Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

Q-test: Why limited to one point?

  1. Feb 25, 2012 #1
    So I recently learned how to Q-test possible outlying data points. But I am a little confused-why can I only Q-test one point?

    I understand that in Q-testing, you are deleting one point. But if that point was so significant that it shouldn't have been included in data, why should the deleting of it affect the data-I think it should be viewed as never having been in the data. Then, we can Q-test other possible points using N-1 instead of N for the Q-test.

    I already took a test over this in my Chemistry class, luckily there was nothing on this reason, but I would still like to know!

    Thank you for any replies :)
  2. jcsd
  3. Feb 25, 2012 #2


    User Avatar

    Staff: Mentor

    I seem to remember idea that it should be applied only once was one of the assumptions used when the test was designed. But I can be wrong.

    I will move the thread to the statistics forum.
  4. Feb 25, 2012 #3


    User Avatar
    Science Advisor

    Hey Icskatingqn and welcome to the forums.

    It depends on what you are trying to do.

    Lets consider trying to estimate the mean (which is the average value of all data points in a sample and for a theoretical distribution which you could think as being the representation of an infinite amount of data).

    Now lets look at a familiar thing of measuring income.

    We know that most people earn around a certain amount (like say 50,000 to 70,000) but there are of course a few billionaires that 'skew' the average upwards.

    You could see the billionaire incomes as the outliers and if you wanted to get a better indication of the 'average' income, then the billionaire data points might skew the average too much to give an accurate representation for the majority of the population.

    This is one reason why we might censor some of the values (i.e. remove them) because in the context of what we are trying to measure, these don't help us and in many cases are damaging which is why we remove them.

    Also another important thing is to remove outliers if they are 'data errors'. If its known that a particular value is 'impossible' or 'unplausible' then its a good idea to remove it. You never see people 4 metres tall, and even though it might be 'possible', it's not 'plausible' so we remove it.

    But having said the above you have to be very careful about what data is removed and for what purpose. It depends on the nature of the experiment, what you are trying to find out, what the data is, and what the underlying process is.

    In other words you just don't just 'remove outliers' as a standard thing: there has to be a good reason for it.
Share this great discussion with others via Reddit, Google+, Twitter, or Facebook