Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

Quartiles of ungrouped data

  1. Feb 25, 2012 #1
    From the question, is the way to find Lower Quartiles and Upper Quartiles correct? I have seen books taking the 3rd and 8th (from the question) as Lower Quartiles and Upper Quartiles respectively. Which should be the correct Quartiles?

    Attached Files:

  2. jcsd
  3. Feb 25, 2012 #2
    Are you asking how to calculate quartiles or how to interpret them? First, there's no such thing as an eighth quartile. By definition a quartile partitions the data into four ordered sets of data. The generic term is quantile which can be any number of equal divisions of data points. The 1st quartile contains the upper 25% of the data points. the last quartile the lowest 25% of data. All data values must be ranked putting them into correspondence with the integers 1 through k, 1<k. If the total number of ranked data points is n and k is a chosen data point [itex] k \leq n[/itex] then:

    [itex]P[X < x] \leq k/n; P[X\geq x] \geq 1 - (k/n)[/itex]

    So by the first inequality if x is ranked 5th highest point out of 100 data points, then k=95 and P=0.95 which is the 95th percentile. It seems you want the upper quartile (top 25%), and lower quartile (bottom 25%) . The meaning of the term 75th percentile is that 75% of all data points are less than the lowest data point of the upper quartile.
    Last edited: Feb 25, 2012
  4. Feb 25, 2012 #3
  5. Feb 25, 2012 #4
    I'm having a hard time reading it, but to establish quartiles it's the number of data points and their quantitative rank that matter, not their actual values. So if n=15, the median value is k/n= 0.5. Solve for k to get 7.5. For the quartile: k/n= 0.25. k= 3.75. So the lower boundary of the upper quartile would be 15-3.75=11.25. This would include your top four ranked values which would be your last four data points in counting order: the 12th, 13th, 14th and 15th data points.

    If you type out what you're doing, I can tell you more, You seem to be doing it correctly. For an even number of values, some people use k+1, as you have, so quantile boundaries do not fall on data points. The value of your median is then 5.5 and the quartile boundaries would be calculated using 2.75. So 5.5 - 2.75 = 2.75. Your answer could be this or 2,25. I'm not sure which.
    Last edited: Feb 25, 2012
  6. Feb 25, 2012 #5
    There are 10 data values in my attached example.

    {51, 55, 57, 61, 62, 67, 70, 72, 73, 74}

    Q1 = 56.5
    Q3 = 72.25

    Q1 = 57
    Q3 = 72 instead
  7. Feb 25, 2012 #6
    As far as I know, with sparse data like this, you can't be very precise in the placing quantile boundaries in terms of extrapolations of the actual data values. All you can say is the median falls between 62 and 67. The quartile boundaries fall on 57 and 72. If you use k+1 and center the rank distribution on the median, using 2.75 ranks as the quartile width, than 57 will fall into the second quartile while 72 will fall into the third quartile when strictly observing the boundaries 2.75 and 8.25. With n=10+1, you can't be more precise than that IMO. Note I'm using Q4 for the quartile with the highest values and Q1 as the one with the lowest values as you did.
    Last edited: Feb 26, 2012
  8. Feb 27, 2012 #7
    I made a mistake with the "k+1" adjustment for even n. It should be n+1 of course.
Share this great discussion with others via Reddit, Google+, Twitter, or Facebook