Questionable statistics formula?

In summary, the conversation was about a debate between a student and their teacher regarding the formula for finding quartiles of grouped data. The student proposed a different algorithm for evaluating the quartiles, arguing that the teacher's formula was incorrect as it produced an answer for Q4, which contradicted the definition of quartiles. However, the teacher argued that the formula was still valid as it had a verbal description of the situations to which it applied. Ultimately, the student conceded to the teacher and realized that their algorithm had flaws.
  • #1
13ugwh!z
2
0
Ok so I started this debate with my teacher. It is about this formula for finding the Quartiles of grouped data. Let's take a look at this data:

With ungrouped data, 1 2 3 4 5 6 7 8 9 11 12, for example, we solve for Q1 as (at least that's what she taught us):
Q1 = 1(12) / 4
Q1 = 3
Q1 = [ 3rd + 4th ] / 2
Q1 = [3 + 4] / 2
Q1 = 3.5

which makes sense, with this formula, Q4 is not possible:
Q4 = 4(12) /4
Q4 = 12
Q4 = [12th + 13th] / 2
Q4 = doesn't exist

If kn / 4 results in a whole number, get the value between the (kn / 4)th and (kn / 4 + 1)th term. Else, if it is a decimal, round it up to the nearest whole number.

With grouped data:

Class Interval f <F
17-24 3 3
25-32 9 12
33-40 10 22
41-48 18 40
49-56 9 49
57-64 6 55
65-72 5 60
-----
n=60

She uses the formula,

Qk = L + [ (( kn / 4 ) - <F) / f ] * s,

where Qk is the kth quartile, 'L' is the lower boudary of the class of Qk, 'n' is the total frequency, '<F' is the cumulative frequency below L, 's' for the size of each class, 'f' for
the frequency of the class of Qk.

My grounds:
How the formula is evaluated is wrong because:
  • Data, grouped or not, can only have 3 Quartiles. Quartiles by definition are those 3 values that together divide the data into 4 equal parts. So there can't be a 4th one.

Solving for Q4:

Q4 = 64.5 + [ ( ( 4 * 60 / 4 ) - 55 ) / 5 ] * 8
Q4 = 64.5 + [ ( 60 - 55 ) / 5 ] * 8
Q4 = 64.5 + [ 5 / 5 ] * 8
Q4 = 64.5 + 8
Q4 = 72.5

72.5 is the upper limit of the 65 - 72 class, therefore Q4 exists which contradicts the definition of quartiles. So this approach is wrong.

My proposal:

Leave the formula as is but evaluate it differently.
kn / 4 is the part that tells us in which class Qk lies in.
kn / 4 should be consistent with how we get Qk with ungrouped data since data, grouped or not, doesn't change the definition of quartiles.
so if kn / 4 is a whole number, get [(kn/4)th + (kn/4 + 1)th] / 2
else round it to the nearest whole numberSo solving for Q4:
this is Q4's position:

Q4 = 4 ( 60 ) / 4
Q4 = [60th + 61st] / 2 -> 61st data doesn't exist, therefore Q4 doesn't exist.
pretending 61st data exists...
Q4 = 60.5

So:
Q4 = 64.5 + [ (60.5 - 55) / 5 ] * 8
Q4 = 64.5 + [ 5.5 / 5 ] * 8
Q4 = 64.5 + (1.1) * 8
Q4 = 64.5 + 8.8
Q4 = 73.3

which agrees with the definition of quartiles. There is no Q4 with this evaluation, which is True.

She insists that references ( books ) are more reliable than this "proof" and that we should follow their formulas.
I don't know what to do when exams come asking for quartiles. I insist in using what I believe is right, though.
 
Last edited:
Physics news on Phys.org
  • #2
You shouldn't call what you are writing "formulas". If you were writing mathematical formulas, you couldn't have the same variable Q1 equal to 3 and then equal to 3.5. What you are writing is algorithms, where that sort of thing is allowed to happen.

The fact that the teacher's formula produces an answer for Q4 does not imply it is an incorrect formula. Formulas in a given field of science (e.g. F = MA, in classical physics) have associated verbal descriptions of the situations to which they apply. The fact that they produce answers when numbers are used that can't describe one of those situations (e.g. M = sqrt(-1) A = sqrt(-5) ) doesn't show the formulas are incorrect.

If there is an argument for your algorithm being better, it would be that it helps people remember than Q4 is not defined. (I haven't checked your algorithm.)
 
  • #3
Thanks for the correction! And I concede to my teacher although, we didn't really had a debate. With her algorithm Q0 and Q4 exist but they don't divide the data, instead they serve as boundaries for the whole data. With my algorithm, Q4 would extend out of the true boundary of the data, which is wrong, plus, Q0 would be shifted to the right making a division in the first 25% of the data (from the lowest boundary to Q0 and from Q0 to Q1). She didn't tell me this, I just realized it.
 

1. What is a "questionable" statistics formula?

A questionable statistics formula refers to a statistical method or formula that is not widely accepted or validated by the scientific community. It may also refer to a formula that has been used in a biased or misleading way.

2. How can I identify if a statistics formula is questionable?

One way to identify a questionable statistics formula is to check if it has been published in a reputable scientific journal and if it has been peer-reviewed. Additionally, examining the methodology and data used to derive the formula can also help determine its validity.

3. What are the potential risks of using a questionable statistics formula?

Using a questionable statistics formula can lead to inaccurate or biased results, which can have serious consequences in the scientific community. It can also undermine the credibility of the researcher and their findings.

4. Are there any guidelines for using statistics formulas in scientific research?

Yes, there are established guidelines and standards for using statistics formulas in scientific research. These include proper data collection and analysis techniques, transparency in reporting results, and using validated and accepted formulas.

5. What should I do if I encounter a questionable statistics formula in a research paper?

If you come across a questionable statistics formula in a research paper, it is important to critically evaluate its validity and discuss it with other scientists in the field. You can also reach out to the authors of the paper or consult with a statistician for further clarification.

Similar threads

  • Set Theory, Logic, Probability, Statistics
Replies
3
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
6
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
714
  • Set Theory, Logic, Probability, Statistics
Replies
12
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
2
Views
3K
Replies
4
Views
1K
Replies
6
Views
821
  • Introductory Physics Homework Help
Replies
7
Views
9K
  • Introductory Physics Homework Help
Replies
5
Views
1K
  • Introductory Physics Homework Help
Replies
4
Views
1K
Back
Top