Comparing Proportions: Choosing the Right Statistical Test for Your Data

  • Thread starter Thread starter Chas3down
  • Start date Start date
  • Tags Tags
    Ap Statistics
Chas3down
Messages
60
Reaction score
0
Sorry, no subforum for statistics so I posted it here..

Homework Statement



So, if I have a given list of proportions

n = 64
.11
.14
.16
.14
.13
.16
.16

and I want to compare it to another group of percentages

n = North American Average
.18
.19
,20
.18
.14
.06
.05

What type of test would I use?
 
Physics news on Phys.org
Chas3down said:
So, if I have a given list of proportions and I want to compare it to another group of percentages, what type of test would I use?
[I condensed your question for clarity and to save space]

What kind of comparison are you looking for? I'm thinking it's just a two-sample z-test for proportions.
 
Mandelbroth said:
[I condensed your question for clarity and to save space]

What kind of comparison are you looking for? I'm thinking it's just a two-sample z-test for proportions.

Just to see if there is a significant difference between the proportions or not.

I am comparing car colors from my school against the National Average. The first group (n=64) is 64 car colors I got from my school parking lot. The second group of proportions, is that of the North American Average.
 
Last edited:
Try a goodness-of-fit chi-square test.
 
Chas3down said:
Just to see if there is a significant difference between the proportions or not.

I am comparing car colors from my school against the National Average. The first group (n=64) is 64 car colors I got from my school parking lot. The second group of proportions, is that of the North American Average.

Validity, etc., depends on the nature of your data.

For instance, how do you condense information about car colors into a proportion? Typically, a proportion like 0.11 could be thought of as the number of 'yes' vs. 'no' answers to some type of question. How does a car's color fit into that type of scheme? The point is that if the numbers you show represent some type of highly 'massaged' figures, their distribution may be an artifact of your data-aggregation method and not reflective of reality. So: show us how you obtained those figures.
 
Well, I collected 64 car color samples (Chose 64 so it is a large enough number so I could generalize for my entire student parking lot) then had for example 10 black cars and did 10/64 to get the proportion of students who drive black color cars to school on that day.
 
Chas3down said:
Well, I collected 64 car color samples (Chose 64 so it is a large enough number so I could generalize for my entire student parking lot) then had for example 10 black cars and did 10/64 to get the proportion of students who drive black color cars to school on that day.
How formal do you want your test to be? If you really want a formal test, state between what wavelengths of light you consider to be blue, etc. There are other things to consider, but that might be better if you want a boolean ("yes/no") answer. For example, "red" can be very subjective. You could argue that all pink cars are red.

Mandelbroth said:
[I condensed your question for clarity and to save space]

What kind of comparison are you looking for? I'm thinking it's just a two-sample z-test for proportions.
Excuse my haste. You can use a two-sample z-test, but it would be weird since you are working with more than one proportion. Use a chi-squared based test for goodness of fit. That should work better.
 
Mandelbroth said:
How formal do you want your test to be? If you really want a formal test, state between what wavelengths of light you consider to be blue, etc. There are other things to consider, but that might be better if you want a boolean ("yes/no") answer. For example, "red" can be very subjective. You could argue that all pink cars are red.Excuse my haste. You can use a two-sample z-test, but it would be weird since you are working with more than one proportion. Use a chi-squared based test for goodness of fit. That should work better.
It is pretty informal.

Okay, yeah, I was thinking of using a chi-squared test, thanks for your thoughts.
 
hmm.. quick addition..

Would I do a chi squared GOF for proportions the same as I would for non-proportions?

just a summation of ((Observed proportion - Expected proportion)^2 / Expected Proportion) to get my chi squared value? And my degrees of freedom would just be number of cars i used to get my data -1 ?
 
  • #10
Not quite.

The summation should be of ((Observed frequency - Expected frequency)^2 / Expected frequency).
You converted the frequencies to proportions, but you really need the frequencies.

The degrees of freedom is the number of colors - 1.
 
  • #11
I like Serena said:
Not quite.

The summation should be of ((Observed frequency - Expected frequency)^2 / Expected frequency).
You converted the frequencies to proportions, but you really need the frequencies.

The degrees of freedom is the number of colors - 1.

Oh gotcha, so I should convert the average car color proportions to frequencies out of 64?
 
  • #12
Yep.
 
  • #13
I like Serena said:
Yep.

Alright thanks, really a big help.. one final question, I should round all my expected car colors to whole numbers correct? Because you can't have 5.3 cars..
 
  • #14
No. The expected cars should remain fractional.
 
  • #15
I like Serena said:
No. The expected cars should remain fractional.

Okay, only reason i thought it would be the other way was because my ti-84 would only accept whole numbers or else it would error out... time to do it by hand.

Really a big help, thanks a lot man.
 
  • #16
Huh? I'd expect your ti-84 to required whole numbers for the observed frequencies, which should indeed be whole, but not for the expected frequencies.
 
  • Like
Likes 1 person
  • #17
/facepalm.. thanks put it in wrong lol.
 
  • #18
:smile:
 
Back
Top