Testing/Comparing Distributions

  • Context: Undergrad 
  • Thread starter Thread starter WWGD
  • Start date Start date
  • Tags Tags
    Distributions
Click For Summary

Discussion Overview

The discussion revolves around the statistical methods for testing whether the distribution of Physics Forums members' nationalities aligns with the distribution of the world's population. Participants explore various statistical approaches, including ANOVA and Chi-squared tests, to assess equality of proportions and goodness-of-fit.

Discussion Character

  • Technical explanation
  • Debate/contested
  • Mathematical reasoning

Main Points Raised

  • One participant suggests using ANOVA to test for equality of proportions between PF members and world population regions, while questioning if a goodness-of-fit test like Chi-squared would be more appropriate.
  • Another participant agrees that Chi-squared tests can assess sample numbers against theoretical expectations but raises concerns about its limitations in identifying similarly-distributed regions.
  • A different participant expresses concern that rejecting the Chi-squared hypothesis might overlook individual nationalities that fit well, suggesting the need for a method to account for outliers.
  • One response proposes a method of iteratively removing regions with poor matching percentages to see if the remaining data passes the Chi-squared test, questioning the validity of this approach.

Areas of Agreement / Disagreement

Participants express differing views on the appropriateness of Chi-squared tests versus ANOVA for this analysis, and there is no consensus on how to handle outliers or the implications of rejecting the Chi-squared hypothesis.

Contextual Notes

Participants note potential limitations in the methods discussed, such as the risk of losing information by rejecting hypotheses based on outliers and the uncertainty regarding the fit of the PF poll to general population numbers.

WWGD
Science Advisor
Homework Helper
Messages
7,804
Reaction score
13,107
Hi all, I was going over the poll

https://www.physicsforums.com/showthread.php?t=766275, and I was wondering how one would go about testing whether the distribution of PF's member nationalities is "the same" (up to some confidence level) than the distribution of the world's population.

Would this be a sort-of ANOVA (subtracting the proportions of members that live in the same region, to test for equality and decide --statistically--which pairs (PF region, World region) are equally-distributed) , but testing for equality of proportions (e.g., % of PF from Asia vs. World's P ), or would it make more sense by some reasonable standard to use some goodness-of-fit test; maybe a χ^2 with the world's distribution proportions as the expected ones ?

I think he χ^2 would just tell us about the distributions in general, but would not help us decide --statistically --which regions are similarly-distributed and which are not, and the ANOVA equivalent (if there is one) of the differences of proportions would tell us about differences in distribution between regions .
 
Last edited:
Physics news on Phys.org
WWGD said:
I think he χ^2 would just tell us about the distributions in general, but would not help us decide --statistically --which regions are similarly-distributed and which are not, and the ANOVA equivalent (if there is one) of the differences of proportions would tell us about differences in distribution between regions .

The Chi-squared goodness of fit test can test the sample numbers versus the theoretical expected numbers. For an individual region, the binomial distribution should work. Use the standard deviation of the binomial to see if the sample number from that region is within the confidence interval. The Chi-squared should also work. I suspect that the binomial and the Chi-squared are identical in that case.
 
Thanks; an issue I was considering was, if we used a Chi-squared, and we rejected it at a given confidence level, we may lose information, in that there may be individual nationalities which do match well, but some outliers --maybe a single one -- may lead us to reject the hypothesis. is there a way of dealing with this?
 
That's an interesting question. Suppose the entire sample fails the Chi-squared. You might take out the region with the worst matching percentage and apply the test to the remaining sample. That would be like throwing out an outlier. If that test fails, remove the samples of the next worst region. Continue removing region samples till the remaining data passes a Chi-squared. I don't know what a statistician would think of that process. I would not expect the PF poll to fit the general population numbers well.
 

Similar threads

  • · Replies 7 ·
Replies
7
Views
3K
  • · Replies 7 ·
Replies
7
Views
3K
  • · Replies 6 ·
Replies
6
Views
1K
  • · Replies 1 ·
Replies
1
Views
1K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 5 ·
Replies
5
Views
4K
  • · Replies 11 ·
Replies
11
Views
4K
Replies
1
Views
3K