Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

Testing/Comparing Distributions

  1. Aug 21, 2014 #1

    WWGD

    User Avatar
    Science Advisor
    Gold Member

    Hi all, I was going over the poll

    https://www.physicsforums.com/showthread.php?t=766275, and I was wondering how one would go about testing whether the distribution of PF's member nationalities is "the same" (up to some confidence level) than the distribution of the world's population.

    Would this be a sort-of ANOVA (subtracting the proportions of members that live in the same region, to test for equality and decide --statistically--which pairs (PF region, World region) are equally-distributed) , but testing for equality of proportions (e.g., % of PF from Asia vs. World's P ), or would it make more sense by some reasonable standard to use some goodness-of-fit test; maybe a χ^2 with the world's distribution proportions as the expected ones ?

    I think he χ^2 would just tell us about the distributions in general, but would not help us decide --statistically --which regions are similarly-distributed and which are not, and the ANOVA equivalent (if there is one) of the differences of proportions would tell us about differences in distribution between regions .
     
    Last edited: Aug 21, 2014
  2. jcsd
  3. Aug 23, 2014 #2

    FactChecker

    User Avatar
    Science Advisor
    Gold Member

    The Chi-squared goodness of fit test can test the sample numbers versus the theoretical expected numbers. For an individual region, the binomial distribution should work. Use the standard deviation of the binomial to see if the sample number from that region is within the confidence interval. The Chi-squared should also work. I suspect that the binomial and the Chi-squared are identical in that case.
     
  4. Aug 23, 2014 #3

    WWGD

    User Avatar
    Science Advisor
    Gold Member

    Thanks; an issue I was considering was, if we used a Chi-squared, and we rejected it at a given confidence level, we may lose information, in that there may be individual nationalities which do match well, but some outliers --maybe a single one -- may lead us to reject the hypothesis. is there a way of dealing with this?
     
  5. Aug 23, 2014 #4

    FactChecker

    User Avatar
    Science Advisor
    Gold Member

    That's an interesting question. Suppose the entire sample fails the Chi-squared. You might take out the region with the worst matching percentage and apply the test to the remaining sample. That would be like throwing out an outlier. If that test fails, remove the samples of the next worst region. Continue removing region samples till the remaining data passes a Chi-squared. I don't know what a statistician would think of that process. I would not expect the PF poll to fit the general population numbers well.
     
Know someone interested in this topic? Share this thread via Reddit, Google+, Twitter, or Facebook