Why does Chi-square formula have two different ones?

  • Thread starter EigenCake
  • Start date
  • Tags
    Formula
In summary, the Chi-squared distribution is defined as the distribution of a random variable formed from the sum of squared standard-normal random variables. When applied to the case of expected versus observed frequencies, the formula (O-E)^2/E is used to determine goodness of fit and accept or reject the null hypothesis. This is based on the fact that (O-E)/sqrt(E) is asymptotically standard-normal, making (O-E)^2/E asymptotically Chi-squared.
  • #1
EigenCake
13
0
One is:
http://en.wikipedia.org/wiki/Chi-squared_distribution

where at the bottom of the page, chi-square = sum of something devided by variance.

However, here:
http://www.napce.org/documents/research-design-yount/23_chisq_4th.pdf [Broken]

where the chi-square formula is that the sum of the same thing devided by expectation value.

Which formula is right then?
 
Last edited by a moderator:
Physics news on Phys.org
  • #2
The first reference is the definition of Chi-squared :- the distribution of a RV formed from the sum of squared standard-normal random variables.

The second reference is an application of the above distribution to the case of expect versus observed frequencies, where (O-E)/sqrt(E) is asymptotically standard-normal and hence (O-E)^2/E is asymptotically Chi-squared.
 
  • #3
Thanks!

It seems that for goodness of fit I should use the second formula to either reject or accept the null hypothesis, rather than use the first formula.
 

1. Why does the Chi-square formula have two different versions?

The Chi-square formula has two versions because it can be used for two different types of data: observed data and expected data. The observed data version is used to determine the goodness of fit of a sample to a population, while the expected data version is used to determine the association between two categorical variables.

2. What is the difference between the two versions of the Chi-square formula?

The main difference between the two versions of the Chi-square formula is the data that is used. The observed data version uses the actual data collected from a sample, while the expected data version uses data that is expected based on a theoretical distribution or a null hypothesis.

3. When should I use the observed data version of the Chi-square formula?

The observed data version of the Chi-square formula is used when you want to determine how well a sample fits a particular population. This can be useful in fields such as genetics, where scientists may want to determine if a population is in Hardy-Weinberg equilibrium.

4. When should I use the expected data version of the Chi-square formula?

The expected data version of the Chi-square formula is used when you want to determine if there is a relationship between two categorical variables. For example, a scientist may use this version to determine if there is a relationship between smoking and lung cancer.

5. Can I use the Chi-square formula for continuous data?

No, the Chi-square formula is specifically designed for categorical data. For continuous data, other statistical tests such as the t-test or ANOVA should be used.

Similar threads

  • Calculus and Beyond Homework Help
Replies
3
Views
769
  • Set Theory, Logic, Probability, Statistics
Replies
4
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
4
Views
3K
  • Set Theory, Logic, Probability, Statistics
Replies
5
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
5
Views
7K
  • Set Theory, Logic, Probability, Statistics
Replies
2
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
4
Views
3K
  • Set Theory, Logic, Probability, Statistics
Replies
5
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
2
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
4
Views
3K
Back
Top