mnb96
- 711
- 5
Hello,
I was trying to interpret the formula of Pearson's Chi-squared test:
\chi^2 = \sum_{i=1}^{n} \frac{(O_i - E_i)^2}{E_i}
I thought that if we assume that each O_i is an observation of the random variable X_i, then the above formula essentially considers the sum-of-squares of n standardized random variables Y_i=\frac{X_i-\mu_i}{\sigma_i}. In fact, if such random variables are Y_i \sim N(0,1), then the random variable S = \sum_{i=1}^n Y_i^2 follows a \chi^2-distribution. Thus, the formula of the Chi-squared test would essentially evaluate the probability \mathrm{P}\left( S = \chi^2 \right), and of course compare it to some chosen P-value.
My question is about the standardization of the random variables X_i.
If my interpretation above is correct, then Pearson's Chi-squared test somehow assumes that each random variable X_i has variance equal to its expected value, that is: \sigma_i^2 = \mu_i
Why so?
Can anybody explain why we would need to assume that variance and expected values are numerically equal? That condition is satisfied only for some distributions like Poisson and Gamma (with \theta=1). Why such a restriction?
I was trying to interpret the formula of Pearson's Chi-squared test:
\chi^2 = \sum_{i=1}^{n} \frac{(O_i - E_i)^2}{E_i}
I thought that if we assume that each O_i is an observation of the random variable X_i, then the above formula essentially considers the sum-of-squares of n standardized random variables Y_i=\frac{X_i-\mu_i}{\sigma_i}. In fact, if such random variables are Y_i \sim N(0,1), then the random variable S = \sum_{i=1}^n Y_i^2 follows a \chi^2-distribution. Thus, the formula of the Chi-squared test would essentially evaluate the probability \mathrm{P}\left( S = \chi^2 \right), and of course compare it to some chosen P-value.
My question is about the standardization of the random variables X_i.
If my interpretation above is correct, then Pearson's Chi-squared test somehow assumes that each random variable X_i has variance equal to its expected value, that is: \sigma_i^2 = \mu_i
Why so?
Can anybody explain why we would need to assume that variance and expected values are numerically equal? That condition is satisfied only for some distributions like Poisson and Gamma (with \theta=1). Why such a restriction?
Last edited: