Solving Overdetermined Problems: X2 Distribution Requirements

  • Thread starter Thread starter Niles
  • Start date Start date
  • Tags Tags
    Distribution
AI Thread Summary
In discussions about overdetermined problems, it is clarified that when dealing with m data points and n parameters (where m > n), the observed chi-square statistic (X2obs) follows a chi-square distribution with m-n degrees of freedom, provided the data points are normally distributed. The necessity for normal distribution stems from the Central Limit Theorem, which suggests that sums or averages of random variables tend to be normally distributed. There is uncertainty regarding the applicability of the chi-square distribution if the data is not normally distributed, prompting questions about alternative goodness-of-fit measures. The conversation also touches on the importance of clearly defining parameters and data types in statistical contexts. Overall, the discussion emphasizes the conditions under which the chi-square distribution is valid.
Niles
Messages
1,834
Reaction score
0
Hi

I'm not sure this is the right place to post, but I'll go ahead. In my book it says that if I am dealing with an overdetermined problem with m data points and n parameters (so m>n), then my observed chi square X2obs follows a X2 distribution with m-n degrees of freedom if the data points are normally distributed.

I thought that the number of degrees of freedom was always m-n, regardless of what distribution my data follows. Am I right or is it correct what the book is stating?
 
Physics news on Phys.org
Niles said:
Am I right or is it correct what the book is stating?

I think no one has answered this because you haven't given a clear statement of what the book said. For example, what kind of parameters is the book talking about? Means? Covariances? Any old parameter? What kind of data are the "data points"?

Do you have a source or link that supports your own opinion that the random variables need not be normally distributed?
 
Niles said:
Hi

I'm not sure this is the right place to post, but I'll go ahead. In my book it says that if I am dealing with an overdetermined problem with m data points and n parameters (so m>n), then my observed chi square X2obs follows a X2 distribution with m-n degrees of freedom if the data points are normally distributed.

I thought that the number of degrees of freedom was always m-n, regardless of what distribution my data follows. Am I right or is it correct what the book is stating?

The Chi-squared distribution has an essential parameter called number of degrees of freedom. So, the bolded and red text in your quote is all part of the name.
 
By "parameters" I mean parameters used to make a fit to the data. And data points are physically measured data, which is why I believe the book is so keen on always dealing with normally distributed data (cf. Central Limit Theorem).

I have no source for my statement. In fact I believe I might be wrong. But I still think it is an interesting question: If I am dealing with data that isn't Gaussianly distributed, then how would I go about and make a goodness-of-fit estimate, considering I can't use X2?

Thanks.
 
Niles said:
By "parameters" I mean parameters used to make a fit to the data.

And by parameters I meant coefficients that characterize the probability density function, just like the expectation a and the standard deviation \sigma in the normal distribution \mathcal{N}(a, \sigma), or the endpoints a and b in the uniform distribution \mathcal{U}(a, b) or the parameter \lambda in the Poisson distribution \mathcal{P}(\lambda).
 
The "if the data points are normally distributed." part may be invoked by using the Central Limit Theorem. If the data points are sums or averages of many RVs then one may assume it is "close to" normally distributed and thus the statistic is "close to" chi-squared.

(BTW: one should say "regardless of" or "irrespective of" or even "irregarding" but not "irregardless".)
 
http://en.wikipedia.org/wiki/Cochran%27s_theorem" gives the precise conditions when the distribution is chi-square and what the number of degrees of freedom is.
 
Last edited by a moderator:
Back
Top