Solving Overdetermined Problems: X2 Distribution Requirements

  • Thread starter Thread starter Niles
  • Start date Start date
  • Tags Tags
    Distribution
Niles
Messages
1,834
Reaction score
0
Hi

I'm not sure this is the right place to post, but I'll go ahead. In my book it says that if I am dealing with an overdetermined problem with m data points and n parameters (so m>n), then my observed chi square X2obs follows a X2 distribution with m-n degrees of freedom if the data points are normally distributed.

I thought that the number of degrees of freedom was always m-n, regardless of what distribution my data follows. Am I right or is it correct what the book is stating?
 
Physics news on Phys.org
Niles said:
Am I right or is it correct what the book is stating?

I think no one has answered this because you haven't given a clear statement of what the book said. For example, what kind of parameters is the book talking about? Means? Covariances? Any old parameter? What kind of data are the "data points"?

Do you have a source or link that supports your own opinion that the random variables need not be normally distributed?
 
Niles said:
Hi

I'm not sure this is the right place to post, but I'll go ahead. In my book it says that if I am dealing with an overdetermined problem with m data points and n parameters (so m>n), then my observed chi square X2obs follows a X2 distribution with m-n degrees of freedom if the data points are normally distributed.

I thought that the number of degrees of freedom was always m-n, regardless of what distribution my data follows. Am I right or is it correct what the book is stating?

The Chi-squared distribution has an essential parameter called number of degrees of freedom. So, the bolded and red text in your quote is all part of the name.
 
By "parameters" I mean parameters used to make a fit to the data. And data points are physically measured data, which is why I believe the book is so keen on always dealing with normally distributed data (cf. Central Limit Theorem).

I have no source for my statement. In fact I believe I might be wrong. But I still think it is an interesting question: If I am dealing with data that isn't Gaussianly distributed, then how would I go about and make a goodness-of-fit estimate, considering I can't use X2?

Thanks.
 
Niles said:
By "parameters" I mean parameters used to make a fit to the data.

And by parameters I meant coefficients that characterize the probability density function, just like the expectation a and the standard deviation \sigma in the normal distribution \mathcal{N}(a, \sigma), or the endpoints a and b in the uniform distribution \mathcal{U}(a, b) or the parameter \lambda in the Poisson distribution \mathcal{P}(\lambda).
 
The "if the data points are normally distributed." part may be invoked by using the Central Limit Theorem. If the data points are sums or averages of many RVs then one may assume it is "close to" normally distributed and thus the statistic is "close to" chi-squared.

(BTW: one should say "regardless of" or "irrespective of" or even "irregarding" but not "irregardless".)
 
http://en.wikipedia.org/wiki/Cochran%27s_theorem" gives the precise conditions when the distribution is chi-square and what the number of degrees of freedom is.
 
Last edited by a moderator:
Back
Top