The chi square goodness-of-fit test with no degrees of freedom left

Click For Summary

Discussion Overview

The discussion revolves around the application of the chi-square goodness-of-fit test in the context of an empirical frequency distribution with parameters that affect the degrees of freedom. Participants explore the implications of having no degrees of freedom left and how this affects the assessment of goodness-of-fit between observed and theoretical distributions.

Discussion Character

  • Debate/contested
  • Mathematical reasoning

Main Points Raised

  • One participant presents an empirical frequency distribution and questions the validity of assessing goodness-of-fit when there are no degrees of freedom left due to the number of parameters estimated.
  • Another participant asserts that having three measurements and two parameters results in one degree of freedom, suggesting that goodness-of-fit information is still available.
  • A different participant calculates chi-square values for different sets of scores and notes that the significance of the results changes based on the inclusion of additional scores, raising questions about the interpretation of these results.
  • One participant challenges the relevance of a previous message, emphasizing the calculation of degrees of freedom based on the number of data points and parameters.
  • Another participant clarifies their understanding of degrees of freedom, stating that it is calculated as the number of data points minus the number of parameters minus one, leading to a conclusion of zero degrees of freedom in the initial example.
  • A participant questions the definition of "class" in the context of degrees of freedom and reiterates that with three data points and two parameters, there is one degree of freedom available.
  • One participant reiterates their chi-square calculation and highlights the differences in fit measures when comparing two probability models against the same data.

Areas of Agreement / Disagreement

Participants express differing views on the calculation of degrees of freedom and its implications for goodness-of-fit assessments. There is no consensus on the correct interpretation of the degrees of freedom in this context, and multiple competing views remain unresolved.

Contextual Notes

Participants reference specific calculations and definitions that may depend on their interpretations of statistical concepts, such as "class" and the parameters involved in the chi-square test. The discussion reflects a variety of assumptions and conditions that are not universally agreed upon.

Ad VanderVen
Messages
169
Reaction score
13
TL;DR
How to deal with a chi square goodness-of-fit test if the number of degrees of freedom is equal to zero?
I have an empirical frequency distribution as for example below:

##f_{2} = \, \, \, 21##
##f_{3} = 111##
##f_{4} = \, \, \, 24##

The theoretical distribution is determined by two parameters. So for a chi-square goodness-of-fit test there are actually no degrees of freedom left. Yet the theoretical distribution deviates from the observed distribution. The fact that there are no degrees of freedom left does not ensure that the theoretical and the observed distribution coincide. Can you still say something about the goodness-of-fit ?
 
Physics news on Phys.org
If you have three measurements and two parameters, you have one degree of freedom.

An example of zero degrees of freedom would be a linear fit to two points. In that case there is no goodness of fit information.
 
For the theoretical distribution ##P (X = k)## it holds in this case that ##k = 2, 3, 4 \dots ##. If I calculate chi square for the scores ##2, 3, 4## with expected values ##NP (X = 2)##, ##NP (X = 3)## and ##NP (X \geq 4)## then I get .381, but if I compute chi square for the scores ##2, 3, 4## and ##5##, where ##f_{5} = 0## with expected value ##NP(X = 2)##, ##NP(X = 3)##, ##NP (X = 4)## and ##NP(X \geq 5)## then get I as a result 3.719 and that would be significant with one degree of freedom.
 
I don't see how your second message has anything to do with your first. Your first measurement has 3 measurements and 2 parameters - i.e. one degree of freedom.
 
I thought the number of degrees of freedom (##df##) was equal to the number of classes minus the number of estimated parameters minus 1. So in this case for ##f_{2} = \, \, \, 21##, ##f_{3} = 111## and ##f_{4} = \, \, \, 24## one would expect ##df = 0## (##= 3-2-1##).
 
I don't know what you mean by "class". You have X data points and Y fit parameters, so you have X-Y degrees of freedom. So if, as your OP says, you have 3 data points and 2 parameters in your model, you have one degree of freedom.
 
Ad VanderVen said:
but if I compute chi square for the scores ##2, 3, 4## and ##5##, where ##f_{5} = 0## with expected value ##NP(X = 2)##, ##NP(X = 3)##, ##NP (X = 4)## and ##NP(X \geq 5)## then get I as a result 3.719 and that would be significant with one degree of freedom.

You are comparing two different probability models to the same data, so isn't surprising that you get different measures of fit. The first probability model makes no prediction for X=4. The second one does.
 

Similar threads

  • · Replies 11 ·
Replies
11
Views
3K
  • · Replies 5 ·
Replies
5
Views
4K
  • · Replies 7 ·
Replies
7
Views
2K
  • · Replies 5 ·
Replies
5
Views
9K
  • · Replies 5 ·
Replies
5
Views
2K
  • · Replies 4 ·
Replies
4
Views
3K
  • · Replies 20 ·
Replies
20
Views
4K
  • · Replies 3 ·
Replies
3
Views
4K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 3 ·
Replies
3
Views
3K