The chi square goodness-of-fit test with no degrees of freedom left

Ad VanderVen · Jan 7, 2021

I have an empirical frequency distribution as for example below:

##f_{2} = \, \, \, 21##
##f_{3} = 111##
##f_{4} = \, \, \, 24##

The theoretical distribution is determined by two parameters. So for a chi-square goodness-of-fit test there are actually no degrees of freedom left. Yet the theoretical distribution deviates from the observed distribution. The fact that there are no degrees of freedom left does not ensure that the theoretical and the observed distribution coincide. Can you still say something about the goodness-of-fit ?

Vanadium 50 · Jan 7, 2021

If you have three measurements and two parameters, you have one degree of freedom.

An example of zero degrees of freedom would be a linear fit to two points. In that case there is no goodness of fit information.

Ad VanderVen · Jan 7, 2021

For the theoretical distribution ##P (X = k)## it holds in this case that ##k = 2, 3, 4 \dots ##. If I calculate chi square for the scores ##2, 3, 4## with expected values ##NP (X = 2)##, ##NP (X = 3)## and ##NP (X \geq 4)## then I get .381, but if I compute chi square for the scores ##2, 3, 4## and ##5##, where ##f_{5} = 0## with expected value ##NP(X = 2)##, ##NP(X = 3)##, ##NP (X = 4)## and ##NP(X \geq 5)## then get I as a result 3.719 and that would be significant with one degree of freedom.

Vanadium 50 · Jan 7, 2021

I don't see how your second message has anything to do with your first. Your first measurement has 3 measurements and 2 parameters - i.e. one degree of freedom.

Ad VanderVen · Jan 7, 2021

I thought the number of degrees of freedom (##df##) was equal to the number of classes minus the number of estimated parameters minus 1. So in this case for ##f_{2} = \, \, \, 21##, ##f_{3} = 111## and ##f_{4} = \, \, \, 24## one would expect ##df = 0## (##= 3-2-1##).

Vanadium 50 · Jan 7, 2021

I don't know what you mean by "class". You have X data points and Y fit parameters, so you have X-Y degrees of freedom. So if, as your OP says, you have 3 data points and 2 parameters in your model, you have one degree of freedom.

Stephen Tashi · Jan 7, 2021

Ad VanderVen said:

but if I compute chi square for the scores ##2, 3, 4## and ##5##, where ##f_{5} = 0## with expected value ##NP(X = 2)##, ##NP(X = 3)##, ##NP (X = 4)## and ##NP(X \geq 5)## then get I as a result 3.719 and that would be significant with one degree of freedom.

You are comparing two different probability models to the same data, so isn't surprising that you get different measures of fit. The first probability model makes no prediction for X=4. The second one does.

The chi square goodness-of-fit test with no degrees of freedom left

Discussion Overview

Discussion Character

Main Points Raised

Areas of Agreement / Disagreement

Contextual Notes

Similar threads

Graduate Hypothesis testing: Defining H0, HA hypotheses so that ( H_A)_A' makes sense

Undergrad My basic understanding of set theory

Undergrad The problem of points

Graduate Expected numbers of cards of a last color remaining

Undergrad How does axiom of foundation prevent infinite sequence of elements?

Insights Revisiting the Velocity-Time Function

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect