How to set up and interpret Chi^2 test results for my data?

liquidFuzz · Jun 20, 2024

I have a curve fit of a nonlinear function (a growth model). As a sanity check I do a chi2 test, but I'm not really sure how to set it up properly. My data is as such: sample point and estimated points. In a chi2 test the variables are often referred to as observations and expected. What would this translate to in a chi test of a least square method. In addition, if I get a really low chi2 test value, is that always a good thing, i.e., there's nothing I should worry about in close proximity to origin or such?

Thanks!

mathguy_1995 · Jun 20, 2024

The ##\chi^ 2##-test function is defined as $$\sum_{i} \frac{\left(y_i-e_i\right)^2}{e_i},$$
where ##y_i## are the observed value and ##e_i## is the estimated value (say, if you want to test if your data come from a binomial distribution, ##\text{Bin}\left(n,p\right)##, then ##e_i=n\cdot p##.) After extracting the test function value you have to compare it to a table of values (according to the degrees of freedom you have), otherwise you could extract the related to the test value p-value and compare it to the significance level ##\alpha ##. I hope I gave some kind of an answer to your question, as (tbh) I didn't understand it completely.

liquidFuzz · Jun 21, 2024

What are my observed values yi and expected value ei if you calculated a model with a least square method?

mathguy_1995 · Jun 21, 2024

The observed values are the data given and the expected ones, if your model is, say the simple linear, are $$e_i=\hat{\beta}_0+\hat{\beta}_1x_i,$$
where ##\hat{\beta}_0## and ##\hat{\beta}_1## are the least square estimators.
Analogously, for any other kind of model, e.g. multiple regression etc.

liquidFuzz · Jun 21, 2024

Thanks!

FactChecker · Jun 21, 2024

I am only familiar with the Chi-squared goodness of fit test which compared the histogram of data with the expected theoretical frequency distribution. This seems to be a different test.

gleem · Jun 26, 2024

liquidFuzz said:

f I get a really low chi2 test value, is that always a good thing, i.e., there's nothing I should worry about in close proximity to origin or such?

Look at the ChiSq per degree of freedom (P) table to see how reasonable a low value can be. It will give the probability that your value of p will be exceeded for the degrees of freedom (number of data points minus the number of parameters). If the probability is large then the ChiSq is probably too low. One thing that can account for this is overestimating the uncertainties of the data.

liquidFuzz · Jul 5, 2024

A question regarding zero entries in expected values.

Lets say I want to test whether a set of data could be considered normal distributed. How do I treat bins where the expected value is close to zero. Fewer bins or just upright rejecting the hypothesis..?

Edit, additional, if I instead tests against the accumulative distribution, can I use that as a test?

FactChecker · Jul 5, 2024

liquidFuzz said:

A question regarding zero entries in expected values.

Lets say I want to test whether a set of data could be considered normal distributed. How do I treat bins where the expected value is close to zero. Fewer bins or just upright rejecting the hypothesis..?

Edit, additional, if I instead tests against the accumulative distribution, can I use that as a test?

If you include those bins in your test, does it change the results? You can combine some bins to add up to non-zero expected numbers. If your hypothesized distribution has many expected zero bins and your sample has results in those bins, than the hypothesis might be rightfully rejected. It is not unusual for the extreme tails of an actual distribution to be different from a normal distribution. You will have to use your judgement, based on the situation, on what to do in that case.

liquidFuzz · Jul 5, 2024

Thanks! I'll play around with merged bins and see if I get something useful out of it.

I was hoping for a clear yes or no...

BWV · Jul 6, 2024

If you have a growth model, linearize it by taking differences or log returns. If you don’t do this the data won’t be stationary and most statistical tests won’t make sense. Look at geometric Brownian motion or ARIMA models for examples

Agent Smith · Aug 27, 2024

liquidFuzz said:

I was hoping for a clear yes or no...

You poor soul!

I read Chi-squared test as part of the null hypothesis. It's interesting.

##\displaystyle \sum_i \frac{(y_i - e_i)^2}{e^i}## is the crux of it. Gracias @mathguy_1995

How to set up and interpret Chi^2 test results for my data?

Similar threads

Undergrad Please Explain (actually explain) The Monty Hall Problem

Undergrad A variant of the Monty Hall problem

Undergrad My basic understanding of set theory

High School Onto set mapping is the surjective set mapping, and into injective?

Undergrad How do E[X] and E[|X|] relate?

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Insights What Exactly is Dirac’s Delta Function? - Insight

Insights Relativator (Circular Slide-Rule): Simulated with Desmos - Insight

Insights Fixing Things Which Can Go Wrong With Complex Numbers