Goodness-of-fit chi-square test on GLM's in R

• Deimantas
In summary, the GLM's in the link do not fit the data at all despite looking good when plotted. Chi squared is not an appropriate distribution to measure the residual sum of squares in either of these models.
Deimantas

Homework Statement

Hello, I've stumbled upon an interesting analysis of GLM's in R (http://www.magesblog.com/2015/08/generalised-linear-models-in-r.html).
However it's only a graphical analysis so I wanted to make sure the models are fitting by using an actual statistics test. The test, however, shows that the models do not fit at all - despite the models looking good when plotted in graphs.

icecream <- data.frame(
temp=c(11.9, 14.2, 15.2, 16.4, 17.2, 18.1,
18.5, 19.4, 22.1, 22.6, 23.4, 25.1),
units=c(185L, 215L, 332L, 325L, 408L, 421L,
406L, 412L, 522L, 445L, 544L, 614L)
)

pois.mod <- glm(units ~ temp, data=icecream,

bin.glm <- glm(cbind(units, opportunity) ~ temp, data=icecream,

The Attempt at a Solution

R commands and results:
1-pchisq(summary(pois.mod)$deviance, summary(pois.mod)$df.residual)
[1] 3.59619e-09

1-pchisq(summary(bin.glm)$deviance, summary(bin.glm)$df.residual)
[1] 6.850076e-14

Since both results are <0.05, it would seem that both the Poisson and binomial models do not fit at all? Am I doing something wrong? The models look good when plotted.

Unless I am missing something, Chi squared is not an appropriate distribution to measure the residual sum of squares in either of these models. Chi squared is only applicable when the model has residuals being normally distributed. Using it when the underlying distribution is Poisson or logistic would produce unpredictable results, of unknown meaning.

Deimantas
andrewkirk said:
Unless I am missing something, Chi squared is not an appropriate distribution to measure the residual sum of squares in either of these models. Chi squared is only applicable when the model has residuals being normally distributed. Using it when the underlying distribution is Poisson or logistic would produce unpredictable results, of unknown meaning.
That explains it. Thank you

Deimantas said:
Am I doing something wrong?

Someone would have to know R before that can be answered. If pchisq is a Person's chi-square test of goodness of fit, how are the bin sizes and degrees of freedom established in that code?

The linear model in the link will have a good fit to that data. That does not necessarily imply that the residual errors will be normally distributed, although that is a basic statistical assumption if any probability conclusions are going to be asserted. Off the top of my head, I don't see what you are doing in your Chi-squared test. Comments would help a lot.

I assume that you are testing the poison and binomial models in the link. Let's ignore the fact that I am not convinced that the linear model is bad. The other models do not address the fact that the source of random variation may be unrelated to the basic function form of the model. So testing the residual errors against the function used in the model may not be solid logic.

FactChecker said:
The linear model in the link will have a good fit to that data. That does not necessarily imply that the residual errors will be normally distributed, although that is a basic statistical assumption if any probability conclusions are going to be asserted. Off the top of my head, I don't see what you are doing in your Chi-squared test. Comments would help a lot.

I assume that you are testing the poison and binomial models in the link. Let's ignore the fact that I am not convinced that the linear model is bad. The other models do not address the fact that the source of random variation may be unrelated to the basic function form of the model. So testing the residual errors against the function used in the model may not be solid logic.
True, and this problem applies just as much to linear models as to generalised linear models (GLMs) like the above. In a linear model, it is assumed that the idiosyncratic variation is normally distributed, and p-values and other fit measures rely on that assumption. In the above models, it is assumed that the idiosyncratic variation is distributed as Poisson and Bernoulli respectively. Parameters are estimated using maximum likelihood, and tests of parameter significance and fit use those distribution assumptions - except in the case of non-parametric tests, which would use no distributional assumptions.

The assumption of Poisson and Bernoulli variation introduces the possibility of model error, but no more so than does the assumption of normal variation in the case of a standard linear model.

The same applies to the transformation applied to the linear estimator, which is the logistic function for the Poisson case and the exponential function for the Bernoulli case. This introduces the possibility of further model error, but no more so than the implicit assumption used in linear modelling that the transformation is the identity function. A linear model is simply a GLM with an identity transformation ('link') function and an assumption that idiosyncratic variation is normally distributed.

An appropriate parametric test of the above models would use distributions derived from those assumed in the model (Poisson and Bernoulli) - just as in OLS one can use chi-squared which is derived from the normal distribution.

andrewkirk said:
one can use chi-squared which is derived from the normal distribution.
Peterson's Chi-squared goodness of fit test applies to any distribution. It compares the expected number of samples in bins to the numbers of actual test values in the bins. The first problem with applying it to this example is that the sample size is far too small. The R utility should have warned about that. A very small statistical significance is to be expected from such a small sample.

Stephen Tashi said:
Someone would have to know R before that can be answered. If pchisq is a Person's chi-square test of goodness of fit, how are the bin sizes and degrees of freedom established in that code?
I can answer this question in a formal way, but alas not in a very helpful one.

pchisq(x,n) gives the value of ##\chi^2{}_n(x)##, which is the value, at ##x##, of the CDF of a chi-square random variable with ##n## degrees of freedom.

The second argument to the function in the OP is the degrees of freedom in the residuals, which is 10 = 12 - 2, since there are twelve data points and the model estimates two parameters - an intercept and a a slope for the linear estimator.

The first argument to the function is the 'deviance' of the glm, which is twice the log likelihood of the data under the model estimated by the glm, minus twice the log likelihood of the data under a 'null' model which forces the slope of the linear estimator to be zero - ie in which the dependent variable has the same conditional distribution for every value of the independent variable.

There is no binning performed because this is not a chi-square test (which requires binning for continuous random variables) but rather a comparison of a statistic against a chi-square distribution.

While I can see broadly why that deviance calc gives a number whose size is inversely related to the fit of the model, and that it makes sense to test the deviance against the CDF of some distribution in order to measure fit, I've done nowhere near enough GLMing to know what would be an appropriate distribution to use for the test. I tend to use p-values and AIC (Akaike Information Criterion) to assess fit for my GLMs - which so far have all been logistic/Bernoulli.

FactChecker said:
The R utility should have warned about that.
R does give warnings when chi square tests are performed with inadequate or inappropriate data. In this case there is no warning because no chi-square test was done by R - the test was performed by the user instead. All the pchisq utility does is return the value of a particular chi-square CDF, and it can have no way of knowing for what that value will be used.

The R command to perform a chi square test is chisq.test, and that was not used here - nor could it have been because that takes a contingency table as input and there is no contingency table here.

FactChecker
andrewkirk said:
R does give warnings when chi square tests are performed with inadequate or inappropriate data. In this case there is no warning because no chi-square test was done by R - the test was performed by the user instead. All the pchisq utility does is return the value of a particular chi-square CDF, and it can have no way of knowing for what that value will be used.

The R command to perform a chi square test is chisq.test, and that was not used here - nor could it have been because that takes a contingency table as input and there is no contingency table here.
Thanks. That does a lot to explain my confusion about what he was doing. In any case, there is not nearly enough data for that test to give a significant result.

What is a Goodness-of-fit chi-square test?

A Goodness-of-fit chi-square test is a statistical test used to determine whether there is a significant difference between the observed and expected frequencies in a categorical variable. It is commonly used in GLM (Generalized Linear Model) analysis to assess the fit of the model to the data.

How is a Goodness-of-fit chi-square test performed in R?

To perform a Goodness-of-fit chi-square test on GLM's in R, you can use the chisq.test() function, which takes in two arguments - the observed frequencies and the expected frequencies. The function will return the chi-square test statistic, the degrees of freedom, and the p-value.

What are the assumptions of a Goodness-of-fit chi-square test?

The main assumptions of a Goodness-of-fit chi-square test are that the data must be categorical, the sample must be random, and the expected frequencies should be at least 5 for each category. Additionally, the observations must be independent and the sample size should be large enough.

How do you interpret the results of a Goodness-of-fit chi-square test?

If the p-value is less than the chosen significance level (usually 0.05), it indicates that there is a significant difference between the observed and expected frequencies. In this case, we can reject the null hypothesis and conclude that the model does not fit the data well. On the other hand, if the p-value is greater than the significance level, it suggests that there is no significant difference and we can accept the null hypothesis.

What are some potential limitations of a Goodness-of-fit chi-square test?

One limitation of a Goodness-of-fit chi-square test is that it can only be used for categorical data. Additionally, the test assumes that the expected frequencies are known and fixed, which may not always be the case. Furthermore, the test is sensitive to sample size, and small sample sizes may lead to inaccurate results. Finally, the test does not tell us anything about the direction or magnitude of the difference between the observed and expected frequencies, only whether there is a significant difference or not.