Akaike information small sample AICc

mertcan · Dec 8, 2017

hi, initially I am aware that AICc value is $$ -2(*log-likelihood)+2K+2K*(K+1)/(n-K-1)$$ where n is sample size and K is number of model parameters. But I really do not know how last term of right hand side is added, also AIC value is $$ -2*(log-likelihood)+2K$$ , so AICc has some correction in addition to AIC. In short my question is what is the derivation of correction in AICc $$(2K*(K+1)/(n-K-1) )$$ ??

Stephen Tashi · Dec 9, 2017

Unfortunately, in searching the web, we find that the usual approach is just to define AIC by a formula and to define AICc by a different formula. However, the terminology "correction" suggests that both formulae are trying to compute a common quantity, whose definition is unstated. If we only consider history as the authority on definitions, we would have to read the original papers that defined the AIC and the AICc to see if the people who proposed the AIC and AICc defined a common quantity that these formulae are supposed to approximate.

If we go beyond history to seek a respectable definition for the AIC, the section "Model Selection Criterion" on page 7 of the presentation http://www4.ncsu.edu/~shu3/Presentation/AIC.pdf, defines a quantity that is to be maximized. The particular formulae used to estimate that quantity could be different for different types of models and situations (e.g. linear models and large samples vs linear model and small samples ). If we define the AIC abstractly as a quantity proportional to:

##E_y E_x [\log(g(x| \hat{\theta}(y)))]##

then, in different situations, the AIC can be given by different formulae.

I don't know what level of abstraction you are comfortable with. One can probably understand formulae for the AIC and AICc by considering specific situations. - but I won't try to figure this out myself unless someone else is really interested in participating!

mertcan · Dec 9, 2017

Stephen Tashi said:

Unfortunately, in searching the web, we find that the usual approach is just to define AIC by a formula and to define AICc by a different formula. However, the terminology "correction" suggests that both formulae are trying to compute a common quantity, whose definition is unstated. If we only consider history as the authority on definitions, we would have to read the original papers that defined the AIC and the AICc to see if the people who proposed the AIC and AICc defined a common quantity that these formulae are supposed to approximate.

If we go beyond history to seek a respectable definition for the AIC, the section "Model Selection Criterion" on page 7 of the presentation http://www4.ncsu.edu/~shu3/Presentation/AIC.pdf, defines a quantity that is to be maximized. The particular formulae used to estimate that quantity could be different for different types of models and situations (e.g. linear models and large samples vs linear model and small samples ). If we define the AIC abstractly as a quantity proportional to:

##E_y E_x [\log(g(x| \hat{\theta}(y)))]##

then, in different situations, the AIC can be given by different formulae.

I don't know what level of abstraction you are comfortable with. One can probably understand formulae for the AIC and AICc by considering specific situations. - but I won't try to figure this out myself unless someone else is really interested in participating!

First of all thanks for your return, but I would like to express that I know how to derive AIC value without the correction, but when sample size is small relative to number of parameters (if n/k<40, by the way k is number of parameters n is sample size) it is said that we should use correction of AIC which means AICc. I really wonder why 40 takes place, what kind of assumptions in AIC definition create 40 or why n/k<40 exists? So, could you help me about which assumptions may result in n/k<40 in small sample size case of AIC value?

Stephen Tashi · Dec 10, 2017

I myself don't know where the number 40 comes from.

The articles I've found that bother to footnote the recommendation n/k < 40 cite Burnham LS, Anderson DR. Model Selection and Inference: A Practical Information-Theoretic Approach. 2. Springer-Verlag; New York: 2002. I don't have a copy of that book.

We could try to follow the derivation given in http://myweb.uiowa.edu/cavaaugh/doc/pub/aicaicc.pdf, starting on page 3. However, I don't see the number 40 mentioned in that document.

Akaike information small sample AICc

Discussion Overview

Discussion Character

Main Points Raised

Areas of Agreement / Disagreement

Contextual Notes

Similar threads

Graduate Hypothesis testing: Defining H0, HA hypotheses so that ( H_A)_A' makes sense

Undergrad My basic understanding of set theory

Undergrad The problem of points

Graduate Expected numbers of cards of a last color remaining

Undergrad How do E[X] and E[|X|] relate?

Insights Revisiting the Velocity-Time Function

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect