Akaike Information Criterion Vs Likelihood Ratio Test

In summary: Cross-validation is a way to test the generalizability of a model. You divide the data into two parts, randomly selecting a fraction of it to use for training the model, and a fraction to use for testing the model. You use the model to predict the data in the other part of the data set. If the model is good at predicting the data in the training set, it is likely to be good at predicting the data in the testing set.
  • #1
CGandC
326
34
TL;DR Summary
In both Akaike Information Criterion and Likelihood Ratio Test we compare likelihoods to understand the better fit to the empirical data, but what's the difference between Akaike Information Criterion and Likelihood Ratio Test?
Hello,

I want to understand the difference between both goodness-of-fit tests, I would be glad if you could help me:

Akaike Information criterion is defined as:

## AIC_i = - 2log( L_i ) + 2K_i ##

Where ##L_i## is the likelihood function defined for distribution model ##i## .
##K_i## is the number of parameters of the distribution model. for example, for exponential distribution we have only lambda so ##K_{exponential} = 1##

So if I want to know which distribution better fits the empirical data, I see which AIC is higher and choose the representative distribution for that high AIC.

Likelihood Ratio Test is defined as ( According to " Clauset , et al . Power law distributions in empirical data" ):

" The basic idea behind the likelihood ratio test is to compute the likelihood of
the data under two competing distributions. The one with the higher likelihood is
then the better fit. Alternatively, one can calculate the ratio of the two likelihoods,
or equivalently the logarithm R of the ratio, which is positive or negative depending
on which distribution is better, or zero in the event of a tie. "Bottom line:
We can see that both in the Akaike and Likelihood ratio test, essentially I compare the likelihood functions for different distribution models and I choose the bigger one which is representative of the better distribution fit. using both methods as described above yielded me very similar results in MATLAB.
So I don't really understand the difference between both methods, maybe I'm understanding these methods wrong?
 
Physics news on Phys.org
  • #2
CGandC said:
So if I want to know which distribution better fits the empirical data, I see which AIC is higher
Did you mean to say "lower"?
 
  • #3
Stephen Tashi said:
Did you mean to say "lower"?

Yes, sorry.
 
  • #4
A practical distinction is that a larger number of free parameters in a model makes its AIC higher, thus making the model less preferable. The log liklihood ratio isn't directly affected by the numbers of free parameters in the models.
 
  • #5
I found another distinction:

Likelihood ratio test is defined to test for distributions that are nested.

For example: I can use for this test to determine whether power-law PDF or power-law with cutoff PDF fit the data better because the power-law PDF is a nested PDF model of the power-law with cutoff PDF.

However, Akaike Information Criterion is general and has no model limitations like the Likelihood ratio test has, but It indeed provides high values when computed.

Besides these main distinctions:
What are some advantages and disadvantages of using AIC over LRT ( likelihood ratio test )?
 
  • #6
CGandC said:
I found another distinction:

Likelihood ratio test is defined to test for distributions that are nested.

Whether that's true is a matter of vocabulary. Under certain assumptions, the distribution of the liklihood ratio (as a statistic) is asymptotically chi-squared. Without the assumption that the distributions are nested, the statistic may not have a chi-squared distribution, but it may still be possible to estimate its distribution.

Some articles only use the term "likihood ratio test" in the case of nested models. Other articles use the term in a more general sense - e.g. the abstract of https://besjournals.onlinelibrary.wiley.com/doi/full/10.1111/j.2041-210X.2010.00063.x (I haven't read the article itself.)
What are some advantages and disadvantages of using AIC over LRT ( likelihood ratio test )?

Questions of about what is a "better" or "good" statistical method are not mathematical questions unless considerable quantitative context is specified - for example, a utility or penalty function that evaluates the cost or benefit of making statistical decisons (right or wrong) on the basis of the statistical tests involved.

So you are probably asking for an answer based on empirical experience. The answer will vary from person to person and from field of study to field of study. You need to get advice from people studying the same thing that you do. If the question involves publishing something in a jounal, you need to look at published articles to see which statistical methods are accepted for publication. Culture and tradition are also factors in choosing statistical methods.

I found these concise notes on model selection: https://www.stat.cmu.edu/~larry/=stat705/Lecture16.pdf They prefer using "cross-validation" to using the AIC.
 

What is the Akaike Information Criterion (AIC)?

The Akaike Information Criterion (AIC) is a statistical measure used to compare different statistical models based on their ability to predict a given data set. It takes into account both the goodness of fit of the model and the complexity of the model, and provides a way to balance these two factors in order to determine the most appropriate model for a given data set.

What is the Likelihood Ratio Test (LRT)?

The Likelihood Ratio Test (LRT) is a statistical test used to compare two nested models, where one model is a simplified version of the other. The test compares the likelihood of the data under each model, and determines whether the more complex model significantly improves the fit compared to the simpler model. It is often used to determine whether a particular variable should be included in a model or not.

What is the difference between AIC and LRT?

While both AIC and LRT are used to compare statistical models, they have different approaches. AIC takes into account both the goodness of fit and complexity of a model, while LRT only looks at the improvement in fit between two nested models. AIC is also used to compare non-nested models, while LRT can only be used for nested models.

Which one should I use - AIC or LRT?

It depends on the specific research question and the type of models being compared. AIC is more commonly used for model selection, as it can be used to compare non-nested models. LRT is useful for determining the significance of a particular variable in a model. In some cases, both AIC and LRT may be used in combination to provide a more comprehensive analysis.

Are AIC and LRT always reliable?

No, they are not always reliable. Both AIC and LRT have their limitations and assumptions, and should be used with caution. It is important to carefully consider the specific research question and the appropriateness of using these tests before applying them to a data set.

Similar threads

  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
10
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
2
Views
4K
  • Programming and Computer Science
Replies
9
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
865
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
9
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
5
Views
7K
  • Set Theory, Logic, Probability, Statistics
Replies
11
Views
1K
Back
Top