Calculating log liklihood: Zero value of likelihood function

Click For Summary
SUMMARY

The discussion focuses on calculating log likelihood values for hydrology data using various probability distributions, specifically addressing the issue of zero likelihood function values. The user employs Excel with the EasyFit add-in to compute likelihoods for eight candidate distributions, including normal, lognormal, and generalized extreme value. The primary concern is how to handle very low or high likelihood values that lead to undefined log likelihoods, which can skew the Akaike Information Criterion (AIC) calculations. The user seeks advice on whether to exclude these problematic values from their analysis.

PREREQUISITES
  • Understanding of log likelihood calculations in statistical modeling
  • Familiarity with Akaike Information Criterion (AIC) for model selection
  • Experience using EasyFit add-in for Excel
  • Knowledge of probability distributions relevant to hydrology data
NEXT STEPS
  • Research methods for handling zero or undefined likelihood values in statistical analysis
  • Learn about robust statistical techniques for outlier detection in hydrology data
  • Explore advanced features of EasyFit for distribution fitting and model evaluation
  • Study the implications of AIC in model selection and alternatives like BIC
USEFUL FOR

Hydrologists, data analysts, and statisticians involved in modeling streamflow data and selecting appropriate probability distributions for analysis.

kirti1604
Messages
2
Reaction score
0
Hello,
I am analysing hydrology data and curve fitting to check the best probability distribution among 8 candidate distribution. (2 and 3 parameter distributions)
The selection is based on the lowest AIC value.
While doing my calculation in excel, how is it suggested to treat very low (approx 0) likelihood function values which result in a log likelihood of zero?
Should I just delete those value to get an AIC?
They generally are very low or very high values (possibly outliers ) which cause undefined values of log likelihood.
(I can delete those values because they are actual recorded data)I would appreciate if someone can suggest the treatment of such data which result in zero likelihood and the log of which can't be determined.
 
Physics news on Phys.org
Can you give some more information on what your data is and what kind of distributions you are assuming for it? Naturally, very low likelihood indicates bad models in general.
 
Orodruin said:
Can you give some more information on what your data is and what kind of distributions you are assuming for it? Naturally, very low likelihood indicates bad models in general.

Thank You for your response,
My data is stream flow data.
I am testing for 8 distributions: normal, lognormal, 3parameter log normal, generalised extreme value, gamma, gamma 3/pearson type 3, log pearson type 3, gumbel.
these candidate distributions are selected based on literature review.
Now just to outline the steps for calculation of log likelihood in excel.( I got an addin for the software called easyfit which has the pdfs for all the distributions i need)
for every data point i calculate the value of the likelihood function, using the density function.
so for any particular distribution, if I have 20 data points, i have 20 values of the likelihood function. then i take the log of each value.
then I calculate the summation of the 20 values
calculate the AIC value. using the formula
AIC=2k-2(summation log likelihood)
k is the no of parameters of distribution.

do the same step for all distributions using 2 for normal lognormal gamma and gumbel and 3 for the others i calculate 8 AIC values and select my distribution using the lowest AIC.

now the values which don't give a value of the likelihood function are generally very low or very high values of the dataset . But I can't reject them as they are not outliers
I am uploading 2 excel sheets.
station 1 was working fine and it outlines the steps i am following.
station 7 is causing trouble with the very high value(442) for the year 1950 as can be seen
the columns next to it are checking the high hand low ouliers using the grubs test after log transforming my data. (test chosen from literature for streamflow analysis)
The 3 tables are the main tables for doing my loglikelihood calculation which I am concerned with
the functions i used require an addin that comes after installing easyfit on 32bit office 2010
 

Attachments

Last edited:

Similar threads

  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 5 ·
Replies
5
Views
4K
  • · Replies 12 ·
Replies
12
Views
2K
  • · Replies 9 ·
Replies
9
Views
3K
  • · Replies 8 ·
Replies
8
Views
4K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 8 ·
Replies
8
Views
5K
  • · Replies 15 ·
Replies
15
Views
4K
  • · Replies 4 ·
Replies
4
Views
3K
  • · Replies 1 ·
Replies
1
Views
3K