Binned Maximum Likelihood fit in python?

Click For Summary
SUMMARY

The discussion focuses on implementing Maximum Likelihood fitting for binned and unbinned data in Python, specifically using the SciPy library. The user, ORF, initially attempted to use the curve_fit function from SciPy for Least-squares fits but encountered issues with the likelihood method. Key insights include the importance of correctly defining the probability density function (PDF) for the exponential distribution and the potential use of rv_histogram.fit for binned data fitting. The conversation highlights the need for precise implementation and understanding of statistical methods in Python.

PREREQUISITES
  • Familiarity with Python programming
  • Understanding of statistical methods, particularly Maximum Likelihood Estimation
  • Experience with SciPy library, specifically scipy.stats.rv_continuous
  • Knowledge of probability density functions (PDFs) and their implementation
NEXT STEPS
  • Research the rv_histogram.fit method for fitting binned data in Python
  • Explore the documentation for scipy.stats.rv_continuous to understand custom PDF definitions
  • Study Maximum Likelihood Estimation techniques in statistical modeling
  • Look for resources on implementing custom fitting functions in Python
USEFUL FOR

Data scientists, statisticians, and Python developers interested in advanced statistical modeling and fitting techniques for both binned and unbinned datasets.

ORF
Messages
169
Reaction score
18
TL;DR
Is there any built-in function to perform Binned Maximum Likelihood fit in python standard libraries?
Hi,

I have been using Python for a while now, but so far for Least-squares fits using curve_fit from Scipy.

I would like to start using Likelihood method to fit binned and unbinned data. I found some documentation in Scipy of how to implement unbinned likelihood fit, but I have not managed to make it work for a simple exponential...

[CODE lang="python" title="Unbinned likelihood fit"]
from scipy.stats import rv_continuous
import numpy as np

class myfunc_gen(rv_continuous):

"Exp distribution"

def _pdf(self, x,a):

return np.exp(x*a)

myfunc = myfunc_gen(name='exp')

a = 1.
x = myfunc.rvs(a, size=10)
a1, loc1, scale1 = myfunc.fit(x, a, floc=0, fscale=1)[/CODE]

I found that Pandas has some fit capabilities, but still quite limiting.

Question: is there any functionality in python equivalent to curve_fit from Scipy for Binned/Unbinned likelihood fits?

Thank you for your time.

Cheers,
ORF
 
Technology news on Phys.org
This is exactly what scipy-stats-rv-continuous-fit is for. Saying 'it doesn't work' is not going to find a solution, you need to be more specific:

Was the result not what you expected? Was it close but not accurate enough? Did it fail to execute? Does it run too slowly? Did your computer catch fire?
 
Hi,

pbuk said:
Was the result not what you expected? Was it close but not accurate enough? Did it fail to execute? Does it run too slowly? Did your computer catch fire?

In this case the result is wrong.
For other simple fitting function it complains either convergence is very slow or directly it reaches 100 iterations and it stops without converging. Probably there is something wrong with my code.

Still, this method is for unbinned data. Is there any method for binned data?

Thank you for your time.

Cheers,
ORF
 
ORF said:
In this case the result is wrong ... Probably there is something wrong with my code.
There is quite a lot wrong with your code. Take a look at how the expon distribution is defined in scipy's source. The key parts are:
Python:
    def _pdf(self, x):
        # expon.pdf(x) = exp(-x)
        return np.exp(-x)
Note that there is no scale parameter in there, _pdf must be defined with a scale factor of 1: you add the scale factor when creating an instance of the class or when calling its methods.

And note that the exponential PDF is not ## e^x ##.

Python:
expon = expon_gen(a=0.0, name='expon')
The exponential distribution is supported in ## [0, \infty) ## but the default support for a rv_continuous distribution is ## (-\infty, \infty) ##. This can be overriden with the (unhelpfully) named parameters a and b: https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.rv_continuous.html

ORF said:
Still, this method is for unbinned data. Is there any method for binned data?
rv_histogram.fit?

I think you probably need a course or a book on statistical methods with Python, unfortunately I can't recommend any.
 
  • Like
Likes   Reactions: ORF
Thanks, but that example is exactly as the one in the documentation (replacing beta distro by exp). I would like to do unbinned and binned likelihood fits using a custom pdf/fitting function.

Thank you for your time.
 
pbuk said:
There is quite a lot wrong with your code. Take a look at how the expon distribution is defined in scipy's source. The key parts are:
Python:
    def _pdf(self, x):
        # expon.pdf(x) = exp(-x)
        return np.exp(-x)
Note that there is no scale parameter in there, _pdf must be defined with a scale factor of 1: you add the scale factor when creating an instance of the class or when calling its methods.

And note that the exponential PDF is not ## e^x ##.

Python:
expon = expon_gen(a=0.0, name='expon')
The exponential distribution is supported in ## [0, \infty) ## but the default support for a rv_continuous distribution is ## (-\infty, \infty) ##. This can be overriden with the (unhelpfully) named parameters a and b: https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.rv_continuous.htmlrv_histogram.fit?

I think you probably need a course or a book on statistical methods with Python, unfortunately I can't recommend any.
Hi, thanks for your very complete explanation. After normalizing the pdf it converges nicely.

I hope I can continue from it.

Thanks,
ORF
 
  • Like
Likes   Reactions: pbuk
  • #10
ORF said:
I hope I can continue from it.
Do let us know how you get on with rv_histogram.fit for your binned data - I have never used it, but it looks like it should work similarly to the continuous fit once configured properly.
 

Similar threads

  • · Replies 6 ·
Replies
6
Views
7K
  • · Replies 6 ·
Replies
6
Views
3K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 6 ·
Replies
6
Views
7K
  • · Replies 1 ·
Replies
1
Views
3K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 1 ·
Replies
1
Views
1K
Replies
3
Views
1K
Replies
7
Views
2K