Binned Maximum Likelihood fit in python?

ORF · Nov 2, 2021

Hi,

I have been using Python for a while now, but so far for Least-squares fits using curve_fit from Scipy.

I would like to start using Likelihood method to fit binned and unbinned data. I found some documentation in Scipy of how to implement unbinned likelihood fit, but I have not managed to make it work for a simple exponential...

[CODE lang="python" title="Unbinned likelihood fit"]
from scipy.stats import rv_continuous
import numpy as np

class myfunc_gen(rv_continuous):

"Exp distribution"

def _pdf(self, x,a):

return np.exp(x*a)

myfunc = myfunc_gen(name='exp')

a = 1.
x = myfunc.rvs(a, size=10)
a1, loc1, scale1 = myfunc.fit(x, a, floc=0, fscale=1)[/CODE]

I found that Pandas has some fit capabilities, but still quite limiting.

Question: is there any functionality in python equivalent to curve_fit from Scipy for Binned/Unbinned likelihood fits?

Thank you for your time.

Cheers,
ORF

jedishrfu · Nov 2, 2021

This sounds like something that would be somewhere in Scipy or some specialty module library but not in the standard python library.

I did find this code on stackexchange

https://stats.stackexchange.com/questions/66199/maximum-likelihood-curve-model-fitting-in-python

pbuk · Nov 3, 2021

This is exactly what scipy-stats-rv-continuous-fit is for. Saying 'it doesn't work' is not going to find a solution, you need to be more specific:

Was the result not what you expected? Was it close but not accurate enough? Did it fail to execute? Does it run too slowly? Did your computer catch fire?

ORF · Nov 3, 2021

Hi,

jedishrfu said:

I did find this code on stackexchange

https://stats.stackexchange.com/questions/66199/maximum-likelihood-curve-model-fitting-in-python

Thanks! It is a nice starting point. However this way of defining our own fitter seems quite primitive (for example, I would need to write another junk of code for calculating the covariance matrix).

Is there any method like curve_fit but using Maximum Likelihood?

Thank you for your time.

Cheers.

ORF · Nov 3, 2021

Hi,

pbuk said:

Was the result not what you expected? Was it close but not accurate enough? Did it fail to execute? Does it run too slowly? Did your computer catch fire?

In this case the result is wrong.
For other simple fitting function it complains either convergence is very slow or directly it reaches 100 iterations and it stops without converging. Probably there is something wrong with my code.

Still, this method is for unbinned data. Is there any method for binned data?

Thank you for your time.

Cheers,
ORF

pbuk · Nov 3, 2021

ORF said:

In this case the result is wrong ... Probably there is something wrong with my code.

There is quite a lot wrong with your code. Take a look at how the expon distribution is defined in scipy's source. The key parts are:

Python:

    def _pdf(self, x):
        # expon.pdf(x) = exp(-x)
        return np.exp(-x)

Note that there is no scale parameter in there, _pdf must be defined with a scale factor of 1: you add the scale factor when creating an instance of the class or when calling its methods.

And note that the exponential PDF is not ## e^x ##.

Python:

expon = expon_gen(a=0.0, name='expon')

The exponential distribution is supported in ## [0, \infty) ## but the default support for a rv_continuous distribution is ## (-\infty, \infty) ##. This can be overriden with the (unhelpfully) named parameters a and b: https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.rv_continuous.html

ORF said:

Still, this method is for unbinned data. Is there any method for binned data?

rv_histogram.fit?

I think you probably need a course or a book on statistical methods with Python, unfortunately I can't recommend any.

jedishrfu · Nov 3, 2021

Google for the point:

https://ipython-books.github.io/75-...n-to-data-with-the-maximum-likelihood-method/

and possibly more points here:

https://www.google.com/search?q=bin...hUKEwiig6GYlfzzAhVdkmoFHX5jAAAQ4dUDCAg&uact=5

and lastly, Towards Data Science has this:

https://towardsdatascience.com/fitting-glms-by-hand-189c02af33a8

ORF · Nov 3, 2021

Thanks, but that example is exactly as the one in the documentation (replacing beta distro by exp). I would like to do unbinned and binned likelihood fits using a custom pdf/fitting function.

Thank you for your time.

ORF · Nov 3, 2021

pbuk said:
There is quite a lot wrong with your code. Take a look at how the expon distribution is defined in scipy's source. The key parts are:
Python:
    def _pdf(self, x):
        # expon.pdf(x) = exp(-x)
        return np.exp(-x)
Note that there is no scale parameter in there, _pdf must be defined with a scale factor of 1: you add the scale factor when creating an instance of the class or when calling its methods.

And note that the exponential PDF is not ## e^x ##.
Python:
expon = expon_gen(a=0.0, name='expon')
The exponential distribution is supported in ## [0, \infty) ## but the default support for a rv_continuous distribution is ## (-\infty, \infty) ##. This can be overriden with the (unhelpfully) named parameters a and b: https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.rv_continuous.htmlrv_histogram.fit?

I think you probably need a course or a book on statistical methods with Python, unfortunately I can't recommend any.

Hi, thanks for your very complete explanation. After normalizing the pdf it converges nicely.

I hope I can continue from it.

Thanks,
ORF

pbuk · Nov 3, 2021

ORF said:

I hope I can continue from it.

Do let us know how you get on with rv_histogram.fit for your binned data - I have never used it, but it looks like it should work similarly to the continuous fit once configured properly.

Binned Maximum Likelihood fit in python?

Similar threads

How to increase phone signal strength by lying about it

Who is responsible for the software when AI takes over programming?

A Crisis for Newly Minted CompSci Majors -- entry level jobs gone

Learning Assembly and computer architecture for x86

Learning data structures and algorithms in different programming languages

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Insights What Exactly is Dirac’s Delta Function? - Insight

Insights Relativator (Circular Slide-Rule): Simulated with Desmos - Insight

Insights Fixing Things Which Can Go Wrong With Complex Numbers