[Mathematica] FindFit/NonlinearModelFit with non-gaussian residuals

FunkyDwarf · Mar 2, 2014

Hi,

I note that in the 'more info' for NonlinearModelFit it says that it assumes the values are normally distributed around the mean response function y, which I understand is required if one wants to use maximum likelihood methods and construct confidence intervals etc.

However, there appears to be no such mention in FindFit, and my understanding (which may be way off) is that Gaussian residuals isn't so important if you want to estimate parameters, only if you want to do confidence/inference stuff.

Is this correct? If so, why when I transform my function (and data), do a fit and then transform back, do i get different parameter values compared to just fitting the 'naked' untransformed model and data? Is this due to some artefact of the algorithm (in this case NMinimize) being used, or is it a deeper issue? Is there not a one to one mapping of the sum of the squared residuals, and the parameters that minimize them?

Thanks in advance!

Bill Simpson · Mar 3, 2014

I always assume that documentation may contain errors and/or omissions. Perhaps this explains the difference in the documentation you describe.

If you do not care how accurate an estimate is then I suppose it doesn't matter what methods are used or assumptions are required, just say all the estimates are about zero.

If you do a non-linear transformation then all the errors between the measured point and the unknown model will be changed and any estimation process will use those changed values.

Imagine your model is y ~= 100 and your measured data points lie between 1 and 1000. If you average all your data the errors lie between +900 and -99. But if you do a log10 transform on your data the errors lie between +1 and -2. One attempt or the other, or perhaps both, are going to make some very questionable calculations with those errors.

Try generating a few uniformly distributed random numbers between 1 and 1000. Take the mean. Compare that with taking the log of each point, take the mean and take the antilog. Sometimes the results are close. Sometimes they are not. But that simple example can show some of what is happening with transforms.

As you noted, there is a great deal of mathematics and assumptions behind the scenes that are often not appropriately explained when dealing with fitting models and dealing with errors.

FunkyDwarf · Mar 3, 2014

Hi Bill,

Thanks for the response. I suspected as much, specifically that nonlinear transforms are going to cause headaches, but it was not so much the individual values, or even certain statistics that i was after, but more the general approach of say least squares. For example, it seems that if you minimize the sum of the squared residuals (SSR) for the original function, and the transformed function, the 'best fit' parameters are different i.e. the minima occur at different points.

After thinking about it a bit this is perhaps not so strange since you can always concoct some weird nonlinear transform that squishes your function and data in different ways so that your notion of 'distance' between function and data isn't conserved across your set when you move from one function to the transformed function.

I guess what I'm getting at is it seems intuitive that minimizing your SSR is a reasonable metric by which to determine your best fit, but it seems arbitrary when you consider the number of transforms you could perform (of course most aren't sensible). Is there any 'global' approach one can use? Presumably this would bring us back to maximum likelihood and all it's baggage?

Thanks again!

Bill Simpson · Mar 3, 2014

Minimizing your SSR may seem reasonable, but that probably depends on many unstated assumptions, like the residuals having a symmetric and perhaps even gaussian distribution, like homoscedasticity, like assuming you have a parametric statistical problem, as opposed to a non-parametric problem. I suspect the list might even be longer. All those things seem to be ignored when the mechanical process of grinding out a sum of squares is introduced.

You should certainly verify this from an authority, but I think I recall that THE justified and acceptable transformation is the one that gives a gaussian distribution of the residuals and homoscedasticity.

I've wished I could find a stats text which would start with the usually unstated requirements, clearly explain why those were the case and then proceed to the theorem that would use all this.

FunkyDwarf · Mar 5, 2014

Me too =)

Bill Simpson · Mar 5, 2014

Buried here somewhere, which I'll never find again, is an intro stats text which is oriented around teaching students to "eyeball the data" and then be able to estimate the statistics with a good deal of precision, make decisions with a good degree of confidence, etc. A couple of Google searches don't find the title. It was cute enough when I saw it on a college bookstore shelf that I bought a copy and meant to try that.

But getting the assumptions out in front of stats would be more important.

[Mathematica] FindFit/NonlinearModelFit with non-gaussian residuals

High School Ant on a stretchy rope puzzle

High School Potato paradox

Geometric Game: Fun With Matches (Safe!)

Undergrad Three Circle Problem

High School Three Squares Problem

Insights Revisiting the Velocity-Time Function

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

[Mathematica] FindFit/NonlinearModelFit with non-gaussian residuals

Similar threads