# I need to fit a tail-heavy Gaussian curve

1. Dec 3, 2009

### Grogs

I need to fit a tail-heavy "Gaussian" curve

Hi, it's been a long time since I've been around PF.

1. The problem statement, all variables and given/known data

This isn't a homework problem per se, but I've been trying to fit some scattering data using a Gaussian function using a least squares approach and it's not working so well. Doing the fit is no problem, but the Gaussian doesn't follow the data very well. The agreement is good within ~ 1.5 standard deviation from the mean, but the data is too tail heavy and the agreement is lousy beyond that.

I need to find a function that will fit the data better. Something with a scaling parameter that would let me vary the kurtosis (tail heaviness) would be ideal, but I can't think of anything that fits the bill. I'm hoping that there's a stats whiz around who can point me in the right direction.

2. Relevant equations

My Gaussian fit function: $f(\theta) = Aexp(-\theta^{2}/2s^{2})$ where A and s are the fit parameters (mean = 0).

I also tried using: $f(\theta) = B + Aexp(-\theta^{2}/2s^{2})$

3. The attempt at a solution

I tried adding a constant to the Gaussian fit, but then the fit ends up being too large at the tails.

TIA,

Grogs

2. Dec 3, 2009

### D H

Staff Emeritus
Re: I need to fit a tail-heavy "Gaussian" curve

Some questions:

1. What is the mean? That you tried a zero mean fit and then a non-zero mean fit suggests you aren't thinking enough about your model.

2. Can the data ever be negative? For example, time-to-failure. Things don't fail before they are made; time-to-failure is inherently non-negative.

3. What made you think the underlying distribution might be gaussian?

4. Have you done a normality test? Testing whether a distribution is normal is a fairly simple procedure. For example, the Anderson-Darling test.

3. Dec 3, 2009

### Grogs

Re: I need to fit a tail-heavy "Gaussian" curve

Hi, thanks for the quick response.

The mean is zero. The data I'm modeling is essentially the angle a neutron scatters in the x-y plane, which is going to be a symmetric function. f(theta) is the number of counts I get in a detector at angle theta. The incoming particles I modeled are all traveling along the x-axis, so that's why I know the mean scattering angle is 0. Adding the constant shouldn't affect the mean angle, just the magnitude, right?

Yes it can. The scattering angle can be either positive or negative in this case. I could just invoke symmetry and fit |theta| without making much of a difference if I need to though.

I read through a lot of previous work of this type in journal articles during my literature review (this work is part of my dissertation) and they typically used a Gaussian to fit the scattering function. My detector geometry is a bit different than anything I found in the literature review though, which I suspect is what is causing the non-Normality.

Not yet. I can run the data through JMP tomorrow when I get back to my work and see what it tells me, but I can pretty well tell by inspection that it's not Normally distributed in the tails. Values at 5 sigma are something like .001 of the maximum, which is way too high for a Gaussian. That's why I decided to see if I could find something other than a Gaussian that would fit it better.

Thanks again,

Grogs

4. Dec 3, 2009

### D H

Staff Emeritus
Re: I need to fit a tail-heavy "Gaussian" curve

Whoa! I misread your $f(\theta) = B + Aexp(-\theta^{2}/2s^{2})$. That says there is a non-zero intensity at all angles. I assume that even with your broad tails, the intensity eventually tails off to zero. So that non-zero B is inviting some non-physics into your model.

One possibility: See if you find a mapping that maps your detector geometry to something a bit more "normal" (i.e., what you found in the literature), do your statistics in that space, and then translate back to your geometry.

In your space, that would translate to some weird member of the exponential family,

$$f(\theta) = A\exp(-g(|\theta|))$$

Another possibility is the double exponential:

$$f(\theta) = A\exp\left(-\,\frac{|\theta|} b\right)$$

Note that the absolute value in the above force the distribution to be symmetric (which makes sense given the description of your setup).

5. Dec 4, 2009

### Grogs

Re: I need to fit a tail-heavy "Gaussian" curve

Thanks, D H. I'll take a look at the exponential and see if it does any better. Another thought that occurred to me was that what I'm looking at may be the convolution of two different Gaussians - one with a small maximum and a very broad distribution and a second with a very large peak and a narrow distribution.

Switching back to the previous geometries isn't really an option unfortunately. They were using a 2D surface tally and I'm using an array of volumetric ones in a cylindrical configuration. I also have a time window on the detectors, i.e., the particles have to arrive at a certain time to be counted, which complicates things as well. It's just too hard to do analytically which is why we've had to resort to using a trusted simulation program to find the right answer and then develop an empirical model. You are right that there does need to be some reasonable explanation of why a particular fit works though. Otherwise I could just do something like a cubic spline on, get a great fit, and move on. I'm hoping that the form of the fit will at least give me a hint of the underlying physics.

6. Dec 4, 2009

### D H

Staff Emeritus
Re: I need to fit a tail-heavy "Gaussian" curve

A mixture.

The EM algorithm is very good at pulling out mixtures. Maybe a bit overkill, though.