Error on the Gaussian mean

In summary, the uncertainty on the mean of this function is:The uncertainty on the mean of this function is given by the covariance matrix of the data points and the gaussian width.
  • #1
kelly0303
561
33
Hello! If I have N points (x,y) which I know they are described by a Gaussian i.e. y(x) is a Gaussian of unknown mean and standard deviation, and each y has an associated error of ##\sqrt{y}##, is there a general formula for the uncertainty on the mean of this function? Thank you!
 
Physics news on Phys.org
  • #2
It is not clear to me what you are asking. In particular what does "each y has an associated error of ##\sqrt y##" mean? Perhaps you could further describe the system ?
 
  • #3
hutchphd said:
It is not clear to me what you are asking. In particular what does "each y has an associated error of ##\sqrt y##" mean? Perhaps you could further describe the system ?
I meant that the uncertainty on y is ##\sqrt{y}##, which is usually the case with counting experiments. An example of such a system is the measurement of the transition between 2 levels of a system, where the lineshape is Gaussian. One sets the laser at a given frequency for a while and measures the number of counts (e.g. fluorescent photons from the induced transition), then the frequency is changed and the number of counts are measured again. In the end, one ends up with a plot of counts (or rate) vs frequency, where the uncertainty on the number of counts, N is ##\sqrt{N}##. Then, in order to get the central value of that transition, one needs to fit a Gaussian to N vs frequency. My question is, is there a general formula for the uncertainty on the mean of this gaussian under these circumstances?
 
  • #4
kelly0303 said:
My question is, is there a general formula for the uncertainty on the mean of this gaussian under these circumstances?
I fear you are conflating two ideas here.
In your example, the lineshape I believe will be Laurentzian (not Gaussian) and the width arises from the intrinsic physics. Given that shape, the best fit of a Laurentzian to the data will provide an estimate of the transition frequency and width. It is not asymmetric distribution.
The uncertainty in the measurement at each frequency, as driven by by signal strength or integration time, will indeed go as ##\sqrt N## and the fitting procedure to a parameterized curve can be weighted accordingly. One can certainly derive a relationship for the goodness of this fit in terms of the uncertianty in the data. These depend upon slopes and curvatures of the fitted curve near the data points in question and the uncertainties.
 
  • #5
hutchphd said:
In your example, the lineshape I believe will be Laurentzian (not Gaussian) and the width arises from the intrinsic physics. Given that shape, the best fit of a Laurentzian to the data will provide an estimate of the transition frequency and width. It is not asymmetric distribution.
Atomic lines aren't always Lorentzian. They are often Gaussian, and more generally they are described by a Voigt profile (a convolution of the two). One example is an atomic transition that is broadened by the Doppler shifts of all the atoms flying around with thermal velocities. This will produce a Gaussian lineshape, reflecting the Maxwell-Boltzmann velocity distribution.

kelly0303 said:
Hello! If I have N points (x,y) which I know they are described by a Gaussian i.e. y(x) is a Gaussian of unknown mean and standard deviation, and each y has an associated error of y, is there a general formula for the uncertainty on the mean of this function? Thank you!
What you want is to do a Levenburg-Marquardt (or whatever other fitting algorithm you prefer) to your data, and extract the covariance matrix. Matlab's "nlinfit" function does this, and I believe scipy.optimize.least_sqaures does the same when you set method='lm'. (I'm not very familiar with scipy, sorry!) For the scipy method, it only returns the jacobian, which you can use to generate the covariance matrix from the error bars on your data points via usual error propagation formulas. Essentially, the Jacobian is the derivative of the fit parameters versus each of your data points ##y_i##. Using those derivatives and the error bars on each ##y_i##, you can calculate the error bars on the fit parameters. Matlab does calculates the covariance matrix for you (it's the 4th output of nlinfit).

Once you have the covariance matrix, you'll want to look at the diagonal component that corresponds to the gaussian width. This will be the variance on that width, so if you take the square root you will get the standard deviation. I hope that's somewhat helpful!
 
  • #6
I realize my description wasn't very good. Here's some example code for matlab. I wasn't able to check it because I don't have MATLAB at home. I believe nlinfit requires the statistics and machine learning toolbox. It's very useful though!
GaussianFitting:
% Make some data
x = 0:0.1:5;
y = 5*exp(-((x-3)/0.71).^2) + 1.7;
yerr = 0.1 + 0.2*rand(size(y));

% Plot the data
figure(1); clf;
errorbar(x,y,yerr,'ks')

% define a gaussian function to fit to
fitfun = @(b,x) b(1)*exp(-((x-b(2))/b(3)).^2)+b(4);
% b(1) is the amplitude of the gaussian line
% b(2) is the center value
% b(3) is the linewidth
% b(4) is the background

% define a vector of inverse-variance weights
w = yerr.^(-2);

% define a guess for the parameters of the gaussian
b0 = [1,0,1,0];

% use nlinfit to fit the data to a gaussian
[b,~,~,covB] = nlinfit(x,y,fitfun,b0,'weights',w);

% plot the result
hold on
plot(x,fitfun(b,x),'k--')
hold off

% print the uncertainty on the linewidth to the command window
display(['The linewidth uncertainty is: ',num2str(sqrt(covB(3,3)))])
 

What is "Error on the Gaussian mean"?

"Error on the Gaussian mean" refers to the uncertainty or variation in the estimated mean value of a dataset that follows a Gaussian or normal distribution. It is a measure of how much the estimated mean may differ from the true population mean.

How is the error on the Gaussian mean calculated?

The error on the Gaussian mean is typically calculated using the standard error formula, which takes into account the sample size, standard deviation, and the confidence level. It is also known as the standard error of the mean.

Why is the error on the Gaussian mean important?

The error on the Gaussian mean is important because it provides a measure of the accuracy and reliability of the estimated mean value. It also allows researchers to determine the confidence level of their results and make informed conclusions about the population mean.

Can the error on the Gaussian mean be reduced?

Yes, the error on the Gaussian mean can be reduced by increasing the sample size or by reducing the standard deviation of the dataset. This can lead to a more precise estimation of the true population mean.

Are there any limitations to using the error on the Gaussian mean?

Yes, the error on the Gaussian mean assumes that the dataset follows a normal distribution, which may not always be the case. In addition, it only takes into account the variability in the mean and not the entire dataset, so it may not provide a complete picture of the data's distribution.

Similar threads

  • Set Theory, Logic, Probability, Statistics
Replies
4
Views
914
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
19
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
3
Views
472
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
991
  • Set Theory, Logic, Probability, Statistics
Replies
18
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
13
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
28
Views
3K
  • Set Theory, Logic, Probability, Statistics
Replies
2
Views
791
Back
Top