Undergrad Weighting data based on the errors

  • Thread starter Thread starter kelly0303
  • Start date Start date
  • Tags Tags
    Data Errors
Click For Summary
The discussion centers on the appropriate method for weighting data points in a fit, particularly when dealing with Poisson errors. The user seeks to determine whether to weight data points by their absolute or relative errors, emphasizing that points with higher counts and lower relative errors should be prioritized. It is suggested that using Maximum Likelihood Estimation could be beneficial, but the specifics of the fitting algorithm in the Python module lmfit remain unclear. A key point raised is that fitting the tails of the distribution may be more critical than fitting the peak for a Voigt profile, which could influence the weighting strategy. Ultimately, the importance of accurately fitting the background versus the peak is debated, highlighting the complexity of the fitting process.
kelly0303
Messages
573
Reaction score
33
Hello! I have some data (counts) with a Poisson error associated to it and I want to make a fit to the data. I am trying to weight the data inversely proportional to the errors, such that the data points with high errors are less important for the fit. However, using the the error on its own, doesn't seem right. If I have a point with a value ##100 \pm 10## and another one with the value ##10000 \pm 100##, the first one has a smaller error, but the second one should be (I think) more important for the fit, as the relative error is much smaller. So, should I weight each data point by the inverse of its percentage error i.e. the first point would have a weight of ##10##, while the other a weight of ##100##? Is this the right way to do it? Thank you!
 
Physics news on Phys.org
Maximum Likelihood Estimation is an approach that may accomplish what you want to do. With the scant details you have given, I can't say whether it would be the best possible approach.
 
tnich said:
Maximum Likelihood Estimation is an approach that may accomplish what you want to do. With the scant details you have given, I can't say whether it would be the best possible approach.
Thank you for your reply! To give more details: I have some data points to which I want to fit a Voigt profile + background. For each data point I have the energy, the counts and the error (square root of the number of counts). In principle, I am just using an already build Python fitting module (lmfit), and one of the options for that is "weights". I assumed that I have to use the value of the errors as the weights to the fit (1/errors), I am just not sure if I should use the absolute or relative error (or something totally different).
 
I don't know what algorithm lmfit uses, or how it uses the weight data, so I don't really know how to answer your question. Since the only error information you have is derived solely from your count data, it would not provide any more information to the fit algorithm than the count data would by itself. I would be tempted not to enter weight data in that case.

I can tell you that the data with low counts is the important data for fitting a pdf curve, especially when it is a Voigt profile. In that case you don't know whether you have a Cauchy distribution, a Gaussian distribution, or some combination of the two. So estimating the tails of the distribution is the important aspect, and that depends on fitting the experimental data you have for the tails. Your fit algorithm may take that into account or it may not.
 
tnich said:
I don't know what algorithm lmfit uses, or how it uses the weight data, so I don't really know how to answer your question. Since the only error information you have is derived solely from your count data, it would not provide any more information to the fit algorithm than the count data would by itself. I would be tempted not to enter weight data in that case.

I can tell you that the data with low counts is the important data for fitting a pdf curve, especially when it is a Voigt profile. In that case you don't know whether you have a Cauchy distribution, a Gaussian distribution, or some combination of the two. So estimating the tails of the distribution is the important aspect, and that depends on fitting the experimental data you have for the tails. Your fit algorithm may take that into account or it may not.
Thank you for your reply. So you are saying that fitting the background right (i.e. the tail) is more important than the peak itself for a Voigt profile? If that's the case, then using the inverse error as the weight will help, I just thought it makes more sense to focus on fitting the peak right (and extracting the parameters of the peak) rather than the background.
 
First trick I learned this one a long time ago and have used it to entertain and amuse young kids. Ask your friend to write down a three-digit number without showing it to you. Then ask him or her to rearrange the digits to form a new three-digit number. After that, write whichever is the larger number above the other number, and then subtract the smaller from the larger, making sure that you don't see any of the numbers. Then ask the young "victim" to tell you any two of the digits of the...

Similar threads

  • · Replies 5 ·
Replies
5
Views
2K
  • · Replies 4 ·
Replies
4
Views
3K
  • · Replies 5 ·
Replies
5
Views
2K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 26 ·
Replies
26
Views
3K
Replies
28
Views
4K
Replies
24
Views
3K
  • · Replies 16 ·
Replies
16
Views
2K
Replies
8
Views
2K
  • · Replies 9 ·
Replies
9
Views
2K