Weighting data based on the errors

  • Context: Undergrad 
  • Thread starter Thread starter kelly0303
  • Start date Start date
  • Tags Tags
    Data Errors
Click For Summary

Discussion Overview

The discussion revolves around the appropriate method for weighting data points in a fitting procedure, specifically when dealing with data that has associated Poisson errors. Participants explore the implications of using absolute versus relative errors for weighting in the context of fitting a Voigt profile plus background.

Discussion Character

  • Exploratory
  • Technical explanation
  • Debate/contested
  • Mathematical reasoning

Main Points Raised

  • One participant suggests weighting data points inversely proportional to their percentage errors, proposing that points with smaller relative errors should have greater influence on the fit.
  • Another participant mentions Maximum Likelihood Estimation as a potential approach but notes that the details provided are insufficient to determine its suitability.
  • A participant expresses uncertainty about how the lmfit module utilizes weight data, suggesting that the error information derived from count data may not add significant value to the fitting process.
  • Concerns are raised about the importance of fitting the tails of the distribution in a Voigt profile, with a participant emphasizing that low count data may be crucial for accurately estimating the distribution's shape.
  • There is a discussion about whether focusing on fitting the peak or the background is more critical for the Voigt profile, indicating differing opinions on the priorities in the fitting process.

Areas of Agreement / Disagreement

Participants do not reach a consensus on the best method for weighting data points or the relative importance of fitting the peak versus the background in the Voigt profile. Multiple competing views remain regarding the appropriate approach to take.

Contextual Notes

Participants highlight limitations in their understanding of the lmfit algorithm and its handling of weight data, as well as the potential impact of using different types of error information on the fitting process.

kelly0303
Messages
573
Reaction score
33
Hello! I have some data (counts) with a Poisson error associated to it and I want to make a fit to the data. I am trying to weight the data inversely proportional to the errors, such that the data points with high errors are less important for the fit. However, using the the error on its own, doesn't seem right. If I have a point with a value ##100 \pm 10## and another one with the value ##10000 \pm 100##, the first one has a smaller error, but the second one should be (I think) more important for the fit, as the relative error is much smaller. So, should I weight each data point by the inverse of its percentage error i.e. the first point would have a weight of ##10##, while the other a weight of ##100##? Is this the right way to do it? Thank you!
 
Physics news on Phys.org
Maximum Likelihood Estimation is an approach that may accomplish what you want to do. With the scant details you have given, I can't say whether it would be the best possible approach.
 
tnich said:
Maximum Likelihood Estimation is an approach that may accomplish what you want to do. With the scant details you have given, I can't say whether it would be the best possible approach.
Thank you for your reply! To give more details: I have some data points to which I want to fit a Voigt profile + background. For each data point I have the energy, the counts and the error (square root of the number of counts). In principle, I am just using an already build Python fitting module (lmfit), and one of the options for that is "weights". I assumed that I have to use the value of the errors as the weights to the fit (1/errors), I am just not sure if I should use the absolute or relative error (or something totally different).
 
I don't know what algorithm lmfit uses, or how it uses the weight data, so I don't really know how to answer your question. Since the only error information you have is derived solely from your count data, it would not provide any more information to the fit algorithm than the count data would by itself. I would be tempted not to enter weight data in that case.

I can tell you that the data with low counts is the important data for fitting a pdf curve, especially when it is a Voigt profile. In that case you don't know whether you have a Cauchy distribution, a Gaussian distribution, or some combination of the two. So estimating the tails of the distribution is the important aspect, and that depends on fitting the experimental data you have for the tails. Your fit algorithm may take that into account or it may not.
 
tnich said:
I don't know what algorithm lmfit uses, or how it uses the weight data, so I don't really know how to answer your question. Since the only error information you have is derived solely from your count data, it would not provide any more information to the fit algorithm than the count data would by itself. I would be tempted not to enter weight data in that case.

I can tell you that the data with low counts is the important data for fitting a pdf curve, especially when it is a Voigt profile. In that case you don't know whether you have a Cauchy distribution, a Gaussian distribution, or some combination of the two. So estimating the tails of the distribution is the important aspect, and that depends on fitting the experimental data you have for the tails. Your fit algorithm may take that into account or it may not.
Thank you for your reply. So you are saying that fitting the background right (i.e. the tail) is more important than the peak itself for a Voigt profile? If that's the case, then using the inverse error as the weight will help, I just thought it makes more sense to focus on fitting the peak right (and extracting the parameters of the peak) rather than the background.
 

Similar threads

  • · Replies 5 ·
Replies
5
Views
2K
  • · Replies 4 ·
Replies
4
Views
3K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 5 ·
Replies
5
Views
2K
Replies
28
Views
4K
  • · Replies 26 ·
Replies
26
Views
3K
Replies
24
Views
3K
Replies
8
Views
2K
  • · Replies 16 ·
Replies
16
Views
2K
  • · Replies 9 ·
Replies
9
Views
3K