Weighted least squares fitting

Click For Summary

Discussion Overview

The discussion revolves around the appropriate use of weights in weighted least squares (WLS) fitting when dealing with data points that have associated errors in both x and y dimensions. Participants explore different approaches to determining weights based on error propagation and the nature of the data.

Discussion Character

  • Technical explanation
  • Debate/contested
  • Mathematical reasoning

Main Points Raised

  • One participant questions whether to use 1/(xi_err^2 + yi_err^2) as the weight in WLS fitting, expressing uncertainty about the correct approach.
  • Another participant argues against using equal weights for x and y errors, suggesting that this is only valid if the slope is known to be 1, which is not the case when fitting.
  • A participant provides context about the data being properties of a celestial object and mentions using Monte Carlo methods for error propagation, seeking clarification on the correct weight choice.
  • It is suggested that 1/sigma^2 is a common choice for weighting, with sigma potentially referring to sample variance, though this is not universally agreed upon.
  • One participant proposes a method for determining weights based on the slope of the data, indicating that using weights of the form 1/(m^2 * sigma_x^2 + sigma_y^2) can be effective in practical applications.
  • Another participant expresses confusion about the meaning of sigma, questioning whether it refers to values from error bars or requires calculation from data points, and receives confirmation that measured values from error bars can be used.

Areas of Agreement / Disagreement

Participants express differing views on the appropriate method for determining weights in WLS fitting, with no consensus reached on a single correct approach. Some suggest using 1/sigma^2 while others propose more complex methods involving the slope of the data.

Contextual Notes

There is uncertainty regarding the definitions of sigma and its application in this context, as well as the implications of using different weighting methods on the fitting results.

lavender81
Messages
4
Reaction score
0
Hello y'all,

If I have n data points (xi, yi) each with error bars in both x and y (xi_err, yi_err), should I use 1/(xi_err^2+yi_err^2) as the weight in a weighted least squares linear fit, or should the weight be a different value that has nothing to do with error bars? I've never used WLS fitting and I'll appreciate your help!

Many thanks,

-Lav
 
Physics news on Phys.org
lavender81 said:
Hello y'all,

should I use 1/(xi_err^2+yi_err^2) as the weight in a weighted least squares linear fit...
In a word, No. That gives equal weight to the x and y errors, which is not the right thing to do unless the slope is 1, and if you knew the slope you wouldn't be doing a fit.

It would help to know more about the experiment. Where do the x's and y's come from, and where do the errors come from? Why do you expect a linear relationship between x and y?
 
They are two properties of a celestial object that I need to fit linearly, and the errors propagate from the method of extraction of the values of these properties (e.g. Monte Carlo). I didn't extract the data, I have them ready and since there is some scatter in the data, I thought of doing a weighted LS fit, but I wasn't sure what weight to choose! I googled some websites on this and found the most frequent choice to be: 1/sigma^2 (Is sigma the sample variance?)

Many thanks!
-Lav
 
lavender81 said:
They are two properties of a celestial object that I need to fit linearly, and the errors propagate from the method of extraction of the values of these properties (e.g. Monte Carlo). I didn't extract the data, I have them ready and since there is some scatter in the data, I thought of doing a weighted LS fit, but I wasn't sure what weight to choose! I googled some websites on this and found the most frequent choice to be: 1/sigma^2 (Is sigma the sample variance?)

Many thanks!
-Lav
Yes, \frac{1}{\sigma^2} is the correct choice for weighting, and \frac{1}{\sigma^2_y} would be great, if there were errors only in your y's. But you have errors in both x and y. There's an easy way of dealing with this and a hard way. The easy way is to eyeball the slope m of the plot, reckon that if x is off by \sigma_x, that will give a calculated y that is off by m \sigma_x, and so use weight \frac{1}{m^2 \sigma^2_x+ \sigma^2_y}. In almost every practical application, this works fine. The hard way is to do a http://en.wikipedia.org/wiki/Total_least_squares" .
 
Last edited by a moderator:
I'll read about the total LS fit and decide which method to choose!
Thank you very much!

-Lav
 
Definitely statistics is not my cup of tea! I am still uncertain what sigma refers to in this case! I am assuming that sigma_i is just the value I'd read directly from the error bars, and not the actual standard deviation that I have to calculate from the data points. Am I right?

Thank you in advance!
 
lavender81 said:
Definitely statistics is not my cup of tea! I am still uncertain what sigma refers to in this case! I am assuming that sigma_i is just the value I'd read directly from the error bars, and not the actual standard deviation that I have to calculate from the data points. Am I right?

Thank you in advance!
You can use the value you measure from your error bars. The absolute weights don't matter: only the relative values.
 

Similar threads

  • · Replies 4 ·
Replies
4
Views
2K
Replies
24
Views
3K
  • · Replies 6 ·
Replies
6
Views
3K
  • · Replies 8 ·
Replies
8
Views
2K
  • · Replies 7 ·
Replies
7
Views
3K
  • · Replies 5 ·
Replies
5
Views
2K
  • · Replies 13 ·
Replies
13
Views
2K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 2 ·
Replies
2
Views
2K