# Weighted least squares fitting

1. Jul 9, 2011

### lavender81

Hello y'all,

If I have n data points (xi, yi) each with error bars in both x and y (xi_err, yi_err), should I use 1/(xi_err^2+yi_err^2) as the weight in a weighted least squares linear fit, or should the weight be a different value that has nothing to do with error bars? I've never used WLS fitting and I'll appreciate your help!

Many thanks,

-Lav

2. Jul 9, 2011

### pmsrw3

In a word, No. That gives equal weight to the x and y errors, which is not the right thing to do unless the slope is 1, and if you knew the slope you wouldn't be doing a fit.

It would help to know more about the experiment. Where do the x's and y's come from, and where do the errors come from? Why do you expect a linear relationship between x and y?

3. Jul 9, 2011

### lavender81

They are two properties of a celestial object that I need to fit linearly, and the errors propagate from the method of extraction of the values of these properties (e.g. Monte Carlo). I didn't extract the data, I have them ready and since there is some scatter in the data, I thought of doing a weighted LS fit, but I wasn't sure what weight to choose! I googled some websites on this and found the most frequent choice to be: 1/sigma^2 (Is sigma the sample variance?)

Many thanks!
-Lav

4. Jul 9, 2011

### pmsrw3

Yes, $\frac{1}{\sigma^2}$ is the correct choice for weighting, and $\frac{1}{\sigma^2_y}$ would be great, if there were errors only in your y's. But you have errors in both x and y. There's an easy way of dealing with this and a hard way. The easy way is to eyeball the slope m of the plot, reckon that if x is off by $\sigma_x$, that will give a calculated y that is off by $m \sigma_x$, and so use weight $\frac{1}{m^2 \sigma^2_x+ \sigma^2_y}$. In almost every practical application, this works fine. The hard way is to do a http://en.wikipedia.org/wiki/Total_least_squares" [Broken].

Last edited by a moderator: May 5, 2017
5. Jul 9, 2011

### lavender81

I'll read about the total LS fit and decide which method to choose!
Thank you very much!!!

-Lav

6. Jul 9, 2011

### lavender81

Definitely statistics is not my cup of tea! I am still uncertain what sigma refers to in this case! I am assuming that sigma_i is just the value I'd read directly from the error bars, and not the actual standard deviation that I have to calculate from the data points. Am I right?