# Weighted least squares fitting

• lavender81
In summary: So you can use 1/sigma^2 or 1/sigma_i^2, it doesn't matter. In summary, when performing a weighted least squares linear fit with data points (xi, yi) and error bars in both x and y (xi_err, yi_err), the weight used should be 1/(m^2*sigma_x^2+sigma_y^2). However, using 1/sigma^2 or 1/sigma_i^2 is also acceptable as the absolute weights do not affect the results. This method is effective for fitting properties of celestial objects, even with scatter in the data.
lavender81
Hello y'all,

If I have n data points (xi, yi) each with error bars in both x and y (xi_err, yi_err), should I use 1/(xi_err^2+yi_err^2) as the weight in a weighted least squares linear fit, or should the weight be a different value that has nothing to do with error bars? I've never used WLS fitting and I'll appreciate your help!

Many thanks,

-Lav

lavender81 said:
Hello y'all,

should I use 1/(xi_err^2+yi_err^2) as the weight in a weighted least squares linear fit...
In a word, No. That gives equal weight to the x and y errors, which is not the right thing to do unless the slope is 1, and if you knew the slope you wouldn't be doing a fit.

It would help to know more about the experiment. Where do the x's and y's come from, and where do the errors come from? Why do you expect a linear relationship between x and y?

They are two properties of a celestial object that I need to fit linearly, and the errors propagate from the method of extraction of the values of these properties (e.g. Monte Carlo). I didn't extract the data, I have them ready and since there is some scatter in the data, I thought of doing a weighted LS fit, but I wasn't sure what weight to choose! I googled some websites on this and found the most frequent choice to be: 1/sigma^2 (Is sigma the sample variance?)

Many thanks!
-Lav

lavender81 said:
They are two properties of a celestial object that I need to fit linearly, and the errors propagate from the method of extraction of the values of these properties (e.g. Monte Carlo). I didn't extract the data, I have them ready and since there is some scatter in the data, I thought of doing a weighted LS fit, but I wasn't sure what weight to choose! I googled some websites on this and found the most frequent choice to be: 1/sigma^2 (Is sigma the sample variance?)

Many thanks!
-Lav
Yes, $\frac{1}{\sigma^2}$ is the correct choice for weighting, and $\frac{1}{\sigma^2_y}$ would be great, if there were errors only in your y's. But you have errors in both x and y. There's an easy way of dealing with this and a hard way. The easy way is to eyeball the slope m of the plot, reckon that if x is off by $\sigma_x$, that will give a calculated y that is off by $m \sigma_x$, and so use weight $\frac{1}{m^2 \sigma^2_x+ \sigma^2_y}$. In almost every practical application, this works fine. The hard way is to do a http://en.wikipedia.org/wiki/Total_least_squares" .

Last edited by a moderator:
I'll read about the total LS fit and decide which method to choose!
Thank you very much!

-Lav

Definitely statistics is not my cup of tea! I am still uncertain what sigma refers to in this case! I am assuming that sigma_i is just the value I'd read directly from the error bars, and not the actual standard deviation that I have to calculate from the data points. Am I right?

lavender81 said:
Definitely statistics is not my cup of tea! I am still uncertain what sigma refers to in this case! I am assuming that sigma_i is just the value I'd read directly from the error bars, and not the actual standard deviation that I have to calculate from the data points. Am I right?

You can use the value you measure from your error bars. The absolute weights don't matter: only the relative values.

## 1. What is weighted least squares fitting?

Weighted least squares fitting is a statistical technique used to minimize the sum of squared errors between a set of data points and a fitted line or curve. It takes into account the variability of each data point by assigning weights to them, which are inversely proportional to their variance. This allows for a more accurate fit, especially when there is heteroscedasticity (unequal variance) in the data.

## 2. How is weighted least squares fitting different from ordinary least squares fitting?

The main difference between weighted least squares fitting and ordinary least squares fitting is that the former takes into account the variability of the data points, while the latter assumes equal variance for all data points. This makes weighted least squares fitting more appropriate for data sets with unequal variance, as it gives more weight to the data points with lower variability and less weight to the points with higher variability.

## 3. How are weights determined in weighted least squares fitting?

The weights in weighted least squares fitting are typically determined by the inverse of the variance of each data point. This means that data points with smaller variances will have higher weights, while data points with larger variances will have lower weights. In some cases, the weights may also be based on the error associated with each data point.

## 4. What are the advantages of using weighted least squares fitting?

One of the main advantages of using weighted least squares fitting is that it provides a more accurate fit for data sets with unequal variance. It also allows for the incorporation of error information, which can improve the overall goodness of fit. Additionally, weighted least squares fitting can be used to handle outliers and minimize their impact on the fitted line or curve.

## 5. What are some common applications of weighted least squares fitting?

Weighted least squares fitting is commonly used in a variety of fields, including statistics, economics, and engineering. It is often used to fit regression models, such as linear regression, to data sets with unequal variances. It is also widely used in time series analysis, where the data may exhibit heteroscedasticity. Additionally, weighted least squares fitting can be used in machine learning to improve the accuracy of predictive models.

• Set Theory, Logic, Probability, Statistics
Replies
4
Views
1K
• Set Theory, Logic, Probability, Statistics
Replies
6
Views
1K
• Set Theory, Logic, Probability, Statistics
Replies
7
Views
595
• Set Theory, Logic, Probability, Statistics
Replies
24
Views
2K
• Set Theory, Logic, Probability, Statistics
Replies
8
Views
1K
• Set Theory, Logic, Probability, Statistics
Replies
3
Views
1K
• Set Theory, Logic, Probability, Statistics
Replies
28
Views
2K
• Set Theory, Logic, Probability, Statistics
Replies
28
Views
3K
• Set Theory, Logic, Probability, Statistics
Replies
13
Views
1K
• Set Theory, Logic, Probability, Statistics
Replies
5
Views
1K