Error in least squares fit: how to include error of points?

Click For Summary

Discussion Overview

The discussion revolves around the inclusion of measurement errors in least squares fitting, particularly how to account for the errors associated with fitted points in the total error of fitting parameters. Participants explore various approaches and models related to this topic, including weighted least squares and alternative methods.

Discussion Character

  • Technical explanation
  • Debate/contested
  • Mathematical reasoning

Main Points Raised

  • One participant questions how to incorporate the errors of fitted points into the least squares fitting process, noting that traditional methods only consider the statistical distribution of the points.
  • Another participant suggests that the fitting error is typically regarded as the measurement error, implying a potential overlap in definitions.
  • A different participant proposes that if the errors of measurements are known, one could use the true values adjusted for these errors, but acknowledges that a more complex model may be needed if only an error distribution is available.
  • It is mentioned that weighted least squares algorithms exist to account for different uncertainties in measurements, emphasizing that measurements with higher precision should have greater influence on the fit.
  • One participant discusses the importance of using residuals to evaluate the quality of the fit, suggesting that the choice of model should be informed by how well it explains the observed uncertainties.
  • Another participant states that least squares fitting assumes errors of individual points are normally distributed and independent, and suggests using a different model if more information about the errors is available.

Areas of Agreement / Disagreement

Participants express differing views on how to handle measurement errors in least squares fitting, with no consensus reached on a single approach. Some advocate for weighted least squares, while others suggest alternative models or emphasize the need for more information about measurement errors.

Contextual Notes

Limitations include assumptions about the distribution and independence of measurement errors, as well as the potential need for more complex models depending on the available information about those errors.

ORF
Messages
169
Reaction score
19
Hello

I have a doubt with the least squares fitting (linear fitting).

The low-level statistics textbooks only take into account the statistical error of fitting, but not the error of the fitted points.

How is the error of the fitted points taken into account, and included in the total error of the fitting parameters?

My English is not very good looking, so if something is unclear, I will try to explain it better.

Thank you for your time :)

Greetings!
 
Physics news on Phys.org
Usually, what you call the fitting error is *taken* to be the measurement error.
 
Hello

Thank you for answering so quickly.

Let's see an example: I have n-points, (x1,y1), ... (xn,yn), each one with an error, ie, (err-x1,err-y1), ... (err-xn,err-yn).

The fitting by least squares just takes into account the values (x1,y1), ... (xn,yn). The fitting parameters error is caused by the statistical distribution of the points, but it doesn't take into account the errors associated with the points, ie, (err-x1,err-y1), ... (err-xn,err-yn).

So, how can I include the error associated with the points? (ie, (err-x1,err-y1), ... (err-xn,err-yn))

Thank you for your time :)

Greetings!
 
Let me ask you first, how do you know the error of the measurements? I mean, if you knew the error of each single measurement then you could obviously just use the *true* value (measured minus error). If you just have an idea of the error distribution, and you want to have it considered separately from the fitting error, you need a a more complex model, e.g. a Kalman filter.
 
That's the very basic least squares algorithm.

There are various weighted least square algorithms (google that term) that account for the fact that different measurements have different uncertainties. You certainly don't want to let those measurements whose uncertainty is (for example) a centimeter have the same weight as some other measurements whose uncertainty is less than a millimeter. Those high precision measurements should dominate over the lesser precision measurements.

Going one step beyond that, you can use the residuals as a test of whether your least squares fit makes sense. Least squares in and of itself says nothing regarding whether the fit is a good fit. Suppose (for example) you are looking at modeling y=c_0 vs. y=c_0+c_1x vs. y=c_0+c_1x+c_2x^2 vs. y=c_0+c_1x+c_2x^2+c_3x^3 vs. y=c_0+c_1x+c_2x^2+c_3x^3+c_4x^4. Suppose that weighted least squares fits respectively explain 10%, 30%, 99%, 99.1%, and 99.2% of the observed uncertainties. Which model should you use? The answer in this case is the quadratic model. The constant and linear models are garbage. Both are lousy predictors. Those huge jumps from the constant model to the linear model to the quadratic model should tell you that you are capturing some essential characteristics of the underlying process with those progressively higher order terms. The tiny jumps after the quadratic model should tell you that you those even higher order models most likely are just overfitting noise.
 
Least squares fitting does take into account the errors of the individual points under the assumption that they are normally distributed, and independent with 0 mean and constant variance. If you have more information about the errors then you should use a different model.
 

Similar threads

  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 8 ·
Replies
8
Views
2K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 8 ·
Replies
8
Views
7K
Replies
24
Views
3K
  • · Replies 13 ·
Replies
13
Views
2K
Replies
28
Views
4K
  • · Replies 16 ·
Replies
16
Views
3K
Replies
8
Views
2K