Error in least squares fit: how to include error of points?

  • Thread starter ORF
  • Start date
  • #1
ORF
167
17
Hello

I have a doubt with the least squares fitting (linear fitting).

The low-level statistics textbooks only take into account the statistical error of fitting, but not the error of the fitted points.

How is the error of the fitted points taken into account, and included in the total error of the fitting parameters?

My English is not very good looking, so if something is unclear, I will try to explain it better.

Thank you for your time :)

Greetings!
 

Answers and Replies

  • #2
rumborak
706
154
Usually, what you call the fitting error is *taken* to be the measurement error.
 
  • #3
ORF
167
17
Hello

Thank you for answering so quickly.

Let's see an example: I have n-points, (x1,y1), ... (xn,yn), each one with an error, ie, (err-x1,err-y1), ... (err-xn,err-yn).

The fitting by least squares just takes into account the values (x1,y1), ... (xn,yn). The fitting parameters error is caused by the statistical distribution of the points, but it doesn't take into account the errors associated with the points, ie, (err-x1,err-y1), ... (err-xn,err-yn).

So, how can I include the error associated with the points? (ie, (err-x1,err-y1), ... (err-xn,err-yn))

Thank you for your time :)

Greetings!
 
  • #4
rumborak
706
154
Let me ask you first, how do you know the error of the measurements? I mean, if you knew the error of each single measurement then you could obviously just use the *true* value (measured minus error). If you just have an idea of the error distribution, and you want to have it considered separately from the fitting error, you need a a more complex model, e.g. a Kalman filter.
 
  • #5
D H
Staff Emeritus
Science Advisor
Insights Author
15,450
688
That's the very basic least squares algorithm.

There are various weighted least square algorithms (google that term) that account for the fact that different measurements have different uncertainties. You certainly don't want to let those measurements whose uncertainty is (for example) a centimeter have the same weight as some other measurements whose uncertainty is less than a millimeter. Those high precision measurements should dominate over the lesser precision measurements.

Going one step beyond that, you can use the residuals as a test of whether your least squares fit makes sense. Least squares in and of itself says nothing regarding whether the fit is a good fit. Suppose (for example) you are looking at modeling [itex]y=c_0[/itex] vs. [itex]y=c_0+c_1x[/itex] vs. [itex]y=c_0+c_1x+c_2x^2[/itex] vs. [itex]y=c_0+c_1x+c_2x^2+c_3x^3[/itex] vs. [itex]y=c_0+c_1x+c_2x^2+c_3x^3+c_4x^4[/itex]. Suppose that weighted least squares fits respectively explain 10%, 30%, 99%, 99.1%, and 99.2% of the observed uncertainties. Which model should you use? The answer in this case is the quadratic model. The constant and linear models are garbage. Both are lousy predictors. Those huge jumps from the constant model to the linear model to the quadratic model should tell you that you are capturing some essential characteristics of the underlying process with those progressively higher order terms. The tiny jumps after the quadratic model should tell you that you those even higher order models most likely are just overfitting noise.
 
  • #6
33,865
11,569
Least squares fitting does take into account the errors of the individual points under the assumption that they are normally distributed, and independent with 0 mean and constant variance. If you have more information about the errors then you should use a different model.
 

Suggested for: Error in least squares fit: how to include error of points?

Replies
5
Views
950
  • Last Post
Replies
16
Views
794
Replies
3
Views
834
  • Last Post
Replies
5
Views
920
Replies
10
Views
5K
Replies
1
Views
2K
Replies
3
Views
6K
Replies
5
Views
1K
  • Last Post
Replies
6
Views
1K
  • Last Post
Replies
14
Views
2K
Top