Uncertainty for least squares fit

Click For Summary

Discussion Overview

The discussion revolves around calculating the uncertainty for a single point on a fitted parabolic equation using the least squares fit method. Participants explore the implications of using conditional standard errors and the relationship between the fitted model and the uncertainties associated with individual data points.

Discussion Character

  • Technical explanation
  • Mathematical reasoning
  • Debate/contested

Main Points Raised

  • One participant describes fitting data to a parabolic equation and seeks to calculate the uncertainty for a single point on the fitted line.
  • Another participant introduces the concept of conditional standard errors and questions its applicability to the problem.
  • A participant expresses uncertainty about the validity of their method for calculating uncertainty, noting discrepancies between the uncertainty of the fitted line and individual data points.
  • One participant provides a formula for the variance of the predicted y-value given an x-value in a simple linear model, but does not clarify how it applies to the parabolic case.
  • There is confusion regarding the terms used in the formula, with participants seeking clarification on the meaning of variables like x_i, x_bar, and sigma.
  • Another participant questions the relevance of the discussed concepts to the goodness of fit, suggesting a focus on the number of data points instead.
  • One participant defines sigma as the variance of the residual terms, indicating it is assumed to be constant across residuals.

Areas of Agreement / Disagreement

Participants express varying levels of understanding regarding the application of conditional standard errors and the relationship between the fitted model and uncertainty. There is no consensus on the best approach to calculate uncertainty for a single point on the fitted line.

Contextual Notes

Participants highlight potential limitations in understanding the relationship between the fitted model and the individual data points, as well as the assumptions underlying the calculations of uncertainty.

LizardCobra
Messages
16
Reaction score
0
I am fitting data to a parabolic equation using the least squares fit method. Each data point that goes into the fit is the average of 5 data points at that x value, so that each point has error bars that come from the standard deviation of those 5 y values.

I have a fitted equation, and I can calculate the residuals, but how can I calculate the uncertainty for a single point on the fitted line? So say the fitted equation is y = ax^2 +bx+c. I want to be able to plug in a value for x, get a value for y (that much is trivial) and get the associated uncertainty on that y value.

Thanks
 
Physics news on Phys.org
Hey LizardCobra and welcome to the forums.

Are you aware of conditional standard errors? (I.e. se(y|x)?) They are related to conditional variances and standard deviations (same sort of idea) but you are dealing with sample data and not population data.
 
No, I hadn't considered conditional standard error- does that apply here, and if so how would I use it?

I said that the standard deviation of the fitted line (which I called the uncertainty on y_fit) was the square root of the variance of the residuals. The uncertainty for the fitted line ended up being smaller than the uncertainty for many of the individual data points. Not entirely sure if what I did is Kosher.
 
An example using a simple linear model (with y and x only) would be that

Var(y_hat|x) = sigma^2(1/n + (x-x_bar)^2/SXX) where SXX is summation from i = 1 to n (n is the number of paired observations) for (x_i - x_bar)*(x_i - x_bar).

The above gives the variance of the predicted y (y_hat) given an x.

You can derive these from the basic linear model.
 
I don't understand this at all. What are x_i, x_bar, and x referring to here? And shouldn't the uncertainty somehow depend on the y values for the data and the y values for the fit?
 
The ith observation is x_i and the sample mean is x_bar. The x is the x-value that you finding the standard error on.

Remember you are getting the standard error of the fitted y-value at a given x point (recall that y is a function of x).
 
I don't see how this is at all related to the 'goodness of fit'. It seems like it is really only dependent on the number of data points that were used.
 
What is sigma? The standard deviation of... what?

thanks
 
You asked about uncertainty for a fitted model at a single point, and I've outlined the expression to calculate it under a simple linear regression.

Sigma is the variance of the residual terms (we assume in this model it's constant for every residual term) and we estimate this through sigma_hat^2.
 

Similar threads

  • · Replies 28 ·
Replies
28
Views
3K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 2 ·
Replies
2
Views
2K
Replies
24
Views
3K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 5 ·
Replies
5
Views
2K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 16 ·
Replies
16
Views
3K
  • · Replies 11 ·
Replies
11
Views
3K
  • · Replies 16 ·
Replies
16
Views
2K