Uncertainty for least squares fit

LizardCobra
Messages
16
Reaction score
0
I am fitting data to a parabolic equation using the least squares fit method. Each data point that goes into the fit is the average of 5 data points at that x value, so that each point has error bars that come from the standard deviation of those 5 y values.

I have a fitted equation, and I can calculate the residuals, but how can I calculate the uncertainty for a single point on the fitted line? So say the fitted equation is y = ax^2 +bx+c. I want to be able to plug in a value for x, get a value for y (that much is trivial) and get the associated uncertainty on that y value.

Thanks
 
Physics news on Phys.org
Hey LizardCobra and welcome to the forums.

Are you aware of conditional standard errors? (I.e. se(y|x)?) They are related to conditional variances and standard deviations (same sort of idea) but you are dealing with sample data and not population data.
 
No, I hadn't considered conditional standard error- does that apply here, and if so how would I use it?

I said that the standard deviation of the fitted line (which I called the uncertainty on y_fit) was the square root of the variance of the residuals. The uncertainty for the fitted line ended up being smaller than the uncertainty for many of the individual data points. Not entirely sure if what I did is Kosher.
 
An example using a simple linear model (with y and x only) would be that

Var(y_hat|x) = sigma^2(1/n + (x-x_bar)^2/SXX) where SXX is summation from i = 1 to n (n is the number of paired observations) for (x_i - x_bar)*(x_i - x_bar).

The above gives the variance of the predicted y (y_hat) given an x.

You can derive these from the basic linear model.
 
I don't understand this at all. What are x_i, x_bar, and x referring to here? And shouldn't the uncertainty somehow depend on the y values for the data and the y values for the fit?
 
The ith observation is x_i and the sample mean is x_bar. The x is the x-value that you finding the standard error on.

Remember you are getting the standard error of the fitted y-value at a given x point (recall that y is a function of x).
 
I don't see how this is at all related to the 'goodness of fit'. It seems like it is really only dependent on the number of data points that were used.
 
What is sigma? The standard deviation of... what?

thanks
 
You asked about uncertainty for a fitted model at a single point, and I've outlined the expression to calculate it under a simple linear regression.

Sigma is the variance of the residual terms (we assume in this model it's constant for every residual term) and we estimate this through sigma_hat^2.
 
Back
Top