Finding the Uncertainty of the Slope Parameter of a Liner Regression

richardc · Mar 1, 2014

Finding the Uncertainty of the Slope Parameter of a Linear Regression

Suppose I have measurements [itex]x_i \pm \sigma_{xi}[/itex] and [itex]y_i \pm \sigma_{yi}[/itex] where [itex]\sigma[/itex] is the uncertainty in the measurement. If I use a linear regression to estimate the value of [itex]b[/itex] in [itex]y=a+bx[/itex], I'm struggling to find a straightforward way to compute the uncertainty of [itex]b[/itex] that arises from the measurement uncertainties. This seems like it should be a very common problem, so I'm not sure why I can't find a simple algorithm or formula.

Thank you for any advice.

Stephen Tashi · Mar 1, 2014

Are you using "uncertainty" to mean "standard deviation"?

It's a common problem, but it's not simple. After all, your data gives only one value for [itex]b[/itex], so how can you estimate the standard deviation of [itex]b[/itex] from a sample of size 1 ?

The common way to get an answer is to oversimplify matters and compute a "linearized asymptotic" estimate. The value of [itex]b[/itex] is some function [itex]F[/itex] of the [itex](x_i,y_i)[/itex]. Let [itex]L[/itex] be the linear approximation for the function [itex]F[/itex]. Assume that near the observed values in the sample that this well approximates the random variable [itex]b[/itex] as a linear combination of the [itex]x_i[/itex] and [itex]y_i[/itex]. When you have a random variable expressed as linear combination of other random variables, you can work on expressing its standard deviation in terms of the standard deviations of the other random variables.

That's the general picture. If it's what you want to do then we can try to look up the specifics. I don't know them from memory.

richardc · Mar 1, 2014

Thank you for clarifying the problem.

With N observation pairs I believe I can write [itex]b=\frac{N \sum x_i y_i - \sum x_i \sum y_i}{N \sum x_i^2 - (\sum x_i)^2}[/itex].

I suppose the propagation of error formula [itex]\sigma_f^2=\sum (\frac{\partial f}{\partial x_i} \sigma_{x_i} )^2[/itex] is then applied to a linear approximation of b?

Stephen Tashi · Mar 1, 2014

You state a problem where there is an error in measurement for [itex]x_i[/itex] as well as for [itex]y_i[/itex]. In such a problem, people often use "total least squares" regression. I think the computation of the slope in "total least squares" regression is different than in ordinary least square regression, which assumes no error in the measurement of the [itex]x_i[/itex]. I think the formula you gave for [itex]b[/itex] is for ordinary least squares regression.

Of course, one may ask the question: If I fit a straight line to data using the estimator for slope used in ordinary least squares regression and my data also has errors in the [itex]x_i[/itex] then what is the standard deviation of this estimator. If that's the question, you need terms involving [itex]\frac{\partial f}{\partial y_i} \sigma^2_{y_i}[/itex] and [itex]\frac{\partial f}{\partial x_i} \sigma^2 x_i[/itex]

I don't know if the estimator for slope in ordinary least squares regression is an unbiased estimator if there are errors in the [itex]x_i[/itex].

Finding the Uncertainty of the Slope Parameter of a Liner Regression

Discussion Overview

Discussion Character

Main Points Raised

Areas of Agreement / Disagreement

Contextual Notes

Similar threads

Graduate Expected numbers of cards of a last color remaining

Undergrad The problem of points

Graduate Probability puzzle

Undergrad How does axiom of foundation prevent infinite sequence of elements?

Undergrad Understanding permutations and combinations in a coin toss experiment

Insights Revisiting the Velocity-Time Function

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect