Propagating Measurement Uncertainty into a Linear Regression Model

lschong
Messages
2
Reaction score
0
I am trying to figure out how to combine uncertainty (in x and y) into the standard error of the best fit line from the linear regression for that dataset.

I am plotting units of concentration (x) versus del t/height (y) to get a value for the flux (which is the slope)

I understand how to get the standard error of the best fit line, but that only gives the error in y in relation to the best fit line. Is there a good way to combine that error with the error from the individual measurements?

For example:
(x) (y)
delt/h Conc.
0.00 563.84
2.39 568.77
3.53 566.64
11.03 572.59

The error in each y measurement is 9%

When I do the linear regression, I get a slope of .71 and an error of .21

Is there a (relatively) simple way to propagate the 9% error into the regression error?
 
Physics news on Phys.org
Putting aside the errors in the x values, the regression error already includes the errors in y.
 
Are you referring to the standard error of the regression line? I know that the standard error includes all the vertical error from each point to the line, but what I want to do is take into account the vertical error in each data point with respect to the line.

So, my first point y = 531 +/- 51 and the second point y = 540+/- 46 and so on. How do I integrate the +/- values for each data point into the error for the linear regression?

Thanks.
 
The computationally easy way is to generate random numbers for each y. For y = 531 +/- 51, you could generate (say) 10 uniform random numbers with mean = 531 and range = +/- 51, all matched to the same x value.
 
Hi,
I would like to do the same thing as Ischong. Is there an analytical way rather than using Monte-Carlo simulation as someone suggests. I know that simulation will surely work but need more simple way as the model is just linear regression.

Sincerely yours,
 
Suppose you have T observations and K variables. Suppose you also know the distribution of each y[t]; for example, y[t] ~ N(m[t], s[t]), t = 1 to T. If s[t] is constant for all t, then you have the standard OLS model. If s[t] is different for each t, then each error term u[t] is distributed N(0, s[t]). Since you know s[t] for all t, you can define the matrix \bold\Phi_{T\times T} = diag(s[t]^2) as the variance matrix (of the errors). Then

\hat{\beta}=\left(X'\bold\Phi^{-1}X\right)^{-1}X'\bold\Phi^{-1}y

is the best linear unbiased estimator of the regression coefficient vector.
 
Namaste & G'day Postulate: A strongly-knit team wins on average over a less knit one Fundamentals: - Two teams face off with 4 players each - A polo team consists of players that each have assigned to them a measure of their ability (called a "Handicap" - 10 is highest, -2 lowest) I attempted to measure close-knitness of a team in terms of standard deviation (SD) of handicaps of the players. Failure: It turns out that, more often than, a team with a higher SD wins. In my language, that...
Hi all, I've been a roulette player for more than 10 years (although I took time off here and there) and it's only now that I'm trying to understand the physics of the game. Basically my strategy in roulette is to divide the wheel roughly into two halves (let's call them A and B). My theory is that in roulette there will invariably be variance. In other words, if A comes up 5 times in a row, B will be due to come up soon. However I have been proven wrong many times, and I have seen some...
Back
Top