Propagating Measurement Uncertainty into a Linear Regression Model

AI Thread Summary
Combining uncertainty in both x and y measurements into the standard error of a linear regression model is a complex issue. The standard error typically accounts for vertical errors in y, but integrating individual measurement errors into the regression error requires a more nuanced approach. One suggested method is to use a variance matrix to account for different error distributions in y, allowing for a more accurate estimation of regression coefficients. While Monte Carlo simulations can provide a solution, there is a desire for a simpler analytical method to achieve this integration. Ultimately, understanding how to effectively propagate measurement uncertainty is crucial for accurate flux calculations in linear regression analysis.
lschong
Messages
2
Reaction score
0
I am trying to figure out how to combine uncertainty (in x and y) into the standard error of the best fit line from the linear regression for that dataset.

I am plotting units of concentration (x) versus del t/height (y) to get a value for the flux (which is the slope)

I understand how to get the standard error of the best fit line, but that only gives the error in y in relation to the best fit line. Is there a good way to combine that error with the error from the individual measurements?

For example:
(x) (y)
delt/h Conc.
0.00 563.84
2.39 568.77
3.53 566.64
11.03 572.59

The error in each y measurement is 9%

When I do the linear regression, I get a slope of .71 and an error of .21

Is there a (relatively) simple way to propagate the 9% error into the regression error?
 
Physics news on Phys.org
Putting aside the errors in the x values, the regression error already includes the errors in y.
 
Are you referring to the standard error of the regression line? I know that the standard error includes all the vertical error from each point to the line, but what I want to do is take into account the vertical error in each data point with respect to the line.

So, my first point y = 531 +/- 51 and the second point y = 540+/- 46 and so on. How do I integrate the +/- values for each data point into the error for the linear regression?

Thanks.
 
The computationally easy way is to generate random numbers for each y. For y = 531 +/- 51, you could generate (say) 10 uniform random numbers with mean = 531 and range = +/- 51, all matched to the same x value.
 
Hi,
I would like to do the same thing as Ischong. Is there an analytical way rather than using Monte-Carlo simulation as someone suggests. I know that simulation will surely work but need more simple way as the model is just linear regression.

Sincerely yours,
 
Suppose you have T observations and K variables. Suppose you also know the distribution of each y[t]; for example, y[t] ~ N(m[t], s[t]), t = 1 to T. If s[t] is constant for all t, then you have the standard OLS model. If s[t] is different for each t, then each error term u[t] is distributed N(0, s[t]). Since you know s[t] for all t, you can define the matrix \bold\Phi_{T\times T} = diag(s[t]^2) as the variance matrix (of the errors). Then

\hat{\beta}=\left(X'\bold\Phi^{-1}X\right)^{-1}X'\bold\Phi^{-1}y

is the best linear unbiased estimator of the regression coefficient vector.
 
I was reading documentation about the soundness and completeness of logic formal systems. Consider the following $$\vdash_S \phi$$ where ##S## is the proof-system making part the formal system and ##\phi## is a wff (well formed formula) of the formal language. Note the blank on left of the turnstile symbol ##\vdash_S##, as far as I can tell it actually represents the empty set. So what does it mean ? I guess it actually means ##\phi## is a theorem of the formal system, i.e. there is a...
Back
Top