Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

Propagating Measurement Uncertainty into a Linear Regression Model

  1. Jan 17, 2010 #1
    I am trying to figure out how to combine uncertainty (in x and y) into the standard error of the best fit line from the linear regression for that dataset.

    I am plotting units of concentration (x) versus del t/height (y) to get a value for the flux (which is the slope)

    I understand how to get the standard error of the best fit line, but that only gives the error in y in relation to the best fit line. Is there a good way to combine that error with the error from the individual measurements?

    For example:
    (x) (y)
    delt/h Conc.
    0.00 563.84
    2.39 568.77
    3.53 566.64
    11.03 572.59

    The error in each y measurement is 9%

    When I do the linear regression, I get a slope of .71 and an error of .21

    Is there a (relatively) simple way to propagate the 9% error into the regression error?
     
  2. jcsd
  3. Jan 19, 2010 #2

    EnumaElish

    User Avatar
    Science Advisor
    Homework Helper

    Putting aside the errors in the x values, the regression error already includes the errors in y.
     
  4. Jan 19, 2010 #3
    Are you referring to the standard error of the regression line? I know that the standard error includes all the vertical error from each point to the line, but what I want to do is take into account the vertical error in each data point with respect to the line.

    So, my first point y = 531 +/- 51 and the second point y = 540+/- 46 and so on. How do I integrate the +/- values for each data point into the error for the linear regression?

    Thanks.
     
  5. Jan 20, 2010 #4

    EnumaElish

    User Avatar
    Science Advisor
    Homework Helper

    The computationally easy way is to generate random numbers for each y. For y = 531 +/- 51, you could generate (say) 10 uniform random numbers with mean = 531 and range = +/- 51, all matched to the same x value.
     
  6. Jun 10, 2010 #5
    Hi,
    I would like to do the same thing as Ischong. Is there an analytical way rather than using Monte-Carlo simulation as someone suggests. I know that simulation will surely work but need more simple way as the model is just linear regression.

    Sincerely yours,
     
  7. Jun 11, 2010 #6

    EnumaElish

    User Avatar
    Science Advisor
    Homework Helper

    Suppose you have T observations and K variables. Suppose you also know the distribution of each y[t]; for example, y[t] ~ N(m[t], s[t]), t = 1 to T. If s[t] is constant for all t, then you have the standard OLS model. If s[t] is different for each t, then each error term u[t] is distributed N(0, s[t]). Since you know s[t] for all t, you can define the matrix [itex]\bold\Phi_{T\times T} = diag(s[t]^2)[/itex] as the variance matrix (of the errors). Then

    [tex]\hat{\beta}=\left(X'\bold\Phi^{-1}X\right)^{-1}X'\bold\Phi^{-1}y[/tex]

    is the best linear unbiased estimator of the regression coefficient vector.
     
Share this great discussion with others via Reddit, Google+, Twitter, or Facebook