Register to reply 
Individual Measurement Uncertainty vs. Standard Error of Regression 
Share this thread: 
#1
Jul2114, 09:57 AM

P: 21

Let's say a student does a simple experiment where she conducts 10 trials at each x value (at each value of the independent variable). She collects data over 30 x values, giving her 300 total trials. For each of the 30 x values, she averages the 10 y values and she calculates the standard deviation in those 10 y values. She makes a plot of average y vs. x in Excel, and uses the standard deviations for y error bars. Assume the plot is linear. Assume there's no error/uncertainty in individual x values. Perhaps also assume the individual y errors are all uniformly distributed and equal. I want this to be a simplest possible case.
Next, the student (a) adds a linear trendline in Excel, (b) has Excel calculate the slope of her line of best fit, and (c) has Excel calculate the standard error in the slope. I have three questions:



#2
Jul2114, 10:28 AM

Mentor
P: 11,862

https://www.che.udel.edu/pdf/FittingData.pdf 


#3
Jul2114, 04:05 PM

P: 21

Thanks for the reply! As I understand it, a weighted leastsquares fit is used only if the y errors (and x errors, if using an orthogonal regression) differ among different data points. Under the simplest circumstances, the individual y errors are all the same, and a weighted leastsquares fit simplifies into a regular unweighted regression.
I'm interested in is the simplest case where weighting is unnecessary. Even in this simplest case, I think there is a theoretical reason for seemingly ignoring the individual y measurement uncertainty when calculating the standard error of the slope. I think the correct reason is this: if the individual y error/uncertainty is the same for all data points, thenacross the entire data setthe y values will fluctuate within that fixed y uncertainty. The standard error of the regression, also called the standard error of the estimate, uses the residuals to estimate the average y error in the data set. Hence, the individual y errors are not ignored in the standard error of the regression. Rather, the standard error of the regression represents the average amount of y error in each measurement in either direction (i.e., not distinguishing between positive error wherein the y measurement is above the true value and negative error wherein the y measurement is below the true value). If this is correct, then it would answer question #1 from my original post, since I believe the standard error of the slope is calculated from the standard error of the regression. I believe questions #2 and #3 in my original post are still unanswered. Any guidance is much appreciated! 


#4
Jul2114, 09:20 PM

P: 21

Individual Measurement Uncertainty vs. Standard Error of Regression
Please also correct any improper conflation of the terms "residual" and "error" in my posts, along with any other incorrect usage of statistics terms.



#5
Jul2214, 01:29 AM

Sci Advisor
P: 3,626

I think the point is that in most situations you don't know the standard error of y but you have to estimate it from your data. Linear regression does exactly this.



#6
Jul2214, 07:57 AM

P: 21

Thanks! Many physicists use the standard error in the slope (which I believe is calculated from the standard error of the regression or the SEM) as the uncertainty in the slope. This practice is what I'm interested in, particularly since in simple manipulations of data the uncertainty is propagated through the operation. In regression the approach of propagating uncertainty appears to be abandoned despite the fact that an operation is being performed to calculate the slope from other values which have uncertainty. I'm trying to better understand the justification for abandoning the uncertainty propagation rules in favor of the standard deviation value, which is calculated from the residuals.
NOTE: If I stated previously that the individual y measurement errors are KNOWN, I misspoke. In my original example, I intend for the individual y standard deviations (uncertainties) to be known and for the individual y errors to be unknown prior to the regression. 


#7
Jul2214, 10:45 AM

P: 541

If you want a more sophisticated analysis of your data, you need to use a more sophisticated tool than OLS  either weighted least squares or some better method. 


#8
Jul2214, 11:06 AM

P: 21

Thanks so much MrAchovy!
If I understand you correctly, the equation from my original post for calculating the slope is the version of OLS that Excel uses. According to that equation, the estimated slope of the best fit line is a function of the individual x and y measurements. Consider that the student used Excel to calculate the slope and the standard error of the slope. The student then used the uncertainty propagation rules pictured above to calculate the uncertainty in the slope based upon the uncertainty in the individual x and y measurements. How would the standard error of the slope (as reported by Excel) compare to the uncertainty in the slope (as calculated by the student using the error propagation rules)? If the two values are different, then my followup questions are: (a) why is there a discrepancy, and (b) which of the two values is betterthe standard error of the slope or the uncertainty generated by the propagation rules cited above in this post? 


#9
Jul2214, 11:44 AM

P: 541

I don't think the independence conditions of that rule apply to the OLS calculation  think about it, the more points you add the more confident you should become in the fit (unless they are outliers) but in that equation the more points you add the greater ## s_f ## becomes.
If you want to measure the goodness of fit to all the data, search for "linear regression goodness of fit"; I think an ftest is probably a better place to start than the regression coefficient. 


#10
Jul2214, 12:00 PM

Sci Advisor
P: 3,626

How do you estimate the individual uncertainties? Probably as ##\sum_j(y_ij\bar{y}_i)^2/(n_i1)## however, the ##\bar{y}_i## are not independent from each other, as they are bound to lie on the regression line. So you need to solve the regression equation first. Also, by assumption, the variances are equal, so you can get a better estimate by using the combined estimate from the linear regression.



#11
Jul2214, 12:15 PM

P: 21

THANKS to all!!! This is very helpful to me.



Register to reply 
Related Discussions  
Estimating measurement error using error from linear regression  Linear & Abstract Algebra  8  
How do you count Uncertainty Measurement using Precision Error and Bias Error?  Engineering, Comp Sci, & Technology Homework  0  
Propagating Measurement Uncertainty into a Linear Regression Model  Set Theory, Logic, Probability, Statistics  5  
Standard error for marginal effect in regression?  Set Theory, Logic, Probability, Statistics  1  
Nonnormal measurement error in linear regression  Set Theory, Logic, Probability, Statistics  2 