Error in slope with variance

In summary: The conversation discusses the issue of calculating the error in a slope, particularly when the values themselves are averages with a standard deviation. The suggested methods for calculating this error include using a weighted least squares approach or simulating data with known group sizes. Another consideration is the measurement error in the x-values, which can be addressed by including it as an additional variable in the analysis. The Excel function LINEST is also mentioned as a tool for calculating the slope and intercept, but there is some disagreement about the proper use of standard deviation in this calculation. Finally, there is a discussion about forcing the slope to go through zero and the differences in calculation methods for this scenario.
  • #1
Salish99
28
0
I am looking for methods to calculate the error in a slope.
the caveat is that my values themselves are averages with a STDEV.
E.g.
x
1+-1%
2+-1%
3+-1%
y
0.14+-0.01
0.27+-0.02
0.42+-0.02
(using http://www.cartage.org.lb/en/themes/sciences/chemistry/Miscellenous/Helpfile/Erroranalysis/MultiplicationDivision/MultiplicationDivision.htm [Broken] as the calculation method for these errors)

This could be simplified by assuming the x-values have no deviation.

Now I can just plot the average slope of those three values, I can make a simple linear regression analysis, obtain the least square values as shown here
https://www.physicsforums.com/showthread.php?t=194616 and somewhat related here
https://www.physicsforums.com/showthread.php?t=173827

or use the Excel Linestatistics
(http://www.trentu.ca/academic/physics/linestdemo.html)
I obtain a value for the slope of my averaged slope, and a STDEV, based on the Least Square algorithm. But this does not take into account at all my initial STDEV.
Is there a more general algorithm that can take STDEVs in the initial (at least) y values, or both x and y values, and how would I calculate that?

Thank you.
 
Last edited by a moderator:
Physics news on Phys.org
  • #2
You seem to have a problem that involves a combination of:

1. grouped data, whereby you observe the group statistics (means and variances) but not individual observations for the dependent variable (y). In this case OLS is typically unbiased but inefficient. Since OLS is unbiased, if all you care about is the sign & the size of the coefficient and not its standard error, you can go with your coefficient estimates from OLS.

OTOH, if you care about the coefficient's standard error and you happen to know the group sizes in each "bin" (i.e. for each of your "average" observations) you can use those as weights and compute a weighted least squares. If you don't have the group sizes, you can assume a uniform group size for each observation (say, g = 100) and simulate three sets of 100 individual observations with means = the y's in your data and the ranges = the +/- factors around each y.

2. measurement error: each of your x's is measured with error, and that can be a source of bias. If you know the "cause" of the variation in the x variable, you can include that as an additional variable; if you don't know or have the "cause" you can at least include the variation around each x as a second variable. You have too few observations and that's a problem, especially for including additional independent variables alongside of x. But if you decide to run the simulation in step 1 above, which will generate many "fictional" observations, this will no longer be a problem.

EnumaElish
 
  • #3
Thanks enuma.

I will disregard the x-axis uncertainty, which derives from the measurement setup.
Just for future reference, how can I convert deviations in the x-values into independent y-(or z-)values?

I am very interested in the standard error of the resulting function.
One suggestion I got was to simply make 2 lines through the graph, one using all max values (all y- values +the error), and one using all min values (all y- values -the error), then taking the average of those two, and using a simple STDEV as as standard error. I somewhat disagree with that (and would be happy for comments).


Anyways, the linear OLS function in excel from LINEST gives the slope m out as [tex]\frac{\sum(x-\bar{x})(y-\bar{y})}{\sum(x-\bar{x})}[/tex]. and the intercept with the y-axis b as b=y(av)-mx(av)
so far, so good.
So for a simple sample set of three points (x;y)=(1;1), (2;2.1), (3;2.9).
the slope m = 0.95, and b=0.1
Using Linest (y, x,,TRUE), I can create an array that gives me exactly those values, with a STDEV for m=0.086603 and for b=0.187
1) How are those values calculated?
2) How come the st.deviation for b is 80 larger than its original value?!?

Furthermore, if I now force the slope to go through 0, I use (y, x,0,TRUE), my slope becomes 0.9928 and b=o (obviously), with STDEV(m)=0.026245.
But if I use abovementioned calculation, and just add a fourth point (0;0), I get a slope of 0.98, and a b of 0.03, NOT what excel does by forcing it through zero.

Any help with how both slope-forced through zero, and all STDEVs are actually calculated?
Thanks.
 

What is "Error in slope with variance"?

"Error in slope with variance" refers to the uncertainty or variability in the calculated slope of a line in a regression analysis. It takes into account both the error in the predicted values and the variance in the data points.

Why is it important to consider the error in slope with variance?

Considering the error in slope with variance allows us to understand the reliability and accuracy of the calculated slope. It helps us determine how much the slope may deviate from the true value and whether the relationship between the variables is statistically significant.

How is the error in slope with variance calculated?

The error in slope with variance is typically calculated using the standard error of the slope, which takes into account the standard deviation of the data points and the sample size. This can be calculated using mathematical formulas or through statistical software.

What factors can affect the error in slope with variance?

The error in slope with variance can be affected by the amount of variability in the data, the sample size, and the presence of outliers or influential data points. It can also be influenced by the assumptions and limitations of the regression model used.

How can the error in slope with variance be minimized?

The error in slope with variance can be minimized by increasing the sample size, reducing the variability in the data, and using robust regression methods that are less sensitive to outliers. It is also important to carefully consider the assumptions and limitations of the regression model and to use appropriate statistical techniques to account for them.

Similar threads

  • Set Theory, Logic, Probability, Statistics
Replies
4
Views
739
  • Set Theory, Logic, Probability, Statistics
Replies
19
Views
5K
  • Set Theory, Logic, Probability, Statistics
Replies
7
Views
280
  • Set Theory, Logic, Probability, Statistics
Replies
4
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
2
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
5
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
2
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
7
Views
7K
  • Set Theory, Logic, Probability, Statistics
Replies
5
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
9
Views
1K
Back
Top