How to prove zero correlation between residuals and predictors?

  • Context: Graduate 
  • Thread starter Thread starter NotEuler
  • Start date Start date
  • Tags Tags
    Correlation Zero
Click For Summary

Discussion Overview

The discussion revolves around proving that the covariance between residuals and predictors in a linear least squares regression model is zero. Participants explore the implications of this assertion, considering both theoretical and practical aspects of regression analysis.

Discussion Character

  • Technical explanation
  • Mathematical reasoning
  • Debate/contested

Main Points Raised

  • NotEuler seeks to prove that cov(e, X1) = cov(e, X2) = 0 for a linear regression model, suggesting that this should hold regardless of the original dataset or the nature of dependencies.
  • NotEuler proposes a method to express the covariance in terms of the fitted model and the residuals, indicating that if the sum of squared errors is minimized, the covariance should be zero.
  • Another participant notes that the partial derivatives of the error function being zero at extrema may support the argument for zero covariance.
  • A later reply outlines a proof sketch that generalizes the argument to any number of predictors, concluding that cov(e, Xk) = 0 for any predictor variable Xk.
  • NotEuler later suggests that this result implies the residuals are also uncorrelated with the predicted values from the model, reinforcing the initial claim.

Areas of Agreement / Disagreement

Participants generally explore the same hypothesis regarding the zero covariance between residuals and predictors, but there is no explicit consensus on the proof or its implications. Some participants express uncertainty about the completeness of their arguments.

Contextual Notes

The discussion includes assumptions about the nature of the dataset and the linearity of relationships, which may not hold in all cases. The proof relies on specific mathematical properties of the least squares method that may require further clarification or validation.

Who May Find This Useful

Readers interested in statistical modeling, regression analysis, and the properties of residuals in linear regression may find this discussion relevant.

NotEuler
Messages
58
Reaction score
2
Hi,
I'm trying to figure out something I'm pretty sure is true, but don't know how to prove it. I couldn't find the answer with a google search, but hopefully someone here knows the answer!

So I have a linear least squares multiple regression model:
Y=a+bX1+cX2+e

where a is the intercept, X1 and X2 predictor/independent variables, and e denotes the residuals.
The model (i.e. the values of a, b and c) is fitted so that Ʃe^2 is minimized.

How do I prove that cov(e,X1)=cov(e,X2=0?

Thanks!
NotEuler
 
Physics news on Phys.org
Maybe I should clarify my question...

1) Assume I have a dataset of dependent variables Yi, and independent variables X1i and X2i.

2) I fit a linear regression model to that dataset: Y=a+bX1+cX2+e.

3) The model is fitted, i.e. the parameters a, b and c are determined, so that the sum of square of the errors Ʃei^2 = Ʃ(Yi-a-bX1i-cX2i)^2 is minimized.

4) I then calculate the covariance of the e:s from that same fitted model, and either set of independent variables (X1:s or X2:s) from the original dataset.

5) I think both cov(e,X1) and cov(e,X2) will always equal zero, regardless of what the original dataset was, and regardless of whether the real dependences are linear or something else.
I also think this should hold for any number of independent variables.

6) I think that to prove this, I need to write the covariance as cov(e,X1) = cov(Y-a-bX1-cX2, X1) = cov(Y,X1)-cov(a,X1)-cov(bX1,X1)-cov(cX2,X1).
And then somehow use the consequences of step 3 to show that if the square of errors is minimized, then this covariance is always zero.Does this make any sense? I'm no expert on regressions or covariances, so this might be hard to follow. It's also possible I'm wrong, and cov(e,X1) is not always zero.
Either way, any hints on how to proceed would be much appreciated!

Cheers,
NotEuler
 
NotEuler said:
3) The model is fitted, i.e. the parameters a, b and c are determined, so that the sum of square of the errors Ʃei^2 = Ʃ(Yi-a-bX1i-cX2i)^2 is minimized.

And then somehow use the consequences of step 3 to show that if the square of errors is minimized, then this covariance is always zero.

The partial derivatives of the function in step 3 with respect to a,b,c would be zero at an extrema. Perhaps that helps.
 
  • Like
Likes   Reactions: 1 person
Yes, that helps a lot! Here's a sketch of the proof, happy to hear if you see any mistakes. I've changed the notation slightly to show that it applies to a regression model with any number of predictors. I will denote means with ~ (i.e. the mean of the ei:s =E(ei)=e~.

1) Assume I have a dataset of dependent variables Yi, and independent variables X1i, X2i, X3i,... Xki.

2) I fit a linear regression model to that dataset: Y=a + bX1 + Z + e, where Z is a linear combination of all the independent variables from X2 onwards: Z=cX2+dX3+...
Z is therefore independent of a and b.

3) The model is fitted, i.e. the parameters a, b, c, d... are determined, so that the sum of square of the errors s(a,b,c,d...) = Ʃei^2 = Ʃ(Yi-a-bX1i-Zi)^2 is minimized.

4) To do this, I calculate the partial derivatives of s for a,b,c,d... and set them to equal 0.
I find that
∂s/∂a = -2 Ʃ(Yi-a-bX1i-Zi). Therefore Ʃ(Yi-a-bX1i-Zi) = Ʃei = 0, and E[e]=e~= 0
∂s/∂b = -2 Ʃ X1i (Yi-a-bX1i-Zi). Therefore Ʃ X1i (Yi-a-bX1i-Zi) = Ʃ X1i ei= 0

5) Ʃ (ei-e~)(X1i-X1~) = Ʃ (eiX1i - eiX1~ - e~X1i + e~X1~)
= ƩeiX1i - ƩeiX1~ - Ʃe~X1i + Ʃe~X1~ = 0 - X1~Ʃei -Ʃ0 + Ʃ0 = -X1~0 = 0

Therefore Cov(e,X1) = 0, which is what I wanted to prove.

Now I could replace X1 with any of the other X:s that are all combined in Z, and repeat the above analysis. Because the regression function is symmetric for all the predictor variables, I would then find that cov(e,Xk)=0 for any k.

Therefore the residuals are always uncorrelated with the predictors in a least squares linear regression model.
 
Now that I think about it, this result immediately implies that the residuals are also uncorrelated with the values predicted by the model (i.e. not the original dataset Yi, but the y=a+bX1+Z predicted.

This is because (following the notation above) cov(e,y)=cov(e, a+bX1+Z)=cov(e,a)+cov(e,bX1)+cov(e,Z)=0+0+0.
 

Similar threads

  • · Replies 1 ·
Replies
1
Views
3K
  • · Replies 2 ·
Replies
2
Views
4K
Replies
10
Views
2K
  • · Replies 10 ·
Replies
10
Views
5K
  • · Replies 1 ·
Replies
1
Views
2K