image
Physics Forums Logo
image
image
* Register * Upgrade Blogs Library Staff Rules Mark Forums Read
image
image   image
image

Go Back   Physics Forums > Mathematics > Set Theory, Logic, Probability, Statistics


Reply

image Regression SS in multiple linear regression Share It Thread Tools Search this Thread image
Old Jun20-09, 08:04 AM                  #1
kingwinner

kingwinner is Offline:
Posts: 815
Regression SS in multiple linear regression

In MULTIPLE linear regression, is it still true that the regression sum of squares is equal to
∑ (Y_i hat -Y bar)^2 ???

My textbook defines regression SS in the chapters for simple linear regression as ∑ (Y_i hat -Y bar)^2, and then in the chapters for multiple linear regression, the regression SS is defined in MATRIX form, and it did not say anywhere whether it is still equal to ∑ (Y_i hat -Y bar)^2 or not, so I am confused...

If it is still equal to ∑ (Y_i hat -Y bar)^2 in MULTIPLE linear regression (this is such a simple formula), what is the whole point of expressing the regression SS in terms of matrices in mutliple linear regression? I don't see any point of doing so when the formula ∑ (Y_i hat -Y bar)^2 is already so simple. There is no need to develop additional headaches...

Thanks for explaining!
  Reply With Quote
Old Jun22-09, 01:39 PM                  #2
statdad

statdad is Offline:
Posts: 702
Recognitions:
Homework Helper Homework Helper
Re: Regression SS in multiple linear regression

I think you have notation (and/or terms) confused. In simple linear regression

LaTeX Code: <BR>\\begin{align*}<BR>SSTO & = \\sum(Y_i - \\bar Y)^2 \\\\<BR>SSE & = \\sum (Y_i - \\hat Y_i)^2 \\\\<BR>SSR & = SSTO - SSE = \\sum (\\hat Y_i - \\bar Y)^2<BR>\\end{align*}<BR>

In multiple linear regression, with matrix notation,

LaTeX Code: <BR>\\begin{align*}<BR>SSTO & = \\mathbf{Y}single-quote\\mathbf{Y} - n \\bar{Y}^2 \\quad(=\\sum (Y_i - \\bar Y)^2)\\\\<BR>SSE & = \\hat{e}single-quote \\hat{e} = \\mathbf{Y}single-quote \\mathbf{Y} - \\hat{\\mathbf{\\beta}}single-quote \\mathbf{X}single-quote \\mathbf{Y} \\quad (=\\sum (Y_i - \\hat Y_i)^2)  \\\\<BR>SSR & = SSTO - SSE = \\hat{\\mathbf{\\beta}}single-quote \\mathbf{X}single-quote \\mathbf{Y} - n \\bar{Y}^2<BR>\\end{align*}<BR>

The matrix approach isn't here simply to cause confusion: in multiple linear regression the "nice" approach of drawing pictures to represent things breaks down. However, a little linear algebra can be used to describe exactly why the residuals sum to zero, why the different quantities have different degrees of freedom, as well as provide convenient ways to generate tests (there are many theorems that describe the probability distribution of different quadratic forms of multivariate normal distributions: using matrices in multiple regression allow these theorems to be used to develop hypothesis tests.)

On a more basic level: imagine trying to derive the normal equations (to estimate the regression coefficients) by algebra rather than via the matrix approach. It isn't fun.

As one more: example:

The fitted values in multiple regression can be written as

LaTeX Code: <BR>\\hat Y = X \\left(Xsingle-quoteX\\right)^{-1} Xsingle-quote Y \\equiv P_V Y<BR>

where LaTeX Code:  P_V = X \\left(Xsingle-quoteX\\right)^{-1} Xsingle-quote  is a projection matrix onto the space spanned by the columns of LaTeX Code:  X  .

The residuals are

LaTeX Code: <BR>\\hat e & = Y - \\hat Y = \\left(I - X \\left(Xsingle-quoteX\\right)^{-1} Xsingle-quote\\right) Y \\equiv P_{\\hat V} Y<BR>

where LaTeX Code:  P_{\\hat V} = I - X \\left(Xsingle-quoteX\\right)^{-1} Xsingle-quote  is the projection onto the space orthogonal to the column space of LaTeX Code:  X  .

Now

LaTeX Code: <BR>\\hat esingle-quote \\hat Y = Ysingle-quote P_{\\hat V} \\left(I - P_{\\hat V}\\right) Y = Ysingle-quote \\left(P_{\\hat V} - P_{\\hat V}^2\\right) Y = Ysingle-quote \\left(P_{\\hat V} - P_{\\hat V}\\right) Y = 0<BR>

or, in short,

LaTeX Code: <BR>\\sum \\hat{e}_i \\hat{y}_i = 0<BR>

just as in linear regression.
  Reply With Quote
image image
Reply
Thread Tools


Similar Threads for: Regression SS in multiple linear regression
Thread Thread Starter Forum Replies Last Post
Multiple linear regression: partial F-test kingwinner Set Theory, Logic, Probability, Statistics 5 Jun20-09 08:21 AM
Multiple regression and matracies? Rabolisk General Math 0 Apr30-09 09:36 PM
Multiple regression and Time Series sony Calculus & Beyond 0 Apr24-09 10:13 AM
Linear regression twoflower Calculus & Beyond 1 Sep18-08 02:33 PM
multiple regression model semidevil Set Theory, Logic, Probability, Statistics 0 Apr26-05 01:57 PM

Powered by vBulletin Copyright ©2000 - 2009, Jelsoft Enterprises Ltd. © 2009 Physics Forums
Sciam | physorgPhysorg.com Science News Partner
image
image   image