How Does the Least Squares Estimator Minimize Error in Linear Regression?

  • Thread starter Thread starter squenshl
  • Start date Start date
  • Tags Tags
    Estimate Square
Click For Summary

Homework Help Overview

The discussion revolves around the properties of the least squares estimator in the context of linear regression, specifically examining the expression for the error term in relation to the estimator and the true parameter. The problem involves a statistical model where the response variable follows a normal distribution with a specified mean and covariance structure.

Discussion Character

  • Conceptual clarification, Mathematical reasoning, Problem interpretation

Approaches and Questions Raised

  • Participants explore the implications of the problem statement, questioning the distinction between 'estimator' and 'estimate'. Some suggest that the problem may be trivial if no distinction is intended. Others propose substitution and expansion as a method to approach the proof, while expressing concerns about the implications of certain terms being zero.

Discussion Status

There is an ongoing examination of the problem's requirements, with participants actively questioning the assumptions and the clarity of the terms used. Some guidance has been offered regarding potential methods of substitution and expansion, but no consensus has been reached on the interpretation of the problem.

Contextual Notes

Participants note potential ambiguities in the problem statement and express uncertainty about the expectations regarding the proof. There is a mention of the need for precise terminology in the context of estimators and estimates, which may affect the interpretation of the task.

squenshl
Messages
468
Reaction score
4

Homework Statement


Suppose that ##Y \sim N_n\left(X\beta,\sigma^2I\right)##, where the density function of ##Y## is
$$\frac{1}{\left(2\pi\sigma^2\right)^{\frac{n}{2}}}e^{-\frac{1}{2\sigma^2}(Y-X\beta)^T(Y-X\beta)},$$
and ##X## is an ##n\times p## matrix of rank ##p##.
Let ##\hat{\beta}## be the least squares estimator of ##\beta##.

Show that ##(Y-X\beta)^T(Y-X\beta) = \left(Y-X\hat{\beta}\right)^T(Y-X\hat{\beta})+\left(\hat{\beta}-\beta\right)^TX^TX\left(\hat{\beta}-\beta\right)## and therefore that ##\hat{\beta}## is the least squares estimate.
Hint: ##Y-X\beta = Y-X\hat{\beta}+X\hat{\beta}-X\beta##.

Homework Equations

The Attempt at a Solution


I have no idea where to start. Do I substitute the hint into ##(Y-X\beta)^T(Y-X\beta)## and expand out the brackets?

Please help!
 
Physics news on Phys.org
There seems to be something odd about how this problem is stated. It asks the student to assume that ##\hat\beta## is the least squares estimator of ##\beta## - and then to use that to prove that it is the least squares estimate. Are they trying to draw a distinction between estimator and estimate? If not, the problem is trivial. However if we want to get very precise about terminology I would have thought that an estimator is a function whereas the estimate is the result of the function. Is there some particular meaning of 'estimator' and 'estimate' that they are using in your course?

As to how to proceed to prove their formula, yes substitution along the lines you mention sounds a good way to start. You can rewrite the RHS of the hint as ##(Y-X\hat\beta)+X(\hat\beta-\beta)##. Expanding out then gives us a right hand side that is what they show above, plus
$$2(X(\hat\beta-\beta))^T(Y-X\hat\beta)$$
So this needs to be shown to be zero. However it seems to me that should be impossible, since it is a function of the unknown parameter vector ##\beta##, which can be changed without changing any of the other elements in the formula (##X,Y,\hat\beta##) .

Are you sure there wasn't an expectation operator around that equation they want you to prove, or some other constraining condition?
 
andrewkirk said:
There seems to be something odd about how this problem is stated. It asks the student to assume that ##\hat\beta## is the least squares estimator of ##\beta## - and then to use that to prove that it is the least squares estimate. Are they trying to draw a distinction between estimator and estimate? If not, the problem is trivial. However if we want to get very precise about terminology I would have thought that an estimator is a function whereas the estimate is the result of the function. Is there some particular meaning of 'estimator' and 'estimate' that they are using in your course?

As to how to proceed to prove their formula, yes substitution along the lines you mention sounds a good way to start. You can rewrite the RHS of the hint as ##(Y-X\hat\beta)+X(\hat\beta-\beta)##. Expanding out then gives us a right hand side that is what they show above, plus
$$2(X(\hat\beta-\beta))^T(Y-X\hat\beta)$$
So this needs to be shown to be zero. However it seems to me that should be impossible, since it is a function of the unknown parameter vector ##\beta##, which can be changed without changing any of the other elements in the formula (##X,Y,\hat\beta##) .

Are you sure there wasn't an expectation operator around that equation they want you to prove, or some other constraining condition?
Nope that's the question asked.
 
squenshl said:

Homework Statement


Suppose that ##Y \sim N_n\left(X\beta,\sigma^2I\right)##, where the density function of ##Y## is
$$\frac{1}{\left(2\pi\sigma^2\right)^{\frac{n}{2}}}e^{-\frac{1}{2\sigma^2}(Y-X\beta)^T(Y-X\beta)},$$
and ##X## is an ##n\times p## matrix of rank ##p##.
Let ##\hat{\beta}## be the least squares estimator of ##\beta##.

Show that ##(Y-X\beta)^T(Y-X\beta) = \left(Y-X\hat{\beta}\right)^T(Y-X\hat{\beta})+\left(\hat{\beta}-\beta\right)^TX^TX\left(\hat{\beta}-\beta\right)## and therefore that ##\hat{\beta}## is the least squares estimate.
Hint: ##Y-X\beta = Y-X\hat{\beta}+X\hat{\beta}-X\beta##.

Homework Equations

The Attempt at a Solution


I have no idea where to start. Do I substitute the hint into ##(Y-X\beta)^T(Y-X\beta)## and expand out the brackets?

Please help!
Let ##Q(\beta) = (Y - X \beta)^T (Y - X \beta)##. If you write ##\beta = b + e## you can expand ##Q(b+e)## as a quadratic in ##e##. It will have 0-order terms (not containing ##e##), first-order terms (linear in ##e##) and second-order terms (of the form ##e^T M e## for some matrix ##M## that depends on ##X, Y## and ##b##). However, if you choose ##b## correctly, the terms of first-order in ##e## will vanish, leaving you with only zero-order and second-order terms in ##e##. That will happen when ##b = \hat{\beta}##. You will obtain the expression you are being asked to prove, where ##e = \beta ##-
 

Similar threads

  • · Replies 2 ·
Replies
2
Views
1K
  • · Replies 11 ·
Replies
11
Views
2K
  • · Replies 5 ·
Replies
5
Views
3K
Replies
1
Views
2K
  • · Replies 11 ·
Replies
11
Views
3K
Replies
7
Views
3K
  • · Replies 1 ·
Replies
1
Views
2K
Replies
3
Views
3K
  • · Replies 10 ·
Replies
10
Views
3K
Replies
12
Views
2K