How Do You Derive the Least Squares Solution in Linear Regression?

Click For Summary
SUMMARY

The discussion focuses on deriving the least squares solution in linear regression, specifically the equation y = Xb + e, where y represents observations, X is a matrix, b is the coefficient vector, and e is the error term. The goal is to minimize the error term e'e, leading to the expression 0 = -2X'y + 2X'Xb. The derivative of the quadratic form is discussed, emphasizing that if matrix A is symmetric, the derivative simplifies to ∂y/∂x = 2Ax. The conversation highlights the necessity of solving a system of linear equations when setting the derivative to zero.

PREREQUISITES
  • Understanding of linear regression concepts
  • Familiarity with matrix calculus
  • Knowledge of symmetric matrices
  • Proficiency in vector derivatives
NEXT STEPS
  • Study matrix calculus in depth, focusing on derivatives of quadratic forms
  • Learn about symmetric matrices and their properties
  • Explore the application of least squares in various regression models
  • Review linear algebra concepts relevant to solving systems of equations
USEFUL FOR

Data scientists, statisticians, and machine learning practitioners interested in understanding the mathematical foundations of linear regression and optimizing model performance.

tennishaha
Messages
21
Reaction score
0
In the least square linear regression, say we have y=Xb+e (y,b,e are vector and X is matrix, y is observations, b is coefficient, e is error term)
so we need to minimize e'e=(y-Xb)'(y-Xb)=y'y-y'Xb-b'X'y+b'X'Xb we can take the derivative of e'e to b, and we can get the answer is 0-2X'y+2X'Xb

but I don't know how to get the answer? (I don't know how to take derivative regarding a vector)

Can anyone help? Thanks
 
Physics news on Phys.org
write it out as a sum:
y=x'Ax = \sum_i \sum_j x_iA_{ij}x_j
\frac{\partial y}{\partial x_k} = \sum_i\sum_j(\left x_iA_{ij}\delta_{jk} + \delta_{ik}A_{ij}x_j \right)
if A is symmetric:
\frac{\partial y}{\partial x} = 2Ax
this works only if A is symmetric though, otherwise it would be:
\frac{\partial y}{\partial x} = (A+A')x i think...
you should take notice that demanding \frac{\partial y}{\partial x}=0 is actually trying to solve a system of linear equations since the derivative is defined to be a vector in this case.
 

Similar threads

  • · Replies 5 ·
Replies
5
Views
2K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 6 ·
Replies
6
Views
3K
Replies
3
Views
3K
  • · Replies 23 ·
Replies
23
Views
4K
  • · Replies 8 ·
Replies
8
Views
3K
  • · Replies 4 ·
Replies
4
Views
3K
  • · Replies 8 ·
Replies
8
Views
3K
  • · Replies 64 ·
3
Replies
64
Views
6K
  • · Replies 4 ·
Replies
4
Views
2K