Optimization & singular Hessian matrix

Click For Summary
SUMMARY

The discussion focuses on the derivation of the least squares formula and the properties of the Hessian matrix in optimization. The second derivatives of the sum of squared errors (SSE) are calculated, leading to the construction of the Hessian matrix, which is derived from the Vandermonde matrix. The critical point analysis confirms that the Hessian's determinant must be positive for a minimum solution. The participants clarify the relationship between the first and second partial derivatives, ensuring accurate understanding of the optimization process.

PREREQUISITES
  • Understanding of least squares optimization
  • Familiarity with partial derivatives and Hessian matrices
  • Knowledge of Vandermonde matrices
  • Basic concepts of error functions in statistical modeling
NEXT STEPS
  • Study the derivation of the least squares estimator in detail
  • Learn about the properties of Hessian matrices in optimization
  • Explore the significance of the Vandermonde matrix in linear algebra
  • Investigate methods for proving the positivity of determinants in matrices
USEFUL FOR

Mathematicians, data scientists, and statisticians involved in optimization problems, particularly those working with regression analysis and error minimization techniques.

swampwiz
Messages
567
Reaction score
83
I am trying to figure out how the least squares formula is derived.

With the error function as

Ei = yi - Ʃj xij aj

the sum of the errors is

SSE = Ʃi Ei2

so the 1st partial derivative of SSE with respect to aj is

∂SSE / ∂aj = Ʃi 2 Ei ( ∂Ei / ∂aj )

with the 1st partial derivative of Ei with respect to aj being

∂Ei / ∂aj = - Ʃ xij

so the 2nd derivative of SSE with respect to a single aj is

2SSE / ∂aj2 = 2 Ʃi { ( ∂Ei / ∂aj ) ( ∂Ei / ∂aj ) + Ei ( ∂2Ei / ∂aj2 ) }

and a double partial derivative (i.e., to aj & ak) is

2SSE / ( ∂aj ∂ak ) = 2 Ʃi { ( ∂Ei / ∂aj ) ( ∂Ei / ∂ak ) + Ei ( ∂2Ei / ( ∂aj ∂ak ) ) }

and with the 2nd partial derivative of Ei with respect to aj or the double partial derivative (i.e., to aj & ak) being 0, since the 1st partial is a constant (i.e., in aj )

2Ei / ∂aj2 = ∂2Ei / ( ∂aj ∂ak ) = 0

the 2nd derivatives reduce to

2SSE / ∂aj2 = 2 Ʃi { ( ∂Ei / ∂aj ) ( ∂Ei / ∂aj ) }

2SSE / ( ∂aj ∂ak ) = 2 Ʃi { ( ∂Ei / ∂aj ) ( ∂Ei / ∂ak ) }

which after the substitution of ∂Ei / ∂aj becomes

2SSE / ( ∂aj ∂ak ) = - 2 Ʃi xi( 2 j )

2SSE / ( ∂aj ∂ak ) = - 2 Ʃi xi( j + k )

so the Hessian matrix (e.g., 3 x 3) is

[ H ] =

[ n Ʃ x Ʃ x2 ]

[ Ʃ x Ʃ x2 Ʃ x3 ]

[ Ʃ x2 Ʃ x3 Ʃ x4 ]

which is the normal product of the Vandemonde matrix

[ V ] =

[ 1 x0 x02 ]

[ 1 x1 x12 ]

[ 1 x2 x22 ]

H = [ V ]T [ V ]

While the diagonal terms are sums of the even powers of x, and therefore always positive, it seems that the 2nd derivative test requires that the determinant of the Hessian matrix be positive. Is there any way to prove that this normal product (or even just the Vandermonde matrix itself, since the determinant of itself would merely be the square root of the determinant of the normal product) is guaranteed to have a positive determinant?

So how to determine that indeed the critical point (which is the solution to the coefficients) is a minimum?
 
Last edited:
Physics news on Phys.org
swampwiz said:
with the 1st partial derivative of Ei with respect to aj being
∂Ei / ∂aj = - Ʃ ( xij )2
Isn't it just -xij ?
 
haruspex said:
Isn't it just -xij ?

Yes, I went through my analysis and detected the error, and continued.
 

Similar threads

  • · Replies 1 ·
Replies
1
Views
6K
  • · Replies 1 ·
Replies
1
Views
6K
  • · Replies 1 ·
Replies
1
Views
3K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 4 ·
Replies
4
Views
6K
  • · Replies 1 ·
Replies
1
Views
3K
  • · Replies 13 ·
Replies
13
Views
8K
Replies
6
Views
6K