Proof: extremum has a semi definitie Hessian matrix

Click For Summary

Discussion Overview

The discussion revolves around the proof that an extremum of a function has a semi-definite Hessian matrix. Participants explore the implications of Taylor's theorem for functions of multiple variables, particularly in the context of minima and the behavior of the Hessian matrix at those points.

Discussion Character

  • Technical explanation
  • Mathematical reasoning
  • Debate/contested

Main Points Raised

  • One participant presents a function that is twice continuously differentiable and has a minimum, referencing Taylor's theorem to relate the function's behavior near the minimum to its Hessian matrix.
  • Another participant suggests dividing both sides of the equation by the norm squared of the perturbation vector, ##||h||^2##, to facilitate taking limits.
  • A participant questions the implications of this division, noting that it leads to a limit expression that resembles a derivative but is complicated by the multivariable context.
  • Further contributions propose setting the perturbation vector as a normalized vector and analyzing the behavior of the Hessian as the perturbation approaches zero.
  • Some participants express confusion regarding the notation and the roles of the Hessian and Jacobian in the context of the discussion, indicating a need for clarification.

Areas of Agreement / Disagreement

Participants express differing views on the appropriate approach to take limits and the roles of the Hessian and Jacobian, indicating that the discussion remains unresolved with multiple competing perspectives.

Contextual Notes

There are uncertainties regarding the assumptions made about the derivatives and the notation used, particularly in distinguishing between the Hessian and Jacobian. The discussion also reflects a lack of consensus on the correct method for taking limits in this context.

Coffee_
Messages
259
Reaction score
2
Consider a function ##f : U \subseteq \mathbb{R}^{n} -> \mathbb{R}## that is an element of ##C^{2}## which has an minimum in ##p \in U##.

According to Taylor's theorem for multiple variable functions, for each ##h \in U## there exists a ##t \in ]0,1[## such that :

##f(p+h)-f(p) = h^{T}.D^{2}(p+th).h##

Where ##D^{2}## is the Hessian or the matrix with the mixed partial derivatives, ##D^{2}(p+th)## means it's not in point p. I already assume it's a minimum so the Jacobian matrix that should be there is 0.

Now I should take a limit and somehow show that in both cases where p is a strict or not strict minimum that is ##f(p)<f(x)## of ##f(p)\leq f(x)## that in both cases ##D^{2}## will be negative semi definite. Can someone help me finish/understand this final step formally? Because I'm not sure about how a limit will work in the equation above, the left term just goes to zero.
 
Physics news on Phys.org
You could divide both sides by ## ||h||^2 ##.
 
wabbit said:
You could divide both sides by ## ||h||^2 ##.

Not sure what that would give me. Left I'd get a similar expression to the derivative but since it's in multiple variables I'm not sure. It would be equal to the limit of ##\frac{D*h}{||h||}## I guess?
 
Well you can then have ##h\rightarrow 0## but ##h/||h||## remains finite, and you could set this to be any vector ##u=h/||h||## of your choice and see what this tells you about D when h approaches 0 but u is fixed.

Or if you prefer that formulation, set ##h=\theta u## and take the limit as ##\theta\rightarrow 0##

I don't know what your "*" is but if it is just usual matrix-vector multiplication, no this isn't what you get.
 
wabbit said:
Well you can then have ##h\rightarrow 0## but ##h/||h||## remains finite, and you could set this to be any vector ##u=h/||h||## of your choice and see what this tells you about D when h approaches 0 but u is fixed.

Or if you prefer that formulation, set ##h=\theta u## and take the limit as ##\theta\rightarrow 0##

I don't know what your "*" is but if it is just usual matrix-vector multiplication, no this isn't what you get.

I reasoned that that's what's it supposed to be according to ##lim \frac{f(p+h)-f(p)-D*h}{||h||}=0## where ##D## is the derivative matrix.
 
But as you stated in your op, the first derivative is 0 and D is the Hessian, so suddenly using D as the jacobian instead is truly bizarre reasonning.

Oh sorry I see you were using ##D^2## to denote the hessian - please read my previous statement as referring to that.
 
wabbit said:
But as you stated in your op, the first derivative is 0 and D is the Hessian, so suddenly using D as the jacobian instead is truly bizarre reasonning.

Oh right, should have thought a bit longer before replying.
 

Similar threads

  • · Replies 24 ·
Replies
24
Views
5K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 1 ·
Replies
1
Views
3K
  • · Replies 2 ·
Replies
2
Views
3K
Replies
4
Views
2K
  • · Replies 4 ·
Replies
4
Views
4K
  • · Replies 18 ·
Replies
18
Views
3K
  • · Replies 4 ·
Replies
4
Views
3K
  • · Replies 6 ·
Replies
6
Views
3K
  • · Replies 8 ·
Replies
8
Views
3K