Stationary points classification using definiteness of the Lagrangian

Click For Summary

Homework Help Overview

The discussion revolves around classifying stationary points of a function using the Lagrange multipliers method, specifically focusing on the behavior of the Hessian matrix of the Lagrangian under constraints defined by an ellipse. The original poster has identified several stationary points and is attempting to classify them as maxima, minima, or saddle points based on the definiteness of the Hessian matrix.

Discussion Character

  • Exploratory, Conceptual clarification, Assumption checking

Approaches and Questions Raised

  • The original poster attempts to classify stationary points using the Hessian's definiteness but encounters confusion regarding additional steps required in constrained optimization. They question when it is sufficient to check the Hessian definiteness and why the gradient of the constraint plays a role in this classification.
  • Participants discuss the relevance of eigenvalues and eigenvectors of the Hessian matrix in relation to the gradient of the constraint, exploring how these relationships affect the classification of critical points.

Discussion Status

Participants are actively engaging with the original poster's questions, providing insights into the relationship between the Hessian matrix and the constraint gradient. Some guidance has been offered regarding the interpretation of eigenvalues and their corresponding eigenvectors, particularly in the context of constrained optimization.

Contextual Notes

There is an emphasis on the need to consider the behavior of the function restricted to the constraint curve, which may alter the interpretation of the Hessian's definiteness. The discussion also highlights the importance of understanding the orthogonality of certain vectors in this context.

fatpotato
Homework Statement
Find the max/min/saddle points of ##f(x,y) = x^4 - y^4## subject to the constraint ##g(x,y) = x^2-2y^2 -1 =0##
Use Lagrange multipliers method
Classify the stationnary points (max/min/saddle) using the definiteness of the Hessian
Relevant Equations
Positive/Negative definite matrix
Hello,

I am using the Lagrange multipliers method to find the extremums of ##f(x,y)## subjected to the constraint ##g(x,y)##, an ellipse.

So far, I have successfully identified several triplets ##(x^∗,y^∗,λ^∗)## such that each triplet is a stationary point for the Lagrangian: ##\nabla \mathscr{L} (x^∗,y^∗,λ^∗) = 0##

Now, I want to classify my triplets as max/min/saddle points, using the positive/negative definiteness of the Hessian like I have been doing for unconstrained optimization, so I compute what I think is the Hessian of the Lagrangian:

$$H_{\mathscr{L}}(x,y,λ)= \begin{pmatrix} 12x^2 - 2\lambda & 0 \\ 0 & -12y^2 - 4\lambda \end{pmatrix}$$

Evaluating the Hessian for my first triplet ##(0,\pm \frac{\sqrt{2}}{2},−\frac{1}{2})## gives me:

$$H_{\mathscr{L}}(0,\pm \frac{\sqrt{2}}{2},−\frac{1}{2}) = \begin{pmatrix} 1 & 0 \\ 0 & - 4\end{pmatrix}$$

This matrix is diagonal, meaning that we immediately read its eigenvalues on the diagonal: ##\lambda_1 = 1 > 0## and ##\lambda_2 = -4 < 0##. A positive/negative definite matrix has only positive/negative eigenvalues, thus I conclude that this matrix is neither, due to its eigenvalues' opposite signs.

When I was studying unconstrained optimization, I learned that we have in this case a saddle point, so I would like to think that the points ##(0,\pm \frac{\sqrt{2}}{2})## are both saddle points for my function f, however, the solution to this problem affirms these points are minimums, using the following argument:

Lagrange_Mult_Sol.PNG

Using the fact that ##\nabla g(x,y) = (0,\pm \frac{\sqrt{2}}{2})## and that ##w^T \nabla g(x,y) = 0## if and only if ##w = (\alpha, 0), \alpha \in \mathbb{R}^{\ast}##

I thought that it was enough to check for the definiteness of the Hessian, and now I am really confused...

Here are my questions:
  1. When is it enough to check the definiteness of the Hessian to classify stationnary points?
  2. Why is there this additional step in constrained optimization?
  3. What am I missing?
Thank you for your time.

Edit: PF destroyed my LaTeX formatting.
 
Last edited by a moderator:
  • Like
Likes   Reactions: Delta2
Physics news on Phys.org
You are constrained to look at the behaviour of f restricted to the one-dimensional hyperbola g(x,y) = x^2 - 2y^2 - 1 = 0. If f increases or decreases as you move off this curve, then that does not concern you.

To remain on the curve, your direction of travel must be orthogonal to \nabla g. In this case, the eigenvector corresponding to the negative eigenvalue is parallel to \nabla g and the eigenvector corresponding to the positive eigenvalue is orthogonal to it, so as far as you are concerned the negative eigenvalue is irrelevant: this critical point is a minimum.

In general, \nabla g will not be an eigenvector of the hessian of f. The text therefore defines a vector \mathbf{w} orthogonal to \nabla g) and looks at <br /> f(\mathbf{x} + \alpha\mathbf{w}) \approx f(\mathbf{x}) + \tfrac12\alpha^2 \mathbf{w}^T H<br /> \mathbf{w} to determine whether a critical point of f subject to this constraint is a minimum (\mathbf{w}^T H\mathbf{w} &gt; 0) or a maximum (\mathbf{w}^T H\mathbf{w} &lt; 0).
 
Last edited:
  • Informative
  • Like
Likes   Reactions: fatpotato and Delta2
Thank you for your answer. There is a new point I do not understand:
pasmith said:
In this case, the eigenvector corresponding to the negative eigenvalue is parallel to and the eigenvector corresponding to the positive eigenvalue is orthogonal to it, so as far as you are concerned the negative eigenvalue is irrelevant: this critical point is a minimum.
How do you deduce that the eigenvector corresponding to the negative eigenvalue is parallel to the gradient? Same question for the eigenvector corresponding to the positive eigenvalue?
 
  • Like
Likes   Reactions: Delta2
fatpotato said:
Thank you for your answer. There is a new point I do not understand:

How do you deduce that the eigenvector corresponding to the negative eigenvalue is parallel to the gradient? Same question for the eigenvector corresponding to the positive eigenvalue?

By inspection. In this case the Hessian is diagonal, so we immediately see that the eigenvectors are (1,0) with eigenvalue H_{11} = 1 and (0,1) with eigenvalue H_{22} = -4. \nabla g is easily computed to be (2x,-4y) and at the critical point this is (0, \pm \sqrt{2}). The direction orthogonal to it is therefore (1,0).
 
  • Like
Likes   Reactions: Delta2 and fatpotato
Yes of course, we look for the kernel of ##H - \lambda_i I## so the vector ##(1,0)^T## is mapped to ##(0,0)^T## with eigenvalue ##\lambda_1 = 1##, thus being the associated eigenvector.

Sorry, I knew the concept, but could not deduce it myself. Now it is perfectly clear, thank you!
 

Similar threads

  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 6 ·
Replies
6
Views
3K
  • · Replies 6 ·
Replies
6
Views
2K
Replies
3
Views
2K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 4 ·
Replies
4
Views
3K
Replies
1
Views
2K
  • · Replies 8 ·
Replies
8
Views
4K
  • · Replies 6 ·
Replies
6
Views
2K
  • · Replies 4 ·
Replies
4
Views
1K