Lagrange multipliers and critical points

Click For Summary
SUMMARY

This discussion focuses on the application of Lagrange multipliers in constraint optimization and the behavior of gradients at critical points. It establishes that while the gradient of a function is zero at extreme values in unconstrained optimization, this is not necessarily true for constrained optimization. The gradient must be orthogonal to the constraint surface, which is often not open in the higher-dimensional domain. A numerical example illustrates that constrained extrema can exist where the unconstrained extrema do not, emphasizing the importance of understanding the relationship between gradients and constraints.

PREREQUISITES
  • Understanding of Lagrange multipliers in optimization
  • Knowledge of gradients and their geometric interpretations
  • Familiarity with constrained optimization problems
  • Basic calculus, particularly in multivariable functions
NEXT STEPS
  • Study the geometric interpretation of gradients in constrained optimization
  • Explore numerical examples of Lagrange multipliers in various dimensions
  • Learn about the projection of vectors onto surfaces in multivariable calculus
  • Investigate the implications of constraint boundaries on optimization solutions
USEFUL FOR

Students and professionals in mathematics, engineering, and economics who are involved in optimization problems, particularly those dealing with constraints and critical points.

mr.tea
Messages
101
Reaction score
12
Hi,

I have (probably) a fundamental problem understanding something related critical points and Lagrange multipliers.

As we know, if a function assumes an extreme value in an interior point of some open set, then the gradient of the function is 0.

Now, when dealing with constraint optimization using Lagrange multipliers, we also find an extreme value of the function restricted to some curve.

So why in the case of constraint optimization can't we also search for points where the gradient is 0? What am I missing here?

Thank you.
 
Physics news on Phys.org
mr.tea said:
So why in the case of constraint optimization can't we also search for points where the gradient is 0?
The answer occurs in an earlier part of your post:
mr.tea said:
if a function assumes an extreme value in an interior point of some open set, then the gradient of the function is 0
Typically, the set to which the function is constrained is not open in the domain of the function, which usually has at least one more dimension than the constraint set has. For instance a (1D) curve is not open in ##\mathbb R^2##, and a (2D) surface is not open in ##\mathbb R^3##.

Typically, the gradient of the unconstrained function is not zero at a point that is an extremum of the function confined to a surface. All that is required is that the projection of the gradient onto the surface is zero - ie that the gradient is orthogonal to the surface.
 
andrewkirk said:
The answer occurs in an earlier part of your post:

Typically, the set to which the function is constrained is not open in the domain of the function, which usually has at least one more dimension than the constraint set has. For instance a (1D) curve is not open in ##\mathbb R^2##, and a (2D) surface is not open in ##\mathbb R^3##.

Typically, the gradient of the unconstrained function is not zero at a point that is an extremum of the function confined to a surface. All that is required is that the projection of the gradient onto the surface is zero - ie that the gradient is orthogonal to the surface.
Do you mean, for example, that if we assume that the constraint is ##x+y-z=0## which has dimension 2, is not open in ##\mathbb R^3##? although this set is composed of the triples ##(x,y,x+y)##?

I am not sure I understand the first sentence of the second paragraph(did you mean "the gradient of the constrained..."?)

Thank you.
 
Last edited:
mr.tea said:
Do you mean, for example, that if we assume that the constraint is ##x+y-z=0## which has dimension 2, is not open in ##\mathbb R^3##? although this set is composed of the triples ##(x,y,x+y)##?
No, the plane isn't open in ##\mathbb{R}^3##. It's even closed. However, to avoid misunderstandings, closed is not the opposite of open. The complementary of an open set is closed and vice versa. But there are sets which are neither, or both. The parametrization by ##(x,y,x+y)## even shows that, because you will leave it, as soon as you change your position a little bit in one direction leaving the others unchanged.

The point is, that often you will find the solutions of an optimization problem on the boundaries of your feasible set and not in the middle of a nice open neighborhood. (If I remember correctly, it even can be formulated that it's always on the boundaries, but it's been too long that I remember whether the "always" is correct here.)

You might want to look at the examples on the Wikipedia page: https://en.wikipedia.org/wiki/Lagrange_multiplier#Example_1
In example one the gradient of the constraint function ##g## is ##(2x,2y)## which vanishes at ##(0,0)##, but the origin is excluded by the constraint.
 
mr.tea said:
I am not sure I understand the first sentence of the second paragraph(did you mean "the gradient of the constrained..."?)
No, I meant unconstrained. Let the surface be the plane you indicated in post 3 (call it ##P##), and let ##f## be some function from ##\mathbb R^3## to ##\mathbb R##. That function has a 3D domain, and its gradient is a 3D vector (a dual vector actually, but that's a complication we needn't bother with here). But the domain of the constrained function ##f|_P## is the plane, which is 2D.

The gradient of ##f|_P## will be zero at a point ##q## that is an extremum on the plane, but the gradient of ##f## is unlikely to be zero at ##q##, considered as a point in ##\mathbb R^3##.
 
andrewkirk said:
The gradient of ##f|_P## will be zero at a point ##q## that is an extremum on the plane, but the gradient of ##f## is unlikely to be zero at ##q##, considered as a point in ##\mathbb R^3##.

Can you give a numerical example? I still find it hard to see how the gradient of ##f|_{p} ## can be 0 but not the gradient of ##f##.
 
$$f(x,y) = x^{2} + y^{2} + x$$
Restricted to the unit circle ##x^{2} + y^{2} = 1##

In this case, you can see that the constrained extrema are at (0,1) and (0,-1), whereas the only unconstrained extremum is at (-0.5,0).

Edit: Oops, forgot to show the part with the gradients.

Unconstrained: $$\nabla f(x,y) = (2x+1, 2y)$$

Constrained: $$\nabla (f - \lambda g)(x,y) = (2(1-\lambda)x + 1, 2(1-\lambda)y)$$

Setting ##2(1-\lambda)y## to zero shows that y = 0 for the constrained extrema. Plugging this into the equation for the unit circle shows that x = +/- 1. Solving ##2(1-\lambda)x+1=0## for ##\lambda## shows that ## \lambda = \frac{1}{2x} + 1## so ##\lambda = \frac{1}{2}, \frac{3}{2}##. So the two forms of the constrained gradient are
$$\nabla(f - \frac{1}{2}g)(x,y) = (x+1, y)$$
and
$$\nabla(f - \frac{3}{2}g)(x,y) = (-x + 1, -y)$$

That clearly shows that the constrained gradient(s) don't necessarily agree with the unconstrained gradient.
 
Last edited:
  • Like
Likes   Reactions: andrewkirk
mr.tea said:
I still find it hard to see how the gradient of f|pf|pf_{|p} can be 0 but not the gradient of fff.

For an intuitive, semi-rigorous way of thinking about it, the constrained gradient is the projection of the unconstrained gradient onto the tangent plane of the surface of constraint. The gradient of the constraint surface times the Lagrange multiplier is in general the normal component of the unconstrained gradient at an extremum. You subtract out the normal part to leave only the tangential part, and then you solve for stationary points of the tangential part (the constrained gradient) only. The constrained extrema have the tangential part of the gradient equal to 0, whereas the unconstrained extrema have both the tangential and the normal part equal to 0. However, the constrained extrema still have to satisfy the equation of constraint, and that's why you can have constrained extrema that aren't also unconstrained extrema. For example, in the example I gave above, the point (-0.5,0) isn't on the unit circle. But if I had instead made the equation of constraint ##x^{2} + y^{2} = (\frac{1}{2})^{2}##, then (-0.5,0) would've been one extremum.
 
  • Like
Likes   Reactions: mr.tea and BvU

Similar threads

  • · Replies 9 ·
Replies
9
Views
2K
  • · Replies 2 ·
Replies
2
Views
1K
  • · Replies 8 ·
Replies
8
Views
2K
  • · Replies 3 ·
Replies
3
Views
3K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 16 ·
Replies
16
Views
3K
  • · Replies 8 ·
Replies
8
Views
2K
Replies
4
Views
2K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 7 ·
Replies
7
Views
2K