Lagrange multipliers and critical points

mr.tea · Aug 17, 2016

Hi,

I have (probably) a fundamental problem understanding something related critical points and Lagrange multipliers.

As we know, if a function assumes an extreme value in an interior point of some open set, then the gradient of the function is 0.

Now, when dealing with constraint optimization using Lagrange multipliers, we also find an extreme value of the function restricted to some curve.

So why in the case of constraint optimization can't we also search for points where the gradient is 0? What am I missing here?

Thank you.

andrewkirk · Aug 17, 2016

mr.tea said:

So why in the case of constraint optimization can't we also search for points where the gradient is 0?

The answer occurs in an earlier part of your post:

mr.tea said:

if a function assumes an extreme value in an interior point of some open set, then the gradient of the function is 0

Typically, the set to which the function is constrained is not open in the domain of the function, which usually has at least one more dimension than the constraint set has. For instance a (1D) curve is not open in ##\mathbb R^2##, and a (2D) surface is not open in ##\mathbb R^3##.

Typically, the gradient of the unconstrained function is not zero at a point that is an extremum of the function confined to a surface. All that is required is that the projection of the gradient onto the surface is zero - ie that the gradient is orthogonal to the surface.

mr.tea · Aug 17, 2016

andrewkirk said:

The answer occurs in an earlier part of your post:

Typically, the set to which the function is constrained is not open in the domain of the function, which usually has at least one more dimension than the constraint set has. For instance a (1D) curve is not open in ##\mathbb R^2##, and a (2D) surface is not open in ##\mathbb R^3##.

Typically, the gradient of the unconstrained function is not zero at a point that is an extremum of the function confined to a surface. All that is required is that the projection of the gradient onto the surface is zero - ie that the gradient is orthogonal to the surface.

Do you mean, for example, that if we assume that the constraint is ##x+y-z=0## which has dimension 2, is not open in ##\mathbb R^3##? although this set is composed of the triples ##(x,y,x+y)##?

I am not sure I understand the first sentence of the second paragraph(did you mean "the gradient of the constrained..."?)

Thank you.

fresh_42 · Aug 17, 2016

mr.tea said:

Do you mean, for example, that if we assume that the constraint is ##x+y-z=0## which has dimension 2, is not open in ##\mathbb R^3##? although this set is composed of the triples ##(x,y,x+y)##?

No, the plane isn't open in ##\mathbb{R}^3##. It's even closed. However, to avoid misunderstandings, closed is not the opposite of open. The complementary of an open set is closed and vice versa. But there are sets which are neither, or both. The parametrization by ##(x,y,x+y)## even shows that, because you will leave it, as soon as you change your position a little bit in one direction leaving the others unchanged.

The point is, that often you will find the solutions of an optimization problem on the boundaries of your feasible set and not in the middle of a nice open neighborhood. (If I remember correctly, it even can be formulated that it's always on the boundaries, but it's been too long that I remember whether the "always" is correct here.)

You might want to look at the examples on the Wikipedia page: https://en.wikipedia.org/wiki/Lagrange_multiplier#Example_1
In example one the gradient of the constraint function ##g## is ##(2x,2y)## which vanishes at ##(0,0)##, but the origin is excluded by the constraint.

andrewkirk · Aug 17, 2016

mr.tea said:

I am not sure I understand the first sentence of the second paragraph(did you mean "the gradient of the constrained..."?)

No, I meant unconstrained. Let the surface be the plane you indicated in post 3 (call it ##P##), and let ##f## be some function from ##\mathbb R^3## to ##\mathbb R##. That function has a 3D domain, and its gradient is a 3D vector (a dual vector actually, but that's a complication we needn't bother with here). But the domain of the constrained function ##f|_P## is the plane, which is 2D.

The gradient of ##f|_P## will be zero at a point ##q## that is an extremum on the plane, but the gradient of ##f## is unlikely to be zero at ##q##, considered as a point in ##\mathbb R^3##.

mr.tea · Aug 17, 2016

andrewkirk said:

The gradient of ##f|_P## will be zero at a point ##q## that is an extremum on the plane, but the gradient of ##f## is unlikely to be zero at ##q##, considered as a point in ##\mathbb R^3##.

Can you give a numerical example? I still find it hard to see how the gradient of ##f|_{p} ## can be 0 but not the gradient of ##f##.

Twigg · Aug 17, 2016

$$f(x,y) = x^{2} + y^{2} + x$$
Restricted to the unit circle ##x^{2} + y^{2} = 1##

In this case, you can see that the constrained extrema are at (0,1) and (0,-1), whereas the only unconstrained extremum is at (-0.5,0).

Edit: Oops, forgot to show the part with the gradients.

Unconstrained: $$\nabla f(x,y) = (2x+1, 2y)$$

Constrained: $$\nabla (f - \lambda g)(x,y) = (2(1-\lambda)x + 1, 2(1-\lambda)y)$$

Setting ##2(1-\lambda)y## to zero shows that y = 0 for the constrained extrema. Plugging this into the equation for the unit circle shows that x = +/- 1. Solving ##2(1-\lambda)x+1=0## for ##\lambda## shows that ## \lambda = \frac{1}{2x} + 1## so ##\lambda = \frac{1}{2}, \frac{3}{2}##. So the two forms of the constrained gradient are
$$\nabla(f - \frac{1}{2}g)(x,y) = (x+1, y)$$
and
$$\nabla(f - \frac{3}{2}g)(x,y) = (-x + 1, -y)$$

That clearly shows that the constrained gradient(s) don't necessarily agree with the unconstrained gradient.

Twigg · Aug 17, 2016

mr.tea said:

I still find it hard to see how the gradient of f|pf|pf_{|p} can be 0 but not the gradient of fff.

For an intuitive, semi-rigorous way of thinking about it, the constrained gradient is the projection of the unconstrained gradient onto the tangent plane of the surface of constraint. The gradient of the constraint surface times the Lagrange multiplier is in general the normal component of the unconstrained gradient at an extremum. You subtract out the normal part to leave only the tangential part, and then you solve for stationary points of the tangential part (the constrained gradient) only. The constrained extrema have the tangential part of the gradient equal to 0, whereas the unconstrained extrema have both the tangential and the normal part equal to 0. However, the constrained extrema still have to satisfy the equation of constraint, and that's why you can have constrained extrema that aren't also unconstrained extrema. For example, in the example I gave above, the point (-0.5,0) isn't on the unit circle. But if I had instead made the equation of constraint ##x^{2} + y^{2} = (\frac{1}{2})^{2}##, then (-0.5,0) would've been one extremum.

Lagrange multipliers and critical points

Similar threads

Undergrad Problem in understanding instantaneous velocity

Undergrad Unit Circle Confusion: A Self-Study Challenge?

Undergrad Proving that convexity implies second order derivative being positive

Undergrad Derive the orthonormality condition for Legendre polynomials

Undergrad Problem with calculating projections of curl using rotation of contour

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Insights What Exactly is Dirac’s Delta Function? - Insight

Insights Relativator (Circular Slide-Rule): Simulated with Desmos - Insight

Insights Fixing Things Which Can Go Wrong With Complex Numbers