# Closest approach of a parabola to a point, using lagrange multipliers

1. Feb 11, 2013

### E'lir Kramer

Advanced Calculus of Several Variables, Edwards, problem II.4.1: Find the shortest distance from the point (1, 0) to a point of the parabola $y^{2} = 4x$.

This is the Lagrange multipliers chapter. There might be another way to solve this, but the only way I'm interested in right now is the Lagrange multipliers method.

The first tricky thing is that he has no supplied us with the function f that we are trying to minimize.

We need an equation for the distance between some point (x,y) and the point (1,0). $d(x,y) = \sqrt{(x-1)^{2} + y^{2}}$. For convenience we can minimize $d^{2}(x, y) = (x-1)^{2} + y^{2}$ and get the same answer.

Now, this is a tricky part that I may have gotten wrong, since this author has given only a brief overview of the gradient function. $\nabla d^{2}(x, y) = (2x-2, 2y)$.

Now the parabola function is actually the constraint on this distance function. However, I am having a hard time with the form of this. Is the correct constraint $0 = 4x - y^{2}$? The thing is, that's not a function, it's just an equation. I don't understand how I'm suppose to take the gradient of it. Now I can imagine a function $g : \Re^{2} \to \Re$ such that $g(x, y) = 4x - y^{2}$. And I know the parabola are all the points {${ p \in \Re^{2} : g(p) = 0 }$}. Perhaps I just didn't understand the proof of the Langrangian method, but I am not sure why my constraint has to be the zero set of some function.

If $\nabla g = (4, -2y)$ (is it?), then I have the Lagrangian equality:

$(2x-2, 2y) = \lambda (4, -2y)$.

The three equations to solve are:

$2x-2 = 4\lambda \\ 2y = -2\lambda y \\ 0 = 4x - y^{2}$

Is this correct? If so, it's unfortunate, because I can't solve it.

I tried first solving for $\lambda = \frac{2x-2}{4}$
and $\lambda = -1$ if y ≠ 0. I think this is justified because if y = 0, then $\lambda = 0$.

Then $-1 = \frac{2x-2}{4}$, so $x = -1$.

But then we have $0 = -4 - y^{2}$, so $-4 = y^{2}$. So y is an imaginary number? That's impossible. There must be some real solution to this problem.

Last edited: Feb 11, 2013
2. Feb 11, 2013

### Zondrina

3. Feb 11, 2013

### E'lir Kramer

Re: Closest approach of a parabola to a point, using lagrange multipli

I'm sorry, Zondrina. I'm not saying that your graph doesn't help me, but I don't see how it does.

4. Feb 11, 2013

### marcusl

Re: Closest approach of a parabola to a point, using lagrange multipli

Your distance and constraint equations are fine, but you go very slightly astray in setting up the minimization equation. You should have, instead,
$$\nabla[d^2 + \lambda g] = 0.$$ This gives two (and only two) equations
$$2(x-1)+4\lambda x=0,$$ $$2y-2\lambda y = 0.$$
The second is solved to give lambda = +1 and you can find x and y from there.

5. Feb 11, 2013

### E'lir Kramer

Re: Closest approach of a parabola to a point, using lagrange multipli

How do I simplify $\nabla[d^{2} + \lambda g]$?

Is $\nabla[d^{2} + \lambda g] = \nabla d^{2} + \lambda \nabla g$?

If so, and if $\nabla d^{2} + \lambda \nabla g = 0$ is true for some $\lambda \in \Re$, then $\nabla d^{2} - \lambda \nabla g = 0$ must be true for some other number $\gamma \in \Re$, where $\gamma = -\lambda$. So aren't these just two formulations of the same insight?

Edit, oh, wait.

$d^{2} + \lambda g = (x-1)^{2} + y ^{2} + 4\lambda x - \lambda y^{2}$, so
$\nabla [d^{2} + g] = (2(x-1) + 4\lambda , 2y - 2 \lambda y) = (0, 0)$

But you've got that $2(x-1) + 4\lambda x = 0$. It seems like $2(x-1) + 4 \lambda = 0$ to me. If your extra x was a typo, then I still have the same problem: substituting 1 for $\lambda$ into your equations yields x = -1, which is what I had. And from then on our work is the same. Where does your extra x come from: was it a typo or did I mess up somewhere?

Last edited: Feb 11, 2013
6. Feb 11, 2013

### E'lir Kramer

Re: Closest approach of a parabola to a point, using lagrange multipli

My justification for y ≠ 0 is false. We have from $0 = 4x - y^{2}$ that if y = 0, then x = 0. (0,0) could be the solution?

7. Feb 11, 2013

### E'lir Kramer

Re: Closest approach of a parabola to a point, using lagrange multipli

I just tried to do the next problem in the chapter, which was also a minimum distance problem, and encountered the same thing with an imaginary root for y. I really hope someone can come in here and show me where I'm going wrong :).

8. Feb 11, 2013

### marcusl

Re: Closest approach of a parabola to a point, using lagrange multipli

Sorry, the extra x was a typo. I see your point--that y ends up being imaginary, which solves the problem (it gives the minimum value d=0) but does not give a valid closest point. I'll think about this a bit more.

9. Feb 11, 2013

### Ray Vickson

Re: Closest approach of a parabola to a point, using lagrange multipli

To clear up some confusion: your constraint is a curve in 2 dimensions, written as g(x,y) = 0. You want to miminize (or maximize) a function f(x,y), subject to the restriction g(x,y) = 0. Your constraint really does involve a function! At a minimizing point $p_0 = (x_0,y_0)$ the two gradient vectors $\nabla f$ and $\nabla g$ must be parallel. Here is why.

For a small increment $h = (h_x,h_y)$ to be feasible, we need $h_x g_x(p_0) + h_y g_y(p_0) = 0$, which says that the point $p+h$ lies in the tangent line of the constraint (and is therefore still feasible when we neglect "second-order" terms like $h_x^2, \: h_y^2, \; h_x h_y$ or higher order). In other words, if we "linearize" the problem, the condition $h \cdot \nabla g = 0$ keeps us feasible.

In order to have a constrained min of f at $p_0$ we need to have $f(p_0 + h) \geq f(p_0)$ for all feasible $h = (h_x,h_y),$, so we need to have
$$h_x f_x(p_0) + h_y f_y(p_0) \geq 0 \text{ whenever }\; h_x g_x(p_0) + h_y g_y(p_0) = 0.$$ However, the same holds for $-h = (-h_x,-h_y)$ as well, so we finally need
$$h_x f_x(p_0) + h_y f_y(p_0) = 0 \text{ whenever }\; h_x g_x(p_0) + h_y g_y(p_0) = 0.$$ This means that the gradient vectors ∇g and ∇f must be linearly dependent, so if ∇g ≠ 0 there must be a scalar multiplier λ such that ∇f = λ ∇g.

Note that it does not really matter whether you write ∇f = λ ∇g (which is ∇f - λ ∇g = 0) or whether you write ∇f + λ ∇g = 0; you will just get different signs for λ. Later, when you deal with inequality-constrained problems like min f subject to g ≥ 0 it will then matter how you write the Lagrangian, but for now don't worry about it.

So, in your case you do have three equations in the three unknowns x, y and λ. Your equation $2y = -2\lambda y$ implies either (i) y = 0; or (ii) λ = -1. (Of course, you might have both!)

If you try (i), your other conditions $2x-2 = 4\lambda$ and $4x = 0$ give $x = 0, \: \lambda = -1/2.$

If you try (ii) you have $2x - 2 = 4(-1) = -4,$ so $x = -1$ and then $y^2 = -4 < 0,$ which is impossible. Therefore, the appropriate case is (i).

Last edited by a moderator: Feb 12, 2013
10. Feb 12, 2013

### HallsofIvy

Staff Emeritus
Re: Closest approach of a parabola to a point, using lagrange multipli

Here's another way of thinking, Similar to what Ray Vickson said, about problems like this. In order to find the point closest to (a, b), starting from given point (x, y), you would find the vector from (x, y) to (a, b), move in that direction a short distance, then repeat with this now point. Of course, in this simple situation, where the function to be minimized is just distance, you would keep getting the same vector and eventually end up at (a, b), where the vector from (x, y) to (a, b) is the 0 vector.

Now, suppose you want to find the point on the curve g(x,y)= constant that is closest to (a, b). You could start the same way- find the vector from an initial point (x, y) to (a, b). But now, unless that vector happens to be tangent to the curve, you can't follow it. What you can do is find its projection onto the curve and go "left" or "right" on the curve depending upon whicy way the projection vector points on the curve. You can continue doing that, following the projection vector and getting closer to (a, b) until the projection vector is 0 and you can't get any closer by going either "left" or "right".

That happens when the vector from (x, y) to (a, b) is perpendicular to the curve. And, because $\nabla g(x,y)$ is perpendicular to the curve, that happens when the vector from (x, y) to (a, b) is parallel to $\nabla g(x,y)$- that is, one is a multiple of the other.

It should be obvious that the distance from (x, y) to (a, b) increases fastest when we move directly away from (a, b), in the direction of the vector from (a, b) to (x, y). That is, the vector from (a, b) to (x, y) is in the direction of the gradient of the distance function- as you found: minimizing (or maximizing) the distance function, $d(x,y)= \sqrt{(x- a)^2+ (y- b)^2}$, is the same as minimizing (or maximizing) $d^2(x,y)= (x- a)^2+ (y- b)^2$ and its gradient is $\nabla d^2(x, y)= 2(x- a)\vec{i}+ 2(y- b)\vec{j}= 2((x- a)\vec{i}+ (y- b)\vec{j})$, as multiple of the vector from (a, b) to (x, y).

That is, to find the minimum distance from (x, y) to (a, b), subject to the condition that g(x,y)= constant, we solve $\nabla g(x, y)= \lambda \nabla d^2(x, y)$, where one vector is a multiple of the other.

More generally, to find (x, y) such that f(x,y) has a minimum or maximum, subject to the requirement that g(x,y)= constant, solve $\nabla g(x,y)= k\nabla f(x,y)$.

Here, the function to be minimized is the distance to (1, 0), and that, as you say, is equivalent to minimizing $f(x,y)= (x- 1)^2+ y^2$ subject to the condition that $g(x, y)= y^2- 4x= 0$. We need to solve $\nabla f(x, y)= 2(x- 1)\vec{i}+ 2y\vec{j}= \lambda\nabla g(x, y)= -4\lambda\vec{i}+ 2\lambda y\vec{j}$. That gives the two equations $2(x- 1)= -4\lambda$ and $2y= 2\lambda y$. You have three equations, including the constraint $y^2= 4x$ to solve for x, y, and $\lambda$.

As long as y is not 0, you can divide that last equation by 2y to get $\lambda= 1$ as marcusl says. But since a value of $\lambda$ is not necessary to solve the problem, I find it often simplest to start by eliminating $\lambda$ by dividing one equation by another. Dividing each side of $2(x- 1)= -4\lambda$ by the corresponding side of $2y= 2\lambda y$, we have $2(x- 1)/2y= -4/2y$.

Last edited by a moderator: Feb 12, 2013
11. Feb 12, 2013

### Ray Vickson

Re: Closest approach of a parabola to a point, using lagrange multipli

Sorry: where in my reply I said " ... $h \cdot \nabla f = 0$ keeps us feasible..." I meant "... $h \cdot \nabla g = 0$ keeps us feasible...". For some reason, the "edit" option is no longer available, so I can't correct the typo.

12. Feb 12, 2013

### E'lir Kramer

Re: Closest approach of a parabola to a point, using lagrange multipli

Thank you both for your generous attention to my questions! I read both of your posts last night before bed, and again this morning over coffee. Gradually, the insight that the gradients must be parallel at the extrema is sinking in to me. I am convinced of the intuitive argument now, and I just need some more time to internalize the formal proof that Edwards gives.

13. Feb 12, 2013

### vela

Staff Emeritus
Re: Closest approach of a parabola to a point, using lagrange multipli

The edit option is available only for a certain length of time after you post. I made the change for you.

14. Feb 12, 2013

### Ray Vickson

Re: Closest approach of a parabola to a point, using lagrange multipli

Thank you.

15. Feb 12, 2013

### E'lir Kramer

Re: Closest approach of a parabola to a point, using lagrange multipli

I have another related question. I am not sure if I should put it here or in a new thread, so I'll put it here.

II.4.6: The equation $73x^{2} + 72xy + 52^{2} = 100$ defines an ellipse which is centered at the origin, but has been rotated about it. Find the semiaxes of this ellipse by maximizing and minimizing $f(x, y) = x^{2} + y^{2}$ on it.

When drawing out the diagram for this one, it really brings home that the two functions must be tangent at their maximum and minimum points.

I *think* that I Just have a mechanics problem in being unable to solve the system of equations that the Lagrangian generates. But, since I have been unable to do so for several hours of on-and-off tinkering, I'm hoping for a hint.

We have

$(2x, 2y) = \lambda(146x + 72y, 104y + 72x)$, so,

$2x = \lambda(146x + 72y) \\ 2x - 146\lambda x= 72\lambda y \\ x(1-73\lambda) = 36 \lambda y$

$2y = \lambda(104x + 72x) \\ y(1-52 \lambda) = 36 \lambda x$

$73x^{2} + 72xy + 52^{2} - 100 = 0 \\$

No matter what I try, I can't eliminate one of the variables by substitution. Is this system even solvable?

16. Feb 12, 2013

### vela

Staff Emeritus
Re: Closest approach of a parabola to a point, using lagrange multipli

It might help to rewrite the two linear equations in this form:
\begin{align*}
(73-k)x + 36y &= 0 \\
36x + (52-k)y &= 0
\end{align*} where $k=1/\lambda$. In matrix form, this would be
$$\begin{pmatrix} 73-k & 36 \\ 36 & 52-k \end{pmatrix}\begin{pmatrix} x \\ y \end{pmatrix} = 0.$$ Obviously, x=y=0 is a solution to those equations, but it won't satisfy the constraint. The only way this system can have a non-trivial solution is if the determinant of the matrix vanishes. This will allow you to solve for k. Once you have that, you can solve for x in terms of y (or vice versa). Then the constraint will allow you to find specific values.

17. Feb 15, 2013

### E'lir Kramer

Re: Closest approach of a parabola to a point, using lagrange multipli

Thanks, Vela. Taking your word on it that (73-k)(52-k) - 36*36 = 0, I was able to solve for k and then back substitute for x and y.

Is this result true in general? If I have a matrix equation equal to zero, does the equation only have non-trivial solutions if the determinant is zero?

18. Feb 15, 2013

### Ray Vickson

Re: Closest approach of a parabola to a point, using lagrange multipli

Yes: that is a standard theorem in Linear Algebra 101: a square system Ax = 0 has a nonzero solution for x if and only if det(A) = 0.

19. Feb 15, 2013

### E'lir Kramer

Re: Closest approach of a parabola to a point, using lagrange multipli

A square system?

20. Feb 15, 2013

### voko

Re: Closest approach of a parabola to a point, using lagrange multipli

"Square system" is somewhat informal for a linear system whose matrix is square, i.e., the number of row equals the number of columns (alternatively, the number of equations equals the number of unknowns).