Closest approach of a parabola to a point, using lagrange multipliers

In summary, the problem involves finding the shortest distance from a given point to a point on a parabola, using the Lagrange multipliers method. The distance function is defined as d(x,y) = \sqrt{(x-1)^{2} + y^{2}}, and the constraint is given by the equation 0 = 4x - y^{2}. After taking the gradient of the distance function, the Lagrangian equality is set up and solved for the values of x and y. However, it leads to an imaginary solution for y, which does not give a valid closest point.
  • #1
E'lir Kramer
74
0
Advanced Calculus of Several Variables, Edwards, problem II.4.1: Find the shortest distance from the point (1, 0) to a point of the parabola [itex]y^{2} = 4x[/itex].

This is the Lagrange multipliers chapter. There might be another way to solve this, but the only way I'm interested in right now is the Lagrange multipliers method.

The first tricky thing is that he has no supplied us with the function f that we are trying to minimize.

We need an equation for the distance between some point (x,y) and the point (1,0). [itex] d(x,y) = \sqrt{(x-1)^{2} + y^{2}} [/itex]. For convenience we can minimize [itex]d^{2}(x, y) = (x-1)^{2} + y^{2}[/itex] and get the same answer.

Now, this is a tricky part that I may have gotten wrong, since this author has given only a brief overview of the gradient function. [itex]\nabla d^{2}(x, y) = (2x-2, 2y) [/itex].

Now the parabola function is actually the constraint on this distance function. However, I am having a hard time with the form of this. Is the correct constraint [itex] 0 = 4x - y^{2} [/itex]? The thing is, that's not a function, it's just an equation. I don't understand how I'm suppose to take the gradient of it. Now I can imagine a function [itex] g : \Re^{2} \to \Re [/itex] such that [itex] g(x, y) = 4x - y^{2}[/itex]. And I know the parabola are all the points {[itex] { p \in \Re^{2} : g(p) = 0 } [/itex]}. Perhaps I just didn't understand the proof of the Langrangian method, but I am not sure why my constraint has to be the zero set of some function.

If [itex] \nabla g = (4, -2y) [/itex] (is it?), then I have the Lagrangian equality:

[itex] (2x-2, 2y) = \lambda (4, -2y) [/itex].

The three equations to solve are:

[itex] 2x-2 = 4\lambda \\
2y = -2\lambda y \\
0 = 4x - y^{2}
[/itex]

Is this correct? If so, it's unfortunate, because I can't solve it.

I tried first solving for [itex] \lambda = \frac{2x-2}{4} [/itex]
and [itex] \lambda = -1 [/itex] if y ≠ 0. I think this is justified because if y = 0, then [itex] \lambda = 0 [/itex].

Then [itex] -1 = \frac{2x-2}{4} [/itex], so [itex] x = -1 [/itex].

But then we have [itex] 0 = -4 - y^{2} [/itex], so [itex] -4 = y^{2} [/itex]. So y is an imaginary number? That's impossible. There must be some real solution to this problem.
 
Last edited:
Physics news on Phys.org
  • #3


I'm sorry, Zondrina. I'm not saying that your graph doesn't help me, but I don't see how it does.
 
  • #4


Your distance and constraint equations are fine, but you go very slightly astray in setting up the minimization equation. You should have, instead,
[tex]\nabla[d^2 + \lambda g] = 0.[/tex] This gives two (and only two) equations
[tex]2(x-1)+4\lambda x=0,[/tex] [tex]2y-2\lambda y = 0.[/tex]
The second is solved to give lambda = +1 and you can find x and y from there.
 
  • #5


How do I simplify [itex] \nabla[d^{2} + \lambda g][/itex]?

Is [itex] \nabla[d^{2} + \lambda g] = \nabla d^{2} + \lambda \nabla g[/itex]?

If so, and if [itex] \nabla d^{2} + \lambda \nabla g = 0 [/itex] is true for some [itex] \lambda \in \Re[/itex], then [itex] \nabla d^{2} - \lambda \nabla g = 0 [/itex] must be true for some other number [itex] \gamma \in \Re[/itex], where [itex] \gamma = -\lambda [/itex]. So aren't these just two formulations of the same insight?

Edit, oh, wait.

[itex] d^{2} + \lambda g = (x-1)^{2} + y ^{2} + 4\lambda x - \lambda y^{2} [/itex], so
[itex] \nabla [d^{2} + g] = (2(x-1) + 4\lambda , 2y - 2 \lambda y) = (0, 0) [/itex]

But you've got that [itex] 2(x-1) + 4\lambda x = 0 [/itex]. It seems like [itex] 2(x-1) + 4 \lambda = 0 [/itex] to me. If your extra x was a typo, then I still have the same problem: substituting 1 for [itex]\lambda[/itex] into your equations yields x = -1, which is what I had. And from then on our work is the same. Where does your extra x come from: was it a typo or did I mess up somewhere?
 
Last edited:
  • #6


My justification for y ≠ 0 is false. We have from [itex] 0 = 4x - y^{2} [/itex] that if y = 0, then x = 0. (0,0) could be the solution?
 
  • #7


I just tried to do the next problem in the chapter, which was also a minimum distance problem, and encountered the same thing with an imaginary root for y. I really hope someone can come in here and show me where I'm going wrong :).
 
  • #8


Sorry, the extra x was a typo. I see your point--that y ends up being imaginary, which solves the problem (it gives the minimum value d=0) but does not give a valid closest point. I'll think about this a bit more.
 
  • #9


E'lir Kramer said:
Advanced Calculus of Several Variables, Edwards, problem II.4.1: Find the shortest distance from the point (1, 0) to a point of the parabola [itex]y^{2} = 4x[/itex].

This is the Lagrange multipliers chapter. There might be another way to solve this, but the only way I'm interested in right now is the Lagrange multipliers method.

The first tricky thing is that he has no supplied us with the function f that we are trying to minimize.

We need an equation for the distance between some point (x,y) and the point (1,0). [itex] d(x,y) = \sqrt{(x-1)^{2} + y^{2}} [/itex]. For convenience we can minimize [itex]d^{2}(x, y) = (x-1)^{2} + y^{2}[/itex] and get the same answer.

Now, this is a tricky part that I may have gotten wrong, since this author has given only a brief overview of the gradient function. [itex]\nabla d^{2}(x, y) = (2x-2, 2y) [/itex].

Now the parabola function is actually the constraint on this distance function. However, I am having a hard time with the form of this. Is the correct constraint [itex] 0 = 4x - y^{2} [/itex]? The thing is, that's not a function, it's just an equation. I don't understand how I'm suppose to take the gradient of it. Now I can imagine a function [itex] g : \Re^{2} \to \Re [/itex] such that [itex] g(x, y) = 4x - y^{2}[/itex]. And I know the parabola are all the points {[itex] { p \in \Re^{2} : g(p) = 0 } [/itex]}. Perhaps I just didn't understand the proof of the Langrangian method, but I am not sure why my constraint has to be the zero set of some function.

If [itex] \nabla g = (4, -2y) [/itex] (is it?), then I have the Lagrangian equality:

[itex] (2x-2, 2y) = \lambda (4, -2y) [/itex].

The three equations to solve are:

[itex] 2x-2 = 4\lambda \\
2y = -2\lambda y \\
0 = 4x - y^{2}
[/itex]

Is this correct? If so, it's unfortunate, because I can't solve it.

I tried first solving for [itex] \lambda = \frac{2x-2}{4} [/itex]
and [itex] \lambda = -1 [/itex] if y ≠ 0. I think this is justified because if y = 0, then [itex] \lambda = 0 [/itex].

Then [itex] -1 = \frac{2x-2}{4} [/itex], so [itex] x = -1 [/itex].

But then we have [itex] 0 = -4 - y^{2} [/itex], so [itex] -4 = y^{2} [/itex]. So y is an imaginary number? That's impossible. There must be some real solution to this problem.

To clear up some confusion: your constraint is a curve in 2 dimensions, written as g(x,y) = 0. You want to miminize (or maximize) a function f(x,y), subject to the restriction g(x,y) = 0. Your constraint really does involve a function! At a minimizing point ##p_0 = (x_0,y_0)## the two gradient vectors ##\nabla f## and ##\nabla g## must be parallel. Here is why.

For a small increment ##h = (h_x,h_y)## to be feasible, we need ##h_x g_x(p_0) + h_y g_y(p_0) = 0##, which says that the point ##p+h## lies in the tangent line of the constraint (and is therefore still feasible when we neglect "second-order" terms like ##h_x^2, \: h_y^2, \; h_x h_y## or higher order). In other words, if we "linearize" the problem, the condition ##h \cdot \nabla g = 0## keeps us feasible.

In order to have a constrained min of f at ##p_0## we need to have ##f(p_0 + h) \geq f(p_0)## for all feasible ##h = (h_x,h_y),##, so we need to have
[tex] h_x f_x(p_0) + h_y f_y(p_0) \geq 0 \text{ whenever }\; h_x g_x(p_0) + h_y g_y(p_0) = 0.[/tex] However, the same holds for ##-h = (-h_x,-h_y)## as well, so we finally need
[tex] h_x f_x(p_0) + h_y f_y(p_0) = 0 \text{ whenever }\; h_x g_x(p_0) + h_y g_y(p_0) = 0.[/tex] This means that the gradient vectors ∇g and ∇f must be linearly dependent, so if ∇g ≠ 0 there must be a scalar multiplier λ such that ∇f = λ ∇g.

Note that it does not really matter whether you write ∇f = λ ∇g (which is ∇f - λ ∇g = 0) or whether you write ∇f + λ ∇g = 0; you will just get different signs for λ. Later, when you deal with inequality-constrained problems like min f subject to g ≥ 0 it will then matter how you write the Lagrangian, but for now don't worry about it.

So, in your case you do have three equations in the three unknowns x, y and λ. Your equation ##2y = -2\lambda y## implies either (i) y = 0; or (ii) λ = -1. (Of course, you might have both!)

If you try (i), your other conditions ##2x-2 = 4\lambda## and ##4x = 0## give ## x = 0, \: \lambda = -1/2.##

If you try (ii) you have ##2x - 2 = 4(-1) = -4,## so ##x = -1## and then ##y^2 = -4 < 0,## which is impossible. Therefore, the appropriate case is (i).
 
Last edited by a moderator:
  • #10


Here's another way of thinking, Similar to what Ray Vickson said, about problems like this. In order to find the point closest to (a, b), starting from given point (x, y), you would find the vector from (x, y) to (a, b), move in that direction a short distance, then repeat with this now point. Of course, in this simple situation, where the function to be minimized is just distance, you would keep getting the same vector and eventually end up at (a, b), where the vector from (x, y) to (a, b) is the 0 vector.

Now, suppose you want to find the point on the curve g(x,y)= constant that is closest to (a, b). You could start the same way- find the vector from an initial point (x, y) to (a, b). But now, unless that vector happens to be tangent to the curve, you can't follow it. What you can do is find its projection onto the curve and go "left" or "right" on the curve depending upon whicy way the projection vector points on the curve. You can continue doing that, following the projection vector and getting closer to (a, b) until the projection vector is 0 and you can't get any closer by going either "left" or "right".

That happens when the vector from (x, y) to (a, b) is perpendicular to the curve. And, because [itex]\nabla g(x,y)[/itex] is perpendicular to the curve, that happens when the vector from (x, y) to (a, b) is parallel to [itex]\nabla g(x,y)[/itex]- that is, one is a multiple of the other.

It should be obvious that the distance from (x, y) to (a, b) increases fastest when we move directly away from (a, b), in the direction of the vector from (a, b) to (x, y). That is, the vector from (a, b) to (x, y) is in the direction of the gradient of the distance function- as you found: minimizing (or maximizing) the distance function, [itex]d(x,y)= \sqrt{(x- a)^2+ (y- b)^2}[/itex], is the same as minimizing (or maximizing) [itex]d^2(x,y)= (x- a)^2+ (y- b)^2[/itex] and its gradient is [itex]\nabla d^2(x, y)= 2(x- a)\vec{i}+ 2(y- b)\vec{j}= 2((x- a)\vec{i}+ (y- b)\vec{j})[/itex], as multiple of the vector from (a, b) to (x, y).

That is, to find the minimum distance from (x, y) to (a, b), subject to the condition that g(x,y)= constant, we solve [itex]\nabla g(x, y)= \lambda \nabla d^2(x, y)[/itex], where one vector is a multiple of the other.

More generally, to find (x, y) such that f(x,y) has a minimum or maximum, subject to the requirement that g(x,y)= constant, solve [itex]\nabla g(x,y)= k\nabla f(x,y)[/itex].

Here, the function to be minimized is the distance to (1, 0), and that, as you say, is equivalent to minimizing [itex]f(x,y)= (x- 1)^2+ y^2[/itex] subject to the condition that [itex]g(x, y)= y^2- 4x= 0[/itex]. We need to solve [itex]\nabla f(x, y)= 2(x- 1)\vec{i}+ 2y\vec{j}= \lambda\nabla g(x, y)= -4\lambda\vec{i}+ 2\lambda y\vec{j}[/itex]. That gives the two equations [itex]2(x- 1)= -4\lambda[/itex] and [itex]2y= 2\lambda y[/itex]. You have three equations, including the constraint [itex]y^2= 4x[/itex] to solve for x, y, and [itex]\lambda[/itex].

As long as y is not 0, you can divide that last equation by 2y to get [itex]\lambda= 1[/itex] as marcusl says. But since a value of [itex]\lambda[/itex] is not necessary to solve the problem, I find it often simplest to start by eliminating [itex]\lambda[/itex] by dividing one equation by another. Dividing each side of [itex]2(x- 1)= -4\lambda[/itex] by the corresponding side of [itex]2y= 2\lambda y[/itex], we have [itex]2(x- 1)/2y= -4/2y[/itex].
 
Last edited by a moderator:
  • #11


Sorry: where in my reply I said " ... ##h \cdot \nabla f = 0## keeps us feasible..." I meant "... ## h \cdot \nabla g = 0## keeps us feasible...". For some reason, the "edit" option is no longer available, so I can't correct the typo.
 
  • #12


Thank you both for your generous attention to my questions! I read both of your posts last night before bed, and again this morning over coffee. Gradually, the insight that the gradients must be parallel at the extrema is sinking into me. I am convinced of the intuitive argument now, and I just need some more time to internalize the formal proof that Edwards gives.
 
  • #13


Ray Vickson said:
Sorry: where in my reply I said " ... ##h \cdot \nabla f = 0## keeps us feasible..." I meant "... ## h \cdot \nabla g = 0## keeps us feasible...". For some reason, the "edit" option is no longer available, so I can't correct the typo.
The edit option is available only for a certain length of time after you post. I made the change for you.
 
  • #14


vela said:
The edit option is available only for a certain length of time after you post. I made the change for you.

Thank you.
 
  • #15


I have another related question. I am not sure if I should put it here or in a new thread, so I'll put it here.

II.4.6: The equation [itex]73x^{2} + 72xy + 52^{2} = 100[/itex] defines an ellipse which is centered at the origin, but has been rotated about it. Find the semiaxes of this ellipse by maximizing and minimizing [itex]f(x, y) = x^{2} + y^{2}[/itex] on it.

When drawing out the diagram for this one, it really brings home that the two functions must be tangent at their maximum and minimum points.

I *think* that I Just have a mechanics problem in being unable to solve the system of equations that the Lagrangian generates. But, since I have been unable to do so for several hours of on-and-off tinkering, I'm hoping for a hint.

We have

[itex](2x, 2y) = \lambda(146x + 72y, 104y + 72x) [/itex], so,

[itex]2x = \lambda(146x + 72y) \\
2x - 146\lambda x= 72\lambda y \\
x(1-73\lambda) = 36 \lambda y [/itex]

[itex]2y = \lambda(104x + 72x) \\
y(1-52 \lambda) = 36 \lambda x[/itex]

[itex]73x^{2} + 72xy + 52^{2} - 100 = 0 \\[/itex]

No matter what I try, I can't eliminate one of the variables by substitution. Is this system even solvable?
 
  • #16


It might help to rewrite the two linear equations in this form:
\begin{align*}
(73-k)x + 36y &= 0 \\
36x + (52-k)y &= 0
\end{align*} where ##k=1/\lambda##. In matrix form, this would be
$$\begin{pmatrix} 73-k & 36 \\ 36 & 52-k \end{pmatrix}\begin{pmatrix} x \\ y \end{pmatrix} = 0.$$ Obviously, x=y=0 is a solution to those equations, but it won't satisfy the constraint. The only way this system can have a non-trivial solution is if the determinant of the matrix vanishes. This will allow you to solve for k. Once you have that, you can solve for x in terms of y (or vice versa). Then the constraint will allow you to find specific values.
 
  • #17


Thanks, Vela. Taking your word on it that (73-k)(52-k) - 36*36 = 0, I was able to solve for k and then back substitute for x and y.

Is this result true in general? If I have a matrix equation equal to zero, does the equation only have non-trivial solutions if the determinant is zero?
 
  • #18


E'lir Kramer said:
Thanks, Vela. Taking your word on it that (73-k)(52-k) - 36*36 = 0, I was able to solve for k and then back substitute for x and y.

Is this result true in general? If I have a matrix equation equal to zero, does the equation only have non-trivial solutions if the determinant is zero?

Yes: that is a standard theorem in Linear Algebra 101: a square system Ax = 0 has a nonzero solution for x if and only if det(A) = 0.
 
  • #19


A square system?
 
  • #20


"Square system" is somewhat informal for a linear system whose matrix is square, i.e., the number of row equals the number of columns (alternatively, the number of equations equals the number of unknowns).
 
  • #21


E'lir Kramer said:
We have

[itex](2x, 2y) = \lambda(146x + 72y, 104y + 72x) [/itex], so
[itex]2x = \lambda(146x + 72y)[/itex]
[itex]2y = \lambda(104y + 72x)[/itex]

No matter what I try, I can't eliminate one of the variables by substitution. Is this system even solvable?
Another approach would be to do what HallsofIvy suggested earlier. Divide one equation by the other to eliminate ##\lambda##. If you assume ##y\ne 0##, you get
$$\frac{x}{y} = \frac{146x+72y}{104y+72x} = \frac{146\left(\frac{x}{y}\right)+72}{104+72\left(\frac{x}{y}\right)}$$ and then solve for x/y. You still need to go back and check that the assumption ##y \ne 0## is valid, which it is. You can avoid using math you're not familiar with this way.
 
  • #22


E'lir Kramer said:
Is this result true in general? If I have a matrix equation equal to zero, does the equation only have non-trivial solutions if the determinant is zero?
As Ray noted, it's a standard result from linear algebra, but you probably already know enough to convince yourself it's true. Typically in algebra, you learn some basic stuff about matrices, in particular, if det(A)≠0 then A is invertible. That's all you really need.

Let Ax=0. If det(A)≠0, you know A-1 exists, so you can multiply both sides of the equation by A-1, which leads to x=0. Therefore, if you want a solution other than x=0, you must have det(A)=0.
 
  • #23


thanks everyone. I took a few days off from math to clear my head, but I will revisit this result in the morning and try to convince myself of this result in linear algebra
 

1. What is the closest approach of a parabola to a point?

The closest approach of a parabola to a point is the shortest distance between the point and any point on the parabola. This distance is measured along a line perpendicular to the parabola's axis of symmetry.

2. How is the closest approach of a parabola to a point calculated?

The closest approach of a parabola to a point can be calculated using Lagrange multipliers, which involves finding the minimum value of a function while satisfying a set of constraints. In this case, the function represents the distance between the point and any point on the parabola, and the constraints are the equations of the parabola and the line perpendicular to the axis of symmetry.

3. What is the role of Lagrange multipliers in finding the closest approach of a parabola to a point?

Lagrange multipliers are used to optimize a function subject to constraints. In the case of finding the closest approach of a parabola to a point, Lagrange multipliers help to find the minimum value of the distance function while satisfying the constraints of the parabola and the line perpendicular to the axis of symmetry.

4. Can Lagrange multipliers be used for any type of parabola and point?

Yes, Lagrange multipliers can be used to find the closest approach of a parabola to a point regardless of the type of parabola (e.g. horizontal, vertical, or rotated) and the location of the point. As long as the parabola and point can be represented by mathematical equations, Lagrange multipliers can be applied.

5. Are there any alternative methods for finding the closest approach of a parabola to a point?

Yes, there are alternative methods such as using the distance formula or finding the intersection point of the parabola and the line perpendicular to the axis of symmetry. However, these methods may not always provide an exact solution and may require more calculations. Using Lagrange multipliers is a more efficient and precise method for finding the closest approach of a parabola to a point.

Similar threads

  • Calculus and Beyond Homework Help
Replies
8
Views
467
  • Calculus and Beyond Homework Help
Replies
16
Views
2K
  • Calculus and Beyond Homework Help
Replies
6
Views
851
  • Calculus and Beyond Homework Help
Replies
10
Views
355
  • Calculus and Beyond Homework Help
Replies
2
Views
540
  • Calculus and Beyond Homework Help
Replies
9
Views
1K
  • Calculus and Beyond Homework Help
Replies
4
Views
834
  • Calculus and Beyond Homework Help
Replies
18
Views
1K
  • Calculus and Beyond Homework Help
Replies
2
Views
506
  • Calculus and Beyond Homework Help
Replies
2
Views
1K
Back
Top