A Lagrange multiplier approach to the catenary problem

Coffee_ · Mar 20, 2015

In general, when dealing with mechanics problems using a function ##f(q1,q2,...)=0## that represent constraints one is minimizing the action ##S## while adding a term to the Lagrangian of the not-independent coordinates ##L + \lambda f ##. One can show that this addition doesn't change the minimal action and that the desired equations follow under a correct choice of ##\lambda##. Anyway this is the known theoretic approach.

Now consider the following treatment of the catenary problem on page 5: http://physics.ucsd.edu/students/courses/fall2009/managed/physics110a/documents/Lecture11-9.pdf

Don't waste the time on the details because my question is about the method:

Why does one add the ##\lambda f## term to the function ##V## one wants to minimize? As you see from my introduction above, when minimizing the action ##S##, that term is not added to the action but to the Lagrangian. Here ##V## is supposed to be the analogue of the action since this is what we want to minimize. So it seems to me that in this problem Lagrange multipliers are used differently than we had used in previous problems and different from the description of the theory.

Please help me lift the confusion. To show that I have thought about it, this is my idea but I'm not sure:

Since we are talking about the potential energy function ##V## and the kinetic energy is 0, we are basically talking about the Lagrangian and not the analog of the action. As we know from the theory we add this ##\lambda f## term to the Lagrangian, or the potential energy here. We know that the physics haven't changed from this addition. And now we minimize the new expression with the added term and forget that we even ever used lagrange multipliers?

Orodruin · Mar 21, 2015

The constraint that the chain should have a fixed length is not a holonomic constraint and you should not be surprised that it appears differently. In fact, I think that this type of Lagrange multiplier is more reminiscent of how Lagrange multipliers are used in multivariable calculus. I agree that it is confusing mainly based upon the document not discussing this type of problem earlier but only discussing holonomic constraints.

Coffee_ · Mar 21, 2015

Orodruin said:

The constraint that the chain should have a fixed length is not a holonomic constraint and you should not be surprised that it appears differently. In fact, I think that this type of Lagrange multiplier is more reminiscent of how Lagrange multipliers are used in multivariable calculus. I agree that it is confusing mainly based upon the document not discussing this type of problem earlier but only discussing holonomic constraints.

Do you mind elaborating on the reasoning they use for this problem then? I seem to understand it for holonomic constraints but here something is different as I mentioned and just seem to not see it.

Orodruin · Mar 21, 2015

Coffee_ said:

Do you mind elaborating on the reasoning they use for this problem then? I seem to understand it for holonomic constraints but here something is different as I mentioned and just seem to not see it.

Consider a functional ##S[f]## which you want to find the extreme values of under the condition ##C[f] = C_0##, where ##C## is another functional of ##f##. By finding the extreme values of ##H = S- \lambda C##, you will be setting
$$
\delta H = \delta S - \lambda \delta C = 0.
$$
This of course is equivalent to setting ##\delta S = \lambda \delta C##, i.e., the variation of ##S## is proportional to the variation of ##C##. If this is true, a variation that does not change ##C## does not change ##S## either, which is what must be true for an extreme value of ##S## under the condition that ##C## is a given constant. If ##\delta S## is not proportional to ##\delta C##, then you can find a variation such that ##\delta S## is non-zero while ##\delta C = 0##, implying that this variation will change the value of ##S## while keeping the value of ##C## equal to ##C_0##. The reasoning is exactly equivalent to the corresponding reasoning in multivariable calculus.

Coffee_ · Mar 21, 2015

Orodruin said:

Consider a functional ##S[f]## which you want to find the extreme values of under the condition ##C[f] = C_0##, where ##C## is another functional of ##f##. By finding the extreme values of ##H = S- \lambda C##, you will be setting
$$
\delta H = \delta S - \lambda \delta C = 0.
$$
This of course is equivalent to setting ##\delta S = \lambda \delta C##, i.e., the variation of ##S## is proportional to the variation of ##C##. If this is true, a variation that does not change ##C## does not change ##S## either, which is what must be true for an extreme value of ##S## under the condition that ##C## is a given constant. If ##\delta S## is not proportional to ##\delta C##, then you can find a variation such that ##\delta S## is non-zero while ##\delta C = 0##, implying that this variation will change the value of ##S## while keeping the value of ##C## equal to ##C_0##. The reasoning is exactly equivalent to the corresponding reasoning in multivariable calculus.

Thanks for the answer. I think the reason for me having troubles with this is because I have never seen functionals not variations formally. I had to deduce how one works with them from examples in mechanics. This means I'm certainly going to have look over your explanation a few times to really grasp it.

Orodruin · Mar 21, 2015

Coffee_ said:

Thanks for the answer. I think the reason for me having troubles with this is because I have never seen functionals not variations formally. I had to deduce how one works with them from examples in mechanics. This means I'm certainly going to have look over your explanation a few times to really grasp it.

Compare to the multivariable case. Let us say you want to minimise some function ##s(x)## where ##x \in \mathbb R^n## under the condition that ##c(x) = c_0##. This is basically the same situation, only with functions instead of functionals. Now, in order to perform this minimisation, you construct ##h = s-\lambda c## and minimise it, i.e., you find out where ##\nabla h = \nabla s - \lambda \nabla c = 0##, which is equivalent to ##\nabla s = \lambda \nabla c##. This means that the gradient of ##s## is parallel to the gradient of ##c##. So what is the geometrical interpretation of this? In order to stay on the surface ##c = c_0##, you can only go in directions perpendicular to the surface normal, i.e., ##\nabla c## so if you have a point ##x_0## fulfilling ##\nabla h = 0## (and ##c(x_0) = c_0##), then any infinitesimal displacement ##\delta## within the surface is going to lead to ##\delta\cdot \nabla s = \lambda \delta\cdot \nabla c = 0##, which means that ##x_0## is an extremum of ##s## on the surface ##c = c_0##. The functional case is completely analogous to this.

A Lagrange multiplier approach to the catenary problem

1. What is the catenary problem?

2. What is a Lagrange multiplier approach?

3. How does the Lagrange multiplier approach apply to the catenary problem?

4. What are the benefits of using a Lagrange multiplier approach for the catenary problem?

5. Are there any limitations to using a Lagrange multiplier approach for the catenary problem?

Similar threads

Hot Threads

Recent Insights