- #1

- 8

- 0

You are using an out of date browser. It may not display this or other websites correctly.

You should upgrade or use an alternative browser.

You should upgrade or use an alternative browser.

- Thread starter pccrp
- Start date

- #1

- 8

- 0

- #2

WannabeNewton

Science Advisor

- 5,815

- 544

- #3

- 8

- 0

coordinatesof ##\mathbb{R}^{6}## which specify the possible states of the particle. As you know, a set of coordinates ##(x^i)## are independent of each other i.e. ##\frac{\partial x^{i}}{\partial x^{j}} = \delta^{i}_{j}##.

I understand they're all needed to specify the state of the system. However, how can you start from this and prove the equations?

- #4

- 6,054

- 391

\vec{r_i} = \vec{r_i}(q)

\\

\dot{\vec{r_i}} = \sum_j \frac {\partial \vec{r_i}} {\partial q_j}(q) \dot{q_j}

\\

\frac {\partial \dot{\vec{r_i}}} {\partial \dot{q_j}} = \frac {\partial \vec{r_i}} {\partial q_j}

\\

\frac {\partial T} {\partial \dot{q_j}} = \sum_i m_i \dot{\vec{r_i}} \cdot \frac {\partial \dot{\vec{r_i}}} {\partial \dot{q_j}} = \sum_i m_i \dot{\vec{r_i}} \cdot \frac {\partial \vec{r_i}} {\partial q_j}

\\

\frac {d} {dt} \frac {\partial T} {\partial \dot{q_j}} = \sum_i m_i \ddot{\vec{r_i}} \cdot \frac {\partial \vec{r_i}} {\partial q_j} + \sum_i m_i \dot{\vec{r_i}} \cdot \frac {\partial \dot {\vec{r_i}}} {\partial q_j} = \sum_i \vec{F_i} \cdot \frac {\partial \vec{r_i}} {\partial q_j} + \frac {\partial T} {\partial q_j} = Q_j + \frac {\partial T} {\partial q_j}

\\

\frac {d} {dt} \frac {\partial T} {\partial \dot{q_j}} - \frac {\partial T} {\partial q_j} = Q_j

\\

\vec{F_i} = - \frac {\partial \Pi} {\partial \vec {r_i}}

\\

\frac {\partial \Pi} {\partial q_j} = \sum_i \frac {\partial \Pi} {\partial \vec {r_i}} \cdot \frac {\partial \vec{r_i}} {\partial q_j} = - \sum_i \vec{F_i} \cdot \frac {\partial \vec{r_i}} {\partial q_j} = - Q_j

\\

\frac {d} {dt} \frac {\partial T} {\partial \dot{q_j}} - \frac {\partial T} {\partial q_j} = - \frac {\partial \Pi} {\partial q_j}

\\

\frac {d} {dt} \frac {\partial T} {\partial \dot{q_j}} - \frac {\partial (T - \Pi)} {\partial q_j} = 0

\\

L = T - \Pi

\\

\frac {d} {dt} \frac {\partial L} {\partial \dot{q_j}} - \frac {\partial L} {\partial q_j} = 0

$$

- #5

Erland

Science Advisor

- 741

- 138

https://www.physicsforums.com/showthread.php?t=538909&highlight=erland

I both try to formulate the problem and solve it. The problem is that both formulating the problem and solving it required extensive notation which might be hard to penetrate, and still, those who replied didn't see the problem.

- #6

WannabeNewton

Science Advisor

- 5,815

- 544

- #7

Erland

Science Advisor

- 741

- 138

OK, but perhaps not everyone knows what a tangent bundle is. I think my explanation in my post (referred to above) is actually based on the same idea as yours, but is expressed differently. The important thing to note is that for every 2n+1-tuple of numbers ##(a_1,a_2,\dots,a_n,b_1,b_2,\dots,b_n,c)## (perhaps within some given boundaries), there is a path ##(q_1(t),q_2(t),\dots,q_n(t))## in configuration space such that ##q_i(c)=a_i## and ##\dot q_i(c)=b_i## for ##i=1,2,\dots,n##.There is nothing deep here whatsoever; it's just math. If ##M## is the configuration space, and ##TM## is the tangent bundle of the configuration space...

- #8

rubi

Science Advisor

- 847

- 348

You don't need fiber bundles to understand this. Let's just work on the space [itex]\mathbb R^3\times \mathbb R^3 = \mathbb R^6[/itex] of coordinates and velocities. The Lagrangian is just a function [itex]L:\mathbb R^6\rightarrow \mathbb R[/itex]. It's value at a point is better denoted by [itex]L(x,v)[/itex] instead of [itex]L(q,\dot q)[/itex]. Then the Euler-Lagrange equations read

[tex]\frac{\mathrm d}{\mathrm d t} \left(\frac{\partial L(x,v)}{\partial v}\bigg|_{x=q(t),v=\dot q(t)}\right) - \frac{\partial L(x,v)}{\partial x}\bigg|_{x=q(t),v=\dot q(t)}=0 \text{ .}[/tex]

This makes all the dependences obvious. However, it's just more convenient to write

[tex]\frac{\mathrm d}{\mathrm d t} \frac{\partial L}{\partial \dot q} - \frac{\partial L}{\partial q} =0[/tex]

instead, although it might cause confusion. You just have to keep in mind that it really means the above equation. The formulation using fiber bundles just generalized this to other spaces than [itex]\mathbb R^6[/itex].

[tex]\frac{\mathrm d}{\mathrm d t} \left(\frac{\partial L(x,v)}{\partial v}\bigg|_{x=q(t),v=\dot q(t)}\right) - \frac{\partial L(x,v)}{\partial x}\bigg|_{x=q(t),v=\dot q(t)}=0 \text{ .}[/tex]

This makes all the dependences obvious. However, it's just more convenient to write

[tex]\frac{\mathrm d}{\mathrm d t} \frac{\partial L}{\partial \dot q} - \frac{\partial L}{\partial q} =0[/tex]

instead, although it might cause confusion. You just have to keep in mind that it really means the above equation. The formulation using fiber bundles just generalized this to other spaces than [itex]\mathbb R^6[/itex].

Last edited:

- #9

Erland

Science Advisor

- 741

- 138

As I explained in the old thread, the problem is that for this to be meaningful, the function ##L(x,v)## must be unique. If there was another function ##M(x,v)## such that ##L(q(t),\dot q(t))=M(q(t),\dot q(t))## for all paths ##q(t)##, but for which ##\partial L /\partial v\neq\partial M/\partial v##, then we wouldn't know which one of these expressions to use.You don't need fiber bundles to understand this. Let's just work on the space [itex]\mathbb R^3\times \mathbb R^3 = \mathbb R^6[/itex] of coordinates and velocities. The Lagrangian is just a function [itex]L:\mathbb R^6\rightarrow \mathbb R[/itex]. It's value at a point is better denoted by [itex]L(x,v)[/itex] instead of [itex]L(q,\dot q)[/itex]. Then the Euler-Lagrange equations read

[tex]\frac{\mathrm d}{\mathrm d t} \left(\frac{\partial L(x,v)}{\partial v}\bigg|_{x=q(t),v=\dot q(t)}\right) - \frac{\partial L(x,v)}{\partial x}\bigg|_{x=q(t),v=\dot q(t)}=0 \text{ .}[/tex]

Therefore, it is important to prove that ##L(x,v)## is unique.

- #10

- 42

- 5

- #11

rubi

Science Advisor

- 847

- 348

As I explained in the old thread, the problem is that for this to be meaningful, the function ##L(x,v)## must be unique. If there was another function ##M(x,v)## such that ##L(q(t),\dot q(t))=M(q(t),\dot q(t))## for all paths ##q(t)##, but for which ##\partial L /\partial v\neq\partial M/\partial v##, then we wouldn't know which one of these expressions to use.

Therefore, it is important to prove that ##L(x,v)## is unique.

[itex]L(x,v)[/itex] is given as an axiom. It completely specifies your theory. For example [itex]L(x,v)=\frac{1}{2}m v^2 - m g x[/itex] describes a falling particle. You don't need to prove any uniqueness properties. In fact, it is never unique. Just try [itex]L'(x,v)=\frac{1}{2}m v^2 - m g x + v[/itex] for example. There are always other [itex]L'(x,v)[/itex] that give you exactly the same equations of motion. It doesn't matter which one you choose.

Everything is well-defined the way it is usually taught. For a given [itex]L(x,v)[/itex], you just compute the partial derivatives of [itex]L(x,v)[/itex], plug in [itex]q(t)[/itex] and [itex]\dot q(t)[/itex] afterwards and then insert them into the Euler-Lagrange equations. Apart from technical conditions like differentiability, you don't need to worry about anything.

- #12

Vanadium 50

Staff Emeritus

Science Advisor

Education Advisor

2021 Award

- 28,059

- 12,600

Don't think about the mathematics. Think about the physics. If q and q-dot are dependent, that means that every time a particle is in a given position, it has the same velocity. While there are problems where that is true, do you want those to be the only kind of problems you can solve?

- #13

rubi

Science Advisor

- 847

- 348

Don't think about the mathematics. Think about the physics. If q and q-dot are dependent, that means that every time a particle is in a given position, it has the same velocity. While there are problems where that is true, do you want those to be the only kind of problems you can solve?

I wrote: "You

For a given trajector [itex]q(t)[/itex], [itex]q[/itex] and [itex]\dot q[/itex]

- #14

Erland

Science Advisor

- 741

- 138

##L=T-V## and there is, for a given potential, a given formula to calculate this in cartesian coordinates, like the one you gave, and this can be taken as an axiom, yes. We then use the coordinate transformation to rewrite this formula in the generalized position and velocity coordinates, and here lies the problem. For how can we know that the cartesian velocities can be uniquely expressed as functions of the generalized velocities and positions? The formulas giving these expressions are not taken as axioms, they are derived in way which not shows that they are unique. Therefore, this uniqueness must be proved.[itex]L(x,v)[/itex] is given as an axiom. It completely specifies your theory. For example [itex]L(x,v)=\frac{1}{2}m v^2 - m g x[/itex] describes a falling particle. You don't need to prove any uniqueness properties. In fact, it is never unique. There are always other [itex]L'(x,v)[/itex] that give you exactly the same equations of motion. It doesn't matter which one you choose.

Everything is well-defined the way it is usually taught. For a given [itex]L(x,v)[/itex], you just compute the partial derivatives of [itex]L(x,v)[/itex], plug in [itex]q(t)[/itex] and [itex]\dot q(t)[/itex] afterwards and then insert them into the Euler-Lagrange equations. Apart from technical conditions like differentiability, you don't need to worry about anything.

Again, I refer to this old thread for the details:

https://www.physicsforums.com/showthread.php?t=538909&highlight=erland

- #15

rubi

Science Advisor

- 847

- 348

We then use the coordinate transformation to rewrite this formula in the generalized position and velocity coordinates

No, you don't perform any coordinate transformations. If you have [itex]L(x,v)[/itex], the choice of coordinates has already been made and isn't changed anymore. [itex]x[/itex] and [itex]v[/itex] are already the generalized coordinates. I didn't mean to imply cartesian coordinates when i wrote [itex]x[/itex]. I just wanted to distinguish it symbolically from the trajectory [itex]q(t)[/itex].

(It's unfortunate that in the case of [itex]TM=\mathbb R^{2N}[/itex], the usual coordinate chart already is [itex](\mathbb R^{2N},\mathrm{id})[/itex]. This obfuscates what's going on a little bit. Actually, everything can even be formulated completely coordinate free. One should distinguish [itex]L:TM\rightarrow \mathbb R[/itex] from [itex]L\circ f^{-1}:U\rightarrow\mathbb R[/itex], where [itex](U,f)[/itex] is a coordinate chart for [itex]TM[/itex]. The [itex]L(x,v)[/itex] I'm talking about all the time, is really some [itex]L\circ f^{-1}[/itex]. This is way too complicated however, if you just work in [itex]\mathbb R^{2N}[/itex].)

- #16

- 8

- 0

https://www.physicsforums.com/showthread.php?t=538909&highlight=erland

I both try to formulate the problem and solve it. The problem is that both formulating the problem and solving it required extensive notation which might be hard to penetrate, and still, those who replied didn't see the problem.

Gathering answers around books and counting on all your greatly helpful answers (thanks, by the way), I successfully got to a conclusion and I would really appreciate if you could say to me if that's true or not.

In my head, it's just a mathematical reason that you can consider them as independent. For example, suppose there's a function [tex] f(y(x),y'(x))= y + y' [/tex] where [itex] y=x^2 \rightarrow y'=2x [/itex]

If we evalute [itex] f(y,y') [/itex] in function of [itex]x[/itex] only, we'll have [itex]f(x)=x^2+2x[/itex]; If we differentiate it w.r.t [itex]x[/itex] we get [itex]f'(x)=2x+2[/itex]

Simirlarly, if we consider [itex]y(x)[/itex] and [itex]y'(x)[/itex] as independent variables and use the chain rule to differentiate [itex]f(y,y')[/itex] w.r.t [itex]x[/itex] we'll have: [tex]\frac{df(y,y')}{dx}=\frac{\partial f}{\partial y} \frac{dy}{dx}+\frac{\partial f}{\partial y'} \frac{dy'}{dx} [/tex] Evaluating each term, we have [tex]\frac{\partial f}{\partial y}=1 ;[/tex][tex]

\frac{\partial f}{\partial y'}=1;[/tex][tex]

\frac{dy}{dx}=y'(x)=2x;[/tex][tex]

\frac{dy'}{dx}=\frac{d(2x)}{dx}=2;[/tex]

Which, by substitution, gives:[tex]\frac{df(y,y')}{dx}=1(2x)+1(2)=2x+2[/tex]

As we can see, the same as the answer previously calculated. This shows (but not proves) that we can consider them as independent and with this result we see that[itex]\dot{\vec{r_i}} = \sum_j \frac {\partial \vec{r_i}} {\partial q_j} \dot{q_j}[/itex] can be considered as a function of independent variables [itex]q_j(t)[/itex] and [itex] \dot{q_j}(t)[/itex] even though we know they're both functions of the independent variable [itex]t[/itex] and that there's a relation of dependence between them:[tex] \dot{\vec{r_i}}=\dot{\vec{r_i}}(q,\dot{q})[/tex] Being so, we can do like in the example and partially differentiate it w.r.t. to [itex]\dot{q_j}[/itex] considering that [itex]q_j[/itex] are constants. With this result, it becomes possible to prove Lagrange's equation.

Please, correct me if I'm wrong and, if possible, redirect me to a proof of the identity I've shown an example.

Last edited:

- #17

rubi

Science Advisor

- 847

- 348

Gathering answers around books and counting on all your greatly helpful answers (thanks, by the way), I successfully got to a conclusion and I would really appreciate if you could say to me if that's true or not.

In my head, it's just a mathematical reason that you can consider them as independent.

I'm not sure whether you understood it. We don't "consider anything independent". In fact, given a trajectory, [itex]q[/itex] and [itex]\dot q[/itex] are not independent in general as the simple example [itex]q(t)=t^2[/itex] shows. The physical intuition behind this is: A particle usually has a different velocity at each point of the trajectory.

The problem many people are having (and I think you are having, too) is: When we evaluate [itex]\frac{\partial L}{\partial q}[/itex] and [itex]\frac{\partial L}{\partial \dot q}[/itex] in the Euler-Lagrange equations, why don't we need to do something like this:

[tex]\frac{\mathrm d L}{\mathrm d q} = \frac{\partial L}{\partial q} + \frac{\partial L}{\partial \dot q} \frac{\partial \dot q}{\partial q}[/tex]

And the answer is that the usual way the Euler-Lagrange equations are written is a little bit of an abuse of notation. [itex]\frac{\partial L(q,\dot q)}{\partial q}[/itex] isn't to be interpreted as

[tex]\frac{\mathrm d L(q,\dot q(q))}{\mathrm d q} \text{ .}[/tex]

It really means

[tex]\frac{\partial L(x,v)}{\partial x}\bigg|_{x=q(t),v=\dot q(t)} \text{ .}[/tex]

The same goes for [itex]\frac{\partial L}{\partial \dot q}[/itex]. At no point does [itex]q[/itex] need to be differentiated with respect to [itex]\dot q[/itex] (or the other way around). Thus it is irrelevant whether they are really dependent or not. You just differentiate [itex]L[/itex] with respect to its arguments and

- #18

- 8

- 0

I'm not sure whether you understood it. We don't "consider anything independent". In fact, given a trajectory, [itex]q[/itex] and [itex]\dot q[/itex] are not independent in general as the simple example [itex]q(t)=t^2[/itex] shows. The physical intuition behind this is: A particle usually has a different velocity at each point of the trajectory.

The problem many people are having (and I think you are having, too) is: When we evaluate [itex]\frac{\partial L}{\partial q}[/itex] and [itex]\frac{\partial L}{\partial \dot q}[/itex] in the Euler-Lagrange equations, why don't we need to do something like this:

[tex]\frac{\mathrm d L}{\mathrm d q} = \frac{\partial L}{\partial q} + \frac{\partial L}{\partial \dot q} \frac{\partial \dot q}{\partial q}[/tex]

And the answer is that the usual way the Euler-Lagrange equations are written is a little bit of an abuse of notation. [itex]\frac{\partial L(q,\dot q)}{\partial q}[/itex] isn't to be interpreted as

[tex]\frac{\mathrm d L(q,\dot q(q))}{\mathrm d q} \text{ .}[/tex]

It really means

[tex]\frac{\partial L(x,v)}{\partial x}\bigg|_{x=q(t),v=\dot q(t)} \text{ .}[/tex]

The same goes for [itex]\frac{\partial L}{\partial \dot q}[/itex]. At no point does [itex]q[/itex] need to be differentiated with respect to [itex]\dot q[/itex] (or the other way around). Thus it is irrelevant whether they are really dependent or not. You just differentiate [itex]L[/itex] with respect to its arguments andafterwardsinsert [itex]q[/itex] and [itex]\dot q[/itex]. This is very important.

Adapting my thoughts, the terms [itex]\frac{\partial L}{\partial q_j}[/itex] and [itex] \frac{\partial L}{\partial \dot q_j}[/itex] (where the [itex]\frac{\partial L}{\partial q_j}[/itex] treats [itex]\dot q_j[/itex] as constants and vice versa) appear in the Lagrangian equations of motion because when proving the from Hamilton's principle, the Taylor expansion for [itex] L+\delta L=L(q(t)+\delta q(t),\dot {q} + \delta \dot{q}, t)[/itex] does not make any distinction if [itex]q[/itex] and [itex] \dot q[/itex] are or aren't independent from each other. Being so, this expansion can always be written [tex]L(q(t)+\delta q(t),\dot {q}(t) + \delta \dot{q}(t), t)= L(q(t), \dot{q}(t), t) +\sum_{j} \frac{\partial L}{\partial q_j}\delta q_j(t) + \sum_{j} \frac{\partial L}{\partial \dot{q_j}}\delta \dot{q_j}(t)[/tex]

And ,if you apply this expansion to Hamilton's principle and manipulate it algebrically (recognizing that [itex] \dot {q_j}=\frac{dq_j}{dt}[/itex]), you'll get the [tex]\frac{\mathrm d}{\mathrm d t} \frac{\partial L}{\partial \dot q} - \frac{\partial L}{\partial q} =0[/tex] Since in the start of the demonstration the [itex]\frac{\partial L}{\partial q_j}[/itex] treats [itex]\dot q_j[/itex] as constants and vice versa, they'll keep this behavior on the Lagrange's equations.

Am I correct? Thanks for your help

- #19

rubi

Science Advisor

- 847

- 348

Since in the start of the demonstration the [itex]\frac{\partial L}{\partial q_j}[/itex] treats [itex]\dot q_j[/itex] as constants and vice versa, they'll keep this behavior on the Lagrange's equations.

Am I correct? Thanks for your help

Yes, this is the idea. However, I'm not happy with phrases like "consider as independent" and "treat as constants". This sounds like one could arbitrarily choose how to interpret the derivatives. That's not the case, though. You can prove the Euler-Lagrange equations with full rigour by strictly applying the rules of calculus. There is no freedom how to interpret terms.

Here is how I would derive the Euler-Lagrange equations (leaving out all technicalities for the sake of simplicity):

We want to find the trajectory ##q(t)## that makes the action

[tex]S[q] = \int_{t_a}^{t_b} L(q(t),\dot q(t))\mathrm d t[/tex]

stationary. A necessary condition for this to be true is that whenever we add a multiple of some arbitrary function ##\eta(t)## with ##\eta(t_a) = \eta(t_b) = 0## to ##q(t)##, ##S[q+\epsilon\eta]## shouldn't change much for small ##\epsilon##. Since, given fixed ##q## and ##\eta##, ##S[q+\epsilon\eta]## is just a real-valued function of the real parameter ##\epsilon##, we can state this as

[tex]\frac{\mathrm d}{\mathrm d \epsilon}\bigg|_{\epsilon=0} S[q+\epsilon\eta] =0 \text{ .}[/tex]

Now we can insert the definition of ##S[q]## and (assuming everything behaves nicely) move the derivative under the integral:

[tex]\frac{\mathrm d}{\mathrm d \epsilon}\bigg|_{\epsilon=0} S[q+\epsilon\eta] = \int_{t_a}^{t_b} \frac{\partial}{\partial\epsilon}\bigg|_{\epsilon=0} L(q(t)+\epsilon\eta(t),\dot q(t)+\epsilon\dot\eta(t))\mathrm d t[/tex]

Note that the derivative acts on a function of the form ##f(g(\epsilon,t),h(\epsilon,t))##, so we can just apply the chain rule:

[tex]\frac{\partial}{\partial\epsilon}\bigg|_{\epsilon=0} L(q(t)+\epsilon\eta(t),\dot q(t)+\epsilon\dot\eta(t)) \\= \left[\frac{\partial L(q(t)+\epsilon\eta(t),\dot q(t)+\epsilon\dot\eta(t))}{\partial (q(t)+\epsilon\eta(t))}\frac{\partial (q(t)+\epsilon\eta(t))}{\partial\epsilon}+\frac{\partial L(q(t)+\epsilon\eta(t),\dot q(t)+\epsilon\dot\eta(t))}{\partial (\dot q(t)+\epsilon\dot\eta(t))}\frac{\partial (\dot q(t)+\epsilon\dot\eta(t))}{\partial\epsilon}\right]\bigg|_{\epsilon=0}\\=\frac{\partial L(q(t),\dot q(t))}{\partial q(t)}\eta(t)+\frac{\partial L(q(t),\dot q(t))}{\partial \dot q(t)}\dot\eta(t)[/tex]

Now you just need to put this back into the integral, use the standard integration by parts trick, make sure the boundary term vanishes and derive the Euler-Lagrange equations, using the fact that it should hold for all ##\eta##. I have abused notation a little bit, but I hope it is clear how this is to be understood.

Last edited:

- #20

Erland

Science Advisor

- 741

- 138

OK, maybe you don't always have to invoke cartesian coordinates, but in most examples given in textbooks, such as Goldstein, this is how it is done.No, you don't perform any coordinate transformations. If you have [itex]L(x,v)[/itex], the choice of coordinates has already been made and isn't changed anymore. [itex]x[/itex] and [itex]v[/itex] are already the generalized coordinates. I didn't mean to imply cartesian coordinates when i wrote [itex]x[/itex]. I just wanted to distinguish it symbolically from the trajectory [itex]q(t)[/itex]

Suppose, for example, we have a particle forced to move on a surface located in a gravity field. We then express the motion of the particle in terms of two parameters on the surface. Then, how can we express L=T-V in terms of the surface parameters and its time derivatives for the path?

The only way I know of is to express the kinetic and potential energy of the particle in cartesian coordinates and then transform to the surface parameters. This method is anyway implicit in the derivation in Goldstein.

Last edited:

- #21

rubi

Science Advisor

- 847

- 348

OK, maybe you don't always have to invoke cartesian coordinates, but in most examples given in textbooks, such as Goldstein, this is how it is done.

Suppose, for example, we have a particle forced to move on a surface located in a gravity field. We then express the motion of the particle in terms of two parameters on the surface. Then, how can we express L=T-V in terms of the surface parameters and its time derivatives for the path?

The only way I know of is to express the kinetic and potential energy of the particle in cartesian coordinates and then transform to the surface parameters. This method is anyway implicit in the derivation in Goldstein.

It seems like you are talking about a different problem than the OP. The OP wants to know how to properly derive the Euler-Lagrange equations,

- #22

Erland

Science Advisor

- 741

- 138

I say yes, this is the way one must think. But there is a problem which many people don't seem to see: the problem of uniqueness.Gathering answers around books and counting on all your greatly helpful answers (thanks, by the way), I successfully got to a conclusion and I would really appreciate if you could say to me if that's true or not.

In my head, it's just a mathematical reason that you can consider them as independent. For example, suppose there's a function [tex] f(y(x),y'(x))= y + y' [/tex] where [itex] y=x^2 \rightarrow y'=2x [/itex]

If we evalute [itex] f(y,y') [/itex] in function of [itex]x[/itex] only, we'll have [itex]f(x)=x^2+2x[/itex]; If we differentiate it w.r.t [itex]x[/itex] we get [itex]f'(x)=2x+2[/itex]

Simirlarly, if we consider [itex]y(x)[/itex] and [itex]y'(x)[/itex] as independent variables and use the chain rule to differentiate [itex]f(y,y')[/itex] w.r.t [itex]x[/itex] we'll have: [tex]\frac{df(y,y')}{dx}=\frac{\partial f}{\partial y} \frac{dy}{dx}+\frac{\partial f}{\partial y'} \frac{dy'}{dx} [/tex] Evaluating each term, we have [tex]\frac{\partial f}{\partial y}=1 ;[/tex][tex]

\frac{\partial f}{\partial y'}=1;[/tex][tex]

\frac{dy}{dx}=y'(x)=2x;[/tex][tex]

\frac{dy'}{dx}=\frac{d(2x)}{dx}=2;[/tex]

Which, by substitution, gives:[tex]\frac{df(y,y')}{dx}=1(2x)+1(2)=2x+2[/tex]

As we can see, the same as the answer previously calculated. This shows (but not proves) that we can consider them as independent and with this result we see that[itex]\dot{\vec{r_i}} = \sum_j \frac {\partial \vec{r_i}} {\partial q_j} \dot{q_j}[/itex] can be considered as a function of independent variables [itex]q_j(t)[/itex] and [itex] \dot{q_j}(t)[/itex] even though we know they're both functions of the independent variable [itex]t[/itex] and that there's a relation of dependence between them:[tex] \dot{\vec{r_i}}=\dot{\vec{r_i}}(q,\dot{q})[/tex] Being so, we can do like in the example and partially differentiate it w.r.t. to [itex]\dot{q_j}[/itex] considering that [itex]q_j[/itex] are constants. With this result, it becomes possible to prove Lagrange's equation.

Please, correct me if I'm wrong and, if possible, redirect me to a proof of the identity I've shown an example.

Using your example, what if we start from ##x^2+2x## and want to retrieve ##f##? It is clear that there are many (in fact, infinitely many) functions ##f(y,y')## such that ##f(x^2,2x)=x^2+2x)##. ##f(y,y')=y+y'## is only one of them, another one is e.g. ##g(y,y')=\sqrt y(\frac {y'}2+2)## (assuming that ##x>0##).

Clearly, ##\frac d{dx}f(x^2,2x)=\frac d{dx}g(x^2,2x)=2x+2##, but ##\partial g/\partial y=\frac1{2\sqrt y}(\frac {y'}2+2)## and ##\partial g/\partial y'=\frac{\sqrt y}2##, very different from ##\partial f/\partial y## and ##\partial f/\partial y'##. But of course, if we calculate ##\frac d{dx}g(x^2,2x)## with the chain rule using these partial derivatives, the result is still ##2x+2##.

But in Langrage's equations, expressions similar to these partial derivatives are used, not only as intermediates, so how can we then know if should use the partial derivatives of ##f## or ##g## or of some other of infinitely many possible functions?

The answer is that we need a function ##f## which works, not only for ##y(x)=x^2##, but for all choices of the function (or path) ##y(x)##, and then, there is only one possible function ##f## (this is better explained in the other thread referred to in other posts).

- #23

rubi

Science Advisor

- 847

- 348

But there is a problem which many people don't seem to see: the problem of uniqueness.

Really, there definitely is no problem of uniqueness here. Given a constraint surface ##M\subset\mathbb R^N## and a Lagrangian ##L## on ##\mathbb R^N##, the Lagrangian for ##M## is uniquely defined by composing ##L## with the inclusion map ##TM\hookrightarrow\mathbb R^{2N}##. This is centuries old mainstream math. You are misunderstanding something and I'm not sure what exactly. Everything comes down to applying the chain rule correctly.

The OP isn't even dealing with constraints here. He has been given a Lagrangian that he wants to work with, so there is no need to question its uniqueness.

Edit: Here's the simplest example I can think of: Let's restrict the motion of a free particle ##L = \frac{1}{2}m(\dot x^2+\dot y^2)## to a circle ##x^2+y^2=1##. We choose coordinates ##x=\cos(\varphi)## and ##y=\sin(\varphi)## on the circle and let ##\varphi=\varphi(t)##. Then the chain rule gives ##\dot x = -\sin(\varphi) \dot\varphi## and ##\dot y = \cos(\varphi) \dot\varphi##, so ##L## restricted to the circle is just ##L = \frac{1}{2}m(\sin^2(\varphi)\dot\varphi^2+\cos^2(\varphi)\dot\varphi^2) = \frac{1}{2}m\dot\varphi^2##. The only thing we had to do was to apply the chain rule. We didn't need to worry about anything else.

Last edited:

- #24

Erland

Science Advisor

- 741

- 138

Very good, but suppose now that there are functions ##f(u,v)\neq -\cos(u)\,v## and/or ##g(u,v)\neq\sin(u)\,v## such that ##\dot x=f(\varphi,\dot\varphi)## and ##\dot y=g(\varphi,\dot\varphi)## for all paths ##\varphi(t)##. If we plug in these in the expression for ##L## and calculate ##\partial L/\partial\varphi## and ##\partial L/\partial \dot \varphi##, then the result could be different from the result using your formula. How can we then know which one is right?Here's the simplest example I can think of: Let's restrict the motion of a free particle ##L = \frac{1}{2}m(\dot x^2+\dot y^2)## to a circle ##x^2+y^2=1##. We choose coordinates ##x=\cos(\varphi)## and ##y=\sin(\varphi)## on the circle and let ##\varphi=\varphi(t)##. Then the chain rule gives ##\dot x = -\sin(\varphi) \dot\varphi## and ##\dot y = \cos(\varphi) \dot\varphi##, so ##L## restricted to the circle is just ##L = \frac{1}{2}m(\sin^2(\varphi)\dot\varphi^2+\cos^2(\varphi)\dot\varphi^2) = \frac{1}{2}m\dot\varphi^2##. The only thing we had to do was to apply the chain rule. We didn't need to worry about anything else.

Therefore, we need to prove that the functions ##f## and ##g## are uniquely determined by the requirements that ##\dot x=f(\varphi,\dot\varphi)## and ##\dot y=g(\varphi,\dot\varphi)## for all paths ##\varphi(t)##.

I think your proof using tangent bundles are actually based upon the same idea as my own proof, but I am not sure.

- #25

rubi

Science Advisor

- 847

- 348

I don't think so. The OP is basically asking: "Given ##L=\frac{1}{2}m\dot\varphi^2##, how do I find the Euler-Lagrange equations?" But you are asking: "Why ##L=\frac{1}{2}m\dot\varphi^2## and not something different?"It is possible that OP and I don't talk about exactly the same problem, but they are certainly related.

Well, there might be such functions, but they wouldn't be relevant for the problem, since they can't have been derived using the chain rule. Thus they don't describe the same physics anymore, i.e. they don't describe a free particle constrained to a circle in our example. There is a clearly stated rule that says "use the chain rule" or (if you want to be more geometric) "concatenate the Lagrangian with the push-forward of the embedding".Very good, but suppose now that there are functions ##f(u,v)\neq -\cos(u)\,v## and/or ##g(u,v)\neq\sin(u)\,v## such that ##\dot x=f(\varphi,\dot\varphi)## and ##\dot y=g(\varphi,\dot\varphi)## for all paths ##\varphi(t)##.

The one that has been derived using the chain rule is right. Substituting some arbitrary functions, just because some of their properties agree with some of the properties of the original functions isn't valid in any part of mathematics (not just in Lagrangian mechanics). Of course the result would be different if you don't follow the rules correctly.If we plug in these in the expression for ##L## and calculate ##\partial L/\partial\varphi## and ##\partial L/\partial \dot \varphi##, then the result could be different from the result using your formula. How can we then know which one is right?

--

Edit: You might ask: "Why the chain rule?" The answer is that the embedding ##\varphi:M\rightarrow\mathbb R^{N}## automatically induces the map ##\varphi_*:TM\rightarrow\mathbb R^{2N}## and this can be shown to give you the chain rule if you choose a coordinate system.

Last edited:

- #26

- 8

- 0

Yes, this is the idea. However, I'm not happy with phrases like "consider as independent" and "treat as constants". This sounds like one could arbitrarily choose how to interpret the derivatives. That's not the case, though. You can prove the Euler-Lagrange equations with full rigour by strictly applying the rules of calculus. There is no freedom how to interpret terms.

Here is how I would derive the Euler-Lagrange equations (leaving out all technicalities for the sake of simplicity):

We want to find the trajectory ##q(t)## that makes the action

[tex]S[q] = \int_{t_a}^{t_b} L(q(t),\dot q(t))\mathrm d t[/tex]

stationary. A necessary condition for this to be true is that whenever we add a multiple of some arbitrary function ##\eta(t)## with ##\eta(t_a) = \eta(t_b) = 0## to ##q(t)##, ##S[q+\epsilon\eta]## shouldn't change much for small ##\epsilon##. Since, given fixed ##q## and ##\eta##, ##S[q+\epsilon\eta]## is just a real-valued function of the real parameter ##\epsilon##, we can state this as

[tex]\frac{\mathrm d}{\mathrm d \epsilon}\bigg|_{\epsilon=0} S[q+\epsilon\eta] =0 \text{ .}[/tex]

Now we can insert the definition of ##S[q]## and (assuming everything behaves nicely) move the derivative under the integral:

[tex]\frac{\mathrm d}{\mathrm d \epsilon}\bigg|_{\epsilon=0} S[q+\epsilon\eta] = \int_{t_a}^{t_b} \frac{\partial}{\partial\epsilon}\bigg|_{\epsilon=0} L(q(t)+\epsilon\eta(t),\dot q(t)+\epsilon\dot\eta(t))\mathrm d t[/tex]

Note that the derivative acts on a function of the form ##f(g(\epsilon,t),h(\epsilon,t))##, so we can just apply the chain rule:

[tex]\frac{\partial}{\partial\epsilon}\bigg|_{\epsilon=0} L(q(t)+\epsilon\eta(t),\dot q(t)+\epsilon\dot\eta(t)) \\= \left[\frac{\partial L(q(t)+\epsilon\eta(t),\dot q(t)+\epsilon\dot\eta(t))}{\partial (q(t)+\epsilon\eta(t))}\frac{\partial (q(t)+\epsilon\eta(t))}{\partial\epsilon}+\frac{\partial L(q(t)+\epsilon\eta(t),\dot q(t)+\epsilon\dot\eta(t))}{\partial (\dot q(t)+\epsilon\dot\eta(t))}\frac{\partial (\dot q(t)+\epsilon\dot\eta(t))}{\partial\epsilon}\right]\bigg|_{\epsilon=0}\\=\frac{\partial L(q(t),\dot q(t))}{\partial q(t)}\eta(t)+\frac{\partial L(q(t),\dot q(t))}{\partial \dot q(t)}\dot\eta(t)[/tex]

Now you just need to put this back into the integral, use the standard integration by parts trick, make sure the boundary term vanishes and derive the Euler-Lagrange equations, using the fact that it should hold for all ##\eta##. I have abused notation a little bit, but I hope it is clear how this is to be understood.

Thanks a lot. My only question now is: How can we prove that we can apply the chain rule over the [itex] L(q,\dot{q},t) [/itex] the same way we would if [itex]q[/itex] and [itex] \dot{q}[/itex] were not related to each other at all? Furthermore, if you find any demonstration of the general chain rule, it would also be very nice. Thanks again.

- #27

rubi

Science Advisor

- 847

- 348

Thanks a lot. My only question now is: How can we prove that we can apply the chain rule over the [itex] L(q,\dot{q},t) [/itex] the same way we would if [itex]q[/itex] and [itex] \dot{q}[/itex] were not related to each other at all? Furthermore, if you find any demonstration of the general chain rule, it would also be very nice. Thanks again.

What I have done is:

[tex]\frac{\mathrm d}{\mathrm d\epsilon} f(g(\epsilon,t),h(\epsilon,t)) = \frac{\partial f (x_1,x_2)}{\partial x_1}\bigg|_{x_1=g(\epsilon,t),x_2=h(\epsilon,t)}\frac{\partial g(\epsilon,t)}{\partial\epsilon}+\frac{\partial f (x_1,x_2)}{\partial x_2}\bigg|_{x_1=g(\epsilon,t),x_2=h(\epsilon,t)}\frac{\partial h(\epsilon,t)}{\partial\epsilon}[/tex]

With the particular choice ##f(x_1,x_2) = L(x_1,x_2)##, ##g(\epsilon,t) = q(t)+\epsilon\eta(t)## and ##h(\epsilon,t)=\dot q(t)+\epsilon\dot\eta(t)##. ##q(t)## and ##\dot q(t)## are just functions of ##t##. It doesn't matter whether they are related or not, because we are doing partial derivatives with respect to ##\epsilon## only, so the ##t##-dependence or any other dependence doesn't play a role. Here's a simple example: ##f(x,t)=x g(t)+x^2##. Then ##\frac{\partial f}{\partial x} = g(t)+2x##. It doesn't matter what ##g(t)## is as long as we know that it doesn't depend on ##x##. If you want to review the chain rule, you might want to have a look at http://en.wikipedia.org/wiki/Chain_rule

- #28

- 8

- 0

What I have done is:

[tex]\frac{\mathrm d}{\mathrm d\epsilon} f(g(\epsilon,t),h(\epsilon,t)) = \frac{\partial f (x_1,x_2)}{\partial x_1}\bigg|_{x_1=g(\epsilon,t),x_2=h(\epsilon,t)}\frac{\partial g(\epsilon,t)}{\partial\epsilon}+\frac{\partial f (x_1,x_2)}{\partial x_2}\bigg|_{x_1=g(\epsilon,t),x_2=h(\epsilon,t)}\frac{\partial h(\epsilon,t)}{\partial\epsilon}[/tex]

With the particular choice ##f(x_1,x_2) = L(x_1,x_2)##, ##g(\epsilon,t) = q(t)+\epsilon\eta(t)## and ##h(\epsilon,t)=\dot q(t)+\epsilon\dot\eta(t)##. ##q(t)## and ##\dot q(t)## are just functions of ##t##. It doesn't matter whether they are related or not, because we are doing partial derivatives with respect to ##\epsilon## only, so the ##t##-dependence or any other dependence doesn't play a role. Here's a simple example: ##f(x,t)=x g(t)+x^2##. Then ##\frac{\partial f}{\partial x} = g(t)+2x##. It doesn't matter what ##g(t)## is as long as we know that it doesn't depend on ##x##. If you want to review the chain rule, you might want to have a look at http://en.wikipedia.org/wiki/Chain_rule

Think I got it now. [itex]\dot{q}[/itex] is not a function of [itex]q[/itex], it's only a function of the variable [itex]t[/itex]. Being so, if we partially derivate [itex]L(q,\dot{q},t)[/itex] w.r.t to [itex]\epsilon[/itex] we know that the partial derivatives w.r.t. to [itex]q[/itex](i.e.[itex]\frac{\partial L}{\partial q}[/itex]) will "see [itex]\dot{q}[/itex] as constants" and vice versa, just as the chain rule predicts. Right? THANKS A LOT FOR ALL YOUR HELP, GUYS

- #29

WannabeNewton

Science Advisor

- 5,815

- 544

Indeed :)

- #30

rubi

Science Advisor

- 847

- 348

Yes, I think you got it. One last thing should be noted though: For the derivative with respect to ##\epsilon##, it doesn't matter whether ##\dot q## depends on ##t## or ##q##, as long as it doesn't depend on ##\epsilon##.Think I got it now. [itex]\dot{q}[/itex] is not a function of [itex]q[/itex], it's only a function of the variable [itex]t[/itex]. Being so, if we partially derivate [itex]L(q,\dot{q},t)[/itex] w.r.t to [itex]\epsilon[/itex] we know that the partial derivatives w.r.t. to [itex]q[/itex](i.e.[itex]\frac{\partial L}{\partial q}[/itex]) will "see [itex]\dot{q}[/itex] as constants" and vice versa, just as the chain rule predicts. Right?

No problem.THANKS A LOT FOR ALL YOUR HELP, GUYS

Share:

- Replies
- 6

- Views
- 180

- Replies
- 2

- Views
- 1K

- Replies
- 7

- Views
- 5K