Lagrangian: q and q-dot independence

1. Aug 29, 2013

pccrp

Hello! I've read thousand of explanations about how q and q-dot are considered independent in the Lagrangian treatment of mechanics but I just can't get it. I would really appreciate if someone could explain how is this so and (I've seen something about an a-priori independence but I couldn't really understand it) prove Lagrangian equations of motion showing how this independence works. Thanks

2. Aug 29, 2013

WannabeNewton

Consider a particle moving completely freely. $q_1,q_2,q_3$ and $\dot{q}_1,\dot{q}_2,\dot{q}_3$ are coordinates of $\mathbb{R}^{6}$ which specify the possible states of the particle. As you know, a set of coordinates $(x^i)$ are independent of each other i.e. $\frac{\partial x^{i}}{\partial x^{j}} = \delta^{i}_{j}$.

3. Aug 29, 2013

pccrp

I understand they're all needed to specify the state of the system. However, how can you start from this and prove the equations?

4. Aug 30, 2013

voko

$$\vec{r_i} = \vec{r_i}(q) \\ \dot{\vec{r_i}} = \sum_j \frac {\partial \vec{r_i}} {\partial q_j}(q) \dot{q_j} \\ \frac {\partial \dot{\vec{r_i}}} {\partial \dot{q_j}} = \frac {\partial \vec{r_i}} {\partial q_j} \\ \frac {\partial T} {\partial \dot{q_j}} = \sum_i m_i \dot{\vec{r_i}} \cdot \frac {\partial \dot{\vec{r_i}}} {\partial \dot{q_j}} = \sum_i m_i \dot{\vec{r_i}} \cdot \frac {\partial \vec{r_i}} {\partial q_j} \\ \frac {d} {dt} \frac {\partial T} {\partial \dot{q_j}} = \sum_i m_i \ddot{\vec{r_i}} \cdot \frac {\partial \vec{r_i}} {\partial q_j} + \sum_i m_i \dot{\vec{r_i}} \cdot \frac {\partial \dot {\vec{r_i}}} {\partial q_j} = \sum_i \vec{F_i} \cdot \frac {\partial \vec{r_i}} {\partial q_j} + \frac {\partial T} {\partial q_j} = Q_j + \frac {\partial T} {\partial q_j} \\ \frac {d} {dt} \frac {\partial T} {\partial \dot{q_j}} - \frac {\partial T} {\partial q_j} = Q_j \\ \vec{F_i} = - \frac {\partial \Pi} {\partial \vec {r_i}} \\ \frac {\partial \Pi} {\partial q_j} = \sum_i \frac {\partial \Pi} {\partial \vec {r_i}} \cdot \frac {\partial \vec{r_i}} {\partial q_j} = - \sum_i \vec{F_i} \cdot \frac {\partial \vec{r_i}} {\partial q_j} = - Q_j \\ \frac {d} {dt} \frac {\partial T} {\partial \dot{q_j}} - \frac {\partial T} {\partial q_j} = - \frac {\partial \Pi} {\partial q_j} \\ \frac {d} {dt} \frac {\partial T} {\partial \dot{q_j}} - \frac {\partial (T - \Pi)} {\partial q_j} = 0 \\ L = T - \Pi \\ \frac {d} {dt} \frac {\partial L} {\partial \dot{q_j}} - \frac {\partial L} {\partial q_j} = 0$$

5. Aug 30, 2013

Erland

pccrp, I have exactly the same problem as you, and think that many of those who try to answer the question don't really understand the problem. I think I solved it, however, and in this old thread

I both try to formulate the problem and solve it. The problem is that both formulating the problem and solving it required extensive notation which might be hard to penetrate, and still, those who replied didn't see the problem.

6. Aug 30, 2013

WannabeNewton

There is nothing deep here whatsoever; it's just math. If $M$ is the configuration space, and $TM$ is the tangent bundle of the configuration space, then the Lagrangian is a map $L: TM \times \mathbb{R} \rightarrow \mathbb{R}$. The coordinate charts for $TM$ have the generalized coordinates $q^i$ and generalized velocities $\dot{q}^i$ as coordinate functions i.e. $p\in TM$ can be represented as $p = (q^1,...,q^n,\dot{q}^1,...,\dot{q}^n)$ with respect to some coordinate chart. Then $L$ is just a map that takes $(q^1,...,q^n,\dot{q}^1,...,\dot{q}^n,t)$ and gives us a real number. Coordinate functions are of course independent of each other.

7. Aug 30, 2013

Erland

OK, but perhaps not everyone knows what a tangent bundle is. I think my explanation in my post (referred to above) is actually based on the same idea as yours, but is expressed differently. The important thing to note is that for every 2n+1-tuple of numbers $(a_1,a_2,\dots,a_n,b_1,b_2,\dots,b_n,c)$ (perhaps within some given boundaries), there is a path $(q_1(t),q_2(t),\dots,q_n(t))$ in configuration space such that $q_i(c)=a_i$ and $\dot q_i(c)=b_i$ for $i=1,2,\dots,n$.

8. Aug 30, 2013

rubi

You don't need fiber bundles to understand this. Let's just work on the space $\mathbb R^3\times \mathbb R^3 = \mathbb R^6$ of coordinates and velocities. The Lagrangian is just a function $L:\mathbb R^6\rightarrow \mathbb R$. It's value at a point is better denoted by $L(x,v)$ instead of $L(q,\dot q)$. Then the Euler-Lagrange equations read
$$\frac{\mathrm d}{\mathrm d t} \left(\frac{\partial L(x,v)}{\partial v}\bigg|_{x=q(t),v=\dot q(t)}\right) - \frac{\partial L(x,v)}{\partial x}\bigg|_{x=q(t),v=\dot q(t)}=0 \text{ .}$$
This makes all the dependences obvious. However, it's just more convenient to write
$$\frac{\mathrm d}{\mathrm d t} \frac{\partial L}{\partial \dot q} - \frac{\partial L}{\partial q} =0$$
instead, although it might cause confusion. You just have to keep in mind that it really means the above equation. The formulation using fiber bundles just generalized this to other spaces than $\mathbb R^6$.

Last edited: Aug 30, 2013
9. Aug 30, 2013

Erland

As I explained in the old thread, the problem is that for this to be meaningful, the function $L(x,v)$ must be unique. If there was another function $M(x,v)$ such that $L(q(t),\dot q(t))=M(q(t),\dot q(t))$ for all paths $q(t)$, but for which $\partial L /\partial v\neq\partial M/\partial v$, then we wouldn't know which one of these expressions to use.

Therefore, it is important to prove that $L(x,v)$ is unique.

10. Aug 30, 2013

marmoset

Anyone who is confused by this is in good company - Bill Burke dedicates his book on applied differential geometry to "all those who, like me, have wondered how in hell you can change $\dot q$ without changing $q$".

11. Aug 30, 2013

rubi

$L(x,v)$ is given as an axiom. It completely specifies your theory. For example $L(x,v)=\frac{1}{2}m v^2 - m g x$ describes a falling particle. You don't need to prove any uniqueness properties. In fact, it is never unique. Just try $L'(x,v)=\frac{1}{2}m v^2 - m g x + v$ for example. There are always other $L'(x,v)$ that give you exactly the same equations of motion. It doesn't matter which one you choose.

Everything is well-defined the way it is usually taught. For a given $L(x,v)$, you just compute the partial derivatives of $L(x,v)$, plug in $q(t)$ and $\dot q(t)$ afterwards and then insert them into the Euler-Lagrange equations. Apart from technical conditions like differentiability, you don't need to worry about anything.

12. Aug 30, 2013

Staff Emeritus
Fiber bundles? Geeze...can we make this any more complicated?

Don't think about the mathematics. Think about the physics. If q and q-dot are dependent, that means that every time a particle is in a given position, it has the same velocity. While there are problems where that is true, do you want those to be the only kind of problems you can solve?

13. Aug 30, 2013

rubi

I wrote: "You don't need fiber bundles..."

For a given trajector $q(t)$, $q$ and $\dot q$ are related. Just try $q(t)=t^2$. Then $q = \frac{{\dot q}^2}{4}$. The point is that this is irrelevant for the Euler-Lagrange equations, because neither $q$ nor $\dot q$ gets differentiated with respect to the other variable.

14. Aug 30, 2013

Erland

$L=T-V$ and there is, for a given potential, a given formula to calculate this in cartesian coordinates, like the one you gave, and this can be taken as an axiom, yes. We then use the coordinate transformation to rewrite this formula in the generalized position and velocity coordinates, and here lies the problem. For how can we know that the cartesian velocities can be uniquely expressed as functions of the generalized velocities and positions? The formulas giving these expressions are not taken as axioms, they are derived in way which not shows that they are unique. Therefore, this uniqueness must be proved.

Again, I refer to this old thread for the details:

15. Aug 30, 2013

rubi

No, you don't perform any coordinate transformations. If you have $L(x,v)$, the choice of coordinates has already been made and isn't changed anymore. $x$ and $v$ are already the generalized coordinates. I didn't mean to imply cartesian coordinates when i wrote $x$. I just wanted to distinguish it symbolically from the trajectory $q(t)$.

(It's unfortunate that in the case of $TM=\mathbb R^{2N}$, the usual coordinate chart already is $(\mathbb R^{2N},\mathrm{id})$. This obfuscates what's going on a little bit. Actually, everything can even be formulated completely coordinate free. One should distinguish $L:TM\rightarrow \mathbb R$ from $L\circ f^{-1}:U\rightarrow\mathbb R$, where $(U,f)$ is a coordinate chart for $TM$. The $L(x,v)$ I'm talking about all the time, is really some $L\circ f^{-1}$. This is way too complicated however, if you just work in $\mathbb R^{2N}$.)

16. Aug 30, 2013

pccrp

Gathering answers around books and counting on all your greatly helpful answers (thanks, by the way), I successfully got to a conclusion and I would really appreciate if you could say to me if that's true or not.

In my head, it's just a mathematical reason that you can consider them as independent. For example, suppose there's a function $$f(y(x),y'(x))= y + y'$$ where $y=x^2 \rightarrow y'=2x$
If we evalute $f(y,y')$ in function of $x$ only, we'll have $f(x)=x^2+2x$; If we differentiate it w.r.t $x$ we get $f'(x)=2x+2$

Simirlarly, if we consider $y(x)$ and $y'(x)$ as independent variables and use the chain rule to differentiate $f(y,y')$ w.r.t $x$ we'll have: $$\frac{df(y,y')}{dx}=\frac{\partial f}{\partial y} \frac{dy}{dx}+\frac{\partial f}{\partial y'} \frac{dy'}{dx}$$ Evaluating each term, we have $$\frac{\partial f}{\partial y}=1 ;$$$$\frac{\partial f}{\partial y'}=1;$$$$\frac{dy}{dx}=y'(x)=2x;$$$$\frac{dy'}{dx}=\frac{d(2x)}{dx}=2;$$

Which, by substitution, gives:$$\frac{df(y,y')}{dx}=1(2x)+1(2)=2x+2$$
As we can see, the same as the answer previously calculated. This shows (but not proves) that we can consider them as independent and with this result we see that$\dot{\vec{r_i}} = \sum_j \frac {\partial \vec{r_i}} {\partial q_j} \dot{q_j}$ can be considered as a function of independent variables $q_j(t)$ and $\dot{q_j}(t)$ even though we know they're both functions of the independent variable $t$ and that there's a relation of dependence between them:$$\dot{\vec{r_i}}=\dot{\vec{r_i}}(q,\dot{q})$$ Being so, we can do like in the example and partially differentiate it w.r.t. to $\dot{q_j}$ considering that $q_j$ are constants. With this result, it becomes possible to prove Lagrange's equation.

Please, correct me if I'm wrong and, if possible, redirect me to a proof of the identity I've shown an example.

Last edited: Aug 30, 2013
17. Aug 30, 2013

rubi

I'm not sure whether you understood it. We don't "consider anything independent". In fact, given a trajectory, $q$ and $\dot q$ are not independent in general as the simple example $q(t)=t^2$ shows. The physical intuition behind this is: A particle usually has a different velocity at each point of the trajectory.

The problem many people are having (and I think you are having, too) is: When we evaluate $\frac{\partial L}{\partial q}$ and $\frac{\partial L}{\partial \dot q}$ in the Euler-Lagrange equations, why don't we need to do something like this:
$$\frac{\mathrm d L}{\mathrm d q} = \frac{\partial L}{\partial q} + \frac{\partial L}{\partial \dot q} \frac{\partial \dot q}{\partial q}$$
And the answer is that the usual way the Euler-Lagrange equations are written is a little bit of an abuse of notation. $\frac{\partial L(q,\dot q)}{\partial q}$ isn't to be interpreted as
$$\frac{\mathrm d L(q,\dot q(q))}{\mathrm d q} \text{ .}$$
It really means
$$\frac{\partial L(x,v)}{\partial x}\bigg|_{x=q(t),v=\dot q(t)} \text{ .}$$
The same goes for $\frac{\partial L}{\partial \dot q}$. At no point does $q$ need to be differentiated with respect to $\dot q$ (or the other way around). Thus it is irrelevant whether they are really dependent or not. You just differentiate $L$ with respect to its arguments and afterwards insert $q$ and $\dot q$. This is very important.

18. Aug 30, 2013

pccrp

Adapting my thoughts, the terms $\frac{\partial L}{\partial q_j}$ and $\frac{\partial L}{\partial \dot q_j}$ (where the $\frac{\partial L}{\partial q_j}$ treats $\dot q_j$ as constants and vice versa) appear in the Lagrangian equations of motion because when proving the from Hamilton's principle, the Taylor expansion for $L+\delta L=L(q(t)+\delta q(t),\dot {q} + \delta \dot{q}, t)$ does not make any distinction if $q$ and $\dot q$ are or aren't independent from each other. Being so, this expansion can always be written $$L(q(t)+\delta q(t),\dot {q}(t) + \delta \dot{q}(t), t)= L(q(t), \dot{q}(t), t) +\sum_{j} \frac{\partial L}{\partial q_j}\delta q_j(t) + \sum_{j} \frac{\partial L}{\partial \dot{q_j}}\delta \dot{q_j}(t)$$
And ,if you apply this expansion to Hamilton's principle and manipulate it algebrically (recognizing that $\dot {q_j}=\frac{dq_j}{dt}$), you'll get the $$\frac{\mathrm d}{\mathrm d t} \frac{\partial L}{\partial \dot q} - \frac{\partial L}{\partial q} =0$$ Since in the start of the demonstration the $\frac{\partial L}{\partial q_j}$ treats $\dot q_j$ as constants and vice versa, they'll keep this behavior on the Lagrange's equations.
Am I correct? Thanks for your help

19. Aug 30, 2013

rubi

Yes, this is the idea. However, I'm not happy with phrases like "consider as independent" and "treat as constants". This sounds like one could arbitrarily choose how to interpret the derivatives. That's not the case, though. You can prove the Euler-Lagrange equations with full rigour by strictly applying the rules of calculus. There is no freedom how to interpret terms.

Here is how I would derive the Euler-Lagrange equations (leaving out all technicalities for the sake of simplicity):
We want to find the trajectory $q(t)$ that makes the action
$$S[q] = \int_{t_a}^{t_b} L(q(t),\dot q(t))\mathrm d t$$
stationary. A necessary condition for this to be true is that whenever we add a multiple of some arbitrary function $\eta(t)$ with $\eta(t_a) = \eta(t_b) = 0$ to $q(t)$, $S[q+\epsilon\eta]$ shouldn't change much for small $\epsilon$. Since, given fixed $q$ and $\eta$, $S[q+\epsilon\eta]$ is just a real-valued function of the real parameter $\epsilon$, we can state this as
$$\frac{\mathrm d}{\mathrm d \epsilon}\bigg|_{\epsilon=0} S[q+\epsilon\eta] =0 \text{ .}$$
Now we can insert the definition of $S[q]$ and (assuming everything behaves nicely) move the derivative under the integral:
$$\frac{\mathrm d}{\mathrm d \epsilon}\bigg|_{\epsilon=0} S[q+\epsilon\eta] = \int_{t_a}^{t_b} \frac{\partial}{\partial\epsilon}\bigg|_{\epsilon=0} L(q(t)+\epsilon\eta(t),\dot q(t)+\epsilon\dot\eta(t))\mathrm d t$$
Note that the derivative acts on a function of the form $f(g(\epsilon,t),h(\epsilon,t))$, so we can just apply the chain rule:
$$\frac{\partial}{\partial\epsilon}\bigg|_{\epsilon=0} L(q(t)+\epsilon\eta(t),\dot q(t)+\epsilon\dot\eta(t)) \\= \left[\frac{\partial L(q(t)+\epsilon\eta(t),\dot q(t)+\epsilon\dot\eta(t))}{\partial (q(t)+\epsilon\eta(t))}\frac{\partial (q(t)+\epsilon\eta(t))}{\partial\epsilon}+\frac{\partial L(q(t)+\epsilon\eta(t),\dot q(t)+\epsilon\dot\eta(t))}{\partial (\dot q(t)+\epsilon\dot\eta(t))}\frac{\partial (\dot q(t)+\epsilon\dot\eta(t))}{\partial\epsilon}\right]\bigg|_{\epsilon=0}\\=\frac{\partial L(q(t),\dot q(t))}{\partial q(t)}\eta(t)+\frac{\partial L(q(t),\dot q(t))}{\partial \dot q(t)}\dot\eta(t)$$
Now you just need to put this back into the integral, use the standard integration by parts trick, make sure the boundary term vanishes and derive the Euler-Lagrange equations, using the fact that it should hold for all $\eta$. I have abused notation a little bit, but I hope it is clear how this is to be understood.

Last edited: Aug 30, 2013
20. Aug 31, 2013

Erland

OK, maybe you don't always have to invoke cartesian coordinates, but in most examples given in textbooks, such as Goldstein, this is how it is done.
Suppose, for example, we have a particle forced to move on a surface located in a gravity field. We then express the motion of the particle in terms of two parameters on the surface. Then, how can we express L=T-V in terms of the surface parameters and its time derivatives for the path?
The only way I know of is to express the kinetic and potential energy of the particle in cartesian coordinates and then transform to the surface parameters. This method is anyway implicit in the derivation in Goldstein.

Last edited: Aug 31, 2013