Help with Derivation of Euler Lagrange Equation

Whenry · Jun 30, 2012

Hello all,

I am having some frustration understanding one derivation of the Euler Lagrange Equation. I think it most efficient if I provide a link to the derivation I am following (in wikipedia) and then highlight the portion that is giving me trouble.

The link is here

If you scroll down to "Derivation of one-dimensional Euler–Lagrange equation".

I am confused on the following two lines under "It follows from the total derivative that..."

[tex]\frac{dF_\epsilon}{d\epsilon} = \frac{\partial F}{\partial \epsilon} + \frac{dx}{d\epsilon}\frac{\partial F}{\partial x} + \frac{dg_\epsilon}{d\epsilon}\frac{\partial F}{\partial g_\epsilon} +\frac{dg'_\epsilon}{d\epsilon}\frac{\partial F}{\partial g'_\epsilon}[/tex]

[tex]\frac{dF_\epsilon}{d\epsilon} = \frac{dg_\epsilon}{d\epsilon}\frac{\partial F}{\partial g_\epsilon} +\frac{dg'_\epsilon}{d\epsilon}\frac{\partial F}{\partial g'_\epsilon}[/tex]

It is not clear to me why the first two terms are zero. I assume that the second term is zero because [itex]x[/itex] is constant with respect to [itex]\epsilon[/itex]. But, I do not know why the partial of [itex]F_e[/itex] is zero.

Any help would be appreciated, thank you!

Will

Mute · Jun 30, 2012

In that derivation, F is not explicitly a function of [itex]\varepsilon[/itex], so the partial derivative of it is zero. Similarly, as you supposed, x is independent of x, so [itex]dx/d\varepsilon[/itex] is zero.

Whenry · Jul 1, 2012

Mute said:

In that derivation, F is not explicitly a function of [itex]\varepsilon[/itex], so the partial derivative of it is zero. Similarly, as you supposed, x is independent of x, so [itex]dx/d\varepsilon[/itex] is zero.

The reason this is confusing to me is that I have seen the chain rule applied to partial derivatives, which lead me to think you could take
something like
[tex]\frac{\partial F(x(\epsilon))}{\partial \epsilon} = \frac{\partial F(x(\epsilon))}{\partial x}\frac{\partial x}{\partial \epsilon}[/tex]
[tex]x[/tex]

Whenry · Jul 1, 2012

Mute said:

In that derivation, F is not explicitly a function of [itex]\varepsilon[/itex], so the partial derivative of it is zero. Similarly, as you supposed, x is independent of x, so [itex]dx/d\varepsilon[/itex] is zero.

The reason this is confusing to me is that I have seen the chain rule applied to partial derivatives, which lead me to think you could take
something like
[tex]\frac{\partial F(x(\epsilon))}{\partial \epsilon} = \frac{\partial F(x(\epsilon))}{\partial x}\frac{\partial x}{\partial \epsilon}[/tex]

Mute · Jul 1, 2012

Whenry said:

The reason this is confusing to me is that I have seen the chain rule applied to partial derivatives, which lead me to think you could take
something like
[tex]\frac{\partial F(x(\epsilon))}{\partial \epsilon} = \frac{\partial F(x(\epsilon))}{\partial x}\frac{\partial x}{\partial \epsilon}[/tex]

The total derivative is the chain rule in higher dimensions. The statement you wrote down is not true. By definition of the partial derivative, if you differentiate a function with respect to a variable that the function does not explicitly depend on, then the partial derivative with respect to that variable is zero. This is because the partial derivative tells you how the function changes while all of its other arguments are held fixed; so, if the function only depends on a variable through the other variables, then since those variables are held fixed during a partial derivative, the partial derivative is zero. Hence,

[tex]\frac{\partial F(x(\varepsilon))}{\partial \varepsilon} = 0,[/tex]
but
[tex]\frac{d F(x(\varepsilon))}{d\varepsilon} = \frac{\partial F(x)}{\partial x} \frac{dx}{d\varepsilon}[/tex]

The total derivative [itex]dF(x(\varepsilon))/d\varepsilon[/itex] tells you how [itex]F(x(\varepsilon))[/itex] changes with epsilon overall. If your function were [itex]G(\varepsilon,x(\varepsilon))[/itex], then

[tex]\frac{\partial G(\varepsilon,x(\varepsilon))}{\partial \varepsilon} \neq 0,[/tex]
and
[tex]\frac{d G(\varepsilon,x(\varepsilon))}{d\varepsilon} = \frac{\partial G(\varepsilon,x)}{\partial \varepsilon} + \frac{\partial G(\varepsilon,x)}{\partial x} \frac{dx}{d\varepsilon}[/tex]

Whenry · Jul 1, 2012

Mute said:

The total derivative is the chain rule in higher dimensions. The statement you wrote down is not true. By definition of the partial derivative, if you differentiate a function with respect to a variable that the function does not explicitly depend on, then the partial derivative with respect to that variable is zero. This is because the partial derivative tells you how the function changes while all of its other arguments are held fixed; so, if the function only depends on a variable through the other variables, then since those variables are held fixed during a partial derivative, the partial derivative is zero. Hence,

[tex]\frac{\partial F(x(\varepsilon))}{\partial \varepsilon} = 0,[/tex]
but
[tex]\frac{d F(x(\varepsilon))}{d\varepsilon} = \frac{\partial F(x)}{\partial x} \frac{dx}{d\varepsilon}[/tex]

The total derivative [itex]dF(x(\varepsilon))/d\varepsilon[/itex] tells you how [itex]F(x(\varepsilon))[/itex] changes with epsilon overall. If your function were [itex]G(\varepsilon,x(\varepsilon))[/itex], then

[tex]\frac{\partial G(\varepsilon,x(\varepsilon))}{\partial \varepsilon} \neq 0,[/tex]
and
[tex]\frac{d G(\varepsilon,x(\varepsilon))}{d\varepsilon} = \frac{\partial G(\varepsilon,x)}{\partial \varepsilon} + \frac{\partial G(\varepsilon,x)}{\partial x} \frac{dx}{d\varepsilon}[/tex]

This makes sense to me. But, I have trouble reconciling that explanation with the partial derivative chain rule. An example is at the top of the pdf linked here. It seems that [itex]w[/itex] only explicitly depends on [itex]x[/itex] and [itex]y[/itex], but the partial derivative w.r.t [itex]t[/itex] is non-zero. I really appreciate you taking your time to help me... I think if you can help me understand why the chain rule example in the pdf is non-zero I will be good to go.

Will

Mute · Jul 1, 2012

Whenry said:

This makes sense to me. But, I have trouble reconciling that explanation with the partial derivative chain rule. An example is at the top of the pdf linked here. It seems that [itex]w[/itex] only explicitly depends on [itex]x[/itex] and [itex]y[/itex], but the partial derivative w.r.t [itex]t[/itex] is non-zero. I really appreciate you taking your time to help me... I think if you can help me understand why the chain rule example in the pdf is non-zero I will be good to go.

Will

It appears that there is simply an unfortunate ambiguity in notation in the pdf that you linked. The issue is that in that pdf, the function w = f depends explicitly on variables x₁ to x_m. Each of those variables, however, depends on n parameters, t₁ to t_n. So, what the author is actually doing is some sort of "total partial derivative", where only one of the t parameters, say t_i, is being varied. Because all of the other t parameters are held fixed, he is using the partial derivative notation, but it is still a total derivative.

Note that in equation 2 in the pdf you linked to that no [itex]\partial w/\partial t_i[/itex] term appears on the right hand side. This is because the function w = f(x₁,...,x_m) does not depend explicitly on t_i. (There also appears to be a typo in that formula. The last term in Eq 2 should be [itex](\partial w/\partial x_m) (\partial x_m/\partial t_i)[/itex].)

Whenry · Jul 1, 2012

Mute said:

It appears that there is simply an unfortunate ambiguity in notation in the pdf that you linked. The issue is that in that pdf, the function w = f depends explicitly on variables x₁ to x_m. Each of those variables, however, depends on n parameters, t₁ to t_n. So, what the author is actually doing is some sort of "total partial derivative", where only one of the t parameters, say t_i, is being varied. Because all of the other t parameters are held fixed, he is using the partial derivative notation, but it is still a total derivative.

Note that in equation 2 in the pdf you linked to that no [itex]\partial w/\partial t_i[/itex] term appears on the right hand side. This is because the function w = f(x₁,...,x_m) does not depend explicitly on t_i. (There also appears to be a typo in that formula. The last term in Eq 2 should be [itex](\partial w/\partial x_m) (\partial x_m/\partial t_i)[/itex].)

Ok...one last example that I have found on an MIT video. (The video is short and this is example is done in the first five minutes, it is here. They also try and relate it to a total derivative.

The example is:
Suppose [itex]Z = x^2 + y^2 {\rm and }\,\, x = u^2 - v^2 {\rm and }\,\, y= uv[/itex].
Then, [tex]\frac{\partial Z}{\partial u} = \frac{\partial Z}{\partial x}\frac{\partial x}{\partial u} + \frac{\partial Z}{\partial y}\frac{\partial y}{\partial u}[/tex]

[tex]\frac{\partial Z}{\partial u} =4ux + 2vu[/tex]

You're telling me this is not really a partial derivative but a total partial derivative?

But, in the derivation of the E-L equation, the partial [itex]\frac{\partial F_e}{\partial e}[/itex] is zero because [itex]F_e(x,g_e(x),g_e^'(x))[/itex] does not explicitly dependo on [itex]e[/itex]?

Mute · Jul 1, 2012

Whenry said:

Ok...one last example that I have found on an MIT video. (The video is short and this is example is done in the first five minutes, it is here. They also try and relate it to a total derivative.

The example is:
Suppose [itex]Z = x^2 + y^2 {\rm and }\,\, x = u^2 - v^2 {\rm and }\,\, y= uv[/itex].
Then, [tex]\frac{\partial Z}{\partial u} = \frac{\partial Z}{\partial x}\frac{\partial x}{\partial u} + \frac{\partial Z}{\partial y}\frac{\partial y}{\partial u}[/tex]

[tex]\frac{\partial Z}{\partial u} =4ux + 2vu[/tex]

You're telling me this is not really a partial derivative but a total partial derivative?

But, in the derivation of the E-L equation, the partial [itex]\frac{\partial F_e}{\partial e}[/itex] is zero because [itex]F_e(x,g_e(x),g_e^'(x))[/itex] does not explicitly dependo on [itex]e[/itex]?

I'm not exactly sure what the best way of calling it is. It is a total derivative, but it is a multivariate total derivative because the variables x and y are parametrized in terms of more than one auxiliary variable, so in that sense it is a partial derivative too.

The important thing to note is that the function z depends on the parameters u and v only through the variables x and y, which are taken to be functions of u and v. This example is the exact same notion as the pdf you linked to. The function z does not depend explicitly on u or v: otherwise there would be some sort of "[itex]\partial z/\partial u[/itex]" term on the right hand side of the expression that the presenter in the video writes down, which is different from the "[itex]\partial z/\partial u[/itex]" on the left hand side.

It's unfortunate that there is not a better or more commonly used notation to deal with this ambiguity. The left hand side [itex]\partial z/\partial u[/itex] is really some sort of multivariate total derivative, not a pure partial derivative. This situation doesn't seem to come up much, otherwise there would probably be a better notation for it.

Note that in the Euler-Lagrange example there is only one parameter, so the meaning of the partial derivative in that context is clear, and the F there does not explicitly depend on epsilon so that partial derivative is zero.

If you are still unsure of that result, you could try to Taylor expand

[tex]F(x,g(x) + \epsilon \eta(x), g'(x) + \epsilon \eta'(x))[/tex]

to lowest order in [itex]\epsilon[/itex] and see what you get. (If you are confused about how to do that with epsilon appearing in two arguments, you could start with having two epsilons, [itex]\epsilon_1[/itex] and [itex]\epsilon_2[/itex], and expand in each argument separately and set [itex]\epsilon_1 = \epsilon_2[/itex] after the expansion).

Whenry · Jul 1, 2012

Mute said:

I'm not exactly sure what the best way of calling it is. It is a total derivative, but it is a multivariate total derivative because the variables x and y are parametrized in terms of more than one auxiliary variable, so in that sense it is a partial derivative too.

The important thing to note is that the function z depends on the parameters u and v only through the variables x and y, which are taken to be functions of u and v. This example is the exact same notion as the pdf you linked to. The function z does not depend explicitly on u or v: otherwise there would be some sort of "[itex]\partial z/\partial u[/itex]" term on the right hand side of the expression that the presenter in the video writes down, which is different from the "[itex]\partial z/\partial u[/itex]" on the left hand side.

It's unfortunate that there is not a better or more commonly used notation to deal with this ambiguity. The left hand side [itex]\partial z/\partial u[/itex] is really some sort of multivariate total derivative, not a pure partial derivative. This situation doesn't seem to come up much, otherwise there would probably be a better notation for it.

Note that in the Euler-Lagrange example there is only one parameter, so the meaning of the partial derivative in that context is clear, and the F there does not explicitly depend on epsilon so that partial derivative is zero.

If you are still unsure of that result, you could try to Taylor expand

[tex]F(x,g(x) + \epsilon \eta(x), g'(x) + \epsilon \eta'(x))[/tex]

to lowest order in [itex]\epsilon[/itex] and see what you get. (If you are confused about how to do that with epsilon appearing in two arguments, you could start with having two epsilons, [itex]\epsilon_1[/itex] and [itex]\epsilon_2[/itex], and expand in each argument separately and set [itex]\epsilon_1 = \epsilon_2[/itex] after the expansion).

Thank you, I had thought I had posted a response but I do not see it here...

anyways, I think my friend helped me see the difference. He offered a much more robust definition of the chain rule...more than I would type out in latex here. However, I can say the following. The "chain rule" of partial derivatives that seems to be not really a partial derivative at all, is more accurately describe in the following way.

Suppose [itex]f(x,y) = x^2 + y^2[/itex] and [itex]x = 2*t,\,\,y=t^2[/itex]. Let the mapping of [itex]t -> (x,y)[/itex] be the function [itex]T: R -> R^2[/itex]. Then the chain rule is really finding [itex]\frac{\partial (f o T)}{\partial t}[/itex] . This is different from [itex]\frac{\partial f}{\partial t}[/itex]. Possible this helps explain the difference.

Help with Derivation of Euler Lagrange Equation

Discussion Overview

Discussion Character

Main Points Raised

Areas of Agreement / Disagreement

Contextual Notes

Similar threads

Undergrad Finding the minimum distance between two curves

Undergrad Why ##a^0=1##?

High School Straightforward integration…

High School Arc Length for Hyperbolic Sin

Undergrad Ambiguity of the term "indefinite integral"

Insights Revisiting the Velocity-Time Function

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect