# Simple Derivation of Euler-Lagrange Equations

Homework Helper
Gold Member
I'm trying to deduce the equations of motion in the form

$$\frac{d}{dt} \frac{\partial \cal L}{\partial \dot{q}} - \frac{\partial \cal L}{\partial q} = 0$$

with little algebra directly from Hamilton's principle, like the geometric derivation of snell's law from the principle of least time. It should be possible since they're simple enough to write. I've tried for about an hour with little luck. Does any one know how to do it?

## Answers and Replies

Related Classical Physics News on Phys.org
Ben Niehoff
Gold Member
First try letting $q_i$ be the usual Cartesian coordinates, $(x, y, z)$. Then you should be able to recover

$$F_x = m \dot x,$$

etc. Once you have that, show that the form of the Euler-Lagrange equation is invariant under coordinate transformations of the form

$$s_i = f_i(q_1, q_2, \dots q_n)$$

where $s_i$ are the new coordinates and $q_i$ are the old.

Homework Helper
Gold Member
Actually I was looking for something more deductive, i.e starting out without already knowing the form of the E-L eqns.

Ben Niehoff
Gold Member
Wait, I think I read your post wrong. My post above describes how to prove that Lagrange's and Newton's methods are equivalent. What you want to do is derive the Euler-Lagrange equation from the principle of least action, right?

It's easiest to consider a more general problem (because the Euler-Lagrange equations are, in fact, more general, and they can be used to find stationary points of any integrals). You have some integral J expressed by

$$J = \int_A^B F(y, \dot y; x) \, dx$$

for some functional F, function y, and independent variable x. Your goal is to find a function y that minimizes (or maximizes, or in general gives a stationary value for) the integral J on the interval [A,B], given the functional F. So here is what you do:

Denote the "correct" function as $y(x)$. Consider some functions $y(x, \alpha)$ which are "close to" y(x) in the sense that

$$y(x, \alpha) = y(x) + \alpha \eta(x)$$

for some (arbitrary) function $\eta(x)$, which is chosen such that $\eta(A) = \eta(B) = 0$. This way, $y(x, \alpha)$ is equal to $y(x)$ on the endpoints of [A,B], and they differ only in the middle.

Now, we can express J in terms of $\alpha$ as

$$J(\alpha) = \int_A^B F\left(y(x, \alpha), \frac{d y(x, \alpha)}{dx}; x\right) \, dx$$

Note the total derivative on y; this is important. What we see is that a stationary point (about variations in y) occurs when

$$\frac{dJ}{d \alpha} = 0$$

So, carry out the differentiation. What's left is a little algebra and a little cleverness, but you should be able to arrive at the Euler-Lagrange equations:

$$\frac{d}{dx} \frac{\partial F}{\partial \dot y} - \frac{\partial F}{\partial y} = 0$$

where $\dot y = dy / dx$ is treated as an independent variable.

Look up "calculus of variations" if you need more help. :)

Last edited:
Homework Helper
Gold Member
I know the usual derivation using the calculus of variations! I was looking for a simpler way to see it without using the calculus of variations, possibly geometric.

Ben Niehoff
Gold Member
Oh, why didn't you say so in the first place? I wrote all that out for nothin'... :P

My strategy would be to go to one dimension first. From Newton's laws, you know

$$F = m\ddot x$$

$$T = \frac12 m \dot x^2$$

If you consider that F is some function of time and space, F(x,t), then you can get U(x,t) by first defining an arbitrary zero point. It might be easiest to consider a force field which is constant in time, which gives

$$U(x) = - \int_{\mathcal O}^x F(x') \, dx'$$

So then the Lagrangian is

$$\mathcal L = T - U = \frac12 m\dot x^2 + \int_{\mathcal O}^x F(x') \, dx'$$

Then we find

$$\frac{\partial \mathcal L}{\partial \dot x} = m \dot x$$

and

$$\frac{\partial \mathcal L}{\partial x} = F(x)$$

which lead directly to

$$\frac{d}{dt} \frac{\partial \mathcal L}{\partial \dot x} - \frac{\partial \mathcal L}{\partial x} = m \ddot x - F(x) = 0$$

This works for the static case, where F does not change in time. You can probably figure out how to prove it in the general case.

Once you get it for Cartesian coordinates, it's easy to show that the form of the EL equations is invariant under a change of coordinates.

Ben Niehoff
Gold Member
Of course, this is a deduction from Newton's laws, and the definitions of potential and kinetic energy. How to get it from the least action principle, without calculus of variations, I don't know.

But the least action principle is pretty ad hoc anyway. Newton's laws are grounded in experiment.

I don't think you're going to get them from the principle of least action without using COV. Well, the only other way I've seen is from the Feynman path integral, taking saddle-point approximation, but this is essentially $$\frac{\delta S}{\delta x} = 0$$.

Oh, and another way you can derive EOMs is to use the theory of canonical transformations and derive the Hamilton-Jacobi equations of motion, which are equivalent to the E-L equations. There are two ways of deriving these, one from COV and one from canonical transformations, as I said. However, I suspect that you use the E-L implicitly at some point.

Do you have a prejudice against COV?

Homework Helper
Gold Member
No I don't have any prejudice against COV :) I just like having simpler ways to see things which can possibly help my intuition. Do you know the geometric derivation of snell's law from the principle of least time? I was looking for something of that sort. I think one should be able to do something like that by looking at a differential element of the path and applying the stationary action condition to that.

I'm not sure about Snell's law. The law of reflection can be done purely geometrically, but doesn't Snell's law require a single minimization? If you could point me at a reference I could think about the E-L case.

Homework Helper
Gold Member
Snell's law can be done purely geometrically too. Just draw the path from point A in the first medium to point B in the second medium with angles $$\theta_{1}$$ and $$\theta_{2}$$ respectively with the normal. Now move the point at which the path touches the medium interface by a little amount $$\Delta x$$. You will see that if the length of one path increases, the other will decrease. From the geometry it's easy to see that the first order changes are $$\Delta x sin \theta_{1}$$ and $$\Delta x \sin \theta_{2}$$ (with opposite signs). The changes in the time are simply these changes in path distance multiplied by the respective refractive index. So for the total first order change in time to be equal to zero, we need

$$n_{1}\Delta x sin \theta_{1} = n_{2}\Delta x sin \theta_{2}$$.

Canceling the $$\Delta x$$s we get Snell's law.

Homework Helper
Gold Member
I don't have the books with me right now, but I remember this derivation is given in Feynman's lectures vol.1, and also in Mach's "Science of Mechanics".

Last edited:
Ben Niehoff
Gold Member
Snell's law can be done purely geometrically too. Just draw the path from point A in the first medium to point B in the second medium with angles $$\theta_{1}$$ and $$\theta_{2}$$ respectively with the normal. Now move the point at which the path touches the medium interface by a little amount $$\Delta x$$. You will see that if the length of one path increases, the other will decrease. From the geometry it's easy to see that the first order changes are $$\Delta x sin \theta_{1}$$ and $$\Delta x \sin \theta_{2}$$ (with opposite signs). The changes in the time are simply these changes in path distance multiplied by the respective refractive index. So for the total first order change in time to be equal to zero, we need

$$n_{1}\Delta x sin \theta_{1} = n_{2}\Delta x sin \theta_{2}$$.

Canceling the $$\Delta x$$s we get Snell's law.
But that's just doing Calculus of Variations. It just so happens that the only functions you need to consider are piecewise linear. Varying the "point at which the ray crosses the boundary" is the same as varying $\alpha$ in $y(x, \alpha) = y(x) + \alpha \eta(x)$. It just so happens that you can visualize this variation, because the path is a physical path in 3-dimensional space; whereas, in the Lagrangian formulation of mechanics, the paths being considered are paths in N-dimensional configuration space.

The only purely geometrical way of deriving Snell's Law that I know is by means of wave optics. Consider a plane wave incident on medium 2 at an angle $\theta_1$ moving at velocity $c / n_1$. In medium 2, it moves with a velocity $c / n_2$. Using Huygens' Principle, one can see that the wave fronts must emerge at an angle $\theta_2$ given by Snell's Law.

The idea of "shortest time" doesn't enter into it; and in fact, the entire concept of minimizing time is suspect in the first place. After all, why should the light know where it is going? The geometrical argument shows that refraction is an entirely local effect, and the math just happens to work out such that the path of each ray of light minimizes time along itself.

In a similar vein, my derivation of the EL equations from Newton's laws shows that they are an entirely local effect, and the math "just works out" so that the path of the system through configuration space minimizes some quantity known as "action".

P.S.: You can get pretty-looking sines, cosines, logarithms, etc., in Latex by preceding them with a backslash such as "\sin \theta_1": $\sin \theta_1$.

Last edited:
I think the E-L are secretly buried in Huygen's principle, which is essentially a Green's function solution of the wave equation.

Homework Helper
Gold Member
But that's just doing Calculus of Variations. It just so happens that the only functions you need to consider are piecewise linear. Varying the "point at which the ray crosses the boundary" is the same as varying $\alpha$ in $y(x, \alpha) = y(x) + \alpha \eta(x)$. It just so happens that you can visualize this variation, because the path is a physical path in 3-dimensional space; whereas, in the Lagrangian formulation of mechanics, the paths being considered are paths in N-dimensional configuration space.
Of course we have to do some kind of variation, because it's a variational principle (which should be called the principle of stationary time to emphasize that it is not really a teleological thing). But in this case we were able to see what the variation was geometrically. And in the case of E-L, we don't have to consider all N dimensions. We can derive the differential equation for each of the $$q_i(t)$$ independently. Each of the qs can be plotted vs time on the plane.

The idea of "shortest time" doesn't enter into it; and in fact, the entire concept of minimizing time is suspect in the first place. After all, why should the light know where it is going? The geometrical argument shows that refraction is an entirely local effect, and the math just happens to work out such that the path of each ray of light minimizes time along itself.

In a similar vein, my derivation of the EL equations from Newton's laws shows that they are an entirely local effect, and the math "just works out" so that the path of the system through configuration space minimizes some quantity known as "action".
Actually it's not true that the action is minimized. It is well known that the action is only stationary and that it is a local effect. And since it is a local effect, it should be possible to derive the E-L equations by considering only a small element of the path, instead of using the calculus of variations on the entire path.

Even in the case of light, it's not true that the time is minimum. It is only stationary.

Thanks for the tex tip :)

Ben Niehoff
Gold Member
Whether it is stationary or minimal is immaterial to the point I was making.

And besides, I know that. "Least action principle" is what it's called, by convention, and it's easier to say than "stationary action principle".

So, carry out the differentiation. What's left is a little algebra and a little cleverness, but you should be able to arrive at the Euler-Lagrange equations:

$$\frac{d}{dx} \frac{\partial F}{\partial \dot y} - \frac{\partial F}{\partial y} = 0$$

where $\dot y = dy / dx$ is treated as an independent variable.
Why is it that one can take y to be independent of its derivative, dy/dx ?

Someone must know this! :-) Every book I've seen just says that they are independent, but offers no explanation.

For a functional like F[x,y(x),y'(x)], it's F that's dependent on x, y(x), and y'(x). These can be regarded as independent variables since we can define F in many unique ways using all or some of these variables. For example F=x+y'/y defines a particular F and therefore a particular class of "curves" (or functions) over some range of x and some definition of the function y. The functional F=x(y+y') defines is an entirely different class of curves even for identical specifications of x and y. Both instances of F depend on the way the variables x,y(x), and y'(x) are used in F's definition and so are in that sense the independent variables of F.

[At least that's the way I've come to see it. If I'm wrong maybe someone can straighten us both out]

My last post sorta sux. For a functional F=F(x,y,y'), where y=y(x) and y'=dy/dx, F "totally" depends on x and "partially" depends on y and y'. That is, we can find the total derivative of F with respect x, dF/dx. However, we can also find the partial derivative of F with respect to y or y'. The nature of functionals like F is a matter of the mathematics of "function spaces". The function space for F(x,y,y') is the set of all functions that have a continuous first derivative. The function space for F(x,y,y',y'') are the set of all functions having continuous second derivatives, etc. Understanding functionals requires a little knowledge of the math of function spaces, normed linear spaces, and so on. An important part of this is how "distance" is defined between functions and therefore what it means to add an infinitesimal increment to a function.

A very readable treatment of this is in the first chapter of the very inexpensive Dover publication, Calculus of Variations, I.M. Gelfand and S.V. Fomin. See https://www.amazon.com/dp/B001G4RG0G/?tag=pfamazon01-20

Last edited by a moderator:
Ah, now it's starting to make a little sense. I will try and get a hold of that book.

Here's a fairly common and informative example - consider the following functional J[y]:

First, let

$$y = y(x)$$

$$y' = \frac{{dy}}{{dx}}$$

Then define F:
$$F(x,y,y') = \sqrt {1 + y{'^2}}$$

And then define J:
$$J[y] = \int_a^b {F(x,y,y')dx} = \int_a^b {\sqrt {1 + y{'^2}} } dx$$

First off, F is defined for all (and only) those functions y=y(x) that have continuous first derivatives (we'll call them "valid functions") over the closed interval x=a to x=b. Obviously, the same goes for J[y]. Geometrically speaking, for any valid function y=y(x), J[y] represents the length of that function's curve between the two points P1=[a,y(a)] and P2=[b, y(b)]. Notice that y=y(x) in the closed interval x=a to x=b, isn't just one function - it's an independent variable and represents an infinity of possible valid functions or curves. Also, for this example, P1 and P2 more than likely NOT the same points for different y functions. A particular value for J would require us having to specify values for a and b, plus we'd need to define a particular function y=y(x), making sure we've picked a "valid function". So you can see how the function y acts as an independent variable here for both F and J. And even though y' isn't specified directly and is always dependent on our choice of y, it still is a variable in the sense that F depends on it directly.

Last edited: