Woes With the Principle of Least Action.

Jilvin · Dec 13, 2009

I am now attempting to figure out how to calculate trajectories using the highly coveted "principle of least action". I apologize beforehand if this is more of a mathematical problem than a problem that needs to be placed under classical mechanics. I also apologize if I can't do the Latex quite right the first time around. I want to overview what I know so far so I can receive corrections for any conceptual or silly mistakes I have made along the way.

So here's basically what I was taught. The Lagrangian (L) is the difference between the kinetic and potential energy:

L=K.E-P.E

The action (denoted S) is denoted:

\int\limits_{t_1}^{t_0}L\, dt

The problem I am having is being able to distinguish why the calculus of variations must be used rather than simple maxima and minima from calculus 1.

So, here's the point of what I need: can somebody explain to me the following things:

1. Why normal maxima and minima cannot solve this type of problem.
2. What exactly is the calculus of variations and how does it solve this type of problem.

diazona · Dec 13, 2009

Good questions!

1) Here's a way to think about it: remember that a function is basically a rule that transforms one number (x) into another number (y). When you minimize a function, what you get is a numeric value of x. The action, however is what we call a functional - it's a rule that transforms an entire function (x(t)) into a number (S). So when you minimize a functional, you get not just a number, but an entire function that you can plug back into the functional. The principle of least action states that you need to find the function x(t) that makes the action a minimum, which is a lot more involved than just finding one number.

2) The calculus of variations is just the mathematical technique that allows you to solve this problem. Think about the following way of doing a regular minimization problem:

You know that if a regular function has a minimum, at that minimum point it momentarily has a slope of zero. What this means is that if you move a teeny tiny bit away from the minimum, you should be able to ignore the change in the value of the function. Mathematically, at a minimum, the following condition should be true:
f(x + \epsilon) = f(x) + \mathcal{O}(\epsilon^2)
If you compare this to the first order Taylor expansion of f,
f(x + \epsilon) = f(x) + \frac{d f}{d x} \epsilon + \mathcal{O}(\epsilon^2)
you'll see that the condition is equivalent to setting df/dx equal to zero, which is good because if it didn't, you'd know that this method doesn't work ;-)

Now imagine generalizing this to minimize a functional, like the action. As I said above, a functional is just a rule that takes a function as input and produces a number. So let's make these changes: f becomes the functional S, x the number becomes x(t) the function, and logically the slight change \epsilon must also become a function, let's say \delta x(t).
S[x(t) + \delta x(t)] = S[x(t)] + \mathcal{O}(\delta x(t)^2) (**)
We need to impose the condition that \delta x(t) is zero at the endpoints of the integral, because those are fixed by the problem or physical situation - you're told "a particle starts at x=(0,0,0) at t=0" or some such thing.

Now what to do? We need to figure out a way to apply Taylor expansion to a functional. The functional, of course, can be written
S[x(t) + \delta x(t)] = \int_{t_0}^{t_1} L(x + \delta x(t), \dot{x} + \delta\dot{x}(t), t) \mathrm{d}t
Here's a trick you can use: rewrite the tiny change \delta x(t) as a tiny number times a normal finite function \epsilon \delta x(t).
S[x(t) + \epsilon\delta x(t)] = \int_{t_0}^{t_1} L(x + \epsilon\delta x(t), \dot{x} + \epsilon\delta\dot{x}(t), t) \mathrm{d}t
This way, the tiny parameter will be a plain old number and you can use regular Taylor expansion on the function L.
S[x(t) + \epsilon\delta x(t)] = \int_{t_0}^{t_1} \left(L(x, \dot{x}, t) + \epsilon \delta x(t) \frac{\partial L}{\partial x} + \epsilon \delta \dot{x}(t) \frac{\partial L}{\partial \dot{x}} + \mathcal{O}(\epsilon^2)\right) \mathrm{d}t
Now if you integrate the third term in the integral by parts, you get
S[x(t) + \epsilon\delta x(t)] = \left.\epsilon \delta x(t) \frac{\partial L}{\partial x}\right|_{t_0}^{t_1} + \int_{t_0}^{t_1} \left(L(x, \dot{x}, t) + \epsilon \delta x(t) \frac{\partial L}{\partial x} - \epsilon \delta x(t) \frac{\mathrm{d}}{\mathrm{d}t}\frac{\partial L}{\partial \dot{x}} + \mathcal{O}(\epsilon^2)\right) \mathrm{d}t
This is where the condition that \delta x(t) = 0 at the endpoints comes in handy: that boundary term that appears in front of the integral is just equal to zero.

Anyway, now compare this to the original condition (**) that I said was necessary for the functional to be extremized. You'll notice that the only difference is the two terms
\epsilon \delta x(t) \frac{\partial L}{\partial x} - \epsilon \delta x(t) \frac{\mathrm{d}}{\mathrm{d}t}\frac{\partial L}{\partial \dot{x}}
So just like with ordinary minimization, if we set this equal to zero, we'll have the condition on x(t) that needs to be fulfilled for the functional to be minimized. That condition is, of course, the Euler-Lagrange equation
\frac{\partial L}{\partial x} - \frac{\mathrm{d}}{\mathrm{d}t}\frac{\partial L}{\partial \dot{x}} = 0

The reason we call this procedure "calculus of variations" is because it uses a "variation" of the function x(t). \delta x(t) is the variation. By the same token, you could call regular old minimization an example of the "calculus of differentials" (or "differential calculus" - sound familiar?) because it uses things like dx and dy, which are called differentials.

Jilvin · Dec 13, 2009

Thank you! That cleared up some weird things. Apparently (according to a close friend I contacted) Feynman has written an excellent summary on this in Chapter 19 of Volume II of his Caltech lectures, so I'll pick that up from my friend tomorrow at school and hopefully the topic will fully click with Feynman's explanation.

Again, much thanks for the clarifications provided.

Woes With the Principle of Least Action.

Thread 'Reference frames, center of rotation, etc'

Similar threads

High School What is the Correct Reading on the Scale in This Mass/Scale Puzzle?

Undergrad Topic about physics axioms, theory, laws etc..

Undergrad Reference frames, center of rotation, etc

High School (Electric) current is not a vector quantity

Undergrad What areas of maths and physics do I need to understand explosion physics?

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Insights What Exactly is Dirac’s Delta Function? - Insight

Insights Relativator (Circular Slide-Rule): Simulated with Desmos - Insight

Insights Fixing Things Which Can Go Wrong With Complex Numbers