# I What is the motivation for principle of stationary action

Tags:
1. Jun 9, 2016

### Frank Castle

Is the motivation for the action principle purely from empirical evidence, or theoretical arguments, or a mixture of the two? As I understand it, there was some empirical evidence from Fermat's observations in optics, i.e. that light follows the path of least time, notions of virtual work and Maupertius's studies of classical systems and his observations/assertion that the path of a classical system through configuration space is one of least distance.

Are there any other motivating ideas and/or arguments for why the action principle should be used to derive physical equations of motion, other than just that it works?!

2. Jun 9, 2016

### stevendaryl

Staff Emeritus
I'm not sure what's the motivation for least-action principles, but what's extremely beautiful about them is the way that they automatically guarantee a generalization of Newton's third law of motion (every action has an equal and opposite reaction). If you have two subsystems, $A$ and $B$, say two particles interacting, or a particle interacting with a field, then you have to figure out two different things:
1. How system $A$ affects system $B$.
2. How system $B$ affects system $A$.
For particles interacting, Newton's third law tells us that if you have the answer to $1$, then you can immediately figure out the answer to $2$. But if $A$ is the electromagnetic field, and $B$ is a charged particle, then knowing $1$ (the Lorentz force law) doesn't obviously tell you what the answer to $2$ should be (how does electric charge affect the electromagnetic field). On the other hand, the least-action approach automatically solves the problem: There is a single term in the lagrangian to represent the interaction between the two subsystems, and the least-action principle, you get both the effect of $A$ on $B$ and the effect of $B$ on $A$. A lagrangian with the right kind of symmetries automatically guarantees the conservation laws for momentum and energy (and other quantities).

3. Jun 10, 2016

### Stavros Kiri

It belongs in the general method of the Calculus of Variation, a separate mathematical chapter (or Variational Calculus). Take a look at wiki for "Calculus of Variation", particularly at the section "Euler-Lagrange equation". If your question is not answered come back here.

Last edited: Jun 10, 2016
4. Jun 11, 2016

### Frank Castle

I get that calculus of variations is all about optimisation problems and that the Euler-Lagrange equation is derived by requiring that the action is stationary (this solving the Euler-Lagrange equation enables one to determine the extremal path of a system), and that the principle of stationary action is the statement that a system follows an extremal path between two different configurations such that the action is stationary (i.e. it's first-order variation vanishes). What I'm confused about is what is the actually motivation for why this should be a universal quality of physical systems, i.e. why is it reasonable to insist that physical systems should be assigned an action and that the actual path followed by the system is an extremal of this action (other than that it works)?

5. Jun 11, 2016

### vanhees71

There's a deep physical reason for the Hamilton principle of least (or better stationary) acion from quantum mechanics in the path-integral formulation by Feynman. If you calculate the propagator of a particle moving from $x_1$ at time $t_1$ to $x_2$ at time $t_2$ you get the integral over all paths in configuration space with the fixed endpoints, i.e., modulo a factor
$$\langle t_1,x_1|t_2,x_2 \rangle=\int \mathrm{D} x \exp(\mathrm{i} S[x]/\hbar),$$
where $S$ is the usual action,
$$S=\int_{t_1}^{t_2} \mathrm{d} T L(x,\dot{x}).$$
Now, if the action is very large compared to the Planck action constant $\hbar$ you have a wildly oscillating integrand. Usually integrating over many trajecories you'll get 0. The only region in the space of trajectories, where this is not the case is where the action is stationary, i.e., along the classical trajectory, defined by the Euler-Lagrange equations of the corresponding variational principle. Since the above propagator is a transition-probability amplitude for the particle to run from $x_1$ to $x_2$ at the respective times $t_1$ and $t_2$, this means that the most likely trajectory, i.e., the trajectory contributing most to the transition amplitude, is the one that makes the action stationary, i.e., the classical trajectory of the particle.

6. Jun 11, 2016

### Stavros Kiri

So your problem is what is the motivation etc. for δS = 0 (principle of stationary action).

First of all, it's a postulate (principle), used to achieve an alternative equivallent
Formalism in classical mechanics. But further it associates with giving deep insights into physics, by offering a generalization for many theories in physics, i.e. a general equivallent method-formalism to derive in an alternative way the equations of a theory. It is used even in Electromagnetism, Gravity, Field Theory etc. and ultimately even in Quantum Mechanics (and Quantum Field Theory) by resulting into R. P. Feynman's Path Integral formulation of QM. (See also vanhees71's nice comment above.)

So I would say that the motivation is first of all a formalistic one, i.e. to have a generalized alternative approach-method for deriving formalisms in physics. [Variational Principles in physics probably started, as you said, with Fermat's principle in optics, that's why (since 1989) such similar approaches (and more) are awarded with the so-called Fermat prize.]

From one standpoint, there is nothing mysterious about the concept of Action (units: [energy]x[time] or [momentum]x[length] ) and the principle of least (or stationary) action , other than that it formalistically works, as long as you choose the right Langrangian (or Hamiltonian, in the Hamilton formalism) of the system. I think that's the whole trick here!

But of course, it is not a coincidence that planck's constant (h bar), the elementary quantum of action, is also an 'action' with the same units, enabling the uncertainty principle ...

And the "deep physical reason ..." :
is indeed the only complete physical interpretation that I have also seen in trying to physically explain the action and the principle of least (or stationary) action. But the problem with it is that it came a lot later, and it assumes knowledge of Quantum Mechanics ...

Now, what is the intuition behind action and the principle of stationary action?
I have no idea, other than the above, or time and energy minimizings and related variational problems and principles ...

Last edited: Jun 11, 2016
7. Jun 11, 2016

### Stavros Kiri

cf. Noether theorem etc.

8. Jun 13, 2016

### Frank Castle

I like this explanation a lot. The only problem I have with it is that it came along nuch later than the original concept of the principle of stationary action and so it seems that there must have been some other physical motivations earlier on that lead to postulating the stationary action principle?! Was it simply motivated by empirical evidence and the fact that the approach is able to reproduce Newton's laws of mechanics?

This is exactly my issue. Whilst the path integral explanation is an elegant one, it does assume knowledge of quantum mechanics, something that wasn't known at the time when the principle was originally formulated.

9. Jun 13, 2016

### vanhees71

I have no idea, why many people from the early 1600s on formulated Newtonian mechanics in terms of variational principles. The main motivation from a mathematical point of view for doing that, in my opinion, is the systematic use of symmetries in the sense of Noether's theorem, but also this was discovered by Noether only in 1918 much later too.

10. Jun 13, 2016

### Stavros Kiri

Yes but I gave more answers in my previous reply. I understand your point and I totally agree (+liked your thinking in your posts). The motivation may have been a formalistic one (+evidence+ that it works), + the similar minimizing principles that already existed (E, t etc. - may be combined into one concept, that of action, but I don't know how exactly the original initiators came up and introduced it etc. - may be it would help reading the original papers, but to be honest I haven't yet).
Please read my other (previous) reply carefully (word for word) and tell me what you think. I find though the problem that you posed a very interesting and important one, still not an easy one to give full answer. May be, time permitted, we should all read the original papers. I' m sure there is a lot to learn from them!

Last edited: Jun 13, 2016
11. Jun 13, 2016

### Stavros Kiri

+ they were trying to find perhaps one unifying function (or functional) [namely Lagrangian or Hamiltonian ... +/or the Action integral] of the system that would contain in it all information and the whole mechanics of the system, including all the Symmetries etc. . [That would give a concise, elegant, and consistent unifying alternative formalism to mechanics ...]
I recall now, after the interaction and exchange of views in this discussion, seeing this (about the Lagrangian etc.) somewhere in the originals. Quantities such as position, energy, momentum etc. were not alone good enough for that purpose, so they came up with Lagrangian and/or Hamiltonian, Action, and the principle ...

But I am now more curious than ever to read the original papers, if possible.

Last edited: Jun 13, 2016
12. Jun 13, 2016

### vanhees71

I think a very good example for this is the Mechanics by Hertz, which is very formally based on the action principle. Also Helmholtz's textbooks are entirely based on it, and indeed even today, many physicists (including myself) find the action principle much more aesthetically appealing and also way more elegant to derive the equations of motion than "naive" Newtonian mechanics with all kinds of awkward "free-body force diagrams" and what not. Of course the true aesthetics comes in when marying it with symmetry principles a la Lie and later Noether. With the Poisson-bracket formulation of Hamiltonian canonical mechanics you have everything expressed in terms of Lie derivatives and the one-to-one correspondence between symmetries (realized as symplectomorphisms on phase space) and conservation laws, i.e., the generators of symmetry transformations are conserved quantities and any conserved quantity is a symmetry-transformation generator.

This provides even the heuristics to "quantize" mechanics, which is much like Dirac found his formulation (and also Heisenberg, Born, and Jordan in terms of their "matrix mechanics"). Using the Hamilton-Jacoby partial differential equation, which is another way to formulate the action principle, you have the heuristics for "wave mechanics", which in fact was the way, how Schrödinger found his formulation of quantum mechanics: Take the classical HJPDG as the "eikonal approximation" of a "wave equation", which is as it turns out the Schrödinger equation.

Nowadays we rather argue the other way around as I wrote in my first posting in this thread: QT is the comprehensive theory, and you can derive classical mechanics from it using the path-integral formalism and considering the stationary-phase approximation.

13. Jun 13, 2016

### Stavros Kiri

You just described "all Physics", right there!
I' m also a symmetries fan in physics, and their associations, implications and consequences. The aesthetical advantages, I agree, are also important, both as motivation and as results, and symmetries nowdays in physics is a very important key-tool.

Time permitted, I will study these sources.

Very interesting! + puts things in perspective

I agree.

14. Jun 13, 2016

### wrobel

By the way the Hamilton Stationary Action Principle does not work for non holonomic case. Nevertheless the Noether theorem remains valid for nonholonomic systems independently on the Hamilton principle.
Indeed, consider a system
$$\frac{d}{dt}\frac{\partial L}{\partial \dot x^i}-\frac{\partial L}{\partial x^i}=\lambda_s a^s_i,\quad a^s_i\dot x^i=0,\quad i=1,\ldots,m,\quad s=1,\ldots, n<m\quad (*)$$ here $a^s_i=a^s_i(x),\quad L=L(x,\dot x),\quad x=(x^1,\ldots,x^m).$ Actually $x$ are local coordinates on smooth manifold $M$ etc, but let us drop the obvious details.

Theorem. Assume that there exists a vector field $v^i(x)$ such that
$$a_i^s v^i=0,\quad L\Big( g^\tau_v(x),\frac{\partial g^\tau_v(x)}{\partial x}y\Big)=L(x,y)\quad \forall \tau,s,y,x$$
Then system (*) has a first integral
$$F(x,\dot x)=\frac{\partial L}{\partial \dot x^i}v^i.$$

15. Jun 13, 2016

### Frank Castle

Thanks for the references, I shall have to take a look at them when I have an opportunity.

I guess I'll just have to be satisfied with the modern explanation then (don't get me wrong, I do like this explanation).

To be honest, the original question stemmed from me thinking how I would explain the concept to someone was at the start of a course on Lagrangian mechanics (but hadn't had any formal teaching in quantum mechanics) and how I would motivate such an approach to them and why it is reasonable to postulate such a principle. The only ideas I could come up with were those that I've expressed in earlier posts.

16. Jun 13, 2016

### Stavros Kiri

From our interactive discussion I remembered as motivation, and I would after all think as a good idea and suggestion to answering those questions in the above quote (sufficed in classical non-quantum level) as already posted:
More ideas for the classical motivations could be extracted by combinatory comprehensive look, review and analysis of this whole discussion here, and I am sure many more will arise by/after looking at the original sources, to e.g. see how on earth they were originally inspired to choose and define the action, that way, instead of something else.

But one thing is for sure (certain): the formalism does work!
(edit addition) [And the trick being to choose the right Lagrangian or Hamiltonian of the system or seeked formalism, that would make it actually to work (for that particular system or formalism). So in my opinion, nothing fancy or mysterious about the Analytical Mechanics formalisms (and there is a whole bunch of them), but I do like them all!
And there may be in fact a certain circularity issue in those formalisms and/or pre-selfcontainment of the formalism (e.g. in the conveniently chosen Lagrangian). I think that's one disadvantage of the method. In other words it may not produce anything new (but I am not sure). But definetaly it gives important and useful tools to see the system, in a neat way, and predict its behavior.
Also I think really interesting would be existence and uniqueness issues and theorems (e.g. for Lagrangian and Hamiltonian) in Analytical Mechanics*.]
* perhaps field for mathematicians ...

Last edited: Jun 13, 2016
17. Jun 13, 2016

### wrobel

existence and uniqueness theorems do exist for the variational problems locally, and there are examples which show that globally the variationasl problems can have many solutions or do not have the solutions

18. Jun 13, 2016

### Stavros Kiri

But you are assuming a manifold and tacit covariance. How restrictive is that for mechanics? In any case, still very important though! (I mean the whole comment)

19. Jun 13, 2016

### wrobel

I am not sure that I understood this remark correctly, anyway (*) are the standard equations of classical mechanics.

Noether theorem can have different versions and generalizations. For example consider a Hamiltonian equations with Hamiltonian function $$H=H(p,q),\quad p=(p_1,\ldots,p_m),\quad q=(q^1,\ldots,q^m) .$$ Assume that this system has one-parametric group of symmetries $\{g^s(p,q)\}$ such that each mapping $(p,q)\mapsto g^s(p,q)$ is a canonical (symplectic) mapping. (Locally this group is generated by a Hamiltonian system with Hamiltonian $F,\quad \{F,H\}=0$.

Theorem. Assume that $dF\ne 0$. Then There are local canonical coordinates $(P,Q)$ such that in this coordinates the Hamiltonian $H$ takes the form $H=H(P_2,\ldots,P_m,Q^1,\ldots,Q^m).$ So that in system $H$ the variables $P_2,\ldots,P_m,Q^2,\ldots,Q^m$ are separated and we obtain the Hamiltonian system of order $2m-2$

20. Jun 13, 2016

### Stavros Kiri

Non uniqueness is also trivial, unless we exclude 'arbitrariness up to a constant' etc.* . I am more interested in the non-existence that you mention. Is there a good counter-example? (If it's something off-hand or easily available. Otherwise I may look for the proof ...)

To be honest, intuitively I also anticipated cases of non-existence. But I haven't myself studied these issues yet. On the other hand you seem to be an expert on the topic, which is a good thing.

* + if I remember correctly, also up to a scalar derivative or gradient ...

Last edited: Jun 13, 2016