# Intuition behind Hamilton's Variational Principle

## Main Question or Discussion Point

Background: I am an upper level undergraduate physics student who just completed a course in classical mechanics, concluding with Lagrangian Mechanics and Hamilton's Variational Principle.

My professor gave a lecture on the material, and his explanation struck me as a truism.
Essentially, he argued that the difference between the Lagrangian evaluated along the parameters describing the true path and the Lagrangian evaluated along parameters corresponding to a mild perturbation of the parameters by a function αη(x), where α is a scale factor, is zero.

Where exactly is the profundity in this statement? I understood it as "If we deviate the parameters away from the parameters that minimize the integral, and then take the limit as that deviation vanishes, the difference between the path described by these two sets of parameters is zero and the path must be the true path." Well of course this is true. What am I missing?
Alternatively are there any decent texts that outline this principle at an undergraduate level?

Related Classical Physics News on Phys.org
Stephen Tashi
My professor gave a lecture on the material, and his explanation struck me as a truism.

What was he explaining?

How would you give a intuitive argument that the extrema of a function occur where its derivative is zero? You'd probably say that when x varies slightly from the place of extrema, there is very little change in f(x). Would that be a truism?

Perhaps you are asking why the path chosen by Nature is an extrema of some functional, rather than why we can find that extrema by setting derivatives equal to zero. I don't know how to prove that result from more fundamental laws of physics, but I think it would be easy to look up the proof. Did you want an intuitive explanation of that result?

The first derivative test is inherently intuitive. Slope is an intuitive geometrical concept. You are searching for this information contained within a function, not trying to eek out information about an unknown function L(x) using another arbitrary function of our construction, L(x + δx).

Of course L(x) = L(x + δx) when δx shrinks to zero. How does that guarantee the action is minimized and that the laws of physics are being respected? I have invoked no physical principles here. I have constructed arbitrarily everything about this operation, the function, the perturbation function, the scale factor -- everything. So how can I possibly make the claim that I am revealing anything but an equally arbitrary path?

Last edited:
Stephen Tashi

The first derivative test is inherently intuitive. Slope is an intuitive geometrical concept. You are searching for this information contained within a function, not trying to eek out information about an unknown function L(x) using another arbitrary function of our construction, L(x + δx).
Then you have a good mathematical question. You're guilty of expecting the way physicists use mathematics to make sense!

When people have a collection of data and want to solve a mathematical problem, they often assume the data is from some set ("family") of functions that is defined by a general formula that contains some parameters. Varying the parameters generates all the members of the family of functions. They often fit a specific member of the family to the data by assigning numerical values to the parameters. Often this is done by minimizing some function of the parameters ( e.g. least squares fitting). One can view the function of the parameters as a functional since it assigns a number to the particular function defined by particular parameters.

Solving practical problems with methods like that may depend on picking the correct family of functions to fit to the data. When presenting a theoretical result, many theoretical results must be proven using the assumption that a very specific family of functions creates the data.

One way to look at "variations" in the calculus of variations is that it presents theoretical results that don't depend on the family of functions being as specific as polynomials, beta distributions, trig functions - or whatever. The presenter could say:

"I'm going to assume the function that does what I want comes from a family of functions. This family of functions is defined by the general form $(f_x(t),f_y(t) ) + \epsilon (\eta_x(t),\eta_y(t))$. Varying parameter $\epsilon$ generates all the members of this family of functions. I'm going to assume the function that does what I want is $( f_x(t), f_y(t))$. The catch is that I won't tell exactly what function $(f_x(t),f_y(t))$ is. Futhermore I won't tell you exactly what function $(\eta_x(t),\eta_y(t))$ is. I'll just assume these functions are nice enough so that certain limits and derivatives exist. I will prove that there are certain equations relating various integrals and derivatives of $(f_x(t), f_y(t))$ regardless of which $(\eta_x(t), \eta_y(t))$ is used. Solving integral and differential equations involves finding an unknown function from such equations, so it shouldn't bother you that I didn't reveal exactly what function $(f_x(t), f_y(t)]$ was to begin with."

You might be disturbed the use of "specific, but unrevealed" items in the discussion. However this is the common method of "universal generalization" used in proofs. If you want to prove something "for each thing W", you can write a proof that begins "Let W be an arbitrary thing...." So W becomes specific, but without any special properties beyond those that all the "things" under discussion have.

Of course L(x) = L(x + δx) when δx shrinks to zero. How does that guarantee the action is minimized and that the laws of physics are being respected? I have invoked no physical principles here. I have constructed arbitrarily everything about this operation, the function, the perturbation function, the scale factor -- everything. So how can I possibly make the claim that I am revealing anything but an equally arbitrary path?
The Lagrangian is not arbitrary. Neither are the initial and final points of the path. It is the path between these points that is arbitrary (subject only to a smoothness restriction), and the claim is that the true physical path is the one that minimizes action. Do you understand the meaning of this claim? Is it the claim itself that you deem truistic? Is there another problem with the claim? Or is it something else, not this claim, that is a problem?