Intuition behind Hamilton's Variational Principle

AI Thread Summary
The discussion centers on the understanding of Lagrangian mechanics and the significance of the principle that the true path of a system minimizes the action. The professor's assertion that the difference between the Lagrangian evaluated along the true path and a perturbed path approaches zero as the perturbation vanishes is seen as a fundamental truth. The conversation explores the intuitive nature of extrema in functions and the mathematical underpinnings of variational principles, emphasizing that the choice of functions and parameters in theoretical physics is often arbitrary yet leads to universal results. The Lagrangian is not arbitrary, but rather the path taken between fixed initial and final points is, with the claim being that the actual physical path minimizes the action. Understanding this principle is crucial for grasping the deeper implications of Lagrangian mechanics in physics.
sigma_
Messages
33
Reaction score
3
Background: I am an upper level undergraduate physics student who just completed a course in classical mechanics, concluding with Lagrangian Mechanics and Hamilton's Variational Principle.

My professor gave a lecture on the material, and his explanation struck me as a truism.
Essentially, he argued that the difference between the Lagrangian evaluated along the parameters describing the true path and the Lagrangian evaluated along parameters corresponding to a mild perturbation of the parameters by a function αη(x), where α is a scale factor, is zero.

Where exactly is the profundity in this statement? I understood it as "If we deviate the parameters away from the parameters that minimize the integral, and then take the limit as that deviation vanishes, the difference between the path described by these two sets of parameters is zero and the path must be the true path." Well of course this is true. What am I missing?
Alternatively are there any decent texts that outline this principle at an undergraduate level?
 
Physics news on Phys.org
sigma_ said:
My professor gave a lecture on the material, and his explanation struck me as a truism.
What was he explaining?

How would you give a intuitive argument that the extrema of a function occur where its derivative is zero? You'd probably say that when x varies slightly from the place of extrema, there is very little change in f(x). Would that be a truism?

Perhaps you are asking why the path chosen by Nature is an extrema of some functional, rather than why we can find that extrema by setting derivatives equal to zero. I don't know how to prove that result from more fundamental laws of physics, but I think it would be easy to look up the proof. Did you want an intuitive explanation of that result?
 
My question is the one you address in your first paragraph:

The first derivative test is inherently intuitive. Slope is an intuitive geometrical concept. You are searching for this information contained within a function, not trying to eek out information about an unknown function L(x) using another arbitrary function of our construction, L(x + δx).

Of course L(x) = L(x + δx) when δx shrinks to zero. How does that guarantee the action is minimized and that the laws of physics are being respected? I have invoked no physical principles here. I have constructed arbitrarily everything about this operation, the function, the perturbation function, the scale factor -- everything. So how can I possibly make the claim that I am revealing anything but an equally arbitrary path?
 
Last edited:
sigma_ said:
My question is the one you address in your first paragraph:

The first derivative test is inherently intuitive. Slope is an intuitive geometrical concept. You are searching for this information contained within a function, not trying to eek out information about an unknown function L(x) using another arbitrary function of our construction, L(x + δx).

Then you have a good mathematical question. You're guilty of expecting the way physicists use mathematics to make sense!

When people have a collection of data and want to solve a mathematical problem, they often assume the data is from some set ("family") of functions that is defined by a general formula that contains some parameters. Varying the parameters generates all the members of the family of functions. They often fit a specific member of the family to the data by assigning numerical values to the parameters. Often this is done by minimizing some function of the parameters ( e.g. least squares fitting). One can view the function of the parameters as a functional since it assigns a number to the particular function defined by particular parameters.

Solving practical problems with methods like that may depend on picking the correct family of functions to fit to the data. When presenting a theoretical result, many theoretical results must be proven using the assumption that a very specific family of functions creates the data.

One way to look at "variations" in the calculus of variations is that it presents theoretical results that don't depend on the family of functions being as specific as polynomials, beta distributions, trig functions - or whatever. The presenter could say:

"I'm going to assume the function that does what I want comes from a family of functions. This family of functions is defined by the general form (f_x(t),f_y(t) ) + \epsilon (\eta_x(t),\eta_y(t)). Varying parameter \epsilon generates all the members of this family of functions. I'm going to assume the function that does what I want is ( f_x(t), f_y(t)). The catch is that I won't tell exactly what function (f_x(t),f_y(t)) is. Futhermore I won't tell you exactly what function (\eta_x(t),\eta_y(t)) is. I'll just assume these functions are nice enough so that certain limits and derivatives exist. I will prove that there are certain equations relating various integrals and derivatives of (f_x(t), f_y(t)) regardless of which (\eta_x(t), \eta_y(t)) is used. Solving integral and differential equations involves finding an unknown function from such equations, so it shouldn't bother you that I didn't reveal exactly what function (f_x(t), f_y(t)] was to begin with."

You might be disturbed the use of "specific, but unrevealed" items in the discussion. However this is the common method of "universal generalization" used in proofs. If you want to prove something "for each thing W", you can write a proof that begins "Let W be an arbitrary thing..." So W becomes specific, but without any special properties beyond those that all the "things" under discussion have.
 
sigma_ said:
Of course L(x) = L(x + δx) when δx shrinks to zero. How does that guarantee the action is minimized and that the laws of physics are being respected? I have invoked no physical principles here. I have constructed arbitrarily everything about this operation, the function, the perturbation function, the scale factor -- everything. So how can I possibly make the claim that I am revealing anything but an equally arbitrary path?

The Lagrangian is not arbitrary. Neither are the initial and final points of the path. It is the path between these points that is arbitrary (subject only to a smoothness restriction), and the claim is that the true physical path is the one that minimizes action. Do you understand the meaning of this claim? Is it the claim itself that you deem truistic? Is there another problem with the claim? Or is it something else, not this claim, that is a problem?
 
Thread 'Gauss' law seems to imply instantaneous electric field'
Imagine a charged sphere at the origin connected through an open switch to a vertical grounded wire. We wish to find an expression for the horizontal component of the electric field at a distance ##\mathbf{r}## from the sphere as it discharges. By using the Lorenz gauge condition: $$\nabla \cdot \mathbf{A} + \frac{1}{c^2}\frac{\partial \phi}{\partial t}=0\tag{1}$$ we find the following retarded solutions to the Maxwell equations If we assume that...
Maxwell’s equations imply the following wave equation for the electric field $$\nabla^2\mathbf{E}-\frac{1}{c^2}\frac{\partial^2\mathbf{E}}{\partial t^2} = \frac{1}{\varepsilon_0}\nabla\rho+\mu_0\frac{\partial\mathbf J}{\partial t}.\tag{1}$$ I wonder if eqn.##(1)## can be split into the following transverse part $$\nabla^2\mathbf{E}_T-\frac{1}{c^2}\frac{\partial^2\mathbf{E}_T}{\partial t^2} = \mu_0\frac{\partial\mathbf{J}_T}{\partial t}\tag{2}$$ and longitudinal part...
Back
Top