Functionals and calculus of variations

In summary, the conversation discusses the reasoning behind the use of functionals of the form I[y] = \int_{a}^{b} F\left(x,y,y' \right) dx in calculus of variations. The integrand F is a function of both y and its derivative y', as the functional I is dependent on the entire function y(x) over the interval x \in [a,b]. This is because the size of the integral depends on how y varies over the interval, which is represented by its rate of change y'. The justification for this is that in order to describe the dynamics of a physical system, one must specify both the position and the velocity of its components, thus the Lagrangian should be a function
  • #1
"Don't panic!"
601
8
I have been studying calculus of variations and have been somewhat struggling to conceptualise why it is that we have functionals of the form [tex]I[y]= \int_{a}^{b} F\left(x,y,y' \right) dx[/tex] in particular, why the integrand [itex]F\left(x,y,y' \right)[/itex] is a function of both [itex]y[/itex] and it's derivative [itex]y'[/itex]?

My thoughts on the matter are that as the functional [itex]I[/itex] is itself dependent on the entire function [itex]y(x)[/itex] over the interval [itex]x\in [a,b][/itex], then if [itex]I[/itex] is expressed in terms of an integral over this interval then the 'size' of the integral will depend on how [itex]y[/itex] varies over this interval (i.e. it's rate of change [itex]y'[/itex] over the interval [itex]x\in [a,b][/itex]) and hence the integrand will depend on [itex]y[/itex] and it's derivative [itex]y'[/itex] (and, in general, higher order derivatives in [itex]y[/itex]. I'm not sure if this is a correct understanding and I'm hoping that someone can enlighten me on the subject (particularly if I'm wrong). Thanks.
 
Mathematics news on Phys.org
  • #2
Think of the motion of a car. The independent variable is time t, but to describe its path you have to give its (initial) position _and_ its (initial) velocity, the derivative of position.
 
  • #3
Can one imply from this then, that as we initially need to specify the position and the velocity on order to describe the configuration of a physical system, then any function [itex]F[/itex] characterising the dynamics of the system over a given interval must be a function of both position and velocity. (In doing so, we can describe the dynamics of the system at any point in the interval that we are considering by specifying the position and velocity at that point and plugging these values into [itex]F[/itex])?!

I'm trying to get an understanding for it in the abstract sense as well, without relating to any particular physical problem as to why the integrand would be a function of some function and it's derivatives (first order and possibly higher order)?
 
  • #4
"Don't panic!" said:
Can one imply from this then, that as we initially need to specify the position and the velocity on order to describe the configuration of a physical system, then any function [itex]F[/itex] characterising the dynamics of the system over a given interval must be a function of both position and velocity. (In doing so, we can describe the dynamics of the system at any point in the interval that we are considering by specifying the position and velocity at that point and plugging these values into [itex]F[/itex])?!

I'm trying to get an understanding for it in the abstract sense as well, without relating to any particular physical problem as to why the integrand would be a function of some function and it's derivatives (first order and possibly higher order)?
My understanding is that the calculus of variations uses notation that treats the y and y' as independent variables, even though they aren't actually independent (as you point out). However the theory still works.
 
  • #5
Yeah, I guess I'm really trying to understand why the integrand is treated as a function of [itex]y[/itex] and [itex]y'[/itex], why not just [itex]y[/itex]? What's the justification/mathematical (and/or) physical reasoning behind it?

Is it just that if you wish to be able to describe the configuration of a physical system and how that configuration evolves in time you need to specify the positions of the components of the system and also how those positions change in time (i.e. the derivatives of the positions). Hence, as we wish the Lagrangian of the system to characterise its dynamics, this implies that the Lagrangian should be a function of both position and velocity?!
 
Last edited:
  • #6
"Don't panic!" said:
Yeah, I guess I'm really trying to understand why the integrand is treated as a function of [itex]y[/itex] and [itex]y'[/itex], why not just [itex]y[/itex]? What's the justification/mathematical (and/or) physical reasoning behind it?

Is it just that if you wish to be able to describe the configuration of a physical system and how that configuration evolves in time you need to specify the positions of the components of the system and also how those positions change in time (i.e. the derivatives of the positions). Hence, as we wish the Lagrangian of the system to characterise its dynamics, this implies that the Lagrangian should be a function of both position and velocity?!
The function F depends on the problem you are trying to solve. So for example, if you are trying to minimize the energy of a system, the energy consists of kinetic energy (depends on y') and potential energy (depends on y). However if you are trying to minimize the length of a curve, the integrand is ds=sqrt(1+y'^2) which does not depend on y. So the form of f depends on the problem at hand. Does that help?
 
  • #7
Thanks. I understand in those cases, but would what I said be correct in the more general case of applying the principle of stationary action to a physical system, i.e. one wishes to describe the state of some system at time [itex]t_{0}[/itex] and what state it evolves to at a later (fixed) time [itex]t_{1}[/itex]. To do so one must specify the coordinates [itex]q_{i}[/itex] of the components of the system and also how those coordinates change in time, i.e. their derivatives, [itex]\dot{q}_{i}[/itex] in the time interval [itex]t\in [t_{0}, t_{1}][/itex]. Thus, we require a function of the form [tex]\mathcal{L}= \mathcal{L}\left(q_{i}(t),\dot{q}_{i}(t)\right)[/tex] to completely specify the state of the system at any time [itex]t\in [t_{0}, t_{1}][/itex]. From this, we define a functional, the action such that [tex]S\left[q_{i}(t)\right] = \int_{t_{0}}^{t_{1}} \mathcal{L}\left(q_{i}(t),\dot{q}_{i}(t)\right) dt [/tex] that associates a number to each path [itex]\vec{q}(t)=\left(q_{i}(t)\right)[/itex] between the (fixed) states [itex]\vec{q}\left(t_{0}\right)[/itex] and [itex]\vec{q}\left(t_{1}\right)[/itex]. We then invoke the principle of stationary action to assert that the actual physical path taken between these two points is the one which satisfies [itex]\delta S = 0 [/itex]. Would this be a correct interpretation?
 
  • #8
That is a good description and matches how I understand calculus of variations in the context of general physical systems.
 
  • #9
Great. Thanks very much for your help!
 
  • #10
As a follow-up. Would it be fair to say then, that as [itex]S\left[\vec{q}(t)\right][/itex] contains information about all possible paths between the points [itex]\vec{q}\left(t_{0}\right)[/itex] and [itex]\vec{q}\left(t_{1}\right)[/itex], this implies that the integrand will be a function of the values of those paths and their derivatives at each point [itex]t\in [t_{0}, t_{1}][/itex]. Now, as at each point along the path in the interval [itex]t\in [t_{0}, t_{1}][/itex], once we have specified the position we are free to specify how that position changes (i.e. the velocity) at that point independently, as we are considering all possible paths. However, upon imposing the principle of stationary action, we are choosing a particular path, i.e. the one which extremises the action. This re-introduces the explicit dependence of [itex]\dot{q}_{i}(t)[/itex] on [itex]q_{i}(t)[/itex] via the relation [tex]\delta\dot{q}_{i}(t) = \frac{d}{dt}\left(\delta q_{i}(t)\right)[/tex] Apologies to re-iterate, just trying to fully firm up the concept in my mind.
 
  • #11
Sorry, please ignore the post above - I realized the error in what I was writing after posting it and the forum won't let me delete it now!

Instead of the above post, is the following a correct summary (pertaining to the Lagrangian and why it is dependent on position and velocity):

The state of a mechanical system at a given time, [itex]t_{0}[/itex] is completely specified by the positions of the particles, along with their corresponding velocities, within it. Thus, if we wish to describe the state of this system at some later time [itex]t[/itex] in some fixed time interval, then we need to specify how the system evolves over this interval, i.e. we require a function which depends on the in positions of the particles and also the rate at which those positions are changing (i.e. their velocities) at each point within the time interval (a requirement if we wish to consider external forces acting on the particles). This motivates us to consider a function [itex]\mathcal{L}= \mathcal{L}\left(q_{i}(t), \dot{q}_{i}(t)\right)[/itex] which completely specifies the state of a mechanical system at each point [itex]t \in [t_{0},t_{1}] [/itex].
 
  • #12
My intuition is that the Lagrangian is sort of a cost function. You might not care about y' in some problems. So, you could imagine a problem in which your cost per unit time to travel from point A to point B in a fixed amount of time is strictly a function of position. You would then try to spend as much of your time in the areas of lower cost to minimize your travel expenses. But let's say you want to discourage speeding as well, so you want to penalize higher velocities. My intuition is that it's easier to apply that speeding penalty if you make the Lagrangian also a function of velocity. For example, you could add the speed squared or cubed or whatever you want. So, it's natural to want to introduce y' as a variable to be able to put that into your cost function. It's a pretty flexible construction, so you can imagine that we can just try to penalize any path that doesn't follow the laws of physics that we want, and hopefully, that will give you a description of physics. When you work out the details, it does turn out to work.
 
  • #13
"Don't panic!" said:
Yeah, I guess I'm really trying to understand why the integrand is treated as a function of [itex]y[/itex] and [itex]y'[/itex], why not just [itex]y[/itex]? What's the justification/mathematical (and/or) physical reasoning behind it?

I haven't heard a mathematical answer to that question yet. Let me reiterate the question emphasizing the mathematical aspect.

When we have a function such as [itex] y = 3x + x^2 [/itex] we denote it as [itex] y = f(x) [/itex], not as [itex] y = f(x,3x,x^2) [/itex] even though evaluation [itex] f [/itex] involves the intermediate steps of evaluating [itex] 3x [/itex] and [itex] x^2 [/itex]. So why is an expression like [itex] F(x,y,y') [/itex] necessary in discussing the integrand in the calculus of variations? Isn't computing [itex] y' [/itex] from [itex] y [/itex] an intermediate step in the process? If we are given [itex] y [/itex] we can find [itex] y', y'',... [/itex] etc. Why not just write the integrand as [itex] F(x,y) [/itex] or even [itex] F(x) [/itex]? After all, the integration [itex] \int_a^b F(...) dx [/itex] is ordinary integration. The integrand must be a function of [itex] x [/itex].My conjecture for an explanation:In the expression [itex] I(y) = \int_a^b F(x,y,y') dx [/itex] we see that it's [itex] I(y) [/itex] instead of [itex]I(y,y') [/itex] so the fact that finding [itex] y' [/itex] is needed as an intermediate step isn't recognized in the left hand side.If we have function like [itex] z = x + 3x + x^2 [/itex] we can choose to describe it in a way that exhibits intermediate calculations. For example, let [itex] y = 3x + x^2 [/itex] and [itex] F(a,b) = a + b [/itex]. Then we can write [itex] z = F(x,y) [/itex].

By analogy the notation [itex] F(x,y,y') [/itex] indicates a particular choice of representing the integrand that takes pains to exhibit intermediate calculations. It's not a simple algebraic expression. The computation implied by [itex] F(x,y,y') [/itex] is an algorithm. As far as I can see, there is nothing incorrect about notation like [itex] I(y) = \int_a^b G(x,y) dx [/itex] to describe the same functional. It's just that the processes described by [itex] F [/itex] and [itex] G [/itex] would be technically different. Thinking of [itex] F [/itex] and [itex] G [/itex] as computer routines, the routine [itex] F [/itex] requires that you compute [itex] y' [/itex] and then give it as input to [itex] F [/itex]. The routine [itex] G [/itex] does not.

So I think the notation [itex] F(x,y,y') [/itex] is not a necessary notation. It is a permissible notation that may be helpful if it reminds us of the steps involved in forming the integrand.
 
  • #14
Thanks for your help on the matter.
Would it be fair to say the following:

The configuration of a system at a given instant in time is completely determined by specifying the coordinates of each of the particles within the system at that instant. However, using just this information one cannot determine the configuration of the system at subsequent instants in time. To do so requires knowledge of the rate of change of these positions at the instant considered. For given values of the coordinates the system can have any velocities (as we are considering the coordinates and velocities of the particles at the same instant in time), and this will affect the configuration of the system after an infinitesimal time interval, [itex]dt[/itex] . Thus, by simultaneously specifying the coordinates and velocities of the particles at a given instant in time, we can, in principle, calculate it's subsequent time evolution. This means that, if the coordinates and velocities of the particles are specified at a given instant, [itex]t_{0}[/itex], then the accelerations of those particles are uniquely defined at that instant, enabling one to construct equations of motion for the system. Following the principle of stationary action, we are motivated to consider a function which summarises the dynamics of a physical system at each given instant in time (over some finite time interval), along all possible paths that the the system could take between two fixed configurations, [itex]\vec{q} (t_{0})[/itex] and [itex]\vec{q} (t_{1})[/itex]. As such, taking into account the discussion above, we can imply that for this function to successfully summarise the dynamics of the system at each point, it is sufficient for it to be a function of the coordinates [itex]q_{i} [/itex] and the velocities [itex]\dot{q}_{i} (t) [/itex] of the constituent components of the system, i.e. a function of the form [itex]\mathcal{L} =\mathcal{L} (q_{i} (t), \dot{q}_{i} (t)[/itex] (we need not consider higher order derivatives as it is known that the dynamical state of the system, at a give instant in time, is completely specified by the values of its coordinates and velocities at that instant). Given this, we can then attribute a value to the dynamics of the system, depending on the path, [itex] \vec{q} (t)= (q_{1},\ldots ,q_{n})[/itex], that it takes between the two fixed configurations, [itex]\vec{q} (t_{0})[/itex] and [itex]\vec{q} (t_{1})[/itex]. We do so by defining a functional, the action, as follows [tex] S[\vec{q} (t)] = \int_{t_{0}}^{t_{1}}\mathcal{L} (q_{i} (t), \dot{q}_{i} (t)) dt [/tex] The principal of stationary action then asserts that the actual path taken by the system between these two fixed configurations is the one for which the action is extremised (i.e. the path which gives an extremal value to this integral).
 
Last edited:
  • #15
"Don't panic!" said:
Thanks for your help on the matter.
Would it be fair to say the following:

The question I have about thr physics that followed is what does it say about the mathematical notation like [itex] F(x,y,y') [/itex] or [itex] G(x,y) [/itex] when [itex] y [/itex] is a function of x? Thinking of [itex] F [/itex] and [itex] G [/itex] as being implemented by computer algorithms, what does the argument [itex] y [/itex] represent?

One possibility is that [itex] y [/itex] represents a function. In many computer languages an argument can be a function instead of a single number. If we give an algorithm the ability to access the function [itex] y(x) [/itex] then it can in principle compute [itex] y', y'', y''' [/itex]. This is the convention that applies to the notation [itex] I(y) [/itex]. In that notation, [itex] y [/itex] represents a function.

Another possiblity is that [itex] y [/itex] represents a single numerical value. In that case, notation like [itex] G(x,y) [/itex] does not represent giving [itex] G [/itex] the knowledge of the function [itex] y(x) [/itex]. So we cannot assume that the algorithm [itex] G [/itex] can compute [itex] y'(x) [/itex].

Under the convention that arguments are single numerical values then I don't see how the algorithm [itex] F(x,y,y') [/itex] can reconstruct any information about [itex] y'' [/itex] (acceleration) from pure mathematics. To do that, it would have to know the behavior of [itex] y' [/itex] in an interval. Are you saying we have a physical situation where the knowledge of position and velocity at one point in time is sufficient to compute the subsequent behavior of the system (and hence compute any derivative of that behavior that is desired)?

( There is another recent thread where someone remarks that physicists often use ambiguous notation that makes it difficult to distinguish between a function and single numerical value that comes from evaluating that function.)
 
  • #16
I was following the Landau-lifschitz book on classical mechanics to be honest, where they describe it in a similar manner. I think what is perhaps meant is that using this information as initial conditions for an equation of motion one can uniquely determine the acceleration at that initial instant?!
My thoughts were that for each possible path between to points, the lagrangian is a function of the coordinates and velocities of this path, such that, at each instant in time along the time interval the lagrangian characterises the dynamics of the system if it were to follow that path (i.e. by plugging in the values of the coordinates and velocities at each instant in time along the path into the lagrangian we can characterise the dynamics of the system along that path).
 
  • #17
The notation ##F(x,y,y')## is pretty bad in my opinion. It should be ##F(x,y(x),y'(x))##. ##y## is a function. ##y(x)## is an element of the codomain of ##y##, so it's typically a number. ##F## doesn't take functions as input. It takes three real numbers.

Similarly, I would never write ##S[\vec q(t)]##, because ##\vec q(t)## is an element of ##\mathbb R^3##, not a function. (It's a "function of t" in the sense that its value is determined by the value of t, but it's still not a function). I would write
$$S[\vec q]=\int_a^b L(\vec q(t),\vec q'(t),t)\mathrm dt.$$ (When I do calculations with a pen and paper, I will of course abuse the notation to avoid having to write everything out). ##L## is usually something very simple. In the classical theory of a single particle moving in 1 dimension, as influenced by a potential ##V:\mathbb R\to\mathbb R##, it can be defined by ##L(r,s,u)=\frac{1}{2}ms^2-V(r)## for all ##r,s,u\in\mathbb R##. Note that this ensures that ##L(q(t),q'(t),t)=\frac{1}{2}mq'(t)^2-V(q(t))## for all t.
 
  • #18
Fredrik said:
is pretty bad in my opinion. It should be F(x,y(x),y′(x))F(x,y(x),y'(x)). yy is a function. y(x)y(x) is an element of the codomain of yy, so it's typically a number. FF doesn't take functions as input. It takes three real numbers.

Exactly. That's what I was trying to allude to in my description. The Lagrangian is a function of the values of the coordinates and velocities of the particle at each given instant over the time interval considered.

Would what I said in the post (above yours) about why the Lagrangian is a function of coordinates and velocities, in a more general sense, be correct? (I know that for conservative systems it assumes the form [itex]\mathcal{L}=T-V[/itex], but I was trying to justify to myself the reasoning as to why we consider the Lagrangian to be a function of position and velocity in the first place, before considering any particular cases, in which the components, such as [itex]T[/itex] and [itex]V[/itex], are clearly functions of the coordinates and velocities?)

Also, is what I said about the action (in previous post), i.e. as a means of attributing a value to the characteristic dynamics of a system due to it following a particular path, [itex]\vec{q}[/itex], enabling us to distinguish the actual physical path taken by the system (using variational techniques), correct?
 
  • #19
Stephen Tashi said:
in an interval. Are you saying we have a physical situation where the knowledge of position and velocity at one point in time is sufficient to compute the subsequent behavior of the system (and hence compute any derivative of that behavior that is desired)?

In reference to this part I was following Landau-Lifschitz:

"If all the coordinates and velocities are simultaneously specified, it is known from experience that the state of the system is completely determined and it's subsequent motion can, in principle, be calculated. Mathematically, this means that, if all the coordinates [itex]q[/itex] and velocities [itex]\dot{q}[/itex] are given at some instant, the accelerations [itex]\ddot{q}[/itex] at that instant are uniquely defined."

(Mechanics, L.D. Landau & E.M.Lifschitz)

I was a bit unsure about this for the same reasons that you mentioned in your post.
 
  • #20
That sounds like the theorem that says (roughly) that if f is a nice enough function, then the differential equation ##\vec x''(t)=f(\vec x(t),\vec x'(t),t)## has a unique solution for each initial condition ##\vec x(t_0)=\vec x_0##, ##\vec x'(t_0)=\vec v_0##.

Lagrangian mechanics is based on a slightly different theorem (I don't recall actually seeing such a theorem, but I'm fairly sure that one exists): A unique solution for each boundary condition ##\vec x(t_a)=x_a##, ##\vec x(t_b)=\vec x_b##.
 
  • #21
Fredrik said:
That sounds like the theorem that says (roughly) that if f is a nice enough function, then the differential equation x⃗ ′′(t)=f(x⃗ (t),x⃗ ′(t),t)\vec x''(t)=f(\vec x(t),\vec x'(t),t) has a unique solution for each initial condition x⃗ (t0)=x⃗ 0\vec x(t_0)=\vec x_0, x⃗ ′(t0)=v⃗ 0\vec x'(t_0)=\vec v_0.

Does that explain what Landau means in the section that I quoted?
(The text that I quoted was from their section leading onto formulating lagrangian mechanics).

Fredrik said:
Lagrangian mechanics is based on a slightly different theorem (I don't recall actually seeing such a theorem, but I'm fairly sure that one exists): A unique solution for each boundary condition ##\vec x(t_a)=x_a##, ##\vec x(t_b)=\vec x_b##.

Yes, good point. Is what I said about the lagrangian correct though?

The lagrangian is a function of the coordinates and velocities of this path, such that, at each instant in time along the time interval the lagrangian characterises the dynamics of the system if it were to follow that path (i.e. by plugging in the values of thecoordinates and velocities at each instant in time along the path into the lagrangian we can characterise the dynamics of the system along that path).
Then, we introduce the action as a means of attributing a value to the characteristic dynamics of a system due to it following a particular path, [itex\vec{q} [/itex]⃗, enabling us to distinguish the actual physical path taken by the system (using variational techniques and specifying the boundary conditions, [itex] \vec{q}(t_{0}) [/itex] and [itex] \vec{q}(t_{1}) [/itex].

I'm just trying to justify to myself a bit more what the lagrangian is, why it's a function of both coordinates and velocities (I assume that to be able to fully specify the dynamics of the system at each instant in the time interval considered, one needs to know the positions of all the particles and the rate of change of those positions at that point?!), and understand a bit more what the action actually is?!
 
  • #22
It's very hard to comment on whether a description in words of something mathematical is "correct". The wordy description isn't going to be as precise. If it was, we wouldn't have needed the mathematical description.

"Don't panic!" said:
Does that explain what Landau means in the section that I quoted?
(The text that I quoted was from their section leading onto formulating lagrangian mechanics).
I don't know what they meant exactly. I'm puzzled by the fact that they're saying that if you know the positions and the velocities at an instant, then you know the accelerations at that instant. The theorem I mentioned says that if you know the positions and the velocities at an instant, then you know the function that gives you the positions at all times. Then you can use it to determine velocities, accelerations and other things, at all times.

"Don't panic!" said:
Is what I said about the lagrangian correct though?
As I said, it's very difficult to comment, but I will give it a try.

"Don't panic!" said:
The lagrangian is a function of the coordinates and velocities of this path,
I wouldn't say that, since you only plug in the positions and velocites at one time. You could say that it's a function of the coordinates and velocities at a point on the path.

There's a much fancier way to say this. The set of all "positions" is a manifold called the system's configuration space. A velocity is a tangent vector at some point in the configuration space, so it's an element of the tangent space at that point. The set of all pairs ##(x,v)## where x is a point in the manifold and v is a tangent vector at x, is called the tangent bundle. The Lagrangian is a function from the tangent bundle (of the system's configuration space) into ##\mathbb R##.

"Don't panic!" said:
such that, at each instant in time along the time interval the lagrangian characterises the dynamics of the system if it were to follow that path (i.e. by plugging in the values of thecoordinates and velocities at each instant in time along the path into the lagrangian we can characterise the dynamics of the system along that path).
What path? It doesn't take a path as input. I suppose you could, for each t, define a function ##L_t## by ##L_t[q]=L(q(t),q'(t),t)## for all paths q. But what does it mean to characterize the dynamics of the system?

"Don't panic!" said:
I'm just trying to justify to myself a bit more what the lagrangian is,
The way I see it, Newtonian, Lagrangian and Hamiltonian mechanics are three different approaches to how to add matter and interactions to an otherwise empty spacetime. To define a classical theory of physics, we must specify the matter content of spacetime, and its interactions. If we want to use the theorem about differential equations that says one solution for each initial condition on the positions and velocities, then we define the theory by writing down a force and postulating that the path is found by solving the equation called Newton's 2nd law. If we want to use the other theorem, the one that guarantees one solution for each boundary condition, we define the theory by writing down a Lagrangian and postulating that the path is found by solving the Euler-Lagrange equation. In both of these approaches, the function that defines the theory is essentially just guessed. I don't know if you can describe what it is in a meaningful way.

I suppose that you could say something like this: The action assigns a "badness score" to each path in configuration space. Each path in configuration space defines a path in the tangent bundle. The Lagrangian tells us how different parts of the tangent bundle contribute to the "badness score" of a path through those parts.

"Don't panic!" said:
why it's a function of both coordinates and velocities (I assume that to be able to fully specify the dynamics of the system at each instant in the time interval considered, one needs to know the positions of all the particles and the rate of change of those positions at that point?!), and understand a bit more what the action actually is?!
That theorem is just as useful if the Lagrangian is independent of one or more of those variables. But there are conserved quantities in that case. (If L is independent of a position coordinate, the corresponding momentum will not change with time). So if we want a theory in which the momenta are changing (e.g. when we give something a push), we can't make L independent of the positions. I think something similar can be said about the velocities, but I haven't really thought about what that would be.
 
Last edited:
  • #23
Thanks very much for your answers.

Yeah, sorry I didn't explain myself too well. I meant, like you said, that at each instant, [itex] t[/itex], the lagrangian is a function of the coordinates and velocities evaluated at that instant.
Would it be fair to say that we wish the lagrangian to contain all information about the dynamics of the system, the forces acting on it..., and this implies that for a general physical system, it should be a function of position and velocity (as quantities such as the configuration of the system, and potentials will depend on position, and also quantities such as momentum, etc. will depend on velocity)?!

Sorry to be a pain, I think my main issue is trying to conceptualise in my mind why we consider the lagrangian as a function of velocity as well as position in the first place?! I think I've confused myself, as I thought I understood it when I originally read the chapter I referred to in the Landau-lifshitz book on mechanics!
 
  • #24
"Don't panic!" said:
Would it be fair to say that we wish the lagrangian to contain all information about the dynamics of the system, the forces acting on it..., and this implies that for a general physical system, it should be a function of position and velocity (as quantities such as the configuration of the system, and potentials will depend on position, and also quantities such as momentum, etc. will depend on velocity)?!
I'd say that what we wish is to be able to apply the theorems about existence and uniqueness to the Euler-Lagrange equation.

"Don't panic!" said:
Sorry to be a pain, I think my main issue is trying to conceptualise in my mind why we consider the lagrangian as a function of velocity as well as position in the first place?!
If we drop the velocities, the Euler-Lagrange equation is reduced to ##\frac{\partial L}{\partial q_i}=0##. So the Lagrangian would be a constant function.
 
  • #25
So is the only reason that we consider the lagrangian to be a function of position and velocity (and no more or less derivatives) because we wish the equations of motion (i.e. the Euler-Lagrange equations) to be of second-degree, as we know empirically that one can completely determine the motion of a system with this information?
 
  • #26
Probably yes. I can't think of any other reason.
 
  • #27
Ok, thanks for all your input. Really appreciate you spending time on the matter! :)
 
  • #28
"Don't panic!" said:
I'm just trying to justify to myself a bit more what the lagrangian is, why it's a function of both coordinates and velocities (I assume that to be able to fully specify the dynamics of the system at each instant in the time interval considered, one needs to know the positions of all the particles and the rate of change of those positions at that point?!)

There are two different questions.
1) What is the physical interpretation of the integrand [itex] L( q(t) , q'(t), t) [/itex]?
2) Why does the actual path of a physical system follow the particular q(t) that minimizes the action integral?

I wouldn't say that the integrand itself (the Lagrangian) determines or describes the dynamics of a physical system since physical systems follow the special path that minimzes an integral of the integrand. So the value of the integrand at point in time is not a complete physical description of a system.

One can take the attitude of expositions like http://www.physicsinsights.org/lagrange_1.html that the Lagrangian is is an artificial mathematical construction. That outlook is: I want to solve a differential equation with given initial and final conditions. I will invent an integrand such that finding the path that minimizes an integral of it gives the solution to the differential equation. (The case of motion under a constant frictional force discussed in that link emphasizes the "artificial" nature of the Lagrangian.)
 
  • #29
I had thought of the Lagrangian as a function that characterises the state of the system at each point along along an allowed path (i.e. one which satisfies the boundary conditions), and is as such a function of the coordinate values and the velocity values at that point (as the state of the system can be completely specified at a given instant by the coordinates and the rate of change of those coordinates, i.e. the velocities, at that instant). In this sense, for each allowed path we can specify the coordinate and velocity values each point along the path and hence inserting these values into the Lagrangian would (in principle) allow one to characterise the state of the system at each point along the path. Integrating over a given interval will give a characteristic value to each given path (the action), and allowing one to distinguish between them and determine the actual physical path taken by the system.
We then assert that the actual path taken is the one which gives a stationary value to the action, and hence (again, in principle) evaluating the Lagrangian (upon inserting the coordinate and velocity values at each point along this path) at each point along this path will characterise the actual state of the system at each point along the path.

Sorry this is a bit wordy, just trying to express in words how I've been trying to think about it in my mind (it has really been bugging me, as you can probably tell!).
 

1. What is the difference between a functional and a regular function?

A functional is a mapping from a set of functions to the real numbers, while a regular function is a mapping from a set of numbers to another set of numbers. In other words, a functional takes in a set of functions and produces a single numerical value, while a regular function takes in a single input and produces a single output.

2. How is the calculus of variations used in real-world applications?

The calculus of variations is used in a variety of fields, such as physics, engineering, economics, and biology, to find the optimal solution to a problem. Some examples include finding the path of a particle that minimizes a certain physical property, determining the shape of a bridge that can support the most weight, and optimizing the shape of a wing for efficient flight.

3. What are the main principles of the calculus of variations?

The main principles of the calculus of variations include the Euler-Lagrange equation, which is used to find the critical points of a functional, and the method of variations, which involves varying the functional and setting it equal to zero to find the optimal solution.

4. How does the calculus of variations relate to classical mechanics?

The calculus of variations is closely related to classical mechanics as it is used to find the path of a particle that minimizes a certain physical property, such as time, energy, or distance. This is known as the principle of least action and is a fundamental concept in classical mechanics.

5. Are there any limitations to the use of the calculus of variations?

While the calculus of variations is a powerful tool, it does have some limitations. It is often unable to find solutions for complex problems with multiple constraints or when the functional is highly nonlinear. Additionally, the method can be computationally intensive and may not always provide a unique solution.

Similar threads

Replies
3
Views
213
Replies
4
Views
756
Replies
2
Views
1K
Replies
2
Views
680
Replies
1
Views
953
Replies
17
Views
2K
Replies
1
Views
636
Replies
3
Views
1K
Replies
3
Views
432
  • General Math
Replies
1
Views
625
Back
Top