# Total Derivative of a Constrained System

• I
Hi all,

I was working on a problem using Euler-Lagrange equations, and I started wondering about the total and partial derivatives. After some fiddling around in equations, I feel like I have confused myself a bit.

I'm not a mathematician by training, so there must exist some terminology which could help explain my problem in a more clear sense, but alas, it is unkown to me :) But here goes.

Suppose we have a function ##f(x,y,z,t)## where the variables are themselves through composition, functions of t.
$$x \rightarrow x(y,z,t)\\ y \rightarrow y(x,z,t)\\ z \rightarrow z(x,y,t)$$
If the functions for x, y and z trivially related to each other, we can back-substitute and get the function ##f## to be just a function of ##t##.

Suppose I wanted to find the derivative ##df/dt##, I could start by doing the total derivative of ##f##.
$$\dfrac{df}{dt} = \sum_i \dfrac{\partial{f}}{\partial{\sigma_i}}\dfrac{d \sigma_i}{d t}$$
Where ##\sigma_i## represents one of the four parameters ##(x,y,z,t)##.
My first insight (I hope), is that the second derivative ##d\sigma_i/dt## is the corresponding total derivative of the corresponding variable (x,y,z,t).

If the relationships between x, y, z and t are such that I can easily back-substitute the functions and generate a single function of just t. For example if ##x\rightarrow x(y,z,t)##, ##y\rightarrow y(z,t)## and ##z \rightarrow z(t)##, then I can substitute them all back into ##f## until there is only #t# left. Also, the total derivative will work beautifully as follows (using subscript notation for partial derivatives):
$$\dfrac{df}{dt} = f_t + f_z\dfrac{dz}{dt}+f_y\dfrac{dy}{dt}+f_x\dfrac{dx}{dt}$$
$$\dfrac{df}{dt} = f_t + f_z(\dfrac{dz}{dt})+f_y(y_t + y_z\dfrac{dz}{dt})+f_x(x_t+x_y\dfrac{dy}{dt})+x_z\dfrac{dz}{dt}$$
$$\dfrac{df}{dt} = f_t + f_z(\dfrac{dz}{dt})+f_y(y_t + y_z\dfrac{dz}{dt})+f_x(x_t+x_y(y_t + y_z\dfrac{dz}{dt}))+x_z\dfrac{dz}{dt}$$
Doing the implicit partial derivatives, and doing the single-variable derivative of z, and then substituting the equations for x,y and z into the stuff that falls out, will yield the correct total (single-variable) derivative of f.

However, for some relationships of ##x,y,z,t##, the total-derivative scheme becomes complex. Examples are:

1) When there are non-linear relationships between the variables
2) When there are more than one independent variable

An example could be that the variables x,y and z are related by the following relationship, assuming that the function ##f## is now only a function of (x,y,z).:
$$A = xyz$$
When I do the total-derivative, I end up with an infinite regression of terms, because the total derivatives of one variable, depends on the total derivative of the others, which depend on the total derivative of the first one, and so on ad infinitum.

So my question is:
How do we deal with total-derivatives of functions where the variables are related by non-trivial (non-composites?) relationships?

Stephen Tashi
Suppose I wanted to find the derivative ##df/dt##, I could start by doing the total derivative of ##f##.
$$\dfrac{df}{dt} = \sum_i \dfrac{\partial{f}}{\partial{\sigma_i}}\dfrac{d \sigma_i}{d t}$$
Where ##\sigma_i## represents one of the four parameters ##(x,y,z,t)##.

There is a difference between taking partial derivative of a function with respect to a "variable" versus taking its partial derivative with respect to one of its arguments

For example, if ##f(x,y,z) = 2x^2 + y + z## and ##y = 5x## then usual interpretation of the notation ##\frac{\partial f}{\partial x}## denotes to the function ##4x##, not the function ##4x + 5##. So the "##\partial x##" in the notation refers to the first argument of ##f##, not the quantity denoted by "##x##" when we account for all ways it affects the value of the function.

However, in discussions of applied math, I've seen people use the partial derivative notation differently. Because of such confusion, some authors use notation like ##f_1(...)## to indicate the partial derivative of a function with respect to its first argument.

In your writing, I can't tell how you are using notation like "##\frac{\partial f}{\partial t}##".

PeroK and Runei

So I am using it in the sense of taking the derivative with respect to its argument, as you show with the function ##f(x,y,z) = 2x^2+y+z## then ##\partial f/\partial x = 4x##.

My question is then about the ways in which these arguments can be related, and how we deal with the total derivative. So it could be that x, y and z are functions themselves of two other arguments (eg. ##x\rightarrow x(h,w)##), or there could be relationships between the arguments themselves as in the following example:
$$x = 2yz^2\\y=z^2+z+1$$
In that case, the partial derivative (##\partial f/\partial x##) is still just the derivative of x with respect to it's argument x (and so it is still ##4x##). Also, in that case it is fairly simply to substitute into the equation for ##f## and get just a function of ##z##.

A more problematic relationship could be the following:
$$x=2y^2+4z\\y=x^2+2z$$
In this case there is no obvious way (I think) to substitute, and then, how do you deal with the total derivative?

Stephen Tashi
My question is then about the ways in which these arguments can be related, and how we deal with the total derivative.

This is hard to answer mathematically. In physics we think of phenomena. So "x" represents a value associated with a phenomena (e.g. position of something) and wherever the notation "##x##" appears in a problem, it represents the same property of the same phenomenon.

By contrast, in mathematics, the definitions:
## f(x,y) = 2x + y ,\ g(x,y) = 2y ## could also be written as ## f(a,b) = 2a + b, \ g(p,q) = 2q ##. The variables in the definition of a function are "dummy" variables. Their names are arbitrary. Technically, notation such as "##x = 2y##" stands for a function named "##x##" define by ##x(w) = 2w##. It doesn't convey the idea that "When I speak of ##x = 2y##, I mean the ##x## that I was using in the definition of ##f(x,y) = 2x + y##"

Technically, the "scope" of the variable ##x## in the definition ##f(x,y) = 2x + y## is only that definition. The mention of an ##x## in a different context, such as ##x = 2y## is a competely different use of "##x##". This is like using an "##x##" on page 3 exercise 1 and an "##x##" on page 12 exercise 9. The two "##x##'s" have nothing to do with each other.

People do not observe such technicalities! They try to express concepts by using ##x## in one place as the argument of a function, and using it in a different place to denote a function ##x(w)## that they wish to relate to the ##x## that is an argument of a function. The functions ##f(x,y)## and ##g(w,y) = f(x(w),y)## are different functions. It is not technically correct to refer to both of those functions as "##f##". Yet it is common to for people to write things things like: "##f(x,y)= 2x + y## and ## x = 2y##, so ##f(x,y) = 4y + y##". That's a misuse of "##f##" to denote two different functions.

To figure out how to take the total derivative of a function, you have to figure out what function is being discussed! The technically imperfect notation must be interpreted so it defines a specific function. A set of equations in the usual sort of imperfect notation may be nonsense - insofar as it does not actually define a particular function. So I doubt there is a set of simple rules for operating on such equations with the goal of computing the total derivative of something.

Your discussion of a recursive approach might be a useful intuition in the particular context you discuss. I'd have to study it more to comment on it.

The following is an interesting question: Let ##f## be a function of 3 arguments , ##f(x,y,z)##. Let ##T## be the total derivative of ##f##. How many arguments does the function ##T## have?

Runei
Some good points there.

The reason for my thinking about all this, is that I am working on a problem in which we have a Lagrangian density given by ##L(x,t,u,u_t,u_x)##. The Euler-Lagrange function for this is
$$\dfrac{\partial L}{\partial u} - \dfrac{\partial}{\partial t}\dfrac{\partial L}{\partial u_t} - \dfrac{\partial}{\partial x}\dfrac{\partial L}{\partial u_x} = 0$$
We then want to study conserved quantities, and our teacher used a way of describing the various different partial derivatives in a way I found needlessly confusing. The following was done (in the way he does it):
$$\dfrac{\partial L}{\partial t} = \dfrac{DL}{Dt}+\dfrac{\partial L}{\partial u}\dfrac{\partial u}{\partial t}+\dfrac{\partial L}{\partial u_t}\dfrac{\partial u_t}{\partial t}+\dfrac{\partial L}{\partial u_x}\dfrac{\partial u_x}{\partial t}$$
He then says that the derivative on the left hand side is the derivative where only x is constant, while the derivative using the big "D" is the derivative where everything is constant (the explicit dependence of L on t). The next steps in the analysis is then to combine the Euler-Larange equation with this equation and do some analysis.

But, I found it needlessly confusing to introduce three different types of derivatives. In my mind, we only need to take the total derivative of the lagrange-density wrt. t. Also, I thought that the partial derivatives such of the field variables (eg. ##\partial u_x/\partial t##) should not be partial derivatives, but be total derivatives.
$$\dfrac{dL}{dt} = \dfrac{\partial L}{\partial t}\dfrac{dt}{dt}+\dfrac{\partial L}{\partial u}\dfrac{du}{dt}+\dfrac{\partial L}{\partial u_t}\dfrac{du_t}{dt}++\dfrac{\partial L}{\partial u_x}\dfrac{du_x}{dt}$$
The variables x and t are independent of each other, and so, the total derivative of a field variable, obviously becomes just the partial derivative (as before).
$$\dfrac{d u_x(x,t)}{dt} = \dfrac{\partial u_x}{\partial x}\dfrac{dx}{dt}+\dfrac{\partial u_x}{\partial t}\dfrac{dt}{dt}=\dfrac{\partial u_x}{\partial t}$$
This all got me thinking and trying different stuff out, as I was first wondering whether I had misunderstood how the total derivative worked.

In the end, this analysis is supposed to analyze the explicit dependence on the lagrangian in time, by way of stress-energy tensor elements that can be derived from using the Euler-Lagrange equation in the total derivative equation.

So that was my road into this thinking.

Stephen Tashi
Stephen Tashi
The reason for my thinking about all this, is that I am working on a problem in which we have a Lagrangian density given by ##L(x,t,u,u_t,u_x)##.

A physicist can probably give you a physics explanation of the professor's approach. I don't know whether you are interested in a mathematical explanation and I don't know if I can figure one out!

Mathematically, the name ##L## is being used to denote several different functions. My guess is that one interpretation for ##L## is that it is a function of 2 variables , which I shall denote as ##L_{a,b}(x,t)##.

A different function named ##L## is ##L_{a,b,c,d,e}(a,b,c,d,e)##.

##L_{a,b}## is defined in terms of ##L_{a,b,c,d,e}## and the functions ##u_{a,b}(x,t)##, ##u_a(x)##, ##u_b(t)## Specifically, ##L_{a,b}(x,t)## is defined as ##L_{a,b,c,d,e}(x,t,u_{a,b}(x,t),u_b(t),u_a(x))##.

The following was done (in the way he does it):
$$\dfrac{\partial L}{\partial t} = \dfrac{DL}{Dt}+\dfrac{\partial L}{\partial u}\dfrac{\partial u}{\partial t}+\dfrac{\partial L}{\partial u_t}\dfrac{\partial u_t}{\partial t}+\dfrac{\partial L}{\partial u_x}\dfrac{\partial u_x}{\partial t}$$
He then says that the derivative on the left hand side is the derivative where only x is constant, while the derivative using the big "D" is the derivative where everything is constant (the explicit dependence of L on t).

We must figure out how many arguments the function ##\frac{DL}{Dt}## has.

One possibility is that ##\frac{DL}{DT} ## has 4 arguments. It can be ##\frac{DL}{DT}_{a,b,c,d}(x,t,dx,dt) ## , defined in terms of another function ##D_{2,a,b}## as ## \frac{DL}{DT}_{a,b,c,d} (x,t,dx,dt) = D_{2,a,b}(x,t) dt ##. So the phrase "everything is constant" apparently refers to the fact that ##\frac{DL}{Dt}## can be written as a function of its first two arguments ##(x,t)## times the ##dt##. (However, I don't know why "everything is constant" is an apt description of this situation.)

Does this line of thinking make sense?

Runei
Hi again,

And thanks for you time digging into this.
I'm unsure about how you have set up the different L functions, so let me try do re-iterate.

The Lagrangian density ##L(x,t,u,u_x,u_t)## is some function dependent on those 5 arguments. As you say, ten there is another Lagrangian, L, which is only dependent on 2 arguments (x and t), ##L_{a,b}(x,t)##. This is because the arguments ##u##, ##u_x## and ##u_y## are themelves functions of the arguments (x and y). And all in al we have the following:
$$u \rightarrow u(x,t)\\u_x \rightarrow \dfrac{\partial u(x,t)}{\partial x}\\u_t \rightarrow \dfrac{\partial u(x,t)}{\partial t}$$

The phrase "everything is constant" is misleading I can see. What he means is just the "normal" partial derivative, where you only take the derivative with respect to one argument, and assume the rest to be constant.

My problem in this is that I think there is a problem in how the assignment is formulated, and that there is some misuse of notation. In particular, I think the partial derivative used on the LHS is different from the way the partial derivative is used on the RHS. I'll show now how the mathematical derivation goes.

We have the Euler-Lagrange equation given by
$$\dfrac{\partial L}{\partial u} = \dfrac{\partial}{\partial t}\dfrac{\partial L}{\partial u_t}+\dfrac{\partial}{\partial x}\dfrac{\partial L}{\partial u_x}$$
As far as I know, the partial derivatives in the Euler-Lagrange equation are supposed to be the partial derivatives wrt. one argument, keeping all other arguments constant. So already from the next step, I see trouble, but let me just continue.

We are then setting up this next partial differential equation, which is the derivative of the lagrangian L, keeping only x constant.
$$\dfrac{\partial L}{\partial t} = \dfrac{DL}{Dt}+\dfrac{\partial L}{\partial u}\dfrac{\partial u}{\partial t}+\dfrac{\partial L}{\partial u_t}\dfrac{\partial u_t}{\partial t}+\dfrac{\partial L}{\partial u_x}\dfrac{\partial u_x}{\partial t}$$
So the derivative with capital "D" is supposed to be the explicit dependence of L on t. So already now I feel the thing is a mess. Because, if the partial derivatives on the RHS are not the supposed to be the explicit dependencies (single argument differentiation), what are they supposed to be? I think it's a mess, but lets carry on.

We know insert the Euler-Lagrange equation in the equation we just wrote.
$$\dfrac{\partial L}{\partial t} = \dfrac{DL}{Dt}+(\dfrac{\partial}{\partial t}\dfrac{\partial L}{\partial u_t}+\dfrac{\partial}{\partial x}\dfrac{\partial L}{\partial u_x})\dfrac{\partial u}{\partial t}+\dfrac{\partial L}{\partial u_t}\dfrac{\partial u_t}{\partial t}+\dfrac{\partial L}{\partial u_x}\dfrac{\partial u_x}{\partial t}$$
I now rewrite ##\partial u/\partial t = u_x## and then I re-arrange the above equation. Additionally, I use that ##u_{xt}=u_{tx}## (equality of second partial derivatives).
$$-\dfrac{DL}{Dt} = -\dfrac{\partial L}{\partial t}+(\dfrac{\partial}{\partial t}\dfrac{\partial L}{\partial u_t}+\dfrac{\partial}{\partial x}\dfrac{\partial L}{\partial u_x})u_t+\dfrac{\partial L}{\partial u_t}\dfrac{\partial u_t}{\partial t}+\dfrac{\partial L}{\partial u_x}\dfrac{\partial u_t}{\partial x}\\ -\dfrac{DL}{Dt} = u_t\dfrac{\partial}{\partial t}\dfrac{\partial L}{\partial u_t}+\dfrac{\partial L}{\partial u_t}\dfrac{\partial u_t}{\partial t}-\dfrac{\partial L}{\partial t}+u_t\dfrac{\partial}{\partial x}\dfrac{\partial L}{\partial u_x}+\dfrac{\partial L}{\partial u_x}\dfrac{\partial u_t}{\partial x}\\ -\dfrac{DL}{Dt} = u_t\dfrac{\partial}{\partial t}\dfrac{\partial L}{\partial u_t}+\dfrac{\partial u_t}{\partial t}\dfrac{\partial L}{\partial u_t}-\dfrac{\partial L}{\partial t}+u_t\dfrac{\partial}{\partial x}\dfrac{\partial L}{\partial u_x}+\dfrac{\partial u_t}{\partial x}\dfrac{\partial L}{\partial u_x}\\ -\dfrac{DL}{Dt} = \dfrac{\partial}{\partial t}(u_t\dfrac{\partial L}{\partial u_t})-\dfrac{\partial L}{\partial t}+\dfrac{\partial}{\partial x}(u_t\dfrac{\partial L}{\partial u_x})\\ -\dfrac{DL}{Dt} = \dfrac{\partial}{\partial t}(u_t\dfrac{\partial L}{\partial u_t}-L)+\dfrac{\partial}{\partial x}(u_t\dfrac{\partial L}{\partial u_x})\\ -\dfrac{DL}{Dt} = \dfrac{\partial}{\partial t}T_{tt}+\dfrac{\partial}{\partial x}T_{tx}$$
At the point where the partial derivative is taken outside the parenthesis, I think the problem is that it is not the same type of partial derivative for L and for the product (this is from equation 4 to 5).

Digging further I have discovered where the problem lies, but I need a little help to understand the issue.

It all comes from the derivation of the Euler-Lagrange equation.

We define the Lagrangian as a function given by ##L(x,t,u,u_x,u_t,\alpha) = L(x,t,u,u_x,u_t)+\alpha\eta(x,t)##. We have an integral that we want to find the extremum of, wrt. ##\alpha##.
$$J = \int_RL(x,t,u,u_x,u_t,\alpha)dxdt$$
Taking the derivative wrt. ##\alpha## gives us
\begin{align}\frac{dJ}{d\alpha} & =\int_R\frac{dL}{d\alpha}dxdt\\ & =\int_R\frac{\partial L}{\partial u}\frac{du}{d\alpha}+\frac{\partial L}{\partial u_x}\frac{du_t}{d\alpha}+\frac{\partial L}{\partial du_x}\frac{du_x}{d\alpha}dxdt\\ & =\int_R\frac{\partial L}{\partial u}\eta+\frac{\partial L}{\partial u_x}\eta_x+\frac{\partial L}{\partial u_t}\eta_t dxdt \end{align}
The partial derivatives here are just plain old partial derivatives wrt. one argument, keeping all other constant. However, now we do integration by parts to get the system into another form, and then I think we encounter a problem in terms of notation.
When we do integration by parts we get the following
$$\int_R\left(\frac{\partial L}{\partial u}-\frac{\partial}{\partial t}\frac{\partial L}{\partial u_t}-\frac{\partial}{\partial x}\frac{\partial L}{\partial u_x}\right)\eta(x,y) dxdy = 0$$
My issue now is with the partial derivatives ##\partial/\partial x## and ##\partial/\partial t##, which comes from the integration by parts. Since they come from an integration of a function of only x,y, they are different than the partial derivatives of the lagrangian L.
Should the partial derivatives ##\partial/\partial x## and ##\partial/\partial t## instead by total derivatives? Or is there another way to show that those derivatives are wrt. a function of only x and y?

It's hard to explain in text I think, but I hope someone can see the issue :-)

And I believe we have come full circle. What @Stephen Tashi said about making clear what arguments your function has, is particularly important here. A clear way to write the Euler-Lagrange equation would be
$$\frac{\partial L}{\partial u}(x,y)-\frac{\partial}{\partial t}\left(\frac{\partial L}{\partial u_t}(x,t)\right)-\frac{\partial}{\partial x}\left(\frac{\partial L}{\partial u_x}(x,y)\right)= 0$$
With that notation, it becomes clear that partial derivatives are different.

Stephen Tashi
We define the Lagrangian as a function given by ##L(x,t,u,u_x,u_t,\alpha) = L(x,t,u,u_x,u_t)+\alpha\eta(x,t)##.

I don't understand that definition. I think of the Lagrangian as a function of time ##t## and some properties of a particular path ## (P(t),t) ## at that time. So if i perturb that path by changing the path to ##(P(t) + \alpha \eta(t), t)## the Lagrangian shoujld change from ##L##( various properties of ##(P(t),t)##) to ##L##( various properties of ## (P(t)+\alpha \eta(t),t) )##.

Are we perturbing the Lagrangian or perturbing the path?

Whoops, you're absolutely right, I miswrote.

The correct expression should be that we perturb U instead. So
$$u(x,t,\alpha)=u(x,t,0)+\alpha\eta(x,t)$$

PeroK
Homework Helper
Gold Member
2020 Award
Whoops, you're absolutely right, I miswrote.

The correct expression should be that we perturb U instead. So
$$u(x,t,\alpha)=u(x,t,0)+\alpha\eta(x,t)$$

On this whole subject, you may be interested in my Insight on the Chain Rule:

https://www.physicsforums.com/insights/demystifying-chain-rule-calculus/

I'm sure you understand most of this already, but the section on what the "total" derivative really is may be useful. You mentioned in a previous post about a "misuse" of notation. I think that's true in a lot of texts dealing with the Lagrangian, in particular. That said, it can get really tedious to expand everything and be completely precise. I did, in fact, intend to do a follow-up insight on the Lagrangian derivation, but it was getting too pedantic and I found myself preferring the sketchier approach used in most texts!

Runei
PeroK
Homework Helper
Gold Member
2020 Award
Whoops, you're absolutely right, I miswrote.

The correct expression should be that we perturb U instead. So
$$u(x,t,\alpha)=u(x,t,0)+\alpha\eta(x,t)$$

For example, I might tend to avoid using ##u## for two different functions here and instead write:

$$A(x,t,\alpha)=u(x,t)+\alpha\eta(x,t)$$

And, then it's clear and precise that:
$$\frac{\partial A}{\partial \alpha} = \eta(x, t)$$
And
$$A(x, t, 0) = u(x, t)$$

Runei