# Help with Euler-Lagrange Equation

• I
• Markus Hanke
In summary: The answer is that you use the distributive law. This says that if you have a bunch of things (vectors and covectors) that each have a distributive property, you can take their partial derivatives to get the new scalar. For example, if you have a bunch of particles with velocities v_i, you can take their partial derivatives with respect to time to get the new velocity: \frac{dv_i}{dt}=\frac{1}{2}\sum_i v_i. But this is not what you're asking.

#### Markus Hanke

I have begun teaching myself Lagrangian field theory in preparation for taking the plunge into quantum field theory ( it's just a hobby, not any kind of formal course ). When working through exercises, I have run across the following issue which I don't quite understand. I am being given a Lagrangian density, and ask to derive the equations of motion; I understand the principles involved, and everything is fine and easy until I get to the point where I need to evaluate the following expression :

$$\frac{1}{2}\frac{\partial }{\partial \left ( \partial _{\mu}\varphi \right )}\left ( \partial _{\mu}\varphi \right )^2$$

wherein ##\varphi(x,y,z,t)## is a scalar field. My approach to this was as simple as it was naive :

$$\frac{1}{2}\frac{\partial }{\partial \left ( \partial _{\mu}\varphi \right )}\left ( \partial _{\mu}\varphi \right )^2=\frac{1}{2}\frac{\partial }{\partial \left ( \partial _{\mu}\varphi \right )}\left ( \partial _{\mu}\varphi \right )\left ( \partial _{\mu}\varphi \right )$$

which evaluates to ##\partial _{\mu}\varphi## via the product rule. However, this is where I get stuck, because the answer is wrong - the correct approach should have been

$$\frac{1}{2}\frac{\partial }{\partial \left ( \partial _{\mu}\varphi \right )}\left ( \partial _{\mu}\varphi \right )\left ( \partial ^{\mu}\varphi \right )$$

which apparently evaluates to ##\partial ^{\mu}\varphi## ( though I have difficulties with that as well, but that's a separate issue ), and leads to the correct equations of motion. My question is : why is ##\left ( \partial _{\mu}\varphi \right )^2=\left ( \partial _{\mu}\varphi \right )\left ( \partial ^{\mu}\varphi \right )## and not ##\left ( \partial _{\mu}\varphi \right )^2=\left ( \partial _{\mu}\varphi \right )\left ( \partial _{\mu}\varphi \right )## ? I know that this is probably something very elementary, so please don't laugh at me, but I genuinely don't get it.

Markus Hanke said:
I have begun teaching myself Lagrangian field theory in preparation for taking the plunge into quantum field theory ( it's just a hobby, not any kind of formal course ). When working through exercises, I have run across the following issue which I don't quite understand. I am being given a Lagrangian density, and ask to derive the equations of motion; I understand the principles involved, and everything is fine and easy until I get to the point where I need to evaluate the following expression :

$$\frac{1}{2}\frac{\partial }{\partial \left ( \partial _{\mu}\varphi \right )}\left ( \partial _{\mu}\varphi \right )^2$$

wherein ##\varphi(x,y,z,t)## is a scalar field. My approach to this was as simple as it was naive :

$$\frac{1}{2}\frac{\partial }{\partial \left ( \partial _{\mu}\varphi \right )}\left ( \partial _{\mu}\varphi \right )^2=\frac{1}{2}\frac{\partial }{\partial \left ( \partial _{\mu}\varphi \right )}\left ( \partial _{\mu}\varphi \right )\left ( \partial _{\mu}\varphi \right )$$

which evaluates to ##\partial _{\mu}\varphi## via the product rule. However, this is where I get stuck, because the answer is wrong - the correct approach should have been

$$\frac{1}{2}\frac{\partial }{\partial \left ( \partial _{\mu}\varphi \right )}\left ( \partial _{\mu}\varphi \right )\left ( \partial ^{\mu}\varphi \right )$$

which apparently evaluates to ##\partial ^{\mu}\varphi## ( though I have difficulties with that as well, but that's a separate issue ), and leads to the correct equations of motion. My question is : why is ##\left ( \partial _{\mu}\varphi \right )^2=\left ( \partial _{\mu}\varphi \right )\left ( \partial ^{\mu}\varphi \right )## and not ##\left ( \partial _{\mu}\varphi \right )^2=\left ( \partial _{\mu}\varphi \right )\left ( \partial _{\mu}\varphi \right )## ? I know that this is probably something very elementary, so please don't laugh at me, but I genuinely don't get it.

This is not specific to quantum field theory, but is a fact about vectors and covectors.

A vector is something that is generally written with upper-indices. For example, if a massive particle's position is $x^\mu$, then its 4-velocity is given by $\frac{dx^\mu}{d\tau}$ (where $\tau$ is proper time for the particle), which is a vector. In contrast, a covector is written using lower-indices. For example, if $\phi$ is a scalar field, then its "gradient" is $\partial_\mu \phi \equiv \frac{\partial \phi}{\partial x^\mu}$.

To make a scalar, you have to use a vector together with a covector. That means that you have to have one object with upper-indices and another object with lower-indices. So if $A^\mu$ is a vector and $B_\mu$ is a covector, then you can make a scalar by combining them via: $\sum_\mu A^\mu B_\mu$. (It's usually written without the $\sum$, using the convention that repeated indices, one upper and one lower, are always summed over.) For example, we can combine a 4-velocity $\frac{dx^\mu}{d\tau}$ and a covector $\partial_\mu \phi$ to get the combination $\frac{dx^\mu}{d\tau} \partial_\mu \phi$. This is a scalar, the rate of change of $\phi$ for the particle as a function of proper time.

So if you have to use a vector and a covector to make a scalar, then how do you take the square of a vector? The answer is that you can't take the square of a vector without help from a metric tensor. The metric tensor $g_{\mu \nu}$ is an operator that converts a vector into a covector: If you have a vector $A^\mu$ then you can convert it into a corresponding covector $A_\mu[itex] through [itex]A_\mu = g_{\mu \nu} A^\nu$ (by convention, the repeated index $\nu$ on the right side of = is summed over). Using the metric, we can square a vector as follows:

$|A^\mu|^2 \equiv A^\mu A_\mu \equiv g_{\mu \nu} A^\mu A^\nu$ (in the last expression, both $\mu$ and $\nu$ are summed over).

Similarly, the square of a covector must involve a metric, as well:
$|\partial_\mu \phi|^2 \equiv \partial^\mu \phi \partial_\mu \phi \equiv g^{\mu \nu} \partial_\mu \phi \partial_\nu \phi$
(where $g^{\mu \nu}$ is the inverse of $g_{\mu \nu}$; it converts a covector $\partial_\mu \phi$ into a vector, $\partial^\mu \phi$).

In quantum field theory, if an expression like $(\partial_\mu \phi)^2$ appears in the Lagrangian, it always means $\partial^\mu \phi \partial_\mu \phi$, because you can't take the square of a covector, otherwise.

Markus Hanke
stevendaryl said:
In quantum field theory, if an expression like $(\partial_\mu \phi)^2$ appears in the Lagrangian, it always means $\partial^\mu \phi \partial_\mu \phi$, because you can't take the square of a covector, otherwise.

Oh, one further point. Since $\partial^\mu \phi \partial_\mu \phi$ means $g^{\mu \nu} \partial_\mu \phi \partial_\nu \phi$, there are two factors of $\partial_\mu \phi$, so taking the derivative with respect to $\partial_\mu \phi$ gives a factor of two.

Markus Hanke
stevendaryl said:
Similarly, the square of a covector must involve a metric, as well:
$|\partial_\mu \phi|^2 \equiv \partial^\mu \phi \partial_\mu \phi \equiv g^{\mu \nu} \partial_\mu \phi \partial_\nu \phi$
(where $g^{\mu \nu}$ is the inverse of $g_{\mu \nu}$; it converts a covector $\partial_\mu \phi$ into a vector, $\partial^\mu \phi$).

In quantum field theory, if an expression like $(\partial_\mu \phi)^2$ appears in the Lagrangian, it always means $\partial^\mu \phi \partial_\mu \phi$, because you can't take the square of a covector, otherwise.

My goodness, of course ! I had come across that already in my studies of GR, but it had completely slipped my mind in this context. I knew I was missing something elementary. Thank you for taking the time to reply in such detail, I understand it now

stevendaryl said:
Oh, one further point. Since $\partial^\mu \phi \partial_\mu \phi$ means $g^{\mu \nu} \partial_\mu \phi \partial_\nu \phi$, there are two factors of $\partial_\mu \phi$, so taking the derivative with respect to $\partial_\mu \phi$ gives a factor of two.

Yup indeed :) Given the correct expansion of this bracket, I actually got the rest of the exercise done with no problems, it was really just this one upper index that tripped me up. But all is good now

## 1. What is the Euler-Lagrange equation?

The Euler-Lagrange equation is a mathematical equation that is used to find the extrema (maximum or minimum) of a functional. It is commonly used in the fields of physics, engineering, and mathematics.

## 2. Why is the Euler-Lagrange equation important?

The Euler-Lagrange equation is important because it provides a way to find the optimal solution of a functional, which is a mathematical expression that takes in a set of functions as inputs and outputs a real number. This is particularly useful in physics and engineering, where finding the optimal path or function is crucial.

## 3. How is the Euler-Lagrange equation derived?

The Euler-Lagrange equation is derived using the calculus of variations, which is a branch of mathematics that deals with finding the extrema of functionals. It involves taking the derivative of the functional with respect to the function and setting it equal to 0, and then solving for the function.

## 4. What are some applications of the Euler-Lagrange equation?

The Euler-Lagrange equation has many applications in physics, engineering, and mathematics. It is used to solve problems in mechanics, such as finding the path of a particle that minimizes the time taken to travel between two points. It is also used in optimization problems, where the goal is to find the optimal value of a function.

## 5. Are there any limitations to the Euler-Lagrange equation?

Yes, there are some limitations to the Euler-Lagrange equation. It can only be used to find the extrema of functionals that depend on a single variable. It also assumes that the functional is continuous and has continuous derivatives. Additionally, it may not always provide a unique solution, and some problems may require additional constraints or techniques to find the optimal solution.