A Differentiation of a product of 4-gradients wrt a 4-gradient

spaghetti3451

I know that $\frac{\partial}{\partial (\partial_{\mu}\phi)} \big( \partial_{\mu} \phi\ \partial^{\mu} \phi \big) = \partial_{\mu} \phi$.

Now, I need to prove this to myself.

So, here goes nothing.

$\frac{\partial}{\partial (\partial_{\mu}\phi)} \big( \partial_{\mu} \phi\ \partial^{\mu} \phi \big)$
$= \frac{\partial}{\partial (\partial_{\mu}\phi)} \big( \eta^{\mu\nu}\partial_{\mu} \phi\ \partial_{\nu} \phi \big)$
$= \eta^{\mu\nu}\ \partial_{\nu} \phi + \eta_{\mu\nu} \eta^{\mu\nu}\ \partial_{\mu} \phi$,

where I first differentiated the factor $\partial_{\mu}\phi$ with respect to $\partial_{\mu}\phi$ and then I differentiated the factor $\partial_{\nu}\phi$ with respect to $\partial_{\mu}\phi$.

Am I correct so far?

Related Differential Geometry News on Phys.org

JorisL

Your indices in the second term don't match with those in the first so that's an indication something is wrong.

Try using the following form

$$\frac{\partial}{\partial (\partial_{\alpha}\phi)} \big( \partial_{\mu} \phi\ \partial^{\mu} \phi \big)$$

This helps you avoid the problem where you have ill-defined expressions like $\eta_{\mu\nu}\eta^{\mu\nu}\partial_\mu\phi$.
The latter can mean two things
$$\eta_{\mu\nu}\left(\eta^{\mu\nu}\partial_\mu\phi\right)=\eta_{\mu\nu}\partial^\nu\phi=\partial_\mu\phi$$
or it can mean (using $\eta_{\mu\nu}\eta^{\mu\nu}=D$ with D the number of spacetime dimensions)
$$\left(\eta_{\mu\nu}\eta^{\mu\nu}\right)\partial_\mu\phi=4\partial_\mu\phi$$

• spaghetti3451

spaghetti3451

Firstly, I need to say that $\frac{\partial}{\partial (\partial_{\mu}\phi)} \big( \frac{1}{2}\partial_{\mu} \phi\ \partial^{\mu} \phi \big) = \partial^{\mu} \phi$.
I made a mistake in my first post of placing the $\mu$ index on the RHS downstairs, instead of upstairs. I also made a mistake of forgetting the factor of $\frac{1}{2}$.

Now,

$\frac{\partial}{\partial (\partial_{\alpha}\phi)} \big( \frac{1}{2}\partial_{\mu} \phi\ \partial^{\mu} \phi \big)$
$=\frac{1}{2}\eta^{\mu\nu}\frac{\partial}{\partial (\partial_{\alpha}\phi)} \big( \partial_{\mu} \phi\ \partial_{\nu} \phi \big)$
$=\frac{1}{2}\eta^{\mu\nu}(\eta_{\alpha\mu}\partial_{\nu}\phi+\eta_{\alpha\nu}\partial_{\mu}\phi)$
$=\frac{1}{2}(\eta_{\alpha\mu}\eta^{\mu\nu}\partial_{\nu}\phi+\eta_{\alpha\nu}\eta^{\nu\mu}\partial_{\mu}\phi)$
$=\eta_{\alpha\mu}\eta^{\mu\nu}\partial_{\nu}\phi$
$=\delta^{\nu}_{\alpha}\partial_{\nu}\phi$
$=\partial_{\alpha}\phi$

Am I correct?

Last edited:

JorisL

Why do you use that
$$\frac{\partial\left(\partial_\mu\phi\right)}{\partial\left(\partial_\alpha\phi\right)}=\eta_{\mu\alpha}$$

A hint that something is wrong, is that the indices don't correspond. You can check this by performing a transformation $x^\mu\to x^{\prime\mu}$ you should find that there is an upper and a lower index. [*]

It's equal to $\delta^\alpha_\mu\,\,\,\left(=\eta^\alpha_{\,\,\, \mu}\right)$ as far as I can tell.

[*]: Oddly it seems that this would turn out correct if you contract the metric so I'm starting to doubt myself.

• spaghetti3451

spaghetti3451

Ok, let me rework my answer using your suggestion.

$\frac{\partial}{\partial (\partial_{\alpha}\phi)} \big( \frac{1}{2}\partial_{\mu} \phi\ \partial^{\mu} \phi \big)$
$=\frac{1}{2}\eta^{\mu\nu}\frac{\partial}{\partial (\partial_{\alpha}\phi)} \big( \partial_{\mu} \phi\ \partial_{\nu} \phi \big)$
$=\frac{1}{2}\eta^{\mu\nu}({\eta^{\alpha}}_{\mu}\partial_{\nu}\phi+{\eta^{\alpha}}_{\nu}\partial_{\mu}\phi)$
$=\frac{1}{2}({\eta^{\alpha}}_{\mu}\eta^{\mu\nu}\partial_{\nu}\phi+{\eta^{\alpha}}_{\nu}\eta^{\nu\mu}\partial_{\mu}\phi)$
$={\eta^{\alpha}}_{\mu}\eta^{\mu\nu}\partial_{\nu}\phi$
$=\eta^{\alpha\nu}\partial_{\nu}\phi$
$=\partial^{\alpha}\phi$

It appears that in the answer, the index $\alpha$ should be upstairs, not downstairs. I made this mistake in my previous post.

JorisL

Try to see why the $\alpha$ index is upstairs and also why you can write $\eta^\alpha_{\,\,\,\mu}=\delta^\alpha_\mu$ where delta is the kronecker delta.

Other than that it is correct.

• spaghetti3451

spaghetti3451

Try to see why the $\alpha$ index is upstairs
Shouldn't the index $\alpha$ be upstairs because we are differentiating the product of an upstairs index and a downstairs index with respect to a downstairs index?

and also why you can write $\eta^\alpha_{\,\,\,\mu}=\delta^\alpha_\mu$ where delta is the kronecker delta.
I think $\eta^\alpha_{\,\,\,\mu}=\delta^\alpha_\mu$ because $\eta^\alpha_{\,\,\, \mu} = \eta^{\alpha\nu}\eta_{\nu\mu} = \delta^\alpha_\mu$.

Am I correct?

JorisL

The latter part is perfect.

The first part becomes clearest when explicitly changing coordinates In other words how does the

$$\partial_\mu\phi(x)=\frac{\partial\phi(x)}{\partial x^\mu}$$

Apply a coordinate transformation $x^\mu\to x^{\prime\mu}$ and look at the way the Jacobian shows up.
The Jacobian is given by (or "upside down" doesn't matter too much since we look at invertible transformations)

$$\frac{\partial x^{\prime\mu}}{\partial x^{\mu}}$$

spaghetti3451

Under the coordinate transformation $x^{\mu}\rightarrow x'^{\mu}={\Lambda^{\mu}}_{\nu}x^{\nu}$,

$\partial_{\mu}\phi(x)~=~\frac{\partial\phi(x)}{\partial x^{\mu}}~$

$\rightarrow \frac{\partial\phi(\Lambda^{-1}x)}{\partial x^{\mu}}=\frac{\partial (\Lambda^{-1}x)^{\nu}}{\partial x^{\mu}}\frac{\partial\phi(\Lambda^{-1}x)}{\partial (\Lambda^{-1}x)^{\nu}}=\frac{\partial}{\partial x^{\mu}}\Big( {(\Lambda^{-1})^{\nu}}_{\rho}x^{\rho} \Big)(\partial_{\nu}\phi)(\Lambda^{-1}x)={(\Lambda^{-1})^{\nu}}_{\rho}\delta^{\rho}_{\mu}(\partial_{\nu}\phi)(\Lambda^{-1}x)={(\Lambda^{-1})^{\nu}}_{\mu}(\partial_{\nu}\phi)(\Lambda^{-1}x)$.

How does this help?

Last edited:

JorisL

Well, now use it in the derivative w.r.t. $\partial_\mu\phi$.
You should get something of the form
$$\frac{\partial\left(\partial_\mu\phi\right)}{\partial\left(\partial_\alpha\phi\right)}\to \left(\Lambda^{-1}\right)^\nu_{\,\,\, \mu}\Lambda^\alpha_{\,\,\, \beta}\frac{\partial\left(\partial_\nu\phi(x^\prime)\right)}{\partial\left(\partial_\beta\phi(x^\prime)\right)}$$

This means that the object transform as a (1,1)-tensor i.e. it has one upper and one lower index.

This might help with the details https://www.physicsforums.com/threads/kronecker-delta-as-tensor-proof.320692/

spaghetti3451

Ok. So, under the coordinate transformation $x^{\mu} \rightarrow {\Lambda^{\mu}}_{\nu}x^{\nu}$,

$\frac{\partial(\partial_{\mu}\phi(x))}{\partial(\partial_{\alpha}\phi(x))} \rightarrow \frac{\partial({(\Lambda^{-1})^{\nu}}_{\mu}(\partial_{\nu}\phi)(\Lambda^{-1}x))}{\partial({(\Lambda^{-1})^{\beta}}_{\alpha}(\partial_{\beta}\phi)(\Lambda^{-1}x))}=\frac{\partial({(\Lambda^{-1})^{\nu}}_{\mu}(\partial_{\nu}\phi)(\Lambda^{-1}x))}{\partial({(\Lambda)_{\alpha}}^{\beta}(\partial_{\beta}\phi)(\Lambda^{-1}x))}$

How should I now take ${(\Lambda)_{\alpha}}^{\beta}$ to the numerator?

JorisL

You can take $\Lambda$ outside of the derivatives. But I suggest you look at a text on (special) relativity.

I suppose you're studying relativistic field theory? This means that knowing how to quickly read and interpret indices (upper/lower) can help you focus on the physics content instead of the manipulations of expressions.

spaghetti3451

Thanks for the reply.

Let me finish the steps of my derivation:

$\frac{\partial({(\Lambda^{-1})^{\nu}}_{\mu}(\partial_{\nu}\phi)(\Lambda^{-1}x))}{\partial({(\Lambda)_{\alpha}}^{\beta}(\partial_{\beta}\phi)(\Lambda^{-1}x))}=\frac{\partial((\partial_{\beta}\phi)(\Lambda^{-1}x))}{\partial({(\Lambda)_{\alpha}}^{\beta}(\partial_{\beta}\phi)(\Lambda^{-1}x))}\frac{\partial({(\Lambda^{-1})^{\nu}}_{\mu}(\partial_{\nu}\phi)(\Lambda^{-1}x))}{\partial((\partial_{\beta}\phi)(\Lambda^{-1}x))}={(\Lambda^{-1})^{\nu}}_{\mu}{(\Lambda^{-1})^{\beta}}_{\alpha}\frac{\partial((\partial_{\nu}\phi)(\Lambda^{-1}x))}{\partial((\partial_{\beta}\phi)(\Lambda^{-1}x))}={(\Lambda^{-1})^{\nu}}_{\mu}{(\Lambda)_{\alpha}}^{\beta}\frac{\partial((\partial_{\nu}\phi)(\Lambda^{-1}x))}{\partial((\partial_{\beta}\phi)(\Lambda^{-1}x))}$

I suppose you're studying relativistic field theory? This means that knowing how to quickly read and interpret indices (upper/lower) can help you focus on the physics content instead of the manipulations of expressions.
Hmm. I guess that's very important. I was just trying to practice my skills in tensor manipulations since I'm still new to this kind of math.

spaghetti3451

Wait! My order of indices on ${(\Lambda)_{\alpha}}^{\beta}$ in the final result are not the same as your order of indices in ${\Lambda^{\alpha}}_{\beta}$.

Did I make a mistake in the second step of my calculation in the previous post?

"Differentiation of a product of 4-gradients wrt a 4-gradient"

Physics Forums Values

We Value Quality
• Topics based on mainstream science
• Proper English grammar and spelling
We Value Civility
• Positive and compassionate attitudes
• Patience while debating
We Value Productivity
• Disciplined to remain on-topic
• Recognition of own weaknesses
• Solo and co-op problem solving