Differentiation of a product of 4-gradients wrt a 4-gradient

spaghetti3451 · Apr 19, 2016

I know that ##\frac{\partial}{\partial (\partial_{\mu}\phi)} \big( \partial_{\mu} \phi\ \partial^{\mu} \phi \big) = \partial_{\mu} \phi##.

Now, I need to prove this to myself.

So, here goes nothing.

##\frac{\partial}{\partial (\partial_{\mu}\phi)} \big( \partial_{\mu} \phi\ \partial^{\mu} \phi \big)##
## = \frac{\partial}{\partial (\partial_{\mu}\phi)} \big( \eta^{\mu\nu}\partial_{\mu} \phi\ \partial_{\nu} \phi \big)##
##= \eta^{\mu\nu}\ \partial_{\nu} \phi + \eta_{\mu\nu} \eta^{\mu\nu}\ \partial_{\mu} \phi##,

where I first differentiated the factor ##\partial_{\mu}\phi## with respect to ##\partial_{\mu}\phi## and then I differentiated the factor ##\partial_{\nu}\phi## with respect to ##\partial_{\mu}\phi##.

Am I correct so far?

JorisL · Apr 20, 2016

Your indices in the second term don't match with those in the first so that's an indication something is wrong.

Try using the following form

$$\frac{\partial}{\partial (\partial_{\alpha}\phi)} \big( \partial_{\mu} \phi\ \partial^{\mu} \phi \big)$$

This helps you avoid the problem where you have ill-defined expressions like ##\eta_{\mu\nu}\eta^{\mu\nu}\partial_\mu\phi##.
The latter can mean two things
$$\eta_{\mu\nu}\left(\eta^{\mu\nu}\partial_\mu\phi\right)=\eta_{\mu\nu}\partial^\nu\phi=\partial_\mu\phi$$
or it can mean (using ##\eta_{\mu\nu}\eta^{\mu\nu}=D## with D the number of spacetime dimensions)
$$\left(\eta_{\mu\nu}\eta^{\mu\nu}\right)\partial_\mu\phi=4\partial_\mu\phi$$

spaghetti3451 · Apr 21, 2016

Firstly, I need to say that ##
\frac{\partial}{\partial (\partial_{\mu}\phi)} \big( \frac{1}{2}\partial_{\mu} \phi\ \partial^{\mu} \phi \big) = \partial^{\mu} \phi##.
I made a mistake in my first post of placing the ##\mu## index on the RHS downstairs, instead of upstairs. I also made a mistake of forgetting the factor of ##\frac{1}{2}##.

Now,

##\frac{\partial}{\partial (\partial_{\alpha}\phi)} \big( \frac{1}{2}\partial_{\mu} \phi\ \partial^{\mu} \phi \big)##
##=\frac{1}{2}\eta^{\mu\nu}\frac{\partial}{\partial (\partial_{\alpha}\phi)} \big( \partial_{\mu} \phi\ \partial_{\nu} \phi \big)##
##=\frac{1}{2}\eta^{\mu\nu}(\eta_{\alpha\mu}\partial_{\nu}\phi+\eta_{\alpha\nu}\partial_{\mu}\phi)##
##=\frac{1}{2}(\eta_{\alpha\mu}\eta^{\mu\nu}\partial_{\nu}\phi+\eta_{\alpha\nu}\eta^{\nu\mu}\partial_{\mu}\phi)##
##=\eta_{\alpha\mu}\eta^{\mu\nu}\partial_{\nu}\phi##
##=\delta^{\nu}_{\alpha}\partial_{\nu}\phi##
##=\partial_{\alpha}\phi##

Am I correct?

JorisL · Apr 21, 2016

Why do you use that
$$
\frac{\partial\left(\partial_\mu\phi\right)}{\partial\left(\partial_\alpha\phi\right)}=\eta_{\mu\alpha}
$$

A hint that something is wrong, is that the indices don't correspond. You can check this by performing a transformation ##x^\mu\to x^{\prime\mu}## you should find that there is an upper and a lower index. ^[*]

It's equal to ##\delta^\alpha_\mu\,\,\,\left(=\eta^\alpha_{\,\,\, \mu}\right)## as far as I can tell.

[*]: Oddly it seems that this would turn out correct if you contract the metric so I'm starting to doubt myself.

spaghetti3451 · Apr 21, 2016

Ok, let me rework my answer using your suggestion.

##\frac{\partial}{\partial (\partial_{\alpha}\phi)} \big( \frac{1}{2}\partial_{\mu} \phi\ \partial^{\mu} \phi \big)##
##=\frac{1}{2}\eta^{\mu\nu}\frac{\partial}{\partial (\partial_{\alpha}\phi)} \big( \partial_{\mu} \phi\ \partial_{\nu} \phi \big)##
##=\frac{1}{2}\eta^{\mu\nu}({\eta^{\alpha}}_{\mu}\partial_{\nu}\phi+{\eta^{\alpha}}_{\nu}\partial_{\mu}\phi)##
##=\frac{1}{2}({\eta^{\alpha}}_{\mu}\eta^{\mu\nu}\partial_{\nu}\phi+{\eta^{\alpha}}_{\nu}\eta^{\nu\mu}\partial_{\mu}\phi)##
##={\eta^{\alpha}}_{\mu}\eta^{\mu\nu}\partial_{\nu}\phi##
##=\eta^{\alpha\nu}\partial_{\nu}\phi##
##=\partial^{\alpha}\phi##

It appears that in the answer, the index ##\alpha## should be upstairs, not downstairs. I made this mistake in my previous post.

JorisL · Apr 21, 2016

Try to see why the ##\alpha## index is upstairs and also why you can write ##\eta^\alpha_{\,\,\,\mu}=\delta^\alpha_\mu## where delta is the kronecker delta.

Other than that it is correct.

spaghetti3451 · Apr 21, 2016

JorisL said:

Try to see why the ##\alpha## index is upstairs

Shouldn't the index ##\alpha## be upstairs because we are differentiating the product of an upstairs index and a downstairs index with respect to a downstairs index?

JorisL said:

and also why you can write ##\eta^\alpha_{\,\,\,\mu}=\delta^\alpha_\mu## where delta is the kronecker delta.

I think ##\eta^\alpha_{\,\,\,\mu}=\delta^\alpha_\mu## because ##\eta^\alpha_{\,\,\, \mu} = \eta^{\alpha\nu}\eta_{\nu\mu} = \delta^\alpha_\mu##.

Am I correct?

JorisL · Apr 21, 2016

The latter part is perfect.

The first part becomes clearest when explicitly changing coordinates In other words how does the

$$
\partial_\mu\phi(x)=\frac{\partial\phi(x)}{\partial x^\mu}
$$

Apply a coordinate transformation ##x^\mu\to x^{\prime\mu}## and look at the way the Jacobian shows up.
The Jacobian is given by (or "upside down" doesn't matter too much since we look at invertible transformations)

$$
\frac{\partial x^{\prime\mu}}{\partial x^{\mu}}
$$

spaghetti3451 · Apr 22, 2016

Under the coordinate transformation ##x^{\mu}\rightarrow x'^{\mu}={\Lambda^{\mu}}_{\nu}x^{\nu}##,

##\partial_{\mu}\phi(x)~=~\frac{\partial\phi(x)}{\partial x^{\mu}}~##

##\rightarrow \frac{\partial\phi(\Lambda^{-1}x)}{\partial x^{\mu}}=\frac{\partial (\Lambda^{-1}x)^{\nu}}{\partial x^{\mu}}\frac{\partial\phi(\Lambda^{-1}x)}{\partial (\Lambda^{-1}x)^{\nu}}=\frac{\partial}{\partial x^{\mu}}\Big( {(\Lambda^{-1})^{\nu}}_{\rho}x^{\rho} \Big)(\partial_{\nu}\phi)(\Lambda^{-1}x)={(\Lambda^{-1})^{\nu}}_{\rho}\delta^{\rho}_{\mu}(\partial_{\nu}\phi)(\Lambda^{-1}x)={(\Lambda^{-1})^{\nu}}_{\mu}(\partial_{\nu}\phi)(\Lambda^{-1}x)##.

How does this help?

JorisL · Apr 22, 2016

Well, now use it in the derivative w.r.t. ##\partial_\mu\phi##.
You should get something of the form
$$
\frac{\partial\left(\partial_\mu\phi\right)}{\partial\left(\partial_\alpha\phi\right)}\to \left(\Lambda^{-1}\right)^\nu_{\,\,\, \mu}\Lambda^\alpha_{\,\,\, \beta}\frac{\partial\left(\partial_\nu\phi(x^\prime)\right)}{\partial\left(\partial_\beta\phi(x^\prime)\right)}
$$

This means that the object transform as a (1,1)-tensor i.e. it has one upper and one lower index.

This might help with the details https://www.physicsforums.com/threads/kronecker-delta-as-tensor-proof.320692/

spaghetti3451 · Apr 22, 2016

Ok. So, under the coordinate transformation ##x^{\mu} \rightarrow {\Lambda^{\mu}}_{\nu}x^{\nu}##,

##\frac{\partial(\partial_{\mu}\phi(x))}{\partial(\partial_{\alpha}\phi(x))} \rightarrow \frac{\partial({(\Lambda^{-1})^{\nu}}_{\mu}(\partial_{\nu}\phi)(\Lambda^{-1}x))}{\partial({(\Lambda^{-1})^{\beta}}_{\alpha}(\partial_{\beta}\phi)(\Lambda^{-1}x))}=\frac{\partial({(\Lambda^{-1})^{\nu}}_{\mu}(\partial_{\nu}\phi)(\Lambda^{-1}x))}{\partial({(\Lambda)_{\alpha}}^{\beta}(\partial_{\beta}\phi)(\Lambda^{-1}x))}##

How should I now take ##{(\Lambda)_{\alpha}}^{\beta}## to the numerator?

JorisL · Apr 22, 2016

You can take ##\Lambda## outside of the derivatives. But I suggest you look at a text on (special) relativity.

I suppose you're studying relativistic field theory? This means that knowing how to quickly read and interpret indices (upper/lower) can help you focus on the physics content instead of the manipulations of expressions.

spaghetti3451 · Apr 28, 2016

Thanks for the reply.

Let me finish the steps of my derivation:

##\frac{\partial({(\Lambda^{-1})^{\nu}}_{\mu}(\partial_{\nu}\phi)(\Lambda^{-1}x))}{\partial({(\Lambda)_{\alpha}}^{\beta}(\partial_{\beta}\phi)(\Lambda^{-1}x))}=\frac{\partial((\partial_{\beta}\phi)(\Lambda^{-1}x))}{\partial({(\Lambda)_{\alpha}}^{\beta}(\partial_{\beta}\phi)(\Lambda^{-1}x))}\frac{\partial({(\Lambda^{-1})^{\nu}}_{\mu}(\partial_{\nu}\phi)(\Lambda^{-1}x))}{\partial((\partial_{\beta}\phi)(\Lambda^{-1}x))}={(\Lambda^{-1})^{\nu}}_{\mu}{(\Lambda^{-1})^{\beta}}_{\alpha}\frac{\partial((\partial_{\nu}\phi)(\Lambda^{-1}x))}{\partial((\partial_{\beta}\phi)(\Lambda^{-1}x))}={(\Lambda^{-1})^{\nu}}_{\mu}{(\Lambda)_{\alpha}}^{\beta}\frac{\partial((\partial_{\nu}\phi)(\Lambda^{-1}x))}{\partial((\partial_{\beta}\phi)(\Lambda^{-1}x))}##

JorisL said:

I suppose you're studying relativistic field theory? This means that knowing how to quickly read and interpret indices (upper/lower) can help you focus on the physics content instead of the manipulations of expressions.

Hmm. I guess that's very important. I was just trying to practice my skills in tensor manipulations since I'm still new to this kind of math.

spaghetti3451 · May 4, 2016

Wait! My order of indices on ##{(\Lambda)_{\alpha}}^{\beta}## in the final result are not the same as your order of indices in ##{\Lambda^{\alpha}}_{\beta}##.

Did I make a mistake in the second step of my calculation in the previous post?

Differentiation of a product of 4-gradients wrt a 4-gradient

Discussion Overview

Discussion Character

Main Points Raised

Areas of Agreement / Disagreement

Contextual Notes

Similar threads

Graduate Strain Tensor Based on Clifford Algebra

Graduate Nonautonomous Lie derivative

Graduate Equivalent definitions of tensor field

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Insights What Exactly is Dirac’s Delta Function? - Insight