Differentiation of a product of 4-gradients wrt a 4-gradient

  • Context: Graduate 
  • Thread starter Thread starter spaghetti3451
  • Start date Start date
  • Tags Tags
    Differentiation Product
Click For Summary

Discussion Overview

The discussion revolves around the differentiation of a product of 4-gradients with respect to a 4-gradient in the context of relativistic field theory. Participants explore the mathematical formulation and implications of this differentiation, addressing issues related to index notation and the application of the metric tensor.

Discussion Character

  • Technical explanation
  • Mathematical reasoning
  • Debate/contested

Main Points Raised

  • One participant claims that the differentiation of the product of 4-gradients yields a specific result, but seeks verification of their calculations.
  • Another participant points out inconsistencies in index notation and suggests a reformulation to avoid ill-defined expressions.
  • A later reply acknowledges previous mistakes regarding index placement and the inclusion of a factor of 1/2 in the differentiation process.
  • Participants discuss the implications of index positions and transformations under coordinate changes, questioning the correctness of certain assumptions about the metric tensor and its relationship to the Kronecker delta.
  • There is a suggestion to explore the transformation properties of derivatives with respect to 4-gradients, emphasizing the tensorial nature of these objects.
  • One participant expresses uncertainty about the proper treatment of indices during transformations and seeks clarification on how to manipulate these expressions correctly.

Areas of Agreement / Disagreement

Participants exhibit a mix of agreement and disagreement regarding the correctness of their mathematical manipulations and the implications of index notation. Multiple competing views on the treatment of indices and the differentiation process remain unresolved.

Contextual Notes

Limitations include potential misunderstandings of index notation, the need for clarity on the implications of coordinate transformations, and unresolved questions about the application of the metric tensor in this context.

spaghetti3451
Messages
1,311
Reaction score
31
I know that ##\frac{\partial}{\partial (\partial_{\mu}\phi)} \big( \partial_{\mu} \phi\ \partial^{\mu} \phi \big) = \partial_{\mu} \phi##.

Now, I need to prove this to myself.

So, here goes nothing.

##\frac{\partial}{\partial (\partial_{\mu}\phi)} \big( \partial_{\mu} \phi\ \partial^{\mu} \phi \big)##
## = \frac{\partial}{\partial (\partial_{\mu}\phi)} \big( \eta^{\mu\nu}\partial_{\mu} \phi\ \partial_{\nu} \phi \big)##
##= \eta^{\mu\nu}\ \partial_{\nu} \phi + \eta_{\mu\nu} \eta^{\mu\nu}\ \partial_{\mu} \phi##,

where I first differentiated the factor ##\partial_{\mu}\phi## with respect to ##\partial_{\mu}\phi## and then I differentiated the factor ##\partial_{\nu}\phi## with respect to ##\partial_{\mu}\phi##.

Am I correct so far?
 
Physics news on Phys.org
Your indices in the second term don't match with those in the first so that's an indication something is wrong.

Try using the following form

$$\frac{\partial}{\partial (\partial_{\alpha}\phi)} \big( \partial_{\mu} \phi\ \partial^{\mu} \phi \big)$$

This helps you avoid the problem where you have ill-defined expressions like ##\eta_{\mu\nu}\eta^{\mu\nu}\partial_\mu\phi##.
The latter can mean two things
$$\eta_{\mu\nu}\left(\eta^{\mu\nu}\partial_\mu\phi\right)=\eta_{\mu\nu}\partial^\nu\phi=\partial_\mu\phi$$
or it can mean (using ##\eta_{\mu\nu}\eta^{\mu\nu}=D## with D the number of spacetime dimensions)
$$\left(\eta_{\mu\nu}\eta^{\mu\nu}\right)\partial_\mu\phi=4\partial_\mu\phi$$
 
  • Like
Likes   Reactions: spaghetti3451
Firstly, I need to say that ##
\frac{\partial}{\partial (\partial_{\mu}\phi)} \big( \frac{1}{2}\partial_{\mu} \phi\ \partial^{\mu} \phi \big) = \partial^{\mu} \phi##.
I made a mistake in my first post of placing the ##\mu## index on the RHS downstairs, instead of upstairs. I also made a mistake of forgetting the factor of ##\frac{1}{2}##.

Now,

##\frac{\partial}{\partial (\partial_{\alpha}\phi)} \big( \frac{1}{2}\partial_{\mu} \phi\ \partial^{\mu} \phi \big)##
##=\frac{1}{2}\eta^{\mu\nu}\frac{\partial}{\partial (\partial_{\alpha}\phi)} \big( \partial_{\mu} \phi\ \partial_{\nu} \phi \big)##
##=\frac{1}{2}\eta^{\mu\nu}(\eta_{\alpha\mu}\partial_{\nu}\phi+\eta_{\alpha\nu}\partial_{\mu}\phi)##
##=\frac{1}{2}(\eta_{\alpha\mu}\eta^{\mu\nu}\partial_{\nu}\phi+\eta_{\alpha\nu}\eta^{\nu\mu}\partial_{\mu}\phi)##
##=\eta_{\alpha\mu}\eta^{\mu\nu}\partial_{\nu}\phi##
##=\delta^{\nu}_{\alpha}\partial_{\nu}\phi##
##=\partial_{\alpha}\phi##

Am I correct?
 
Last edited:
Why do you use that
$$
\frac{\partial\left(\partial_\mu\phi\right)}{\partial\left(\partial_\alpha\phi\right)}=\eta_{\mu\alpha}
$$

A hint that something is wrong, is that the indices don't correspond. You can check this by performing a transformation ##x^\mu\to x^{\prime\mu}## you should find that there is an upper and a lower index. [*]

It's equal to ##\delta^\alpha_\mu\,\,\,\left(=\eta^\alpha_{\,\,\, \mu}\right)## as far as I can tell.

[*]: Oddly it seems that this would turn out correct if you contract the metric so I'm starting to doubt myself.
 
  • Like
Likes   Reactions: spaghetti3451
Ok, let me rework my answer using your suggestion.

##\frac{\partial}{\partial (\partial_{\alpha}\phi)} \big( \frac{1}{2}\partial_{\mu} \phi\ \partial^{\mu} \phi \big)##
##=\frac{1}{2}\eta^{\mu\nu}\frac{\partial}{\partial (\partial_{\alpha}\phi)} \big( \partial_{\mu} \phi\ \partial_{\nu} \phi \big)##
##=\frac{1}{2}\eta^{\mu\nu}({\eta^{\alpha}}_{\mu}\partial_{\nu}\phi+{\eta^{\alpha}}_{\nu}\partial_{\mu}\phi)##
##=\frac{1}{2}({\eta^{\alpha}}_{\mu}\eta^{\mu\nu}\partial_{\nu}\phi+{\eta^{\alpha}}_{\nu}\eta^{\nu\mu}\partial_{\mu}\phi)##
##={\eta^{\alpha}}_{\mu}\eta^{\mu\nu}\partial_{\nu}\phi##
##=\eta^{\alpha\nu}\partial_{\nu}\phi##
##=\partial^{\alpha}\phi##

It appears that in the answer, the index ##\alpha## should be upstairs, not downstairs. I made this mistake in my previous post.
 
Try to see why the ##\alpha## index is upstairs and also why you can write ##\eta^\alpha_{\,\,\,\mu}=\delta^\alpha_\mu## where delta is the kronecker delta.

Other than that it is correct.
 
  • Like
Likes   Reactions: spaghetti3451
JorisL said:
Try to see why the ##\alpha## index is upstairs

Shouldn't the index ##\alpha## be upstairs because we are differentiating the product of an upstairs index and a downstairs index with respect to a downstairs index?

JorisL said:
and also why you can write ##\eta^\alpha_{\,\,\,\mu}=\delta^\alpha_\mu## where delta is the kronecker delta.

I think ##\eta^\alpha_{\,\,\,\mu}=\delta^\alpha_\mu## because ##\eta^\alpha_{\,\,\, \mu} = \eta^{\alpha\nu}\eta_{\nu\mu} = \delta^\alpha_\mu##.

Am I correct?
 
The latter part is perfect.

The first part becomes clearest when explicitly changing coordinates In other words how does the

$$
\partial_\mu\phi(x)=\frac{\partial\phi(x)}{\partial x^\mu}
$$

Apply a coordinate transformation ##x^\mu\to x^{\prime\mu}## and look at the way the Jacobian shows up.
The Jacobian is given by (or "upside down" doesn't matter too much since we look at invertible transformations)

$$
\frac{\partial x^{\prime\mu}}{\partial x^{\mu}}
$$
 
Under the coordinate transformation ##x^{\mu}\rightarrow x'^{\mu}={\Lambda^{\mu}}_{\nu}x^{\nu}##,

##\partial_{\mu}\phi(x)~=~\frac{\partial\phi(x)}{\partial x^{\mu}}~##

##\rightarrow \frac{\partial\phi(\Lambda^{-1}x)}{\partial x^{\mu}}=\frac{\partial (\Lambda^{-1}x)^{\nu}}{\partial x^{\mu}}\frac{\partial\phi(\Lambda^{-1}x)}{\partial (\Lambda^{-1}x)^{\nu}}=\frac{\partial}{\partial x^{\mu}}\Big( {(\Lambda^{-1})^{\nu}}_{\rho}x^{\rho} \Big)(\partial_{\nu}\phi)(\Lambda^{-1}x)={(\Lambda^{-1})^{\nu}}_{\rho}\delta^{\rho}_{\mu}(\partial_{\nu}\phi)(\Lambda^{-1}x)={(\Lambda^{-1})^{\nu}}_{\mu}(\partial_{\nu}\phi)(\Lambda^{-1}x)##.

How does this help?
 
Last edited:
  • #10
Well, now use it in the derivative w.r.t. ##\partial_\mu\phi##.
You should get something of the form
$$
\frac{\partial\left(\partial_\mu\phi\right)}{\partial\left(\partial_\alpha\phi\right)}\to \left(\Lambda^{-1}\right)^\nu_{\,\,\, \mu}\Lambda^\alpha_{\,\,\, \beta}\frac{\partial\left(\partial_\nu\phi(x^\prime)\right)}{\partial\left(\partial_\beta\phi(x^\prime)\right)}
$$

This means that the object transform as a (1,1)-tensor i.e. it has one upper and one lower index.

This might help with the details https://www.physicsforums.com/threads/kronecker-delta-as-tensor-proof.320692/
 
  • #11
Ok. So, under the coordinate transformation ##x^{\mu} \rightarrow {\Lambda^{\mu}}_{\nu}x^{\nu}##,

##\frac{\partial(\partial_{\mu}\phi(x))}{\partial(\partial_{\alpha}\phi(x))} \rightarrow \frac{\partial({(\Lambda^{-1})^{\nu}}_{\mu}(\partial_{\nu}\phi)(\Lambda^{-1}x))}{\partial({(\Lambda^{-1})^{\beta}}_{\alpha}(\partial_{\beta}\phi)(\Lambda^{-1}x))}=\frac{\partial({(\Lambda^{-1})^{\nu}}_{\mu}(\partial_{\nu}\phi)(\Lambda^{-1}x))}{\partial({(\Lambda)_{\alpha}}^{\beta}(\partial_{\beta}\phi)(\Lambda^{-1}x))}##

How should I now take ##{(\Lambda)_{\alpha}}^{\beta}## to the numerator?
 
  • #12
You can take ##\Lambda## outside of the derivatives. But I suggest you look at a text on (special) relativity.

I suppose you're studying relativistic field theory? This means that knowing how to quickly read and interpret indices (upper/lower) can help you focus on the physics content instead of the manipulations of expressions.
 
  • #13
Thanks for the reply.

Let me finish the steps of my derivation:

##\frac{\partial({(\Lambda^{-1})^{\nu}}_{\mu}(\partial_{\nu}\phi)(\Lambda^{-1}x))}{\partial({(\Lambda)_{\alpha}}^{\beta}(\partial_{\beta}\phi)(\Lambda^{-1}x))}=\frac{\partial((\partial_{\beta}\phi)(\Lambda^{-1}x))}{\partial({(\Lambda)_{\alpha}}^{\beta}(\partial_{\beta}\phi)(\Lambda^{-1}x))}\frac{\partial({(\Lambda^{-1})^{\nu}}_{\mu}(\partial_{\nu}\phi)(\Lambda^{-1}x))}{\partial((\partial_{\beta}\phi)(\Lambda^{-1}x))}={(\Lambda^{-1})^{\nu}}_{\mu}{(\Lambda^{-1})^{\beta}}_{\alpha}\frac{\partial((\partial_{\nu}\phi)(\Lambda^{-1}x))}{\partial((\partial_{\beta}\phi)(\Lambda^{-1}x))}={(\Lambda^{-1})^{\nu}}_{\mu}{(\Lambda)_{\alpha}}^{\beta}\frac{\partial((\partial_{\nu}\phi)(\Lambda^{-1}x))}{\partial((\partial_{\beta}\phi)(\Lambda^{-1}x))}##

JorisL said:
I suppose you're studying relativistic field theory? This means that knowing how to quickly read and interpret indices (upper/lower) can help you focus on the physics content instead of the manipulations of expressions.

Hmm. I guess that's very important. I was just trying to practice my skills in tensor manipulations since I'm still new to this kind of math.
 
  • #14
Wait! My order of indices on ##{(\Lambda)_{\alpha}}^{\beta}## in the final result are not the same as your order of indices in ##{\Lambda^{\alpha}}_{\beta}##.

Did I make a mistake in the second step of my calculation in the previous post?
 

Similar threads

  • · Replies 1 ·
Replies
1
Views
4K
  • · Replies 1 ·
Replies
1
Views
3K
  • · Replies 7 ·
Replies
7
Views
2K
  • · Replies 34 ·
2
Replies
34
Views
5K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 5 ·
Replies
5
Views
2K
  • · Replies 1 ·
Replies
1
Views
1K
  • · Replies 0 ·
Replies
0
Views
1K
  • · Replies 7 ·
Replies
7
Views
4K
  • · Replies 3 ·
Replies
3
Views
1K