Gold Member
I am reading I am reading Spacetime and Geometry : An Introduction to General Relativity -- by Sean M Carroll and have arrived at chapter 3 where he introduces the covariant derivative ##{\mathrm{\nabla }}_{\mu }##. He makes demands on this which are \begin{align}

\mathrm{1.\ Linearity:}\mathrm{\ }\mathrm{\nabla }\left(T+S\right)=\mathrm{\nabla }T+\mathrm{\nabla }S & \phantom {10000}(1) \nonumber \\

\mathrm{2.\ Leibniz\ rule:}\mathrm{\nabla }\left(T\ \otimes \ \ S\right)=\left(\mathrm{\nabla }T\right)\ \ \otimes \ \ S+T\ \otimes \ \ \left(\mathrm{\nabla }S\right) & \phantom {10000}(2) \nonumber \\

{\mathrm{3.\ Commutes\ with\ contractions:}\mathrm{\nabla }}_{\mu }\left(T^{\lambda }_{\ \ \ \lambda \rho }\right)={\left(\mathrm{\nabla }T\right)}^{\mathrm{\ \ \ }\lambda}_{\mu \ \ \lambda \rho } & \phantom {10000}(3) \nonumber\\

{\mathrm{4.\ Reduces\ to\ partial\ derivative\ on\ scalars:}\mathrm{\nabla }}_{\mu }\phi ={\partial }_{\mu }\phi & \phantom {10000}(4) \nonumber \\

\end{align}1,2 and 4 seem reasonable but I cannot understand 3 and he does not seem to use it, even though he implies that he does.

The LHS of (3) seems straight forward\begin{align}

{\mathrm{\nabla }}_{\mu }\left(T^{\lambda }_{\ \ \ \lambda \rho }\right) & ={\partial }_{\mu }T^{\lambda }_{\ \ \ \lambda \rho }+{\mathrm{\Gamma }}^{\lambda }_{\mu \kappa }T^{\kappa }_{\ \ \ \lambda \rho }-{\mathrm{\Gamma }}^{\kappa }_{\mu \lambda }T^{\lambda }_{\ \ \ \kappa \rho }-{\mathrm{\Gamma }}^{\kappa }_{\mu \rho }T^{\lambda }_{\ \ \ \lambda k} & \phantom {10000}(5) \nonumber \\

& ={\partial }_{\mu }T^{\lambda }_{\ \ \ \lambda \rho }-{\mathrm{\Gamma }}^{\kappa }_{\mu \rho }T^{\lambda }_{\ \ \ \lambda k} & \phantom {10000}(6) \nonumber \\

\end{align}Which is very like the rule for the covariant derivative of a (0,1) tensor.

I understand that the ##\mathrm{\nabla }T## in (1) and (2) means ##{\mathrm{\nabla }}_{\sigma}T## where ##T## is some tensor. So the RHS of (3) appears to be ##{\left({\mathrm{\nabla }}_{\sigma}T\right)}^{\mathrm{\ \ \ }\lambda}_{\mu \ \ \lambda \rho }## which leaves too many indices on the RHS. Otherwise the RHS is some kind of derivative with one contra- and three co-variant indices. What is that?

Help!

Staff Emeritus
Homework Helper
Gold Member
(3) is essentially what gives you the action on covariant indices (i.e., the negative sign that appears), which you have used when you say that it is straight-forward.

##\nabla## in ##\nabla T## generally does not need an index. If ##T## is a ##(n,m)## tensor, then ##\nabla T## is a ##(n,m+1)## tensor. Both (1) and (2) are written in non-index form. In general, ##\nabla T## is defined according to
$$(\nabla T)(V, \ldots) = (\nabla_V T)(\ldots),$$
where ##\ldots## are the arguments of ##T##. (Note that ##\nabla_V T## is an ##(n,m)## tensor if ##T## is.)

Staff Emeritus
Here's the way I understand #3.

##g^{\lambda \nu} (\nabla_\mu T_{\nu \lambda \rho}) = \nabla_\mu (g^{\lambda \nu} T_{\nu \lambda \rho})##

George Keeling
Staff Emeritus
Homework Helper
Gold Member
Here's the way I understand #3.

##g^{\lambda \nu} (\nabla_\mu T_{\nu \lambda \rho}) = \nabla_\mu (g^{\lambda \nu} T_{\nu \lambda \rho})##
There is nothing generally stating that there needs to exist a metric. Instead, (3) essentially gives the relation between the actions of the covariant derivative on upper/lower indices. More specifically, if you define ##\nabla_\mu v^\nu = \partial_\mu v^\nu + \Gamma^\nu_{\mu \lambda} v^\lambda##, then you must have (from (2), (3), and (4))
$$\nabla_\mu(v^\nu\omega_\nu) = v^\nu \partial_\mu \omega_\nu + \omega_\nu \partial_\mu v^\nu = \omega_\nu \nabla_\mu v^\nu + v^\nu \nabla_\mu \omega_\nu = v^\nu \nabla_\mu \omega_\nu + \omega_\nu \partial_\mu v^\nu + \Gamma^\lambda_{\mu\nu} v^\nu \omega_\lambda.$$
Solving for ##\nabla_\mu \omega_\nu## then leads to (also using the quotient rule)
$$\nabla_\mu \omega_\nu = \partial_\mu \omega_\nu - \Gamma^\lambda_{\mu\nu} \omega_\lambda.$$

Gold Member
Amazingly I had just read $$g_{\mu \lambda} \nabla_\mu V^\lambda = \nabla_\rho (g_{\lambda \mu} V^\lambda)$$in the book and I was trying to work out how it came from (3). stevendaryl has delivered me. It's very like
Here's the way I understand #3. $$g^{\lambda \nu} (\nabla_\mu T_{\nu \lambda \rho}) = \nabla_\mu (g^{\lambda \nu} T_{\nu \lambda \rho})$$
So I'll take that (slightly generalised) as the answer! Thanks stevendaryl.

Staff Emeritus
Homework Helper
Gold Member
It does not come from (3), it comes from requiring that the connection is metric compatible, i.e., that ##\nabla g = 0##, which is a completely separate condition (and after that it follows from (2) and (3)). (3) is a requirement for any connection.

George Keeling
Staff Emeritus
Amazingly I had just read $$g_{\mu \lambda} \nabla_\mu V^\lambda = \nabla_\rho (g_{\lambda \mu} V^\lambda)$$in the book and I was trying to work out how it came from (3). stevendaryl has delivered me. It's very like
So I'll take that (slightly generalised) as the answer! Thanks stevendaryl.

But as Orudruin points out, what I said is not exactly correct. In the case (which is true in General Relativity) where there is a metric and the connection is compatible with the metric, then it's basically close.

The issue of "commuting with contractions" is more general than manifolds with a metric.

1. Take covariant derivative, then contract:

If ##T^\nu_{\ \ \lambda \rho}## is a tensor of rank ##(1,2)## (1 upper index, 2 lower indices), then we can get another tensor by taking the covariant derivative:

##\nabla_\mu T^\nu_{\ \ \lambda \rho}##, which is of rank ##(1,3)##.

You can then contract ##\lambda## and ##\nu## to get a tensor of rank ##(0,2)##:

##\sum_\lambda \nabla_\mu T^\lambda_{\ \ \lambda \rho}##

That has two free indices, ##\mu, \rho##

So we went from rank ##(1,2)## to rank ##(1,3)## to rank ##(0,2)##

2. Contract, then take covariant derivative:

Alternatively, we can first contract to get ##\sum_\lambda T^\lambda_{\ \ \lambda \rho}##, which is of rank ##(0,1)##. Then we can take a covariant derivative to get:

##\nabla_\mu \sum_\lambda T^\lambda_{\ \ \lambda \rho}##, which again has two free indices, ##\mu, \rho##.

This time, we went from rank ##(1,2)## to ##(0,1)## to ##(0,2)##

Saying that contraction commutes with covariant derivatives says that the resulting ##(0,2)## tensors are the same.

I have trouble writing all the indices correctly, but I think this is the identity:

##(\nabla_\mu T)^\lambda_{\ \ \lambda \rho} = \nabla_\mu (T^\lambda_{\ \ \lambda \rho})##

Last edited:
Gold Member
I love the idea that we can 1. Take covariant derivative, then contract or 2. Contract, then take covariant derivative and get the same answer. It sounds like commuting over contraction. But unfortunately I get 1. $$\sum_\lambda \nabla_\mu T^\lambda_{\ \ \lambda \rho}=\nabla_\mu (T^\lambda_{\ \ \lambda \rho})$$ and 2. $$\nabla_\mu \sum_\lambda T^\lambda_{\ \ \lambda \rho}=\nabla_\mu T^\lambda_{\ \ \lambda \rho}=\nabla_\mu ( T^\lambda_{\ \ \lambda \rho})$$So, sadly, this appears to be a statement of the obvious. I also wrote the summations out explicitly and 1 and 2 are the same by the linearity rule (1).

Going back to
It does not come from (3), it comes from requiring that the connection is metric compatible, i.e., that ##\nabla g = 0##, which is a completely separate condition (and after that it follows from (2) and (3)). (3) is a requirement for any connection.
A thousand apologies: I made a mistake in the equation I found and omitted the last part. It is (3.24) in the book and should have been: $$g_{\mu \lambda} \nabla_\rho V^\lambda = \nabla_\rho (g_{\lambda \mu} V^\lambda) = \nabla_\rho V_\mu$$ Leibniz (2) gives $$\nabla_\rho (g_{\lambda \mu} V^\lambda) = V^\lambda \nabla_\rho g_{\lambda \mu} + g_{\lambda \mu} \nabla_\rho V^\lambda$$ and metric compatibility (which Orodruin cleverly detected) makes ##\nabla_\rho g## vanish. The next step is $$g_{\lambda \mu} \nabla_\rho V^\lambda = \nabla_\rho V_\mu$$which is using (3)??
So the use of rules is in a slightly different order to what Orodruin suggested. But it's a miracle to get so far with a wrong and incomplete equation. That usage of (3) (which does not depend on the fact that the metric was the tensor involved) makes me want to rewrite it as:$$T \nabla_\rho S = \nabla_\rho TS$$ if an index in ##T## and ##S## is contracted. Indices have been omitted in the above expression, but I don't know how to put them in in a general enough way. (They could be many up or down indices.) This is a generalisation of what StevenDaryl wrote yesterday: ##
g^{\lambda \nu} (\nabla_\mu T_{\nu \lambda \rho}) = \nabla_\mu (g^{\lambda \nu} T_{\nu \lambda \rho})
##.