# Covariant derivative definition in Wald

• A
JonnyG
I'm working through Wald's "General Relativity" right now. My questions are actually about the math, but I figure that a few of you that frequent this part of the forums may have read this book and so will be in a good position to answer my questions. I have two questions:

1) Wald first defines a derivative operator ##\nabla## which maps smooth tensors of type ##(k,l)## to smooth tensors of type ##(k, l + 1)##. He defines the operator by its properties. The one that is bothering me is the fourth one. He writes "Consistency with the notion of tangent vectors as directional derivatives on scalar fields: For all ##f \in C^{\infty} M## and all ##t^a \in V_p##, $$t(f) = t^a \nabla_a f$$ What does the notation ##t^a \nabla_a f## mean? It looks like a contraction to me, but Wald explicitly says that the index attached to ##\nabla## is just for notational convenience. I was thinking, if ##t## was a vector field on ##M## (the smooth manifold in question), then ##tf \in C^{\infty} M## and so ##t^a \nabla_ f## could be interpreted as ##\nabla_a tf##, which would make sense. But ##t## is merely a tangent vector, so in this case I am confused.

2) My second question is this: After Wald lists the five properties that the covariant derivative operator is required to satisfy, he shows that such an operator does indeed exist. He says let ##\psi## be a smooth coordinate map on ##M## and let ##\{\partial/\partial x^\mu\}## and ##\{dx^\mu\}## be bases for the tangent space and cotangent space, respectively. Then given a smooth tensor field ##T^{a_1 \cdots a_k}_{b_1 \cdots b_l}##, take its components ##T^{\mu_1 \cdots \mu_k}_{\nu_1 \cdots \nu_l}## in the given coordinate basis and define ##\partial_c T^{a_1 \cdots a_k}_{b_1 \cdots b_l}## to be the tensor whose components are the partial derivatives ##\partial(T^{\mu_1 \cdots \mu_k}_{\nu_1 \cdots nu_l} )/ \partial x^{\sigma}##.

I understand that because ##T## is a smooth tensor field then its components are smooth real-valued maps, and consequently, we can take their partial derivatives. But to which variable do we differentiate with respect to? The components of ##\partial_c T## (which is a type ##(k, l+1)## tensor) are supposed to be the partial derivatives of the component functions of ##T##, but if you ask me to take the partial derivative of a component function of ##T##, I ask, which of the ##n## variables do I differentiate with respect to?

I hope my misunderstandings are clear. If they aren't, please let me know and I will clear it up.

EDIT: In regards to my first question, I just realized that ##\nabla f## is dual to ##df##, and ##t^a## is a vector, and by the isomorphism ##V_p \simeq V_p^{**}##, the vector ##t^a## acts on the dual vector ##df##. So, though ##t^a \nabla_a f## isn't a contraction, writing it as one is indeed notationally convenient because it's a quick way to say that ##t^a## is acting on ##df##. Is this correct?

EDIT 2: I am going to take a quick stab at my second question by trying to answer it with an example. Please let me know if it is correct. Let us take a simple tensor field ##T## of type ##(2,2)## on a smooth ##3##-manifold. Suppose ##T## has the simple expansion, ##T = f \Big( \partial/\partial x^1 \otimes \partial/\partial x^2 \otimes dx^1 \otimes dx^2 \Big)## where ##f## is a smooth real valued function on ##M##. Then ##\nabla T = \frac{\partial f}{\partial x^1} \Big(\partial/\partial x^1 \otimes \partial/\partial x^2 \otimes dx^1 \otimes dx^2 \otimes dx^1 \Big)+ \frac{\partial f}{\partial x^2} \Big( \partial/\partial x^1 \otimes \partial/\partial x^2 \otimes dx^1 \otimes dx^2 \otimes dx^2 \Big) + \frac{\partial f}{\partial x^3} \Big( \partial/\partial x^1 \otimes \partial/\partial x^2 \otimes dx^1 \otimes dx^2 \otimes dx^3 \Big)##. Is this what Wald means?

Last edited:

Homework Helper
Gold Member
EDIT: In regards to my first question, I just realized that ##\nabla f## is dual to ##df##, and ##t^a## is a vector, and by the isomorphism ##V_p \simeq V_p^{**}##, the vector ##t^a## acts on the dual vector ##df##. So, though ##t^a \nabla_a f## isn't a contraction, writing it as one is indeed notationally convenient because it's a quick way to say that ##t^a## is acting on ##df##. Is this correct?
Yes, I think that's what Wald means. So when he writes ##t^a\nabla_af## he means ##t^a(\nabla f)_a##. The notation saves him writing two parentheses. I wonder is the saving worth it, given the confusion it causes?

He could have just written ##t^a\nabla f_a## which would mean the same thing without needing parentheses, employing an evaluation rule in which operators are evaluated from left to right except when a overruled by a precedence rule or parentheses, and operators are 'pushed up the stack' (like on those lovely old-fashioned reverse-polish calculators) when the next symbol to their right is not a legal operand. Then the ##t^a## is pushed up because it can't operate on ##\nabla##. Then, when ##\nabla## operates on ##f##, it generates an operand that ##t^a## can use.

JonnyG
@andrewkirk I think I am wrong when I say that Wald means ##t^a \nabla_a f## to mean that ##t## acts on ##df##. I mean, why not just write ##t(df)## if that's what he meant? It isn't making sense to me. But you say that you think that is what Wald means...I think I am missing something here?

JonnyG
@andrewkirk Sorry Andrew, but you may have to spell it out for me. How is ##(\nabla f)_a## a dual vector?

Homework Helper
Gold Member
@andrewkirk I think I am wrong when I say that Wald means ##t^a \nabla_a f## to mean that ##t## acts on ##df##. I mean, why not just write ##t(df)## if that's what he meant? It isn't making sense to me. But you say that you think that is what Wald means...I think I am missing something here?
The reason I suggested that interpretation is because your OP quoted Wald as saying
Wald explicitly says that the index attached to ##\nabla## is just for notational convenience.
which rules out the most natural interpretation, which is that the subscript ##a## is not just a notational convenience but that instead ##\nabla_a## denotes the vector ##\partial/\partial x^a##. With that interpretation ##\nabla_a f## is a scalar and ##t^a\nabla_a f## denotes the Einstein sum ##t^a(\nabla_af)##.

I think that interpretation is the most likely one as it is specifically about directional derivatives. Perhaps it would be best to dismiss his comment about the subscript ##a## just being a notational convenience.

JonnyG
Oh wait. ##f## is a smooth map, so it is a smooth tensor of type ##(0,0)##. Thus ##\nabla f## is a smooth tensor of type ##(0,1)##, meaning it is a dual vector. So ##t## can actually act on ##\nabla f##. So when Wald says that "consistency with the notion of tangent vectors as directional derivatives on scalar fields", he means that we retain the notion that tangent vectors act as directional derivative operators not only on scalar fields, but also on ##\nabla f##.

I don't know how I missed that...by the way, am I correct in my answer to my second question?

Last edited:
Homework Helper
Gold Member
I don't understand your answer to 2. The role of ##f## is unclear. You appear to be either multiplying it by a tensor or trying to apply it to a tensor (which is illegal). What Wald means is

$$\nabla T=\left(\frac{\partial}{\partial x^\alpha} T^{\mu_1,...,\mu_k}_{\nu_1,...,\nu_l}\right)\left(\partial_{\mu_1}\otimes...\otimes\partial_{\mu_k}\otimes dx^{\nu_1}\otimes...\otimes dx^{\nu_l}\otimes dx^{\alpha}\right)$$

JonnyG
I don't understand your answer to 2. The role of ##f## is unclear. You appear to be either multiplying it by a tensor or trying to apply it to a tensor (which is illegal). What Wald means is

$$\nabla T=\left(\frac{\partial}{\partial x^\alpha} T^{\mu_1,...,\mu_k}_{\nu_1,...,\nu_l}\right)\left(\partial_{\mu_1}\otimes...\otimes\partial_{\mu_k}\otimes dx^{\nu_1}\otimes...\otimes dx^{\nu_l}\otimes dx^{\alpha}\right)$$
And that is using Einstein summation notation, so you are summing over ##\alpha## right? That is what I wrote in my answer, but I suppose I was using bad notation. The ##f## was meant to denote the function component of the tensor field ##T##.

Thanks for your help. I really appreciate it!

Staff Emeritus
Homework Helper
Gold Member
2021 Award
I don't understand your answer to 2. The role of ##f## is unclear. You appear to be either multiplying it by a tensor or trying to apply it to a tensor (which is illegal). What Wald means is

$$\nabla T=\left(\frac{\partial}{\partial x^\alpha} T^{\mu_1,...,\mu_k}_{\nu_1,...,\nu_l}\right)\left(\partial_{\mu_1}\otimes...\otimes\partial_{\mu_k}\otimes dx^{\nu_1}\otimes...\otimes dx^{\nu_l}\otimes dx^{\alpha}\right)$$

But this object does not transform properly under coordinate transformations. I do not have my copy of Wald with me so I cannot check what it actually says at the moment.

Regarding the OP question, it is the most convenient place to introduce an index in abstract index notation. For the other options you would need parentheses or some weird rule of how to read indices on covariant derivatives. Simply writing ##\nabla f_a## and giving priority to the ##\nabla## before the index is also not without problems. For example it becomes unclear if the abstract index actually belongs to ##\nabla f## or to ##f##, making it a dual vector.

An alternative is using the notation ##f_{;a}##, which also works perfectly for higher order tensors.

I anyway think the meaning of ##t^a \nabla_a f## is clear. It is the contraction between the dual vector ##\nabla f## and the tangent vector ##t##. When ##f## is a scalar you can write this in a number of different ways, but here you are after a notation which will also serve you well for higher order tensors.

Homework Helper
Gold Member
But this object does not transform properly under coordinate transformations. I do not have my copy of Wald with me so I cannot check what it actually says at the moment.
Ah, what a silly mistake. Of course you are correct. What I wrote takes no account of curvature. It's only a 'comma derivative' (partial derivative of the coordinate-based tensor components), whereas what's required is a semi-colon derivative.

The correct formula is something like:

\begin{align*}\nabla T&=\left(\frac{\partial}{\partial x^\alpha}
T^{\mu_1,...,\mu_k}_{\nu_1,...,\nu_l}
+\sum_{r=1}^k T^{\mu_1,...,\mu_{r-1},\beta,\mu_{r+1},...,\mu_k}_{\nu_1,...,\nu_l}\Gamma^{\mu_r}_{\beta\alpha}
-\sum_{r=1}^l T^{\mu_1,...,\mu_k}_{\nu_1,...,\nu_{r-1},\beta,\nu_{r+1},...,\nu_l}\Gamma^\beta_{\nu_r\alpha}
\right)\\
\end{align*}

The mess of Christoffel symbols is the difference between the comma (partial, coordinate) and the semi-colon (covariant) derivative.

JonnyG
@andrewkirk We haven't gotten to the Christoffel symbols yet in the book. Is there a way to express this without using those symbols? Still though, I don't see how your original answer isn't a tensor field. ##T## is expressed as a linear combination of elementary ##(k, l + 1)## tensors fields with smooth functions for coefficients. What's wrong with it?

@Orodruin Allow me to quote Wald exactly:

"Our first important task is to show that derivative operators exist. Let ##\psi## be a coordinate system and let ##\{\partial/\partial x^\mu\}## and ##\{dx^\mu\}## be the associated coordinate bases. Then in the region covered by these coordinates we may define a derivative operator ##\partial_c##, called an ordinary derivative, as follows. For any smooth tensor field ##T^{a_1 \cdots a_k}_{b_1 \cdots b_l}## we take its components ##T^{\mu_1 \cdots \mu_k}_{\nu_1 \cdots \nu_l}## in this coordinate basis and define ##\partial_c T^{a_1 \cdots a_k}_{b_1 \cdots b_l}## to be the tensor whose components in this coordinate basis are the partial derivatives ##\partial(T^{\mu_1 \cdots \mu_k}_{\nu_1 \cdots \nu_l}) / \partial x^{\sigma}##."

Staff Emeritus
Homework Helper
Gold Member
2021 Award
What's wrong with it?
The components defined as partial derivatives of the components of another tensor does not have the correct transformation properties under general coordinate transformations.

Staff Emeritus
@andrewkirk We haven't gotten to the Christoffel symbols yet in the book. Is there a way to express this without using those symbols? Still though, I don't see how your original answer isn't a tensor field. ##T## is expressed as a linear combination of elementary ##(k, l + 1)## tensors fields with smooth functions for coefficients. What's wrong with it?

Well, the axioms for $\nabla$ imply that it obeys the product rule for derivatives: $\nabla (X Y) = (\nabla X) Y + X (\nabla Y)$. Now, let's apply that to a vector field $A$. We can pick out a basis $e_\mu$ and write $A = \sum_\mu e_\mu A^\mu$. So applying the product rule tells us:

$\nabla A = \sum_\mu ((\nabla e_\mu) A^\mu + e_\mu (\nabla A^\mu))$

Now, the assumption that $\nabla$ is just the partial derivative would mean that $\nabla e_\mu = 0$. But you can choose any (or just about any) tetrad of vector fields to act as your basis. So there is no way for $\nabla e_\mu$ to always be zero, for every basis.

Now you can perhaps pick a specific basis, and declare that $\nabla e_\mu$ is zero for that basis, but in general, it will be nonzero for any other basis (unless it is a linear transformation of the special basis).

So the axioms for $\nabla$ specify it uniquely only after you say what $\nabla e_\mu$ is (for one basis; its value for other bases would then be computable). That's where the Christofel symbols come in: $\nabla e_\mu = \Gamma^\alpha_{\beta \mu} e_\alpha \otimes \omega^\beta$

(where $\omega^\beta$ is the covector basis).

• vanhees71
Homework Helper
Gold Member
We haven't gotten to the Christoffel symbols yet in the book. Is there a way to express this without using those symbols?
Here's an alternative. We start by writing the covariant derivative as a sum of derivatives in each of the coordinate directions:

\begin{align*}\nabla T&=T_{;\alpha}\otimes dx^\alpha\\
&=\bigg[\left(\frac{\partial}{\partial x^\alpha}
T^{\mu_1,...,\mu_k}_{\nu_1,...,\nu_l}
+\sum_{r=1}^k T^{\mu_1,...,\mu_{r-1},\beta,\mu_{r+1},...,\mu_k}_{\nu_1,...,\nu_l}\Gamma^{\mu_r}_{\beta\alpha}
-\sum_{r=1}^l T^{\mu_1,...,\mu_k}_{\nu_1,...,\nu_{r-1},\beta,\nu_{r+1},...,\nu_l}\Gamma^\beta_{\nu_r\alpha}\right)
\\
\bigg]\otimes dx^{\alpha}
\end{align*}
where ##T_{;\alpha}## is the derivative of ##T## in direction ##\partial_\alpha##.

Focus on the coefficient in parentheses

$$\left(\frac{\partial}{\partial x^\alpha} T^{\mu_1,...,\mu_k}_{\nu_1,...,\nu_l} +\sum_{r=1}^k T^{\mu_1,...,\mu_{r-1},\beta,\mu_{r+1},...,\mu_k}_{\nu_1,...,\nu_l}\Gamma^{\mu_r}_{\beta\alpha} -\sum_{r=1}^l T^{\mu_1,...,\mu_k}_{\nu_1,...,\nu_{r-1},\beta,\nu_{r+1},...,\nu_l}\Gamma^\beta_{\nu_r\alpha}\right)$$

The first term is simply the partial derivative of the tensor components, which is intuitive. The second and third components represent the way that the coordinate bases change as we move in direction ##\partial_\alpha##.

The second term covers how the basis ##\partial_{\mu_1},...,\partial_{\mu_k}## changes. The Christoffel symbol ##\Gamma^{\mu_r}_{\beta\alpha}## is equal to ##dx^{\mu_r}\left(\partial_{\beta;\alpha}\right)##. The item ##\partial_{\beta;\alpha}## is the derivative of ##\partial_\beta## in direction ##\partial_\alpha##, which represents the way that the direction of ##\partial_\beta## changes as we move in direction ##\partial_\alpha##.

Similarly, the third term covers how the basis ##dx^{\nu_1},...,dx^{\nu_k}## changes and the Christoffel symbol ##\Gamma^{\beta}_{\nu_r\alpha}## is equal to ##d\nu_r\left(\left(dx^\beta\right)_{;\alpha}\right)##. The item ##\left(dx^\beta\right)_{;\alpha}##, usually written as just ##dx^\beta{}_{;\alpha}##, is the derivative of ##dx^{\beta}## in direction ##\partial_\alpha##, which represents the way that ##dx^\beta## changes as we move in direction ##\partial_\alpha##.

So the coefficient can be written without Christoffel symbols as:

$$\left(\frac{\partial}{\partial x^\alpha} T^{\mu_1,...,\mu_k}_{\nu_1,...,\nu_l} +\sum_{r=1}^k T^{\mu_1,...,\mu_{r-1},\beta,\mu_{r+1},...,\mu_k}_{\nu_1,...,\nu_l}dx^{\mu_r}\left(\partial_{\beta;\alpha}\right) -\sum_{r=1}^l T^{\mu_1,...,\mu_k}_{\nu_1,...,\nu_{r-1},\beta,\nu_{r+1},...,\nu_l}\partial_{\nu_r}\left(dx^\beta{}_{;\alpha}\right)\right)$$

OR, if you feel more comfortable with ##\nabla## than with semi-colons, as

$$\left(\frac{\partial}{\partial x^\alpha} T^{\mu_1,...,\mu_k}_{\nu_1,...,\nu_l} +\sum_{r=1}^k T^{\mu_1,...,\mu_{r-1},\beta,\mu_{r+1},...,\mu_k}_{\nu_1,...,\nu_l}dx^{\mu_r}\left(\nabla_\alpha\partial_{\beta}\right) -\sum_{r=1}^l T^{\mu_1,...,\mu_k}_{\nu_1,...,\nu_{r-1},\beta,\nu_{r+1},...,\nu_l}\partial_{\nu_r}\left(\nabla_\alpha dx^\beta\right)\right)$$

Last edited:
[...] The second and third components represent the way that curvature makes the coordinate bases change as we move in direction ##\partial_\alpha##.
It's possible to have zero curvature, but nonzero ##\Gamma##'s. If the curvature vanishes, then it's possible to find some other coordinate system in which the ##\Gamma##'s also vanish.

The 2nd and 3rd components just correct for the fact that the components in the 1st term do not transform tensorially in general.

Gold Member
2021 Award
It's possible to have zero curvature, but nonzero ##\Gamma##'s. If the curvature vanishes, then it's possible to find some other coordinate system in which the ##\Gamma##'s also vanish.

The 2nd and 3rd components just correct for the fact that the components in the 1st term do not transform tensorially in general.
Sure, you know this from the usual Euclidean 3D vector analysis. For curvilinear coordinates (like spherical or cylindrical) the Christoffel symbols do not vanish. Note that in the usual cases of orthogonal curvilinear coordinates one uses not the holonomous basis but the orthonormal basis. In any case the Christoffel symbols don't vanish, if the basis vectors depend on position.

Staff Emeritus
But this object does not transform properly under coordinate transformations. I do not have my copy of Wald with me so I cannot check what it actually says at the moment.

I think that the point is to show that interpreting $\nabla$ as the partial-derivative operator (in some specific basis) satisfies all the axioms that Wald gives for the covariant derivative. So this shows that a covariant derivative exists (that is, something satisfies the axioms).

Staff Emeritus
Homework Helper
Gold Member
2021 Award
I think that the point is to show that interpreting $\nabla$ as the partial-derivative operator (in some specific basis) satisfies all the axioms that Wald gives for the covariant derivative. So this shows that a covariant derivative exists (that is, something satisfies the axioms).
That would make more sense. Of course, this is just one particular covariant derivative and it is dependent on which coordinate system is used to define it, but I agree that it would make a perfectly fine covariant derivative - as long as one does not do the mistake of thinking its Christoffel symbols vanish in other coordinate systems.

Homework Helper
Gold Member
It's possible to have zero curvature, but nonzero ##\Gamma##'s.
True. I forgot about that. Polar coordinates in 2D Euclidean space is a simple example with nonzero Christoffel symbols but no curvature.

JonnyG
Okay I think I have got it. Please correct me if I am wrong. Let ##M## be a smooth manifold and let ##p \in M##. Then given a smooth chart ##(U, \phi)## about ##p##, then we can define a partial derivative operator ##\partial## (like the one we previously discussed) that satisfies the properties given by Wald. However, if ##T## is a smooth tensor field, then ## \partial T## is actually not a tensor field, because its components do not transform properly when changing coordinates. But, it can be shown that ##\nabla T = \partial T + \Gamma^c_{ab}##, where ##\Gamma^c_{ab}## is a tensor field. And actually, ##\nabla T## IS a smooth tensor field, and thus is a covariant derivative operator. So the Christoffel symbol acts as a sort of correction factor.

I have three more questions (sorry guys):

1) If ##T## is a smooth tensor field then its components are smooth real-valued functions. i.e. they are smooth functions from the manifold into ##\mathbb{R}##. But if we take the partial derivatives of the components, then ##\frac{\partial T^{\mu_1 \cdots \mu_k}_{\nu_1 \cdots \nu_l}}{\partial x^\alpha}## is the component of at least one of the terms in ##\nabla T##. But ##\frac{\partial T^{\mu_1 \cdots \mu_k}_{\nu_1 \cdots \nu_l}}{\partial x^\alpha}## is not a smooth function from the manifold into ##\mathbb{R}## - it is a smooth function from ##\mathbb{R}## into ##\mathbb{R}## (in the case where the manifold were embedded in Euclidean space). But the components of ##\nabla T## are supposed to be smooth functions from the manifold into ##\mathbb{R}##. What's up with this?

2) I understand that the Christoffel symbols act as correction factors. But what exactly is it correcting? What geometry does it describe?

3) The fifth property that Wald requires the covariant derivative operator to satisfy is the torsion free property. That is, ##\nabla_a \nabla_b f = \nabla_b \nabla_a f##. I thought the indices on the derivative operator were kind of dummy indices? So what does it really mean to interchange the order of the ##\nabla_a## and the ##\nabla_b##? If they are dummy indices, then interchanging the order is trivial, but that doesn't seem to be the case.

Last edited:
Homework Helper
Gold Member
However, if ##T## is a smooth tensor field, then ## \partial T## is actually not a tensor field, because its components do not transform properly when changing coordinates. But, it can be shown that ##\nabla T = \partial T + \Gamma^c_{ab}##, where ##\Gamma^c_{ab}## is a tensor field. And actually, ##\nabla T## IS a smooth tensor field
I assume that when you write ##\nabla T = \partial T + \Gamma^c_{ab}## you just mean this to be a stylised indicator that ##\nabla T## is equal to what you call ##\partial T## plus a linear combination of a bunch of Christoffel symbols. If so, that's fine.

But ##\Gamma^c_{ab}## cannot be a tensor field because in your stylised equation it is equal to ##\nabla T - \partial T##. Since the first of those is a tensor field and the second is not, the difference cannot be a tensor field. Christoffel symbols are coordinate dependent and do not transform as tensors.

Staff Emeritus
Homework Helper
Gold Member
2021 Award
1) If ##T## is a smooth tensor field then its components are smooth real-valued functions. i.e. they are smooth functions from the manifold into ##\mathbb{R}##. But if we take the partial derivatives of the components, then ##\frac{\partial T^{\mu_1 \cdots \mu_k}_{\nu_1 \cdots \nu_l}}{\partial x^\alpha}## is the component of at least one of the terms in ##\nabla T##. But ##\frac{\partial T^{\mu_1 \cdots \mu_k}_{\nu_1 \cdots \nu_l}}{\partial x^\alpha}## is not a smooth function from the manifold into ##\mathbb{R}## - it is a smooth function from ##\mathbb{R}## into ##\mathbb{R}## (in the case where the manifold were embedded in Euclidean space).

It does define smooth functions from the manifold to the real numbers. It just does not have the correct trandformation properties.

2) I understand that the Christoffel symbols act as correction factors. But what exactly is it correcting? What geometry does it describe?
They are not correction factors, they tell you how the vector basis changes. Given a manifold, there are several possible connections. In a sense, the Christoffel symbols define the geometry.

If they are dummy indices, then interchanging the order is trivial, but that doesn't seem to be the case.

It tells you that ##\nabla\nabla f## is symmetric. This is not at all a trivial statement.

Last edited:
• vanhees71
Staff Emeritus
Homework Helper
Gold Member
2021 Award
No, that's a definition of zero curvature. (This will become clearer when you've read a bit further in Wald where he explains curvature.)
No, this is wrong. Wald is indeed intending zero torsion here. Note that what the covariant derivatives act on here are scalar fields, not tangent vectors or one-forms.

Indeed, now that I have access to my copy again, Wald explicitly states (my emphasis):
5. Torsion free: For all ##f \in \mathcal F##, ##\nabla_a \nabla_b f = \nabla_b \nabla_a f##.
Just because the connection is torsion free, it is not implied that it has zero curvature - as is the case, e.g., for the Levi-Civita connection on a sphere.

With respect to the OP, the important passage which should not be missed is the statement (Wald's emphasis):
Of course, a different choice of coordinate system ##\psi'## will yield a different derivative operator ##\partial_a'##, that is, the components of the tensor ##\partial_c T^{a_1\ldots a_k}{}_{b_1\ldots b_l}## in the new (primed) coordinates will not be equal to the partial derivatives of the primed components of ##T^{a_1\ldots a_k}{}_{b_1\ldots b_l}## with respect to the primed coordinates. Thus, a given ordinary derivative operator is coordinate dependent, i.e., it is not naturally associated with the structure of the manifold.

• vanhees71
Staff Emeritus
Gold Member
But ##\Gamma^c_{ab}## cannot be a tensor field because in your stylised equation it is equal to ##\nabla T - \partial T##. Since the first of those is a tensor field and the second is not, the difference cannot be a tensor field. Christoffel symbols are coordinate dependent and do not transform as tensors.

As others have said, the ##\Gamma^c_{ab}## are not components of a tensor field. (If a tensor's components are 0 in one coordinate system, they're 0 in all coordinate systems. In general, this is not true of the ##\Gamma##'s.)

I think that another important passage on Wald's perspective comes after equation (3.1.5)

Note that, as defined here, a Christoffel symbol is a tensor field associated with the derivative operator ##\nabla_a## and the coordinate system used to define ##\partial_a##. However, if we change coordinates, we also change our ordinary derivative operator from ##\partial_a## to ##\partial'_a## and thus we change our tensor from ##\Gamma^c{}_{ab}## to a new tensor ##\Gamma'^c{}_{ab}##. Hence the coordinate components of ##\Gamma^c{}_{ab}## will not be related to the components of ##\Gamma'^c{}_{ab}## in the primed coordinates by the tensor transformation law, equation (2.3.88), since we change tensors as well as coordinates.

Last edited:
JonnyG
The two important passages quoted by Orodruin and George Jones clear things up a lot (I think). Just for one last clarification, from the passage that George Jones quoted, Wald is the saying that a Christoffel symbol is a tensor field, but a change of coordinates will yield a different Christoffel symbol, whose components are not related to the original Christoffel symbol by the tensor transformation law.

No, this is wrong. Wald is indeed intending zero torsion here. Note that what the covariant derivatives act on here are scalar fields, not tangent vectors or one-forms.
Oh, yes, of course. It's acting on a scalar field here, not a vector field. Last edited:
I think that another important passage on Wald's perspective comes after equation (3.1.5)
[...quote from Wald...]
Did you mean after eq (3.1.15)? Anyway, I must say I dislike Wald's use of terminology on this point. To call a Christoffel symbol a "tensor field", and then immediately say that its components don't obey the tensor transformation law, seems unnecessarily confusing.

(Yes, I know he's saying that the Christoffel symbol is (my bolding)
a tensor field associated with the derivative operator ##\nabla_a## and the coordinate system used to define ##\partial_a## [and that we] change tensors as well as coordinates
...meaning that it's a coordinate-dependent tensor field.)

Staff Emeritus
Did you mean after eq (3.1.15)? Anyway, I must say I dislike Wald's use of terminology on this point. To call a Christoffel symbol a "tensor field", and then immediately say that its components don't obey the tensor transformation law, seems unnecessarily confusing.

(Yes, I know he's saying that the Christoffel symbol is (my bolding)...meaning that it's a coordinate-dependent tensor field.)

This is a point that causes some confusion. People often say something like "Such and such isn't a tensor, because it doesn't transform the right way...", but I think that mixes up a mathematical object (namely, a tensor) with a particular way of describing it (using coordinates).

When you pick a coordinate system $x^\mu$, then the basis vectors $e_\mu$ (or some people prefer $\frac{\partial}{\partial x^\mu}$) are a 4-tuple of vector fields. So since $\nabla$ turns a vector field into a $(1,1)$ tensor field, then it follows that $\nabla e_\mu$ is a $(1,1)$ tensor field. The connection coefficients $\Gamma^\alpha_{\mu \nu}$ are just the components of that tensor field:

$\Gamma^\alpha_{\mu \nu} = (\nabla e_\mu)^\alpha_\nu$

But what's misleading about the expression for this tensor field is that the index $\mu$ is not something to be transformed under a coordinate change. It's picking a very particular vector field $e_\mu$. $\mu$ is not an index saying which component of a tensor, it's an index saying which vector field.

This is a point that causes some confusion. People often say something like "Such and such isn't a tensor, because it doesn't transform the right way...",
Yes, I was originally taught like that back in the Jurassic period, but now I try to maintain a distinction between the object and its components wrt a coord system.

Gold Member
2021 Award
Indeed, it is very important to distinguish between tensors (including scalars and vectors as tensors of rank zero and one, respectively), which are independent of the choice of coordinates and their components with respect to a basis. The Christoffel symbols are NO tensor components, and also the partial derivative with respect to some (generalized) coordinate are NO tensor components.

For simplicity let's look at a vector field ##\boldsymbol{V}##. In terms of contravariant components with respect to an arbitrary basis ##\boldsymbol{b}_{\mu}## it's written as
$$\boldsymbol{V}=V^{\mu} \boldsymbol{b}_{\mu}$$
with the Einstein summation convention applied here. Let ##\boldsymbol{b}^{\mu}## denote the dual basis. The basis vectors transform under general diffeomorphisms of the generalized coordiantes ##q^{\mu}## of the manifold via
$$\boldsymbol{b}_{\mu}'=\frac{\partial q^{\nu}}{\partial q^{\prime \mu}} \boldsymbol{b}_{\nu},$$
i.e., in the covariant way, and the components of the vector in the contravariant way (i.e., like the "increments" of the generalized coordinates)
$$V^{\prime \mu} = \frac{\partial q^{\prime \mu}}{\partial q^{\nu}} V^{\nu}.$$
The same holds for the dual basis:
$$\boldsymbol{b}^{\prime \mu} = \frac{\partial q^{\prime \mu}}{\partial q^{\nu}} \boldsymbol{b}_{\nu}.$$
Obviously the partial derivative applied to an invariant object transforms in a covariant way. Thus from the vector field ##\boldsymbol{V}## you get a 2nd-rank tensor by defining
$$\nabla \boldsymbol{V} = \boldsymbol{b}^{\mu} \otimes \partial_{\mu} \boldsymbol{V}.$$
Now we have
$$\partial_{\mu} \boldsymbol{V} = \partial_{\mu}(\boldsymbol{b}_{\nu} V^{\nu}) = (\partial_{\mu} V^{\nu}) \boldsymbol{b}_{\nu} + V^{\nu} \partial_{\mu} \boldsymbol{b}_{\nu}.$$
Now we can express the latter vector again in terms of the basis, defining the Christoffel symbol by
$$\partial_{\mu} \boldsymbol{b}_{\nu}=\boldsymbol{b}_{\rho} {\Gamma^{\rho}}_{\mu \nu}.$$
Note that the Christoffel symbols are in general NOT symmetric in the lower indices. This you have to postulate in addition, defining the manifold as "torsion free". This now implies that the tensor COMPONENTS of the gradient of the vector field ##\boldsymbol{V}## are given by
$$\nabla_{\mu} V^{\rho}=\partial_{\mu} V^{\rho} + {\Gamma^{\rho}}_{\mu \nu} V^{\nu}.$$
You can easily check that in general neither of the two terms on the right-hand side transform like "mixed tensor components" ##{T_{\mu}}^{\rho}## but the combination of both does!

You can also easily show that the antisymmetrized Christoffelsymbols, the torsion, transforms like 3rd-rank tensor components,
$${\tau^{\rho}}_{\mu \nu} = {\Gamma^{\rho}}_{\mu \nu} - {\Gamma^{\rho}}_{\nu \mu},$$
i.e., the torsion defines a coordinate independent characterization of the manifold. The connection, defined by the Christoffel symbols, of course depends on the choice of the basis we started with.

For a (pseudo-)Riemannian manifold we have some additional requirements. One is that the manifold is torsion free and the other that the gradient of the (pseudo-)metric vanishes. From these requirements you get the well-known unique definition of the Christoffel symbols in terms of partial derivatives of the (pseudo-)metric.

• etotheipi
Staff Emeritus
Homework Helper
Gold Member
2021 Award
For a (pseudo-)Riemannian manifold we have some additional requirements. One is that the manifold is torsion free and the other that the gradient of the (pseudo-)metric vanishes. From these requirements you get the well-known unique definition of the Christoffel symbols in terms of partial derivatives of the (pseudo-)metric.

To clarify a bit here, these are not requirements that you have to impose specifically for a manifold with a metric - although we often do so. Any connection which is a connection without a metric tensor is still a connection when there is a metric tensor. You could easily imagine connections that fulfil none, one, or both of these conditions. An example is the connection on a sphere (without the poles) which preserves compass directions during parallel transport. This is a metric compatible, but not torsion-free, connection.

The point is that it is only when you require both that the connection is compatible with the metric and that it is torsion free that you obtain the unique Levi-Civita connection. You are also free to require that a connection is torsion free even if you do not have access to a metric, although it will not uniquely fix the connection.

• vanhees71
Staff Emeritus
Indeed, it is very important to distinguish between tensors (including scalars and vectors as tensors of rank zero and one, respectively), which are independent of the choice of coordinates and their components with respect to a basis. The Christoffel symbols are NO tensor components

I just demonstrated exactly the opposite. Let me go through it again:
1. Let $V$ be any vector field.
2. Then $\nabla V$ is a $(1,1)$ tensor field.
3. A basis vector $e_\mu$ for a coordinate system is, in fact, a vector field.
4. Therefore, for any basis vector $e_\mu$, $\nabla e_\mu$ is a $(1,1)$ tensor field.
5. The components of $\nabla e_\mu$ in that basis are, in fact $(\nabla e_\mu)^\lambda_\nu = \Gamma^\lambda_{\mu \nu}$
6. So $\Gamma^\lambda_{\mu \nu}$ is in fact, the components of a tensor field.
What is perhaps confusing is that, in spite of appearances, the roles of $\mu$ and $\nu$ in $\Gamma^\lambda_{\mu \nu}$ are not analogous: $\lambda$ and $\nu$ are component indices of the tensor field $\nabla e_\mu$, while $\mu$ is not a component index, but tells us which tensor field we are talking about: $\nabla e_\mu \neq \nabla e_\nu$ if $\mu \neq \nu$

Staff Emeritus
Homework Helper
Gold Member
2021 Award
I just demonstrated exactly the opposite. Let me go through it again:
1. Let $V$ be any vector field.
2. Then $\nabla V$ is a $(1,1)$ tensor field.
3. A basis vector $e_\mu$ for a coordinate system is, in fact, a vector field.
4. Therefore, for any basis vector $e_\mu$, $\nabla e_\mu$ is a $(1,1)$ tensor field.
5. The components of $\nabla e_\mu$ in that basis are, in fact $(\nabla e_\mu)^\lambda_\nu = \Gamma^\lambda_{\mu \nu}$
6. So $\Gamma^\lambda_{\mu \nu}$ is in fact, the components of a tensor field.
What is perhaps confusing is that, in spite of appearances, the roles of $\mu$ and $\nu$ in $\Gamma^\lambda_{\mu \nu}$ are not analogous: $\lambda$ and $\nu$ are component indices of the tensor field $\nabla e_\mu$, while $\mu$ is not a component index, but tells us which tensor field we are talking about: $\nabla e_\mu \neq \nabla e_\nu$ if $\mu \neq \nu$

Just to underline the structure in a coordinate representation, the mixed tensor ##\nabla e_\mu## is given by
$$\nabla e_\mu = \Gamma_{\mu\nu}^\sigma e_\sigma \otimes dx^\nu.$$
It is here quite obvious that if you use a coordinate basis ##e_\mu = \partial_\mu##, then ##\partial_\mu## is a very specific vector field and clearly ##\nabla V## depends on which vector field ##V## you use. Using the vector field ##\partial'_\mu## is clearly going to result in a different tensor. The important part is that the ##\mu## is labelling which vector field we are talking about and is not denoting a tensor component, which I think is the key point in both yours and vanhees' statements. The Christoffel symbols are not the components of a type (1,2) tensor, they are the components of ##n## different (1,1) tensors. Which ##n## tensors we are talking about depends on what vector basis we have chosen.

• vanhees71
Staff Emeritus
Exactly. The point of $\nabla$ versus $\partial$ is this: If $V$ is a vector field, then $\nabla V$ is a unique tensor field, which you can evaluate in any coordinate system you like. In contrast, $\partial V$ is not a unique tensor field; you have to know which basis was used in order to know which tensor field you mean. It's still a tensor field (but probably not the one you meant).