# Covariant derivative and geometry of tensors

## Main Question or Discussion Point

I'm trying to teach myself GR from Wald's General Relativity, and it's very tough going. I do have basic knowledge of differential geometry, but I think my geometric intuition is next to nonexistent. I'd very much appreciate some help in understanding several basic questions, or pointers to texts with good explanations.

1. The text stresses that in the absence of something like covariant derivative, we have no way of connecting tangent spaces of different points with each other. What confuses me is how can we have a notion of a smooth vector field in that case? I understand the definition of a smooth vector field (tangent vectors acting as derivations turn smooth functions to smooth functions); since we have it, doesn't it mean that we do have a notion of what it means for tangent vectors from two nearby tangent spaces to be very close?

Also, why doesn't the standard trick of passing to R^n work? Given two nearby points on the manifold, why can't we translate them into R^n by a chart and identify their tangent spaces in R^n? If this leads to different results in different charts, why wouldn't the smoothness of chart transformations make everything alright, and is there an instructive example to show this failure?

2. Is there a good way of understanding tensor product and contraction geometrically? I can follow all the indices around, I just completely fail to understand what it *means*, and it's very frustrating. For example, Wald defines parallel transport of vector $$v^{b}$$ along a curve with tangent $$t^{a}$$ by the equation $$t^{a}\nabla_{a}v^{b} = 0$$. But I don't understand why this should capture the notion of parallel transport. $$\nabla_{a}v^{b}$$ is some (1,1)-tensor, and while I can write out the definition of what it means to multiply it by $$t^{a}$$ and contract over index a, I don't really understand what it means. Even worse, for a general tensor T with both covariant and contravariant indices, I have no clue how to imagine $$t^{a}\nabla_{a}T = 0$$. Is there a helpful way to visualize/understand this?

Related Special and General Relativity News on Phys.org
1. The text stresses that in the absence of something like covariant derivative, we have no way of connecting tangent spaces of different points with each other. What confuses me is how can we have a notion of a smooth vector field in that case? I understand the definition of a smooth vector field (tangent vectors acting as derivations turn smooth functions to smooth functions); since we have it, doesn't it mean that we do have a notion of what it means for tangent vectors from two nearby tangent spaces to be very close?
No, because how close they are depends on a coordinate system. When you have a coordinate derivative, you can compare tangent vectors at different points depending on a path that connects these two points but is independent of a a local coordinate system.

Also, why doesn't the standard trick of passing to R^n work? Given two nearby points on the manifold, why can't we translate them into R^n by a chart and identify their tangent spaces in R^n? If this leads to different results in different charts, why wouldn't the smoothness of chart transformations make everything alright, and is there an instructive example to show this failure?
Smoothness does not depend on a chart, but other things do depend.

2. Is there a good way of understanding tensor product and contraction geometrically? I can follow all the indices around, I just completely fail to understand what it *means*, and it's very frustrating.
Tesnor products of vector space is a part of multilinear algebra. Some of that has a geometrical interpretation in terms of Grassmann (or Clifford) algebra. But otherwise algebra is just algebra - a very useful tool.

For example, Wald defines parallel transport of vector $$v^{b}$$ along a curve with tangent $$t^{a}$$ by the equation $$t^{a}\nabla_{a}v^{b} = 0$$. But I don't understand why this should capture the notion of parallel transport. $$\nabla_{a}v^{b}$$ is some (1,1)-tensor, and while I can write out the definition of what it means to multiply it by $$t^{a}$$ and contract over index a, I don't really understand what it means. Even worse, for a general tensor T with both covariant and contravariant indices, I have no clue how to imagine $$t^{a}\nabla_{a}T = 0$$. Is there a helpful way to visualize/understand this?
It is good to check some other book where these concepts are explained in detail, especially parallel transport along along a path. I like Bishop, Crittenden "Geometry of manifolds", but you will certainly find other good books.

Last edited by a moderator:
Also, why doesn't the standard trick of passing to R^n work?
It does, at least formally. See, for example, Dirac, General Theory of Relativity, section 6, where he uses an embedding to define parallel transport. Try searching in the forum for some interesting discussions of how big an n you need. (Oops, sorry, you're right, OP is not asking about that.)

$$t^{a}\nabla_{a}v^{b} = 0$$
$$t^a\nabla_a$$ is how one writes out the directional derivative along the curve in Wald's s notation. It's the same as

$$t^a \nabla_{\frac{\partial}{\partial x^a}} = \nabla_{t^a\frac{\partial}{\partial x^a}}$$

in Koszul notation, where it's easier to see that it's the covariant derivative in the direction of the tangent vector field..

Last edited:
It does, at least formally. See, for example, Dirac, General Theory of Relativity, section 6, where he uses an embedding to define parallel transport.
I don't think this is what the OP was asking about when he asked his question in the context of smoothness. Moreover the embedding trick is not really "geometrical", because it works on an infinitesimal level as regards the law of the parallel transport.

Tesnor products of vector space is a part of multilinear algebra. Some of that has a geometrical interpretation in terms of Grassmann (or Clifford) algebra. But otherwise algebra is just algebra - a very useful tool.
Right, I understand this. What I meant is, about the most typical thing that I see done with two tensors in Wald's book so far, is multiplying them and contracting the product on one of the indices. For example, this is done all the time to raise/lower indices using the metric; but while I understand the formalism and see that it e.g. changes an (n,m) tensor into (n-1,m+1), I don't understand the *meaning* of the operation.

I worked out that if I have a dual vector and a vector, and if I treat them as tensors and multiply then contract those tensors, what I get is the same as simply acting on the vector with the dual vector. That's about the simplest example imaginable, and that's now clear to me. But I completely lack intuition into what it means e.g. to tackle on the metric to an (n,m) tensor, contracting on one index. Sure, formally it's a (n-1,m+1)-tensor, a multilinear function on n-1 V*'s and m+1 V's, and given a sample set of n-1 dual vectors and m+1 vectors, I can write out the formula of the value of the function as a sum over one basis of a product of the metric values and the original tensor's values. But that formal sum is all I have; I can't help thinking I should have some kind of (vague? geometric? algebraic?) intuitive understanding of what was actually *done*.

Anyway, I wanted to thank you, and Daverz, for your kind answers and some pointers. I've also started reading Schutz's GR book, and it seems to go into more details setting up tensors and covariant derivatives (only just getting to that part); perhaps it'll clear things up for me. I've looked at Bishop & Crittenden and it's also very nice, though perhaps a bit more mathematically-minded than I need for getting through Wald (e.g. they discuss everything in terms of arbitrary fiber bundles, while Wald only looks at the tensor bundle w/o calling it so); but I'll try looking things up there when they're too vague for me elsewhere.

For example, this is done all the time to raise/lower indices using the metric; but while I understand the formalism and see that it e.g. changes an (n,m) tensor into (n-1,m+1), I don't understand the *meaning* of the operation. .
I think it all boils down to the question what is the geometrical (or whatever) *meaning* of the trace of a matrix. All these contractions are generalized partial traces. These are *invariants*. In physics we are trying to associate some geometrical and geometrical content to the invariants and to invariant operations. For some it is easy, for some it is not so easy. In quantum theory we are even taking traces over a continuous index, like x or p - we call then "integrals".

I'm trying to teach myself GR from Wald's General Relativity, and it's very tough going. I do have basic knowledge of differential geometry, but I think my geometric intuition is next to nonexistent. I'd very much appreciate some help in understanding several basic questions, or pointers to texts with good explanations.
You might try the Stanford Leonard Susskind lectures on General Relativity (GR starts at lecture 28):