# I Understanding parallel transfer

Tags:
1. Apr 19, 2017

### joneall

I've read Collier's book on General Relativity and consulted parts of Schutz, Hartle and Carroll. In the terms they use, i have yet to gain anything resembling an intuitive understanding of parallel transport.

In fact, it seems to me it is usually presented backwards, saying that the geodesic is the path where the covariant derivative of the tangent vector to the path is zero. This is true, but it seems to me simpler to say that the geodesic is chosen parallel to the tangent vector whose covariant derivative is zero.

Let me explain in really simple terms. I was trying to visualize it as I walked my dog thru the local park this morning. The park is on a gentle slope with hillocks here and there, so that it is a nice 2-D manifold with a tangent space at each point. The path starts out on level ground and is so just a straight line in the direction of my destination, a point at the top of the slope where there is a gate. So the tangent vector to the path parallel transports, i.e., it remains constant, its derivative, zero.

Now I reach a spot where there is a hillock nearby, ahead on the left. Suddenly, I am not in flat space any more and I must take that into account. What do I do?

I first adjust my tangent space to the terrain, so it is a plane tilted slightly with respect to the preceding one, meaning the coordinate basis has changed. I then pick the vector in the space which is approximately parallel to the preceding one and whose covariant derivative is zero. And I make my path tangent to that vector, i.e., I adopt that vector as the new tangent vector to my path, effectively choosing the path.

I then advance a distance dx and start over at the beginning of the last paragraph.

Is this reasonable or have I entirely missed something -- hopefully simpler?

I have seen another (closed) discussion on this topic which says a number of things which mean nothing to me:

"Parallel transport just keeps the vector at the same angle it started, irrespective of the worldline that's transporting it" OK, parallel with respect to what?

"...parallel transport is defined with respect to the connection...?" I did not realize a connection was a set of coordinates. I thought it was a function of derivatives of the metric, commonly known as a connection coefficient or Christoffel symbol.

I guess my questions indicate the degree of my confusion. I need some intuitive understanding of this. It is not adequate to me to say that covariant derivate = zero defines parallel transfer.

Sorry for the confusion and thanks in advance for any help.

2. Apr 19, 2017

### Orodruin

Staff Emeritus
There is no need to have a metric in order to define a connection. The connection is defined by the connection coefficients. However, if there is a metric, there is a unique torsion free affine connection that is metric compatible - the Levi-Civita connection.

Why not? It is how parallel transport is defined. A vector field $X$ is parallel along a curve if $\nabla_{\dot\gamma} X = 0$ along that curve (where $\dot\gamma$ is the tangent vector of the curve).

In heuristic terms, it tells you that the vector $X$ "does not change" along the curve. What "does not change" means is defined by the affine connection.

3. Apr 19, 2017

### pervect

Staff Emeritus
Here is how I look at parallel transport.

One way that's very useful of defining a straight line is to say that you walk in the same direction. I believe you use the concept yourself in your example - I'll stick with your analogy of "walking". But, on a curved manifold, there's a unique tangent space at every point, as you also mentioned. So, to be able to walk in the same direction,you implicitly define a map from one tanget space to another, that maps a vector pointing in the direction you are moving from one tangent space to the next as you walk along the path.

We say that a geodesic parallel transports itself.

If one has any definition of a straight line, one defines this map, a map from a vector in one tangent space to another, for a vector pointing in the direction one is moving - walking, in your park example.. This is just a logical consequence of having any well defined notion of what a straight line is.

Parallel transport is a bit more involved. Walking in a straight line means that you can carry a vector that points along your path of travel along with you. But parallel transport is a more powerful generalization - it says you can transport ANY vector, even one that's not pointing in the direction you're walking in, along with you. So if you're holding a spear pointing to your right, you can parallel transport that spear as you walk along, and it points in the "same direction" as you walk.

Up to now, it's all been very general, but for the purposes of physics we are often less general than the bare minimum of what is logically required to be self-consistent.. Instead of using any possible logical definition of a straight line, we insist that a straight line is the shortest distance between two nearby points - or to be more precise, a path that extremizes distance. A bit more on this distinction later.

So this is where physicists start to take a less general view that mathematicians, insisting that straight lines are the shortest distance between two points. And we also insist that if we parallel transport a pair of vectors, they maintain the same angle with each other. So if we point orthogonal to our path, to the right, as we travel down the path, we define a notion of parallel transporting this vector that points to the right that doesn't change it's angle relative to the direction of travel.

The mathematics of this is that we preserve the dot product of the two vectors as we parallel transport them. This dot product is defined by our metric (which takes two vectors, and outputs a number. If the vectors are orthogonal, the number is zero). You can also use the covariant derivative defintion you were quoting from your texts, but you were complaining that it wasn't very intuitive (and I'd have to agree).

These less-than-totally-general conditions (preserving angles, straight lines being locally the shortest) on parallel transport single out a single map from vectors in one tangent space to another. This map from vectors in one tangent space to vectors in another has a general name, by the way - it's called a connection. And the specific map that we single out with our conditions is the special connection, the Levi-Civita connection.

That's the basics. I've skipped over mentioning some tricky points, but there's no sense in getting too tricky right off the bat, it's better to try to keep things simple. Hopefully this is simple enough to be of some use.

Well, I will mention one tricky thing that may give some insight. On a hilly surface, it's possible for two walkers to start at the same point, both walk in straight lines, but still meet each other at a later point along their paths. This has a number of interesting logical consequences. One can ask, for instance - do each of the travellers necessarily travel the same distance when they meet again? (The answer is no). And when the answer is "no", we have two travellers walking in straight lines from one point to another that have different path lengths, so it becomes obvious that we need to be a bit more formal about what a straight line is than saying it's a path of "shortest distance". Both paths are straight, but when they have different lengths, one of them must not be the shortest possible path. So this is what motivates the necessity for talking about "extremal distance" rather than "shortest distance" in my previous remarks.

Last edited: Apr 19, 2017
4. Apr 19, 2017

### A.T.

Imagine a tank, with the gun turret rotation inversely coupled to the tank steering: When the tank hull turns X degree relative to the local ground, the turret turns -X degree relative to the hull, so the gun keeps it's orientation relative to the local ground. This is how you parallel transport a gun.

5. Apr 19, 2017

### joneall

I exaggerated in my remark about defining the covariant derivative's being zero defines the parallel transport (not transfer): I should have said "intuitive", not "adequate".

Good analogy with the tank, A.T.

And pervect's little essay was quite enlightening.

I have been thru all the math of parallel transfers and geodesics. It's the science behind it that is hard to see. I'm still cogitating.

6. Apr 19, 2017

### Ibix

If I freefall along some path in spacetime then all the accelerometers I routinely carry about my person read zero. So I, and my accelerometers, must be pointing in the same direction all the time.

But this leads to some oddities in curved manifolds. For example, if you and I start somewhere on a sphere pointing in slightly different directions and move at the same speed on great circles we'll meet again every half circle. But each time we'll swap whether I'm facing clockwise or anticlockwise of you, despite that neither of us changed direction. And our directions relative to a latitude/longitude coordinate system will, in general, be changing all the time.

The connection coefficients define how the direction of the basis vectors change as you move. Partial derivatives describe how something changes with respect to the basis vectors. So the covariant derivative is the change of something corrected for the change of the basis vectors. So the covariant derivative of my heading, my path's tangent vector, along a free-fall path (where my direction isn't physically changing, regardless of its relation to the basis vectors) is either zero or our whole approach is wrong.

I think I've got that right.

7. Apr 19, 2017

### Staff: Mentor

But a single vector does not have a covariant derivative. In order to define any kind of derivative, you need a vector field, i.e., an assignment of vectors to different points in the manifold--more precisely, to the tangent spaces at different points in the manifold. Then you need a connection, i.e., a way of matching up vectors in the tangent spaces at different points, so you can define the difference between vector A at point #1 and vector B at neighboring point #2, which you need to be able to define in order to define derivatives. Finally, you need to know which neighboring points to use to make the comparison, which means you need a path--a particular curve whose points are the ones you use.

All this is why the definition is done the way it is, instead of the way you suggest.

8. Apr 19, 2017

### pervect

Staff Emeritus
I rather like the idea of approximating a general curve with a series of short segments that are geodesics ("straight"), and sharp turns. To parallel transport a vector you hold the angle to your path constant only when you're not making one of the sharp turns. When you make a sharp turn, you don't let the vector (your tank turret) rotate when you make the sharp turn.

The only proof I have to offer that this process converges to a unique idea of parallel transport is that the area between the curve made of geodesic segments and the smooth curve becomes very small, and I think it can be shown that the difference in rotation depends on this area. But I have some nagging doubts about this whole process working as I think it does, and proving that the process converges to a unique answer as one takes different approximations of the smooth curve in picewise segments formally seems like a bit of a leap of faith.

9. Apr 19, 2017

### Orodruin

Staff Emeritus
Is this not circular reasoning? The conclusion sounds reasonable, but the proof of the rotation angle being proportional to the integral of the scalar curvature over the enclosed area relies on the notion of parallel transport along the arbitrary smooth curve being defined.

10. Apr 19, 2017

### pervect

Staff Emeritus
It's probably is circular. So it's probably not suitable for any formal mathematical treatment, though it may still be useful in providing some intuition.

11. Apr 20, 2017

### joneall

Sorry, but the extent of my reading (see above) defines the symbols, connection or Christoffel, in terms of the metric and its derivatives. How can you define it without a metric? I know you can do the math with the principle of least action and Euler-Lagrange, but that does not show me what the connection is in that case.

Is there some fundamental difference between those types of metrics? I think connection coefficients are the same as Christoffel symbols. Wikipedia says Levi-Civita is what is often referred to as the covariant derivative, expressed in terms of Ch'l symbols. Is this really correct?

Sorry for all the questions. This is not an easy subject, especially if one does not want to spend months studying differential geometry.

12. Apr 20, 2017

### weirdoguy

Ehresmann connection. It's defined for vector (or even general fibre) bundle as a horizontal distribution in the tangnet bundle of total space (and you can show that it is equivalent to existance of an operator called "covariant derivative"). You can do that for principal bundles as well. Frankly, that was the first notion of connection that I have learned (I studied physics) and it made it much simpler to understand Levi-Civita connection (in which case the total space of vector bundle is TM and the distribution lies in TTM). When I was reading GR books before that I really didn't grasp the concept. Of course it requires more knowledge in differential geometry, but for me usually more "difficult"/general language makes things easier. The same was with classical gauge theories, learning about principal bundles first made things way more clearer.

13. Apr 20, 2017

### joneall

To get back to my park analogy, I will replace paragraph 5, where I reach a non-flat part by the following:

First, because this is a manifold and changes smoothly from point to point, I have a connection, a mathematical device which allows me to map from the last tangent space to the current one. This is needed because the new tangent space is also a plane, but is tilted slightly with respect to the preceding one, meaning the coordinate basis has changed. So I use the connection -- whatever sort it may be -- to map the vector to the new coordinates and then calculate the direction in which its covariant derivative is zero. And I make my path tangent to that vector, i.e., I adopt that vector as the new tangent vector to my path, effectively choosing the path.

Is that approximately ok as an analogy?

I must of course also take into account pervect's remark that one can parallel transport any vector, not a tangent one.

14. Apr 20, 2017

### Orodruin

Staff Emeritus
I think you are missing an important point of what an affine connection is. It is nothing but a way of relating vectors at different nearby points in a manifold. Given a vector $X$ at a point $p$ and a vector field $Y$ in some neighbourhood of $p$, it should satisfy
$$\nabla_{fX} Y = f \nabla_X Y \quad \mbox{and} \quad \nabla_X(fY) = df(X) Y + f \nabla_X Y$$
for any (smooth) scalar fields $f$ - that's it. If you insist on doing this by components, you can introduce the connection coefficients $\Gamma_{ab}^c$ according to
$$\nabla_a \partial_b = \Gamma_{ab}^c \partial_c.$$
Nowhere do you need a reference to a metric and any connection satisfying the mentioned identities is a perfectly fine affine connection.

Now, when you do have a metric you can require some additional things from your connection. We can require that the geodesics implied by the affine connection are extrema of the path length and that the connection is metric compatible $\nabla_X g = 0$ (this translates to parallel transport preserving lengths of and angles between parallel transported vectors).

The unique torsion free metric compatible affine connection on a manifold with a metric is called the Levi-Civita connection. This is the affine connection that you will encounter in GR.

This notion requires the embedding of your manifold in a higher-dimensional one with a pre-existing metric. It may serve as a heuristic for understanding why general curved spaces require the definition of a connection, but it is far from the most elegant or general way of looking at things. Use it for your heuristic understanding, but not as a general way of understanding manifolds. The description within the manifold is that the tangent spaces are just different and there are a priori several different ways of comparing nearby tangent spaces. This is what the affine connection is for - essentially it tells you what a vector "not changing" along a curve means.

There is no requirement that your connection must be the Levi-Civita connection. One of my favourite examples is the connection on the two-sphere (with the poles removed) for which compass directions are parallel transported (i.e., the parallel transport of a unit vector pointing east will always be a unit vector pointing east). This connection is metric compatible, but not torsion free. Its geodesics are curves of constant compass direction, i.e., a curve always going in the north-east direction is a geodesic of this connection.

15. Apr 20, 2017

### joneall

I think I see what you mean, but I do not understand the symbols in this formula. Could you explain or point me to an explanation. The nabla is the covariant derivative, normally, but what it is doing with two lower indices (indexes?) is beyond me. Sorry.

16. Apr 20, 2017

### Orodruin

Staff Emeritus
The affine connection $\nabla$ is a differential operator that takes a vector and a vector field as arguments. It never has two indices, the $\nabla_{fX}$ is letting the vector $X$ be multiplied by a constant. This is the coordinate-free notation. If you introduce coordinates, you would typically write $\nabla_a \equiv \nabla_{\partial_a}$ and therefore $\nabla_X = \nabla_{X^a\partial_a} = X^a \nabla_a$. The differential $df$ is a one-form and $df(X)$ is the value of applying that one-form to the vector $X$, i.e., $df(X) = X^a \partial_a f$ in any coordinate system.

17. Apr 20, 2017

### pervect

Staff Emeritus
The covariant derivative of a tensor adds a rank to a tensor - so if you have a rank 0 tensor, a scalar, $\nabla_a$ f is a vector (with a lower index, the way it's written, so you might call it a dual vector, depending on your notation). The covariant derivative of a vector doesn't make sense, but the covariant derivative of a vector field is a rank 2 tensor, $\nabla_a u^b$ for instance.

To parallel transport a vector along a curve, you first define the curve, C. And at every point, the curve has a tangent vector, $t^a$, which points in the direction of the path one is walking.

The parallel transport condition for a vector $u^a$ along the curve (curve C has a tangent $t^a$) satisfies the equation

$$t^a \nabla_a u^b = 0$$

This seems to be rather different from what you wrote.

Sometimes you see $t^a \nabla_a$ written as $\nabla_\vec{t}$, where $\vec{t}$is a vector (with components t^i). But I find this notation rather confusing, to be honest. IIRC it's called a directional derivative when written in this form.

I don't find any of this particularly intuitive - It's good to be able to write the equation for a parallel transport given the covariant derivative, but I don't think it generates much intuitive insight.

18. Apr 20, 2017

### joneall

Sorry, Orodriun, but I have the feeling you are out in some abstract mathematical space and I am down here on Earth. Certainly due to my ignorance. But there it is...

To begin with, using f to designate a constant is off-putting to us laypeople, who are used to using it for a function. Hence my confusion on that point.

Is the nabla the covariant derivative (which I understand to be a special case of a Levi-Civita connection, itself a special case of an affine connection -- Did I get that right?)? As pervect states, it adds a rank to a tensor. Here's the version of the covariant derivative I know about:

$\nabla_{\beta}V^\alpha = \partial_{\beta}V^\alpha + V^{\nu}\Gamma^{\alpha}_{\nu\beta}$

I fail to see where it has a vector and a vector field as arguments. All I see is the vector V. Again, my ignorance. Sorry to drag out this discussion.

19. Apr 20, 2017

### vanhees71

You can define a "directional derivative" by it. Take another vector field with components $W^{\beta}$ then you can define
$$(\nabla_W V)^{\alpha} = W^{\beta} \nabla_{\beta} V^{\alpha}.$$
Another application is a covariant derivative for vectors defined along a curve like $u^{\mu}=\dot{x}^{\mu}$, where the dot refers to the derivative with respect to an affine parameter like proper time and $x^{\mu}(\tau)$ given as a trajectory of a massive particle. Then the covariant derivative of arbitrary vectors $v^{\mu}(\tau)$ wrt. $\tau$ is
$$\mathrm{D}_{\tau} v^{\mu}(\tau)=\dot{v}^{\mu} + {\Gamma^{\mu}}_{\rho \sigma} u^{\rho} v^{\tau}.$$
Applied to $u^{\mu}$ itself, you can use it to define geodesics via
$$\mathrm{D}_{\tau} u^{\mu} =0.$$

20. Apr 20, 2017

### Orodruin

Staff Emeritus
The nabla is the affine connection. The notation is the same as for the covariant derivative. The Levi-Civita connection is a particular type of affine connection in a manifold with a metric.

Exactly. A type $(n,m)$ tensor is a linear map from a $m$ copies of the tangent space to $n$ copies of the same tangent space. The typical coordinate free notation is to put the additional argument as a subscript of the $\nabla$. For example, if $T$ is a type $(1,2)$ tensor, then $T(Y,Z)$ is a tangent vector and $\nabla T$ is a type $(1,3)$ tensor $S$ such that $S(X,Y,Z) \equiv \nabla_X T(Y,Z)$. There is nothing here that is any different from what you are doing, except that you have put everything in coordinate form.

That would be the free $\beta$ index, which can be contracted with the components of the vector, and the field $V$, respectively.