Understanding parallel transfer

joneall · Apr 19, 2017

I've read Collier's book on General Relativity and consulted parts of Schutz, Hartle and Carroll. In the terms they use, i have yet to gain anything resembling an intuitive understanding of parallel transport.

In fact, it seems to me it is usually presented backwards, saying that the geodesic is the path where the covariant derivative of the tangent vector to the path is zero. This is true, but it seems to me simpler to say that the geodesic is chosen parallel to the tangent vector whose covariant derivative is zero.

Let me explain in really simple terms. I was trying to visualize it as I walked my dog thru the local park this morning. The park is on a gentle slope with hillocks here and there, so that it is a nice 2-D manifold with a tangent space at each point. The path starts out on level ground and is so just a straight line in the direction of my destination, a point at the top of the slope where there is a gate. So the tangent vector to the path parallel transports, i.e., it remains constant, its derivative, zero.

Now I reach a spot where there is a hillock nearby, ahead on the left. Suddenly, I am not in flat space any more and I must take that into account. What do I do?

I first adjust my tangent space to the terrain, so it is a plane tilted slightly with respect to the preceding one, meaning the coordinate basis has changed. I then pick the vector in the space which is approximately parallel to the preceding one and whose covariant derivative is zero. And I make my path tangent to that vector, i.e., I adopt that vector as the new tangent vector to my path, effectively choosing the path.

I then advance a distance dx and start over at the beginning of the last paragraph.

Is this reasonable or have I entirely missed something -- hopefully simpler?

I have seen another (closed) discussion on this topic which says a number of things which mean nothing to me:

"Parallel transport just keeps the vector at the same angle it started, irrespective of the worldline that's transporting it" OK, parallel with respect to what?

"...parallel transport is defined with respect to the connection...?" I did not realize a connection was a set of coordinates. I thought it was a function of derivatives of the metric, commonly known as a connection coefficient or Christoffel symbol.

I guess my questions indicate the degree of my confusion. I need some intuitive understanding of this. It is not adequate to me to say that covariant derivate = zero defines parallel transfer.

Sorry for the confusion and thanks in advance for any help.

Orodruin · Apr 19, 2017

joneall said:

I did not realize a connection was a set of coordinates. I thought it was a function of derivatives of the metric, commonly known as a connection coefficient or Christoffel symbol.

There is no need to have a metric in order to define a connection. The connection is defined by the connection coefficients. However, if there is a metric, there is a unique torsion free affine connection that is metric compatible - the Levi-Civita connection.

joneall said:

It is not adequate to me to say that covariant derivate = zero defines parallel transfer.

Why not? It is how parallel transport is defined. A vector field ##X## is parallel along a curve if ##\nabla_{\dot\gamma} X = 0## along that curve (where ##\dot\gamma## is the tangent vector of the curve).

In heuristic terms, it tells you that the vector ##X## "does not change" along the curve. What "does not change" means is defined by the affine connection.

pervect · Apr 19, 2017

Here is how I look at parallel transport.

One way that's very useful of defining a straight line is to say that you walk in the same direction. I believe you use the concept yourself in your example - I'll stick with your analogy of "walking". But, on a curved manifold, there's a unique tangent space at every point, as you also mentioned. So, to be able to walk in the same direction,you implicitly define a map from one tanget space to another, that maps a vector pointing in the direction you are moving from one tangent space to the next as you walk along the path.

We say that a geodesic parallel transports itself.

If one has any definition of a straight line, one defines this map, a map from a vector in one tangent space to another, for a vector pointing in the direction one is moving - walking, in your park example.. This is just a logical consequence of having any well defined notion of what a straight line is.

Parallel transport is a bit more involved. Walking in a straight line means that you can carry a vector that points along your path of travel along with you. But parallel transport is a more powerful generalization - it says you can transport ANY vector, even one that's not pointing in the direction you're walking in, along with you. So if you're holding a spear pointing to your right, you can parallel transport that spear as you walk along, and it points in the "same direction" as you walk.

Up to now, it's all been very general, but for the purposes of physics we are often less general than the bare minimum of what is logically required to be self-consistent.. Instead of using any possible logical definition of a straight line, we insist that a straight line is the shortest distance between two nearby points - or to be more precise, a path that extremizes distance. A bit more on this distinction later.

So this is where physicists start to take a less general view that mathematicians, insisting that straight lines are the shortest distance between two points. And we also insist that if we parallel transport a pair of vectors, they maintain the same angle with each other. So if we point orthogonal to our path, to the right, as we travel down the path, we define a notion of parallel transporting this vector that points to the right that doesn't change it's angle relative to the direction of travel.

The mathematics of this is that we preserve the dot product of the two vectors as we parallel transport them. This dot product is defined by our metric (which takes two vectors, and outputs a number. If the vectors are orthogonal, the number is zero). You can also use the covariant derivative defintion you were quoting from your texts, but you were complaining that it wasn't very intuitive (and I'd have to agree).

These less-than-totally-general conditions (preserving angles, straight lines being locally the shortest) on parallel transport single out a single map from vectors in one tangent space to another. This map from vectors in one tangent space to vectors in another has a general name, by the way - it's called a connection. And the specific map that we single out with our conditions is the special connection, the Levi-Civita connection.

That's the basics. I've skipped over mentioning some tricky points, but there's no sense in getting too tricky right off the bat, it's better to try to keep things simple. Hopefully this is simple enough to be of some use.

Well, I will mention one tricky thing that may give some insight. On a hilly surface, it's possible for two walkers to start at the same point, both walk in straight lines, but still meet each other at a later point along their paths. This has a number of interesting logical consequences. One can ask, for instance - do each of the travellers necessarily travel the same distance when they meet again? (The answer is no). And when the answer is "no", we have two travellers walking in straight lines from one point to another that have different path lengths, so it becomes obvious that we need to be a bit more formal about what a straight line is than saying it's a path of "shortest distance". Both paths are straight, but when they have different lengths, one of them must not be the shortest possible path. So this is what motivates the necessity for talking about "extremal distance" rather than "shortest distance" in my previous remarks.

A.T. · Apr 19, 2017

joneall said:

I need some intuitive understanding of this.

Imagine a tank, with the gun turret rotation inversely coupled to the tank steering: When the tank hull turns X degree relative to the local ground, the turret turns -X degree relative to the hull, so the gun keeps it's orientation relative to the local ground. This is how you parallel transport a gun.

joneall · Apr 19, 2017

Thanks, everybody, for the helpful answers.

I exaggerated in my remark about defining the covariant derivative's being zero defines the parallel transport (not transfer): I should have said "intuitive", not "adequate".

Good analogy with the tank, A.T.

And pervect's little essay was quite enlightening.

I have been thru all the math of parallel transfers and geodesics. It's the science behind it that is hard to see. I'm still cogitating.

Ibix · Apr 19, 2017

If I freefall along some path in spacetime then all the accelerometers I routinely carry about my person read zero. So I, and my accelerometers, must be pointing in the same direction all the time.

But this leads to some oddities in curved manifolds. For example, if you and I start somewhere on a sphere pointing in slightly different directions and move at the same speed on great circles we'll meet again every half circle. But each time we'll swap whether I'm facing clockwise or anticlockwise of you, despite that neither of us changed direction. And our directions relative to a latitude/longitude coordinate system will, in general, be changing all the time.

The connection coefficients define how the direction of the basis vectors change as you move. Partial derivatives describe how something changes with respect to the basis vectors. So the covariant derivative is the change of something corrected for the change of the basis vectors. So the covariant derivative of my heading, my path's tangent vector, along a free-fall path (where my direction isn't physically changing, regardless of its relation to the basis vectors) is either zero or our whole approach is wrong.

I think I've got that right.

PeterDonis · Apr 19, 2017

joneall said:

it seems to me simpler to say that the geodesic is chosen parallel to the tangent vector whose covariant derivative is zero

But a single vector does not have a covariant derivative. In order to define any kind of derivative, you need a vector field, i.e., an assignment of vectors to different points in the manifold--more precisely, to the tangent spaces at different points in the manifold. Then you need a connection, i.e., a way of matching up vectors in the tangent spaces at different points, so you can define the difference between vector A at point #1 and vector B at neighboring point #2, which you need to be able to define in order to define derivatives. Finally, you need to know which neighboring points to use to make the comparison, which means you need a path--a particular curve whose points are the ones you use.

All this is why the definition is done the way it is, instead of the way you suggest.

pervect · Apr 19, 2017

A.T. said:

Imagine a tank, with the gun turret rotation inversely coupled to the tank steering: When the tank hull turns X degree relative to the local ground, the turret turns -X degree relative to the hull, so the gun keeps it's orientation relative to the local ground. This is how you parallel transport a gun.

I rather like the idea of approximating a general curve with a series of short segments that are geodesics ("straight"), and sharp turns. To parallel transport a vector you hold the angle to your path constant only when you're not making one of the sharp turns. When you make a sharp turn, you don't let the vector (your tank turret) rotate when you make the sharp turn.

The only proof I have to offer that this process converges to a unique idea of parallel transport is that the area between the curve made of geodesic segments and the smooth curve becomes very small, and I think it can be shown that the difference in rotation depends on this area. But I have some nagging doubts about this whole process working as I think it does, and proving that the process converges to a unique answer as one takes different approximations of the smooth curve in picewise segments formally seems like a bit of a leap of faith.

Orodruin · Apr 19, 2017

pervect said:

The only proof I have to offer that this process converges to a unique idea of parallel transport is that the area between the curve made of geodesic segments and the smooth curve becomes very small, and I think it can be shown that the difference in rotation depends on this area.

Is this not circular reasoning? The conclusion sounds reasonable, but the proof of the rotation angle being proportional to the integral of the scalar curvature over the enclosed area relies on the notion of parallel transport along the arbitrary smooth curve being defined.

pervect · Apr 19, 2017

Orodruin said:

Is this not circular reasoning? The conclusion sounds reasonable, but the proof of the rotation angle being proportional to the integral of the scalar curvature over the enclosed area relies on the notion of parallel transport along the arbitrary smooth curve being defined.

It's probably is circular. So it's probably not suitable for any formal mathematical treatment, though it may still be useful in providing some intuition.

joneall · Apr 20, 2017

Orodruin said:

There is no need to have a metric in order to define a connection. The connection is defined by the connection coefficients. However, if there is a metric, there is a unique torsion free affine connection that is metric compatible - the Levi-Civita connection.

Sorry, but the extent of my reading (see above) defines the symbols, connection or Christoffel, in terms of the metric and its derivatives. How can you define it without a metric? I know you can do the math with the principle of least action and Euler-Lagrange, but that does not show me what the connection is in that case.

Is there some fundamental difference between those types of metrics? I think connection coefficients are the same as Christoffel symbols. Wikipedia says Levi-Civita is what is often referred to as the covariant derivative, expressed in terms of Ch'l symbols. Is this really correct?

Sorry for all the questions. This is not an easy subject, especially if one does not want to spend months studying differential geometry.

weirdoguy · Apr 20, 2017

joneall said:

How can you define it without a metric?

Ehresmann connection. It's defined for vector (or even general fibre) bundle as a horizontal distribution in the tangnet bundle of total space (and you can show that it is equivalent to existence of an operator called "covariant derivative"). You can do that for principal bundles as well. Frankly, that was the first notion of connection that I have learned (I studied physics) and it made it much simpler to understand Levi-Civita connection (in which case the total space of vector bundle is TM and the distribution lies in TTM). When I was reading GR books before that I really didn't grasp the concept. Of course it requires more knowledge in differential geometry, but for me usually more "difficult"/general language makes things easier. The same was with classical gauge theories, learning about principal bundles first made things way more clearer.

joneall · Apr 20, 2017

To get back to my park analogy, I will replace paragraph 5, where I reach a non-flat part by the following:

First, because this is a manifold and changes smoothly from point to point, I have a connection, a mathematical device which allows me to map from the last tangent space to the current one. This is needed because the new tangent space is also a plane, but is tilted slightly with respect to the preceding one, meaning the coordinate basis has changed. So I use the connection -- whatever sort it may be -- to map the vector to the new coordinates and then calculate the direction in which its covariant derivative is zero. And I make my path tangent to that vector, i.e., I adopt that vector as the new tangent vector to my path, effectively choosing the path.

Is that approximately ok as an analogy?

I must of course also take into account pervect's remark that one can parallel transport any vector, not a tangent one.

Orodruin · Apr 20, 2017

joneall said:

Sorry, but the extent of my reading (see above) defines the symbols, connection or Christoffel, in terms of the metric and its derivatives. How can you define it without a metric? I know you can do the math with the principle of least action and Euler-Lagrange, but that does not show me what the connection is in that case.

I think you are missing an important point of what an affine connection is. It is nothing but a way of relating vectors at different nearby points in a manifold. Given a vector ##X## at a point ##p## and a vector field ##Y## in some neighbourhood of ##p##, it should satisfy
$$
\nabla_{fX} Y = f \nabla_X Y \quad \mbox{and} \quad \nabla_X(fY) = df(X) Y + f \nabla_X Y
$$
for any (smooth) scalar fields ##f## - that's it. If you insist on doing this by components, you can introduce the connection coefficients ##\Gamma_{ab}^c## according to
$$
\nabla_a \partial_b = \Gamma_{ab}^c \partial_c.
$$
Nowhere do you need a reference to a metric and any connection satisfying the mentioned identities is a perfectly fine affine connection.

Now, when you do have a metric you can require some additional things from your connection. We can require that the geodesics implied by the affine connection are extrema of the path length and that the connection is metric compatible ##\nabla_X g = 0## (this translates to parallel transport preserving lengths of and angles between parallel transported vectors).

The unique torsion free metric compatible affine connection on a manifold with a metric is called the Levi-Civita connection. This is the affine connection that you will encounter in GR.

joneall said:

This is needed because the new tangent space is also a plane, but is tilted slightly with respect to the preceding one, meaning the coordinate basis has changed.

This notion requires the embedding of your manifold in a higher-dimensional one with a pre-existing metric. It may serve as a heuristic for understanding why general curved spaces require the definition of a connection, but it is far from the most elegant or general way of looking at things. Use it for your heuristic understanding, but not as a general way of understanding manifolds. The description within the manifold is that the tangent spaces are just different and there are a priori several different ways of comparing nearby tangent spaces. This is what the affine connection is for - essentially it tells you what a vector "not changing" along a curve means.

There is no requirement that your connection must be the Levi-Civita connection. One of my favourite examples is the connection on the two-sphere (with the poles removed) for which compass directions are parallel transported (i.e., the parallel transport of a unit vector pointing east will always be a unit vector pointing east). This connection is metric compatible, but not torsion free. Its geodesics are curves of constant compass direction, i.e., a curve always going in the north-east direction is a geodesic of this connection.

joneall · Apr 20, 2017

Orodruin said:

$$
\nabla_{fX} Y = f \nabla_X Y \quad \mbox{and} \quad \nabla_X(fY) = df(X) Y + f \nabla_X Y
$$

I think I see what you mean, but I do not understand the symbols in this formula. Could you explain or point me to an explanation. The nabla is the covariant derivative, normally, but what it is doing with two lower indices (indexes?) is beyond me. Sorry.

Orodruin · Apr 20, 2017

joneall said:

I think I see what you mean, but I do not understand the symbols in this formula. Could you explain or point me to an explanation. The nabla is the covariant derivative, normally, but what it is doing with two lower indices (indexes?) is beyond me. Sorry.

The affine connection ##\nabla## is a differential operator that takes a vector and a vector field as arguments. It never has two indices, the ##\nabla_{fX}## is letting the vector ##X## be multiplied by a constant. This is the coordinate-free notation. If you introduce coordinates, you would typically write ##\nabla_a \equiv \nabla_{\partial_a}## and therefore ##\nabla_X = \nabla_{X^a\partial_a} = X^a \nabla_a##. The differential ##df## is a one-form and ##df(X)## is the value of applying that one-form to the vector ##X##, i.e., ##df(X) = X^a \partial_a f## in any coordinate system.

pervect · Apr 20, 2017

The covariant derivative of a tensor adds a rank to a tensor - so if you have a rank 0 tensor, a scalar, ##\nabla_a## f is a vector (with a lower index, the way it's written, so you might call it a dual vector, depending on your notation). The covariant derivative of a vector doesn't make sense, but the covariant derivative of a vector field is a rank 2 tensor, ##\nabla_a u^b## for instance.

To parallel transport a vector along a curve, you first define the curve, C. And at every point, the curve has a tangent vector, ##t^a##, which points in the direction of the path one is walking.

The parallel transport condition for a vector ##u^a## along the curve (curve C has a tangent ##t^a##) satisfies the equation

$$t^a \nabla_a u^b = 0$$

This seems to be rather different from what you wrote.

Sometimes you see ##t^a \nabla_a## written as ##\nabla_\vec{t}##, where ##\vec{t}##is a vector (with components t^i). But I find this notation rather confusing, to be honest. IIRC it's called a directional derivative when written in this form.

I don't find any of this particularly intuitive - It's good to be able to write the equation for a parallel transport given the covariant derivative, but I don't think it generates much intuitive insight.

joneall · Apr 20, 2017

Orodruin said:

The affine connection ##\nabla## is a differential operator that takes a vector and a vector field as arguments. It never has two indices, the ##\nabla_{fX}## is letting the vector ##X## be multiplied by a constant. This is the coordinate-free notation. If you introduce coordinates, you would typically write ##\nabla_a \equiv \nabla_{\partial_a}## and therefore ##\nabla_X = \nabla_{X^a\partial_a} = X^a \nabla_a##. The differential ##df## is a one-form and ##df(X)## is the value of applying that one-form to the vector ##X##, i.e., ##df(X) = X^a \partial_a f## in any coordinate system.

Sorry, Orodriun, but I have the feeling you are out in some abstract mathematical space and I am down here on Earth. Certainly due to my ignorance. But there it is...

To begin with, using f to designate a constant is off-putting to us laypeople, who are used to using it for a function. Hence my confusion on that point.

Is the nabla the covariant derivative (which I understand to be a special case of a Levi-Civita connection, itself a special case of an affine connection -- Did I get that right?)? As pervect states, it adds a rank to a tensor. Here's the version of the covariant derivative I know about:

##\nabla_{\beta}V^\alpha = \partial_{\beta}V^\alpha + V^{\nu}\Gamma^{\alpha}_{\nu\beta}##

I fail to see where it has a vector and a vector field as arguments. All I see is the vector V. Again, my ignorance. Sorry to drag out this discussion.

vanhees71 · Apr 20, 2017

You can define a "directional derivative" by it. Take another vector field with components ##W^{\beta}## then you can define
$$(\nabla_W V)^{\alpha} = W^{\beta} \nabla_{\beta} V^{\alpha}.$$
Another application is a covariant derivative for vectors defined along a curve like ##u^{\mu}=\dot{x}^{\mu}##, where the dot refers to the derivative with respect to an affine parameter like proper time and ##x^{\mu}(\tau)## given as a trajectory of a massive particle. Then the covariant derivative of arbitrary vectors ##v^{\mu}(\tau)## wrt. ##\tau## is
$$\mathrm{D}_{\tau} v^{\mu}(\tau)=\dot{v}^{\mu} + {\Gamma^{\mu}}_{\rho \sigma} u^{\rho} v^{\tau}.$$
Applied to ##u^{\mu}## itself, you can use it to define geodesics via
$$\mathrm{D}_{\tau} u^{\mu} =0.$$

Orodruin · Apr 20, 2017

joneall said:

Is the nabla the covariant derivative

The nabla is the affine connection. The notation is the same as for the covariant derivative. The Levi-Civita connection is a particular type of affine connection in a manifold with a metric.

joneall said:

As pervect states, it adds a rank to a tensor.

Exactly. A type ##(n,m)## tensor is a linear map from a ##m## copies of the tangent space to ##n## copies of the same tangent space. The typical coordinate free notation is to put the additional argument as a subscript of the ##\nabla##. For example, if ##T## is a type ##(1,2)## tensor, then ##T(Y,Z)## is a tangent vector and ##\nabla T## is a type ##(1,3)## tensor ##S## such that ##S(X,Y,Z) \equiv \nabla_X T(Y,Z)##. There is nothing here that is any different from what you are doing, except that you have put everything in coordinate form.

joneall said:

I fail to see where it has a vector and a vector field as arguments.

That would be the free ##\beta## index, which can be contracted with the components of the vector, and the field ##V##, respectively.

stevendaryl · Apr 20, 2017

It seems as if this discussion has flown way past the point of giving simple examples of parallel transport, but it always helped me to understand the simplest cases first, before moving onto the abstract general theory.

The two cases that were most helpful to me in understanding parallel transport, intuitively, were:

Parallel transport of vectors in 2D Euclidean space described using polar coordinates.
Parallel transport of vectors on the surface of a sphere.

Figure 1: Parallel transport in polar coordinates

In Figure 1, I show the parallel transport of a vector [itex]\vec{V}[/itex], shown in red, around a circle. Initially, the vector is purely radial, so in polar coordinates, you would write [itex]\vec{V} = \hat{r}[/itex]. Then after you parallel-transport the vector 1/4 of the way around the circle counterclockwise, the vector is pointing purely tangentially, so in polar coordinates, you would write: [itex]\vec{V} = -\hat{\theta}[/itex]. So the components of the vector change even though the length and direction of the vector remains the same. The connection coefficients describe how the components of a vector change as the vector is parallel-transported.

Note 1: The picture says [itex]+\hat{\theta}[/itex], but the usual convention is that [itex]\theta[/itex] increases counterclockwise, which would make the vector in the negative [itex]\hat{\theta}[/itex] direction.

Note 2: The use of the notation [itex]\hat{r}[/itex] and [itex]\hat{\theta}[/itex] means a unit vector in the direction of increasing [itex]r[/itex] or increasing [itex]\theta[/itex]. Without a metric, there is no way to make sense of the notion of a "unit vector", but a metric is not necessary to make sense of parallel transport.

Figure 2: Parallel transport on a globe

In Figure 2, I show the simplest example that I know of of parallel transport in a curved space.

Imagine standing at the North Pole facing South along the line 0 degrees longitude. Hold your right arm out to represent a vector that will be parallel-transported. Initially, your arm points south along the line 90 degrees west longitude (90 degrees away from the direction you are facing). The vector drawn in red represents the direction your arm is pointing.
Now, march straight south along the line of 0 degrees longitude until you reach the equator. Keep facing the same direction (south) and keep your right arm pointing to your right. At this point you are at the equator, and your arm is pointing straight west.
Now, sidestep your way east along the equator until you reach the line of 90 degrees east longitude. Keep facing south, and keep your arm pointing to your right. At this point, your arm is still pointing east.
Now, walk backwards, north along the line of 90 degrees east longitude until you return to the North Pole. At this point, even though you have tried to always keep facing in the same direction, you are now facing south along the line of 90 degrees east longitude, and your arm is pointing to your right, along the line of 0 degrees longitude. So even though you tried not to rotate, your travel in the big three-leg journey has rotated you and your arm through 90 degrees.

In this case, the necessity of parallel transport is clearly not an artifact of using a bad choice of coordinate system. The rotation of the vector due to parallel transport is independent of any coordinate system. No matter what coordinate system you use, if it covers the entire journey, it will have to have nonzero connection coefficients.

pervect · Apr 20, 2017

Maybe this will help. Suppose we consider a scalar field f, that's a function of position. To make it less abstract, we'll take an example - you're walking around the 2 dimensional surface of the park, which is a 2 dimensional manifold, and this scalar field f gives the height of the ground.

Then ##g_a = \nabla_a f## is a (dual) vector. It's a dual because it's got a lower subscript. Given that you have a metric tensor, the dual vector can be regarded as a map from a vector to a scalar. Hopefully this is all familar and a review. If not, I suppose I should mention that the metric tensor, used to raise and lower indices, converts a vector into a dual vector - if we have some vector ##p^a##, the dual vector ##q_b = g_{ab} p^a##. And given a vector ##p^a## and a dual vector ##q_a##, the map from the pair ##(p^a##, ##q_a)## to a scalar is just ##p^a q_a##. These facts together imply that a dual vector is a map from a vector to a scalar.

One more bit of notation. When I have a vector quantity that I write without indices, I'll put an arrow over it to show that it's a vector - i.e. ##\vec{p}##. But I'll also write it as ##p^a## , using indices. Il write both vector and dual vector quantites with an arrow over them, at least I'll try to be consistent about my notation.

OK - I digress. We have the scalar field f, that's the height of the ground, and we have a dual vector field that gives the gradient of the heght, the slope of the ground at every point, which is ## g_a = \nabla_a f##. f is a scalar, but ##vec{g}## is a dual vector.

Now, if we are walking in some direction given by some vector of unit length vector ##t^a## or ##\vec{t}##, t being a tangent vector, which has a unit length, the height of the ground is changing at some rate with respect to the path, which is a scalar quantity ##g_a t^a##, ##g_a## being the slope of the ground , a (dual) vector, ##g_a = \nabla_a f##.

This rate of change of height is the derivative of the scalar field f in some direction ##\vec{t}## It's a directional derivative. We can write this rate of change of the scalar quantity along the path as ##t^a \nabla_a f##, where ##t^a## is the tangent vector, or using directional derivative notation, ##\nabla_\vec{t} f##.

Repeating this exercise when f is not a scalar field, but some vector field ##u^a## instead, is what we need to do to understand the relationship of covariant derivatives to parallel transport. All the tensors gain a rank, ##\nabla_a u^b## is a rank 2 tensor, ##\nabla_\vec{t}## u^b is a rank 1 vector. And the condition for parallel transport is just ##\nabla_\vec{t} u^b## = 0, where ##\vec{t}## is the tangent vector ( a unit length vector) to our path. So it's similar to the case where we're walking around the ground and asking the slope, but it's different because we're not considering the "slope" of a scalar field, but the "slope" of a vector field.

PeterDonis · Apr 20, 2017

pervect said:

Given that you have a metric tensor, the dual vector can be regarded as a map from a vector to a scalar.

It's actually not necessary to have a metric tensor for this to be true; the operation of contracting a vector and a dual vector (which is what makes the dual vector a map from vectors to scalars) does not require a metric. A metric is only required if you want to establish a correspondence between vectors and dual vectors, i.e., to raise and lower indices.

Orodruin · Apr 20, 2017

pervect said:

Given that you have a metric tensor, the dual vector can be regarded as a map from a vector to a scalar.

You don't need a metric for this. A dual vector by definition maps a vector to a scalar. The metric gives a bilinear map from vectors to scalars by associating every vector with a corresponding dual vector.

pervect said:

These facts together imply that a dual vector is a map from a vector to a scalar.

This is part of the definition of a dual vector (the dual space of a vector space ##V## is the space of all linear functionals on ##V##).

pervect said:

Now, if we are walking in some direction given by some vector of unit length vector ##t^a## or ##\vec{t}##, t being a tangent vector, which has a unit length, the height of the ground is changing at some rate with respect to the path, which is a scalar quantity ##g_a t^a##, ##g_a## being the slope of the ground , a (dual) vector, ##g_a = \nabla_a f##.

You will need a metric to claim that the vector is of unit length, which is not necessary for the argument that follows. The usual thing is to look at the tangent vector of a curve. The scalar quantity ##t^a \partial_a f## is then the rate of change in ##f## as the curve parameter increases.

pervect · Apr 20, 2017

While we're cleaning up some technical points, one other issue is bothers me. (I'm flashing back to Bill K, whom I don't think is around much anymore, sadly). The condition for parallel transport of a vector ##t^a \nabla_a u^b = 0## is straight from my text, Wald.

But actually ##\nabla_a## is only defined for a vector field, like the way that ##\nabla_a f##, where f is a scalar field, requires a field.

The notion of directional derivative is actually somewhat superior, because if you want to know how fast f is changing along a path, you don't need f to be defined anywhere but on the path itself - it doesn't matter what the value of f is if you're not on the path, it doesn't even have to be defined. So, the directional derivative notation is actually a bit better, though I still am not terribly used to it. ##\nabla_\vec{t} f = 0## means that the path is level in the direction represented by the unit vector ##\vec{t}##, not changing the value of the scalar field f. (I say "level path", because in the analogy the scalar field f is the height of the path, something the flatlander walking along the path can't directly measure, but he knows when he's going uphill, it takes effort to climb).

##\nabla_\vec{t} u^a## means that the vector ## \vec{u}## is parallel transported in the direction ##\vec{t}##, and we only need to define the vector on the path to write this, we don't need a vector field.
.

stevendaryl · Apr 20, 2017

pervect said:

While we're cleaning up some technical points, one other issue is bothers me. (I'm flashing back to Bill K, whom I don't think is around much anymore, sadly). The condition for parallel transport of a vector ##t^a \nabla_a u^b = 0## is straight from my text, Wald.

But actually ##\nabla_a## is only defined for a vector field, like the way that ##\nabla_a f##, where f is a scalar field, requires a field.

The notion of directional derivative is actually somewhat superior, because if you want to know how fast f is changing along a path, you don't need f to be defined anywhere but on the path itself - it doesn't matter what the value of f is if you're not on the path, it doesn't even have to be defined. So, the directional derivative notation is actually a bit better, though I still am not terribly used to it. ##\nabla_\vec{t} f## means that the path is level, not changing the value of the scalar field f. (I say "level path", because in the analogy the scalar field f is the height of the path, something the flatlander walking along the path can't directly measure, but he knows when he's going uphill, it takes effort to climb).

##\nabla_\vec{t} u^a## means that the vector u is parallel transported, and we only need to define the vector on the path to write this, we don't need a vector field.
.

Yeah, that's an annoying point about parallel transport, it seems like it is an operation on vector fields, but it applies to a vector defined along a single path. The curvature tensor is in some sense even more misleading: If you look at the definition, it seems to apply to vector fields (due to the appearance of the covariant derivative), but it turns out that only the value of the vector fields at a single point is relevant.

Orodruin · Apr 20, 2017

A priori, it is the vector field that is parallel along a curve. However, just as the torsion and curvature only depend on the value of the field at a single point, any vector fields that take the same values along the curve will give the same results for ##\nabla_T X##, where ##T## is the curve tangent. You can therefore happily extend the definition of the vector along the curve to a neighbourhood of the curve and apply the covariant derivative to that field. (Or talk about equivalence classes of vector fields - two vector fields being in the same equivalence class if they are equal along the curve - if you are inclined to that sort of thing.)

joneall · Apr 23, 2017

This is seeming a lot clearer to me now.

pervect said:

Maybe this will help. ... We have the scalar field f, that's the height of the ground, and we have a dual vector field that gives the gradient of the heght, the slope of the ground at every point, which is ## g_a = \nabla_a f##. f is a scalar, but ##vec{g}## is a dual vector.

Now, if we are walking in some direction given by some vector of unit length vector ##t^a## or ##\vec{t}##, t being a tangent vector, which has a unit length, the height of the ground is changing at some rate with respect to the path, which is a scalar quantity ##g_a t^a##, ##g_a## being the slope of the ground , a (dual) vector, ##g_a = \nabla_a f##.

This rate of change of height is the derivative of the scalar field f in some direction ##\vec{t}## It's a directional derivative. We can write this rate of change of the scalar quantity along the path as ##t^a \nabla_a f##, where ##t^a## is the tangent vector, or using directional derivative notation, ##\nabla_\vec{t} f##.

Repeating this exercise when f is not a scalar field, but some vector field ##u^a## instead, is what we need to do to understand the relationship of covariant derivatives to parallel transport. All the tensors gain a rank, ##\nabla_a u^b## is a rank 2 tensor, ##\nabla_\vec{t}## u^b is a rank 1 vector. And the condition for parallel transport is just ##\nabla_\vec{t} u^b## = 0, where ##\vec{t}## is the tangent vector ( a unit length vector) to our path. So it's similar to the case where we're walking around the ground and asking the slope, but it's different because we're not considering the "slope" of a scalar field, but the "slope" of a vector field.

Very good. But... in the first case, we consider in fact the projection of the gradient of the height on the path (or its tangent, same thing). Since the height is a scalar field, this is all we need to know. But when we extrapolate from a scalar h to a vector ##u^b## and make the projection of that on the tangent to the path be zero, we are ignoring the vector's components orthogonal to the tangent of the path. The vector could be rotating about the path and that hardly seems what I would call parallel.

I suspect I am still missing something and you will clear it up for me. Sorry for always talking about components. It's the easiest way I can see it. And since in GR (or at least in the books I have read) we are always talking about Riemannian manifolds which have a symmetric metric, that's ok. It's also much more intuitive. I don't doubt at all that doing the math first and then identifying certain quantities with physical objects is the best way to go -- as someone else has said. But I don't know how to go about learning that math without getting myself into one more big book and months of study. I am open to suggestions, tho.

Orodruin · Apr 23, 2017

joneall said:

Very good. But... in the first case, we consider in fact the projection of the gradient of the height on the path (or its tangent, same thing). Since the height is a scalar field, this is all we need to know. But when we extrapolate from a scalar h to a vector ##u^b## and make the projection of that on the tangent to the path be zero, we are ignoring the vector's components orthogonal to the tangent of the path. The vector could be rotating about the path and that hardly seems what I would call parallel.

This is not really true. You have a rank two tensor and your "projection" is contracting the derivative index with the tangent vector one - not the one of the transported vector.

I suspect I am still missing something and you will clear it up for me. Sorry for always talking about components. It's the easiest way I can see it. And since in GR (or at least in the books I have read) we are always talking about Riemannian manifolds which have a symmetric metric, that's ok.

A bit of nitpicking: GR deals with pseudo-Riemannian manifolds or, even more accurately, Lorentzian manifolds. A Riemannian manifold has a positive definite metric, a pseudo-Riemannian one replaces the requirement of being positive definite with being non-degenerate, and a Lorentzian manifold has an n+1 signature.

joneall · Apr 23, 2017

Orodruin said:

This is not really true. You have a rank two tensor and your "projection" is contracting the derivative index with the tangent vector one - not the one of the transported vector.

Ok, but in that case, I don't understand how I should see it.

A bit of nitpicking: GR deals with pseudo-Riemannian manifolds or, even more accurately, Lorentzian manifolds. A Riemannian manifold has a positive definite metric, a pseudo-Riemannian one replaces the requirement of being positive definite with being non-degenerate, and a Lorentzian manifold has an n+1 signature.

I knew it was pseudo-Riemannian but have not heard about the Lorentzian.

You are very careful to keep the mathematical statements correct. Perhaps you could indicate to me a book for learning that myself. I'd be grateful. Is Schutz''s "Geometrical methods of mathematical physics" good, for instance? Or is it not necessary for what I am trying to understand?

stevendaryl · Apr 23, 2017

joneall said:

Ok, but in that case, I don't understand how I should see it.

I knew it was pseudo-Riemannian but have not heard about the Lorentzian.

You are very careful to keep the mathematical statements correct. Perhaps you could indicate to me a book for learning that myself. I'd be grateful. Is Schutz''s "Geometrical methods of mathematical physics" good, for instance? Or is it not necessary for what I am trying to understand?

After all the discussion in this thread, do you think you could summarize what you still don't grasp about parallel transport?

Orodruin · Apr 23, 2017

joneall said:

Ok, but in that case, I don't understand how I should see it.

Once contracted with the tangent vector, what you obtain is a directional derivative of a vector in the tangent direction - which is again a vector. However, what this directional derivative "means" depends on your affine connection.

joneall said:

You are very careful to keep the mathematical statements correct. Perhaps you could indicate to me a book for learning that myself. I'd be grateful. Is Schutz''s "Geometrical methods of mathematical physics" good, for instance? Or is it not necessary for what I am trying to understand?

I have no experience of Schutz's book. I started with Nakahara's "Geometry, Topology and Physics", but there are several books that will include this material and Nakahara is probably overkill if all you want to do is to understand parallel transport. I would suggest looking at parallel transport as a vector (or more generally, a tensor) not changing along a path - with the affine connection defining what "not changing" means.

joneall · Apr 23, 2017

stevendaryl said:

After all the discussion in this thread, do you think you could summarize what you still don't grasp about parallel transport?

I tried to in the last message. I know this is dragging out and I am sorry about that. I will soon give up.

I don't see why setting the directional derivative of a vector in the tangent direction to zero is equivalent to parallel transport, dragging a vector along a path in such a way as to keep it parallel to itself. I'm not even sure that "to itself" is correct, since I'm not sure how you define parallelism in curved coordinates. It has been stated that this is the definition of parallel transport. But that does not say what its physical significance is or why we do it.

I have studied advanced classical mechanics and quantum mechanics and special relativity (and have a very old and rather out-of-date PhD in particle physics), but never found a field where the math seems so divorced from what we see as in GR.

Orodruin · Apr 23, 2017

joneall said:

I don't see why setting the directional derivative of a vector in the tangent direction to zero is equivalent to parallel transport,

It is the definition of parallel transport.

joneall said:

I'm not even sure that "to itself" is correct, since I'm not sure how you define parallelism in curved coordinates.

This is part of the point as well - there is no unique way of defining what it means for vectors at different points in the manifold to be "the same". For nearby points, this is what the affine connection does for you - it defines what it means to "not change" and there are several possibilities for doing this. Then there is a particular way of doing it if you have a metric - the Levi-Civita connection, which in some sense is the "intuitive" notion of parallel transport you would have on a sphere.

joneall · Apr 23, 2017

Thank you, all. I seem to be fairly comfortable with the subject now.

Understanding parallel transfer

Similar threads

Hot Threads

Recent Insights