Covariant Derivative

1. Nov 22, 2005

John_Doe

The covariant derivative is
$$A^\mu_{\sigma} = \frac{\partial A^\mu}{\partial x_{\sigma}} + \Gamma^\mu_{\sigma \alpha}A^\alpha$$
... why?

2. Nov 22, 2005

matt grime

It's a definition (or at least that is the implication from your phrasing); surely its only necessary justification is "because it works"?

3. Nov 22, 2005

John_Doe

Nonsense! I think that it is deduced by using 'parallel displacements', whatever that means.

4. Nov 22, 2005

HallsofIvy

The "ordinary" derivative of a tensor, $$\frac{\partial A^\mu}{\partial x_{\sigma}}$$ is NOT a tensor- it will have components that do not transform, on coordinate change, the way tensor components must.
$$\Gamma^\mu_{\sigma \alpha}A^\alpha$$ "subtracts off" those components so that we have a tensor.

5. Nov 22, 2005

dextercioby

Write it correctly.

$$\nabla_{\sigma}A^{\mu}=:\partial_{\sigma}A^{\mu}+\Gamma^{\mu}{}_{\sigma\lambda} A^{\lambda}$$

Well, physicsts in GR invoke the way a vector field should behave under general transformations of coordinates...

In abstract differential geometry one needs a bit more...

Daniel.

6. Nov 22, 2005

Hurkyl

Staff Emeritus
John: There's a big difference between "Why is this true?" and "What would compel anyone to do things that way?" The answer to the former is "by definition", so I think what you really want to know is the latter.

7. Nov 22, 2005

John_Doe

$$\frac{df}{dx} = \lim_{\Delta x \rightarrow 0} \frac{f(x + \Delta x) - f(x)}{\Delta x}$$
Should be used to deduce the derivative.

8. Nov 22, 2005

BerkMath

I agree, John Doe! And only +, -, *, / should be used in the definition of a limit.

9. Nov 22, 2005

matt grime

Fortunately, John, maths is not compelled to stick only with what you know and are comfortable with. Why should ordinary derivatives even be *defined* in the way you want them to be? It is just nomenclature, tough if you don't like it. Someday you might even get to invent some yourself (that catches on) when you can correct what you perceive as an egregious abuse of language. Until then remember it is just a name for something.

Hurkyl's post is bang on the money. The question ought to be 'why did they pick the word derivative to describe that operator', and I have no idea at all since I know nothing about tensor analysis in this sense.

If I were to add together two integers a+b we know what I'm talking about. If I were to add together two rationals a/b+c/d the answer is (ad+bc)/bd. Why is that addition? Why isn't the addition (a+c)/(b+d)? We extend the definition of addition so that it is consistent and does what it ought to. The same here for derivatives I imagine, as hallsofivy's answer indicates.

10. Nov 22, 2005

John_Doe

However, the addition of two rationals can deduced through a logically consistent series of steps based on the former definition of addition. Otherwise, how else could we deduce the addition of two rationals?

The same argument applies to my derivative problem.

11. Nov 22, 2005

matt grime

This is a question about the philosophy of mathematics it appears, and as such it has no necessarily correct response.

You (whether you are aware of it or not, or care) want things to be innate and canonical, ie for that to be the covariant derivative as something we discover. That presupposes that there is such a thing as a covariant derivative that you want to look for and that it is intrinsic.

Most mathematics is invented to fit a purpose. We look at tensors, notice that derivatives don't behave properly and then we *invent* the corrected version since we want to take derivatives. This definition has become the one we use (there might not be any others, admittedly) because it is the one that does what we want in, presumably, the most elementary way. Ie be investigating we can find something that behaves as we want. I'm sure one can invent others by multiplying the gamma by some function.

There is for instance a perfectly valid argument that states the derivative of a function ought not to be what you think but instead the scwharz derivative, just as there are many different methods of integration.

12. Nov 22, 2005

John_Doe

And for what reasons does
$$\frac{df}{dx} = \lim_{\Delta x \rightarrow 0} \frac{f(x + \Delta x) - f(x)}{\Delta x}$$
fail to deduce the derivative as per the conventional usage of the word?

13. Nov 22, 2005

StatusX

I don't think you guys are being fair to johndoe. He's clearly asking what motivates the definition. It didn't just come out of thin air, some mathematician figured out the natural way to extend derivatives to manifolds and this process (likely involving parellel transport) gives rise to the formula in question. I only know of the covariant derivative from GR, and so the defintion was only motivated in my textbook by reference to four velocity along geodesics, and so probably won't help. But I think someone else can give a good explanation for why this is formula.

14. Nov 22, 2005

matt grime

As was explained to you, twice i think, it doesn't transform covariantly. Unless you're talking about the Schwarzian derivative. Again, I think you're overlooking the fact that for a curve, f: R \to R, the derivative is the gradient of the tangent when it exists, and that is the limit of the above ratio, and that is the definition of the derivative. (One doesn't deduce, or fail to, the derivative from a formula.) There is a good reason if you're into dynamical systems to think that that is not the best idea for derivative, and that we should look at the shwarz derivative. It isn't the gradient of the tangent, it doesn't give the local rate of change, but it gives some other quantity that is more natural for the purposes of dynamical systems. But it's just a definition that works. Similarly in GR it appears that we need to take a better definition of something to get a covariantly behaved differential operator. I am sure that the one given is not the only covariant differential operator: the space of derivations is usually quite big. But it has become clear that it is the one you want to consider, possibly because it is the simplest, or the most natural in some metric'. It didn't come with the label 'covariant derivative' on the bottom (this is a general mathematical observation) automatically, but it is in some sense the best notion of what a covariant derivative ought to be.

15. Nov 22, 2005

matt grime

16. Nov 22, 2005

matt grime

And incidentally, what if one cannot take limits? One can differentiate things without taking limits in algebraic settings. The derivative of det is tr even if you're thinking in a field that is not metrized.
There is a problem when one conflates the name for a property with that property.
Perhaps it was just my glib that is the covariant derivative because we say it is', answer that is causing the problem? Why it is the correct thing to think about is a different matter entirely, but doesn't stop what I said being correct (if glib) and certainly not nonsense.

17. Nov 22, 2005

HallsofIvy

I answered that at the beginning. Using that definition, for the derivative of a tensor, yields a quantity that is not a tensor. It is necessary to remove parts that do not transform correctly. That is what that additional term is for.

18. Nov 22, 2005

matt grime

Notice that the definition you gave of derivative doesn't even define derivatives on euclidean space, ie it is impossible to recover Df when f is a function from R^n to R^m

19. Nov 22, 2005

George Jones

Staff Emeritus
First, you should think carefully about what the other posters have said.

In post #7, you define the derivative of a function as a limit. This limit results in another function. Contrast this with your original post, where covariant differentiation changes a vector to a second rank tensor.

The covariant directional derivative of a vector results in another vector, and can be defined in terms of parallel transport and limits. If you want covariant directional derivatives to be defined in terms parallel transport, then parallel transport must first be defined. This can be done, but just as the definition of covariant differentiation (at first) seems non-intuitive, the definition of parallel transport may seem non-intuitive.

The covariant derivative of $$A$$ in the direction of $$V$$ is given by

$$V^{\sigma} \nabla_{\sigma}A^{\mu} = V^{\sigma} \left( \partial_{\sigma}A^{\mu} + \Gamma^{\mu}{}_{\sigma\lambda} A^{\lambda} \right).$$

If $$\gamma$$ is a curve, then a notion of parallel transport along $$\gamma$$ is defined by a collection on invertible linear maps - one for every pair of points in the image of the curve - such that for every $$a$$, $$b$$, and $$c$$ in the curve's image:

$$\Gamma_{a,b}: T_{a} M \rightarrow T_{b} M$$;

$$\Gamma_{a,b} \Gamma_{b,c} = \Gamma_{a,c}$$;

suitable smoothness conditions are satisfied.

This is a fairly general notion of parallel transport - you are probably more interested in metric-compatible parallel transport.

Once a notion of parallel transport is chosen, differentiation can be defined as

$$\nabla_{V}A = \lim_{\epsilon \rightarrow 0} \frac{1}{\epsilon} \left[ A_{||} \left( s + \epsilon \right) - A \left( s \right) \right]$$,

where $$V$$ is the tangent vector to $$\gamma$$ at curve parameter $$s$$, and where $$A_{||} \left( s + \epsilon \right)$$ means $$A \left( s + \epsilon \right)$$ parallel transported to $$\gamma \left( s \right)$$.

Regards,
George

Last edited: Nov 22, 2005
20. Nov 22, 2005

pervect

Staff Emeritus
Just to add my \$.02, parallel transport can be defined in a fairly intuitive way if one has a metric. The basic idea is that two vectors are parallel if, when you join head-head and tail-tail by two new vectors, that the resulting structure is a parallelogram.

To illustrate this, look at the picture below.

a----->b

c----->d

We say that the vector (a->b) and the vector (c->d) are parallel if and only if

(a->b) has the same length as (c->d)
(a->c) has the same length as (b->d)

This is because a parallelogram has the same length on opposing sides by defintion.

Mathematically this definition is a little sloppy, but I think it gives a good intuitive picture.

A slightly more elaborate construction of this general form is known as "Schild's ladder", and it defines a purely geometric way to do parallel transport. Schild's ladder requires only that one be able to find the midpoint of a path.

While other approaches exist, I think that starting with the metric, defining parallel transport geometrically by Schild's ladder, and using this to define the covariant derivative is one of the more intuitive approaches.

Onto the covariant derivative (in the same geometric and slightly sloppy manner):

Suppose we have a vector field, V, which associates a vector V with any point. We want to ask "How fast is the vector field changing at point P?"

To take the limit, we must specify a direction that we want to compute the rate of change of the vector field. This is the vector 'U' that exists at point P.

Because we are being sloppy, we conflate this direction with an actual curve that moves in this direction.

So we take the vector V at point P, and subtract it from the parallel-transported vector V' at (P+$\epsilon U$). We divide the result by epsilon. The result is a new vector.

i.e.

$$\lim_{\epsilon \rightarrow 0} \frac{V(P+\epsilon U) - V(P)}{\epsilon}$$

In order to compute this difference, though, we have to be able to "move" a vector from one point P to a nearby point P'. The ability to do this was what we defined earlier, with the notion of "parallel transport".

Last edited: Nov 22, 2005