Covariant Derivative: A^μₛᵦ Definition & Use

John_Doe · Nov 22, 2005

The covariant derivative is
[tex]A^\mu_{\sigma} = \frac{\partial A^\mu}{\partial x_{\sigma}} + \Gamma^\mu_{\sigma \alpha}A^\alpha[/tex]
... why?

matt grime · Nov 22, 2005

It's a definition (or at least that is the implication from your phrasing); surely its only necessary justification is "because it works"?

John_Doe · Nov 22, 2005

Nonsense! I think that it is deduced by using 'parallel displacements', whatever that means.

HallsofIvy · Nov 22, 2005

The "ordinary" derivative of a tensor, [tex]\frac{\partial A^\mu}{\partial x_{\sigma}}[/tex] is NOT a tensor- it will have components that do not transform, on coordinate change, the way tensor components must.
[tex]\Gamma^\mu_{\sigma \alpha}A^\alpha[/tex] "subtracts off" those components so that we have a tensor.

dextercioby · Nov 22, 2005

Write it correctly.

[tex] \nabla_{\sigma}A^{\mu}=:\partial_{\sigma}A^{\mu}+\Gamma^{\mu}{}_{\sigma\lambda} A^{\lambda} [/tex]

Well, physicsts in GR invoke the way a vector field should behave under general transformations of coordinates...

In abstract differential geometry one needs a bit more...

Daniel.

Hurkyl · Nov 22, 2005

John: There's a big difference between "Why is this true?" and "What would compel anyone to do things that way?" The answer to the former is "by definition", so I think what you really want to know is the latter.

John_Doe · Nov 22, 2005

[tex]\frac{df}{dx} = \lim_{\Delta x \rightarrow 0} \frac{f(x + \Delta x) - f(x)}{\Delta x}[/tex]
Should be used to deduce the derivative.

BerkMath · Nov 22, 2005

I agree, John Doe! And only +, -, *, / should be used in the definition of a limit.

matt grime · Nov 22, 2005

Fortunately, John, maths is not compelled to stick only with what you know and are comfortable with. Why should ordinary derivatives even be *defined* in the way you want them to be? It is just nomenclature, tough if you don't like it. Someday you might even get to invent some yourself (that catches on) when you can correct what you perceive as an egregious abuse of language. Until then remember it is just a name for something.

Hurkyl's post is bang on the money. The question ought to be 'why did they pick the word derivative to describe that operator', and I have no idea at all since I know nothing about tensor analysis in this sense.

If I were to add together two integers a+b we know what I'm talking about. If I were to add together two rationals a/b+c/d the answer is (ad+bc)/bd. Why is that addition? Why isn't the addition (a+c)/(b+d)? We extend the definition of addition so that it is consistent and does what it ought to. The same here for derivatives I imagine, as hallsofivy's answer indicates.

John_Doe · Nov 22, 2005

However, the addition of two rationals can deduced through a logically consistent series of steps based on the former definition of addition. Otherwise, how else could we deduce the addition of two rationals?

The same argument applies to my derivative problem.

matt grime · Nov 22, 2005

This is a question about the philosophy of mathematics it appears, and as such it has no necessarily correct response.

You (whether you are aware of it or not, or care) want things to be innate and canonical, ie for that to be the covariant derivative as something we discover. That presupposes that there is such a thing as a covariant derivative that you want to look for and that it is intrinsic.

Most mathematics is invented to fit a purpose. We look at tensors, notice that derivatives don't behave properly and then we *invent* the corrected version since we want to take derivatives. This definition has become the one we use (there might not be any others, admittedly) because it is the one that does what we want in, presumably, the most elementary way. Ie be investigating we can find something that behaves as we want. I'm sure one can invent others by multiplying the gamma by some function.

There is for instance a perfectly valid argument that states the derivative of a function ought not to be what you think but instead the scwharz derivative, just as there are many different methods of integration.

John_Doe · Nov 22, 2005

And for what reasons does
[tex]\frac{df}{dx} = \lim_{\Delta x \rightarrow 0} \frac{f(x + \Delta x) - f(x)}{\Delta x}[/tex]
fail to deduce the derivative as per the conventional usage of the word?

StatusX · Nov 22, 2005

I don't think you guys are being fair to johndoe. He's clearly asking what motivates the definition. It didn't just come out of thin air, some mathematician figured out the natural way to extend derivatives to manifolds and this process (likely involving parellel transport) gives rise to the formula in question. I only know of the covariant derivative from GR, and so the defintion was only motivated in my textbook by reference to four velocity along geodesics, and so probably won't help. But I think someone else can give a good explanation for why this is formula.

matt grime · Nov 22, 2005

John_Doe said:

And for what reasons does
[tex]\frac{df}{dx} = \lim_{\Delta x \rightarrow 0} \frac{f(x + \Delta x) - f(x)}{\Delta x}[/tex]
fail to deduce the derivative as per the conventional usage of the word?

As was explained to you, twice i think, it doesn't transform covariantly. Unless you're talking about the Schwarzian derivative. Again, I think you're overlooking the fact that for a curve, f: R \to R, the derivative is the gradient of the tangent when it exists, and that is the limit of the above ratio, and that is the definition of the derivative. (One doesn't deduce, or fail to, the derivative from a formula.) There is a good reason if you're into dynamical systems to think that that is not the best idea for derivative, and that we should look at the shwarz derivative. It isn't the gradient of the tangent, it doesn't give the local rate of change, but it gives some other quantity that is more natural for the purposes of dynamical systems. But it's just a definition that works. Similarly in GR it appears that we need to take a better definition of something to get a covariantly behaved differential operator. I am sure that the one given is not the only covariant differential operator: the space of derivations is usually quite big. But it has become clear that it is the one you want to consider, possibly because it is the simplest, or the most natural in some `metric'. It didn't come with the label 'covariant derivative' on the bottom (this is a general mathematical observation) automatically, but it is in some sense the best notion of what a covariant derivative ought to be.

matt grime · Nov 22, 2005

StatusX said:

I don't think you guys are being fair to johndoe. He's clearly asking what motivates the definition.[\quote]

It has been explained at length by more informed people than me what motivates this choice. However, read hurkyl's post.

He explicitly asked how to deduce that this is the covariant derivative as one extends the notion of addition to rationals from the integers.

Deduce is the wrong word, and perhaps that is the problem.

matt grime · Nov 22, 2005

And incidentally, what if one cannot take limits? One can differentiate things without taking limits in algebraic settings. The derivative of det is tr even if you're thinking in a field that is not metrized.
There is a problem when one conflates the name for a property with that property.
Perhaps it was just my glib `that is the covariant derivative because we say it is', answer that is causing the problem? Why it is the correct thing to think about is a different matter entirely, but doesn't stop what I said being correct (if glib) and certainly not nonsense.

HallsofIvy · Nov 22, 2005

John_Doe said:

And for what reasons does
[tex]\frac{df}{dx} = \lim_{\Delta x \rightarrow 0} \frac{f(x + \Delta x) - f(x)}{\Delta x}[/tex]
fail to deduce the derivative as per the conventional usage of the word?

I answered that at the beginning. Using that definition, for the derivative of a tensor, yields a quantity that is not a tensor. It is necessary to remove parts that do not transform correctly. That is what that additional term is for.

matt grime · Nov 22, 2005

Notice that the definition you gave of derivative doesn't even define derivatives on euclidean space, ie it is impossible to recover Df when f is a function from R^n to R^m

George Jones · Nov 22, 2005

First, you should think carefully about what the other posters have said.

In post #7, you define the derivative of a function as a limit. This limit results in another function. Contrast this with your original post, where covariant differentiation changes a vector to a second rank tensor.

The covariant directional derivative of a vector results in another vector, and can be defined in terms of parallel transport and limits. If you want covariant directional derivatives to be defined in terms parallel transport, then parallel transport must first be defined. This can be done, but just as the definition of covariant differentiation (at first) seems non-intuitive, the definition of parallel transport may seem non-intuitive.

The covariant derivative of [tex]A[/tex] in the direction of [tex]V[/tex] is given by

[tex]V^{\sigma} \nabla_{\sigma}A^{\mu} = V^{\sigma} \left( \partial_{\sigma}A^{\mu} + \Gamma^{\mu}{}_{\sigma\lambda} A^{\lambda} \right).[/tex]

If [tex]\gamma[/tex] is a curve, then a notion of parallel transport along [tex]\gamma[/tex] is defined by a collection on invertible linear maps - one for every pair of points in the image of the curve - such that for every [tex]a[/tex], [tex]b[/tex], and [tex]c[/tex] in the curve's image:

[tex] \Gamma_{a,b}: T_{a} M \rightarrow T_{b} M[/tex];

[tex] \Gamma_{a,b} \Gamma_{b,c} = \Gamma_{a,c}[/tex];

suitable smoothness conditions are satisfied.

This is a fairly general notion of parallel transport - you are probably more interested in metric-compatible parallel transport.

Once a notion of parallel transport is chosen, differentiation can be defined as

[tex] \nabla_{V}A = \lim_{\epsilon \rightarrow 0} \frac{1}{\epsilon} \left[ A_{||} \left( s + \epsilon \right) - A \left( s \right) \right] [/tex],

where [tex]V[/tex] is the tangent vector to [tex]\gamma[/tex] at curve parameter [tex]s[/tex], and where [tex]A_{||} \left( s + \epsilon \right)[/tex] means [tex]A \left( s + \epsilon \right)[/tex] parallel transported to [tex]\gamma \left( s \right)[/tex].

Regards,
George

pervect · Nov 22, 2005

Just to add my $.02, parallel transport can be defined in a fairly intuitive way if one has a metric. The basic idea is that two vectors are parallel if, when you join head-head and tail-tail by two new vectors, that the resulting structure is a parallelogram.

To illustrate this, look at the picture below.

a----->b

c----->d

We say that the vector (a->b) and the vector (c->d) are parallel if and only if

(a->b) has the same length as (c->d)
(a->c) has the same length as (b->d)

This is because a parallelogram has the same length on opposing sides by defintion.

Mathematically this definition is a little sloppy, but I think it gives a good intuitive picture.

A slightly more elaborate construction of this general form is known as "Schild's ladder", and it defines a purely geometric way to do parallel transport. Schild's ladder requires only that one be able to find the midpoint of a path.

While other approaches exist, I think that starting with the metric, defining parallel transport geometrically by Schild's ladder, and using this to define the covariant derivative is one of the more intuitive approaches.

Onto the covariant derivative (in the same geometric and slightly sloppy manner):

Suppose we have a vector field, V, which associates a vector V with any point. We want to ask "How fast is the vector field changing at point P?"

To take the limit, we must specify a direction that we want to compute the rate of change of the vector field. This is the vector 'U' that exists at point P.

Because we are being sloppy, we conflate this direction with an actual curve that moves in this direction.

So we take the vector V at point P, and subtract it from the parallel-transported vector V' at (P+[itex]\epsilon U [/itex]). We divide the result by epsilon. The result is a new vector.

i.e.

[tex]
\lim_{\epsilon \rightarrow 0} \frac{V(P+\epsilon U) - V(P)}{\epsilon}
[/tex]

In order to compute this difference, though, we have to be able to "move" a vector from one point P to a nearby point P'. The ability to do this was what we defined earlier, with the notion of "parallel transport".

John_Doe · Nov 22, 2005

[tex]A^\mu_{\sigma} = \frac{\partial A^\mu}{\partial x_{\sigma}} + \Gamma^\mu_{\sigma \alpha}A^\alpha[/tex]
Utilising this method, how can the covariant derivative be derived?

matt grime · Nov 23, 2005

What method?

If something is "derived" it is derived from somewhere. What is it that you're starting from and ending up with that formula.

Or do you simply mean:

given that the covariant derivative is defined as SOMETHING, how can it be shown to be equivalent to this statement?

Or even,

why does this represent derivative along a vector field?

The latter two are reasonable questions and probably come about from careful bookkeeping and a picture would help, in all likelihood.

Trying to make sense of prevects post 19 and thinking about what it means in coordinates would be a start, I guess.

John_Doe · Nov 23, 2005

Utilising the method outlined above by pervect.

matt grime · Nov 23, 2005

How far have you got in trying to do it yourself? Drawn a picture? Tried some examples? Started to try it for arbitrary objects? Where does the bookkeeping go wrong? What answer do you end up with? How close do you get? Does it always breakdown at the same point?

John_Doe · Nov 23, 2005

[tex]dA^\mu - \delta A^\mu = \displaystyle{(} \frac{\partial A^\mu}{\partial x_{\sigma}} + \Gamma^\mu _{\sigma \alpha} A^\alpha \displaystyle{)} dx_{\sigma}[/tex]
If [tex](A^\mu + \delta A^\mu)[/tex] is the vector resulting from an infinitesimal parallel displacement from [tex]P_{1}[/tex] to [tex]P_{2}[/tex], and [tex](A^\mu + dA^\mu)[/tex] the vector [tex]A^\mu[/tex] at the point [tex]P_{2}[/tex], then this is also a vector.

"Vier Vorlesungen ueber Relativitaetstheorie" ("The Meaning of Relativity")

matt grime · Nov 23, 2005

As someone who is as equally as mystified as to what this covariant derivative is as the next person, I have no idea what in there is stuff *you* know to be true, what is causing *you* problems, what you're trying to show, or even what it is that you're doing.

matt grime · Nov 23, 2005

One thing that should have occurred to me earlier. If you do want someone to prove that the covariant derivative is that formula, then it must be originally given in terms of some other definition, be it words or some other formula. What precisely is it that you are starting from in order to prove it is equivalent to that form?

John_Doe · Nov 23, 2005

[tex]\delta A^\nu = -\Gamma^\nu_{\alpha \beta}A^\alpha dx_{\beta}[/tex]. I know this to be true. I want to know how the covariant deivative is derived. I know that [tex]dA^\mu - \delta A^\mu = \displaystyle{(} \frac{\partial A^\mu}{\partial x_{\sigma}} + \Gamma^\mu _{\sigma \alpha} A^\alpha \displaystyle{)} dx_{\sigma}[/tex] is important.

matt grime · Nov 23, 2005

I know you dismissed Tom's suggestion in the Homework form with one word, but did you actually read the wikipedia entry at all for covariant derivatives? Because that appears to explain it all very clearly and concisely to me.

It defines a derivative (derivation in my language), a covariant derivation, explains why some formula is a covariant derivative, and then shows that in coordinates it is exactly what you give in your first post.

If that is not sufficient for your question then what would be?

John_Doe · Nov 23, 2005

The covariant derivative isn't 'defined' to be [tex]A^\mu_{\sigma} = \frac{\partial A^\mu}{\partial x_{\sigma}} + \Gamma^\mu_{\sigma \alpha}A^\alpha[/tex]. That is deduced.

I read the Wikipedia article on covariant derivatives, and didn't understand. If it appears to explain it very clearly and consisely to you, could you explain it to me?

matt grime · Nov 23, 2005

And back again into the loop: what exactly is the definition of covariant derivative you're starting from then? Because the problem seems to be you can't translate from your definition to this one.

pervect · Nov 23, 2005

For the "pictoral" approach I mentioned in more detail, if you can get a hold of MTW's book "Gravitation", it's discussed around pg 244 in chapter 10.

As far as the detailed answer to your question

A general vector x can be represented as [itex] x^i e_i[/itex], where the [itex]e_i[/itex] are the basis vectors.

Thus in a cartesian coordinate system we would have as basis vectors e_x, e_y, e_z, in a cylindrical coordinate system we would have e_r, e_theta, e_z, etc etc.

The defintion of the Christoffel symbols

[tex]\Gamma^{\mu}{}_{\sigma \alpha}[/itex] is that they describe how the basis vectors transform in terms of the basis vectors.

i.e we take

[tex]\nabla_{\sigma} e_{\alpha}[/tex]

and express it in terms of the basis vectors as

[tex]\nabla_{\sigma} e_{\alpha} = \Gamma^{\mu}{}_{\alpha \sigma} e_{\mu}[/tex]

The rest is the chain rule. [tex] \nabla_{\sigma} A^{\mu}e_{\mu} = (\nabla_{\sigma}A^{\mu}) e_{\mu} +(\nabla_{\sigma}e_{\mu})A^{\mu} [/tex]

The first term gives the partial derivative, the second term gives the Christoffel symbols.

Covariant Derivative: A^μₛᵦ Definition & Use

1. What is a covariant derivative?

2. What is the notation for a covariant derivative?

3. How is a covariant derivative different from a regular derivative?

4. What is the use of a covariant derivative?

5. Can a covariant derivative be applied to other mathematical objects besides vector fields?

Similar threads

Hot Threads

Recent Insights