I Is Covariant Derivative Notation Misleading in Vector Calculus?

  • #101
PeterDonis said:
others (such as MTW, as I mentioned before) put the hats on the component indexes, to indicate that the components are being taken with respect to a non-coordinate (usually orthonormal) basis.

Actually, looking at MTW again, I think their "hat" convention actually has nothing to do with a non-coordinate basis; I think it has to do with components with respect to a local inertial coordinate chart, as opposed to components with respect to a general coordinate chart. As far as I can tell, MTW don't write equations in component form at all unless they are using a coordinate basis.
 
Physics news on Phys.org
  • #102
PeterDonis said:
But the symbol symbolizes the geometric object itself, which is independent of any choice of coordinates; it's the indices that (at least with MTW's convention--Wald's abstract index convention is different) are supposed to convey information about the choice of coordinates. So putting the hats on the indices makes sense given MTW's general approach. Putting the hat on the symbol itself would imply that something about the geometric object changes when you change coordinates.
That again depends on how you define your symbols. For me a tensor (e.g., a vector ##V##) is an invariant object (under transformations under consideration, i.e., rotations in 3D Euclidean vector spaces, Lorentz transformations in 4D Lorentzian vector spaces, general linear transformations for a general vector space, general diffeomorphisms in all kinds of differentiable manifolds,...). Then there are the components wrt. a basis. Then the index notation comes into play, and we write e.g., for a vector (field) on a manifold wrt. the holonous basis of some coordinates, ##V=V^{\mu} \partial_{\mu}##. For me the indices just count from 0 to 3 (in GR), no matter which ornaments I put on them. That's why I have to write the components in another holonomous basis derived from the "new" coordinates ##\bar{x}^{\mu}## as ##\bar{V}^{\mu}##. Then of course you have ##V=V^{\mu} \partial_{\mu}=\bar{V}^{\mu} \bar{\partial}_{\mu}##.

It's of course possible to do it like MTW and put some ornaments on the indices to indicate wrt. which basis vectors you decompose your tensors. It's just a matter to stay consistent, but I think a notation should also be as "error resistant" as possible, and then this notation is in my experience less "save" than the other.
 
  • #103
dextercioby said:
I cannot believe there are 100 posts here about a simple pure mathematics issue (albeit with application in GR).

##\nabla## is a linear operator called covariant derivative which can be applied to any ##(n,l)## tensor to bring it to an ##(n,l+1)## tensor. It generalizes what in (pseduo-) Riemannian manifolds is called "affine connection".

In mathematics ##\nabla_{\mu}V^{\nu}## is ill defined (and from your long debate, it's certainly controversial in physics (!)). However, in order to connect the mathematical definition of a covariant derivative of a tensor with physics (in the GR tensor approach à la Einstein/Levi-Civita/Hilbert/Weyl, i.e. the old-school GR), we can define this dubious object appearing in physics texts as:

$$\nabla_{\mu}V^{\nu} := \left(\nabla V\right)_{\mu}^{~\nu}, \tag{1} $$

that is, the LHS are the components of the (1,1) tensor ##\nabla V## (with ##V## being a (1,0) tensor a.k.a. vector) in the basis ##dx^{\mu}\otimes \partial_{\nu}## (a coordinate basis in the tangent space of each point of the curved manifold).

So regarding the original "bracketing" issue brought up by @stevendaryl and its ensuing discussion, ##(1)## is the only reasonable definition of that object which appears in the "old-school" GR works. Modern (after 1960) GR uses the so-called "abstract index notation" which is meant to make more sense when analyzed from a pure math perspective. However, when the so-called "abstract tensors" are projected onto bases of directional derivatives and differential one-forms, the dubious objects of the LHS of ##(1)## appear again.

P.S. The directional derivative of a vector ##Y## along a vector ##X## is a vector. In formulas:

$$\nabla_{X} Y =: \nabla_{X^{\mu} \partial_{\mu}} \left(Y^{\nu}\partial_{\nu}\right) = \quad ... \quad =\\
\left[X^{\mu} \left(\partial_{\mu}Y^{\nu} +\Gamma^{\nu}_{~\mu\lambda} Y^{\lambda}\right)\right] \partial_{\nu} \tag{2} $$

In the round brackets of ##(2)## one recognizes the object defined in ##(1)##.
If you really want to be save you also have to obey the horizontal order of indices!

I don't see, what's the advantage to complicate the notation by writing ##{(\nabla V)_{\mu}}^{\nu}## instead of the usual notation ##\nabla_{\mu} V^{\nu}##, which is simply the same thing. Mathematicians are sometimes in a disadvantage if it comes to practical calculations in comparison to physicists who tend to write things in a way that facilitates such calculations. I took a lot of pure-math lectures, and usually when it came to just calculating something my notation was way easier and quicker than the mathematicians'. My disadvantage was that I had to translate my sloppy physicist's notation to the more rigorous but combersome notation of the mathematicians. There's no free lunch! ;-)).
 
  • #104
PeterDonis said:
If the covariant derivative commutes with contraction, yes. The discussion in Chapter 3 of Carroll's online lecture notes [1] indicates that this property will always hold for the covariant derivative used in GR, although it strictly speaking does not have to hold for a general covariant derivative (one that only satisfies the linearity and Leibniz rule properties); the primary benefit of requiring the property to be true appears from his discussion to be that it means we can use the same connection coefficients for transforming tensors of any rank.

[1] https://ned.ipac.caltech.edu/level5/March01/Carroll3/Carroll3.html
Interesting. I thought it's usually put in the definition of a derivation operation on vector spaces that they should commute with contraction. Otherwise you could have a different connection on the co-vector space than the vector space, but that's a bit confusing then. It's easier to have the commutability between contraction and taking some kind of derivative. Then the definition of a specific kind of covariant derivative on the vector space implies a unique one on the co-vector space and vice versa. If you need different kinds of derivatives you can just introduce them. In GR you have not only the usual covariant derivatives (implying parallel transport) but also, e.g., Lie derivatives (implying Lie transport), etc.

It's amazing which variety is in the definitions of standard mathematical objects in the literature. Fortunately in standard GR everything is pretty unique, i.e., the usual connection is the unique metric-compatible connection in a torsion-free Lorentzian manifold ;-)).
 
  • #105
vanhees71 said:
I don't see, what's the advantage to complicate the notation by writing ##{(\nabla V)_{\mu}}^{\nu}## instead of the usual notation ##\nabla_{\mu} V^{\nu}##, which is simply the same thing.

My point is that it is ambiguous: Are you operating on a vector ##V## and then taking component ##\nu## of the result, or are you operating on the component ##V^\nu##?
 
  • #106
stevendaryl said:
My point is that it is ambiguous: Are you operating on a vector ##V## and then taking component ##\nu## of the result, or are you operating on the component ##V^\nu##?

With my personal convention for such things, ##\nabla_\mu## is just shorthand for the operator ##\nabla_{e_\mu}##. With this convention, plus linearity and the Leibniz rule for derivatives, we can write:

##\nabla_\mu V = \nabla_\mu (V^\sigma e_\sigma) = (\nabla_\mu V^\sigma) e_\sigma + V^\sigma (\nabla_\mu e_\sigma)##

Taking components of both sides (by operating with ##e^\nu##) gives:

##(\nabla_\mu V)^\nu = \nabla_\mu V^\nu + \Gamma^\nu_{\mu \sigma} V^\sigma##

So rather than saying ##(\nabla V)^\nu_\mu = \nabla_\mu V^\nu##, I would say it's equal to ##\nabla_\mu V^\nu + \Gamma^\nu_{\mu \sigma} V^\sigma##

I guess @PeterDonis would say that this ambiguity is resolved by denying that ##\nabla_\mu## is an operator; it only appears in the context of the expression ##\nabla_\mu V^\nu## where the meaning is ##(\nabla V)^\mu_\nu##.
 
  • #107
stevendaryl said:
My point is that it is ambiguous: Are you operating on a vector V and then taking component ν of the result, or are you operating on the component Vν?
My interpretation of the notation is that I'm operating on tensor components. In any case what's of utmost importance is to also take care of the horizontal positioning of the indices. I can accept a notation like ##{(\nabla V)_{\mu}}^{\nu}##, though I think it's utmost inconvenient. I don't like books, where they don't care about the horizontal index position. It's already a night mare if it's not obeyed for Lorentz-transformation matrices in SRT!

Of course at the end ##\nabla V## (which is also sloppy notation for ##\nabla \otimes V##!) is always the same tensor, which is independent of the choice of bases and coordinates. In my notation its
$$\nabla V=\mathrm{d} x^{\mu} \nabla_{\mu} V^{\nu} \partial_{\nu}$$
in the usual notation ##\partial_{\mu}## and ##\mathrm{d} x^{\mu}## for the holonomous basis and its dual given some coordinates.

As I said before, I think there's not much to argue about. It's all just convention and one must make sure to understand the notation of each book/paper right.
 
  • Like
Likes etotheipi
  • #108
stevendaryl said:
I guess @PeterDonis would say that this ambiguity is resolved by denying that ##\nabla_\mu## is an operator

Not necessarily, no. With your convention, where ##\nabla_\mu## means ##\nabla_{e_\mu}##, the directional derivative operator along ##e_\mu##, it is obviously an operator. With the convention I'm used to, ##\nabla_\mu##, used in isolation, is just a way of referring to the covariant derivative operator ##\nabla## itself--Wald would write it as ##\nabla_a##. But in either case it's an operator. The ambiguity in the convention I'm used to is, as you say, that sometimes (usually in expressions where it's combined with other things), ##\nabla_\mu## can mean the ##\mu## component of some tensor obtained by applying the ##\nabla## operator to something; that is indeed not an operator. (As we have seen, it ends up being the same as the directional derivative in the ##e_\mu## direction of the thing the ##\nabla## is operating on. But that still doesn't resolve all ambiguities; see below.)

The only real way to resolve ambiguity is to, well, resolve ambiguity, by adding more notation until the expression is unambiguous.

For example, in Wald's abstract index notation, the various objects you have given would look like this:

Directional derivative of ##V##:

$$
\nabla_\mu V = \left[ \left( e_\mu \right)^a \nabla_a \right] V^b
$$

(Note the brackets enclosing the contraction that denotes the directional derivative, to make it unambiguous that it denotes a single operator.)

Extracting the ##\nu## component:

$$
\left( \nabla_\mu V \right)^\nu = \left[ \left( e_\mu \right)^a \nabla_a V^b \right] \left( e^\nu \right)_b
$$

(Here I don't have a third bracket type to use, so I'm relying on the first expression above to make it clear what the directional derivative operator is, and using the brackets to make clear that the operator is only operating on ##V##; the operator produces a vector, and we contract that vector with ##e^\nu## to extract the component.)

Directional derivative of the ##\nu## component of ##V##:

$$
\nabla_\mu V^\nu = \left[ \left( e_\mu \right)^a \nabla_a \right] \left[ V^b \left( e^\nu \right)_b \right]
$$

Expanding out the above (since now the directional derivative is operating on both ##V## and ##e_\nu##, as the brackets in the above expression make clear):

$$
\nabla_\mu V^\nu = \left[ \left( e_\mu \right)^a \nabla_a V^b \right] \left( e^\nu \right)_b + V^b \left[ \left( e_\mu \right)^a \nabla_a \left( e^\nu \right)_b \right]
$$

Notice that this does not give the same result as above.

Covariant derivative of ##V##:

$$
\nabla V = \nabla_a V^b
$$

Extracting the ##\mu##, ##\nu## component:

$$
\left( \nabla V \right)_\mu{}^\nu = \left( \nabla_a V^b \right) \left( e_\mu \right)^a \left( e^\nu \right)_b
$$

As we have seen in previous posts, since covariant differentiation commutes with contraction, this is equal to the ##\nu## component of the directional derivative of ##V## in the ##e_\mu## direction, but it is not equal to the directional derivative in the ##\mu## direction of the ##\nu## component of ##V##.
 
Last edited:
  • Like
Likes vanhees71 and etotheipi
  • #109
PeterDonis said:
The components aren't numerically the same, since they are components taken with respect to two different choices of basis. The index numbers are the same, but that's just because we number them by dimensions without taking into account anything about the particular coordinate choice. But if we were to designate indexes by coordinate instead of by index number, we would have, for example, ##V^t##, ##V^r##, ##V^\theta##, and ##V^\phi## for a coordinate basis as compared with ##V^T##, ##V^X##, ##V^Y##, and ##V^Z## for an orthonormal basis.
Even in the case when just numbering the indices, this can be addressed by using ##0123## for an unprimed coordinate system and ##0' 1' 2' 3'## for a primed coordinate system.
 
  • #110
Sorry to resume this thread. I've a doubt about the notation used by Wald for torsion-free condition of a covariant derivative operator, namely ##\nabla_a \nabla_b f = \nabla_b \nabla_a f## (Wald employes extensively abstract index notation).

It should be actually ##\nabla_a (\nabla_b f) = \nabla_b (\nabla_a f)##, I think: the result of the brackets () is a (0,1) tensor - a covector - and then the second instance of covariant derivative operator acts on it.

Is that correct ? Thanks
 
Last edited:
  • #111
Yes, this is correct. But that bracketing is considered superfluous, so it's omitted.
 
  • #112
dextercioby said:
Yes, this is correct. But that bracketing is considered superfluous, so it's omitted.
ok good. I believe the two sides (LHS and RHS) with named abstract indexes ##a## and ##b## reversed make sense only if we think of (or imagine) each of them acting on (or contracting with) a given 'fixed' (2,0) tensor field (e.g. ##u^av^b##).

Otherwise what sense would it make ? The two sides are actually the same - the same (0,2) tensor object having just the 2 ordered slots with reversed names.
 
Last edited:
  • #113
The covariant derivatives applied to a scalar indeed commute. In usual Ricci calculus that's very easy to see
$$\nabla_{\mu} \Phi=\partial_{\mu} \Phi$$
and then
$$\nabla_{\nu} \nabla_{\mu} \Phi = \partial_{\nu} \partial_{\mu} \Phi -{\Gamma^{\rho}}_{\mu \nu} \partial_{\rho} \Phi.$$
Since the partial derivatives commute (under the usual assumptions about the smoothness of ##\Phi##) and the Christoffel symbols are symmetric in a pseudo-Riemann manifold, you indeed have
$$\nabla_{\nu} \nabla_{\mu} \Phi = \nabla_{\mu} \nabla_{\nu} \Phi.$$
 
  • #114
vanhees71 said:
$$\nabla_{\nu} \nabla_{\mu} \Phi = \partial_{\nu} \partial_{\mu} \Phi -{\Gamma^{\rho}}_{\mu \nu} \partial_{\rho} \Phi.$$ Since the partial derivatives commute (under the usual assumptions about the smoothness of ##\Phi##) and the Christoffel symbols are symmetric in a pseudo-Riemann manifold, you indeed have
$$\nabla_{\nu} \nabla_{\mu} \Phi = \nabla_{\mu} \nabla_{\nu} \Phi.$$
I could be wrong but using Leibniz rule it should be (basically the partial derivatives order in the first term on RHS is reversed) $$\nabla_{\nu} (\nabla_{\mu} \Phi) = \partial_{\mu} \partial_{\nu} \Phi -{\Gamma^{\rho}}_{\mu \nu} \partial_{\rho} \Phi$$

Then, as you pointed out, since Christoffel symbols are symmetric and using partial derivative commutativity we get the result.
 
Last edited:
  • #115
Well, yes, my order of the lower Christoffel-symbol indices should be switched in the first expression, but of course they are symmetric, which is why I usually don't care. You are right, in more general cases of spaces with torsion, one has to keep an eye on the order of those indices. I hope, here I get it formally right with the index ordering:
$$\nabla_{\nu} \nabla_{\mu} \phi=\nabla_{\nu} \partial_{\mu} \Phi = \partial_{\nu} \partial_{\mu} \Phi -{\Gamma^{\rho}}_{\nu \mu} \partial_{\rho} \Phi=\partial_{\mu} \partial_{\nu} \Phi -{\Gamma^{\rho}}_{\mu \nu} \partial_{\rho} \Phi=\nabla_{\mu} \partial_{\nu} \Phi=\nabla_{\mu} \nabla_{\nu} \Phi.$$
 
  • #116
vanhees71 said:
Well, yes, my order of the lower Christoffel-symbol indices should be switched in the first expression, but of course they are symmetric, which is why I usually don't care. You are right, in more general cases of spaces with torsion, one has to keep an eye on the order of those indices. I hope, here I get it formally right with the index ordering:
$$\nabla_{\nu} \nabla_{\mu} \phi=\nabla_{\nu} \partial_{\mu} \Phi = \partial_{\nu} \partial_{\mu} \Phi -{\Gamma^{\rho}}_{\nu \mu} \partial_{\rho} \Phi=\partial_{\mu} \partial_{\nu} \Phi -{\Gamma^{\rho}}_{\mu \nu} \partial_{\rho} \Phi=\nabla_{\mu} \partial_{\nu} \Phi=\nabla_{\mu} \nabla_{\nu} \Phi.$$
My point was actually about partial derivatives order in the first term on RHS.
 
  • Like
Likes vanhees71
  • #117
I see, yes, but then we first apply ##\partial_{\mu}## and then ##\partial_{\nu}##, i.e., we have ##\partial_{\nu} \partial_{\mu}## in the first step. At the end it's right anyway, because the operators in question commute ;-)).
 
  • Like
Likes cianfa72
  • #118
Sorry, I believe the sign of Christoffel-symbol should be '+'. Btw I believe we're mixing again the meaning of Greek indices ##\mu## and ##\nu## (tensor component indexes in a basis vs "which vector in the basis"). In a coordinate basis (holonomic) it should be fine, however.
$$\nabla_{\nu} \nabla_{\mu} \phi=\nabla_{\nu} \partial_{\mu} \Phi = \partial_{\mu} \partial_{\nu} \Phi + (\nabla_{\nu} \partial_{\mu}) \Phi = \partial_{\mu} \partial_{\nu} \Phi +{\Gamma^{\rho}}_{\mu \nu} \partial_{\rho} \Phi=\partial_{\nu} \partial_{\mu} \Phi +{\Gamma^{\rho}}_{\nu \mu} \partial_{\rho} \Phi=\nabla_{\mu} \partial_{\nu} \Phi=\nabla_{\mu} \nabla_{\nu} \Phi.$$ Note in fact that ##\nabla_{\nu}\partial_{\mu}## is actually the covariant derivative in the direction ##\nu## (i.e. in the direction ##\partial_{\nu}##) of the vector ##\partial_{\mu}##.
 
Last edited:
  • #119
cianfa72 said:
Sorry, I believe the sign of Christoffel-symbol should be '+'. Btw I believe we're mixing again the meaning of Greek indices ##\mu## and ##\nu## (tensor component indexes in a basis vs "which vector in the basis"). In a coordinate basis (holonomic) it should be fine, however.
$$\nabla_{\nu} \nabla_{\mu} \phi=\nabla_{\nu} \partial_{\mu} \Phi = \partial_{\mu} \partial_{\nu} \Phi + (\nabla_{\nu} \partial_{\mu}) \Phi = \partial_{\mu} \partial_{\nu} \Phi +{\Gamma^{\rho}}_{\mu \nu} \partial_{\rho} \Phi=\partial_{\nu} \partial_{\mu} \Phi +{\Gamma^{\rho}}_{\nu \mu} \partial_{\rho} \Phi=\nabla_{\mu} \partial_{\nu} \Phi=\nabla_{\mu} \nabla_{\nu} \Phi.$$ Note in fact that ##\nabla_{\nu}\partial_{\mu}## is actually the covariant derivative in the direction ##\nu## (i.e. in the direction ##\partial_{\nu}##) of the vector ##\partial_{\mu}##.
That’s not how covariant differentiation works.
 
  • #120
Orodruin said:
That’s not how covariant differentiation works.
Yes, it was wrong. I tried to do the complete job: the goal is work out the component ##\mu##, ##\nu## of the tensor ##\nabla(\nabla \Phi##) in a coordinate basis.

As tensor ##\nabla \Phi = (\partial_{\alpha} \Phi) dx^{\alpha}## then
$$\nabla (\nabla \Phi)= [\partial_{\beta} \partial_{\alpha} \Phi - \partial_{\rho} \Phi {\Gamma^{\rho}}_{\alpha \beta}] dx^{\alpha} \otimes dx^{\beta}$$ Contract it with ##\partial_{\mu}## and ##\partial_{\nu}## to get the ##\mu##, ##\nu## component:
$$(\nabla (\nabla \Phi))_{\mu \nu}= [\partial_{\beta} \partial_{\alpha} \Phi - \partial_{\rho} \Phi {\Gamma^{\rho}}_{\alpha \beta}] dx^{\alpha}(\partial_{\mu}) dx^{\beta}(\partial_{\nu})$$$$ \nabla_{\mu} \nabla_{\nu} \Phi = (\nabla \nabla \Phi)_{\mu \nu}= [\partial_{\nu} \partial_{\mu} \Phi - \partial_{\rho} \Phi {\Gamma^{\rho}}_{\mu \nu}]$$
Is that correct now ? Thanks
 
Last edited:
  • Like
Likes Orodruin
  • #121
cianfa72 said:
Is that correct now ?
Looks good.
 
  • #122
Orodruin said:
Looks good.
And what was wrong with my derivation? I don't see any difference.
 
  • #123
vanhees71 said:
And what was wrong with my derivation? I don't see any difference.
I never said anything was wrong with it. I complained about #118.
 
  • Like
Likes cianfa72 and vanhees71
  • #124
Orodruin said:
I never said anything was wrong with it. I complained about #118.
Yep, my fault sorry.
 
  • #125
dextercioby said:
I cannot believe there are 100 posts here about a simple pure ... issue
This, unfortunately, has become a characteristic feature in here.
dextercioby said:
In mathematics ##\nabla_{\mu}V^{\nu}## is ill defined
No, it is not. In mathematics we define things. So, on a generic tensor (density) T_{A} \equiv T^{\rho_{1}\cdots \rho_{r}}_{{}\tau_{1}\cdots \tau_{s}}, I define the operator \nabla_{\mu} by the rule \nabla_{\mu}T_{A} \equiv \partial_{\mu}T_{A} + \Gamma^{\lambda}_{\mu\nu}[T_{A}]^{\nu}{}_{\lambda} , where [T^{\rho_{1} \cdots \rho_{r}}_{{}\tau_{1}\cdots \tau_{s}}]^{\nu}{}_{\lambda} \equiv \sum_{p = 1}^{r} \delta^{\rho_{p}}_{\lambda}T^{\rho_{1}\cdots \rho_{p-1}\nu \rho_{p+1}\cdots \rho_{r}}_{{}{}{}{}\tau_{1} \cdots \tau_{s}} - \sum_{q = 1}^{s} \delta^{\nu}_{\tau_{q}}T^{\rho_{1}\cdots \rho_{r}}_{{}\tau_{1}\cdots \tau_{q-1}\lambda \tau_{q+1}\cdots \tau_{s}} - \delta^{\nu}_{\lambda}T_{A} , with last term is absent when T_{A} is not a density.

Remaks: (1) Notice that [T_{A}]^{\nu}{}_{\lambda} \epsilon^{\lambda}{}_{\nu} is nothing but the change of T_{A} under an infinitesimal \mbox{GL}(n) transformation parametrized by \epsilon^{\lambda}{}_{\nu}. So, for any object \Psi, we define [\Psi]^{\nu}{}_{\lambda} by its infinitesimal transformation under the general linear group \mbox{GL}(n).

(2) You can show that the (above defined) operator \nabla_{\mu} satisfies the Leibniz rule.
 
  • Like
Likes vanhees71 and ergospherical

Similar threads

Back
Top