# Notational clash in index notation?

Staff Emeritus
Gold Member
Classical Theory of Particles and Fields, by Boris Kosyakov, has the following in appendix A:

Elie Cartan proposed to use differential coordinates $dx^i$ as a convenient basis of 1-forms. The differentials $dx^i$ transform like covectors [...] Furthermore, when used in the directional derivative $dx^i \partial F/\partial x^i$, $dx^i$ may be viewed as a linear functional which takes real values on vectors $\partial F/\partial x^i$. The line elements $dx^i$ are called [...] 1-forms.
(I don't own a copy of the book. This came to my attention through this physics.SE question, and it turned out that this part of the book is accessible through Amazon's peephole.)

This completely mystifies me, since the gradient of a scalar is what I would think of as the prototypical example of a covector, not a vector. The partial derivative operator $\partial_i=\partial/\partial x^i$ has a lower index; it's a covector.

My best guess would be that this is occurring because of the notational collision between the interpretation of upper/lower-index quantities as vectors/covectors and the use of lower/upper-index quantities as basis vectors for the space of vectors/covectors. That is, we often use $\partial_i$ as a basis for the space of vectors, and we then have expressions like $a^i\partial_i$, which require context in order to see that they are vectors expressed using the Einstein summation convention, rather than scalars. Similarly, $dx^i$ can be taken as the covector to $\partial_i$. However, none of this resolves the craziness (AFAICT) of referring to a gradient as a vector rather than a covector.

In this total derivative, I would give $dx^i$ the context-dependent interpretation of being an infinitesimally small $\Delta x^i$, which is the prototypical example of a vector (not covector). For a total derivative involving a finite change, we would have $\Delta F\approx \Delta x^i \partial F/\partial x^i$. I don't see what Kosyakov could do here, since $\Delta x^i$ is clearly a vector, not a covector.

Is Kosyakov's point of view unusual? Is there something I'm missing?

Last edited:

Related Special and General Relativity News on Phys.org
Fredrik
Staff Emeritus
Gold Member
The partial derivative operator $\partial_i=\partial/\partial x^i$ has a lower index; it's a covector.
It's a tangent vector, not a cotangent vector. So its n-tuple of components (which in this coordinate system is ##(\delta^1_i,\dots,\delta^n_i)##) transforms contravariantly. However, the n-tuple ##(\partial_1,\dots,\partial_n)## transforms covariantly. (The definition of "covariant" is essentially that we use that term when something transforms just like this n-tuple).

Last edited:
atyy
Also, the mathematical "gradient" is a covector - but that's because mathematicians define "gradient" differently from physicky vector calculus. For example, one can find df as a "gradient" in http://sophia.dtp.fmph.uniba.sk/~fecko/referaty/regensburg.pdf (p6) and http://www.damtp.cam.ac.uk/research/gr/members/gibbons/dgnotes3.pdf (just before Eq 3.42). That's why one often reads that the gradient is a covector. The physicist's vector calculus "gradient" is a vector, because it is defined in a space with a metric, which can be used to get the Hodge dual and make the physics vector calculus "gradient".

pervect
Staff Emeritus
Classical Theory of Particles and Fields, by Boris Kosyakov, has the following in appendix A:

My best guess would be that this is occurring because of the notational collision between the interpretation of upper/lower-index quantities as vectors/covectors and the use of lower/upper-index quantities as basis vectors for the space of vectors/covectors. That is, we often use $\partial_i$ as a basis for the space of vectors, and we then have expressions like $a^i\partial_i$, which require context in order to see that they are vectors expressed using the Einstein summation convention, rather than scalars. Similarly, $dx^i$ can be taken as the covector to $\partial_i$. However, none of this resolves the craziness (AFAICT) of referring to a gradient as a vector rather than a covector.

In this total derivative, I would give $dx^i$ the context-dependent interpretation of being an infinitesimally small $\Delta x^i$, which is the prototypical example of a vector (not covector). For a total derivative involving a finite change, we would have $\Delta F\approx \Delta x^i \partial F/\partial x^i$. I don't see what Kosyakov could do here, since $\Delta x^i$ is clearly a vector, not a covector.

Is Kosyakov's point of view unusual? Is there something I'm missing?
I don't think Kosyakov's point of view is unusual. ##dx^i## has a scalar value, or a scalar range. If ##dx^i## is interpreted as an operator, having a domain and a range , so that ##dx^i## operates on something in the domain and returns a value in the range, then it is a one form, as in this case the range is a scalar value, and the domain is a vector. A map from a vector to a scalar is a covector by definition (well, in my recollection, I should probably look that up but I'm not going to).

MTW also makes a similar point, they also point out that the notation is a bit ambiguous between a pure scalar, and an operator that takes in a vector and outputs a scalar. MTW prefers to use a bold face d for the later case.

The definition of a vector is pretty universally taken to be the derivative operators, which makes ##\partial_i## a vector, and ##dx^i## with the appropriate definition a map from a vector to a scalar or a covector.

As far as the upper-lower index notation goes, I've always found it very confusing in some particular cases such as this one, and I'm glad to see I'm not the only one.

Fredrik
Staff Emeritus
Gold Member
Also, the mathematical "gradient" is a covector - but that's because mathematicians define "gradient" differently from physicky vector calculus. For example, one can find df as a "gradient"
The "physicky vector calculus" gradient ##(\partial_1f,\dots,\partial_nf)## transforms covariantly (because ##(\partial_1,\dots,\partial_n)## does, and f doesn't transform at all). Note that it's not a cotangent vector. It's an n-tuple of tangent vectors associated with each coordinate system, that transforms covariantly. That makes it a "covector" according to the old-fashioned definitions.

##\mathrm df## is a 1-form (a cotangent vector field) that can be thought of as the gradient of f because its n-tuple of components is the "physicky vector calculus" gradient:
$$(\mathrm df)_\mu=(\mathrm df)(\partial_\mu)=\partial_\mu f.$$

Last edited:
atyy
The "physicky vector calculus" gradient ##(\partial_1f,\dots,\partial_nf)## transforms covariantly (because ##(\partial_1,\dots,\partial_n)## does, and f doesn't transform at all). Note that it's not a cotangent vector. It's an n-tuple of tangent vectors associated with each coordinate system, that transforms covariantly. That makes it a "covector" according to the old-fashioned definitions.

##\mathrm df## is a 1-form (a cotangent vector field) that can be thought of as the gradient of f because its n-tuple of components is the "physicky vector calculus" gradient:
$$(\mathrm df)_\mu=(\mathrm df)(\partial_\mu)=\partial_\mu f.$$
That isn't the physicist's gradient, because the formula is only correct in Cartesian coordinates. If one switches to cylindrical coordinates, the mathematician's gradient has components given by the partial derivative, but the physicist's gradient picks up a 1/r as in http://mathworld.wolfram.com/CylindricalCoordinates.html (eq 32).

Edit: I spoke too hastily about the Hodge dual in post #3. The Hodge dual is used to get to the vector calculus "div" and "curl". The mathematician's gradient and physicist's gradient are related by sharp and flat isomorphisms eg. http://lamington.wordpress.com/2014/05/26/div-grad-curl-and-all-this/. The notation in Calegari's post is different from the Fecko and Gibbons links in post #3. The angle brackets ##\langle,\rangle## are used by Calegari to an mean inner product which takes two vectors and outputs a number, whereas Fecko and Gibbons use them to mean a form acts on a vector to give a number, or that a form is acted on by a vector to give a number.

Last edited:
pervect
Staff Emeritus
A bit more that I didn't have time to write:

A vector ##a^i## can be expanded out as ##a^1 (e_1) + a^2 (e_2) + ... a^n (e_n)##, where ##a^i## are the coefficients, and ##e_i## are the basis vectors. So the basis vectors really always have a lower index, which is consistent with the notion that partial derivative operators are basis vectors and the notation for how said operators are written, i.e. ##\partial_i##. It is the coefficients of the basis vectors that have the upper index. I also checked another text, when Wald gives a set of basis vectors, he gives it as a subscript, i.e. ##{v_i}##.

The parenthisis above aren't really needed but I find them helpful, sort of like "scare quotes".

I don't see any way to improve the notation, but I still find it confusing.

robphy