# Covariant Versus Contravariant Vectors

1. May 5, 2013

### pbhact

I'm confused about the difference between a contravariant and covariant vector. Some books and articles seem to say that there really is no difference, that a vector is a vector, and can be written in terms of contravariant components associated with a particular basis, or can be written in terms of covariant components associated with the dual basis of the original basis. In other words, it is the components that are contravariant or covariant.

Yet, elsewhere I read that the gradient is a covariant vector, and that velocity is a contravariant vector, because of the way the components transform. No mention of the original basis or dual basis. There seems to be a real difference in the nature of these vectors, as opposed to simply the components and bases used to express them.

I would very much appreciate any insight that anyone can give me on this.

2. May 5, 2013

### Fredrik

Staff Emeritus
There's no short answer I'm afraid. You can start with this post, but you will also need to look at the one I linked to near the end of it, and then the three posts that I linked to at the end of that one.

I'm going to move your post to differential geometry, because tensors are usually studied in that context. Tensors are really part of multilinear algebra, but they're not very interesting in that context. They become interesting when we use them to define tensor fields on manifolds.

3. May 5, 2013

### Popper

I recall seeing uses of them where there wasn't much of a difference. However in general there is a major difference. Especially in modern differential geometry. There are two ways to define these geometric objects

A geometrical object whose components transforms in the same way as the differential of the coordinates, i.e. as dxu, is called a contravariant vector, or vector for short. A geometrical object which maps vectors to real numbers is called a 1-form or a covariant vector. A 1-form can also be thought of as a geometrical object whose components transform in the same way as the components of the gradient of a scalar function. The components of a 1-form w are written using a subscript, i.e. a wu.

Let me give you an example; in a flat spacetime the displacement dX is an infinitesimal displacement in spacetime. If this is the displacement of a particle with non-zero proper mass then when you divide by the increment in the proper time that it took the particle to undergo the displacement., dT, is U = dX/dT. This object is called the 4-veloocity of the particle. For a photon dT is zero so you have to use what's called an affine parameter s. The quantity dX/ds is not a 4-velocity but it is a vector which is tangent to the trajectory of the photon. The quantity A = d2U/dT2 is called the 4-accelertion. On a general manifold the components of A are written as Au = DAu/dT2. If you have a function which assigns a real number to each vector of a manifold then that function is called a geometrical object and given the name 1-form.

The operator D/dT is referred to as the absolute derivative operator. It's definition is beyond the scope of this post. Using something called the metic tensor the components of which are guv can be used to convert a vector to a 1-form and the inverse of the metric tensor components guv can be used to to convert a vector to a 1 form. The details of these things are beyond the scope of this post.

The quantity P = m0U where U is the 4-velocity of a particle and m0 is the particle's proper mass. The quantity P0 is proportional to the particles mass, i.e. P0 = cm where m is the particles inertial mass (also referred to as relativistic mass). The time component of the 1-form p is proportional to the inertial energy of the particle. By inertial energy I'm referring to the energy of a particle whose 4-acceleration is zero. In this case one shouldn't confuse the inertial mass with the inertial energy since they are not proportional as they are in special relativity.

Last edited: May 5, 2013
4. May 5, 2013

5. May 5, 2013

### micromass

There is a natural correspondence between covariant and contravariant vectors whenever there is a metric. So in principle, it doesn't matter that much whether you define something as covariant or contravariant. However, some things are much more naturally written as a covariant vector. An example is the gradient. It was first made into a contravariant vector, but it is only much later that we realized that it has a much better description as a covariant vector. Why? It is because the gradient has a coordinate invariant description as a covariant vector. That is: if you change coordinates, then you won't need to change the definition of the gradient.

If you want to see whether something is covariant and contravariant, then there is a very simply test. Just ask yourself: do I want to integrate this object? (Integrate = taking line or surface integrals). If yes, then it should be a covariant vector. If not, then it should be contravariant.

6. May 5, 2013

### tiny-tim

Hi Phil! Welcome to PF!
If you're not going to change the basis (the reference frame), then it makes no difference.

If, say, you take the gradient of a scalar potential (to get a force), then you automatically get a covariant vector without having to think about it.

Technically, whenever you make a dot product of two vectors, you'll find that one is covariant the other is contravariant, eg force "dot" distance …

but I wouldn't worry about it

7. May 5, 2013

### Popper

No. Absolutely not.

That's because there is a higher level of math being used. Authors of modern differential geometry/general relativity texts will say that the texts which refer to things like the gradient of a function is really not a vector and that the inner product of two vectors is not really true. I think its wrong to phrase it like that. To me its merely a different view/take/definition of things.

8. May 5, 2013

### WannabeNewton

Stop reading stuff that talks about contravariant vectors. It is outdated terminology. Pick up a book on analysis on manifolds (usually an undergrad class).

Last edited: May 5, 2013
9. May 5, 2013

### micromass

10. May 5, 2013

### dx

The gradient is the local linear description of a function M → ℝ

The velocity is the local linear description of a function ℝ → M

11. May 6, 2013

### pbhact

Thanks. I'll definitely do that. But... virtually every book or article I look at that has the word tensor in the title talks about contravariant and covariant vectors/tensors. Since I am only studying tensors for fun, I have no professional connections, and had no idea the terminology is outdated.

12. May 6, 2013

### Fredrik

Staff Emeritus
Some of this will be over your head, but I'm posting it anyway, because this is the answer to the question about the gradient. Most of what you need to understand this is explained in those old posts of mine that I recommended above. (You can skip the first one and go directly to the second one if you want. The first one is essentially just a rant about how annoying the old-fashioned terminology is).

The gradient of a function from $f:\mathbb R^n\to\mathbb R$ is defined by
$$\nabla f(x)=(f_{,1}(x),\dots,f_{,n}(x)),$$ where for each $i\in\{1,\dots,n\}$, $f_{,i}$ denotes the ith partial derivative of f.

Since $\mathbb R^n$ is a smooth manifold, we can interpret the numbers $f_{,i}(x)$ as the components of either a tangent vector T or a cotangent vector C. (I will return to this in a minute). The identity map on $\mathbb R^n$ is a coordinate system, so we can use it to define bases for both the tangent space at x and the cotangent space at x. I will denote the identity map by I. The numbers $f_{,i}(x)$ can be interpreted as partial derivatives with respect to the coordinate system $I$ because
$$\frac{\partial}{\partial I^i}\!\bigg|_x f =(f\circ I^{-1})_{,i}(I(x))=f_{,i}(x).$$
Tangent vector: $$T=f_{,i}(x)\frac{\partial}{\partial I^i}\!\bigg|_x =\left(\frac{\partial}{\partial I^i}\!\bigg|_x f\right)\frac{\partial}{\partial I^i}\!\bigg|_x$$ Cotangent vector: $$C=\displaystyle f_{,i}(x)\,\mathrm dI^i\big|_x =\left(\frac{\partial}{\partial I^i}\!\bigg|_x f\right)\,\mathrm dI^i\big|_x$$ To "transform" the two right-hand sides to another coordinate system J, is to make the substitution $I\to J$. The reason why the gradient is considered a cotangent vector rather than a tangent vector is that
$$\left(\frac{\partial}{\partial J^i}\!\bigg|_x f\right)\frac{\partial}{\partial J^i}\!\bigg|_x\neq T,$$ but $$\left(\frac{\partial}{\partial J^i}\!\bigg|_x f\right)\,\mathrm dJ^i\big|_x=C.$$

13. May 6, 2013

### Fredrik

Staff Emeritus
It is, but that doesn't mean that it's not used anymore. Some physicists still use it for some unfathomable reason. So you may need to know the old approach too, but the best way to learn it is to study the modern approach (which is described in my posts).

14. May 6, 2013

### pbhact

Thanks very much Fredrik for this comment and your previous explanation. Very helpful.

15. May 6, 2013

### WannabeNewton

It isn't your fault by any means so sorry if you thought I meant that. I was just saying you should know that the modern terminology is different and that if you end up reading modern textbooks on the subject (which you should by the way!), the outdated terminology will not show up except maybe in passing.

16. May 6, 2013

### pbhact

I appreciate (and will follow) your advice about looking into more modern texts on the subject. Thanks.

17. May 6, 2013

### Staff: Mentor

Unlike the others who have responded to this thread, I totally agree with what you have expressed in the first paragraph. Maybe I learned all this in an antiquated way, but it has served me well in my extensive engineering career. And if the modern approach is more general and sophisticated, it still could not have changed the basic fundamental characteristics of vectors and tensors that you elucidated in the first paragraph.

I'd now like to say something about the gradient. The old time approach started out by looking at the differential change in a scalar field between two neighboring points in the space (or space-time):

$$df=\frac{\partial f}{\partial x^k}dx^k$$

Now, if $\vec{ds}$ represents a differential position vector between the two neighboring points, then $$\vec{ds}=\vec{a_i}dx^i$$
where the $\vec{a_i}$ are the coordinate basis vectors.

Based on the above two equations, we can write:
$$df=\vec{ds}\centerdot \nabla f$$

According to the above equations, the most natural way of writing the gradient of f is to write:

$$\nabla f = \vec{a^j}\frac{\partial f}{\partial x^j}$$

where the $\vec{a^j}$ are the duel basis vectors (aka basis 1 forms)

This is part of the motivation for referring to the gradient as an entity expressed in terms of covariant components $\frac{\partial f}{\partial x^j}$

18. May 6, 2013

### WannabeNewton

Thinking of one-forms and vectors in terms of how they transform may be fine for computational purposes but the modern notions are far superior in terms of elegance and formulating physical concepts in a coordinate-free way.

To be more precise, let $M$ be a smooth manifold and let $p\in M$. The vector space $T_{p}(M)$ is the set of all tangent vectors to $M$ at $p$ (more precisely it is the set of all derivations at $p$ which are the set of all linear functionals $X\in (C^{\infty}(M))^{*}$ such that $X(fg) = f(p)Xg + g(p)Xf$ for all $f,g\in C^{\infty}(M)$).

Before going further, let me define the pushforward of a tangent vector $X\in T_{p}(M)$ under a smooth map $F:M\rightarrow N$, where $N$ is another smooth manifold, as $F_{*}:T_{p}(M)\rightarrow T_{F(p)}(N)$ which acts on $X$ by $(F_{*}X)(f) = X(f\circ F), f\in C^{\infty}(M)$. This is in fact a derivation at $p$.

Now, let $\gamma:J\subseteq \mathbb{R}\rightarrow M$ be a smooth curve and define the tangent vector to $\dot{\gamma}(t_0)$ at $t_{0}\in J$ by $\dot{\gamma}(t_0)(f) = (\gamma_{*}\frac{\mathrm{d} }{\mathrm{d} t}|_{t_0})f = \frac{\mathrm{d} }{\mathrm{d} t}|_{t_0}(f\circ \gamma)$ where $\frac{\mathrm{d} }{\mathrm{d} t}|_{t_0}$ is the usual derivative in $\mathbb{R}$. Note that $\dot{\gamma}(t_0)$ is a derivation at $p = \gamma(t_0)$ by the usual product rule for ordinary derivatives so $\dot{\gamma}(t_{0})\in T_{p}(M)$. This is why we say the 4-velocity to the wordline of a test particle in space-time (or just the velocity vector of some smooth curve in general) is a tangent vector. The above are what the outdated terminology would label as "contravariant vectors".

There is a special set of tangent vectors to $M$ at $p$ which prove invaluable for computations in coordinates. Just to develop some background, let $(U,\varphi)$ be a smooth chart containing $p$. Here $U\subseteq M$ is an open subset of $M$ that we usually call a coordinate domain and $\varphi:U\rightarrow \varphi(U)\subseteq \mathbb{R}^{n}$ is a homeomorphism which we usually call a coordinate map. So $\varphi(p) = (x^{1}(p),...,x^{n}(p))\in \mathbb{R}^{n}$ is called the coordinate representation of $p$ under $\varphi$ where the component maps $x^{i}:U\rightarrow \mathbb{R}$ are the $ith$ coordinate functions.

Now define the tangent vectors $\frac{\partial }{\partial x^{i}}|_{p}$ by
$\frac{\partial }{\partial x^{i}}|_{p}(f) = (\varphi^{-1})_{*}\frac{\partial }{\partial x^{i}}|_{\varphi(p)}(f) = \frac{\partial }{\partial x^{i}}|_{\varphi(p)}(f\circ \varphi^{-1})$

(note that $f\circ \varphi^{-1}:\varphi(U)\subseteq \mathbb{R}^{n}\rightarrow \mathbb{R}$ so $\frac{\partial }{\partial x^{i}}|_{\varphi(p)}$ is just the regular partial derivative in $\mathbb{R}^{n}$).

Now denote the dual space to $T_p (M)$ by $T^{*}_p (M)$ (this is then called the cotangent space to $M$ at $p$). We call $\omega \in T^{*}_{p}(M)$ covectors. So by definition of the dual space, these are just the linear functionals $\omega:T_p (M)\rightarrow \mathbb{R}$. Now let $f\in C^{\infty}(M)$ be a smooth scalar field. The differential of $f$ at $p$, denoted by $df(p)$, is the map $df(p):T_p(M)\rightarrow \mathbb{R}$ given by $df(p)(X) = X(f)(p),X\in T_p(M)$. As you can see, $df(p)\in T_{p}^{*}(M)$ i.e. it is a covector (a linear functional). The above are what the outdated terminology would label as "covariant vectors".

Now to tie this back in to what you asked before, let $M = \mathbb{R}^{n}$ and let $f:\mathbb{R}^{n}\rightarrow \mathbb{R}$ be smooth. Let $p\in \mathbb{R}^{n}$ and take the global chart $(\mathbb{R}^{n},\text{id}_{\mathbb{R}^{n}})$. Note that $df(p)(\frac{\partial }{\partial x^{i}}|_{p}) = \frac{\partial }{\partial x^{i}}|_{p}(f) = \frac{\partial }{\partial x^{i}}|_{\text{id}_{\mathbb{R}^{n}}(p)}(f\circ \text{id}_{\mathbb{R}^{n}}^{-1}) = \frac{\partial f}{\partial x^{i}}(p)$ i.e. the $ith$ component of $df(p)$ given by $(df(p))_{i} = df(p)(\frac{\partial }{\partial x^{i}}|_p)$ is just $(df(p))_{i} = \frac{\partial f}{\partial x^{i}}(p)$. This is nothing more than the usual gradient from vector calculus so this is why we say the gradient of a smooth function is in fact a covector!

Last edited: May 6, 2013
19. May 6, 2013

### Jorriss

Can one nominate a post to be listed as a faq? I think WBN's post would be a useful faq. It was very helpful for me.

20. May 7, 2013

### HallsofIvy

I would like to add that as long as we require that coordinate axes be orthogonal ("Euclidean Tensors") there is NO difference between "covariant" and "contravariant" vectors.