# Change of Basis, Covariant Vectors, and Contravariant Vector

Tags:
1. Jul 31, 2015

### putongren

I'm having trouble understanding those concepts in the title. Can someone explain those concepts in an easy to understand manner? Please don't refer me to a wikipedia page. I know some linear algebra and multi-variable calculus.

Thank you.

2. Jul 31, 2015

### micromass

Staff Emeritus
So when you read about those in the wiki page or some textbook, what parts did you not understand?

3. Aug 1, 2015

### Fredrik

Staff Emeritus
A good place to start studying tensors is chapter 3 of "A first course in general relativity" by Schutz. Another place that looks good (I know that the rest of the book is very good) is chapter 8 of "Linear algebra done wrong" by Treil.

I'm going to quote myself. This was first posted in this thread, posts #11, #23 and #24.

Post #11:

The components of a vector $v$ with respect to an ordered basis $(e_1,\dots,e_n)$ are the unique real numbers $v^1,\dots,v^n$ such that $v=\sum_{i=1}^n v^i e_i$.

I will elaborate a bit...

Let $V$ be an n-dimensional vector space over $\mathbb R$. Let $V^*$ be the set of linear functions from $V$ to $\mathbb R$. Define addition and scalar multiplication on $V^*$ by $(f+g)(v)=f(v)+g(v)$ and $(vf)(x)=a(f(v))$ for all $v\in V$. These definitions turn $V^*$ into a vector space. The $V^*$ defined this way is called the dual space of $V$.

Let $(e_i)_{i=1}^n$ be an ordered basis for $V$. (The notation denotes the n-tuple $(e_1,\dots,e_n)$). It's conventional to put these indices downstairs, and to put the indices on components of vectors in $V$ upstairs. For example, if $v\in V$, then we write $v=v^i e_i$. I'm using the summation convention here, so the right-hand side really means $\sum_{i=1}^n v^i e_i$.

For each $i\in\{1,\dots,n\}$, we define $e^i\in V^*$ by $e^i(e_j)=\delta^i_j$. It's not hard to show that $(e^i)_{i=1}^n$ is an ordered basis for $V^*$. The ordered basis $(e^i)_{i=1}^n$ is called the dual basis of $(e_i)_{i=1}^n$. It's conventional to put the indices on components of vectors in $V^*$ downstairs. For example, if $f\in V^*$, then we write $f=f_ie^i$.

Exercise: Find an interesting way to rewrite each of the following expressions:

a) $e^i(v)$
b) $f(e_i)$
Post #23:

If $(e_i)_{i=1}^n$ and $(e_i')_{i=1}^n$ are ordered bases for $V$, then for all $i$, there must exist numbers $M_i^j$ such that $e_i'=M_i^j e_j$. (In other words, we can always write the new basis vectors as linear combinations of the old).

Post #24:

Now let $M$ be the matrix such that for all $i,j$, the component on row $i$, column $j$ is $M^i_j$. Recall that the definition of matrix multiplication is $(AB)^i_j=A^i_k B^k_j$. Let $v\in V$ be arbitrary. We have
$$v=v^j e_j=v^i{}' e_i{}' =v^i{}' M^j_i e_j,$$ and therefore $v^j=v^i{}' M^j_i$. This implies that
$$(M^{-1})^k_j v^j =v^i{}' (M^{-1})^k_j M^j_i =v^i{}' (M^{-1}M)^k_i =v^i{}' \delta^k_i =v^k{}'.$$ So the n-tuple of components $(v^1,\dots,v^n)$ transforms according to
$$v^i{}'= (M^{-1})^i_j v^j.$$ The fact that the matrix that appears here is $M^{-1}$ rather than $M$ is the reason why an n-tuple of components of an element of $V$ is said to transform contravariantly. The terms "covariant" and "contravariant" should be interpreted respectively as "the same as the ordered basis" and "the opposite of the ordered basis".

It's easy to see that the dual basis transforms contravariantly. Let $N$ be the matrix such that $e^i{}' =N^i_j e^j$. We have
$$\delta^i_j =e^i{}'(e_j{}')=N^i_k e^k (M_j^l e_l) = N^i_k M_j^l e^k{}(e_l{}) =N^i_k M_j^l \delta^k_l =N^i_k M_j^k =(NM)^i_j.$$ This implies that $N=M^{-1}$. So we have
$$e^i{}' =(M^{-1})^i_j e^j.$$ Now we can easily see that an n-tuple of components of an arbitrary $f\in V^*$ transforms covariantly. We can prove it in a way that's very similar to how we determined the transformation properties of the n-tuple of components of $v$, but the simplest way is to use the formula $f_i=f(e_i)$, which I left as an easy exercise in post #11.
$$f_i{}' =f(e_i{}')=f(M_i^j e_j) =M_i^j f(e_j)= M_i^j f_j.$$ Note that what's "transforming" under a change of ordered basis in these examples are n-tuples of real numbers or n-tuples of vectors (in $V$ or $V^*$). In the case of a tensor of type $(k,l)$, what's transforming isn't the tensor, but its $n^{k+l}$-tuple of components with respect to the ordered basis $(e_i)_{i=1}^n$.

Of course, one can take the point of view that these $n$-tuples or $n^{k+l}$-tuples are the tensors, or rather, that the function that associates tuples with ordered bases is what should be called a tensor. I'm not a fan of that view myself. I consider it inferior and obsolete. However, there isn't anything fundamentally wrong with it. The real problem is that it's so hard to find an explanation of this view that isn't unbelievably bad.

4. Aug 1, 2015

### putongren

Here is an equation that I pasted that is related to those concepts that are mentioned in the title.

The gradient component of the equation is the contravariant vector, and the partial derivative component of the equation is the covariant vector. Or is it that other way around?
What is the geometric interpretation of the contravariant and covariant vectors if we define them using the equation that I pasted. I'm trying to understand these vectors in a more visual way. What is the significance of this interpretation for understanding this topic?

Also, how is the change of basis used in understanding some of the math in physics?

5. Aug 1, 2015

### Fredrik

Staff Emeritus
I don't understand the question. The triple $\left(\frac{\partial}{\partial x^1},\frac{\partial}{\partial x^2},\frac{\partial}{\partial x^3}\right)$ transforms covariantly (i.e. like the ordered basis...and the reason is simply that this triple is the ordered basis). Since f doesn't transform, the triple $\left(\frac{\partial f}{\partial x^1},\frac{\partial f}{\partial x^2},\frac{\partial f}{\partial x^3}\right)$ (i.e. $\nabla f$) also transforms covariantly.

They're not defined using that equation.

I haven't found visuals to be very useful. What helped me understand tensors was to do calculations like the ones I quoted in my previous post.

6. Aug 1, 2015

### putongren

In [PLAIN]https://upload.wikimedia.org/math/8/b/2/8b23e3e6769bf4b9c638e85396983497.png, [Broken] ei is the basis, is that correct? v i is the contra variant vector, is the correct? What is V?

If ei is the basis, then is ei the dual basis?

Last edited by a moderator: May 7, 2017
7. Aug 2, 2015

### Fredrik

Staff Emeritus
$e_i$ is a basis vector. $\{e_1,\dots,e_n\}$ is a basis. $(e_1,\dots,e_n)$ is an ordered basis. The dual of $(e_1,\dots,e_n)$ is denoted by $(e^1,\dots,e^n)$.

$v$ is a vector, i.e. an element of some vector space V. $v^i$ is its $i$th component with respect to the ordered basis $(e_1,\dots,e_n)$. The n-tuple $(v^1,\dots,v^n)$ transforms contravariantly, and is therefore called a contravariant vector by some. I don't like this terminology and just call it the n-tuple of components of $v$ with respect to $(e_1,\dots,e_n)$.

$v$ and $e_i$ are elements of the same vector space V. $e^i$ is an element of its dual space V* (defined in my first post in the thread).

8. Aug 8, 2015

### FactChecker

Tensors are the most natural thing in the world. The idea of tensors is to represent a physical entity (the length of an object, a density of a fluid, a rate of motion) in a way that is independent of the coordinate system used. After all, the physical entity remains the same, whether you measure it in one coordinate system or another. But any two measurements of the same thing in two different coordinate systems must transform to each other in the way defined by a tensor. Suppose you measure a 2 meter object. If you measured it in millimeters, it would be 2000 millimeters. So the smaller the units of the coordinate system, the larger the length measurement in that system. That is contravariant: small coordinate units=> larger numbers. On the other hand, consider a density or rate. Suppose a rope costs $2/meter. Now change that to millimeters.$0.002 /millimeter. That is covariant: small coordinate units => smaller numbers. Actually, it is contravariant in cost and covariant in length.

Tensors are so natural that you may wonder if there is any mathematical vector that is not a tensor. There are simple ones. Suppose I define the unit vector as (1,0,0) in any 3 dimensional coordinate system. That is a perfectly valid math definition. It is (1,0,0) in meters and (1,0,0) in millimeters. It is not a tensor.

9. Aug 11, 2015

### mathwonk

If A is an mxn matrix, it defines a linear map from R^n to R^m,, where if v is an n diml column vector, then Av is an m diml column vector. On the other hand, if w is an m diml row vector, or covector, then wA defines an n diml row vector. Now m diml row vectors, or covectors, define linear maps on the space R^m, so give elements of the dual space R^m*. Thus the same mxn matrix A defines a map R^n-->R^m on column vectors, and another map A*:R^m*-->R^n* on row vectors, depending on which side of A you multiply. One case is called covariant, and the other contravariant, but physicists and mathematicians differ on which is which. Change of basis involves multiplying these expressions on right and left by an invertible matrix and its inverse. (I doubt if this is any use to you, and apologize in advance.)

If you have a smooth map f from an n manifold N to an m diml manifold M, a coordinate system lets you work as if your map went from R^n to R^m. In particular it gives you bases of the tangent spaces to both manifolds and lets you work as if those also are R^n and R^m. The matrix of the differential of your map f, is an m by n matrix of partials defining a linear map from the tangent space of N (at p) to that of M (at f(p)). The jth row consists of the n partials of the jth component function of f.

Thus tangent vectors behave like column vectors, while cotangent vectors behave like row vectors. Since tangent vectors thus map in the same direction as f, mathematicians call them “covariant” (same direction), and call cotangent vectors “contravariant” (opposite direction). Basic tangent vectors to N are defined by the partial symbols ∂/∂x_1,...,∂/∂x_n. But beware, people differ on when to put indices up or down. Fredrik also seems to be using the terms co and contra in the oposite sense from this. I recommend listening to him, not me.

Anyway, I will opt out now.

Last edited: Aug 11, 2015
10. Aug 12, 2015

### HOI

This reminds me of the old Groucho Marx joke: "this is so easy a six year old could understand it! Quick, go find a six year old."

11. Aug 13, 2015

### rjvsngh

I assume you're looking for a way to see covariant vectors. Since they are identified with linear functionals on a vector space V, you could try to figure how to visualize linear functionals. (See https://en.wikipedia.org/wiki/Linear_form#Visualizing_linear_functionals)

It is difficult to visualise the covariant basis vectors because they live in a different space ($V^*$) compared to the primal vectors (belonging to $V$). However if you add an inner product to $V$, then it is possible to draw dual vectors in the primal space. It is true, at least for finite-dimensional inner product spaces, that for any $f \in V^*$ that $(u,v) = f(v)$ for all $v \in V$ and some $u \in V$. The vector $u \in V$ is the vector corresponding to the dual $f \in V^*$. Here $(\cdot,\cdot)$ is an inner product in $V$.

While I am not a mathematician, I hope I'm right!