# Covariant and Contravariant

1. Jan 1, 2015

### TimeRip496

I am reading a notes about tensor when I came across this which the notes did not elaborate more on it. As a result I dont quite understand why.

Here it is : " Note that we mark the covariant basis vectors with an upper index and the contravariant basis vectors with a lower index. This may sounds counter-intuitive ('did we not decide to use upper indices for contravariant vectors?') but this is precisely what we mean with the 'different meaning of the indices' here: this time they label the vectors and do not denote their components. "

I can follow except the last sentence and I do not know why. Can anyone enlighten me?

2. Jan 1, 2015

### ShayanJ

Using Einstein summation convention, we have $\vec A=A^i e_i=A^1 e_1+A^2 e_2+A^3 e_3$. As you know, the components of mentioned vector($A^i$), are scalars(they are functions of spatial coordinates). So what makes $\vec A$ a vector? Its the basis vectors $e_i$. They are really vectors, but basis vectors. Which means only a distinguished set of linearly independent vectors.

3. Jan 1, 2015

### TimeRip496

Do you mean that the upper indices assigned for contravariant vector while the lower indices assigned for contravariant basis vector is just a mean to distinguished them from each other? Sorry if I didnt follow you.

4. Jan 1, 2015

### ShayanJ

Yeah, its basically a convention so you shouldn't look for reasons here.
But what you should understand is that $A^i$ is not a vector. Its only a general term referring to one of the components of the vector $\vec A$ and so its a scalar. Its just that people work with the components of vectors.

5. Jan 1, 2015

### TimeRip496

Ok thanks a lot!

6. Jan 1, 2015

### stevendaryl

Staff Emeritus
If you've taken vector calculus, you probably have seen a 2-D vector $\vec{A}$ written in the form $A^x \hat{x} + A^y \hat{y}$. In that notation, $\hat{x}$ means a "unit vector" in the x-direction, while the coefficient $A^x$ means the component of $\vec{A}$ in that direction. When you get to relativity, the notion of a "unit vector" becomes not well-defined, so the more general notion is a "basis vector". You would write an arbitrary vector $\vec{A}$ in the form $\sum_\mu A^\mu e_\mu$, where the sum ranges over all basis vectors (there are 4 in SR--3 spatial directions and one time direction). By convention, people leave off the $\sum_\mu$ and it's assumed that if an index appears in both lowered and raised forms, then it means that it is summed over. So people would just write a vector as $A^\mu e_\mu$

Now, although the components $A^\mu$ are different in different coordinate systems, so people say that the vector "transforms" when you change coordinates, the combination $A^\mu e_\mu$ is actually coordinate-independent. The vector has the same value, as a vector, in every coordinate system. What that means is that if you change coordinates from $x^\mu$ to some new coordinates $x^\alpha$, the value of $\vec{A}$ doesn't change:

$A^\mu e_\mu = A^\alpha e_\alpha$

The components $A^\mu$ change, and the basis vectors $e_\mu$ change, but the combination remains the same.

We can relate the old and new components through a matrix $L^\alpha_\mu$:

$A^\alpha = L^\alpha_\mu A^\mu$

If we use this matrix to rewrite $A^\alpha$ in our equation relating the two vectors, we see:

$A^\mu e_\mu = L^\alpha_\mu A^\mu e_\alpha = A^\mu (L^\alpha_\mu e_\alpha)$

Note that since this equation holds for any vector $\vec{A}$, it must mean that

$e_\mu = L^\alpha_\mu e_\alpha$

or if we let $(L^{-1})^\mu_\alpha$ be the inverse matrix, we can apply it to both sides to get:

$(L^{-1})^\mu_\alpha e_\mu = e_\alpha$

So we have the pair of transformation equations:
1. $A^\alpha = L^\alpha_\mu A^\mu$
2. $e_\alpha = (L^{-1})^\mu_\alpha e_\mu$
The basis vectors $e_\mu$ transform in the opposite way from the components $A^\mu$, so that the combination $A^\mu e_\mu$ has the same value in every coordinate system.

7. Jan 1, 2015

### stevendaryl

Staff Emeritus
Just another point: the index on a basis vector $e_\mu$ indicates which basis vector, rather which component of a vector. But since a basis vector is, after all, a vector, you can actually ask "what are the components of basis vector $e_\mu$?" The answer is pretty trivial:

$(e_\mu)^\mu = 1$ (In this case, $\mu$ is NOT summed over)

All other components are zero. This can be summarized using the delta-notation:

$(e_\mu)^\nu = \delta^\nu_\mu$

8. Jan 1, 2015

### bcrowell

Staff Emeritus
I wouldn't refer to the components of a vector as scalars. I would define a scalar as something that doesn't change under a change of coordinates, i.e., a rank-0 tensor.

9. Jan 1, 2015

### ShayanJ

Oh...yeah, Sorry. I should have made clear I don't mean the strict meaning of the word.
So...what should we call them? Just "components of a vector"?

EDIT: But actually in the context of linear algebra, they are scalars. So we have two conflicting definitions of the word scalar.

Last edited: Jan 1, 2015
10. Jan 1, 2015

### stevendaryl

Staff Emeritus
I guess it's a matter of taste, but I don't like that way of describing things. If $\vec{A}$ and $\vec{B}$ are vectors, then wouldn't you say that $\vec{A} \cdot \vec{B}$ is a scalar? But in the special case where $\vec{B}$ is the basis vector $e_\mu$, we have:

$\vec{A} \cdot e_\mu = A_\mu$

So it is simultaneously true that $A_\mu$ is a scalar (it is the result of taking the scalar product of two vectors), and it is also a component of a covector.

11. Jan 1, 2015

### Fredrik

Staff Emeritus
The components of a vector $v$ with respect to an ordered basis $(e_1,\dots,e_n)$ are the unique real numbers $v^1,\dots,v^n$ such that $v=\sum_{i=1}^n v^i e_i$.

I will elaborate a bit...

Let $V$ be an n-dimensional vector space over $\mathbb R$. Let $V^*$ be the set of linear functions from $V$ to $\mathbb R$. Define addition and scalar multiplication on $V^*$ by $(f+g)(v)=f(v)+g(v)$ and $(vf)(x)=a(f(v))$ for all $v\in V$. These definitions turn $V^*$ into a vector space. The $V^*$ defined this way is called the dual space of $V$.

Let $(e_i)_{i=1}^n$ be an ordered basis for $V$. (The notation denotes the n-tuple $(e_1,\dots,e_n)$). It's conventional to put these indices downstairs, and to put the indices on components of vectors in $V$ upstairs. For example, if $v\in V$, then we write $v=v^i e_i$. I'm using the summation convention here, so the right-hand side really means $\sum_{i=1}^n v^i e_i$.

For each $i\in\{1,\dots,n\}$, we define $e^i\in V^*$ by $e^i(e_j)=\delta^i_j$. It's not hard to show that $(e^i)_{i=1}^n$ is an ordered basis for $V^*$. The ordered basis $(e^i)_{i=1}^n$ is called the dual basis of $(e_i)_{i=1}^n$. It's conventional to put the indices on components of vectors in $V^*$ downstairs. For example, if $f\in V^*$, then we write $f=f_ie^i$.

Exercise: Find an interesting way to rewrite each of the following expressions:

a) $e^i(v)$
b) $f(e_i)$

Last edited: Jan 1, 2015
12. Jan 1, 2015

### bcrowell

Staff Emeritus
I suppose this would depend on whether you use the convention that a basis vector like $e_\mu$ transforms, or doesn't transform. I would take the Greek index to mean that it's a concrete index rather than an abstract index, and I would then assume that it was to be kept fixed under a change of coordinates. In reality, I think this would usually be clear from context.

13. Jan 1, 2015

### stevendaryl

Staff Emeritus
The way that I think of things "transforming under coordinate changes" is this:

Vectors are fixed things (in differential geometry, they can be identified with tangents to parametrized paths). Components of a vector are projections of the vector onto a basis (as set of 4 independent vectors). If I have 4 independent vectors $\vec{A}, \vec{B}, \vec{C}, \vec{D}$, and then I have another vector $\vec{V}$, I can, as Fredrick said, write $\vec{V}$ as a linear combination of my basis: $\vec{V} = V^1 \vec{A} + V^2 \vec{B} + V^3 \vec{C} + V^4 \vec{D}$. $V^1, ..., V^4$ are just 4 real numbers that happen to express the relationship between $\vec{V}$ and my four basis vectors, $\vec{A}, \vec{B}, \vec{C}, \vec{D}$. At this point, nothing has been said about a coordinate system. All 5 vectors, $\vec{V},\vec{A}, ..., \vec{D}$ have an identity that is independent of any coordinate system.

But if I want to use a different set of vectors as my basis, say, $\vec{A'}, \vec{B'}, ..., \vec{D'}$, then I can also write the same vector $\vec{V}$ in terms of this new basis: $\vec{V} = (V^1)' \vec{A'} + ... + (V^4)' \vec{D'}$. I haven't transformed $\vec{V}$, I've just written it as a different linear combination.

14. Jan 1, 2015

### ShayanJ

Here you're talking about linear algebra. I'm wondering how the two view points can be reconciled!
mmm...It seems to me that in linear algebra we never use different coordinates, in fact we never define such things. We just pick different sets of linearly independent vectors as bases. So...Yeah, what you're talking here, is just in the tangent space of a point. But in differential geometry where we use different coordinates, we're doing things in a much less local manner than being only at a point.

15. Jan 2, 2015

### pervect

Staff Emeritus
In differential geometry, we have the tangent and cotangent spaces at a point - but we also usually have some additional structure, such as the connection / fibre bundle that defies a map from the tangent space at one point to the tangent space at another point, given a curve connecting the two points.

16. Jan 2, 2015

### Matterwave

At any point $P$ on a manifold, the tangent and cotangent spaces are simply linear vector spaces, that's why we "talk about linear algebra" even when we are talking about differential geometry. However, there are "special" sets of basis vectors $e_i,~~i=1,...,n$ which are called "coordinate basis vectors" if they all satisfy $[e_i,e_j]=0,~~\forall i,j$. When we transform from one set of coordinate basis vectors to another set of coordinate basis vectors, the components of vectors or one forms change in the usual fashion $A^{i'}=\frac{\partial x^{i'}}{\partial x^j}A^j$.

17. Jan 2, 2015

### ShayanJ

Yeah, I know. But here, the question is can we call components of a vector, scalars?
As bcrowell said, it seems wrong because the components change when we do a coordinate transformation and so aren't invariant under coordinate transformations, as scalars should be!
But as stevendaryl said, $\vec A \cdot \vec B$ is a scalar, and if we put $\vec B=\hat e_i$, we get $\vec A \cdot \hat e_i=A^i$ and it seems components of vectors are actually scalars.
(Maybe they actually settled the issue but I didn't understand!!!)

18. Jan 2, 2015

### Matterwave

This is a matter of terminology. The number $A^i \equiv \vec{A}\cdot\vec{e_i}$ is a scalar field certainly; however, if we view $A^i$ as "the i'th component of the vector $\vec{A}$" then certainly it is a component of a vector and not a scalar. In other words, it depends on how you want to view the quantity $A^i$. If you view it as "the i'th component of A in THIS PARTICULAR basis" then it is a scalar, if you view it as "the i'th component of A in SOME basis" then it is not a scalar.

Perhaps it's easier if we give a concrete example. Say we have a vector $\vec{A}=(3,2,0)$, then $A^1=3$. 3, being a number, is a scalar of course, but $A^1$ which we use to denote what is in the first slot of $\vec{A}=(A^1,\quad,\quad)$ is the component of a vector.

19. Jan 2, 2015

### stevendaryl

Staff Emeritus
Right. In a coordinate basis, we're picking the basis vectors vectors in a way that relates to the coordinates: $e_\mu$ is that unique vector so that $(e_\mu \cdot \nabla) \Phi = \partial_\mu \Phi$ for all scalar fields $\Phi$. But that's just a particular (very convenient) way of picking a basis. The basis doesn't have to have anything to do with coordinates. (Of course, to be useful, you need some continuous way to pick a basis at every point.)

20. Jan 2, 2015

### stevendaryl

Staff Emeritus
Yeah, physics discussions (and mathematics discussions aren't much better, often) sometimes run into confusion when it's not clear whether someone is talking about a tensor (or matrix, or vector, or whatever), or whether someone is talking about a component of a tensor (with arbitrary indices).

For example, $g_{\mu \nu}$ might mean the metric tensor, or they might mean a particular component of the metric tensor.

There is a similar ambiguity when people talk about functions: Does $f(x)$ mean a function, or does it mean the value of the function at some point $x$? Many people try to use different alphabets, or different fonts, or something to distinguish between a variable and a constant with an arbitrary value, so they might write $f(x)$ to mean the function and $f(a)$ to mean the value at point $a$. But it's hard to be consistent about such conventions, and not everybody uses the same ones.

You can disambiguate by using lambda notation (or some equivalent "binding" mechanism):

$\lambda x . f(x)$ means the function, while $f(x)$ means its value at point $x$. But it's a pain to make everything explicit that way.

21. Jan 2, 2015

### stevendaryl

Staff Emeritus
The use of lowered-indices to indicate basis vectors, such as $e_\mu$, is a little more profound than simply keeping track of indices to apply the Einstein summation convention. It's also the case that in a change of basis, the basis vectors and the components of a single vector transform in opposite way:

$A^\alpha = L^\alpha_\mu A^\mu$
$e_\alpha = (L^{-1})^\mu_\alpha e_\mu$

For a covector $B = B_\alpha e^\alpha$, where $e^\alpha$ is a basis of covectors, it works out the opposite way:

$B_\alpha = (L^{-1})^\mu_\alpha e_\mu$
$e^\alpha = L^\alpha_\mu e^\mu$

I'm not sure I know of a pithy way to see that an indexed collection of basis vectors should transform like the components of a covector, and an indexed collection of basis covectors should transform like the components of a vector. It has to work out that way in order for the Einstein summation convention to produce an object that is basis-independent, but I don't know a satisfying explanation for why it should work that way.

22. Jan 2, 2015

### HomogenousCow

Schutz has a pretty good explanation of this.
You just have to postulate that the basis vectors transform linearly, and then you can show that the components transform via the inverse matrix in order to preserve the vector/covector.
I guess it all comes from the fact that the vectors/covectors are geometrical objects? Not sure if everyone would find that a satisfying motivation.

23. Jan 2, 2015

### Fredrik

Staff Emeritus
It's not a postulate. If $(e_i)_{i=1}^n$ and $(e_i')_{i=1}^n$ are ordered bases for $V$, then for all $i$, there must exist numbers $M_i^j$ such that $e_i'=M_i^j e_j$. (In other words, we can always write the new basis vectors as linear combinations of the old).

24. Jan 2, 2015

### Fredrik

Staff Emeritus
I will elaborate on what I said in post #23 here, in order to answer the question of how things transform under a change of ordered basis. Some of this was already worked out by stevendaryl in post #6. $V$ denotes an arbitrary n-dimensional vector space over $\mathbb R$. $V^*$ denotes its dual space. (See post #11 if that term is unfamiliar).

Now let $M$ be the matrix such that for all $i,j$, the component on row $i$, column $j$ is $M^i_j$. Recall that the definition of matrix multiplication is $(AB)^i_j=A^i_k B^k_j$. Let $v\in V$ be arbitrary. We have
$$v=v^j e_j=v^i{}' e_i{}' =v^i{}' M^j_i e_j,$$ and therefore $v^j=v^i{}' M^j_i$. This implies that
$$(M^{-1})^k_j v^j =v^i{}' (M^{-1})^k_j M^j_i =v^i{}' (M^{-1}M)^k_i =v^i{}' \delta^k_i =v^k.$$ So the n-tuple of components $(v^1,\dots,v^n)$ transforms according to
$$v^i= (M^{-1})^i_j v^j.$$ The fact that the matrix that appears here is $M^{-1}$ rather than $M$ is the reason why an n-tuple of components of an element of $V$ is said to transform contravariantly. The terms "covariant" and "contravariant" should be interpreted respectively as "the same as the ordered basis" and "the opposite of the ordered basis".

It's easy to see that the dual basis transforms contravariantly. Let $N$ be the matrix such that $e^i{}' =N^i_j e^j$. We have
$$\delta^i_j =e^i{}'(e_j{}')=N^i_k e^k (M_j^l e_l) = N^i_k M_j^l e^k{}(e_l{}) =N^i_k M_j^l \delta^k_l =N^i_k M_j^k =(NM)^i_j.$$ This implies that $N=M^{-1}$. So we have
$$e^i{}' =(M^{-1})^i_j e^j.$$ Now we can easily see that an n-tuple of components of an arbitrary $f\in V^*$ transforms covariantly. We can prove it in a way that's very similar to how we determined the transformation properties of the n-tuple of components of $v$, but the simplest way is to use the formula $f_i=f(e_i)$, which I left as an easy exercise in post #11.
$$f_i{}' =f(e_i{}')=f(M_i^j e_j) =M_i^j f(e_j)= M_i^j f_j.$$ Note that what's "transforming" under a change of ordered basis in these examples are n-tuples of real numbers or n-tuples of vectors (in $V$ or $V^*$). In the case of a tensor of type $(k,l)$, what's transforming isn't the tensor, but its $n^{k+l}$-tuple of components with respect to the ordered basis $(e_i)_{i=1}^n$.

Of course, one can take the point of view that these $n$-tuples or $n^{k+l}$-tuples are the tensors, or rather, that the function that associates tuples with ordered bases is what should be called a tensor. I'm not a fan of that view myself. I consider it inferior and obsolete. However, there isn't anything fundamentally wrong with it. The real problem is that it's so hard to find an explanation of this view that isn't unbelievably bad.

Last edited: Jan 2, 2015
25. Jan 2, 2015