# Covariance and contravariance

## Main Question or Discussion Point

Hi! I´m trying to get an intuition for these concepts and was just playing at home.
My thought was to start with a 2-Dimensional ON-coordinatesystem, the xy-plane and do the following:

1. Study the vector with the coordinate (1,1) in this system. Now its cov. and con. coordinates is of course in the start the same.

2. Start by imagine that we decrease the angle between the positive part och y and x axis. This means that we get a oblique system and we have a diffrence between cov. and con.

3. What should we see? Well, Take the covariant coordinates x = r cos (v) etc.. and differentiate them with respect to the angle, dv.

4. Do the same thing for the contravariant coordinates (here I tried to do it by expressing them through the elements of the transformation matrix). If the length is invariant and can be expressed by x_i x^i. We should see that the derivatives cancel each other so that:

x_i * (d x^i / dv) = x^i * (d x_i / dv) since (d [x^i x_i] / dv) = 0.

Now to the questions!
Do you see any wrong with my arguments here?
Is there any easier way to express the contravariant coordinates through the cartesian x and y?
Can you show that x_i x^i is invariant when we decrease the angle v in some much easier way?

Last edited:

Related Differential Geometry News on Phys.org
Fredrik
Staff Emeritus
Gold Member
I'm not sure I understand what you're doing. For example, if you tilt the y axis by 45 degrees, doesn't that put the point previously known as (1,1) on the y axis, ensuring that xixi=0? Your terminology is also a bit strange. There's no such thing as covariant and contravariant coordinates. There are just "coordinates". Maybe you're thinking of the components of cotangent vectors and cotangent vectors respectively.

Let me just outline the modern approach to these things. If there's something you want me to clarify, just ask. If you don't have time to study this, that's OK with me. Questions about this sort of thing come up often enough that I would like to have a post that I can direct people to when they ask questions about the covariant/contravariant stuff.

A manifold is a set with some other stuff defined on it, the most important being coordinate systems. They are functions that assign an n-tuple of real numbers to each point in the manifold. For example, the coordinate system x takes the point p to (x1(p),...,xn(p)).

For each point p in the manifold M, we define an n-dimensional vector space TpM, called the tangent space at p. The members of TpM are called tangent vectors at p. For any finite-dimensional real vector space V, its dual space V* is defined as the set of linear functions from V into the set of real numbers. (This set is given a vector space structure in the obvious way). The dual space TpM* of TpM is called the cotangent space at p, and its members are called cotangent vectors at p.

Tangent vectors are sometimes just called "vectors". Cotangent vectors are sometimes called "covectors". Tangent vectors and cotangent vectors also go by the horrible names "contravariant vectors" and "covariant vectors" respectively. Those terms belong to an obsolete way of dealing with these things.

The metric at p is a function g:TpM×TpM→ℝ that's linear in both variables and satisfies g(u,v)=g(v,u) and one more thing that I'll mention in a minute. For each u, the function that takes v to g(u,v) is a linear function from TpM into ℝ. That makes that function, which is often written as g(u,·), a member of TpM*. The metric is required (as a part of its definition) to also have the property that the function that takes u to g(u,·), is a vector space isomorphism from TpM into TpM*.

This isomorphism ensures that for each tangent vector v, there's a "corresponding" cotangent vector g(v,·), and that for each cotangent vector, there's a corresponding tangent vector, given by the inverse of the function that takes v to g(v,·).

When tangent vectors are expressed in terms of a basis, it's conventional to put the indices on the components upstairs and the indices on the basis vectors downstairs. For example, if {ei} is a basis, then v=viei. For cotangent vectors, the convention is the opposite. If {fi} is a basis, then ω=ωifi.

Given a basis {ei} of TpM, we can define a basis {ei} of TpM*, called the dual basis of {ei}, by ei(ej)=δij.

Suppose that {fi} is another basis for TpM. Then there's always a matrix B such that fi=Bjiej, where Bji is the component on row j, column i, of the matrix B. It's easy to show that this implies fi=(B-1)ijej. So when you change a basis of TpM as described by a matrix B, the dual basis changes as described by B-1.

It's also easy to show that if you change the basis of TpM in the way described by B, the components of each tangent vector change as described by B-1. As I already mentioned, a change of basis in the way described by B also changes the dual basis in the way described by B-1, and this means that the components of each cotangent vector change as described by B. This is the source of the terms "covariant" and "contravariant". Change the basis by B, the components of a tangent vector change by B-1 (hence "contravariant"), and the components of a cotangent vector change by B (hence "covariant").

So what does this have to do with coordinate systems? We can use each coordinate system to construct n partial derivative operators, and we can show that these operators form a basis for the tangent space. So when you change the coordinate system, you also change the associated basis of the tangent space.

These are some of my earlier posts that explain the details of some of these things:

Last edited:
Holy molly thats a good anwser!
Thank you very much.

I think that I´ve just been studying the property which you write:

Given a basis {ei} of TpM, we can define a basis {ei} of TpM*, called the dual basis of {ei}, by ei(ej)=δij.
And looking at the geometric relationship between these two kind of basis sets.

However I have some questions. Tangent space.. can I view this as a room which exhibits the properties that we find in an infinitesimal region around p?
Like a plane just touching the sphere, if that now was too be our M.

The problem seems to be that I´ve only been introduced to this subject throught Special relativity, and not taken a course in GR where the curvature comes in.
In my cases the metic is a diagonalmatrix, invariant over the manifold (which is spacetime?).. Will this matrix vary in diffrent points in GR later?

Hahah thank you very much Fredrik this is damn beautiful...

I´ve also read some Lie algebra where the manifold and tangetspace comes in again.. could you maybe comment something about this? Right now i see lie algebra as the case where we deal with manifolds which represents continous operations, like rotation, and therefore expect some properties, as jacobi identitiy etc.
And this differential geometry as the case where we have a physical space. Where we dont expect the same properties..

One more question:

You write:
This isomorphism ensures that for each tangent vector v, there's a "corresponding" cotangent vector g(v,·), and that for each cotangent vector, there's a corresponding tangent vector, given by the inverse of the function that takes v to g(v,·).
Does this also guarantee that we always can find a dualbasis for the basis X in T_pM?

haushofer
Holy molly thats a good anwser!
Thank you very much.

However I have some questions. Tangent space.. can I view this as a room which exhibits the properties that we find in an infinitesimal region around p?
Like a plane just touching the sphere, if that now was too be our M.
The sphere is a nice example. If you would define vectors on a sphere, how would you do that? Vectors can be added, but the natural question then is: how do you compare vectors at different points on the sphere? The question is that this is ambiguitous (but it can be done via a connection!).

It turns out that the natural thing to do is to consider the tangent space at each point on the sphere, and declare that vectors live in that space. Your "plane touching the sphere" is a good visualization of it. In this plane, vectors can be added. But to compare different tangent spaces, you need a description how to "drag" vectors from one tangent space to another, and that can be done in many different ways!

In my cases the metic is a diagonalmatrix, invariant over the manifold (which is spacetime?).. Will this matrix vary in diffrent points in GR later?
Yes.

I´ve also read some Lie algebra where the manifold and tangetspace comes in again.. could you maybe comment something about this? Right now i see lie algebra as the case where we deal with manifolds which represents continous operations, like rotation, and therefore expect some properties, as jacobi identitiy etc.
And this differential geometry as the case where we have a physical space. Where we dont expect the same properties..
A group G can be considered as a manifold with some extra structure. The Lie algebra can then be seen as the tangent space at the identity e of G. This picture can be usefull. A nice question, for instance, is: If I have a homomorphism between two groups, does this induce a homomorphism between the two algebras? It turns out that this depends on the topology of the group seen as a manifold (the group manifold has to be simply connected). This question is relevant, because e.g. representations are homomorphisms.

haushofer
One more question:

You write:

Does this also guarantee that we always can find a dualbasis for the basis X in T_pM?
The dual basis is the basis of the cotangent space. This dual of the tangent space is really a different vector space!

Compare it with bra's and kets in QM. Bra's and kets are really different beasts living in different spaces.

The dual basis is the basis of the cotangent space. This dual of the tangent space is really a different vector space!

Compare it with bra's and kets in QM. Bra's and kets are really different beasts living in different spaces.

Ah, thnx. Is this the natural generalisation when doing QFT on curved space? That the bras are contravariant components and the kets covariant components.

Can I recommend looking at

Tensor Geometry

By Dodson and Poston.

You will find much meat to chew on here.

Fredrik
Staff Emeritus
Gold Member
Tangent space.. can I view this as a room which exhibits the properties that we find in an infinitesimal region around p?
Like a plane just touching the sphere, if that now was too be our M.
Yes, that's a good way to picture it. It is however important to understand that the definition of an n-dimensional manifold doesn't require it to be embedded in an n+1-dimensional Euclidean space like this picture suggests. The tangent space at p is defined abstractly. There are several different ways to do it. The last link in my previous post is to a post that describes one of those ways, and contains a link to a post that describes another way.

The problem seems to be that I´ve only been introduced to this subject throught Special relativity, and not taken a course in GR where the curvature comes in.
In my cases the metic is a diagonalmatrix, invariant over the manifold (which is spacetime?).. Will this matrix vary in diffrent points in GR later?
Let $g_p$ be the metric at p, and ${e_i}$ the basis of $T_pM$ associated with some coordinate system x. Then the components of $g_p$ in x are defined by $g_{ij}(p)=g_p(e_i,e_j)$. In SR, we have $g_{ij}(p)=\eta_{ij}$ for all p, if and only if the coordinate system is a global inertial frame. If not, the components depend on p, even in SR. In GR, they pretty much always depend on p, no matter what coordinate system you use.

The $\eta_{ij}$ above are the components of the matrix

$$\eta=\begin{pmatrix}-1 & 0 & 0 & 0\\ 0 & 1 & 0 & 0\\ 0 & 0 & 1 & 0\\ 0 & 0 & 0 & 1\end{pmatrix}$$

Note that the metric at p isn't defined as a matrix. It's a bilinear form. It is however useful to define the matrix of its components in a coordinate system, and the metric at p is completely determined by that matrix.

A Lie group is a group that's also a manifold. A Lie algebra is a vector space V, with a function $(x,y)\mapsto [x,y]$ from V×V into V, that's bilinear and satisfies the Jacobi identity: [x,[y,z]]+[y,[z,x]]+[z,[x,y]]=0. (So it's like a distributive multiplication operation, but instead of being associative, it satisfies the Jacobi identity). Such a function is called a Lie bracket.

I'm going to describe how to define a Lie bracket on the vector space $\mathfrak g=T_eG$ of a Lie group G. (That's the tangent space at the identity element). We need a few definitions first. Suppose that M and N are manifolds and that $\phi:M\rightarrow N$. If f is a real-valued function on N, then $f\circ\phi$ is a real-valued function on M, called the pullback of f. We can use the function $\phi$ to define a function $\phi_*:T_pM\rightarrow T_{\phi(p)}N$, called a pushforward. Recall that tangent vectors at a point are defined as "derivations" at that point, so to specify a tangent vector at $\phi(p)$, we need to specify its action on an arbitrary real-valued function f:

$$\phi_* v(f)=v(f\circ\phi)$$.

A vector field is a function that takes each point p (in some open subset of a manifold) to a tangent vector at p. If X is a vector field, the corresponding tangent vector at p is written as Xp. For every real-valued function f, the function that takes p to Xpf is written as Xf. The commutator of two vector fields X and Y is another vector field, defined by

$$[X,Y]_pf=X_p(Yf)-Y_p(Xf)$$

A pushforward of a vector field is defined in terms of the pushforwards of tangent vectors:

$$(\phi_*X)_{\phi(p)}=\phi_*X_p$$.

Recall that a Lie group G is also a group. For each g in G, we define left multiplication by g as the function $\lambda_g:G\rightarrow G$ defined by

$$\lambda_g(h)=gh$$

Now we have all the tools we need. Suppose that $K,L\in\mathfrak g$. Then $\lambda_g_*K\in T_gG$. We define the left-invariant vector field corresponding to K by

$$X^K_g=\lambda_g_*K$$

and the Lie bracket on $\mathfrak g$ by

$$[K,L]=[X^K,X^L]_e$$

I hope it's obvious that I have omitted some minor technicalities (e.g. I never mentioned that "left multiplication by g" is required to be a smooth function by the definition of "Lie group"). If you're wondering if I could have used right multiplication instead of left, the answer is yes. The result would have been a Lie algebra that's isomorphic to this one.

Last edited:
that takes each point p
A small philosophical point ( ) here Fredrik.

I prefer the expression "associates with each point p" rather than the operator language which suggests that once the operation is concluded you no longer have the original.

In fact the uses we want here mean that we must have both the original and the resultant of the association together.

haushofer
Ah, thnx. Is this the natural generalisation when doing QFT on curved space? That the bras are contravariant components and the kets covariant components.
"Contra" and "co" says something about how the components of an object transform under coordinate transformations; if the objects transform oppositely to the basis vectors you call the components "contravariant", and if they transform the same as the basis vectors then you call them "covariant". This can be defined for any vector space, so I would say also for Hilbert spaces. Mathematicians would speak about "vectors" and "covectors", living in the tangent and cotangent space; the vectors themselves don't change under these transformations, because they are defined without any reference to coordinates. Coordinates are just labels!

These covectors are functionals acting on vectors, giving back a number. And vice versa. That's the analogy with bra's and kets; if I have a bra <x|, I can act with it on a ket |x> to obtain

<x|y> = ...

which is a complex number in QM. Because under a unitary transformation

|x> --> U |x>

and

<y| --> <y|U^{-1}

this thing <x|y> is invariant under these unitary transformations.

In the same way you can view a dual vector omega as a functional acting on a vector X to give back a number,

omega(X) = ...

where the brackets mean "acting on". Or the other way around,

X(omega) = ...

Fredrik
Staff Emeritus
Gold Member
Ah, thnx. Is this the natural generalisation when doing QFT on curved space? That the bras are contravariant components and the kets covariant components.
No, it has nothing to do with that. Bra-ket notation is just a useful tool to simplify some calculations. This is what I wrote about bra-ket notation last year.

I'm going to describe how to define a Lie bracket on the vector space $\mathfrak g=T_eG$ of a Lie group G. (That's the tangent space at the identity element). We need a few definitions first. Suppose that M and N are manifolds and that $\phi:M\rightarrow N$. If f is a real-valued function on N, then $f\circ\phi$ is a real-valued function on M, called the pullback of f. We can use the function $\phi$ to define a function $\phi_*:T_pM\rightarrow T_{\phi(p)}N$, called a pushforward. Recall that tangent vectors at a point are defined as "derivations" at that point, so to specify a tangent vector at $\phi(p)$, we need to specify its action on an arbitrary real-valued function f:

$$\phi_* v(f)=v(f\circ\phi)$$.
What exactly is v(f) here? :/

Can I recommend looking at

Tensor Geometry

By Dodson and Poston.

You will find much meat to chew on here.
I actually bought Introduction to Smooth Manifolds by John Lee last monday, any experience with it? Should arrive tomorrow.

Fredrik
Staff Emeritus
Gold Member
What exactly is v(f) here? :/
v doesn't act on f. (It can't, because v is a tangent vector at a point of M, while f is a real-valued function on N).

$\phi_*$ acts on v (a tangent vector at p) and the result is a tangent vector at $\phi(p)$. That tangent vector (which we write as $\phi_*v$) acts on f.

I actually bought Introduction to Smooth Manifolds by John Lee last monday, any experience with it? Should arrive tomorrow.
It's very good. The only problem is that it doesn't cover connections, covariant derivatives, and curvature. I guess he didn't want to include that, since he had already written a book about those things, "Riemannian manifolds: an introduction to curvature", which is also excellent.

Ah, included Riemannian manifolds and topological manifolds also. Maybe should start with one of those... I am writing a summary of all the good posts in swedish now but will come back tomorrow if anthing isn´t clear.

Thanks again! Maybe I cant repay you guys in some way but I pass it forward by volunteering as math teacher each monday Peace out!

Introduction to Smooth Manifolds by John Lee
Sorry, don't know that one. Perhaps someone else can comment, or you can do a critique when you've read it.

D & P are unusual as they offer a blend of maths and physics perspectives, rather than just one or the other, so the maths is not too stuffy. Relativity is covered.

Fredrik
Staff Emeritus
Gold Member
Ah, included Riemannian manifolds and topological manifolds also. Maybe should start with one of those...
If you can live with not knowing exactly what "Hausdorff and second countable topological space" means, you can start with the book on smooth manifolds. Read enough of that to ensure that you understand tensor fields. After that you can start reading sections here and there in all three books, depending on what you're the most interested in for the moment.

I'm not sure I understand what you're doing. For example, if you tilt the y axis by 45 degrees, doesn't that put the point previously known as (1,1) on the y axis, ensuring that xixi=0? Your terminology is also a bit strange. There's no such thing as covariant and contravariant coordinates. There are just "coordinates". Maybe you're thinking of the components of cotangent vectors and cotangent vectors respectively.

Let me just outline the modern approach to these things. If there's something you want me to clarify, just ask. If you don't have time to study this, that's OK with me. Questions about this sort of thing come up often enough that I would like to have a post that I can direct people to when they ask questions about the covariant/contravariant stuff.

A manifold is a set with some other stuff defined on it, the most important being coordinate systems. They are functions that assign an n-tuple of real numbers to each point in the manifold. For example, the coordinate system x takes the point p to (x1(p),...,xn(p)).

For each point p in the manifold M, we define an n-dimensional vector space TpM, called the tangent space at p. The members of TpM are called tangent vectors at p. For any finite-dimensional real vector space V, its dual space V* is defined as the set of linear functions from V into the set of real numbers. (This set is given a vector space structure in the obvious way). The dual space TpM* of TpM is called the cotangent space at p, and its members are called cotangent vectors at p.

Tangent vectors are sometimes just called "vectors". Cotangent vectors are sometimes called "covectors". Tangent vectors and cotangent vectors also go by the horrible names "contravariant vectors" and "covariant vectors" respectively. Those terms belong to an obsolete way of dealing with these things.

The metric at p is a function g:TpM×TpM→ℝ that's linear in both variables and satisfies g(u,v)=g(v,u) and one more thing that I'll mention in a minute. For each u, the function that takes v to g(u,v) is a linear function from TpM into ℝ. That makes that function, which is often written as g(u,·), a member of TpM*. The metric is required (as a part of its definition) to also have the property that the function that takes u to g(u,·), is a vector space isomorphism from TpM into TpM*.

This isomorphism ensures that for each tangent vector v, there's a "corresponding" cotangent vector g(v,·), and that for each cotangent vector, there's a corresponding tangent vector, given by the inverse of the function that takes v to g(v,·).

When tangent vectors are expressed in terms of a basis, it's conventional to put the indices on the components upstairs and the indices on the basis vectors downstairs. For example, if {ei} is a basis, then v=viei. For cotangent vectors, the convention is the opposite. If {fi} is a basis, then ω=ωifi.

Given a basis {ei} of TpM, we can define a basis {ei} of TpM*, called the dual basis of {ei}, by ei(ej)=δij.

Suppose that {fi} is another basis for TpM. Then there's always a matrix B such that fi=Bjiej, where Bji is the component on row j, column i, of the matrix B. It's easy to show that this implies fi=(B-1)ijej. So when you change a basis of TpM as described by a matrix B, the dual basis changes as described by B-1.

It's also easy to show that if you change the basis of TpM in the way described by B, the components of each tangent vector change as described by B-1. As I already mentioned, a change of basis in the way described by B also changes the dual basis in the way described by B-1, and this means that the components of each cotangent vector change as described by B. This is the source of the terms "covariant" and "contravariant". Change the basis by B, the components of a tangent vector change by B-1 (hence "contravariant"), and the components of a cotangent vector change by B (hence "covariant").

So what does this have to do with coordinate systems? We can use each coordinate system to construct n partial derivative operators, and we can show that these operators form a basis for the tangent space. So when you change the coordinate system, you also change the associated basis of the tangent space.

These are some of my earlier posts that explain the details of some of these things:

Why is that is not clear to me.
What if we do not write it this way? Also I read that perpendicular component actually has a basis of 1/ei for e^i as contra. That adds to confusion.

Also I do not understand why perpendicular component is regarded as gradient where basis are at bottom where as they are at top for contra.

I somehow cannot correlate the two.

"Contra" and "co" says something about how the components of an object transform under coordinate transformations; if the objects transform oppositely to the basis vectors you call the components "contravariant" said:
Can you clarify this more? This has been a constant source of my confusion.

Fredrik
Staff Emeritus
Gold Member
Why is that is not clear to me.
What if we do not write it this way?
What does "that" and "it" refer to? You quoted a 13-paragraph post that summarizes the basics of differential geometry, so it's impossible for me to know what you mean.

Also I read that perpendicular component actually has a basis of 1/ei for e^i as contra. That adds to confusion.
I'm not familiar with what you're talking about here. If you want an answer to this, I think you will have to provide an exact quote of the stuff you read.

Fredrik said:
"Contra" and "co" says something about how the components of an object transform under coordinate transformations; if the objects transform oppositely to the basis vectors you call the components "contravariant", and if they transform the same as the basis vectors then you call them "covariant".
Can you clarify this more? This has been a constant source of my confusion.
It's in the first post I linked to in the post you quoted, but I'll cover it very briefly here. Here I'll denote row i, column j of a matrix X by $X^i_j$. Recall that the definition of matrix multiplication says that $(AB)^i_j=A^i_k B^k_j$. Suppose that M is a matrix that defines a change of basis: $\vec e'_i=M^j_i \vec e_j$. A vector $v$ in the vector space for which $\{e_i\}$ is a basis can be expressed in terms of either of the two bases:
\begin{align}
&v=v^i\vec e_i=v'^i\vec e'_i=v'^i M^j_i \vec e_j\\
&v^j=M^j_i v'^i\\
&v'^i=(M^{-1})^i_j v^j
\end{align} So vector components transform as described by the inverse of the matrix that describes the transformation of the basis vectors.

I reading reading about this a little more and I see that when you write A(new) = X*B(old) where X = transformation matrix, it is written as dx(new)/dx(old) which makes sense to me but I cannot understand why that is same as basis vector ei. Can someone throw some light on that?
As a part of this Q, I also see that metric tensor g(ij) is written as ei.ej = dxi(bar)/dxj * dxm(bar)/dxn where (bar) are new co-ordinates. Now as I understand, you do not need to change to new coordinate system to get matric tensor and can be defined without it. If that is true, why is it defined it terms of new and old corinate components?

Fredrik
Staff Emeritus
Gold Member
Is your question why $\big\{\frac{\partial}{\partial x^\mu}\big|_p\big\}$ is a basis for the tangent space at p? I suggest that you start with the last post I linked to in the post you quoted above.

The metric isn't defined in terms of new and old coordinate systems. It's a tensor field g, so it associates a tensor gp with each point p. For all p and all bases {ei}, the coordinates of gp in the basis {ei} are defined as $g_p(e_i,e_j)$. Since each coordinate system that covers a region that includes p defines a basis at p, there's a set of components of gp associated with each of those coordinate systems.

julian
Gold Member
Hi! I´m trying to get an intuition for these concepts and was just playing at home.
My thought was to start with a 2-Dimensional ON-coordinatesystem, the xy-plane and do the following:

1. Study the vector with the coordinate (1,1) in this system. Now its cov. and con. coordinates is of course in the start the same.

2. Start by imagine that we decrease the angle between the positive part och y and x axis. This means that we get a oblique system and we have a diffrence between cov. and con.

3. What should we see? Well, Take the covariant coordinates x = r cos (v) etc.. and differentiate them with respect to the angle, dv.

4. Do the same thing for the contravariant coordinates (here I tried to do it by expressing them through the elements of the transformation matrix). If the length is invariant and can be expressed by x_i x^i. We should see that the derivatives cancel each other so that:

x_i * (d x^i / dv) = x^i * (d x_i / dv) since (d [x^i x_i] / dv) = 0.

Now to the questions!
Do you see any wrong with my arguments here?
Is there any easier way to express the contravariant coordinates through the cartesian x and y?
Can you show that x_i x^i is invariant when we decrease the angle v in some much easier way?
If you look in "General Relativity" by I.R.Kenyon for example, page 51 - flat 2D space, he illustrates the interpretation of the vector and covector components of an infintesimal vector...so in a limited sense there is such a thing as vector/covector coordinates. This is what you are refering to?