# Tangent space in manifolds, how do we exactly define?

## Main Question or Discussion Point

If we have a manifold with a chart projected onto $R^n$ cartesian space and define a curve $f(x^\mu(\lambda))=g(\lambda)$ then we can write the identity

$$\frac{dg}{d\lambda} = \frac{dx^\mu}{d\lambda} \frac{\partial f}{\partial x^\mu}$$

in the operator form:

$$\frac{d}{d\lambda} = \frac{dx^\mu}{d\lambda} \frac{\partial}{\partial x^\mu}$$

And we interpret $\frac{\partial}{\partial x^\mu}$ as basis vectors and $\frac{d}{d\lambda}$ as tangent vector.

What is the intuition to do that? Why do so defined basis vectors transform like Cartesian vectors in tangent space? How do we define and write the equation of the tangent space to manifold? How are the coordinates of tangent space related to $x^\mu$?

I know these are standard questions but the math textbooks are so involved i don't understand them, just looking for some quick intuition. Thanks!

Last edited:

Related Special and General Relativity News on Phys.org
Fredrik
Staff Emeritus
Gold Member
You're asking for a lot of explanation. It's too much to cover it comprehensively in a forum post, but if you read this one and the three I linked to at the end of it, it should get you started.

If you want more intuition about tangent vectors, you're going to have to study two different definitions of the tangent space at a point p, and the proof that the two spaces are isomorphic. Isham's book Modern differential geometry for physicists covers this pretty well. See also this post.

let me leave only one question: Why so defined basis vectors transform like Cartesian vectors in tangent space?

And thanks, i am reading the post.

Fredrik
Staff Emeritus
Gold Member
The transformation of the basis vectors is covered in this post.

WannabeNewton
Let $M$ be a smooth manifold and let $p\in M$. A linear map $X:C^{\infty}(M)\rightarrow \mathbb{R}$ is called a derivation at $p$ provided that $X(fg) = f(p)Xg + g(p)Xf$ for all $f,g\in C^{\infty}(M)$. The set $T_{p}M = \left \{ X: \text{X is a derivation at p} \right \}$ is called the tangent space to $M$ at $p$ and is easily verified to indeed be a vector space.

Now, let $(U,\varphi)$ be a chart on $M$. $\varphi_{*}:T_{p}M\rightarrow T_{\varphi(p)}\mathbb{R}^{n}$ will of course be an isomorphism since $\varphi$ is a diffeomorphism (here $\varphi_{*}$ is the pushforward). Let $x^1,...,x^{n}:\mathbb{R}^{n}\rightarrow \mathbb{R}$ be standard coordinate functions over $\mathbb{R}^{n}$. One can show that the derivations $\partial _{i}|_{\varphi(p)}$ defined by $\partial_{i}|_{\varphi(p)}f = (\partial _{i}f)(p)$ form a basis for $T_{p}(\mathbb{R}^{n})$ hence if we pushforward each of these coordinate vector fields, $\partial _{i}|_{p} = (\varphi^{-1})_{*}\partial _{i}|_{\varphi(p)}$ will be a basis for $T_{p}M$.

Finally, let $I\subseteq \mathbb{R}$ be an interval (we want connected sets when defining curves and these are necessarily intervals or singletons). A curve is a continuous map $\gamma: I \rightarrow M$ (continuous with respect to the natural topology on $M$ induced by the smooth structure). For smooth curves, we define the tangent vector to $\gamma$ at $t_{0}\in I$ to be $\dot{\gamma(t_{0})} = \gamma _{*}(\frac{\mathrm{d} }{\mathrm{d} t}|_{t_{0}})$ where $\frac{\mathrm{d} }{\mathrm{d} t}|_{t_{0}}$ is the single coordinate basis vector for $T_{t_{0}}\mathbb{R}$. Note that $\dot{\gamma}(t_{0})\in T_{\gamma(t_{0})}$ so it is in fact a derivation. In GR, when we say $u^{\mu} = \frac{\mathrm{d} x^{\mu}}{\mathrm{d} \tau}$ is the "tangent vector" to a time-like particle's worldline at $\tau_{0}$, what we are really writing down is $\dot{\gamma}(\tau_{0})x^{\mu} = \frac{\mathrm{d} }{\mathrm{d} \tau}(x^{\mu}\circ \gamma)(\tau_{0})$ but we suppress the composition as is rather common practice; we of course interpret this as the 4-velocity of the particle.

As far as change of coordinates goes, let $(U,\varphi), (V,\psi)$ be two charts on $M$ and let $p\in U\cap V$. Denoting $x^{i}$ as the coordinate functions of $\varphi$ and $x'^{i}$ as those of $\psi$, we proceed as $$\partial_{i}|_{p} = (\varphi^{-1})_{*}\partial_{i}|_{\varphi(p)} = (\varphi^{-1})_{*}(\psi\circ \varphi^{-1})_{*}\partial_{i}|_{\varphi(p)} = \frac{\partial x'^{j}}{\partial x^{i}}(\varphi(p))(\varphi^{-1})_{*}\partial'_{j}|_{\varphi(p)} = \frac{\partial x'^{j}}{\partial x^{i}}(\varphi(p))\partial'_{j}|_{p}$$

Last edited:
The transformation of the basis vectors is covered in this post.
But as far as I understood that post doesn't show why the transformations of operators are given by the same matrix as transformation of cartesian coordinates on the tangent space. Am I missing something?

Thanks, Newton. Though I didn't get some terms you used.

WannabeNewton
Thanks, Newton. Though I didn't get some terms you used.
As Fredrik noted, it is hard to explain all your questions that are of fundamental nature without having to build up all the machinery from the ground up but that is hard to do on a forum post. What I can suggest is you could get the book Imtrduction to Smooth Manifolds by Lee and if you have further questions then ask away. The text is very nice if you want to go somewhat more in depth into differential topology.

Fredrik
Staff Emeritus
Gold Member
But as far as I understood that post doesn't show why the transformations of operators are given by the same matrix as transformation of cartesian coordinates on the tangent space. Am I missing something?
I'm not sure I understand the question. That post shows how the bases associated with two different coordinate systems x and y are related. I understand now that that's not what you wanted to see, but I still don't understand what you want to see.

I second the recommendation for Lee's book on smooth manifolds. It's the best one. But I should probably also mention that it assumes that you know some topology, and doesn't include stuff about curvature. (He has also written books with "topological manifolds" and "Riemannian manifolds" in the title. The former covers topology, and the latter covers curvature). The good news is that even if you skip the parts that involve topology, you will still understand a lot.

Last edited:
dx
Homework Helper
Gold Member
But as far as I understood that post doesn't show why the transformations of operators are given by the same matrix as transformation of cartesian coordinates on the tangent space. Am I missing something?
The coordinates given to the tangent space by a coordinate system Xα are simply the numbers vα in

v = va(∂/∂Xa)​

Since the basis vectors associated with two coordinate systems X and Y are related as

(∂/∂Xa) = (∂Yβ/∂Xa)(∂/∂Yβ)​

we can deduce that the components of v in the two coordinates are related as

va(∂/∂Xa) = va(∂Yβ/∂Xa)(∂/∂Yβ)​

i.e. the components of v in the Y system are va(∂Yβ/∂Xa)

I started reading Isham. Thanks, guys.

There is a statement in Isham that $v_1+v_2 := \varphi^{-1} \circ (\varphi \circ \sigma_1 + \varphi \circ \sigma_1)$ where sigmas are lines on the manifold. How do we prove that this expression is independent on chart representation $(U, \varphi)$ and choice of sigmas in equivalence class? I intuitively see it but how to prove it?

Fredrik
Staff Emeritus
Gold Member
I'll use the following notations (so that I don't have to type Greek letters all the time): M for the manifold. p,q for points in the manifold. C,D for curves in M. x,y for coordinate systems. i,j for indices from 1 to n, where n is the dimension of M. I will use the comma notation for partial derivatives, so the value of the ith partial derivative of f at x is denoted by $f_{,i}(x)$. I will denote the equivalence class that C belongs to by [C].

Let p be an arbitrary point in M. Consider the set of all C such that C is a smooth curve in M, and C(0)=p. We define an equivalence relation on this set by saying that two curves C and D are equivalent if
$$(x\circ C)'(0) =(x\circ D)'(0)$$ for all coordinate systems x such that x(p)=0.

First we need to show that the condition above is independent of the coordinate system. Suppose that the condition above holds, and that y is a coordinate system such that y(p)=0. Then
\begin{align}(y\circ C)'(0) &= (y\circ x^{-1}\circ x\circ C)'(0) =(y\circ x^{-1})_{,i}(x(C(0))) (x\circ C)^i{}'(0)\\
&=(y\circ x^{-1})_{,i}(x(D(0)))(y\circ D)^i{}'(0) = (y\circ x^{-1}\circ x\circ D)'(0) =(y\circ D)'(0).
\end{align} Note that the second and fourth equality are just applications of the chain rule.

We want to define an addition operation on the set of equivalence classes. So we want to define [C]+[D] for arbitrary C,D in the set of curves considered above. Unfortunately, we can't just define it as [C+D], because C+D is undefined. But $x\circ C$ and $x\circ D$ are curves in $\mathbb R^n$, so their sum is defined. So we use that to define
$$[C]+[D]=[x^{-1}\circ(x\circ C+x\circ D)].$$ We need to verify a) that for each x, the right-hand side is independent of the chosen representatives from the equivalence classes [C] and [D], and b) that the right-hand side is independent of the choice of x.

a) Let E,F be arbitrary members of [C] and [D] respectively. First we need to prove that $x^{-1}\circ(x\circ E+x\circ F)$ is a member of the set on which the equivalence relation has been defined.
\begin{align}
(x^{-1}\circ(x\circ E+x\circ F))(0) &=x^{-1}((x\circ E+x\circ F)(0)) =x^{-1}(x(E(0))+x(F(0))) =x^{-1}(0)=p.
\end{align} Then we need to prove that $x^{-1}\circ(x\circ E+x\circ F)$ is equivalent to $x^{-1}\circ(x\circ C+x\circ D)$. Since we have already proved that it doesn't matter which coordinate system we use in the equivalence condition, we can choose to use x to get a convenient cancellation.
\begin{align}
(x\circ x^{-1}\circ(x\circ E+x\circ F))'(0) &= (x\circ E+x\circ F)'(0) =(x\circ E)'(0)+(x\circ F)'(0) =(x\circ C)'(0)+(x\circ D)'(0)\\
& =(x\circ C+x\circ D)'(0) =(x\circ x^{-1}\circ(x\circ C+x\circ D))'(0)
\end{align}
b) Let y be a coordinate system such that y(p)=0. First we show that $y^{-1}\circ(y\circ C+y\circ D)$ is in the appropriate set by evaluating this function at 0. (The result should be p. I'm not going to type this part of the proof). Then show that $y^{-1}\circ(y\circ C+y\circ D)$ is equivalent to $x^{-1}\circ(x\circ C+x\circ D)$. Edit: Uhh...I see that this is a bit tricky. I don't have the energy to think everything through and type this up now. Maybe tomorrow.

One little warning about Isham. It does a lot of things non-rigorously, and its choice of topics is suitable for people who want to learn the mathematics of gauge theories, not people who want to learn the mathematics of GR. (Tangent spaces are however such a basic concept that it's useful both in GR and in gauge theories).

Last edited:
Fantastic explanation! thanks a lot. would be awesome if you could help me with proof of chart independence too.

Last edited:
Fredrik
Staff Emeritus
Gold Member
For each coordinate system x such that x(p)=0, define $$E_x=x^{-1}\circ(x\circ C+x\circ D).$$ We want to prove that for all x,y such that x(p)=y(p)=0, $E_x$ is equivalent to $E_y$. It's very easy to verify that $E_x(0)=E_y(0)=p$. This ensures that $E_x$ and $E_y$ are members of the set on which the equivalence relation is defined. We will prove that these curves are equivalent by showing that
$$(x\circ E_x)'(0)=(x\circ E_y)'(0).$$ The ith component of the left-hand side is
$$(x\circ E_x)^i{}'(0)=(x\circ C+x\circ D)^i{}'(0)=(x\circ C)^i{}'(0)+(x\circ D)^i{}'(0).$$
The right-hand side is much harder to evaluate. We will need a couple of lemmas. First note that
$$(y\circ C)^i{}'(0)=(y\circ x^{-1}\circ x\circ C)^i{}'(0) =\sum_j (y\circ x^{-1})^i{}_{,j}(0)(x\circ C)^j{}'(0).$$ There's of course a similar result involving D instead of C. Now let $I$ be the identity map on $\mathbb R^n$. We have
$$\delta^i_j =I^i{}_{,j} =(x\circ y^{-1}\circ y\circ x^{-1})^i{}_{,j} =\sum_k(x\circ y^{-1})^i{}_{,k}(0)(y\circ x^{-1})^k{}_{,j}(0).$$ We will need this to get rid of some annoying factors that show up in the evaluation of $(x\circ E_y)^i{}'(0)$.
\begin{align}
(x\circ E_y)^i{}'(0) &=(x\circ y^{-1}\circ (y\circ C+y\circ D))^i{}'(0) =\sum_k(x\circ y^{-1})^i{}_{,k}(0) (y\circ C+y\circ D)^k{}'(0)\\
&=\sum_k(x\circ y^{-1})^i{}_{,k}(0) \sum_j (y\circ x^{-1})^k{}_{,j}(0)\left((x\circ C)^j{}'(0)+(x\circ D)^j{}'(0)\right)\\
&=\sum_j\delta^i_j\left((x\circ C)^j{}'(0)+(x\circ D)^j{}'(0)\right) =(x\circ C)^i{}'(0)+(x\circ D)^i{}'(0)\\
&=(x\circ E_x)^i{}'(0).
\end{align}

For each coordinate system x such that x(p)=0, define $$E_x=x^{-1}\circ(x\circ C+x\circ D).$$ We want to prove that for all x,y such that x(p)=y(p)=0, $E_x$ is equivalent to $E_y$. It's very easy to verify that $E_x(0)=E_y(0)=p$. This ensures that $E_x$ and $E_y$ are members of the set on which the equivalence relation is defined. We will prove that these curves are equivalent by showing that
$$(x\circ E_x)'(0)=(x\circ E_y)'(0).$$ The ith component of the left-hand side is
$$(x\circ E_x)^i{}'(0)=(x\circ C+x\circ D)^i{}'(0)=(x\circ C)^i{}'(0)+(x\circ D)^i{}'(0).$$
The right-hand side is much harder to evaluate. We will need a couple of lemmas. First note that
$$(y\circ C)^i{}'(0)=(y\circ x^{-1}\circ x\circ C)^i{}'(0) =\sum_j (y\circ x^{-1})^i{}_{,j}(0)(x\circ C)^j{}'(0).$$ There's of course a similar result involving D instead of C. Now let $I$ be the identity map on $\mathbb R^n$. We have
$$\delta^i_j =I^i{}_{,j} =(x\circ y^{-1}\circ y\circ x^{-1})^i{}_{,j} =\sum_k(x\circ y^{-1})^i{}_{,k}(0)(y\circ x^{-1})^k{}_{,j}(0).$$ We will need this to get rid of some annoying factors that show up in the evaluation of $(x\circ E_y)^i{}'(0)$.
\begin{align}
(x\circ E_y)^i{}'(0) &=(x\circ y^{-1}\circ (y\circ C+y\circ D))^i{}'(0) =\sum_k(x\circ y^{-1})^i{}_{,k}(0) (y\circ C+y\circ D)^k{}'(0)\\
&=\sum_k(x\circ y^{-1})^i{}_{,k}(0) \sum_j (y\circ x^{-1})^k{}_{,j}(0)\left((x\circ C)^j{}'(0)+(x\circ D)^j{}'(0)\right)\\
&=\sum_j\delta^i_j\left((x\circ C)^j{}'(0)+(x\circ D)^j{}'(0)\right) =(x\circ C)^i{}'(0)+(x\circ D)^i{}'(0)\\
&=(x\circ E_x)^i{}'(0).
\end{align}
Pretty cool. I didn't think you could have proven this straight from the definition.
But this shows that defining the tangent space as a set of derivations is vastly superior. The proofs are much shorter and less complicated. Of course, it's also less intuitive imo.

Fredrik, thank you very much, I really appreciate.

May I ask you to explain what $\delta^i_j =I^i_{ ,j}$ means? What is the derivative of the identity? Why delta identity is equal to derivative of another identity?

WannabeNewton
Fredrik, thank you very much, I really appreciate.

May I ask you to explain what $\delta^i_j =I^i_{ ,j}$ means? What is the derivative of the identity? Why delta identity is equal to derivative of another identity?
$[DI]_{S} = I_{n}$ where $S$ is the standard basis of $\mathbb{R}^{n}$. Remember that the total derivative of a linear map, in this case the identity map, is just the linear map so the matrix representation of the total derivative of the identity map, in the standard basis, is just the identity matrix. In index notation this is just $(DI)^{i}_{j} = \delta^{i}_{j}$ where it is understood that the coordinate representation is with respect to the standard basis $\partial _{i} = e_{i}$.

Last edited:
Fredrik
Staff Emeritus
Gold Member
There is a simpler explanation. Here I will write a member of $\mathbb R^n$ as $x=(x^1,\dots,x^n)$. The identity map on $\mathbb R^n$ is the $I:\mathbb R^n\to\mathbb R^n$ defined by $I(x)=x$ for all $x\in\mathbb R^n$. $I^i$ denotes the "ith component", i.e. $I^i:\mathbb R^n\to\mathbb R$ is defined by $I^i(x)=x^i$ for all $x\in\mathbb R^n$. It follows immediately from the definition of partial derivative that $I^i{}_{,j}(x)=\delta^i_j$ for all x.

In the ∂(something)/∂(something) notation for partial derivatives, we have
$$I^i{}_{,j}=\frac{\partial x^i}{\partial x^j}.$$

Fantastic!

In the proof what did you mean by $(y \circ x^{-1})^i{}_{,j}$? I mean it is the inverse map $x^{-1}$ that is involved, so how should the derivative with respect to $x_j$ be understood?

Fredrik
Staff Emeritus
Gold Member
In the proof what did you mean by $(y \circ x^{-1})^i{}_{,j}$?
For any differentiable function $f:S\to\mathbb R^n$ where $S\subseteq\mathbb R^n$, $f^i$ denotes the map that takes each $x\in S$ to the ith component of $f(x)$, which we would write as $(f(x))^i$. So for all $x\in S$, we have $f^i(x)=(f(x))^i$.

Denote the domains of x and y by U and V respectively. $y\circ x^{-1}$ is a map from $x(U\cap V)$ into $y(U\cap V)$. Both of these are subsets of $\mathbb R^n$. So $(y\circ x^{-1})^i$ is a function from a subset of $\mathbb R^n$ into $\mathbb R$. The ",j" just denotes the jth partial derivative of that function, as defined by any book on calculus that covers partial derivatives.

In particular, these partial derivatives do not have anything to do with the partial derivatives with respect to the coordinate system x. I don't use the comma notation for those. This is the notation I use:
$$\left.\frac{\partial}{\partial x^j}\right|_p\ f =(f\circ x^{-1})_{,j}(x(p)).$$ Here f is a function defined on some subset of M that contains p, so $f\circ x^{-1}$ is a function from a subset of $\mathbb R^n$ into $\mathbb R$.

I understand now, thanks again, Fredrik.

Does the smoothness of manifold also automatically imply that it can be covered with overlapping charts? So there can not be any stand-alone chart the doesn't overlap with the rest?

WannabeNewton
Does the smoothness of manifold also automatically imply that it can be covered with overlapping charts? So there can not be any stand-alone chart the doesn't overlap with the rest?
The smooth structure that you endow on the manifold is nothing more than a collection of smoothly compatible charts - the charts determine the smooth structure not the other way around as you seem to think. Some manifolds can be covered by only one chart ($\mathbb{R}^{n}$) but in general you will need multiple charts, that may or may not overlap, to cover your manifold (e.g. $S^{2}$ needs at least two - the stereographic projection). It all depends on the smooth atlas that you choose to endow on it (a given smooth manifold has uncountably many distinct smooth atlases btw.)

So there can be smooth manifolds that can not be covered by only overlapping charts whatever chart system we try?

WannabeNewton