# Question on vectors in tangent space to a manifold being independent of coordinate chart

1. Jan 17, 2015

In Nakahara's book, "Geometry, Topology and Physics" he states that it is, by construction, clear from the definition of a vector as a differential operator $X[\itex] acting on some function f:M\rightarrow\mathbb{R}[\itex] at a point [itex]p\in M[\itex] (where [itex]M[\itex] is an [itex]m[\itex]-dimensional manifold), $$\frac{df(c(t))}{dt}\Biggr\vert_{t=0}=X^{\mu}\left(\frac{\partial f}{\partial x^{\mu}}\right)\equiv X[f][\tex] (with [itex]c(0)=p[\itex]) that a vector [itex]X[\itex] exists without specifying the coordinate (i.e. it is coordinate independent). Is this the case because the far left-hand side the derivative of [itex]f[\itex] with respect to the parameter [itex]t[\itex] is coordinate independent (as it depends on an equivalence class of curves [itex][c][\itex] on [itex]M[\itex] parameterised by some real parameter [itex]t\in (a,b)\subset\mathbb{R}[\itex], where [itex]a<0<b[\itex] for convenience, defined by [itex]c:(a,b) \rightarrow M, t\mapsto c(t)[\itex], with the equivalence relation [itex]c\sim \tilde{c}[\itex] defined such that [itex]c(0)=p=\tilde{c}(0)[\itex] and [itex]\frac{dx^{\mu}(c(t))}{dt}\Biggr\vert_{t=0}=\frac{dx^{\mu}(\tilde{c}(t))}{dt}\Biggr\vert_{t=0}[\itex] which are themselves coordinate independent)? 2. Jan 17, 2015 ### Matterwave All your tex wrappers are using the wrong slash... you should use "/" instead of "\". That would make reading your post much easier. 3. Jan 18, 2015 ### "Don't panic!" Whoops, sorry about that. It wouldn't let me edit my original post so I've reposted it here: In Nakahara's book, "Geometry, Topology and Physics" he states that it is, by construction, clear from the definition of a vector as a differential operator [itex]X$ acting on some function $f:M\rightarrow\mathbb{R}$ at a point $p\in M$ (where $M$ is an $m$-dimensional manifold), [tex]\frac{df(c(t))}{dt}\Biggr\vert_{t=0}=X^{\mu}\left(\frac{\partial f}{\partial x^{\mu}}\right)\equiv X[f]$$ (with $c(0)=p$) that a vector $X$ exists without specifying the coordinate (i.e. it is coordinate independent). Is this the case because the far left-hand side the derivative of $f$ with respect to the parameter $t$ is coordinate independent (as it depends on an equivalence class of curves $[c]$ on $M$ parameterised by some real parameter $t\in (a,b)\subset\mathbb{R}$, where $a<0<b$ for convenience, defined by $c: (a,b) \rightarrow M, t\mapsto c(t)$, with the equivalence relation $c\sim \tilde{c}$ defined such that $c(0)=p=\tilde{c}(0)$ and $\frac{dx^{\mu}(c(t))}{dt}\Biggr\vert_{t=0}=\frac{dx^{\mu}(\tilde{c}(t))}{dt}\Biggr\vert_{t=0}$ which are themselves coordinate independent)? 4. Jan 18, 2015 ### Fredrik Staff Emeritus I don't know Nakahara's definitions, but this is one of several options that can be considered standard: Let $F$ be the set of smooth functions from $M$ into $\mathbb R$, with addition, scalar multiplication and multiplication defined in the obvious ways. Let $T_pM$ be the vector space of all $v:F\to\mathbb R$ such that (a) $v$ is linear. (b) $v(fg)=v(f)g(p)+f(p)v(g)$ for all $f,g\in F$. This is a manifestly coordinate-independent definition of $T_pM$. (We never even mentioned a coordinate system). It's now possible to show (see p. 82-84 in Isham) that if $x:U\to\mathbb R^n$ is a coordinate system such that $p\in U$, then $\big(\frac{\partial}{\partial x^i}\big|_p\big)_{i=1}^n$ is an ordered basis for $T_pM$. It's also possible to define $T_pM$ as the vector space spanned by the $\frac{\partial}{\partial x^i}\big|_p$, where $x$ is a coordinate system with $p$ in its domain. This is less elegant (because of the apparent coordinate dependence), but requires less of an effort from the reader. The way to show coordinate independence is to show that if $y$ is another coordinate system with $p$ in its domain, then each $\frac{\partial}{\partial y^i}\big|_p$ is a linear combination of the $\frac{\partial}{\partial x^i}\big|_p$. For all $f\in F$, we have \begin{align} &\frac{\partial}{\partial y^i}\bigg|_p f =(f\circ y^{-1})_{,i}(y(p)) = (f\circ x^{-1}\circ x\circ y^{-1})_{,i}(y(p)) = (f\circ x^{-1})_{,j}(x\circ y^{-1}(y(p)))\, (x\circ y^{-1})^j{}_{,i}(y(p))\\ & =(x^j\circ y^{-1})_{,i}(y(p))\, (f\circ x^{-1})_{,j}(x(p)) = \frac{\partial x^j(y(p))}{\partial y^i} \frac{\partial}{\partial x^j}\bigg|_p f. \end{align} Edit: If we define $T_pM$ as a set of equivalence classes of curves, it's significantly more difficult to show that everything is coordinate independent. It's not sufficient to just show that the definition of the equivalence relation doesn't change if we change the coordinate system. We also have to do the same for the definitions of addition and scalar multiplication. If you want to see most of the details, start with my post here, and follow the links. Hm, some of those pages take forever to load in my browser, so I should also add a direct link to the post where the independence proofs start. It's here. Last edited: Jan 18, 2015 5. Jan 18, 2015 ### Matterwave Yes. You can choose which definition you want to take for a vector. Some texts would, instead of defining vectors as differential operators, define vectors as the equivalence class of curves $c$ instead. That would make the definition obviously coordinate independent. But since each equivalence class of curves corresponds to 1 differential operator $X$, then it is a matter of choice which concept you want to identify with a "tangent vector". 6. Jan 18, 2015 ### lavinia The thought that I stick to for the differential operator definition of tangent vector is that directional derivatives are point wise operations, that is: given a vector, one can take the directional derivative of a function at a point by differentiating along any curve that fits the vector at that point. Only the vector matters, not the curve. This means that a vector can be thought of as an operator on functions. In other words it makes sense to talk about v.f at a point and of course the Leibniz Rule is satisfied naturally. On a manifold the idea of a vector is a little subtle because one has different coordinate charts and one needs to know when two vectors represented in two overlapping charts are really the same. The answer is given with the Chain Rule but one can also completely avoid coordinates by defining vectors in the first place as operators on functions. This is a coordinate free way of doing it. For me, the operator definition is a bit non-intuitive and I always feel more grounded thinking in terms of directional derivatives. One can formalize this method by talking about equivalence classes of curves, but this is a formalism. The underlying idea is directional derivatives. 7. Jan 19, 2015 ### "Don't panic!" Is this just a definition, or is there a deeper reasoning behind it? I really find it quite confusing that in the book that I'm reading the only justification given is the following: "By construction, it is obvious that a vector $X$ exists without specifying a coordinate system (i.e. it is coordinate independent)" and the author then refers to the following equation as justification: $$\frac{df(c(t))}{dt}\Biggr\vert_{t=0} = X^{\mu}\frac{\partial f}{\partial x^{\mu}}\equiv X[f]$$ where $X^{\mu}= \frac{dx^{\mu}(c(t))} {dt}\Biggr\vert_{t=0}$. I may be being a little stupid, but I don't really see what's obvious about the above equation that makes it coordinate independent (other than my suggestion in the original post)? 8. Jan 19, 2015 ### lavinia That is a directional derivative. The formula just verifies that it depends only on the vector. It is a basic property of derivatives. You would do well to verify for yourself that the directional derivative is independent of the curve. 9. Jan 19, 2015 ### Fredrik Staff Emeritus It's a definition, and there's a reason. The reason is that this vector space is isomorphic to the vector space of equivalence classes of curves through p. (See the links at the end of my first post in this thread). That sentence doesn't make sense to me. I suspect that what he has in mind is something like this: \begin{align} &\frac{df(c(t))}{dt}\Biggr\vert_{t=0} = (f\circ c)'(0) = (f\circ x^{-1}\circ x\circ c)'(0) =(f\circ x^{-1})_{,i}(x(c(0))) (x\circ c)^i{}'(0)\\ &= (x\circ c)^i{}'(0) \frac{\partial}{\partial x^i}\bigg|_{c(0)} f. \end{align} The right-hand side must be coordinate independent because it's equal to the left-hand side, which doesn't even mention a coordinate system. 10. Jan 19, 2015 ### "Don't panic!" That's what I had thought. I assume this is because the function $f$ and the curve $c(t)$ are themselves independent of any coordinate system? 11. Jan 19, 2015 ### "Don't panic!" Is this correct? Let $f:M\rightarrow\mathbb{R}$ be a $C^{k}$ function and let $X\in T_{p}M$ be a vector in the tangent space to $M$ at the point $p$. There exists a curve $\gamma: (a,b) \rightarrow M$ in the neighbourhood of $p$ such that $\gamma (0)=p$ and $\frac{d\gamma(t)}{dt}\Biggr\vert_{t=0}=X^{\mu}$ (where $X^{\mu}$ is the $\mu^{th}$ component of $X$). We have then, that the directional derivative of $f$ at the point $p\in M$ ($M$ is an $m$-dimensional manifold) with respect to $X$ is $$X[f]=\frac{d(f\circ\gamma)}{dt}\Biggr\vert_{t=0}$$ Let $\gamma_{1}: (a,b) \rightarrow M$ and $\gamma_{2}: (c,d) \rightarrow M$ be two curves passing though a point $p\in M$ satisfying $$\gamma_{1}(0)=p=\gamma_{2}(0), \qquad\qquad\frac{d\gamma_{1}(t)}{dt}\Biggr\vert_{t=0}=\frac{d\gamma_{2}(s)}{ds}\Biggr\vert_{s=0}$$ We have then, using the definition given above, that $$X[f]= \frac{df(\gamma_{1}(t))}{dt}\Biggr\vert_{t=0}=f'(\gamma_{1}(t))\Biggr\vert_{t=0}\gamma'_{1}(t)\Biggr\vert_{t=0}\\ \qquad\qquad\qquad =f'(p)\gamma'_{1}(0)=f'(p)\gamma'_{2}(0)\\ \qquad\qquad\qquad =f'(\gamma_{2}(s))\Biggr\vert_{s=0}\gamma'_{2}(s)\Biggr\vert_{s=0}\\ \qquad\qquad\qquad =\frac{df(\gamma_{2}(s))}{ds}\Biggr\vert_{s=0}$$ And from this, we see that the directional derivative of $f$ in the direction of $X$ is independent of the choice of curve passing through the point $p\in M$. 12. Jan 19, 2015 ### Hawkeye18 Let me add my 2cents. First of all, any differential operator is coordinate independent by definition: it is usually given in a particular coordinate system with the understanding that in a different coordinate system one should get the same result. Then, given a representation in a particular coordinate system, one can get a representation in any other coordinate system using the chain rue and the above understanding that the operator should be coordinate independent. Second, the thing I do not like about a lot of "modern" books on differential geometry, that they are often very formal, "algebraic", giving almost no geometrical or physical intuition. I think if one want to understand the intuition behind the definition, one should start with embedded manifolds. An embedded manifold $M$ of dimension $n$ in $\mathbb R^N$ is a set that can be locally represented as an image of a local representation function $\varphi:V\to \mathbb R^N$, where $V$ is an open subset of $\mathbb R^n$ and $\varphi$ is a smooth injective function which derivative matrix has full rank everywhere and $\varphi$ is a homeomorphism between $V$ and $\varphi(V)$ (topology on $\varphi(V)$ is inherited from $\mathbb R^N$). The requirement that $\varphi$ is a homeomorphism (i.e. that $\varphi^{-1}:\varphi(V) \to V$ is continuous) might look a bit to technical, but it is needed to forbid accumulation of different "folds" of the manifold $M$ to a point on $M$. Probably a bit more intuitive definition of the embedded manifold is the one used in Spivak's "Calculus on manifolds". The intuition behind this definition is that a simplest embedded $n$-dimensional manifold in $\mathbb R^N$ is $\mathbb R^n\times 0 \subset \mathbb R^n\times \mathbb R^{N-n} =\mathbb R^N$, so the embedded manifold should locally (up to a diffeomorphism) look like this picture. Thus, a subset $M$ of $\mathbb R^N$ is an embedded $n$-dimensional manifold if for any point $p\in M$ there is a neighborhood $\widetilde W$ of $p$ in $\mathbb R^N$ an open set $\widetilde V$ in $\mathbb R^n\times \mathbb R^{N-n} =\mathbb R^N$, and a diffeomophism $\widetilde\varphi :\widetilde V\to \widetilde W$ such that M\cap \widetilde W = \widetilde\varphi (\widetilde V\cap \mathbb R^n\times 0). Restricting $\widetilde \varphi$ to $V=\widetilde V\cap \mathbb R^n\times 0$ one gets the function $\varphi$ from the previous definition. Similarly, from $\varphi$ from the first definition one can get $\widetilde \varphi$, one will need to use the inverse function theorem here. The equivalence of the two definitions is proved, for example, in Spivak. For an embedded manifold $M$ there are several equivalent natural definitions of the tangent space at a point $p\in M$. The "physical" one is to consider all trajectories on $M$ going through $p$ (say all $C^1$ functions $t\mapsto x(t)\in \mathbb R^N$, $x(t) \in M$, $x(0)=p$) and for all trajectories calculate the velocities $x'(0)$ at $p$. Another natural definition would be a "geometric" one, using approximation by linear subspaces. Of course, all the definition are equivalent, and the tangent space can be computed in terms of the local parametrization function $\varphi$: it is exactly the range (the column space) of the Jacobi matrix $\varphi'(a)$, where $\varphi(a)=p$. The columns of $\varphi'(a)$ form a natural basis in the tangent space, so given a local parameterization $\varphi$, it is natural to use in the tangent space the coordinates in this basis. Note, that the inverse map $\varphi^{-1}$ gives us a coordinate chart in a neighborhood of $p\in M$, so we can say that given a coordinate chart for an embedded manifold we have a natural basis and natural coordinates in the tangent space. Now, suppose we have a different local parameterization $\psi$ (and so a different coordinate chart $\psi^{-1}$). We then have a different basis and a different system of coordinates in the tangent space. The change of coordinates matrix from one basis to another is not hard to compute. Namely, it $T=T_pM$ is the "real" tangent space to $M$ at the point $p$ (by real I mean that this is is a subspace of $\mathbb R^N$ and not some abstract crap) and $v\in T_pM$, then $(\varphi'(a))^{-1} v$ gives the coordinates of $v$ in the basis consisting of columns of $\varphi'(a)$. There is a question, of course, how to interpret $(\varphi'(a))^{-1}$, because the matrix $\varphi'(a)$ is not square, and so it is not invertible. But this matrix defines a linear isomorphism between $\mathbb R^n$ and $T_pM$, so $(\varphi'(a))^{-1}$ is well defined as a map from $T_pM$ to $\mathbb R^n$. Note also that the coordinates can be computed using the derivatives of the function $\widetilde \varphi$ introduced above: (\varphi'(a))^{-1} v =(\widetilde\varphi'(a))^{-1} v \qquad \forall v\in T_pM; here $\widetilde \varphi'(a)$ is an invertible $N\times N$ matrix. The above identity follows from the fact that if we restrict the matrix $\widetilde \varphi'(a)$ to $\mathbb R^n\times 0$ we get exactly the matrix $\varphi'(a)$. Continuing with the change of coordinate matrix, we can see that the matrix $(\psi'(a))^{-1}\varphi'(a)$ gives us the change of coordinate formula from $\varphi$- to $\psi$-coordinates in the tangent space. And the chain rule implies that this matrix is exactly the derivative of the change of coordinate function $\psi^{-1}\circ\varphi$ (evaluated at $a$). There is a little detail here, because the classical chain rule in analysis require the functions to be defined on open subsets of the underlying space ($\mathbb R^N$ in our case), but the function $\psi^{-1}$ is not defined on an open subset of $\mathbb R^N$. But there is an easy way around: one just needs to notice that $\psi^{-1} \varphi$ can be obtained from $\widetilde\psi^{-1} \widetilde\varphi$ by restricting input to the first $n$ coordinates (i.e. to points $x\times 0\in V\times 0 \subset \mathbb R^n\times \mathbb R^{N-n}$) and noticing that the output in this case will be $\psi^{-1}\circ \varphi(x)\otimes 0\subset \mathbb R^n\times \mathbb R^{N-n}$. Then we can apply the chain rule to the function $\widetilde\psi^{-1} \widetilde\varphi$ and then restrict everything to the first $n$ coordinates, to get the change of coordinate formula we just stated (the change of coordinate matrix is exactly the derivative of the change of coordinate function $\psi^{-1}\circ\varphi$ evaluated at $a$). BTW, this also proves that the change of coordinate function $\psi^{-1}\circ \varphi$ is smooth; I do not know a proof not using the extended functions $\widetilde\varphi$, $\widetilde \psi$. Now, finishing the story: after we have the change of coordinates formula, we notice that the same formula governs the change of coefficients of the first order differential operators (if we require the operator to be coordinate-independent, i.e. to give the same result in all coordinate systems). Thus, the language of first order differential operators gives a convenient formalism for the description of the tangent space. The operators ${\partial}/{\partial x^k}$ (evaluated at $a$) give us a basis in the abstract tangent space in$\varphi$-coordinates. In the "real" tangent space the $k$th column of $\varphi'(a)$ corresponds to the abstract vector ${\partial}/{\partial x^k}$. After that one can introduce many equivalent definitions of the abstract tangent space, I think most of them are presented in this thread. Some of them are mo intuitive than others, some are very elegant but completely non-intuitive. For example, the definition in terms of classes of equivalent trajectories is quite intuitive. You can introduce some intuition in the definition using the differential operators, like you need the vectors to compute the derivatives along the vectors, and that is exactly the differential operator. These definitions are more or less convincing, in the sense that you can convince yourself that they give you something similar to the "real" tangent space for an embedded manifold. But to show that they give you exactly the "real" tangent space for the embedded manifold, you need the narrative presented in this post or something similar. I understand why many of the authors do not like to present this narrative: filling all the detail require from a reader a good understanding of analysis in several variables and quite a bit of work. And an abstract definition allows to start writing formulas and proving theorems pretty fast: but the price is often that students have very little understanding about the connection between abstract and real life objects. I think it is beneficial for a student to have an understanding of going from embedded manifolds to abstract ones, even without filling all the details: this might be very beneficial the geometric intuition. Note that similar narratives (from embedded to abstract manifolds) exist for other objects like metric tensor, curvature, etc. 13. Jan 19, 2015 ### Fredrik Staff Emeritus Yes, it looks good, except for some inaccuracies like $\frac{d\gamma(t)}{dt}\Biggr\vert_{t=0}=X^{\mu}$ that you're probably aware of. I like the dot notation for tangent vectors of curves. I define $\dot\gamma(t)$ for each t by $\dot\gamma(t) f =(f\circ\gamma)'(t)$ for all smooth f. $X^\mu$ enters the picture when you use the chain rule: \begin{align} \dot\gamma(t) f =(f\circ\gamma)'(t) =(f\circ x^{-1}\circ x\circ \gamma)'(t) =(f\circ x^{-1})_{,i}(x(c(t))) (x\circ\gamma)^i{}'(t) =(x\circ\gamma)^i{}'(t) \frac{\partial}{\partial x^i}\bigg|_{c(t)} f. \end{align} 14. Jan 20, 2015 ### "Don't panic!" Yes, sorry I realised that after the point at which I was able to edit the post. Other than that are there any other inaccuracies that I should know about or have I understood it correctly? 15. Jan 20, 2015 ### "Don't panic!" Also, would it be enough to say what you put (in the quote above) to justify the coordinate independence of a tangent vector at a given point {itex]p\in M?

16. Jan 20, 2015

### Fredrik

Staff Emeritus
That question doesn't entirely make sense. Every element of every set is coordinate independent. A tangent vector at p is coordinate independent just like the number 3 is coordinate independent.

What makes sense is to ask if an expression (=string of text) such as $(x\circ c)^i{}'(0)\frac{\partial}{\partial x^i}\big|_{c(0)}$ is coordinate independent in the sense that it represents the same element of the same set, regardless of what coordinate system the "$x$" represents. If you can rewrite the expression so that it doesn't contain any references to a coordinate system, you will immediately know that the answer is "yes"

17. Jan 20, 2015

### "Don't panic!"

Ah, ok. So as the quantity $(x\circ c)^{i\prime}(0) \frac{\partial}{\partial x^{i}}\Biggr\vert_{c(0)} f$ is equal to a quantity that is coordinate independent (namely $\frac{d(f\circ c(t))}{dt}\Biggr\vert_{t=0}$) it must be itself coordinate independent?! (Sorry for the, my brain is being a little sluggish today)

18. Jan 20, 2015

### Fredrik

Staff Emeritus
Yes.

19. Jan 20, 2015

### "Don't panic!"

Ok, great. Thanks for your help.