# Question about the derivation of the tangent vector on a manifold

• A
I am trying to understand the following derivation in my lecture notes. Given an n-dimensional manifold ##M## and a parametrized curve ##\gamma : (-\epsilon, \epsilon) \rightarrow M : t \mapsto \gamma(t)##, with ##\gamma(0) = \mathbf{P} \in M##.

Also define an arbitrary (dummy) scalar field ##f = C^{\infty}(M)##, and $$\bar{\gamma} = \phi_\alpha \circ \gamma, \qquad \qquad \bar{f} = f \ \circ \phi_\alpha^{-1}.$$ Here ##\phi_\alpha(\mathbf{P}')## gives the coordinates of the point ##\mathbf{P'}## on the manifold, and ##\phi_\alpha^{-1}## is its inverse. The tangent vector of ##\gamma## at ##\mathbf{P}## is then given as follows, $$\dot{\gamma}(0)f = \frac{d}{dt}(f \ \circ \gamma)|_{t=0} =\frac{d}{dt}(\bar{f} \circ \phi_\alpha \circ \phi_\alpha^{-1} \circ \bar{\gamma})|_{t=0} = \frac{d}{dt}(\bar{f} \circ \bar{\gamma})|_{t=0} = \dot{x}^i(0) \frac{\partial\bar{f}}{\partial x^i}|_{x(0)}.$$

Here ##\dot{x}^i(t) = \bar{\dot{\gamma}}^i(t)## and ##x^i(t) = \bar{\gamma}^i(t)## are coordinates. I understand everything up to this point. Next however the following step is introduced, $$\dot{x}^i(0) \frac{\partial\bar{f}}{\partial x^i}|_{x(0)} = \dot{x}^i(0) \frac{\partial f}{\partial x^i}|_{\gamma(0)}.$$

I cannot understand or derive why this holds. The lhs makes sense to me, since ##\bar{f}## takes coordinates as its arguments, and thus differentiating to coordinates seems logical. However on the rhs we are suddenly differentiating ##f##, which takes a point on the manifold ##\mathbf{P}## as its argument. How can we still derive this to the coordinates? Trying to fill in the definitions of ##\bar{f}## and ##\bar{\gamma}## on the lhs also doesn't seem to take me anywhere. Any help would be greatly appreciated.

Infrared
Gold Member
It's hard to say without seeing how your class set things up, but I'd say that ##\frac{\partial f}{\partial x^i}=\frac{\partial (f\circ \phi_{\alpha}^{-1})}{\partial x^i}## is the usual definition of the LHS.

If this is not what your class did, can you give your definition of ##\frac{\partial f}{\partial x^i}?##

It's hard to say without seeing how your class set things up, but I'd say that ##\frac{\partial f}{\partial x^i}=\frac{\partial (f\circ \phi_{\alpha}^{-1})}{\partial x^i}## is the usual definition of the LHS.

If this is not what your class did, can you give your definition of ##\frac{\partial f}{\partial x^i}?##
After looking into this a little more I agree with you that the equation is actually a definition and not a result. This still confuses me though, what are you actually defining? Aren't all the functions in this case already explicitly defined, so what is there left to choose? It seems odd to me to set two separate things equal "by definition" when these things have already been explicitly defined earlier.

fresh_42
Mentor
It is always a good idea to make a little drawing of the spaces involved: ##M, \mathbb{R}^n, \mathbb{R}## and the functions between them: ##f, \bar{f}, \theta_\alpha, \gamma,\bar{\gamma}##.

Personally, I would even write ##f:M\longrightarrow N## instead of ##f:M\longrightarrow \mathbb{R}## and name the one coordinate of ##N## as ##y##.

It is always a good idea to make a little drawing of the spaces involved: ##M, \mathbb{R}^n, \mathbb{R}## and the functions between them: ##f, \bar{f}, \theta_\alpha, \gamma,\bar{\gamma}##.

Personally, I would even write ##f:M\longrightarrow N## instead of ##f:M\longrightarrow \mathbb{R}## and name the one coordinate of ##N## as ##y##.
I am not a star at drawing ;) , but I can try to give of an overview of my current intuitive understanding of these functions. Please correct me if something is wrong.

On the manifold ##M##, which I envision as a "curved sheet", there exist points ##\mathbf{P}##. The chart ##\phi_{\alpha}## takes such a point and produces a set of ##n## coordinates belonging to that point, so maps into ##\mathbb{R}^n##.

The function ##f## assigns to each to each point ##\mathbf{P}## on the manifold a certain numerical value, so maps into ##\mathbb{R}##. This function is really just used so that we can define a tangent vector by the rate of change of this function. The function ##\bar{f}## then assigns a numerical value to each set of coordinates.

##\gamma## represents a curve on the manifold ##M##, which I envision as a line on the curved sheet. This function takes a numerical value and produces a certain point on the manifold, so maps into ##M##. The function ##\bar{\gamma}## does the same but produces the set of coordinates belonging to that point, so maps into ##\mathbb{R}^n##.

fresh_42
Mentor
I wouldn't start with tangent vectors, just the situation at the beginning. That ##f(M)\subseteq \mathbb{R}## is a source of confusion, as we have automatically a trivial chart here, and the function values are indistinguishable from the coordinates: all real numbers. That's why I suggested to use a neutral notation ##f:M \longrightarrow N## and only go back to ##N=\mathbb{R}## if necessary. The drawing doesn't need to be of high accuracy, just as good as to distinguish the spaces. E.g.

I drew the domain of ##f## two dimensional, only to visualize it. This is wrong, as it is only a real line, but then the values of ##f## couldn't be seen.

I read ##\dot{x}^i(0) \left.\dfrac{\partial\bar{f}}{\partial x^i}\right|_{x(0)} = \dot{x}^i(0) \left.\dfrac{\partial f}{\partial x^i}\right|_{\gamma(0)}## as definition and would have written ##\bar{\gamma}(0)## instead of ##x(0)## on the left.

It says basically: We need the chart to calculate the differentiation. But we are interested in a function that ends up in its own chart anyway, so we can avoid the manifold and calculate in coordinates from the start by using ##\bar{f}## instead.

Decimal
@fresh_42 Thank you! That helps a lot, I feel like my intuition still needs a lot of work in this area but it's getting much clearer already.

fresh_42
Mentor
@fresh_42 Thank you! That helps a lot, I feel like my intuition still needs a lot of work in this area but it's getting much clearer already.
https://www.physicsforums.com/insights/the-pantheon-of-derivatives-i/

It is confusing, because there are so many different points of view, depending on what is considered the variable: the point of evaluation as at school, the function, the direction, the coordinates, and which, those of the domain or of the codomain, the transformation from function to tangent, the Jacobi matrix, or whatever. It is more a question of where we are at a certain point of calculation than what it is. That's why I like those drawings to separate the many spaces involved. If you replace the chart ##\theta_\alpha## by ##D_p## then a similar image can be used for tangent spaces. This way ##\mathbb{R}^n## isn't the coordinate chart anymore, but the Euclidean tangent space of ##M##.

If you study physics, then you will need the coordinates very much. Consider the system as a language and the formulas as vocabulary. Unfortunately, there is no unique notation. ##J_p(f)(v)=D_pf(v)=\nabla_p(f).v=\langle \operatorname{grad}(f)(p),v\rangle = \left.\dfrac{df}{dx}\right|_p \cdot v= v\cdot \left.\dfrac{df}{dx}\right|_p ## all mean the same thing: a real number, the slope of ##f## at point ##p## in direction ##v##.

sysprog and Decimal
Infrared
Gold Member
After looking into this a little more I agree with you that the equation is actually a definition and not a result. This still confuses me though, what are you actually defining? Aren't all the functions in this case already explicitly defined, so what is there left to choose? It seems odd to me to set two separate things equal "by definition" when these things have already been explicitly defined earlier.
The LHS hasn't been defined previously. Although ##f:M\to\mathbb{R}## is already defined, we haven't said yet what it means to take a partial derivative of a real-valued function defined on a manifold. So we just define it to be the derivative of the corresponding function in local coordinates.

This is probably too pedantic, but maybe helpful to think about anyway. In a coordinate system around ##p##, we have the tangent vectors ##\frac{\partial}{\partial x^i}## in ##T_pM##, which is really shorthand for ##{\phi_{\alpha}^{-1}}_*\left(\frac{\partial}{\partial x^i}\right).## Viewing both sides ##\frac{\partial}{\partial x^i}={\phi_{\alpha}^{-1}}_*\left(\frac{\partial}{\partial x^i}\right)## as derivations on ##M##, we have ##\frac{\partial f}{\partial x^i}=\frac{\partial (f\circ\phi_{\alpha}^{-1})}{\partial x^i}## just by definition.

On the other hand, we could have defined ##\frac{\partial f}{\partial x^i}=f_*\left(\frac{\partial}{\partial x^i}\right)=(f\circ\phi_{\alpha}^{-1})_*\left(\frac{\partial}{\partial x^i}\right),## where we identify ##T_{f(p)}\mathbb{R}## with ##\mathbb{R}.##

This is the derivation on ##\mathbb{R}## that takes a smooth function ##g:\mathbb{R}\to\mathbb{R}## to ##\frac{\partial}{\partial x^i}\left(g\circ f\circ \phi_{\alpha}^{-1}\right)=\frac{\partial g}{\partial t}\frac{\partial (f\circ\phi_{\alpha}^{-1})}{\partial x^i},## (where ##t## is the coordinate on ##\mathbb{R}##), that is, ##f_*\left(\frac{\partial}{\partial x^i}\right)=\frac{\partial (f\circ\phi_{\alpha}^{-1})}{\partial x^i}\frac{\partial}{\partial t}##, which is identified with ##\frac{\partial (f\circ\phi_{\alpha}^{-1})}{\partial x^i}## through ##T_{f(p)}\mathbb{R}=\mathbb{R}##.

This gives us the equality ##\frac{\partial f}{\partial x^i}=\frac{\partial (f\circ\phi_{\alpha}^{-1})}{\partial x^i}## that we wanted.

I've mostly avoided notating where the derivatives are being evaluated as it should be clear from context.

This explanation also assumes that you used the 'derivation' definition of tangent space, which I think is the most natural one for this explanation.

Last edited:
Decimal
mathwonk
Homework Helper
2020 Award
That notation made my eyes spin around, and I am a professional. This may not add anything to what others have said, but I was motivated to write it anyway, because obscuring simple ideas with notation is a pet peeve of mine.

Just try to keep the ideas in mind and remember that whatever the notation is, it must reduce down to some way of expressing the basic idea.

The basic idea here is that the derivative of a real valued function, defined on the real line, and evaluated at a point, usually zero, is a number. Conversely any derivative of a real valued function defined in the nbhd of p on a manifold f:M—>R, and which is a number, must be obtained by first composing your function with another function from the real line to the manifold, c:R—>M, called a curve, with c(0) = p, and then taking the derivative of the composition (foc):R—>R, evaluated at zero.

Moreover given f, the value of the derivative we get, does not depend fully on the choice of the curve c, but only on the velocity vector of that curve at zero, (by the chain rule).

Thus given a function f:M—>R, and a point p on M, if c:R—>M is a curve with c(0) = p, and with velocity vector v = c’(0), we call the value (foc)’(0) of the derivative at zero of the composite function foc, “the directional derivative of f, at p, in the direction v”.

Every derivative of f at p, which is a number, must be the directional derivative of f in some direction. So if the notation ∂f/∂xj (p) defines some derivative which is a number, that number must be the directional derivative of f, at p, in the direction of some velocity vector to some curve passing through p. You just have to figure out what curve it is.

Now x in your book's discussion denotes a “coordinate function”, i.e. a function from some nbhd of p in M, to R^n. Hence its inverse is a map from R^n to a nbhd of p in M. Thus one can get n distinguished curves passing through p, by restricting the inverse of the coordinate map to each of the n coordinate axes.

Thus the only conceivable meaning of ∂f/∂xj (p), is to take the inverse of the coordinate map defining the coordinate system x, restrict it to the jth coordinate axis, call this restriction c, and then compute the derivative (foc)’(0).

and since partial derivatives are just derivatives of restrictions to the various axes, if phi is the map defining the coordinate system x, this is just ∂f/∂xj(f o phi^-1)(0).

Remember the ideas are always simpler than the notation, but no matter how awful the notation is, it still has to mean what the ideas mean, there is no other possibility.

Last edited:
lavinia and Decimal