# Questions about the differential

1. Jul 11, 2015

### orion

In some books, the differential is defined by:

(a) $df(v) = v(f) : T_pM → \mathbb {R}$

while in other books, a more abstract definition is made:

(b) Given two manifolds $M,N$, $v \in T_pM$ and a map $F :M → N$ then $dF$ is defined

$dF(v)(f) = v(f\circ F) : T_pM → T_{F(p)}N$

where $f \in C^∞(N)$.

My questions:

(1) How are these two definitions equivalent? And if they are not equivalent how to reconcile them?

(2) I need help understanding (b). Why are there $F$ and $f$? Also, how in what space does $v(F \circ f)$ operate since F is a map from $M$ to $N$ and $v$ belongs to $T_pM$ and $f \in C^∞(N)$?

(3) It would seem that definition (a) is more in line with the idea that a differential is a linear functional on $T_pM$ and hence a covector in a space dual to $T_pM$. How to understand (2) in this context?

Last edited: Jul 11, 2015
2. Jul 11, 2015

### WWGD

To possibly save you some work, in (a), you do start with a map from $\mathbb R^n \rightarrow \mathbb R^m$., both of which are manifolds, your $M,N$ you use below.

in (b), v is a derivation, so it acts in any tangent space. v acts on the tangent space (as a directional derivative) at the tangent space $T_{F(p)}N$ , based at $N$, and it acts on functions $f: N \rightarrow \mathbb R$.

3. Jul 11, 2015

### orion

I made a mistake in the mapping in (a). It is correct now.

So in (a) what is $F$ and $f$ as defined in (b)?

Last edited: Jul 11, 2015
4. Jul 11, 2015

### WWGD

But please do note that there is a map associated to , or giving rise to , the differential in (a). That will make it easier to compare with the description in (b). Please describe this map, or explain if there is no such map in the definition.

EDIT: I need to leave or a while now, will be back in a few hours.

5. Jul 11, 2015

### orion

Is this correct:

$f :M → \mathbb{R}$
$df: T_pM → T_{f(p)}\mathbb{R}$

but

$T_{f(p)} \mathbb{R} ↔ \mathbb{R}$?

What plays the role of $f$ from definition (b)? Would that be the identity map?

Honestly, I have no idea what I'm doing. I'm just trying to make the maps work out.

Edit: Where does $v$ come in to play in all this?

Last edited: Jul 11, 2015
6. Jul 11, 2015

### Fredrik

Staff Emeritus
Everyone uses the df notation for the map defined in (a), but not everyone uses it for the map defined in (b). Your (b) defines a map associated with F called a "pushforward" (because it "pushes" tangent vectors from the point p to the point F(p)). It's usually denoted by something like $F_*|_p$ instead of $\mathrm dF|_p$. I wasn't even aware that some people are using the dF notation until a recent thread where someone asked about it.

I think I found a reason why some people might like the dF notation. (I still prefer $F_*$). $df|_p$ and $F_*|_p$ are both linear maps, so we can compute their https://www.physicsforums.com/threads/matrix-representations-of-linear-transformations.694922/ [Broken] (one for the domain and one for the codomain). If the manifolds we're dealing with are $\mathbb R^n$ (possibly with different values of n), and we choose the ordered bases associated with the identity maps on these spaces, then in both cases (i.e. $df$ and $F_*$), we end up with the Jacobian matrix of the function.

\begin{align*}
[df|_p]_j &=(df|_p)\left(\frac{\partial}{\partial I^j}\right)_p =\left(\frac{\partial}{\partial I^j}\right)_p f =(f\circ I^{-1})_{,j}(I(p)) =f_{,j}(p)\\
[F_*|_p]^i_j &=\left(F_*|_p\left(\frac{\partial}{\partial I^j}\right)_p\right)^i = \left(F_*|_p\left(\frac{\partial}{\partial I^j}\right)_p\right)(I^i) =\left(\frac{\partial}{\partial I^j}\right)_p (I^i\circ F) =\left(\frac{\partial}{\partial I^j}\right)_p F^i\\
& =(F^i\circ I^{-1})_{,j}(I(p)) =F^i{}_{,j}(p).
\end{align*}

Last edited by a moderator: May 7, 2017
7. Jul 11, 2015

### orion

Is your $I$ the identity map? I'm confused by the step $(F \circ I^{-1})_{,j}(I(p))$ in both your equations. Not the derivative part, but the compositions. How should I understand that? If $M=\mathbb{R^n}$ and $N=\mathbb{R^m}$, then $I$ as you are using it lives where? In $\mathbb{R^n}$ or $\mathbb{R^m}$?

In the books that I've read, they say that they are the same because of the "natural identification" of $T_{f(p)}\mathbb{R}$ with $\mathbb{R}$

Thanks, Fredrik, for your reply. I think it will be immensely helpful. (I just have to understand it completely).

8. Jul 11, 2015

### Fredrik

Staff Emeritus
The equality $\left(\frac{\partial}{\partial I^j}\right)_p f =(f\circ I^{-1})_{,j}(I(p))$ is just the definition of left-hand side. (I think I elaborated a little bit more on this in my first post in your other thread). Yes, $I$ is the identity map. I didn't bother to think about whether its the identity map on $\mathbb R^n$ or on $\mathbb R^m$. I'll think about it now...

OK, in the first calculation, $I$ is the identity map on the manifold that's the domain of f.

In the second, if $F:\mathbb R^n\to\mathbb R^m$ then $I^i$ is the $i$th component of the identity map on $\mathbb R^m$ and $I^j$ is the $j$th component of the identity map on $\mathbb R^n$. It would probably have been less confusing if I had used a different symbol (e.g. J) for the identity map on $\mathbb R^n$.

To understand the second calculation, you need to know that if $v\in T_pM$ and $x:U\to\mathbb R^n$ is a coordinate system such that $p\in U$, we have $v=v(x^i)\left(\frac{\partial}{\partial x^i}\right)_p$. So we can write $v^i=v(x^i)$.

9. Jul 11, 2015

### orion

What I'm not understanding is what is the point of the step $(f \circ I^{-1})_{,j} (I(p))$?

For example, if I were writing that out, it would not occur to me to include that step.

Edit: Forget it. I understand it now.

10. Jul 11, 2015

### orion

No, wait. I'm still having a problem with that step. I don't understand why that step is necessary. I understand the notation from the other thread, but I just don't understand why that step is necessary.

11. Jul 12, 2015

### Fredrik

Staff Emeritus
$\left(\frac{\partial}{\partial I^j}\right)_p$ is defined as the map that takes an arbitrary smooth function f to $(f\circ I^{-1})_{,j}(I(p))$. So I'm just using the definition of the notation. Are you asking why it's defined that way? It's because partial derivatives in calculus are defined using the addition operation on $\mathbb R^n$, but in differential geometry, the domain of the function is a manifold, and most manifolds aren't equipped with an addition operation. So we define the partial derivatives at p of $f:M\to\mathbb R$ with respect to the coordinate system $x:U\to\mathbb R$ by
$$\left(\frac{\partial}{\partial x^i}\right)_p f=(f\circ x^{-1})_{,i}(x(p)).$$

12. Jul 12, 2015

### orion

What I'm asking is why can't we simply write:

$(\partial/\partial I^j)_pF = (\partial F^i/\partial I^j)_p = F^i_{,j}$ ?

Edit: Ok. I accept it as the definition in light of your explanation above.
Edit again: From where did you learn that notation? I haven't seen it anywhere else.

Another question I'm struggling to understand in definition (b) above is what exactly is the role of $f$? Why is that in the definition? I mean, when $f=I$ it's clear, but what about in other cases? I found yet another definition defined in terms of curves ($γ$ and $F \circ γ$) which I understand but neither that definition nor definition (a) above has the additional $f$.

Last edited: Jul 12, 2015
13. Jul 12, 2015

### orion

Here is another proof of the equivalence of (a) and (b) using the identity map $I$:

$df^b(v)(I) = v(I \circ f) = v(f) = df^a(v)$

where $df^a$ and $df^b$ are definitions (a) and (b) respectively.

14. Jul 12, 2015

### Fredrik

Staff Emeritus
I first learned it from Spivak, "A comprehensive introduction to differential geometry", vol. 1. Page 35 of the first edition uses the notation
$$\frac{\partial f}{\partial x^i}(p) =\frac{\partial f}{\partial x^i}\bigg|_p =D_i(f\circ x^{-1})(x(p)).$$ Page 39 says that the operator taking f to $\frac{\partial f}{\partial y^i}(p)$ is denoted by $\frac{\partial}{\partial y^i}\big|_p$.

Spivak doesn't use the comma notation for partial derivatives. I think I picked that up from some physics course.

I sometimes use $\big(\frac{\partial}{\partial x^i}\big)_p$ instead of $\frac{\partial}{\partial x^i}\big|_p$. This makes sense because the value at $p$ of a vector field $X$ is denoted by $X_p$, and the map $p\mapsto\frac{\partial}{\partial x^i}\big|_p$ is a vector field that in my opinion should be denoted by $\frac{\partial}{\partial x^i}$.

I see now that Lee is using a different notation, that in my opinion is clearly worse. (Formula (3.8) on page 60 in the second edition). For no good reason, he denotes the coordinate system by $\varphi$, and still denotes the associated partial derivative functional by $\frac{\partial}{\partial x^i}\big|_p$. So his notation doesn't reveal which coordinate system is involved. Why not denote the coordinate system by $x$, or the partial derivative functional by $\frac{\partial}{\partial\varphi^i}\big|_p$ so that the connection is clear?

I noticed something else when I was checking this. Lee used the $F_*$ notation and the "pushforward" terminology in the first edition, but replaced it with the $dF$ notation and the "differential" terminology in the second edition.

Let $M$ be a smooth manifold. Let $p\in M$. Denote the set of all smooth $f:M\to\mathbb R$ by $C^\infty(M)$. The tangent space of $M$ at $p$ is denoted by $T_pM$ and is defined as the vector space of all $v:C^\infty(M)\to\mathbb R$ that satisfy the following two conditions

(1) $v$ is linear.
(2) $v(fg)=v(f)g(p)+f(p)v(g)$ for all $f,g\in C^\infty(M)$.

(By the way, I think my main source for this definition and the basics about tangent vectors was Wald, "General Relativity". Much latter I also read a similar but much more detailed presentation of tangent spaces in Isham, "Modern differential geometry for physicists". I think I've learned a lot of this by putting together pieces from different books).

Edit: I wrote $T_{f(p)}N$ in a few places in the section below. It should have been $T_{F(p)}N$. I have corrected it now.

Now suppose that you want to use a function $F:M\to N$ to map tangent vectors at p to tangent vectors at F(p). We define $F_*|_p:T_pM\to T_{F(p)}N$ by
$$(F_*|_p v)(f)=v(f\circ F)$$ for all $f\in C^\infty(N)$. The way to think here is that if $v\in T_pM$, then $F_*|_p v$ is supposed to be an element of $T_{F(p)}N$, and this is a vector space whose elements are maps from $C^\infty(N)$ into $\mathbb R$. So to specify which element of $T_{F(p)}N$ that $F_*|_p v$ is, we have to specify what real number $(F_*|_p v)(f)$ is for all $f\in C^\infty(N)$.

Last edited: Jul 12, 2015
15. Jul 12, 2015

### Fredrik

Staff Emeritus
The calculation is correct, but I'm not sure what it proves. I guess it proves that the map $v\mapsto df^b(v)(I)$ is $dv^a$. So you have found a simple relationship between $df^a$ and $df^b$, but I don't see why it is a reason to think of them as the same thing, or a reason to use the same notation for them.

16. Jul 12, 2015

### mathwonk

A manifold M can be studied by two natural auxiliary objects, namely maps R-->M of the real numbers into it, and maps M-->R of it into the real numbers, i.e. curves in it, or functions on it.

Secondly we can look at the infinitesimal properties of both of these objects, i.e. the behavior near a given point p of M. So we can study curves through p that have the same velocity vector, and functions defined near p that have the same partial derivatives.

Furthermore, these two approaches are dual to each other, i.e. they admit a natural pairing. namely, given a curve g:R-->M sending 0 to p, and a function f:M-->R defined near p, if we compose we get a function (fog):R-->R whose derivative we can take at 0.

Then to capture just the infinitesimal nature of curves through p, we can identify two such curves g1,g2 if for every function f, the derivatives of (fog1) and (fog2) are equal at 0. Intuitively this means the two curves have the same velocity vector.

Thus we could use this equivalence relation as a means of defining tangent vectors to M at p, I.e. a tangent vector at p is defined as an equivalence class of curves through p that yield the same derivatives for all functions f defined near p.

If we define the tangent space Tp(M) this way, then we could define the dual space to be the formal algebraic dual of this space, namely all linear functionals defined on Tp(M). Then we would see that every function f defined near p gives us such a functional, by taking the derivatives mentioned above. Thus each function f determines a cotangent vector, called df, a linear mapping from Tp(M)-->R.

With this definition we also get a definition of the differential or push forward of a mapping of manifolds. Namely if F:M-->N is such a map, we just push curves forward, by composing with F, to define a map F_* on tangent spaces. I.e. a curve g:R-->M goes to the curve F_*(g) = (Fog):R-->N.

Then if N = R, and F = f is just a function, we have two definitions, namely df and push forward by f.

Let’s see how they compare. df is a linear map on Tp(M) that takes an equivalence class of curves g through p to their common derivative after composition with f, i.e. to (fog)’(0).

Push forward takes the curve g to the curve (fog). But it takes it to its equivalence class as a curve through f(p). so we want to see that the equivalence class of the curve (fog) is determined by the derivative (fog)’(0). But this follows from the chain rule, i.e. for any further function h, the derivative of the composition ho(fog) will be determined by the value of the derivatives of h and of (fog). I.e. two curves through f(p) having the same derivative at 0, will also have the same derivative after composition with any h.

One can also do things in the other direction, assuming as more basic the concept of functions defined near p in M. Thus we could define the cotangent space of M at p or T*p(M) to be the equivalence classes of all functions defined near p, where two functions are equivalent if their compositions with all curves through p have the same derivative.

Then we could define the tangent space Tp(M) as the formal algebraic dual of this, i.e. we could consider a tangent vector to be a linear functional on covectors. Then the natural pairings we have been discussing give us immediately a map from curves to tangent vectors. I.e. given any curve g we get after composition with any function f, a function (fog) whose derivative at zero gives us a number, which is the same for equivalent functions in T*p(M), so our curve g gives us a linear functional on T*p(M), hence a tangent vector.

Finally, the Leibniz rule holds for all these pairings and evaluations, so we are also getting from each curve a real valued mapping on the family of functions, before taking equivalence classes, that obeys the Leibniz rule, i.e. each curve also gives us a “derivation” on functions. So another point of view is to define a tangent vector, not as a linear functional on covectors, i.e. on differentials of functions, but as a derivation on the functions themselves. This is often done in differential geometry and also makes sense in algebraic geometry.

Or you could define both tangent and cotangent spaces separately, Tp(M) as equivalence classes of curves, and T*p(M) as equivalence classes of functions, and then prove that they are formally algebraically dual to one another.

But they are all the same. i.e. you have functions R-->M and functions M-->R. One type gives you tangent vectors and the other gives you cotangent vectors. And you can compose these getting functions R-->R whose derivatives you can take at 0, and this gives a natural pairing of the two concepts. It is up to you what concept you think is more basic, and which one you define to be dual to the other.

17. Jul 12, 2015

### orion

Suppose we have manifolds $M$ and $N$. Suppose $f \in C^\infty: N \rightarrow \mathbb R$ and $F: M \rightarrow N$.
Why is it a push forward of a vector and not a pull back of a curve since the domain of $f \circ F$ is in $M$?

My other question is that since $dF_p$ is a derivation on $f \circ F$, why is it $dF_p$ and not $df_p$? (I know you prefer the notation $F_*$ (push forward) but I've seen quite a few books define the differential this way.) I'm asking this because it seems like $f$ is the function of interest where $F$ is just the map between manifolds.

18. Jul 12, 2015

### orion

Mathwonk, thanks for you detailed reply. I will read it this evening when I'm home.

19. Jul 12, 2015

### mathwonk

you are welcome. here is an other twist on the same stuff. basically we have seen that on a manifold we know what the smooth functions are near each point. moreover, since some of these functions are coordinate functions, these functions tell us everything there is to know about the local structure of the manifold near p. in particular they can be used in various ways to describe the tangent space there. i.e. cotangent vectors are just infinitesimal bits of functions, and vectors are duals of those. here is another way to define infinitesimal bits of functions, i.e.cotangent vectors:

fix a point p on M and let Op(M) or just Op, or just O, be equivalence classes of smooth functions defined near p, where two functions are equivalent if they agree on some neighborhood of p. This is called the "local ring of smooth germs of functions at p". I.e. we are just interested in the behavior of M near p so we only look at functions defined near there and we set them equal if the are the same near p. Moreover we are only interested in first order behavior, or derivatives, so we don't care what their value is, so we just take ones that equal zero. I.e. let mp or just m, be the ideal of those functions in O which vanish at p. This is called the maximal ideal at p, since O is a ring and m is its (only) maximal ideal. (if you know about/care about ring theory.)

Now we are only interested in the first order i.e. derivative behavior of functions at p, so we want to mod out by higher order behavior. I.e. define a smaller ideal m^2 to be all functions in m such that for all curves g through p, the derivative of fog at 0 is zero, i.e. functions ”vanishing (at least) twice at p”. Then m/m^2 is the space of functions vanishing at p, and equated if their difference vanishes twice at p, so this captures only the first order behavior of a function at p. This is another possible definition of the cotangent space T*p(M).

In fact I believe you can show using smooth partial Taylor expansions of smooth functions near p (Lang, Analysis I, page 291, ex.4), that in fact m^2 consists of the square of the ideal m in the sense of ring theory, i.e. a function vanishes at least twice at p if and only if it is (a sum of) products of functions vanishing at least once. This would justify my notation m^2.

This lets us define cotangent spaces in algebraic geometry by taking the ring of polynomials defined near a point, and giving exactly this same definition. Then tangent spaces are defined as the dual of this, so we get away without having access to smooth curves in algebraic geometry. Of course we could also use algebraic derivations of the local ring of polynomial functions defined near p.

If you grasp this general approach, you can read any type of geometry, algebraic, analytic, differential, arithmetic, p adic (? actually i don't know this theory, but what else do they have to work with?), formal, etc.... and you can still understand the definitions of tangent vectors etc...

Last edited: Jul 12, 2015
20. Jul 12, 2015

### Fredrik

Staff Emeritus
f isn't a curve, but you could ask why we don't say that F defines a pullback of a smooth functions. The answer is that this terminology is perfectly fine and is used by some people. I think Wald uses it. If I remember correctly, he used the notation $F^*f=f\circ F$. This defines a map $F^*:C^\infty(N)\to C^\infty(M)$, which can be called a "pullback" of smoth functions.

We can also use the pushforward of tangent vectors to define a pullback of cotangent vectors. We define $F^*|_p:T_{F(p)}N^*\to T_pM^*$ by
$$(F^*|_p\omega)(v)=\omega(F_*|_p v)$$ for all $\omega\in T_{F(p)}N^*$ and all $v\in T_pM$.

This $dF|_p$ is a map from $T_p(M)$ into $T_{F(p)}N$, that's defined using the map F and the point p, but not any specific function f. f is a dummy variable in the definition. We define $dF|_p$ by specifying $dF|_p v$ for each $v\in T_pM$. And since $dF|_pv$ is a map from $C^\infty(N)$ into $\mathbb R$, the way to specify it is to specify the real number $(dF|_pv)(f)$ for each $f\in C^\infty(N)$.

Here's the definition of $dF|_p: T_pM\to T_{F(p)}N$ again: For each $v\in T_pM$, we define $dF|_pv$ by
$$(dF_p v)(f)=v(f\circ F)$$ for all $f\in C^\infty(N)$.