Tangent vectors as directional derivatives

"Don't panic!"

I have a few conceptual questions that I'd like to clear up if possible.

The first is about directional derivatives in general. If one has a function $f$ defined in some region and one wishes to know the rate of change of that function (i.e. its derivative) along a particular direction in that region, is the reason why one specifies a curve along the direction one wishes to consider because the curve specifies the direction (in a sense)? That is, if we choose a curve $\gamma$ (parametrised by $t$) along some direction in the region in which $f$ is defined, then we can evaluate the function along that curve by composing $f$ with $\gamma$, i.e. $f\circ\gamma$. Then for each value of $\gamma$ we can evaluate $f$ at that point and as such, the rate of change of the function along the direction defined by the curve (at a particular point) $\gamma$ is given by $$\frac{d}{dt}(f\circ\gamma)$$ Would this be correct?
When it comes to defining tangent vectors on manifolds, is the point that we define a curve $\gamma : (-\varepsilon, \varepsilon)\rightarrow M$ such that a particular direction along the manifold, at a given point $p\in M$, is specified. Then we can consider function $f:M\rightarrow\mathbb{R}$ and evaluate this function along the curve $\gamma$ at the point $p\in M$. This involves composing the function with the curve $\gamma$ and noting that $\gamma (0)=p$. Then, $$\frac{d}{dt}(f\circ\gamma)\bigg\vert_{p}$$ which is the derivative of $f$ along a particular direction (specified by $\gamma$) on the manifold at a given point $p\in M$. We note that, in general, there will be more than one curve that will have the same tangent at a given point, and so we identify a tangent vector at the point $p\in M$ as an equivalence class of curves passing through $p\in M$ and satisfying $(\phi\circ\gamma_{1})'(0)=(\phi\circ\gamma_{2})'(0)$ (where $\phi$ is some coordinate chart)?!
Adding to this, I was asked a question as to way we don't consider the parameter $t$ to be a one-dimensional coordinate system for the curve $\gamma$?! My response was that $t$ simply parametrises a curve in $M$, each value of $t\in(-\varepsilon, \varepsilon)\subset\mathbb{R}$ is mapped to a specific point on the manifold, i.e. $t\mapsto \gamma (t)=p\in M$, however, this doesn't specify the actual location of the point on the manifold and therefore $t$ is not a coordinate; one requires a mapping from $M$ to $\mathbb{R}^{n}$ in order to specify the actual location of the point in terms of an $n$-tuple of coordinate values. I'm unsure whether this is a valid argument though?!

My second question is, given a function $F: \mathbb{R}^{n}\rightarrow\mathbb{R}$ is it valid to consider a curve $\gamma :[0,1]\rightarrow\mathbb{R}^{n}$ defined such that $\gamma (0)=a\in\mathbb{R}^{n}$ and $\gamma (1)=x\in\mathbb{R}^{n}$, and express it as $$\gamma(t)=(x^{1}(t),\ldots,x^{n}(t))=\gamma (0)+t\left(\gamma (1)-\gamma (0)\right)=a+t(x-a)$$ where the $x^{i}:[0,1]\rightarrow\mathbb{R}$ are coordinate functions defined by $$x^{i}(t)=a^{i}+t\left(x^{i}-a^{i}\right)$$ Then one can write $$\frac{d}{dt}\left((F\circ\gamma)(t)\right)= \frac{d}{dt}\left(F(\gamma(t))\right)=\frac{d}{dt}\left(F((x^{1}(t),\ldots,x^{n}(t)))\right) \\ \qquad\qquad\qquad\qquad\qquad\qquad\;\;\;=\sum_{i=1}^{n}\frac{\partial F(a+t(x-a))}{\partial x^{i}}\frac{dx^{i}}{dt} \\ \qquad\qquad\qquad\qquad\qquad\qquad\;\;\;=\sum_{i=1}^{n}\frac{\partial F(a+t(x-a))}{\partial x^{i}}\left(x^{i}-a^{i}\right)$$ and as $x\in\mathbb{R}^{n}$ was chosen arbitrarily, this result holds $\forall x\in\mathbb{R}^{n}$. Would this be valid?

Related Differential Geometry News on Phys.org

Fredrik

Staff Emeritus
Gold Member
The first is about directional derivatives in general. If one has a function $f$ defined in some region and one wishes to know the rate of change of that function (i.e. its derivative) along a particular direction in that region, is the reason why one specifies a curve along the direction one wishes to consider because the curve specifies the direction (in a sense)?
Yes. Also, the composition $f\circ\gamma$ is a real-valued function defined on a subset of $\mathbb R$, so the definition of "derivative" from calculus can be applied.

When it comes to defining tangent vectors on manifolds,...
No objections to this part, other than to the notation $|_p$. It should be $|_0$. I would also put that notation right next to the d/dt, but that's a matter of taste.
$$(f\circ\gamma)'(0)=\frac{d}{dt}\bigg|_0 f(\gamma(t)).$$
Adding to this, I was asked a question as to way we don't consider the parameter $t$ to be a one-dimensional coordinate system for the curve $\gamma$?!
You can view the range of the curve $\gamma:(-\varepsilon,\varepsilon)\to M$ as a 1-dimensional submanifold. If this map is injective, then its inverse can be thought of as a coordinate system.

...this doesn't specify the actual location of the point on the manifold and therefore $t$ is not a coordinate; one requires a mapping from $M$ to $\mathbb{R}^{n}$ in order to specify the actual location of the point in terms of an $n$-tuple of coordinate values.
A coordinate system on M is a map into $\mathbb R^n$. A coordinate system on a 1-dimensional submanifold of M is a map into $\mathbb R$.

...express it as $$\gamma(t)=(x^{1}(t),\ldots,x^{n}(t))=\gamma (0)+t\left(\gamma (1)-\gamma (0)\right)=a+t(x-a)$$
This would be an approximation, and it might be a bad one.

It's confusing that you're using the symbol x for two different things.

"Don't panic!"

A coordinate system on M is a map into Rn\mathbb R^n. A coordinate system on a 1-dimensional submanifold of M is a map into R\mathbb R.
But isn't the map meant to be coordinate independent? I had a friend question me on this last weekend as he thought the argument for defining a vector using this approach was a bit circular as he thought that it implicitly introduced a coordinate system from the start?! I wasn't able to give a particularly convincing argument, the best I was able to do was to give the explanation I put in my first post

This would be an approximation, and it might be a bad one.

It's confusing that you're using the symbol x for two different things.
I included this bit in reference to Hadamard's lemma for smooth functions, that I've been trying to prove, i.e. For any smooth function $F:\mathbb{R} ^{n} \rightarrow\mathbb{R}$ there exist smooth functions $H_{\mu}$ such that for all $x\in\mathbb{R}^{n}$ $$F(x) =F(a) +\sum_{i=1}^{n}(x^{\mu}-a^{\mu})H_{\mu}(x)$$ where $a\in\mathbb{R}^{n}$.

Fredrik

Staff Emeritus
Gold Member
But isn't the map meant to be coordinate independent? I had a friend question me on this last weekend as he thought the argument for defining a vector using this approach was a bit circular as he thought that it implicitly introduced a coordinate system from the start?! I wasn't able to give a particularly convincing argument, the best I was able to do was to give the explanation I put in my first post
I don't quite understand what your friend is arguing against or what his argument is.

I included this bit in reference to Hadamard's lemma for smooth functions, that I've been trying to prove, i.e. For any smooth function $F:\mathbb{R} ^{n} \rightarrow\mathbb{R}$ there exist smooth functions $H_{\mu}$ such that for all $x\in\mathbb{R}^{n}$ $$F(x) =F(a) +\sum_{i=1}^{n}(x^{\mu}-a^{\mu})H_{\mu}(x)$$ where $a\in\mathbb{R}^{n}$.
There's a nice proof of that lemma in Isham's book. The same proof can be found in Wald. It involves a trick that's easy to understand but difficult to find.

"Don't panic!"

I don't quite understand what your friend is arguing against or what his argument is.
I think the issue came up when I was trying to introduce the concept of vectors on a manifold to him. I said that we need to develop a notion of a tangent vector when we have no notion of a specific origin or what a straight line is, or how to relate nearby points on a manifold. I said that we want to develop a coordinate independent definition (as vectors are coordinate independent quantities) and at this point started talking about curves on a manifold and this is where the issue came up, as he said how can we introduce vectors in this way when it appears that one is introducing a 1-dimensional coordinate system to do so?!

There's a nice proof of that lemma in Isham's book. The same proof can be found in Wald. It involves a trick that's easy to understand but difficult to find.
I was actually trying to do the problem in Wald's book where he asks the reader to prove it, and this was the approach that I took, but I was really unsure about it.

Fredrik

Staff Emeritus
Gold Member
...he said how can we introduce vectors in this way when it appears that one is introducing a 1-dimensional coordinate system to do so?!
I would agree that there's a problem if we had used one specific curve $\gamma$ to define $T_pM$, and a different choice of $\gamma$ would have given us a different $T_pM$. But that's not the case. The definition of $T_pM$ goes roughly like this: We start with a set that contains lots of curves through p. We define an equivalence relation on that set. Then we define a vector space structure (an addition operation and a scalar multiplication operation) on the set of equivalence classes. The vector space we end up with is denoted by $T_pM$. Note in particular that an equivalence class of curves doesn't define a coordinate system on a 1-dimensional submanifold.

I was actually trying to do the problem in Wald's book where he asks the reader to prove it, and this was the approach that I took, but I was really unsure about it.
I actually don't remember what Wald said about this theorem. When I saw the proof in Isham a few years ago, it made me think that I had previously studied the same proof in Wald. But maybe I had just done the exercise you mentioned. Isham's proof can be read at Google Books. Link.

"Don't panic!"

Note in particular that an equivalence class of curves doesn't define a coordinate system on a 1-dimensional submanifold.
Is this because each curve in the equivalence class can, in principle, be parametrised by a different parameter and so the equivalence class just captures the properties that they each pass through the same point and have the same derivative at that point, without needing to specify any particular coordinate system on a 1-dimensional submanifold?

I actually don't remember what Wald said about this theorem. When I saw the proof in Isham a few years ago, it made me think that I had previously studied the same proof in Wald
Wald quotes the result in chapter 2 of his book and then asks the reader to prove it by induction at the end of this chapter. I get the 1-dimensional case as we can write $$F(x)=F(a)+(F(x)-F(a))=F(a)+F(a+t(x-a))\big\vert_{t=0}^{t=1}=F(a)+\int_{0}^{1}\frac{dF(a+t(x-a))}{dt}dt\\ =F(a)+\int_{0}^{1}\frac{\partial F(a+t(x-a))}{\partial u}\frac{du}{dt}dt =F(a)+\int_{0}^{1}\frac{\partial F(a+t(x-a))}{\partial u}(x-a)dt\\ =F(a)+(x-a)\int_{0}^{1}\frac{\partial F(a+t(x-a))}{\partial u}dt$$
where $u(t)=a+t(x-a)$.

I assume that Isham is doing something similar to this, although I don't quite follow his proof. Is the point that he is adding and subtracting the same functions (reducing in dimension by 1 each time) such that he can write an $n$-dimensional function as a sum of functions (decreasing in dimension)? Also, when he goes from $$F(a^{1},\ldots,a^{n})=F(0,\ldots,0)+\sum_{\mu =1}^{n}F(a^{1},\ldots,ta^{\mu},0,\ldots,0)\big\vert_{t=0}^{t=1}$$ to $$F(a^{1},\ldots,a^{n})=F(0,\ldots,0)+\sum_{\mu =1}^{n}\int_{0}^{1}F(a^{1},\ldots,ta^{\mu},0,\ldots,0)dt\\ =F(0,\ldots,0)+\sum_{\mu =1}^{n}\int_{0}^{1}\frac{\partial F(a^{1},\ldots,ta^{\mu},0,\ldots,0)}{\partial u^{\mu}}a^{\mu}dt$$ I assume he's using the fundamental theorem of calculus and the chain rule (with $u^{\mu}(t)=ta^{\mu}$)? My confusion arises from reading other texts where they just define a function $h(t)=F(a+t(x-a))$ and then note that $$\frac{dh}{dt}=\sum_{\mu =1}^{n}\frac{\partial F(a+t(x-a))}{\partial x^{\mu}}(x^{\mu}-a^{\mu})$$ and then note that $$h(1)-h(0)=F(x)-F(a)=\int_{0}^{1}\frac{dh}{dt}dt=\sum_{\mu =1}^{n}(x^{\mu}-a^{\mu})\int_{0}^{1}\frac{\partial F(a+t(x-a))}{\partial x^{\mu}}dt$$ but I don't see how this follows (unless they are abusing notation)?!

Fredrik

Staff Emeritus
Gold Member
Is this because each curve in the equivalence class can, in principle, be parametrised by a different parameter...
No, it's because two different curves in the same equivalence class define coordinate systems on different submanifolds. The point p may be the only point in M that's in the range of both curves.

The possibility of reparametrization means that each curve defines infinitely many coordinate systems (on the same 1-dimensional submanifold), rather than just one.

I will think about that proof and post my comments later.

"Don't panic!"

it's because two different curves in the same equivalence class define coordinate systems on different submanifolds
Ah, so is the point that two different curves in the same equivalence class are mappings from different submanifolds to the manifold, with their particular parameter defining a 1-dimensional coordinate system for each of the respective submanifolds. The two curves may wildly differ at in general, but are such that they both pass through a particular point p, and the value of their derivatives (evaluated at this point are equal). Would this be correct?

The possibility of reparametrization means that each curve defines infinitely many coordinate systems (on the same 1-dimensional submanifold), rather than just one.
Is the point here that how a given curve is parametrised is somewhat arbitrary and as such we are free to choose how we parametrise it and hence making it coordinate independent? How does one show that the derivative of a curve at a point is independent of parametrisation?

I will think about that proof and post my comments later.
Thanks, really appreciate you taking the time to look at it :)

Fredrik

Staff Emeritus
Gold Member
As far as I can tell, the rewrite
$$F(x)=F(0)+\sum_{k=1} F(x_1,\dots,x_{k+1},tx_k,0,\dots,0)\big|_{t=0}^{t=1}$$ is in no way better than the much simpler rewrite
$$F(x)=F(0)+F(x)-F(0)= F(0)+F(tx)\big|_{t=0}^{t=1}.$$ They both seem to get the job done, in the sense that they both give us ways to write
$$F(x)=F(0)+\sum_{k=1}^n x_k F_k(x).$$ They give us different sets of functions $F_k$, but for each k, both $F_k$ have the same value at 0. I guess the $F_k$ aren't unique. Does that make sense? Let's denote the members of the second set of such functions by $G_k$ instead of $F_k$. For all x, we have
$$F(0)+\sum_{k=1}^n x_k F_k(x)=F(0)+\sum_{k=1}^n x_k G_k(x)$$ and therefore
$$\sum_{k=1}^n x_k (F_k(x)-G_k(x))=0.$$ This implies in particular that for all $k$, we must have $F_k(x)=G_k(x)$ when $x$ is the $k$th standard basis vector. But it doesn't seem to imply that $F_k=G_k$. So it seems plausible that both of those rewrites get the job done.

Fredrik

Staff Emeritus
Gold Member
Ah, so is the point that two different curves in the same equivalence class are mappings from different submanifolds to the manifold,
They map intervals in $\mathbb R$ (possibly the same one) to different 1-dimensional submanifolds of M. If they are injective, their inverses map those submanifolds back to those intervals. Those inverses can be thought of as coordinate systems on 1-dimensional submanifolds.

The two curves may wildly differ at in general, but are such that they both pass through a particular point p, and the value of their derivatives (evaluated at this point are equal). Would this be correct?
"Their derivatives" is the tangent vector that we're trying to define, so we shouldn't be talking about it when we're discussing the very early parts of the construction. But given any two curves C,D in the same equivalence class, and any coordinate system x such that x(p)=0, the maps $x\circ C$ and $x\circ D$ (curves in $\mathbb R^n$) have the same derivative at 0. (Now I'm just talking about the kind of derivative encountered in calculus). The fact that we're mentioning a coordinate system here is a problem if and only if the equivalence relation somehow depends on x. That's why one of the first steps in this construction is to prove that it doesn't.

Is the point here that how a given curve is parametrised is somewhat arbitrary and as such we are free to choose how we parametrise it and hence making it coordinate independent? How does one show that the derivative of a curve at a point is independent of parametrisation?
It's not. The derivative (of the composition of the coordinate system and the curve) can be interpreted as a velocity, and most reparametrizations will change it. A reparametrization of a curve $\gamma:(a,b)\to M$ is a map $s:(c,d)\to(a,b)$. The reparametrized curve is the map $\gamma\circ s$. If x is a coordinate system as above, then we have
$$(x\circ\gamma\circ s)'(0)=(x\circ\gamma)'(s(0))s'(0).$$ So if $s'(0)\neq 1$, $x\circ\gamma\circ s$ and $x\circ\gamma$ have different velocities.

"Don't panic!"

They map intervals in R\mathbb R (possibly the same one) to different 1-dimensional submanifolds of M. If they are injective, their inverses map those submanifolds back to those intervals. Those inverses can be thought of as coordinate systems on 1-dimensional submanifolds.
But I thought the mapping was from a subset of $\mathbb{R}$ to the manifold, i.e. $\gamma : (-\varepsilon, \varepsilon)\rightarrow M$?

It's not. The derivative (of the composition of the coordinate system and the curve) can be interpreted as a velocity, and most reparametrizations will change it. A reparametrization of a curve γ:(a,b)→M\gamma:(a,b)\to M is a map s:(c,d)→(a,b)s:(c,d)\to(a,b). The reparametrized curve is the map γs\gamma\circ s. If x is a coordinate system as above, then we have
(xγs)′(0)=(xγ)′(s(0))s′(0).​
(x\circ\gamma\circ s)'(0)=(x\circ\gamma)'(s(0))s'(0). So if s′(0)≠1s'(0)\neq 1, the reparametrized curve doesn't have the same velocity vector.
So is the reason why we need not worry about the introduction of these 1-dimensional coordinate systems the fact that a tangent vector at a point is defined as an equivalence class of curves which will, in general, be parametrised differently (corresponding to them having different 1-dimensional coordinate systems), and will, in general, vary very differently at other points over the manifold. As such, as the definition of a tangent vector is not dependent on any one particular curve, we do not need to even consider how these curves are parametrised (i.e. we don't need to introduce such 1-dimensional coordinate systems into the definition) and thus the definition is completely coordinate independent?!

Also, I have to confess, I'm a little confused by your previous post (sorry). Is there any reference for the identity that Isham uses, or is it literally noticing that you can write out an alternating sum of functions (decreasing in dimension) such that $$F(a^{1},\ldots ,a^{n})=F(a^{1},\ldots ,a^{n})-F(a^{1},\ldots ,a^{n-1},0)+F(a^{1},\ldots ,a^{n-1},0)-F(a^{1},\ldots ,a^{n-2},0,0)\\ \qquad\qquad\qquad+\cdots +F(a^{1},0,\ldots ,0,0)-F(0,0,\ldots ,0,0)+F(0,0\ldots ,0,0)$$ It just seems a little unsatisfactory (non-rigorous) to write it out this way.
I'm slightly confused by his notation as well. I see that the above sum can be written as $$F(a^{1},\ldots ,a^{n})=F(a^{1},\ldots ,ta^{n})\bigg\vert_{t=0}^{t=1}+\cdots +F(ta^{1},0,\ldots ,0,0)\bigg\vert_{t=0}^{t=1}+F(0,0,\ldots,0,0)$$
but I'm not sure how one can write this as $$F(a^{1},\ldots ,a^{n})=F(0,0,\ldots,0,0)+\sum_{\mu =1}^{n}F(a^{1},\ldots ,ta^{\mu},0,\ldots, 0)\bigg\vert_{t=0}^{t=1}$$
Is this just compact notation for noting that as $\mu$ the dimension of the function increases?!
Finally, assuming this notation I see how he can write it as $$F(a^{1},\ldots ,a^{n})=F(0,0,\ldots,0,0)+\sum_{\mu =1}^{n}F(a^{1},\ldots ,ta^{\mu},0,\ldots, 0)\bigg\vert_{t=0}^{t=1}\\=F(0,0,\ldots,0,0)+\sum_{\mu =1}^{n}\int_{0}^{1}\frac{d}{dt}F(a^{1},\ldots ,ta^{\mu},0,\ldots, 0)dt$$ but in the next step he re-writes this as $$F(a^{1},\ldots ,a^{n})=F(0,0,\ldots,0,0)+\sum_{\mu =1}^{n}F(a^{1},\ldots ,ta^{\mu},0,\ldots, 0)\bigg\vert_{t=0}^{t=1}\\ =F(0,0,\ldots,0,0)+\sum_{\mu =1}^{n}\int_{0}^{1}\frac{\partial}{\partial u^{\mu}}F(a^{1},\ldots ,ta^{\mu},0,\ldots, 0)a^{\mu}dt$$ Is this simply the chain rule upon defining a function $u^{\mu}(t)=ta^{\mu}$ such that $$\frac{d}{dt}F(a^{1},\ldots ,ta^{\mu},0,\ldots, 0)=\frac{d}{dt}F(u)\\=\frac{\partial}{\partial u^{\mu}}F(a^{1},\ldots ,u^{\mu}(t),0,\ldots, 0)\frac{du^{\mu}(t)}{dt}\\ =\frac{\partial}{\partial u^{\mu}}F(a^{1},\ldots ,u^{\mu}(t),0,\ldots, 0)a^{\mu}$$
or is there another reason. Can one assert from this, that as $a=(a^{1},\ldots ,a^{n})$ was chosen arbitrarily from the open ball that we are considering, that this is true $\forall x=(x^{1},\ldots ,x^{n})$ in this open ball, such that $$F(x)=F(0,0,\ldots,0,0)+\sum_{\mu =1}^{n}\int_{0}^{1}\frac{\partial}{\partial u^{\mu}}F(x^{1},\ldots ,tx^{\mu},0,\ldots, 0)x^{\mu}dt$$

Fredrik

Staff Emeritus
Gold Member
But I thought the mapping was from a subset of $\mathbb{R}$ to the manifold, i.e. $\gamma : (-\varepsilon, \varepsilon)\rightarrow M$?
Right, that's correct, and consistent with what I said, but not with the statement I was replying to. I said that the range of an injective curve is a 1-dimensional submanifold of M. That doesn't contradict the statement that the codomain of the curve is M. But in the text I was replying to, you said that the domain of the curve is a submanifold of M (which would make it a subset of M rather than $\mathbb R$).

So is the reason why we need not worry about the introduction of these 1-dimensional coordinate systems the fact that a tangent vector at a point is defined as an equivalence class of curves which will, in general, be parametrised differently (corresponding to them having different 1-dimensional coordinate systems), and will, in general, vary very differently at other points over the manifold. As such, as the definition of a tangent vector is not dependent on any one particular curve, we do not need to even consider how these curves are parametrised (i.e. we don't need to introduce such 1-dimensional coordinate systems into the definition) and thus the definition is completely coordinate independent?!
You seem to have a pretty good understanding of what's going on, but the statements about parametrization look strange to me. It only makes sense to talk about the parametrization of a curve in M when we use the word "curve" to refer to a set of points in M. But our "curves" are maps $\gamma:(a,b)\to M$. The range of such a map is a set of points in M. The curve is the parametrization of that set. So you should be talking about different curves, not different parametrizations.

Is there any reference for the identity that Isham uses, or is it literally noticing that you can write out an alternating sum of functions (decreasing in dimension) such that $$F(a^{1},\ldots ,a^{n})=F(a^{1},\ldots ,a^{n})-F(a^{1},\ldots ,a^{n-1},0)+F(a^{1},\ldots ,a^{n-1},0)-F(a^{1},\ldots ,a^{n-2},0,0)\\ \qquad\qquad\qquad+\cdots +F(a^{1},0,\ldots ,0,0)-F(0,0,\ldots ,0,0)+F(0,0\ldots ,0,0)$$ It just seems a little unsatisfactory (non-rigorous) to write it out this way.
I think he just expects people to be familiar enough with induction proofs to recognize this as an argument that can easily be turned into a rigorous proof using induction.

I'm slightly confused by his notation as well. I see that the above sum can be written as $$F(a^{1},\ldots ,a^{n})=F(a^{1},\ldots ,ta^{n})\bigg\vert_{t=0}^{t=1}+\cdots +F(ta^{1},0,\ldots ,0,0)\bigg\vert_{t=0}^{t=1}+F(0,0,\ldots,0,0)$$
but I'm not sure how one can write this as $$F(a^{1},\ldots ,a^{n})=F(0,0,\ldots,0,0)+\sum_{\mu =1}^{n}F(a^{1},\ldots ,ta^{\mu},0,\ldots, 0)\bigg\vert_{t=0}^{t=1}$$
Is this just compact notation for noting that as $\mu$ the dimension of the function increases?!
I'm not sure what the concern is here, because you seem to understand exactly what's going on. Is the issue that a notation like $F(a^1,\dots,a^{\mu-1},ta^\mu,0\dots,0)$ appears to suggest that $\mu\geq 3$? That's not how it's supposed to be interpreted.

Finally, assuming this notation I see how he can write it as $$F(a^{1},\ldots ,a^{n})=F(0,0,\ldots,0,0)+\sum_{\mu =1}^{n}F(a^{1},\ldots ,ta^{\mu},0,\ldots, 0)\bigg\vert_{t=0}^{t=1}\\=F(0,0,\ldots,0,0)+\sum_{\mu =1}^{n}\int_{0}^{1}\frac{d}{dt}F(a^{1},\ldots ,ta^{\mu},0,\ldots, 0)dt$$ but in the next step he re-writes this as $$F(a^{1},\ldots ,a^{n})=F(0,0,\ldots,0,0)+\sum_{\mu =1}^{n}F(a^{1},\ldots ,ta^{\mu},0,\ldots, 0)\bigg\vert_{t=0}^{t=1}\\ =F(0,0,\ldots,0,0)+\sum_{\mu =1}^{n}\int_{0}^{1}\frac{\partial}{\partial u^{\mu}}F(a^{1},\ldots ,ta^{\mu},0,\ldots, 0)a^{\mu}dt$$ Is this simply the chain rule
Yes.

Can one assert from this, that as $a=(a^{1},\ldots ,a^{n})$ was chosen arbitrarily from the open ball that we are considering, that this is true $\forall x=(x^{1},\ldots ,x^{n})$ in this open ball, such that $$F(x)=F(0,0,\ldots,0,0)+\sum_{\mu =1}^{n}\int_{0}^{1}\frac{\partial}{\partial u^{\mu}}F(x^{1},\ldots ,tx^{\mu},0,\ldots, 0)x^{\mu}dt$$
Yes.

Last edited:

Fredrik

Staff Emeritus
Gold Member
As I said in post #10 (the one you found confusing), I don't see a reason to prefer Isham's approach over the other one you posted. I see nothing wrong with this (my version of the alternative approach):

Let B be an open ball in $\mathbb R^n$ around 0. Let F be a real-valued function on a subset of $\mathbb R^n$ that contains B. If F is smooth on B, then for all x in B, we have
\begin{align}
&F(x)=F(0)+F(x)-F(0)=F(0)+F(tx)\big|_{t=0}^{t=1} =F(0)+\int_0^1\frac{d}{dt}F(tx)dt \\
& =F(0)+\int_0^1\left(\sum_{k=1}^n F_{,k}(tx)x_k\right)dt =F(0)+\sum_{k=1}^n x_k \int_0^1 F_{,k}(tx)dt.
\end{align}
For each k, define $F_k$ by $F_k(x)=\int_0^1 F_{,k}(tx) dt$ for all x in B. Now we can write the above as
$$F(x)=F(0)+\sum_{k=1}^n x_k F_k(x).$$
Now suppose that $f:M\to\mathbb R$ is smooth, that $p\in M$, and that $x:U\to\mathbb R^n$ is a coordinate system such that $x(p)=0$. Define $F=f\circ x^{-1}:x(U)\to\mathbb R$. For each k, define $f_k=F_k\circ x$. Let $B$ be an open ball around 0 that's also a subset of $x(U)$. For all $q\in x^{-1}(B)$, we have
\begin{align}
&f(q)=(f\circ x^{-1}\circ x)(q)=F(x(q)) =F(0)+\sum_{k=1}^n x^k(q)F_k(x(q))\\
&=f(p) +\sum_{k=1}^n x^k(q)f_k(q).
\end{align} We can allow ourselves to write
$$f=f(p)+\sum_{k=1}^n x^kf_k.$$ (The first term should be interpreted as the number f(p) times the identity map). This equality is strictly speaking not true, but it doesn't need to be. What matters is that there's an open neighborhood of $p$ such that the restriction of the left-hand side to that neighborhood is equal to the restriction of the right-hand side to that neighborhood. That's all we need for the final calculation below to be valid.

Before we do it, note that
\begin{align}
&f_k(p)=F_k(x(p))= \int_0^1 F_{,k}(tx(p))dt =\int_0^1 F_{,k}(0)dt =F_{,k}(0)\\
&=(f\circ x^{-1})_{,k}(x(p)) =\left(\frac{\partial}{\partial x^k}\right)_p f.
\end{align}
Now let's wrap things up. For all $v\in T_pM$, we have
\begin{align}
v(f)=v\left(f(p)+\sum_{k=1}^n x^kf_k\right) =\sum_{k=1}^n \big(v(x^k)f_k(p)+\underbrace{x^k(p)}_{=0} v(f_k)\big) =\sum_{k=1}^n v(x^k)\left(\frac{\partial}{\partial x^k}\right)_p f.
\end{align} Since f is an arbitrary smooth function, this implies that
$$v=\sum_{k=1}^n v(x^k)\left(\frac{\partial}{\partial x^k}\right)_p.$$

"Don't panic!"

Right, that's correct, and consistent with what I said, but not with the statement I was replying to. I said that the range of an injective curve is a 1-dimensional submanifold of M. That doesn't contradict the statement that the codomain of the curve is M. But in the text I was replying to, you said that the domain of the curve is a submanifold of M (which would make it a subset of M rather than R\mathbb R).
I don't quite see how the range is 1-dimensional, the curve maps to points on $M$ which would have the same dimension as the manifold wouldn't they? (sorry if I'm just being really stupid).

But our "curves" are maps γ:(a,b)→M\gamma:(a,b)\to M. The range of such a map is a set of points in M. The curve is the parametrization of that set. So you should be talking about different curves, not different parametrizations.
This is what I was trying to explain to my friend, but he when I said that each $t\in (a, b)$ maps to a point on the manifold I couldn't convince him that $t$ doesn't define a 1-dimensional coordinate system, I think he was thinking in terms of classical cases (in Euclidean space), but I tried to explain that the case is the same there as well, as one can always parameterise a curve in Euclidean space such that each value of $t$ corresponds to a set of coordinate values, but the parameter $t$ itself isn't considered as a coordinate.

F(0)+∫10(∑k=1nF,k(tx)xk)dt
Is this correct, $F_{, k} (tx)=\frac{\partial F(tx)} {\partial x^{k}}$? My original confusion came about because in Wald (and a couple of other texts that I've read) he denotes it as $H_{\mu} (x)$ and that $$H_{\mu} (x) =\int_{0}^{1} \frac{\partial F(tx)} {\partial x^{\mu}} dt$$

Last edited:

Fredrik

Staff Emeritus
Gold Member
I don't quite see how the range is 1-dimensional, the curve maps to points on $M$ which would have the same dimension as the manifold wouldn't they? (sorry if I'm just being really stupid)
The dimension of a manifold is by definition the (vector space) dimension of the $\mathbb R^n$ that's the codomain of all the coordinate systems. If $\gamma:(a,b)\to M$ is a curve such that $\gamma^{-1}:\gamma(M)\to(a,b)$ can be considered a coordinate system on $\gamma(M)$, then the fact that $(a,b)$ is a subset of $\mathbb R$ makes $\gamma(M)$ 1-dimensional.

Edit: The notation $\gamma(M)$ doesn't make sense. I should have written $\gamma\big((a,b)\big)$.

Is this correct, $F_{, k} (tx)=\frac{\partial F(tx)} {\partial x^{k}}$?
It depends on what you mean by what you wrote on the right.

Interpretation 1: Compute the partial derivative of $F$ with respect to the $k$th variable slot. The result is a function. The notation represents the value of that function at $tx$.

Interpretation 2: Compute the partial derivative with respect to $x^k$ of the function of $x$ defined by the expression $F(tx)$. (To be more precise, compute the partial derivative of the function $x\mapsto F(tx)$ with respect to the $k$th variable slot). The result is a function. The notation represents the value of that function at $x$.

If you meant what I called "interpretation 1", then yes.

I'm inclined to interpret the notation $\frac{\partial F(tx)} {\partial x^{k}}$ according to interpretation 1 (because it's a partial derivative of F, right?), but I think the only interpretation of the notation $\frac{\partial}{\partial x^k} F(tx)$ that makes sense is interpretation 2. This would give us the horrendously ugly result
$$\frac{\partial}{\partial x^k} F(tx) =t\frac{\partial F(tx)} {\partial x^k}.$$ This sort of thing is why I don't use the $\partial/\partial x^k$ notation in calculus.

Last edited:

"Don't panic!"

γ−1:γ(M)→(a,b)\gamma^{-1}:\gamma(M)\to(a,b) can be considered a coordinate system on γ(M)\gamma(M), then the fact that (a,b)(a,b) is a subset of R\mathbb R makes γ(M)\gamma(M) 1-dimensional.
Isn't $\gamma (M)$ the image set of $\gamma$ though, and as they are points on [/itex] which is locally homeomorphic to $\mathbb {R} ^{n}$, so I would've thought it would be n-dimensional?! (sorry to go on a bit, I'm just getting more confused over the whole situation).
Also, would what I've put here be correct at all?

This is what I was trying to explain to my friend, but he when I said that each t∈(a,b) t\in (a, b) maps to a point on the manifold I couldn't convince him that t t doesn't define a 1-dimensional coordinate system, I think he was thinking in terms of classical cases (in Euclidean space), but I tried to explain that the case is the same there as well, as one can always parameterise a curve in Euclidean space such that each value of t t corresponds to a set of coordinate values, but the parameter t t itself isn't considered as a coordinate.
Interpretation 1: Compute the partial derivative of FF with respect to the kkth variable slot. The result is a function. The notation represents the value of that function at txtx.
Yes, I assumed by the notation they give that it's the $\mu$th variable slot, as $F : \mathbb{R} ^{n} \rightarrow \mathbb{R}$ and so the notation $\frac{\partial F} {\partial x^{\mu}}$ is, symbolically, the derivative of $F$ with respect to its $\mu$th "coordinate" (i.e. the derivative of $F$ with respect to $tx^{\mu}$)?

micromass

The dimension of a manifold is by definition the (vector space) dimension of the $\mathbb R^n$ that's the codomain of all the coordinate systems.
While correct, it misses that a manifold also has a very intrinsic dimension. You don't need to have coordinate systems to be able to determine the dimension. In fact, we have something like topological dimension of a manifold. It is intrinsic to the manifold and agrees with the codomain of the coordinate systems. So I'm inclined to take that as the definition of the dimension (although - agreed - most books take your definition, which skips over the subtle point that perhaps two coordinate systems might exist which gives a different dimension, it's difficult to prove this can't occur).

Back to curves then. The image of a curve does not need to be one-dimensional (see space-filling curves). Even with smooth curves things can go wrong. The point is the difference between an immersed and an embedded submanifold. While an (injective) smooth curve always is an immersed submanifold, it doesn't need to be embedded (that is: the topology of the manifolds need not agree). Luckily, a corollary of the inverse function theorem says that we can always limit the domain of a curve so that it becomes embedded. So if you only care about arbitrary small domains of curves (this is the germ approach which is popular in algebraic geometry); then you're ok.

micromass

Isn't $\gamma (M)$ the image set of $\gamma$ though, and as they are points on [/itex] which is locally homeomorphic to $\mathbb {R} ^{n}$, so I would've thought it would be n-dimensional?! (sorry to go on a bit, I'm just getting more confused over the whole situation).
Also, would what I've put here be correct at all?

I think you're missing that "local homeomorphic" is a property of an open set, while you're actuing like it is a property of a point. The codomain $\gamma(M)$ is not an open set, so you can't use homeomorphisms.

"Don't panic!"

Would this be correct though?

This is what I was trying to explain to my friend, but he when I said that each $t\in (a, b)$ maps to a point on the manifold I couldn't convince him that $t$ doesn't define a 1-dimensional coordinate system, I think he was thinking in terms of classical cases (in Euclidean space), but I tried to explain that the case is the same there as well, as one can always parameterise a curve in Euclidean space such that each value of $t$ corresponds to a set of coordinate values, but the parameter $t$ itself isn't considered as a coordinate.

I was trying to rationalise with him why the definition of a tangent vector using this approach is intrinsically coordinate independent? Is what I put correct, or is it more that as it is defined as an equivalence class of curves and therefore not dependent on any one particular curve it is independent of any coordinate system introduced when specifying the form of a particular curve?!

Fredrik

Staff Emeritus
Gold Member
Isn't $\gamma (M)$ the image set of $\gamma$ though, and as they are points on [/itex] which is locally homeomorphic to $\mathbb {R} ^{n}$, so I would've thought it would be n-dimensional?! (sorry to go on a bit, I'm just getting more confused over the whole situation).
I don't understand why you think the dimension of a submanifold has to be the same as the dimension of the manifold. Have you seen a definition of "dimension" that makes you think that this is the case? This would be very different from how things work with vector spaces. For example, a straight line though the origin in $\mathbb R^3$ is 1-dimensional, not 3-dimensional.

And oops, I see now that I wrote "$\gamma(M)$". That notation makes no sense.. Maybe that has contributed to the confusion. The range of $\gamma:(a,b)\to M$ is of course $\gamma\big((a,b)\big)$. If $\gamma$ is a smooth injective curve, then $\gamma^{-1}:\gamma\big((a,b)\big)\to(a,b)$ is a homeomorphism. Since the range of this map is $(a,b)$, which is a subset of $\mathbb R$, the submanifold $\gamma\big((a,b)\big)$ is 1-dimensional.

Also, would what I've put here be correct at all?
$\gamma$ takes each $t\in (a,b)$ to a point $\gamma(t)$ in M. The inverse of of $\gamma$ (assuming that this is an injective curve) is a coordinate system on a 1-dimensional submanifold, but it's not one of the coordinate systems associated with M itself. So you got that last part right at least.

Last edited:

Fredrik

Staff Emeritus
Gold Member
While correct, it misses that a manifold also has a very intrinsic dimension. You don't need to have coordinate systems to be able to determine the dimension. In fact, we have something like topological dimension of a manifold. It is intrinsic to the manifold and agrees with the codomain of the coordinate systems.
Cool. I have heard of the concept, but I have never studied it.

Back to curves then. The image of a curve does not need to be one-dimensional (see space-filling curves).
I was thinking that if we restrict the domain of a smooth curve sufficiently, it will be injective...and I see that your comment supports that idea.

micromass

I was thinking that if we restrict the domain of a smooth curve sufficiently, it will be injective...and I see that your comment supports that idea.
Well no, take the constant curve for example.

WWGD

Gold Member
While correct, it misses that a manifold also has a very intrinsic dimension. You don't need to have coordinate systems to be able to determine the dimension. In fact, we have something like topological dimension of a manifold. It is intrinsic to the manifold and agrees with the codomain of the coordinate systems. So I'm inclined to take that as the definition of the dimension (although - agreed - most books take your definition, which skips over the subtle point that perhaps two coordinate systems might exist which gives a different dimension, it's difficult to prove this can't occur).

Back to curves then. The image of a curve does not need to be one-dimensional (see space-filling curves). Even with smooth curves things can go wrong. The point is the difference between an immersed and an embedded submanifold. While an (injective) smooth curve always is an immersed submanifold, it doesn't need to be embedded (that is: the topology of the manifolds need not agree). Luckily, a corollary of the inverse function theorem says that we can always limit the domain of a curve so that it becomes embedded. So if you only care about arbitrary small domains of curves (this is the germ approach which is popular in algebraic geometry); then you're ok.
What do you mean by intrinsic dimension? By the definition I know, a space is an n- manifold if every point has a neighborhood that is homeomorphic to $\mathbb R^n$.

"Don't panic!"

I don't understand why you think the dimension of a submanifold has to be the same as the dimension of the manifold. Have you seen a definition of "dimension" that makes you think that this is the case? This would be very different from how things work with vector spaces. For example, a straight line though the origin in R3\mathbb R^3 is 1-dimensional, not 3-dimensional.
Yes, sorry I think I was just stressing out about it a bit and started conflating ideas. I think I sometimes get a bit lost in the abstraction, as I can easily see how a curve in $\mathbb{R} ^{3}$ as it can be described in terms of a single parameter (hence 1-dimensional).

And oops, I see now that I wrote "γ(M)\gamma(M)". That notation makes no sense.. Maybe that has contributed to the confusion. The range of γ:(a,b)→M\gamma:(a,b)\to M is of course γ((a,b))\gamma\big((a,b)\big). If γ\gamma is a smooth injective curve, then γ−1:γ((a,b))→(a,b)\gamma^{-1}:\gamma\big((a,b)\big)\to(a,b) is a homeomorphism. Since the range of this map is (a,b)(a,b), which is a subset of R\mathbb R, the submanifold γ((a,b))\gamma\big((a,b)\big) is 1-dimensional.
γ\gamma takes each t∈(a,b)t\in (a,b) to a point γ(t)\gamma(t) in M. The inverse of of γ\gamma (assuming that this is an injective curve) is a coordinate system on a 1-dimensional submanifold, but it's not one of the coordinate systems associated with M itself. So you got that last part right at least.
I think this all makes sense now, thanks for your patience!

Physics Forums Values

We Value Quality
• Topics based on mainstream science
• Proper English grammar and spelling
We Value Civility
• Positive and compassionate attitudes
• Patience while debating
We Value Productivity
• Disciplined to remain on-topic
• Recognition of own weaknesses
• Solo and co-op problem solving