Diffeomorphisms & the Lie derivative

Frank Castle · Jan 23, 2017

I've been studying a bit of differential geometry in order to try and gain a deeper understanding of the mathematics of general relativity (GR). As you may guess from this, I am approaching this subject from a physicist's perspective so I apologise in advance for any lack of rigour.

As I understand it, given a base manifold ##M##, a vector field ##X:M\rightarrow TM## is a section of the tangent bundle of the manifold ##M##. Furthermore, a given vector field ##X## generates a local one-parameter group of diffeomorphisms ##\sigma_{t}(p):=\sigma(t,p)## that are the integral curves of X, i.e. $$\frac{d}{dt}\sigma^{\mu}_{t}(x)=X^{\mu}(\sigma_{t}(x))$$ with the initial condition $$\sigma^{\mu}_{0}(x)=x^{\mu}$$ where ##\sigma^{\mu}_{t}(x)## is the coordinate representation of the diffeomorphism ##\sigma :\mathbb{R}\times M\rightarrow M## and, similarly, ##X^{\mu}(\sigma^{\mu}_{t}(x))## are the coordinate components of ##X##.

The first question I have, is that from reading Sean Carroll's GR notes, when introducing the Lie derivative, he states that "a single discrete diffeomorphism is insufficient; we require a one-parameter family of diffeomorphisms...". Is what is meant by this that in order to be able to compare vectors at different points on a manifold, we need a mapping between the two points that continuously connects the two points, such that one can meaningfully take a limit?! By this I mean, by varying the parameter ##t## the diffeomorphism ##\sigma_{t}## maps to different points, such that if we have some starting point ##p\in M## and some end point ##q\in M## then by varying ##t##, ##\sigma_{t}## maps to a continuous set of points between ##p## and ##q##, thus defining a continuous curve between ##p## and ##q##. Such that, as we increase the value of the parameter ##t##, ##\sigma_{t}## maps to different points(between ##p## and ##q##), and for some value of ##t## it maps to the point ##q##, such that we have a parametrised curve connecting the point ##p## to the point ##q##, along which we can "drag" vectors from one (base) point to another.

Given this initial formalism, my second question pertains to the pullback and pushforward defined by the diffeomorphism ##\sigma_{t}## and calculating the Lie derivative of a vector:

Choosing a particular local coordinate chart, under the action of ##\sigma_{t}## (for infinitesimal ##t##), a point ##p## whose coordinate is ##x^{\mu}##, is mapped to $$x'^{\mu}:=\sigma^{\mu}_{t}(x)=\sigma^{\mu}(t,x)=x^{\mu}+tX^{\mu}(x)$$ Given a vector field ##Y##, suppose we wish to evaluate its change as we move from the point ##p## to a nearby point ##q##, whose coordinate is given by ##x'^{\mu}=\sigma_{t}(x)##. To do this we must use the pullback map ##\left(\sigma_{-t}\right)_{\ast}:T_{\sigma_{t}(x)}M\rightarrow T_{x}M## to map the vector ##Y\big\vert_{\sigma_{t}(x)}\in T_{\sigma_{t}(x)}M## to ##T_{x}M## and then take the difference between this and the vector ##Y\big\vert_{x}\in T_{x}M##. Let ##e_{\mu}\big\vert_{x}## and ##e_{\mu}\big\vert_{x'}## denote the coordinate bases for ##T_{x}M## and ##T_{\sigma_{t}(x)}M##, respectively. Then, $$Y_{x'}:=Y\big\vert_{\sigma_{t}(x)}=Y^{\nu}(x')e_{\mu}\big\vert_{x'}=\left[Y^{\mu}(x)+tX^{\nu}(x)\frac{\partial Y^{\mu}(x)}{\partial x^{\nu}}+\mathcal{O}(t^{2})\right]e_{\mu}\big\vert_{x'}$$ Then, using the pullback map, $$\left(\sigma_{-t}\right)_{\ast}Y\big\vert_{x'}=\left[Y^{\mu}(x)+tX^{\nu}(x)\frac{\partial Y^{\mu}(x)}{\partial x^{\nu}}+\mathcal{O}(t^{2})\right]\left(\sigma_{-t}\right)_{\ast}e_{\mu}\big\vert_{x'}$$ Now this is the bit that I'm unsure about. How does one interpret ##\left(\sigma_{-t}\right)_{\ast}e_{\mu}\big\vert_{x'}##? Does one simply use the chain rule, such that $$\left(\sigma_{-t}\right)_{\ast}e_{\mu}\big\vert_{x'}=\frac{\partial x^{\nu}}{\partial x'^{\mu}}e_{\nu}\big\vert_{x}=\left(\delta^{\nu}_{\,\mu}-t\frac{\partial X^{\nu}}{\partial x'^{\mu}}\right)e_{\nu}\big\vert_{x}=\left(\delta^{\nu}_{\,\mu}-t\frac{\partial X^{\nu}}{\partial x^{\lambda}}\frac{\partial x^{\lambda}}{\partial x'^{\nu}}\right)e_{\nu}\big\vert_{x}=\left(\delta^{\nu}_{\,\mu}-t\frac{\partial X^{\nu}}{\partial x{\mu}}+\mathcal{O}(t^{2})\right)e_{\nu}\big\vert_{x}$$ If this is the case, then I think I get what's going on, since then we have $$\left(\sigma_{-t}\right)_{\ast}Y\big\vert_{x'}=\left[Y^{\mu}(x)+tX^{\nu}(x)\frac{\partial Y^{\mu}(x)}{\partial x^{\nu}}+\mathcal{O}(t^{2})\right]\left[\delta^{\nu}_{\,\mu}-t\frac{\partial X^{\nu}}{\partial x{\mu}}+\mathcal{O}(t^{2})\right]e_{\nu}\big\vert_{x}\\ =\left[Y^{\mu}(x)+t\left(X^{\nu}(x)\frac{\partial Y^{\mu}(x)}{\partial x^{\nu}}-Y^{\nu}(x)\frac{\partial X^{\mu}(x)}{\partial x^{\nu}}\right)+\mathcal{O}(t^{2})\right]e_{\mu}\big\vert_{x}$$ and the Lie derivative of ##Y## with respect to ##X## is then $$\mathcal{L}_{X}Y=\lim_{t\rightarrow 0}\frac{\left(\sigma_{-t}\right)_{\ast}Y\big\vert_{x'}-Y\big\vert_{x}}{t}=\left[X^{\nu}(x)\frac{\partial Y^{\mu}(x)}{\partial x^{\nu}}-Y^{\nu}(x)\frac{\partial X^{\mu}(x)}{\partial x^{\nu}}\right)e_{\mu}\big\vert_{x}=\left[X,Y\right]$$ However, I'm quite unsure as to whether I've understood this all correctly. Any help and/or insight would be much appreciated.

TeethWhitener · Feb 8, 2017

Bump. I'm curious about this too. The "one-parameter family of diffeomorphisms" sailed over my head the first time I read it in Wald.

xaos · Feb 20, 2017

Second Bump.

This seems to be a good calculation. One thing I don't see is how the high order terms behave under the diffeomorphisms, since this gives only a first order condition on the expansion. They would have to fall off to zero faster than the region collapses, but I don't see what information allows this. Perhaps my eyes are blurred...

Frank Castle · Mar 16, 2017

If anyone has any ideas on this it'd be much appreciated. I'd really like to understand the intuition behind "one-parameter family of diffeomorphisms".

fresh_42 · Mar 16, 2017

Frank Castle said:

If anyone has any ideas on this it'd be much appreciated. I'd really like to understand the intuition behind "one-parameter family of diffeomorphisms".

Your calculation looks good to me, i.e. without drawing some pictures of which parts happen actually where. As far as I understood it, the role of one-parameter (connected) subgroups is only to establish a correspondence to one-dimensional Lie subalgebras as the basis for the correspondence subgroups ##\leftrightarrow## subalgebras. All I've found was an example (torus) where the author finishes:

... that it is rather difficult in general to tell the precise character of a one-parameter subgroup just from knowledge of its infinitesimal generator.

It's a book about Lie groups, so he works directly with the exponential map from the beginning, which makes things a little bit less abstract.

zwierz · Mar 17, 2017

Frank Castle said:

If anyone has any ideas on this it'd be much appreciated. I'd really like to understand the intuition behind "one-parameter family of diffeomorphisms".

then try to define and calculate the Lie derivative of 1-form ##\omega=\omega_i(x)dx^i,\quad \mathcal L_X\omega=?##
another useful exercise show that ##g_X^s\circ g_Y^t= g_Y^t\circ g_X^s\Longleftrightarrow [X,Y]=0##, here ##g_X^t## is one parametric group generated by the vector field ##X##

Frank Castle · Mar 17, 2017

fresh_42 said:

Your calculation looks good to me, i.e. without drawing some pictures of which parts happen actually where. As far as I understood it, the role of one-parameter (connected) subgroups is only to establish a correspondence to one-dimensional Lie subalgebras as the basis for the correspondence subgroups ##\leftrightarrow## subalgebras. All I've found was an example (torus) where the author finishes:

It's a book about Lie groups, so he works directly with the exponential map from the beginning, which makes things a little bit less abstract.

Is the idea behind the need for a one-parameter family of diffeomorphisms that a single diffeomorphism ##\phi## with map a given point ##p## to one (and only one) "new" point ##q=\phi(p)##. As such this is not enough if we wish to connect two points by a curve that we can subsequently parallel transport a vector between their respective tangent spaces. To do so we need a "family" of diffeomorphisms - one for each value of some parameter ##t##, since then for different values of ##t##, ##\phi_{t}## will map the initial point ##p## to different points between ##p## and ##q## (i.e. by varying ##t##, ##\phi_{t}(p)## corresponds to different points between ##p## and ##q##). Thus, ##\phi_{t}(p)## describes a curve, parametrised by ##t## with the constraints that ##\phi_{0}(p)=p## and ##\phi_{\tau}(p)=q## for some value of ##t=\tau##. Would this be a correct understanding of the intuition behind it at all?!

TeethWhitener · Mar 17, 2017

Frank Castle said:

Is the idea behind the need for a one-parameter family of diffeomorphisms that a single diffeomorphism ##\phi## with map a given point ##p## to one (and only one) "new" point ##q=\phi(p)##. As such this is not enough if we wish to connect two points by a curve that we can subsequently parallel transport a vector between their respective tangent spaces. To do so we need a "family" of diffeomorphisms - one for each value of some parameter ##t##, since then for different values of ##t##, ##\phi_{t}## will map the initial point ##p## to different points between ##p## and ##q## (i.e. by varying ##t##, ##\phi_{t}(p)## corresponds to different points between ##p## and ##q##). Thus, ##\phi_{t}(p)## describes a curve, parametrised by ##t## with the constraints that ##\phi_{0}(p)=p## and ##\phi_{\tau}(p)=q## for some value of ##t=\tau##. Would this be a correct understanding of the intuition behind it at all?!

This is my understanding of it. A single diffeomorphism ##\phi## from ##M \rightarrow M## basically just relabels the points on ##M##. A family of diffeomorphisms at a fixed point ##p## denoted ##\phi_t(p)## maps ##\mathbb{R} \rightarrow M##, so it's equivalent to a curve on ##M## parameterized by ##t \in \mathbb{R}##.

Edit: one other thing Wald talks about, which I didn't see in Carroll, is that ##\phi_{s+t} = \phi_s \circ \phi_t## and ##\phi_0(p) = p##, which means that the family of diffeomorphisms is actually an Abelian group.

Frank Castle · Mar 17, 2017

TeethWhitener said:

This is my understanding of it. A single diffeomorphism ##\phi## from ##M \rightarrow M## basically just relabels the points on ##M##. A family of diffeomorphisms at a fixed point ##p## denoted ##\phi_t(p)## maps ##\mathbb{R} \rightarrow M##, so it's equivalent to a curve on ##M## parameterized by ##t \in \mathbb{R}##.

Edit: one other thing Wald talks about, which I didn't see in Carroll, is that ##\phi_{s+t} = \phi_s \circ \phi_t## and ##\phi_0(p) = p##, which means that the family of diffeomorphisms is actually an Abelian group.

Ok cool, this makes intuitive sense to me.

fresh_42 · Mar 17, 2017

TeethWhitener said:

Edit: one other thing Wald talks about, which I didn't see in Carroll, is that ##\phi_{s+t} = \phi_s \circ \phi_t## and ##\phi_0(p) = p##, which means that the family of diffeomorphisms is actually an Abelian group.

... which are exactly the defining equations of flows of the vector field.

Diffeomorphisms & the Lie derivative

1. What is a diffeomorphism?

2. How is a diffeomorphism related to the Lie derivative?

3. What is the significance of diffeomorphisms in physics?

4. Can diffeomorphisms be used to study non-linear systems?

5. How are diffeomorphisms applied in real-world applications?

Similar threads

Hot Threads

Recent Insights