Issue with the definition of a Lie derivative and its components (Carroll's GR)

haushofer · Aug 21, 2019

Dear all,

I'm having a small issue with the notion of Lie-derivatives after rereading Carroll's notes

https://arxiv.org/abs/gr-qc/9712019
page 135 onward. The Lie derivative of a tensor T w.r.t. a vector field V is defined in eqn.(5.18) via a diffeomorphism ##\phi##. In this definition, both terms are "tensors at the point p", as he remarks after eqn.(5.17). My issue is with the term

[tex]\phi_{* t}\Bigl[ T(\phi_t (p))\Bigr] [/tex]

in eqn.(5.17). As I read this, you first evaluate the tensor T at a shifted point ##\phi_t (p)##, and after that you pull this back via ##\phi_{* t}## at the point p. After eqn.(5.21) however, I get confused. In this part, Carroll tries to show that in a particular coordinate system the Lie derivative becomes an ordinary partial derivative, so he can introduce Lie brackets. He takes as a vector field ##V=\partial_1##, and states:

"The magic of this coordinate system is that a diffeomorphism by t amounts to a coordinate transformation from ##x^{\mu} = (x^1, x^2, . . . , x^n)## to ##y^{\mu} = (x^1 + t, x^2, . . . , x^n)##."

All right, you go along the flow in the ##x^1##-direction.

My confusion is with eqn.(5.23): he evaluates the tensor components in ##y^{\mu} = (x^1 + t, x^2, . . . , x^n)##. But why? Aren't we supposed to evaluate the terms in the original point p, that is, with coordinates ##x^{\mu} = (x^1, x^2, . . . , x^n)##? The confusion also arises, because I thought the point of Lie derivaties was that you compare tensor components in the very same point (and hence, evaluated at the same coordinate values!). But in eqn.(5.24), the Lie derivative becomes an ordinary partial derivative because in Carroll's magic coordinate system it falls down to

[tex]
\lim_{t \rightarrow 0 }\frac{T^{\mu \ldots}_{\nu \ldots}(x^1 + t, x^2, . . . , x^n) - T^{\mu \ldots}_{\nu \ldots}(x^1, x^2, . . . , x^n)}{t} = \partial_{x^1} T^{\mu \ldots}_{\nu \ldots}(x^1, x^2, . . . , x^n)
[/tex]

But here we are comparing the same tensor components in two different points! I understand that this is the whole idea of a partial derivative, but I'm confused in the context of Lie derivatives and Carroll's remark after eqn.(5.17).

So I guess my question really is: if a diffeomorphism brings us from a point with coordinates ##x^{\mu}## to a point with coordinates ##y^{\mu}##, how do the components of

[tex]\phi_{* t}\Bigl[ T(\phi_t (p))\Bigr] [/tex]

look like? I thought it would be ##T^{'\mu \ldots}_{'\nu \ldots}(x)##, but because of Carroll's discussion above I'm confused.

fresh_42 · Aug 21, 2019

Don't we simply have ##\phi_t(p)=\phi_t(x_1,x_2,\ldots,x_n)=(x_1+t,x_2,\ldots,x_n)=:(y_1,\ldots,y_n)\stackrel{(t \to 0)}{\longrightarrow}(5.24)## in this case? Then we apply ##T## as we formally went from ##\phi\, : \,M\longrightarrow N##, leading us to the tangent space ##T_{\phi(p)}N## which we pull back to ##T_pM##.

I guess I don't understand your concerns. A bit confusing is that we have only one potato here ##N=M##.

Martin Scholtz · Aug 22, 2019

I am not sure what exactly you worry about but I'll try to answer.

Yes, you have to evaluate both terms in the definition of the Lie derivative at the same point. If $$P$$ is that point and $$\Phi_t$$ is the flow of the vector field $$V=V^\mu \partial_\mu$$, let the point Q be the point lying on the integral line of $$V$$ i.e.

$$ Q = \Phi_t(P)$$.
At this point, field $$V$$ has coordinate expression

$$V(y)=V^\mu(y) \frac{\partial}{ \partial y^\mu}$$
where $$y^\mu = x^\mu +t V^\mu$$ are the coordinates of $$Q$$ up to the first order in $$t$$.

Mapping $$x\mapsto y$$ is the coordinate expression for the flow and order to calculate the pull back you need to transform $$V(y)$$ to coordinates $$x$$ as usual coordinate transformation for vector fields by Jacobi matrix. You also need to expand $$V^\mu(y)$$ in $$t$$ to express y as function of x.

haushofer · Aug 22, 2019

Ok, I agree, so how do you explain that eqn.(5.23)? Why evaluate it in the point with coordinates y and not x?

Martin Scholtz · Aug 22, 2019

Sorry, I wasn't very focused when I wrote the first post. So let's go from the beginning. We agree that the Lie derivative of tensor field ##T## in the direction of vector field V at point P is

$$L_V T = \lim_{t\to 0} \frac{1}{t}( \Phi_t^* T(\Phi_t(P)) - T(P)),$$ right? ##\Phi_t## is the flow of ##V## and ##\Phi_t^*## is the pull-back against the flow. So, tensor T must be evaluated at, in coordinates, y where

$$ y^\mu = x^\mu + t\,V^\mu(x) + O(t^2) $$

Now, if the coordinates ##x^\mu## are adapted to field V, we have

$$ V^\mu = \delta_1^\mu $$ and $$y^\mu = x^\mu + t\,\delta^\mu_1 + O(t^2) $$

The tensor field evaluated at y is
$$T^\mu(y) = T^\mu(x^1 + t, x^2, \dots x^n)$$

For simplicity, suppose that T is a vector field ##T = T^\mu \partial_\mu##.
Then
$$\Phi_t^* T(y)|_{x} = T^\mu(y) \frac{\partial x^\nu}{\partial y^\mu}\,\frac{\partial}{\partial x^\nu}=
T^\mu(x^1 + t, \dots x^n) \partial_\mu|_{x}$$ where $$\partial x^\mu / \partial y^\nu = \delta ^\mu_\nu$$

Then the abstract definition of the Lie derivative immediately gives you simply the partial derivative. If you employ general coordinates, derivatives of the field V would appear both in the Jacobi matrix and the expansion of V(y).

haushofer · Aug 22, 2019

Thanks for thinking with me everyone, much appreciated as always.

Martin Scholtz said:

Sorry, I wasn't very focused when I wrote the first post. So let's go from the beginning. We agree that the Lie derivative of tensor field ##T## in the direction of vector field V at point P is

$$L_V T = \lim_{t\to 0} \frac{1}{t}( \Phi_t^* T(\Phi_t(P)) - T(P)),$$ right? ##\Phi_t## is the flow of ##V## and ##\Phi_t^*## is the pull-back against the flow. So, tensor T must be evaluated at, in coordinates, y where

$$ y^\mu = x^\mu + t\,V^\mu(x) + O(t^2) $$

Yes, I agree with the definition of the Lie derivative and follow your derivation afterwards, but my issue is with your conclusion in the quote abovem

"So, tensor T must be evaluated at the point with coordinates y".

So, in the first term of the Lie derivative ##\Phi_t^* T(\Phi_t(P))##, if I first evaluate the tensor at the shifted point with coordinates y,

##T(\Phi_t(P))##

and then pull it back to the point with coordinates x,

##\Phi_t^* \Bigl(T(\Phi_t(P))\Bigr)##

I get a tensor "at P" (as the remark after eqn.(5.17) states), even though the argument involves the (point ##\Phi_t(P)## with) coordinate y. So my confusion arises because of the "evaluate the tensor first in the shifted point, and then pull it back to the original point, so we get a tensor at that original point". I have to think this through more carefully, but let me ask a follow-up question which should clarify.

Imagine, as in Carroll's text, we're using a diffeomorphism ##\Phi_t## to shift the point P with coordinates ##x^{\mu}## to another point ##\Phi_t(P)## with coordinates ##y^{\mu}##. I want to understand how the terms in the Lie derivative of our tensor ##T## translate to "tensor component transformation notation". So, e.g.

##T(P) : T^{\mu\ldots}_{\nu\ldots} (x)##

##T(\Phi_t(P)) : T^{\mu\ldots}_{\nu\ldots} (y)##

##\Phi_t^* \Bigr(T(P)\Bigr) : \frac{\partial x^{\mu}}{\partial y^{\rho}} \ldots \frac{\partial y^{\lambda} }{\partial x^{\nu} } T^{\rho\ldots}_{\lambda\ldots} (y)## (the standard "tensor transformation law")

And then, last but not least, of course:

##\Phi_t^* \Bigl(T(\Phi_t(P))\Bigr)##

What will this term become in component notation, according to you? Would this be

##\Phi_t^* \Bigl(T(\Phi_t(P))\Bigr) : \frac{\partial x^{\mu}}{\partial y^{\rho}} \ldots \frac{\partial y^{\lambda} }{\partial x^{\nu} } T^{\rho\ldots}_{\lambda\ldots} (x)## ? And how would you translate then the Lie derivative in this component notation?

(removed a few confusing lines)

fresh_42 · Aug 22, 2019

You could formally consider ##\phi\, : \,M \longrightarrow N## as a diffeomorphism between two different manifolds. Now we use the tensor field on ##N## to make a statement about ##M##.

... and we can ask how fast a tensor changes as we travel down the integral curves ...

We compare ##T(p)## with its variation along the flow ##\phi##. That's what a differentiation does: it compares a tangent vector at a certain point with what happens nearby. In school nearby is a secant, here it is a flow.

haushofer · Aug 22, 2019

fresh_42 said:

You could formally consider ##\phi\, : \,M \longrightarrow N## as a diffeomorphism between two different manifolds. Now we use the tensor field on ##N## to make a statement about ##M##.

We compare ##T(p)## with its variation along the flow ##\phi##. That's what a differentiation does: it compares a tangent vector at a certain point with what happens nearby. In school nearby is a secant, here it is a flow.

Yes, but that's not the issue. What my issue is, is that I cannot reconcile this coordinate free notation with the way I use to calculate Lie derivatives, as the difference (using components and surpressing indices)

##T'(x)-T(x)##

(where the prime indicates the transformed components under infinitesimal transformations) and expanding. Both terms are evaluated at the same point x, while in the example of Carroll you evaluate one term in the shifted point with coordinates y (here that would be x').

vanhees71 · Aug 24, 2019

I think Stephani provides a more intuitive explanation of the Lie derivative. Given is a congruence of world lines, defining a vector field ##a^{\mu}(x)## through tangent vectors along these world lines. Now consider an observer, who moves along one of the lines by an infinitesimal step, i.e., in coordinates from
$$x^{\mu} \rightarrow \bar{x}^{\mu}=x^{\mu}+\delta t a^{\mu}(x).$$
Now suppose that observer uses his coordinate system at the original point ##P## at the point ##\bar{P}##. This implies a coordinate transformation
$$x^{\prime \mu}=x^{\mu}-\delta t a^{\mu}(x)$$
and a transformation matrix
$${A^{\mu}}_{\nu}=\delta_{\mu}^{\nu} - \delta t \partial_{\nu} a^{\mu}(x).$$
Now take an arbitrary vector field ##T^{\mu}(x)##. The observer will associate the components of this field at point ##\bar{P}## to
$$T^{\prime \mu}(\bar{P})={A^{\mu}}_{\nu} T^{\nu}(x+\delta t a(x))=T^{\mu}(P) + \delta t (a^{\nu}(x) \partial_{\nu} T^{\mu}-T^{\nu} \partial_{\nu} a^\mu) +\mathcal{O}(\delta t^2).$$
Then the Lie derivative of ##T## at point ##P## is defined as
$$\mathcal{L}_{a} T^{\mu}=\lim_{\delta t \rightarrow 0} \frac{1}{\delta t} [T^{\prime \mu}(\bar{P})-T^{\mu}(P)]=a^{\nu} \partial_{\nu} T^{\mu} - T^{\nu} \partial_{\nu} a^{\mu}.$$
That's (slightly adapted to a more careful notation concerning the indices) from

H. Stephani, Relativity, An introduction to Special and General Relativity, 3rd Ed., Cambridge University Press (2004).

It's of course straight forward to define the Lie derivative for any tensor field of higher rank and also for covariant components. It's also easy to show that instead of the partial derivatives ##\partial_{\nu}## you can write the covariant derivatives ##\nabla_{\nu}## everywhere since the terms involving Christoffel symbols all finally cancel.

haushofer · Aug 25, 2019

vanhees71 said:

I think Stephani provides a more intuitive explanation of the Lie derivative. Given is a congruence of world lines, defining a vector field ##a^{\mu}(x)## through tangent vectors along these world lines. Now consider an observer, who moves along one of the lines by an infinitesimal step, i.e., in coordinates from
$$x^{\mu} \rightarrow \bar{x}^{\mu}=x^{\mu}+\delta t a^{\mu}(x).$$
Now suppose that observer uses his coordinate system at the original point ##P## at the point ##\bar{P}##. This implies a coordinate transformation
$$x^{\prime \mu}=x^{\mu}-\delta t a^{\mu}(x)$$
and a transformation matrix
$${A^{\mu}}_{\nu}=\delta_{\mu}^{\nu} - \delta t \partial_{\nu} a^{\mu}(x).$$
Now take an arbitrary vector field ##T^{\mu}(x)##. The observer will associate the components of this field at point ##\bar{P}## to
$$T^{\prime \mu}(\bar{P})={A^{\mu}}_{\nu} T^{\nu}(x+\delta t a(x))=T^{\mu}(P) + \delta t (a^{\nu}(x) \partial_{\nu} T^{\mu}-T^{\nu} \partial_{\nu} a^\mu) +\mathcal{O}(\delta t^2).$$
Then the Lie derivative of ##T## at point ##P## is defined as
$$\mathcal{L}_{a} T^{\mu}=\lim_{\delta t \rightarrow 0} \frac{1}{\delta t} [T^{\prime \mu}(\bar{P})-T^{\mu}(P)]=a^{\nu} \partial_{\nu} T^{\mu} - T^{\nu} \partial_{\nu} a^{\mu}.$$
That's (slightly adapted to a more careful notation concerning the indices) from

H. Stephani, Relativity, An introduction to Special and General Relativity, 3rd Ed., Cambridge University Press (2004).

It's of course straight forward to define the Lie derivative for any tensor field of higher rank and also for covariant components. It's also easy to show that instead of the partial derivatives ##\partial_{\nu}## you can write the covariant derivatives ##\nabla_{\nu}## everywhere since the terms involving Christoffel symbols all finally cancel.

Thanks. My issue is funnily enough with how to derive the Lie derivative algebraically; I can dream these derivations. My issue is that I can't reconcile different texts which use (apparently) slightly different notions and definitions (and then I'm not even beginning about the passive v.s. active point of view, which some texts seem to combine in defining Lie derivatives.) But I think I do understand the answer to my question i posed to @Martin Scholtz, so thanks Martin.

E.g., in your definition, I'm puzzled by your definition,

$$\mathcal{L}_{a} T^{\mu}=\lim_{\delta t \rightarrow 0} \frac{1}{\delta t} [T^{\prime \mu}(\bar{P})-T^{\mu}(P)] .$$

which I would define as

$$\mathcal{L}_{a} T^{\mu}=\lim_{\delta t \rightarrow 0} \frac{1}{\delta t} [T^{\prime \mu}(\bar{P})-T^{\mu}(\bar{P})] $$

This is equivalent to how e.g. Zee or Inverno define the Lie derivatives. It makes sense, because you compare tensors at two equal points (here ##\bar{P}##), which is the only sensible thing to do without a connection.

What I don't get specifically in your post, is your quote "The observer will associate the components of this field at point ##\bar{P}## to"
$$T^{\prime \mu}(\bar{P})={A^{\mu}}_{\nu} T^{\nu}(x+\delta t a(x))$$

I'd say that one gets instead

$$T^{\prime \mu}(\bar{P})={A^{\mu}}_{\nu} T^{\nu}(x)$$

i.e. the tensor transformation law,

$$T^{\prime \mu}(x')= \frac{\partial x^{'\mu}}{\partial x^{\nu}} T^{\nu}(x)$$

One then first evaluates the tensor in the shifted point ##\bar{P}##, and compares this value with the 'dragged along value of the tensor' at that very same point. Note that this definition differs from the earlier mentioned definition by Martin and me, but it gives the same notion of the derivative.

Maybe I should just stick to the definitions and conventions I do understand. ;)

Anyway, I'm starting with a new job tomorrow, so I'll probably won't be able to give more elaborate reactions for the coming days, but if anyone has more comments, be assured they will be read. Many thanks for all of those who helped me with this issue :)

vanhees71 · Aug 25, 2019

The latter is the transformation of the vector components at the same space-time point ##P##. In the Lie derivative you carry the vector components in the above specified way to the "infinitesimally close" point ##\bar{P}## along the world-line with tangent ##a^{\mu}##.

In the postings above this has been formalized in the coordinate-free way using the diffeomorphism ##\phi_t## parametrizing (locally) the curve along which you want to evaluate the Lie derivative. The pullback is formally describing what Stephani calls "the observer takes his coordinate bases with him".

haushofer · Aug 25, 2019

vanhees71 said:

The pullback is formally describing what Stephani calls "the observer takes his coordinate bases with him".

I find these kind of statements highly confusing. The only thing I can make of these wordings, is that you first shift the points actively, and then perform a passive transformation successively such that the old point in the old coordinate system has the same numerical value as the shifted point in the new (passively obtained) coordinate system. But as I said, this combines active and passive notions of coordinate transformations ("diffeomorphisms on the manifold ##M## and diffeomorphisms in ##R^n##"), and personally that confuses things a whole lot more than just sticking to the active picture.

I'll compare your given definition more carefully (=algebraically and conceptually) with my own understanding. (edit:) I suspect Stephani means something subtly different with ##T^{\prime \mu}(\bar{P})## in his Lie derivative than what I'm used to ;)

vanhees71 · Aug 25, 2019

I'm usually puzzled more about the coordinate-free statements, but that's because I'm not familiar enough with it :-((.

haushofer · Aug 25, 2019

Just a sideremark: as a teacher, I know how important it is to be aware of pre-existing concepts in students heads when they learning something new. I guess somewhere in the process years ago of learning this stuff I implanted myself wrong/confusing concepts which makes me feel uncomfortable about it everytime I compare different textbooks.

So in that sense I did learn a valuable lesson :P

haushofer · Aug 25, 2019

vanhees71 said:

I'm usually puzzled more about the coordinate-free statements, but that's because I'm not familiar enough with it :-((.

Well, maybe that's the other lesson to learn today: stay away from it as much as you can if you understand your own definitions! :P

Issue with the definition of a Lie derivative and its components (Carroll's GR)

1. What is the Lie derivative and why is it important in Carroll's GR?

2. What are the components of the Lie derivative and how are they related?

3. What is the issue with the definition of the Lie derivative in Carroll's GR?

4. How can the issue with the definition of the Lie derivative be resolved?

5. What are some applications of the Lie derivative in Carroll's GR?

Similar threads

Hot Threads

Recent Insights