# The Road to Reality - exercise on scalar product

• I
• cianfa72
cianfa72
TL;DR Summary
On the proof of chain rule applied to the definition of scalar product expression for the special case of gradient
Hi, I'm keep studying The Road to Reality book from R. Penrose.

In section 12.4 he asks to give a proof, by use of the chain rule, that the scalar product ##\alpha \cdot \xi=\alpha_1 \xi^1 + \alpha_2 \xi^2 + \dots \alpha_n \xi^n## is consistent with ##df \cdot \xi## in the particular case ##\alpha = df## for a scalar function ##f##.

My idea is to use the chain rule in a given chart around a point ##P## in the manifold ##M##. Pick a coordinate chart ##\phi: U \rightarrow \mathbb R^n## and a smooth curve ##\gamma (t)## in ##M##. Then in that chart the curve is ##\phi \circ \gamma (t)##. The class of equivalence of smooth curves with the same derivative $$\left. \frac {d(\phi \circ \gamma)} {dt} \right|_{\phi(P)}$$ at ##\phi(P)## defines the tangent vector ##\xi## in ##P##.

Now ##\xi(f)## is defined as the derivative of ##f## along one of the curves in the equivalence class. By definition in the given chart we get: $$\frac {d(f \circ {\phi}^{-1} \circ \phi \circ \gamma (t))} {dt}$$ By the chain rule $$\frac {\partial {(f \circ {\phi}^{-1})}} {\partial x^i} \cdot \left. \frac {d(\phi \circ \gamma(t))} {dt} \right|_{\phi(P)}$$The first term are the components ##(\alpha_1, \alpha_2 \dots \alpha_n)## of ##df## in the chart and the latter the components ##(\xi_1, \xi_2 \dots \xi_n)## of tangent vector ##\xi## in the chart. Hence the result holds.

Does it make sense ? Thank you.

Last edited:
It makes sense.

$$\frac {\partial {(f \circ {\phi}^{-1})}} {\partial x^i} \cdot \left. \frac {d(\phi \circ \gamma(t))} {dt} \right|_{\phi(P)}$$

should be

$$\left.\frac {\partial {(f \circ {\phi}^{-1})}} {\partial x^i}\right|_{\phi(P)} \cdot \left. \frac {d(\phi \circ \gamma(t))} {dt} \right|_{t=0}$$

If you use the Weierstraß definition of a derivative instead, i.e. ##f(p+v)=f(p)+(J_p(f))\cdot v+o(v)=f(p)+(\nabla f)\cdot v+o(v),## then you get the linearity for free.

fresh_42 said:
$$\left.\frac {\partial {(f \circ {\phi}^{-1})}} {\partial x^i}\right|_{\phi(P)} \cdot \left. \frac {d(\phi \circ \gamma(t))} {dt} \right|_{t=0}$$
Yes, definitely.

We can use Weierstraß definition, however Penrose asks explicitly to show that result by use of chain rule.

fresh_42
I resume this thread for another exercise from the same book (Ex 14.19) . He asks to show that ##[X,Y] = \nabla_XY - \nabla_YX## for any symmetric (i.e. zero torsion) affine connection using the fact that for any smooth function ##f## by definition ##X(f)=\nabla_Xf## for any affine connection ##\nabla##.

My work: by definition $$[X,Y]f = X(Y(f)) - Y(X(f))$$ then $$[X,Y]f = \nabla_X(\nabla_Yf) - \nabla_Y(\nabla_Xf)= \nabla_X(Y^{\mu}\nabla_{\mu}f) - \nabla_Y(X^{\mu}\nabla_{\mu}f)$$ $$= \nabla_XY^{\mu} \nabla_{\mu}f + Y^{\mu} \nabla_X\nabla_{\mu}f - \nabla_YX^{\mu} \nabla_{\mu}f - X^{\mu}\nabla_Y \nabla_{\mu}f$$ $$=\nabla_XY^{\mu} \nabla_{\mu}f + Y^{\mu}X^{\nu} \nabla_{\nu}\nabla_{\mu}f - \nabla_YX^{\mu} \nabla_{\mu}f - X^{\mu}Y^{\nu}\nabla_{\nu} \nabla_{\mu}f$$ Since by hypothesis ##\nabla## is symmetric then $$[X,Y](f) = \nabla_XY(f) - \nabla_YX(f)$$ hence for the arbitrariness of ##f## we get the result.

Btw, I was thinking about the following writing $$\nabla_X \nabla_Y f \text{ vs } \nabla_Y \nabla_X f$$ it should be actually $$\nabla_X (\nabla_Yf) \text{ vs } \nabla_Y (\nabla_Xf)$$ since the partial derivatives always commute then the above quantities should be always the same (btw it resembles the notation from Wald section 3.1, namely ##\nabla_a\nabla_bf = \nabla_b\nabla_af##).

Can you spot where is the mistake ? Thanks.

Last edited:
cianfa72 said:
Btw, I was thinking about the following writing $$\nabla_X \nabla_Y f \text{ vs } \nabla_Y \nabla_X f$$ it should be actually $$\nabla_X (\nabla_Yf) \text{ vs } \nabla_Y (\nabla_Xf)$$ ...
Yes, that's meant.
cianfa72 said:
... since the partial derivatives always commute then the above quantities should be always the same ...
They do not.
cianfa72 said:
... (btw it resembles the notation from Wald section 3.1, namely ##\nabla_a\nabla_bf = \nabla_b\nabla_af##).
This is true for smooth functions on flat manifolds like ##\mathbb{R}^n. It is not true in general.
cianfa72 said:
Can you spot where is the mistake ? Thanks.
Try ##M=\operatorname{SL}(2,\mathbb{R}.##

fresh_42 said:
This is true for smooth functions on flat manifolds like ##\mathbb{R}^n##. It is not true in general.
Sorry, ##\nabla_{\mu}f = \partial_{\mu}f## holds true always by definition. Hence $$\nabla_{\nu} (\nabla_{\mu}f) = \nabla_{\nu} (\partial_{\mu}f) = \partial_{\nu}( \partial_{\mu}f) = \frac {\partial^2} {\partial_{\nu} \partial_{\mu}}f = \frac {\partial^2} {\partial_{\mu} \partial_{\nu}}f = \nabla_{\mu}(\nabla_{\nu}f)$$

cianfa72 said:
I resume this thread for another exercise from the same book (Ex 14.19) .
Which pages is this on?

cianfa72 said:
Sorry, ##\nabla_{\mu}f = \partial_{\mu}f## holds true always by definition. Hence $$\nabla_{\nu} (\nabla_{\mu}f) = \nabla_{\nu} (\partial_{\mu}f) = \partial_{\nu}( \partial_{\mu}f) = \frac {\partial^2} {\partial_{\nu} \partial_{\mu}}f = \frac {\partial^2} {\partial_{\mu} \partial_{\nu}}f = \nabla_{\mu}(\nabla_{\nu}f)$$
##\partial_\mu f## is not a function, so the second equality does not hold.

cianfa72 said:
Section 14.6 - pag. 315.
I don't see it there. I cannot find ex 14.19, just a figure with that numbering.

martinbn said:
I don't see it there. I cannot find ex 14.19, just a figure with that numbering.
It is a note at the beginning of page 315 in my English edition.

cianfa72
fresh_42 said:
That's why I said smooth functions to be safe. Here is the theorem:
https://calculus.subwiki.org/wiki/Clairaut's_theorem_on_equality_of_mixed_partials
Conversely, if we take a smooth function ##f## defined for instance on ##SL(2,\mathbb R)## with its "standard" differentiable structure, then in general the second-order mixed partial derivatives will not commute.

Said that, does my exercise solution in post #4 make sense? Thanks.

Last edited:
cianfa72 said:
Conversely, if we take a smooth function ##f## defined for instance on ##SL(2,\mathbb R)## with its "standard" differentiable structure, then in general the second-order mixed partial derivatives will not commute.

Said that, does my exercise solution in post #4 make sense? Thanks.
Looks ok to me.

cianfa72
cianfa72 said:
Said that, does my exercise solution in post #4 make sense? Thanks.
What is the exercise? You said "prove the identity for a torsion free connection". But whta is the definition of torsion free? Most texts use this identity as a definition.

The exercise/note 14.19 asks to derive that formula for the Lie derivative ##\mathcal L_XY = [X,Y]##, namely $$[X,Y] = \nabla_XY - \nabla_YX$$ for any symmetric connection ##\nabla##.

cianfa72 said:
The exercise/note 14.19 asks to derive that formula for the Lie derivative ##\mathcal L_XY = [X,Y]##, namely $$[X,Y] = \nabla_XY - \nabla_YX$$ for any symmetric connection ##\nabla##.
What is the definition of symmetric connection?

martinbn said:
What is the definition of symmetric connection?
$$\nabla_{\mu}\nabla_{\nu}f = \nabla_{\nu}\nabla_{\mu}f$$ for any smooth ##f##.

Note that in post #4 the definition of Lie derivative for vector field is $$\mathcal L_XY = [X,Y] = XY - YX$$ Note that symmetric is the same as torsion-free, since the latter is equivalent to say that Christoffel symbols of ##\nabla## in any coordinates are symmetric in the lower indices.

Last edited:
cianfa72 said:
Conversely, if we take a smooth function ##f## defined for instance on ##SL(2,\mathbb R)## with its "standard" differentiable structure, then in general the second-order mixed partial derivatives will not commute.
Thinking again about it, I'm puzzled by the following: an open neighborhood of a differentiable manifold ##M## of dimension ##n##, by very definition, looks like an open neighborhood of ##\mathbb R^n## via a diffeomorphism.

So, given a smooth function ##f## defined on ##M##, why the second-order mixed partial derivatives might not commute when they actually commute for a smooth function defined on an open set of ##\mathbb R^n## ?

cianfa72 said:
Thinking again about it, I'm puzzled by the following: an open neighborhood of a differentiable manifold ##M## of dimension ##n##, by very definition, looks like an open neighborhood of ##\mathbb R^n## via a diffeomorphism.

So, given a smooth function ##f## defined on ##M##, why the second-order mixed partial derivatives might not commute when they actually commute for a smooth function defined on an open set of ##\mathbb R^n## ?
We have a global chart on flat manifolds, but we need two different charts between ##X## and ##Y## on a curved manifold. Write your equations with points of evaluation! Then it becomes ##(X\circ Y)(p) = X\left(Y(p)\right)## and ##Y(p)## possibly requires a different chart.

cianfa72 said:
Thinking again about it, I'm puzzled by the following: an open neighborhood of a differentiable manifold ##M## of dimension ##n##, by very definition, looks like an open neighborhood of ##\mathbb R^n## via a diffeomorphism.

cianfa72 said:
So, given a smooth function ##f## defined on ##M##, why the second-order mixed partial derivatives might not commute when they actually commute for a smooth function defined on an open set of ##\mathbb R^n## ?
Why do you think they do not commute?

martinbn said:
Why do you think they do not commute?
That's the point: if they always commute then I don't understand the point raised by @fresh_42 in post#10: namely that on a generic manifold second-order mixed partial derivatives may not commute in any given chart (when acting on a smooth function ##f## defined on the manifold).

Last edited:
fresh_42 said:
We have a global chart on flat manifolds, but we need two different charts between ##X## and ##Y## on a curved manifold. Write your equations with points of evaluation! Then it becomes ##(X\circ Y)(p) = X\left(Y(p)\right)## and ##Y(p)## possibly requires a different chart.
I'm not sure about that: by definition of manifold, for any point ##p## there is an (open) chart that includes it within the atlas defining its differential structure. Hence both ##(X\circ Y)(p) = X\left(Y(p)\right)## and ##Y(p)## are defined on the same open neighborhood of ##p##.

cianfa72 said:
I'm not sure about that: by definition of manifold, for any point ##p## there is an (open) chart that includes it within the atlas defining its differential structure. Hence both ##(X\circ Y)(p) = X\left(Y(p)\right)## and ##Y(p)## are defined on the same open neighborhood of ##p##.
We are talking about vector fields.
cianfa72 said:
$$\nabla_XY(f) - \nabla_YX(f)=[X,Y](f)$$
This means we follow a flow along ##Y## and then along ##X## and compare it with the path the other way around: follow a flow along ##X## and then along ##Y##. And these two paths do not necessarily end up at the same final point. This is only a heuristic. If you want to "see" it by formulas, you will need to do the math, e.g. on ##\operatorname{SO}(3)## or with the least dimensional example
$$\Bigl\langle \left. \begin{pmatrix} e^t&c\\0&e^{-t} \end{pmatrix} \right| t,c\in \mathbb{R} \Bigr\rangle$$

fresh_42 said:
We are talking about vector fields.
Of course, nevertheless they are defined on the same open neighborhood and the operations involved are actually local.

I suspect that the meaning of the expression ##\nabla_{\nu} \nabla_{\mu}f## is not actually ##\nabla_{\nu}( \nabla_{\mu}f)## but ##(\nabla \nabla f)_{\nu \mu}##. Note that ##\nu,\mu## are indices (1,2,3...) in any chart.

fresh_42 said:
This means we follow a flow along ##Y## and then along ##X## and compare it with the path the other way around: follow a flow along ##X## and then along ##Y##. And these two paths do not necessarily end up at the same final point.
Ok yes, since in general ##[X,Y] \neq 0##.

fresh_42 said:
If you want to "see" it by formulas, you will need to do the math, e.g. on ##\operatorname{SO}(3)## or with the least dimensional example
$$\Bigl\langle \left. \begin{pmatrix} e^t&c\\0&e^{-t} \end{pmatrix} \right| t,c\in \mathbb{R} \Bigr\rangle$$
These are elements (vector fields) of the associated Lie algebra, right?

Last edited:
cianfa72 said:
These are elements (vector fields) of the associated Lie algebra, right?
This is a non-abelian Lie group. The tangent vectors (at the identity) are
$$X=\begin{pmatrix} 1&0\\0&-1 \end{pmatrix}\, , \,Y=\begin{pmatrix} 0&1\\0&0 \end{pmatrix}$$ with ##[X,Y]=2Y.##

fresh_42 said:
This is a non-abelian Lie group. The tangent vectors (at the identity) are
$$X=\begin{pmatrix} 1&0\\0&-1 \end{pmatrix}\, , \,Y=\begin{pmatrix} 0&1\\0&0 \end{pmatrix}$$ with ##[X,Y]=2Y.##
Ah ok, one gets the basis of tangent vectors at the Lie group identity (i.e. Lie algebra basis vectors) calculating the derivatives of group element w.r.t. the parameters ##t## and ##c## and evaluating them at ##t=0,c=0##.

fresh_42
Sorry, regarding my previous statement
cianfa72 said:
I suspect that the meaning of the expression ##\nabla_{\nu} \nabla_{\mu}f## is not actually ##\nabla_{\nu}( \nabla_{\mu}f)## but ##(\nabla \nabla f)_{\nu \mu}##. Note that ##\nu,\mu## are indices (1,2,3...) in any local chart.
What is the actual meaning of ##\nabla_{\nu} \nabla_{\mu}f## ?
Thanks.

cianfa72 said:
Sorry, regarding my previous statement

What is the actual meaning of ##\nabla_{\nu} \nabla_{\mu}f## ?
Thanks.
The first derivative produces a one form, the second derivative is of that one form.

cianfa72
martinbn said:
The first derivative produces a one form, the second derivative is of that one form.
I think it is matter of notation: ##\nabla_{\nu}\nabla_{\mu}f## should be written as $$(\mathbf {\nabla} (\mathbf {\nabla} f))_{\nu \mu}$$ i.e. starting from a scalar field ##f## take the covariant derivative to get its gradient (1-form) then apply the covariant derivative again. You get a (0,2) tensor. Fix a basis ##\{\mathbf e_{\alpha}\}## in the tangent space at any point and insert basis vectors in order into the (0,2) tensor "slots" (i.e. do the contractions). This way you get the ##\nu,\mu## components of the (0,2) tensor in that basis.

Now it makes sense what is going on in post #4: since on manifolds the Connection coefficients in general do not vanish, the covariant derivative of the gradient doesn't result in second-order mixed partial derivatives of ##f##, hence the aforementioned expression doesn't commute in ##\nu, \mu##.

Last edited:
Btw, even in ##\mathbb R^n## if one picks a non-standard affine connection (i.e. a non Levi-Civita connection), then ##\nabla_{\nu} \nabla_{\mu}f## doesn't commute as well.

Replies
4
Views
2K
Replies
10
Views
1K
Replies
5
Views
3K
Replies
4
Views
2K
Replies
9
Views
3K
Replies
2
Views
2K
Replies
6
Views
3K
Replies
14
Views
3K
Replies
1
Views
2K
Replies
21
Views
1K