# A Confusion on notion of connection & covariant derivative

Tags:
1. Mar 14, 2016

### "Don't panic!"

I have been reading Nakahara's book "Geometry, Topology & Physics" with the aim of teaching myself some differential geometry. Unfortunately I've gotten a little stuck on the notion of a connection and how it relates to the covariant derivative.

As I understand it a connection $\nabla :\mathcal{X}(M)\times\mathcal{X}(M)\rightarrow\mathcal{X}(M)$, where $\mathcal{X}(M)$ is the set of tangent vector fields over a manifold $M$, is defined such that given two vector fields $X,V\in\mathcal{X}(M)$ then $\nabla :(X,V)\mapsto\nabla_{X}V$. The connection enables one to "connect" neighbouring tangent spaces such that one can meaningfully compare vectors in the two tangent spaces.

What confuses me is that Nakahara states that this is in some sense the correct generalisation of a directional derivative and that we identify the quantity $\nabla_{X}V$ with the covariant derivative, but what makes this a derivative of a vector field? In what sense is the connection enabling one to compare the vector field at two different points on the manifold (surely required in order to define its derivative), when the mapping is from the (Cartesian product of) the set of tangent vector fields to itself? I thought that the connection $\nabla$ "connected" two neighbouring tangent spaces through the notion of parallel transport in which on transports a vector field along a chosen curve, $\gamma :(a,b)\rightarrow M$, in the manifold connecting the two tangent spaces.

Given this, what does the quantity $\nabla_{e_{\mu}}e_{\nu}\equiv\nabla_{\mu}e_{\nu}=\Gamma_{\mu\nu}^{\lambda}e_{\lambda}$ represent? ($e_{\mu}$ and $e_{\nu}$ are coordinate basis vectors in a given tangent space $T_{p}M$ at a point $p\in M$) I get that since $e_{\mu},e_{\nu}\in T_{p}M$, then $\nabla_{\mu}e_{\nu}\in T_{p}M$ and so can be expanded in terms of the coordinate basis of $T_{p}M$, but I don't really understand what it represents?!

Apologies for the long-windedness of this post but I've really confused myself over this notion and really want to clear up my understanding.

2. Mar 15, 2016

### andrewkirk

$\nabla_XV$ at point $p\in M$ (where $M$ is the manifold) is the derivative at $p$ of vector field $V$ in the direction of vector $X$. The tangent spaces that it is 'connecting' are the tangent spaces at points along the geodesic $\gamma$ that passes through $p$ with velocity $X$. It is only tangent spaces that are 'infinitesimally close' to $p$ that are relevant and need to be 'connected' for this purpose. The directional derivative gives information about the rate, and in what direction, the vector field $V$ deviates from the vector $V(p)$ as the latter is parallel transported along $\gamma$. More precisely, if we define $V^{(p,\gamma)}(t)$ to be the result of parallel transporting $V(p)$ from $p$ along $\gamma$ to point $\gamma(t)$ then $\nabla_XV$ measures the rate and direction of deviation of $V(\gamma(t))$ from $V^{(p,\gamma)}(t)$ as $t$ increases.

A coordinate system on open set $U\subseteq M$ is a set of $n$ vector fields on $U$ that is linearly independent everywhere in $U$. So $\nabla_{e_\mu}e_\nu$ at $p\in U$ is the rate and direction of deviation at $p$ of the vector $e_\nu(\gamma(t))$ from $e_\nu{}^{(p,\gamma)}(t)$, which is the result of parallel transporting $e_\nu(p)$ along $\gamma$, and the geodesic $\gamma$ is determined by $e_\mu(p)$.

3. Mar 15, 2016

### "Don't panic!"

This was kind of my intuition for it (from a purely physics perspective) before studying differential geometry, but I can't 'see' from its axiomatic definition (in terms of the mapping I put in my first post) how it defines a derivative? Maybe I'm just being blinkered on the subject.

4. Mar 15, 2016

### lavinia

I am not sure that the definition of a connection always corresponds to a directional derivative unless it is a Levi-Civita connection. We should think about this further.

That said, this may help with intuition.

Suppose the manifold is embedded in Euclidean space. Then the vector fields on the tangent space of the manifold have the usual directional derivative in Euclidean space. The result is another vector field in Euclidean space. The covariant derivative is the orthogonal projection of the directional derivative onto the tangent space of the manifold. In this sense it is a natural generalization of directional derivative in Euclidean space. One can think of it as the part of the directional derivative that someone living on the manifold can see. The orthogonal part is invisible to him. This covariant derivative will be compatible with the metric that the manifold inherits from euclidean space. It will be the unique Levi-Civita connection for that metric.

It is a theorem that any Riemannian manifold can be embedded isometrically in a sufficiently high dimensional Euclidean space. Its Levi-Civita connection is then the projection of the directional derivative onto the tangent space of the manifold. So all Levi-Civita connections can be thought of as orthogonal projections of Euclidean directional derivatives.

A vector field in Euclidean space is parallel along a curve if its directional derivative is zero. A vector field along a curve in an embedded manifold is parallel if the orthogonal projection of its directional derivative is zero. This generalizes the idea of parallel to Riemannian manifolds.

Generally speaking, if one thinks of a connection as an R- linear(or C linear) map on vector fields $v \rightarrow ∇v$ and satisfies the Leibniz rule, then it is a local operator. It is a theorem that a local operator can always be expressed as a linear combination of partial derivatives. But beyond this I am not sure how you get a natural correspondence with directional derivative unless the connection is a Levi-Civita connection.

In a general connection - not necessarily a Levi-Civita connection - one can define what it means for a vector field to be parallel along a curve. A vector field is parallel if its covariant derivative is zero.

Last edited: Mar 15, 2016
5. Mar 15, 2016

### "Don't panic!"

Is the idea then that the connection maps two vector fields to the covariant derivative of one vector field with respect to the other? Is this done by considering the integral curve of the vector field (that we are taking the covariant derivative with respect to) that passes through the point at which the original tangent space is attached and a point infinitesimally nearby at which a neighbouring tangent space is located. Then the vector field we wish to take the covariant derivative is parallel transported along this curve from the neighbouring tangent space to the original tangent space so that we can compare it's coordinate components (with respect to both tangent spaces) at the same point and then take the limit to get the derivative of the vector field at that point?! Apologies if this description is a little convoluted, just trying to put my thoughts on the matter in words.

6. Mar 15, 2016

### andrewkirk

Yes.
The remainder of your post looks like a prose version of my post 2, in which case the answer to that is also yes.

7. Mar 15, 2016

### lavinia

You do not seem to be following the description of the covariant derivative in terms of generalized directional derivatives which is what I thought you asked. One does not need to parallel translate a vector along a curve to calculate its directional derivative. Just take the derivative. Check this for the ordinary directional derivative in Euclidean space.

If $X$ and $Y$ are the vector fields then at the point $p$, $∇_{X}Y$ depends only on the value of $X$ at $p$. One can take any curve fitting $X_{p}$ just like with the ordinary directional derivative.

andrewkirk's description also works but for that you have to already know what parallel translation is. If you parallel translate $Y_{X_{t}}$ back to the point $p$ you can compare it to $Y_{p}$ and form a Newton quotient. The limit will also be the covariant derivative.

If one starts with a connection on a principal bundle, then parallel translation is defined first. One then defines the covariant derivative from parallel translation. This approach works in a general context that encompasses non-LeviCivita connections and vector bundles other than the tangent bundle.

Last edited: Mar 15, 2016
8. Mar 15, 2016

### "Don't panic!"

Sorry, I hadn't meant to brush your description aside, I was just trying to understand how one can take a derivative without taking a difference quotient (maybe I'm just being stupid on this point)? And then in this case how does one interpret $\nabla_{\mu}e_{\nu}$ - is it describing how the basis vectors vary from point to point? My confusion arose initially as when I was first taught the notion of a covariant derivative from a very informal physics perspective it was done so by taking the partial derivative of a vector as follows $$\partial_{\mu}A=\partial_{\mu}(A^{\nu}e_{\nu})=(\partial_{\mu}A^{\nu})e_{\nu}+A^{\nu}\partial_{\mu}e_{\nu}$$ Now, in Euclidean space the second term would vanish as there is a canonical global coordinate system in which the basis vectors are constant, however, on a more general manifold, this is not the case.
I don't quite see how this handwavy description and the more formal description agree with each other? (again, maybe I'm just tired and not seeing the wood for the trees)

9. Mar 15, 2016

### Twigg

Technical, Fancy Answer:

Remember vector fields are really derivations. So $\nabla_{X}Y$ evaluated at $p \in M$ is like saying "the derivative of tangent vector Y along the integral curve of X through point p".

Consider two vector fields $X$ and $Y$ at point $p \in M$, and write them as $X = X^{i}\hat{e}_{i}$ and $Y = Y^{i}\hat{e}_{i}$, where $(\hat{e}_{i})^{m}_{i=1}$ is some basis of orthonormal tangent vectors. Consider a neighborhood of p on which $X$ has an integral curve $\gamma (t)$, such that $X = \dot{\gamma}(t)$ and $\gamma (0) = p$. Then the "directional derivative" of vector field Y along vector field X is $\nabla_{X}Y = \frac{d}{dt} (Y_{\gamma(t)})|$. You know that both the components of Y $Y^{i}$, and the basis vectors $\hat{e}_{i}$ vary at different points along the curve, so you have to apply the product rule:

$\frac{d}{dt} Y_{\gamma(t)} = \frac{d}{dt} (Y^{i}(\gamma(t)) \hat{e}_{i}(\gamma(t))) = (\frac{d}{dt}Y^{i}(\gamma(t)))\hat{e}_{i} + Y^{i}(\gamma(t))(\frac{d}{dt}\hat{e}_{i}(\gamma(t))) = \dot{\gamma}(t) \cdot (grad(Y^{i})\hat{e}_{i} + Y^{i}grad(\hat{e}_{i})$

If $grad(\hat{e}_{i}) = 0$, then you get the same expression for directional derivative as in Euclidean space. But in general, you need to define some connection coefficients, $\Gamma_{ij}^{k} = \hat{e}_{k} \cdot \nabla_{\hat{e}_{j}}\hat{e}_{i}$ to simplify:

$\nabla_{X}Y = \dot{\gamma}(t) \cdot (grad(Y^{i})\hat{e}_{i} + Y^{i}grad(\hat{e}_{i}) = X^{j}Y^{i}_{,j}\hat{e}_{i} + X^{j}Y^{i}\Gamma_{ij}^{k}\hat{e}_{k}$

An "affine connection" $\nabla:$ can be any map that sends two vector fields to another vector field provided it's bilinear and obeys the product rule, and every connection defines an associated covariant derivative. I would agree with lavinia that any covariant derivative associated with an affine connection that doesn't preserve the metric wouldn't be a directional derivative.

Non-technical Answer:

Take a familiar example, like the vector fields in $\mathbb{R}^{2}$ using polar coordinates, $X = r\hat{\theta}$ and $Y = \hat{r}$. In polar coordinates, nabla is defined $\vec{\nabla} = \hat{\theta}\frac{1}{r}\frac{\partial}{\partial\theta}+\hat{r}\frac{\partial}{\partial r}$. So the directional derivative of Y along X is $\nabla_{X}Y = X \cdot \nabla Y = r\hat{\theta} \cdot \frac{1}{r}\frac{\partial}{\partial\theta}\hat{r} = \hat{\theta} \cdot \frac{\partial}{\partial\theta}(cos\theta\hat{i}+sin\theta\hat{j}) = \hat{\theta} \cdot (-sin\theta\hat{i} + cos\theta\hat{j}) = \hat{\theta} \cdot \hat{\theta} = 1$. Notice, the coefficients of Y are constants (1,0), so if there were no connection coefficients here you'd expect $\nabla_{X} Y = 0$.

To put that in connection-y terms:
$\Gamma^{r}_{r\theta} = \hat{r} \cdot \frac{1}{r}\frac{\partial}{\partial\theta}\hat{r} = 0$
$\Gamma^{\theta}_{r\theta} = \hat{\theta} \cdot \frac{1}{r}\frac{\partial}{\partial\theta}\hat{r} = \frac{1}{r}$

10. Mar 15, 2016

### Twigg

Bingo.

11. Mar 15, 2016

### lavinia

You should work out examples where you take directional derivatives in Euclidean space then project. Try it for the sphere.

Suppose for instance that $X(s,t)$ is a parameterized surface in $R^3$.
The the tangent vectors to the surface are finite linear combinations $aX_{s} + bX_{t}$

The ordinary partial derivative of a tangent vector,$v$, with respect to $s$ is $∂a/∂sX_{s} + ∂b/∂sX_{t} + aX_{ss} + bX_{st}$
This is just the directional derivative of $v$ with respect to $X_{s}$.

The first two terms are a tangent vector to the surface but the second may not be. Projecting these terms onto the tangent space gives the decompositions

$X_{ss} = Γ_{ss}^{s}X_{s} + Γ_{ss}^{t}X_{t} + eN$
$X_{st} = Γ_{st}^{s}X_{s} + Γ_{st}^{t}X_{t} + fN$

where N is the unit normal to the surface. This writes the second derivatives as a tangent vector (the terms with the Christoffel symbols) and a normal component.

So the tangential component of $∂v/∂s$ is
$∂a/∂sX_{s} + ∂b/∂sX_{t} + a(Γ_{ss}^{s}X_{s} + Γ_{ss}^{t}X_{t}) + b(Γ_{st}^{s}X_{s} + Γ_{st}^{t}X_{t})$

This is the covariant derivative $∇_{X{s}}v$

One can express the covariant derivative on the parameter domain $(s,t)$ as

$∇_{∂/∂s}(a∂/∂s+b∂/∂t) =( ∂a/∂s)∂/∂s + (∂b/∂s)∂/∂t + a∇_{∂/∂s}∂/∂s + b∇_{∂/∂s}∂/∂t = (∂a/∂s)∂/∂s + (∂b/∂s)∂/∂t + a(Γ_{ss}^{s}∂/∂s) + Γ_{ss}^{t}∂/∂t) + b(Γ_{st}^{s}∂/∂s + Γ_{st}^{t}∂/∂t)$

Exercise: Show that this covariant derivative in the parameter space is compatible with the metric
$<∂/∂s,∂/∂s> = X_{s}.X_{s}$, $<∂/∂s,∂/∂t> = X_{s}.X_{t}$ , $<∂/∂t,∂/∂t> = X_{t}.X_{t}$

Last edited: Mar 16, 2016
12. Mar 16, 2016

### "Don't panic!"

Thanks for the worked example. I shall have a go at the exercise you've suggested.

What I've also found hard is how, from the definition of the connection given in Nakahara's book, one can arrive at the notion of a covariant derivative in the first place? He simply states that we define the action of $\nabla$ on the basis vectors $e_{\mu}$ as $$\nabla_{\mu}e_{\nu}=\Gamma_{\mu\nu}^{\lambda}e_{\lambda}$$ and then states that the connection coefficients $\Gamma_{\mu\nu}^{\lambda}$ quantify how the basis vectors change from point to point without giving any justification why.

13. Mar 16, 2016

### lavinia

If you change notation in the worked example to $∂/∂s=e_{1}$, $∂/∂t=e_{2}$ and set $a = 1$ and $b = 0$ then the formula for the covariant derivative becomes $∇_{e_{1}}e_{1} = Γ_{11}^λe_{λ}$ with $λ=1,2$

The derivative of $e_{1}$ with respect to itself is the rate of change of $e_{1}$ with respect to itself. This has two components.

One can always write a connection in terms of the covariant derivatives of a basis of vector fields in a coordinate domain. But these formulas do not automatically tell you that the covariant derivatives are generalizations of directional derivatives. However, if the connection is a Levi-Civita connection then the covariant derivative is exactly the projection of a directional derivative in a Euclidean space in which the manifold is isometrically embedded.

Last edited: Mar 16, 2016
14. Mar 16, 2016

### "Don't panic!"

Going back to your earlier point (quoted above), is the idea of a connection and the covariant derivative that we want to be able to compute the derivative of vector field $V$ at a point $p\in M$ on the manifold $M$ along another vector field $X$ at that point, which we denote as $(\nabla_{X}V)_{p}$. In order to do this in a meaningful way the value of such a derivative should only depend on the value of $X$ at $p$, and accordingly the resulting object should be a vector field evaluated at the point $p$. A consistent notion of a derivative of one vector field along another at a given point on the manifold can be achieved through introducing a connection $\nabla$ that maps two vector fields to another vector field and whose image satisfies the linearity and Leibnitz properties of a derivative operator. This allows one to meaningfully define a derivative of a vector field along another vector field at a point on the manifold, without having to compare its value at two different point on the manifold.
The connection then provides further structure as it allows one to meaningfully compare vectors in different tangent spaces on the manifold since it can be used to define a notion of parallel transport in which one parallel transports a vector in the tangent space $T_{p}M$ over $p$ to a neighbouring tangent space $T_{q}M$ over (a neighbouring point) $q$ along the integral curve of a vector field that passes through both points (e.g. if $\gamma (t)$ is the integral curve of some vector field $X$, i.e. $X(\gamma (t))=\dot{\gamma}(t)$, then we require that $\gamma (0)=p$ and $\gamma (t)=q$ ).

Would this description capture the correct intuition at all?

15. Mar 16, 2016

### lavinia

More or less.

The precise definition of an affine connection is:

For each smooth vector field $Y$ and each tangent vector $X_{p}$ at the point $p$ there is a new tangent vector at $p$ $∇_{X_{p}}Y$ called the covariant derivative of $Y$ with respect to $X_{p}$. $∇_{X_{p}}Y$ must be bilinear in $Y$ and $X_{p}$ and satisfy the Leibniz rule $∇_{X_{p}}fY = f∇_{X_{p}}Y + (X_{p}.f)Y$. It is also required that if $X$ and $Y$ are smooth vector fields, then the vector field $∇_{X}Y$ obtained by evaluating $∇_{X_{p}}Y$ at every point of the domain of $Y$ is also a smooth vector field.

If the covariant derivative of $Y$ along a curve $c(t)$ with respect to the velocity vector $c'(t)$ is everywhere zero along the curve then $Y$ is said to be parallel along $c(t)$. A vector $Y_{p}$ at a $p$ can always be extended to a vector field along $c(t)$ so that $Y$ is parallel. $Y$ is the parallel translation of $Y_{p}$ along $c(t)$.

Last edited: Mar 16, 2016
16. Mar 16, 2016

### lavinia

BTW: Instead of using Christoffel symbols a connection is often written as

$∇e_{i} = ∑_{j}ω_{ij}⊗e_{j}$ where $e_{j}$ is a basis of vector fields and the $ω_{ij}$ are 1 forms called the connection 1 forms.

In this notation, $∇_{X}e_{i} = ∑_{j}ω_{ij}(X)e_{j}$ This description has the virtue that the $e_{j}$'s can be a basis in an arbitrary smooth vector bundle, not only the tangent bundle. The vector $X$ though is always a tangent vector. Connection 1 forms are commonly used in Differential Geometry.

Last edited: Mar 17, 2016
17. Mar 16, 2016

### "Don't panic!"

So is the covariant derivative essentially defined analogously to the axiomatic definition of tangent vectors on manifolds in terms of derivations? (In this case derivations map to directional derivatives of functions along curves, but the covariant derivative generalises this idea to enable one to meaningfully take a derivative of a vector along the direction of another vector)

Does the requirement of the Leibniz rule as one of its defining axiom guarantee that the connection defines a derivative operator?

18. Mar 16, 2016

### lavinia

Yes. One can generalize this further to the covariant derivative of a tensor field.

You also need linearity.

19. Mar 17, 2016

### "Don't panic!"

Ah ok, thanks for all your help. I think it's starting to become a little clearer now.

Is the thought process for defining vectors in terms of differential operators on manifolds that as there is no well-defined way to subtract values of functions at different points on a manifold the approach one takes is to consider vectors as directional derivatives of functions at particular points since a directional derivative depends only upon the direction and point one is considering. Then, one defines the operation of taking a derivative algebraically (i.e. in terms of the axioms of linearity and the Leibniz rule) such that it doesn't depend on limits of differences of functions at two different points thus providing a well-defined notion of a derivative at a point. One can then generalise this argument to the problem of comparing vectors in different tangent spaces to arrive at the covariant derivative, i.e. define a derivative operator algebraically so that it only depends on the properties of two vectors at a single point (and therefore in the same tangent space). This is equivalent to defining the notion of a connection which has the added benefit of providing a way of comparing vectors in different tangent spaces by parallel transport.

20. Mar 18, 2016

### lavinia

The covariant derivative $∇_{X_{p}}Y$ depends on the tangent vector $X_{p}$ and the vector field $Y$ along a curve fitting $X_{p}$ not just $Y$ at $p$.

The directional derivative $X_{p}.f$ depends on the tangent vector $X_{p}$ and the function $f$ along a curve fitting $X_{p}$ not just $f$ at $p$.

If you look at the "algebraic" definition of tangent vector it acts on functions linearly over the base field and satisfies the Leibniz rule on functions - not just on the value of the functions at a point. The same is true for the covariant derivative.

I strongly suspect that the covariant derivative is not a directional derivative when the connection is not a Levi-Civita connection. Work through examples to get a better feel for this. For instance compute some covariant derivatives for connections that are not torsion free or are not compatible with a metric.

More generally, work through geometric examples of connections in simple cases - e.g. two dimensional surfaces. You will then see directly how a covariant derivative generalizes a directional derivative in the case of a Levi-Civita connection. I suggest not going straight to the General Relativity definition since it is more mathematically abstract. Save that for until you have a intuition from simple examples.

Last edited: Mar 18, 2016
Share this great discussion with others via Reddit, Google+, Twitter, or Facebook

Have something to add?
Draft saved Draft deleted