Confusion on notion of connection & covariant derivative

&quot;Don&#039;t panic!&quot; · Mar 14, 2016

I have been reading Nakahara's book "Geometry, Topology & Physics" with the aim of teaching myself some differential geometry. Unfortunately I've gotten a little stuck on the notion of a connection and how it relates to the covariant derivative.

As I understand it a connection ##\nabla :\mathcal{X}(M)\times\mathcal{X}(M)\rightarrow\mathcal{X}(M)##, where ##\mathcal{X}(M)## is the set of tangent vector fields over a manifold ##M##, is defined such that given two vector fields ##X,V\in\mathcal{X}(M)## then ##\nabla :(X,V)\mapsto\nabla_{X}V##. The connection enables one to "connect" neighbouring tangent spaces such that one can meaningfully compare vectors in the two tangent spaces.

What confuses me is that Nakahara states that this is in some sense the correct generalisation of a directional derivative and that we identify the quantity ##\nabla_{X}V## with the covariant derivative, but what makes this a derivative of a vector field? In what sense is the connection enabling one to compare the vector field at two different points on the manifold (surely required in order to define its derivative), when the mapping is from the (Cartesian product of) the set of tangent vector fields to itself? I thought that the connection ##\nabla## "connected" two neighbouring tangent spaces through the notion of parallel transport in which on transports a vector field along a chosen curve, ##\gamma :(a,b)\rightarrow M##, in the manifold connecting the two tangent spaces.

Given this, what does the quantity ##\nabla_{e_{\mu}}e_{\nu}\equiv\nabla_{\mu}e_{\nu}=\Gamma_{\mu\nu}^{\lambda}e_{\lambda}## represent? (##e_{\mu}## and ##e_{\nu}## are coordinate basis vectors in a given tangent space ##T_{p}M## at a point ##p\in M##) I get that since ##e_{\mu},e_{\nu}\in T_{p}M##, then ##\nabla_{\mu}e_{\nu}\in T_{p}M## and so can be expanded in terms of the coordinate basis of ##T_{p}M##, but I don't really understand what it represents?!

Apologies for the long-windedness of this post but I've really confused myself over this notion and really want to clear up my understanding.

andrewkirk · Mar 15, 2016

##\nabla_XV## at point ##p\in M## (where ##M## is the manifold) is the derivative at ##p## of vector field ##V## in the direction of vector ##X##. The tangent spaces that it is 'connecting' are the tangent spaces at points along the geodesic ##\gamma## that passes through ##p## with velocity ##X##. It is only tangent spaces that are 'infinitesimally close' to ##p## that are relevant and need to be 'connected' for this purpose. The directional derivative gives information about the rate, and in what direction, the vector field ##V## deviates from the vector ##V(p)## as the latter is parallel transported along ##\gamma##. More precisely, if we define ##V^{(p,\gamma)}(t)## to be the result of parallel transporting ##V(p)## from ##p## along ##\gamma## to point ##\gamma(t)## then ##\nabla_XV## measures the rate and direction of deviation of ##V(\gamma(t))## from ##V^{(p,\gamma)}(t)## as ##t## increases.

A coordinate system on open set ##U\subseteq M## is a set of ##n## vector fields on ##U## that is linearly independent everywhere in ##U##. So ##\nabla_{e_\mu}e_\nu## at ##p\in U## is the rate and direction of deviation at ##p## of the vector ##e_\nu(\gamma(t))## from ##e_\nu{}^{(p,\gamma)}(t)##, which is the result of parallel transporting ##e_\nu(p)## along ##\gamma##, and the geodesic ##\gamma## is determined by ##e_\mu(p)##.

&quot;Don&#039;t panic!&quot; · Mar 15, 2016

andrewkirk said:

XV∇XV\nabla_XV at point p∈Mp∈Mp\in M (where MMM is the manifold) is the derivative at ppp of vector field VVV in the direction of vector XXX. The tangent spaces that it is 'connecting' are the tangent spaces at points along the geodesic γγ\gamma that passes through ppp with velocity XXX. It is only tangent spaces that are 'infinitesimally close' to ppp that are relevant and need to be 'connected' for this purpose. The directional derivative gives information about the rate, and in what direction, the vector field VVV deviates from the vector V(p)V(p)V(p) as the latter is parallel transported along γγ\gamma. More precisely, if we define V(p,γ)(t)V(p,γ)(t)V^{(p,\gamma)}(t) to be the result of parallel transporting V(p)V(p)V(p) from ppp along γγ\gamma to point γ(t)γ(t)\gamma(t) then ∇XV∇XV\nabla_XV measures the rate and direction of deviation of V(γ(t))V(γ(t))V(\gamma(t)) from V(p,γ)(t)V(p,γ)(t)V^{(p,\gamma)}(t) as ttt increases.

This was kind of my intuition for it (from a purely physics perspective) before studying differential geometry, but I can't 'see' from its axiomatic definition (in terms of the mapping I put in my first post) how it defines a derivative? Maybe I'm just being blinkered on the subject.

lavinia · Mar 15, 2016

I am not sure that the definition of a connection always corresponds to a directional derivative unless it is a Levi-Civita connection. We should think about this further.

That said, this may help with intuition.

Suppose the manifold is embedded in Euclidean space. Then the vector fields on the tangent space of the manifold have the usual directional derivative in Euclidean space. The result is another vector field in Euclidean space. The covariant derivative is the orthogonal projection of the directional derivative onto the tangent space of the manifold. In this sense it is a natural generalization of directional derivative in Euclidean space. One can think of it as the part of the directional derivative that someone living on the manifold can see. The orthogonal part is invisible to him. This covariant derivative will be compatible with the metric that the manifold inherits from euclidean space. It will be the unique Levi-Civita connection for that metric.

It is a theorem that any Riemannian manifold can be embedded isometrically in a sufficiently high dimensional Euclidean space. Its Levi-Civita connection is then the projection of the directional derivative onto the tangent space of the manifold. So all Levi-Civita connections can be thought of as orthogonal projections of Euclidean directional derivatives.

A vector field in Euclidean space is parallel along a curve if its directional derivative is zero. A vector field along a curve in an embedded manifold is parallel if the orthogonal projection of its directional derivative is zero. This generalizes the idea of parallel to Riemannian manifolds.

Generally speaking, if one thinks of a connection as an R- linear(or C linear) map on vector fields ##v \rightarrow ∇v## and satisfies the Leibniz rule, then it is a local operator. It is a theorem that a local operator can always be expressed as a linear combination of partial derivatives. But beyond this I am not sure how you get a natural correspondence with directional derivative unless the connection is a Levi-Civita connection.

In a general connection - not necessarily a Levi-Civita connection - one can define what it means for a vector field to be parallel along a curve. A vector field is parallel if its covariant derivative is zero.

&quot;Don&#039;t panic!&quot; · Mar 15, 2016

lavinia said:

Suppose the manifold is embedded in Euclidean space. Then the vector fields on the tangent space of the manifold have the usual directional derivative in Euclidean space. The result is another vector field in Euclidean space. The covariant derivative is the orthogonal projection of the directional derivative onto the tangent space of the manifold. In this sense it is a natural generalization of directional derivative in Euclidean space. One can think of it as the part of the directional derivative that someone living on the manifold can see. The orthogonal part is invisible to him. This covariant derivative will be compatible with the metric that the manifold inherits from euclidean space. It will be the unique Levi-Civita connection for that metric.

It is a theorem that any Riemannian manifold can be embedded isometrically in a sufficiently high dimensional Euclidean space. Its Levi-Civita connection is then the projection of the directional derivative onto the tangent space of the manifold. So all Levi-Civita connections can be thought of as orthogonal projections of Euclidean directional derivatives.

Generally speaking, if one thinks of a connection as a linear map that sends vector fields to vector fields and satisfies the Leibniz rule, then it is a local operator. It is a theorem that a local operator can always be expressed as a linear combination of partial derivatives. But beyond this I am not sure how you get a natural correspondence with directional derivative unless the connection is a Levi-Civits connection.

Is the idea then that the connection maps two vector fields to the covariant derivative of one vector field with respect to the other? Is this done by considering the integral curve of the vector field (that we are taking the covariant derivative with respect to) that passes through the point at which the original tangent space is attached and a point infinitesimally nearby at which a neighbouring tangent space is located. Then the vector field we wish to take the covariant derivative is parallel transported along this curve from the neighbouring tangent space to the original tangent space so that we can compare it's coordinate components (with respect to both tangent spaces) at the same point and then take the limit to get the derivative of the vector field at that point?! Apologies if this description is a little convoluted, just trying to put my thoughts on the matter in words.

andrewkirk · Mar 15, 2016

"Don't panic!" said:

Is the idea then that the connection maps two vector fields to the covariant derivative of one vector field with respect to the other?

Yes.
The remainder of your post looks like a prose version of my post 2, in which case the answer to that is also yes.

lavinia · Mar 15, 2016

"Don't panic!" said:

Is the idea then that the connection maps two vector fields to the covariant derivative of one vector field with respect to the other? Is this done by considering the integral curve of the vector field (that we are taking the covariant derivative with respect to) that passes through the point at which the original tangent space is attached and a point infinitesimally nearby at which a neighbouring tangent space is located. Then the vector field we wish to take the covariant derivative is parallel transported along this curve from the neighbouring tangent space to the original tangent space so that we can compare it's coordinate components (with respect to both tangent spaces) at the same point and then take the limit to get the derivative of the vector field at that point?! Apologies if this description is a little convoluted, just trying to put my thoughts on the matter in words.

You do not seem to be following the description of the covariant derivative in terms of generalized directional derivatives which is what I thought you asked. One does not need to parallel translate a vector along a curve to calculate its directional derivative. Just take the derivative. Check this for the ordinary directional derivative in Euclidean space.

If ##X## and ##Y## are the vector fields then at the point ##p##, ##∇_{X}Y## depends only on the value of ##X## at ##p##. One can take any curve fitting ##X_{p}## just like with the ordinary directional derivative.

andrewkirk's description also works but for that you have to already know what parallel translation is. If you parallel translate ##Y_{X_{t}}## back to the point ##p## you can compare it to ##Y_{p}## and form a Newton quotient. The limit will also be the covariant derivative.

If one starts with a connection on a principal bundle, then parallel translation is defined first. One then defines the covariant derivative from parallel translation. This approach works in a general context that encompasses non-LeviCivita connections and vector bundles other than the tangent bundle.

&quot;Don&#039;t panic!&quot; · Mar 15, 2016

lavinia said:

You do not seem to be following the description of the covariant derivative in terms of generalized directional derivatives which is what I thought you asked. One does not need to parallel translate a vector along a curve to calculate its directional derivative. Just take the derivative. Check this for the ordinary directional derivative in Euclidean space.

If XXX and YYY are the vector fields then at the point ppp,

Sorry, I hadn't meant to brush your description aside, I was just trying to understand how one can take a derivative without taking a difference quotient (maybe I'm just being stupid on this point)? And then in this case how does one interpret ##\nabla_{\mu}e_{\nu}## - is it describing how the basis vectors vary from point to point? My confusion arose initially as when I was first taught the notion of a covariant derivative from a very informal physics perspective it was done so by taking the partial derivative of a vector as follows $$\partial_{\mu}A=\partial_{\mu}(A^{\nu}e_{\nu})=(\partial_{\mu}A^{\nu})e_{\nu}+A^{\nu}\partial_{\mu}e_{\nu}$$ Now, in Euclidean space the second term would vanish as there is a canonical global coordinate system in which the basis vectors are constant, however, on a more general manifold, this is not the case.
I don't quite see how this handwavy description and the more formal description agree with each other? (again, maybe I'm just tired and not seeing the wood for the trees)

Twigg · Mar 15, 2016

"Don't panic!" said:

What confuses me is that Nakahara states that this is in some sense the correct generalisation of a directional derivative and that we identify the quantity ∇XV∇XV\nabla_{X}V with the covariant derivative, but what makes this a derivative of a vector field?

Technical, Fancy Answer:

Remember vector fields are really derivations. So ##\nabla_{X}Y## evaluated at ##p \in M## is like saying "the derivative of tangent vector Y along the integral curve of X through point p".

Consider two vector fields ##X## and ##Y## at point ##p \in M ##, and write them as ##X = X^{i}\hat{e}_{i} ## and ## Y = Y^{i}\hat{e}_{i}##, where ## (\hat{e}_{i})^{m}_{i=1} ## is some basis of orthonormal tangent vectors. Consider a neighborhood of p on which ##X## has an integral curve ## \gamma (t) ##, such that ## X = \dot{\gamma}(t) ## and ##\gamma (0) = p##. Then the "directional derivative" of vector field Y along vector field X is ##\nabla_{X}Y = \frac{d}{dt} (Y_{\gamma(t)})| ##. You know that both the components of Y ##Y^{i}##, and the basis vectors ##\hat{e}_{i}## vary at different points along the curve, so you have to apply the product rule:

##\frac{d}{dt} Y_{\gamma(t)} = \frac{d}{dt} (Y^{i}(\gamma(t)) \hat{e}_{i}(\gamma(t))) = (\frac{d}{dt}Y^{i}(\gamma(t)))\hat{e}_{i} + Y^{i}(\gamma(t))(\frac{d}{dt}\hat{e}_{i}(\gamma(t))) = \dot{\gamma}(t) \cdot (grad(Y^{i})\hat{e}_{i} + Y^{i}grad(\hat{e}_{i})##

If ##grad(\hat{e}_{i}) = 0##, then you get the same expression for directional derivative as in Euclidean space. But in general, you need to define some connection coefficients, ##\Gamma_{ij}^{k} = \hat{e}_{k} \cdot \nabla_{\hat{e}_{j}}\hat{e}_{i}## to simplify:

## \nabla_{X}Y = \dot{\gamma}(t) \cdot (grad(Y^{i})\hat{e}_{i} + Y^{i}grad(\hat{e}_{i}) = X^{j}Y^{i}_{,j}\hat{e}_{i} + X^{j}Y^{i}\Gamma_{ij}^{k}\hat{e}_{k} ##

An "affine connection" ##\nabla:## can be any map that sends two vector fields to another vector field provided it's bilinear and obeys the product rule, and every connection defines an associated covariant derivative. I would agree with lavinia that any covariant derivative associated with an affine connection that doesn't preserve the metric wouldn't be a directional derivative.

Non-technical Answer:

Take a familiar example, like the vector fields in ##\mathbb{R}^{2}## using polar coordinates, ## X = r\hat{\theta}## and ## Y = \hat{r}##. In polar coordinates, nabla is defined ## \vec{\nabla} = \hat{\theta}\frac{1}{r}\frac{\partial}{\partial\theta}+\hat{r}\frac{\partial}{\partial r}##. So the directional derivative of Y along X is ##\nabla_{X}Y = X \cdot \nabla Y = r\hat{\theta} \cdot \frac{1}{r}\frac{\partial}{\partial\theta}\hat{r} = \hat{\theta} \cdot \frac{\partial}{\partial\theta}(cos\theta\hat{i}+sin\theta\hat{j}) = \hat{\theta} \cdot (-sin\theta\hat{i} + cos\theta\hat{j}) = \hat{\theta} \cdot \hat{\theta} = 1##. Notice, the coefficients of Y are constants (1,0), so if there were no connection coefficients here you'd expect ##\nabla_{X} Y = 0##.

To put that in connection-y terms:
##\Gamma^{r}_{r\theta} = \hat{r} \cdot \frac{1}{r}\frac{\partial}{\partial\theta}\hat{r} = 0##
##\Gamma^{\theta}_{r\theta} = \hat{\theta} \cdot \frac{1}{r}\frac{\partial}{\partial\theta}\hat{r} = \frac{1}{r}##

Twigg · Mar 15, 2016

"Don't panic!" said:

And then in this case how does one interpret ∇μeν∇μeν\nabla_{\mu}e_{\nu} - is it describing how the basis vectors vary from point to point?

Bingo.

lavinia · Mar 15, 2016

"Don't panic!" said:

Sorry, I hadn't meant to brush your description aside, I was just trying to understand how one can take a derivative without taking a difference quotient (maybe I'm just being stupid on this point)?

You should work out examples where you take directional derivatives in Euclidean space then project. Try it for the sphere.

Suppose for instance that ##X(s,t)## is a parameterized surface in ##R^3##.
The the tangent vectors to the surface are finite linear combinations ##aX_{s} + bX_{t}##

The ordinary partial derivative of a tangent vector,##v##, with respect to ##s## is ##∂a/∂sX_{s} + ∂b/∂sX_{t} + aX_{ss} + bX_{st}##
This is just the directional derivative of ##v## with respect to ##X_{s}##.

The first two terms are a tangent vector to the surface but the second may not be. Projecting these terms onto the tangent space gives the decompositions

##X_{ss} = Γ_{ss}^{s}X_{s} + Γ_{ss}^{t}X_{t} + eN##
##X_{st} = Γ_{st}^{s}X_{s} + Γ_{st}^{t}X_{t} + fN##

where N is the unit normal to the surface. This writes the second derivatives as a tangent vector (the terms with the Christoffel symbols) and a normal component.

So the tangential component of ##∂v/∂s## is
##∂a/∂sX_{s} + ∂b/∂sX_{t} + a(Γ_{ss}^{s}X_{s} + Γ_{ss}^{t}X_{t}) + b(Γ_{st}^{s}X_{s} + Γ_{st}^{t}X_{t})##

This is the covariant derivative ##∇_{X{s}}v##

One can express the covariant derivative on the parameter domain ##(s,t)## as

##∇_{∂/∂s}(a∂/∂s+b∂/∂t) =( ∂a/∂s)∂/∂s + (∂b/∂s)∂/∂t + a∇_{∂/∂s}∂/∂s + b∇_{∂/∂s}∂/∂t = (∂a/∂s)∂/∂s + (∂b/∂s)∂/∂t + a(Γ_{ss}^{s}∂/∂s) + Γ_{ss}^{t}∂/∂t) + b(Γ_{st}^{s}∂/∂s + Γ_{st}^{t}∂/∂t)##

Exercise: Show that this covariant derivative in the parameter space is compatible with the metric
##<∂/∂s,∂/∂s> = X_{s}.X_{s}##, ##<∂/∂s,∂/∂t> = X_{s}.X_{t}## , ## <∂/∂t,∂/∂t> = X_{t}.X_{t}##

&quot;Don&#039;t panic!&quot; · Mar 16, 2016

lavinia said:

You should workout examples where you take directional derivatives in Euclidean space then project. Try it for the sphere.

Thanks for the worked example. I shall have a go at the exercise you've suggested.

What I've also found hard is how, from the definition of the connection given in Nakahara's book, one can arrive at the notion of a covariant derivative in the first place? He simply states that we define the action of ##\nabla## on the basis vectors ##e_{\mu}## as $$\nabla_{\mu}e_{\nu}=\Gamma_{\mu\nu}^{\lambda}e_{\lambda}$$ and then states that the connection coefficients ##\Gamma_{\mu\nu}^{\lambda}## quantify how the basis vectors change from point to point without giving any justification why.

lavinia · Mar 16, 2016

"Don't panic!" said:

Thanks for the worked example. I shall have a go at the exercise you've suggested.

What I've also found hard is how, from the definition of the connection given in Nakahara's book, one can arrive at the notion of a covariant derivative in the first place? He simply states that we define the action of ##\nabla## on the basis vectors ##e_{\mu}## as $$\nabla_{\mu}e_{\nu}=\Gamma_{\mu\nu}^{\lambda}e_{\lambda}$$ and then states that the connection coefficients ##\Gamma_{\mu\nu}^{\lambda}## quantify how the basis vectors change from point to point without giving any justification why.

If you change notation in the worked example to ##∂/∂s=e_{1}##, ##∂/∂t=e_{2}## and set ##a = 1## and ##b = 0## then the formula for the covariant derivative becomes ##∇_{e_{1}}e_{1} = Γ_{11}^λe_{λ}## with ##λ=1,2##

The derivative of ##e_{1}## with respect to itself is the rate of change of ##e_{1}## with respect to itself. This has two components.

One can always write a connection in terms of the covariant derivatives of a basis of vector fields in a coordinate domain. But these formulas do not automatically tell you that the covariant derivatives are generalizations of directional derivatives. However, if the connection is a Levi-Civita connection then the covariant derivative is exactly the projection of a directional derivative in a Euclidean space in which the manifold is isometrically embedded.

&quot;Don&#039;t panic!&quot; · Mar 16, 2016

lavinia said:

One does not need to parallel translate a vector along a curve to calculate its directional derivative. Just take the derivative.

Going back to your earlier point (quoted above), is the idea of a connection and the covariant derivative that we want to be able to compute the derivative of vector field ##V## at a point ##p\in M## on the manifold ##M## along another vector field ##X## at that point, which we denote as ##(\nabla_{X}V)_{p}##. In order to do this in a meaningful way the value of such a derivative should only depend on the value of ##X## at ##p##, and accordingly the resulting object should be a vector field evaluated at the point ##p##. A consistent notion of a derivative of one vector field along another at a given point on the manifold can be achieved through introducing a connection ##\nabla## that maps two vector fields to another vector field and whose image satisfies the linearity and Leibnitz properties of a derivative operator. This allows one to meaningfully define a derivative of a vector field along another vector field at a point on the manifold, without having to compare its value at two different point on the manifold.
The connection then provides further structure as it allows one to meaningfully compare vectors in different tangent spaces on the manifold since it can be used to define a notion of parallel transport in which one parallel transports a vector in the tangent space ##T_{p}M## over ##p## to a neighbouring tangent space ##T_{q}M## over (a neighbouring point) ##q## along the integral curve of a vector field that passes through both points (e.g. if ##\gamma (t)## is the integral curve of some vector field ##X##, i.e. ##X(\gamma (t))=\dot{\gamma}(t)##, then we require that ##\gamma (0)=p## and ##\gamma (t)=q## ).

Would this description capture the correct intuition at all?

lavinia · Mar 16, 2016

"Don't panic!" said:

Would this description capture the correct intuition at all?

More or less.

The precise definition of an affine connection is:

For each smooth vector field ##Y## and each tangent vector ##X_{p}## at the point ##p## there is a new tangent vector at ##p## ##∇_{X_{p}}Y## called the covariant derivative of ##Y## with respect to ##X_{p}##. ##∇_{X_{p}}Y## must be bilinear in ##Y## and ##X_{p}## and satisfy the Leibniz rule ##∇_{X_{p}}fY = f∇_{X_{p}}Y + (X_{p}.f)Y##. It is also required that if ##X## and ##Y## are smooth vector fields, then the vector field ##∇_{X}Y## obtained by evaluating ##∇_{X_{p}}Y## at every point of the domain of ##Y## is also a smooth vector field.

If the covariant derivative of ##Y## along a curve ##c(t)## with respect to the velocity vector ##c'(t)## is everywhere zero along the curve then ##Y## is said to be parallel along ##c(t)##. A vector ##Y_{p}## at a ##p## can always be extended to a vector field along ##c(t)## so that ##Y## is parallel. ##Y## is the parallel translation of ##Y_{p}## along ##c(t)##.

lavinia · Mar 16, 2016

BTW: Instead of using Christoffel symbols a connection is often written as

##∇e_{i} = ∑_{j}ω_{ij}⊗e_{j}## where ##e_{j}## is a basis of vector fields and the ##ω_{ij}## are 1 forms called the connection 1 forms.

In this notation, ##∇_{X}e_{i} = ∑_{j}ω_{ij}(X)e_{j}## This description has the virtue that the ##e_{j}##'s can be a basis in an arbitrary smooth vector bundle, not only the tangent bundle. The vector ##X## though is always a tangent vector. Connection 1 forms are commonly used in Differential Geometry.

&quot;Don&#039;t panic!&quot; · Mar 16, 2016

lavinia said:

For each smooth vector field YY and each tangent vector XpX_{p} at the point pp there is a new tangent vector at pp ∇XpY∇_{X_{p}}Y called the covariant derivative of YY with respect to XpX_{p}. ∇XpY∇_{X_{p}}Y must be bilinear in YY and XpX_{p} and satisfy the Leibniz rule ∇XpfY=f∇XpY+(Xp.f)Y∇_{X_{p}}fY = f∇_{X_{p}}Y + (X_{p}.f)Y. It is also required that if XX and YY are smooth vector fields, then the vector field ∇XY∇_{X}Y obtained by evaluating ∇XpY∇_{X_{p}}Y at every point of the domain of YY is also a smooth vector field.

So is the covariant derivative essentially defined analogously to the axiomatic definition of tangent vectors on manifolds in terms of derivations? (In this case derivations map to directional derivatives of functions along curves, but the covariant derivative generalises this idea to enable one to meaningfully take a derivative of a vector along the direction of another vector)

Does the requirement of the Leibniz rule as one of its defining axiom guarantee that the connection defines a derivative operator?

lavinia · Mar 16, 2016

"Don't panic!" said:

So is the covariant derivative essentially defined analogously to the axiomatic definition of tangent vectors on manifolds in terms of derivations? (In this case derivations map to directional derivatives of functions along curves, but the covariant derivative generalises this idea to enable one to meaningfully take a derivative of a vector along the direction of another vector)

Yes. One can generalize this further to the covariant derivative of a tensor field.

Does the requirement of the Leibniz rule as one of its defining axiom guarantee that the connection defines a derivative operator?

You also need linearity.

&quot;Don&#039;t panic!&quot; · Mar 17, 2016

lavinia said:

Yes. One can generalize this further to the covariant derivative of a tensor field.
You also need linearity.

Ah ok, thanks for all your help. I think it's starting to become a little clearer now.

Is the thought process for defining vectors in terms of differential operators on manifolds that as there is no well-defined way to subtract values of functions at different points on a manifold the approach one takes is to consider vectors as directional derivatives of functions at particular points since a directional derivative depends only upon the direction and point one is considering. Then, one defines the operation of taking a derivative algebraically (i.e. in terms of the axioms of linearity and the Leibniz rule) such that it doesn't depend on limits of differences of functions at two different points thus providing a well-defined notion of a derivative at a point. One can then generalise this argument to the problem of comparing vectors in different tangent spaces to arrive at the covariant derivative, i.e. define a derivative operator algebraically so that it only depends on the properties of two vectors at a single point (and therefore in the same tangent space). This is equivalent to defining the notion of a connection which has the added benefit of providing a way of comparing vectors in different tangent spaces by parallel transport.

lavinia · Mar 18, 2016

"Don't panic!" said:

One can then generalise this argument to the problem of comparing vectors in different tangent spaces to arrive at the covariant derivative, i.e. define a derivative operator algebraically so that it only depends on the properties of two vectors at a single point (and therefore in the same tangent space). .

The covariant derivative ##∇_{X_{p}}Y## depends on the tangent vector ##X_{p}## and the vector field ##Y## along a curve fitting ##X_{p}## not just ##Y## at ##p##.

The directional derivative ##X_{p}.f## depends on the tangent vector ##X_{p}## and the function ##f## along a curve fitting ##X_{p}## not just ##f## at ##p##.

If you look at the "algebraic" definition of tangent vector it acts on functions linearly over the base field and satisfies the Leibniz rule on functions - not just on the value of the functions at a point. The same is true for the covariant derivative.

I strongly suspect that the covariant derivative is not a directional derivative when the connection is not a Levi-Civita connection. Work through examples to get a better feel for this. For instance compute some covariant derivatives for connections that are not torsion free or are not compatible with a metric.

More generally, work through geometric examples of connections in simple cases - e.g. two dimensional surfaces. You will then see directly how a covariant derivative generalizes a directional derivative in the case of a Levi-Civita connection. I suggest not going straight to the General Relativity definition since it is more mathematically abstract. Save that for until you have a intuition from simple examples.

&quot;Don&#039;t panic!&quot; · Mar 18, 2016

lavinia said:

The covariant derivative ##∇_{X_{p}}Y## depends on the tangent vector ##X_{p}## and the vector field ##Y## along a curve fitting ##X_{p}## not just ##Y## at ##p##.

The directional derivative ##X_{p}.f## depends on the tangent vector ##X_{p}## and the function ##f## along a curve fitting ##X_{p}## not just ##f## at ##p##.

If you look at the "algebraic" definition of tangent vector it acts on functions linearly over the base field and satisfies the Leibniz rule on functions - not just on the value of the functions at a point. The same is true for the covariant derivative.

I strongly suspect that the covariant derivative is not a directional derivative when the connection is not a Levi-Civita connection. Work through examples to get a better feel for this. For instance compute some covariant derivatives for connections that are not torsion free or are not compatible with a metric.

More generally, work through geometric examples of connections in simple cases - e.g. two dimensional surfaces. You will then see directly how a covariant derivative generalizes a directional derivative in the case of a Levi-Civita connection. I suggest not going straight to the General Relativity definition since it is more mathematically abstract. Save that for until you have a intuition from simple examples.

Thanks for all the information. I'll try and work through some examples then.

&quot;Don&#039;t panic!&quot; · Mar 18, 2016

Sorry to go on a bit, but is the reason for the "algebraic" definition of a derivative so that one can circumvent the issue of needing to take the limit of a function (or vector field) evaluated at two points, instead only requiring information about the integral curve passing through the point one is considering?

lavinia · Mar 18, 2016

"Don't panic!" said:

Sorry to go on a bit, but is the reason for the "algebraic" definition of a derivative so that one can circumvent the issue of needing to take the limit of a function (or vector field) evaluated at two points, instead only requiring information about the integral curve passing through the point one is considering?

If one defines tangent vectors by derivatives along curves one has to worry about changes of coordinates and equivalence classes of curves. This is a little messy .
The algebraic definition is intrinsic and more concise. It has no need for worrying about coordinate transformations and equivalent curves.
But both approaches are equivalent.

If you like I could check your work on examples and also suggest ones to work on.

&quot;Don&#039;t panic!&quot; · Mar 18, 2016

lavinia said:

The algebraic definition is intrinsic and more concise. It has no need for worrying about coordinate transformation and equivalent curves.

What I find a little unintuitive about this definition though is what guarantees that this definition defines a differential operator? Are all derivations (i.e. maps from ##C^{\infty}(M)## to ##\mathbb{R}## satisfying the properties of linearity and the Leibniz rule) linear combinations of differential operators?

lavinia said:

If you like I could check your work on examples and also suggest ones to work on.

Yes please, some suggestions would be much appreciated.

lavinia · Mar 18, 2016

"Don't panic!" said:

What I find a little unintuitive about this definition though is what guarantees that this definition defines a differential operator? Are all derivations (i.e. maps from ##C^{\infty}(M)## to ##\mathbb{R}## satisfying the properties of linearity and the Leibniz rule) linear combinations of differential operators?

There are theorems about linear maps that satisfy the Leibniz rule that say that they can be expressed as linear combinations of partial derivatives but I don't know the proof. Milnor says that a linear map that decreases supports is a local operator and a theorem of Peetrie says that a local operator is a differential operator.

Let's try to think this through. It shouldn't be too bad.

BTW: The example of a surface that I illustrated above is a good one to follow through. It is not a specific surface but a general derivation. We can walk through it .

Ben Niehoff · Mar 19, 2016

There is a neat way of thinking of things that clarifies the relationship between connections and covariant derivatives. In a sense, a derivative operator is an infinitesimal generator of translations. For fun, try working out

$$e^{a \frac{d}{dx} } f(x)$$

where ##a## is a constant, and ##e^{\hat O} \equiv 1 + \hat O + \frac12 \hat O^2 + \frac{1}{3!} \hat O^3 + \ldots## is the exponential of an operator. For simplicity, you can assume ##f(x)## is analytic, so you can replace it by its Taylor series.

In a similar sense, the covariant derivative operator ##\nabla_X## (along a given vector field ##X##) is the infinitesimal generator of parallel transport along ##X##, where that parallel transport is defined in terms of some connection coefficients (or equivalently, a connection 1-form).

stevendaryl · Mar 19, 2016

"Don't panic!" said:

What I find a little unintuitive about this definition though is what guarantees that this definition defines a differential operator? Are all derivations (i.e. maps from ##C^{\infty}(M)## to ##\mathbb{R}## satisfying the properties of linearity and the Leibniz rule) linear combinations of differential operators?

Well, the Leibiz rules basically axiomatize what a directional derivative should do, so it doesn't seem surprising to me that only a directional derivative satisfies those axioms.

I think it's kind of a strange way to axiomatize it, though. Usually derivatives are introduced through limits of ratios: [itex]\frac{\Delta x}{\Delta t}[/itex]

lavinia · Mar 19, 2016

stevendaryl said:

Well, the Leibiz rules basically axiomatize what a directional derivative should do, so it doesn't seem surprising to me that only a directional derivative satisfies those axioms.

I think it's kind of a strange way to axiomatize it, though. Usually derivatives are introduced through limits of ratios: [itex]\frac{\Delta x}{\Delta t}[/itex]

For functions linearity and the Leibniz rule nail the directional derivative. For connections one gets an operator but I don't see generally how to interpret it as a directional derivative. In the case of a Levi-Civita connection on a Riemannian manifold one can interpret the covariant derivative as the projection of a directional derivative in Euclidean space onto the tangent plane of an embedded submanifold. But for other connections it seems unclear and probably false.

Every connection satisfies a Leibniz rule ##∇(fs) =df⊗s + f∇s##. This rule together with linearity over the base field guarantees that ##∇## is a differential operator.

One can always get a Newton quotient by parallel translating vectors back to a fixed point and comparing them to the value of the vector field at that point. This will give you the covariant derivative. If one has parallel translation as given then one can define covariant differentiation this way.

andrewkirk · Mar 19, 2016

"Don't panic!" said:

I was just trying to understand how one can take a derivative without taking a difference quotient

It can be expressed as a quotient that has almost exactly the same form as the usual definitions of derivatives, using the concepts of post 2, as follows:

Let ##\mathscr{T}(M)## be the set of all smooth vector fields on manifold ##M##.
Given a connection ##\nabla:TM\times\mathscr{T}(M)\to TM##,
the covariant derivative of a smooth vector field ##V:M\to TM## at point ##p\in M##, with respect to vector ##X\in T_pM## is a vector in ##T_pM## whose value is:

$$\lim_{h\to 0}\frac{V_{(X)}(h)-V_{(X)}(0)}{h}$$

where ##V_{(X)}(h)## is defined to be the result of parallel transporting ##V(\gamma_{X}(h))## along ##\gamma_{X}## to ##p## and ##\gamma_{X}## is the geodesic that passes through through ##p## with velocity ##X##.

Parallel transport is defined by the connection, as one would expect from the name: it connects a vector in one tangent space to an equivalent - parallel transported - vector in another tangent space that is a little way along a curve.

Note that, once we have the connection, the derivative can be simply expressed as a limit of a quotient of differences of vectors all in the tangent space at ##p##$. We do not need to take any differences against nearby tangent spaces.

&quot;Don&#039;t panic!&quot; · Mar 20, 2016

Thanks everyone for the information provided. So is the idea essentially that in order to calculate a derivative of a well-defined manner we consider a vector in the tangent space and then take its derivative along the integral curve of a vector field passing through that point (is this what is meant by the connection characterising an infinitesimal parallel transport, since the integral curve of the vector field will pass through the tangent space at a particular point and an infinitesimally close neighbouring tangent space?). If this is correct, why is it formalised as a mapping of two vector fields, i.e. ##\nabla :\mathcal{X}(M)\times\mathcal{X}(M)\rightarrow\mathcal{X}(M)## (where ##\mathcal{X}(M)## is the set of vector fields over ##M##)? Shouldn't it be a map of the form ##T_{p}M\times\mathcal{X}(M)\rightarrow T_{p}M##, or is the point that we choose a vector field evaluated at a particular point ##p\in M## and the take its derivative along the direction of another vector field at that point (i.e. along the integral curve of a vector field passing through that point) assuming a Levi-Civita connection?

Confusion on notion of connection & covariant derivative

Discussion Overview

Discussion Character

Main Points Raised

Areas of Agreement / Disagreement

Contextual Notes

Similar threads

Graduate Nonautonomous Lie derivative

Graduate Equivalent definitions of tensor field

Insights Revisiting the Velocity-Time Function

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect