How is the gradient covariant?

In summary, the covector and vector gradients are two different ways of describing the rate of change of a scalar function along a path. The covector gradient is more intuitive, but the components of the vector gradient transform differently under a coordinate change.
  • #1
plob
13
1
Hi, basic cartesian coordinates and we want to know the gradient of a scalar function of x,y, and z. So we can use the most basic basis there is of three orthogonal unit vectors and come up with the gradient of the scalar function. Now without rescaling the coordinate system or altering it in any way just double the length of each basis vector. If it is the same scalar function as before then each component of gradient is reduced by half

Isn't that contravariance?
 
Physics news on Phys.org
  • #2
plob said:
Hi, basic cartesian coordinates and we want to know the gradient of a scalar function of x,y, and z. So we can use the most basic basis there is of three orthogonal unit vectors and come up with the gradient of the scalar function. Now without rescaling the coordinate system or altering it in any way just double the length of each basis vector. If it is the same scalar function as before then each component of gradient is reduced by half

Isn't that contravariance?

There are two different meanings of "gradient" whose differences are glossed over when you're dealing with cartesian coordinates, but which need to be kept distinct when you're dealing with general coordinates and coordinate changes.

Most fundamentally,
  • Let ##\phi(\vec{r})## be a scalar field---a function that returns a scalar (a real or complex number) for each point in space.
  • Let ##\vec{r}(t)## be a path through space as a function of a real-valued parameter ##t## (this doesn't have to be time, it could be any quantity that increases continuously along the path).
  • Let ##\overrightarrow{V}(t)## be the corresponding "velocity vector", ##\overrightarrow{V} = \frac{d \vec{r}}{dt}##.
Then we can define the "directional derivative" of ##\phi## along vector ##\overrightarrow{V}## to be the rate of change of ##\phi## along the path ##\vec{r}(t)##:

Directional derivative: ##\nabla_{\overrightarrow{V}}(\phi) \equiv \frac{d}{dt} \phi(\vec{r}(t))##

In terms of the directional derivative, we can define two different mathematical objects that might be called the "gradient" of ##\phi##:

Covector gradient: Define ##(\nabla \phi)## to be an operator that takes a vector ##\overrightarrow{V}## and returns the directional derivative ##\nabla_{\overrightarrow{V}}(\phi)##

Vector gradient: Define ##\overrightarrow{\nabla \phi}## to be a vector such that ##\overrightarrow{\nabla \phi} \cdot \vec{V} = \nabla_{\overrightarrow{V}}(\phi)##

The components of the covector gradient are given by: ##(\nabla \phi)_j = \frac{\partial \phi}{\partial x^j}##. The components of the vector gradient are given by: ##(\overrightarrow{\nabla \phi})^j = \sum_{i} g^{ij} \frac{\partial \phi}{\partial x^i} ##, where ##g^{ij}## is the metric tensor. In Cartesian components, ##g^{ij} = 0## unless ##i=j##, and ##g^{jj} = 1##. So there's not much difference between the covector gradient and the vector gradient. However, under a coordinate change, the components of the covector gradient transform differently than the components of the vector gradient.
 
  • Like
Likes plob
  • #3
stevendaryl said:
There are two different meanings of "gradient" whose differences are glossed over when you're dealing with cartesian coordinates, but which need to be kept distinct when you're dealing with general coordinates and coordinate changes.

Most fundamentally,
  • Let ##\phi(\vec{r})## be a scalar field---a function that returns a scalar (a real or complex number) for each point in space.
  • Let ##\vec{r}(t)## be a path through space as a function of a real-valued parameter ##t## (this doesn't have to be time, it could be any quantity that increases continuously along the path).
  • Let ##\overrightarrow{V}(t)## be the corresponding "velocity vector", ##\overrightarrow{V} = \frac{d \vec{r}}{dt}##.
Then we can define the "directional derivative" of ##\phi## along vector ##\overrightarrow{V}## to be the rate of change of ##\phi## along the path ##\vec{r}(t)##:

Directional derivative: ##\nabla_{\overrightarrow{V}}(\phi) \equiv \frac{d}{dt} \phi(\vec{r}(t))##

In terms of the directional derivative, we can define two different mathematical objects that might be called the "gradient" of ##\phi##:

Covector gradient: Define ##(\nabla \phi)## to be an operator that takes a vector ##\overrightarrow{V}## and returns the directional derivative ##\nabla_{\overrightarrow{V}}(\phi)##

Vector gradient: Define ##\overrightarrow{\nabla \phi}## to be a vector such that ##\overrightarrow{\nabla \phi} \cdot \vec{V} = \nabla_{\overrightarrow{V}}(\phi)##

The components of the covector gradient are given by: ##(\nabla \phi)_j = \frac{\partial \phi}{\partial x^j}##. The components of the vector gradient are given by: ##(\overrightarrow{\nabla \phi})^j = \sum_{i} g^{ij} \frac{\partial \phi}{\partial x^i} ##, where ##g^{ij}## is the metric tensor. In Cartesian components, ##g^{ij} = 0## unless ##i=j##, and ##g^{jj} = 1##. So there's not much difference between the covector gradient and the vector gradient. However, under a coordinate change, the components of the covector gradient transform differently than the components of the vector gradient.
Hi, thank you I'm sure this is probably right

But don't really have the background to grasp the last part. My fault sorry.

Is it possible there might be a more intuitive way of understanding
 
  • #4
plob said:
Hi, thank you I'm sure this is probably right

But don't really have the background to grasp the last part. My fault sorry.

Is it possible there might be a more intuitive way of understanding

Which part? The business about the metric tensor?
 
  • Like
Likes plob
  • #5
stevendaryl said:
Which part? The business about the metric tensor?
Hi steven, yes
 
  • #6
Suppose you have some weird coordinate system with two coordinates, ##u## and ##v##. You want to know the distance between point ##A## and point ##B## is. If you were using cartesian coordinates, then the distance would be given by: ##D = \sqrt{\delta u^2 + \delta v^2}##, where ##\delta u## is the change in the ##u## coordinate in going from ##A## to ##B##, and ##\delta v## is the change in the ##v## coordinate. But what about polar coordinates? In terms of ##r## and ##\theta##, the distance between ##A## and ##B## is given approximately (when ##A## and ##B## are very close together) by:

##D = \sqrt{\delta r^2 + r^2 \delta \theta^2}##

In general, for points that are close together, the distance will be given by: ##D^2 = g_{uu} \delta u^2 + g_{uv} \delta u \delta v + g_{vu} \delta v \delta u + g_{vv} \delta v^2##. Those four numbers, ##g_{uu}, g_{uv}, g_{vu}, g_{vv}## are the components of the "metric tensor" in the ##u-v## coordinate system.

In cartesian coordinates ##x,y##, it's trivial: ##g_{xx} = 1, g_{xy} = 0, g_{yx} = 0, g_{yy} = 0##. But in polar coordinates, it's a little more interesting: ##g_{rr} = 1, g_{r\theta} = 0, g_{\theta r} = 0, g_{\theta \theta} = r^2##.

The metric tensor is how you compute dot-products of two vectors: ##\vec{A} \cdot \vec{B} = (A^u)(B^u) g_{uu} + (A^u)(B^v) g_{uv} + (A^v)(B^u) g_{vu} + (A^v)(B^v) g_{vv}##. For cartesian coordinates, since the components of ##g## are pretty trivial, then it simplifies a lot: ##\vec{A} \cdot \vec{B} = A^x B^x + A^y B^y##.

Viewed as a 2x2 matrix, the metric tensor ##g## has an inverse. It's components are denoted by raised indices: ##g^{ij}##. You can use the inverse metric tensor to take a "dot" product of two covectors, or to convert a covector into a vector.

So let's take the case of the gradient in polar coordinates. The covector form is: ##\nabla \phi## with components ##(\nabla \phi)_r = \frac{\partial \phi}{\partial r}## and ##(\nabla \phi)_\theta = \frac{\partial \phi}{\partial \theta}##. To convert it into a vector, you use the inverse of the metric tensor. In this case,

##g^{rr} = \frac{1}{g_{rr}} = 1##
##g^{\theta \theta} = \frac{1}{g_{\theta \theta}} = \frac{1}{r^2}##
(the other two components are zero).

So the vector form of the gradient is: ##\overrightarrow{\nabla \phi}## with components

##(\overrightarrow{\nabla \phi})^r = g^{rr} \frac{\partial \phi}{\partial r} = \frac{\partial \phi}{\partial r}##
##(\overrightarrow{\nabla \phi})^\theta = g^{\theta \theta} \frac{\partial \phi}{\partial \theta} = \frac{1}{r^2} \frac{\partial \phi}{\partial \theta} ##

To get the directional derivative, you take the dot-product with a direction vector ##\overrightarrow{V}##:
##\overrightarrow{\nabla \phi} \cdot \overrightarrow{V}##

but computing the dot-product in curvilinear coordinates involves the metric tensor again:

##\overrightarrow{\nabla \phi} \cdot \overrightarrow{V} = g_{rr} (\overrightarrow{\nabla \phi})^r (\overrightarrow{V})^r + g_{\theta \theta} (\overrightarrow{\nabla \phi})^\theta (\overrightarrow{V})^\theta##

The metric tensor just cancels out the use of the inverse metric tensor in forming the vector gradient, so the result is just:

##\overrightarrow{\nabla \phi} \cdot \overrightarrow{V} = \frac{\partial \phi}{\partial r} V^r + \frac{\partial \phi}{\partial \theta} V^\theta##

This shows that the covariant form of the gradient is more natural; the vector form uses the inverse metric tensor to create the vector, and then uses the metric tensor again to get the result. In the final result, the metric tensor components drop out.
 
  • #7
Please allow me a followup to this thread, which I found after struggling with the matter.

What you say, stevendaryl, appears as quite clear to me. I just would be grateful for confirmation that, consequently the total derivative of any scalar ##S## expressed through the derivatives of the coordinates ##\mbox{d}S = \frac{\partial S}{\partial x^\mu} \mbox{d}x^\mu## looks trivial in any coordinate system in the sense that the metric is not visible. In particular, in polar coordinates ##\mbox{d}S = \frac{\partial S}{\partial r} \mbox{d}r + \frac{\partial S}{\partial \theta} \mbox{d}\theta##.

I got much confused because in most literature it is said that the gradient in polar coordinates is ##\mbox{col}(\frac{\partial S}{\partial r} , \; \frac{1}{r}\frac{\partial S}{\partial \theta} )## . But I think the reason is that they refer to a normalized coordinate system (unit vectors in all directions) in place of the natural system given by the the coordinates.

Thank you for any reply.
 
  • #8
Forgive me for not reading most of the thread, but I'll try to answer @gerald V's last post.

Yes, the derivative ##dS=\frac{\partial S}{\partial x^i}dx^i## is independent of a metric. You do not need a metric to define the derivative of a scalar function.

I think you are right about why you found that formula in the literature. If you just follow the usual formula for the gradient vector in coordinates, you would get ##\nabla S=\frac{\partial S}{\partial r}\frac{\partial}{\partial r}+\frac{1}{r^2}\frac{\partial S}{\partial \theta}\frac{\partial}{\partial \theta},## so if you expressed ##\nabla S## in the basis ##\{\frac{\partial}{\partial r},\frac{\partial}{\partial \theta}\},## you would get the vector ##\begin{pmatrix}\frac{\partial S}{\partial r}\\ \frac{1}{r^2}\frac{\partial S}{\partial\theta}\end{pmatrix}.##

But people often like to use a basis of unit vectors, and from ##g_{rr}=1, g_{\theta\theta}=r^2##, we see that while ##\frac{\partial}{\partial r}## has unit length, the vector ##\frac{\partial}{\partial\theta}## has length ##r##. So the second component should be multiplied by ##r## if you want the coordinates in the re-scaled basis of unit vectors ##\{\frac{\partial}{\partial r},\frac{1}{r}\frac{\partial}{\partial \theta}\}##. Sometimes, especially in physics contexts, you might see these unit vectors called ##\hat{r}## and ##\hat{\theta}##, respectively.

Edit: I just noticed that the rest of this thread is from 2018. In the future, consider starting a new thread instead.
 
Last edited:

1. What is a gradient?

A gradient is a mathematical operator that describes the rate of change of a function at a particular point in space. It is represented by a vector whose direction points in the direction of the steepest increase of the function and whose magnitude represents the rate of change.

2. How is the gradient related to the concept of a covariant derivative?

The gradient is a special case of the covariant derivative, which is a mathematical tool used to describe the change of a vector field along a given direction. The gradient is the covariant derivative of a scalar function, where the vector field is the gradient vector.

3. Why is it important to have a covariant gradient?

A covariant gradient is important because it allows for the description of changes in a vector field that are independent of the coordinate system used. This is particularly useful in fields such as physics and engineering, where vector quantities need to be described in a consistent and coordinate-independent manner.

4. How is the covariant gradient different from the ordinary gradient?

The covariant gradient takes into account the curvature of space, while the ordinary gradient does not. This means that the covariant gradient is valid in curved coordinate systems, while the ordinary gradient is only valid in flat coordinate systems.

5. Can the covariant gradient be applied to any type of function?

Yes, the covariant gradient can be applied to any type of function, as long as the function is defined on a curved space and its derivatives are well-defined. This includes scalar, vector, and tensor functions.

Similar threads

  • Advanced Physics Homework Help
Replies
5
Views
2K
  • Linear and Abstract Algebra
Replies
9
Views
199
  • Calculus and Beyond Homework Help
Replies
8
Views
470
Replies
0
Views
321
  • Special and General Relativity
2
Replies
38
Views
4K
  • Linear and Abstract Algebra
Replies
2
Views
2K
Replies
4
Views
2K
  • General Math
Replies
5
Views
1K
  • Linear and Abstract Algebra
Replies
2
Views
1K
  • Special and General Relativity
Replies
3
Views
859
Back
Top