Why is grad(f) a covariant vector

  • Context: Graduate 
  • Thread starter Thread starter Benjam:n
  • Start date Start date
  • Tags Tags
    Covariant Vector
Click For Summary
SUMMARY

The discussion centers on the nature of the gradient of a function f defined on R² and its transformation properties under coordinate changes. The gradient vector, denoted as ∇f, is established as a covariant vector because it transforms in the same manner as the basis vectors of the tangent space. The transformation of the coordinate n-tuple, however, is shown to be contravariant, as it involves the inverse of the transformation matrix used for the gradient. This distinction is crucial for understanding the behavior of gradients in differential geometry.

PREREQUISITES
  • Understanding of gradient vectors and their properties in calculus.
  • Familiarity with coordinate transformations in differential geometry.
  • Knowledge of partial derivatives and their notation.
  • Basic concepts of linear algebra, particularly linear bijections.
NEXT STEPS
  • Study the properties of covariant and contravariant vectors in differential geometry.
  • Learn about the implications of the chain rule in coordinate transformations.
  • Explore the concept of vector spaces and dual spaces in the context of Riemannian metrics.
  • Investigate the role of the gradient in various coordinate systems, including polar coordinates.
USEFUL FOR

Mathematicians, physicists, and students of differential geometry who are interested in the transformation properties of vectors and gradients in various coordinate systems.

Benjam:n
Messages
28
Reaction score
0
Take R2. Take a function f(x,y) defined on R2 which maps every point to a real number. The gradient of this at any point mean a vector which points in the direction of steepest incline. The magnitude of the vector is the value of the derivative of the function in that direction. Both of these things are very real. This vector is solid and is surely there, so why doesn't it transform contravariantly? I had a go to explore this take x coordinate as Cartesian and x bar as the polars. Then define the function f(x,y) as x^2 +y^2. And if you work out the gradient vector and transform this contarvariant it does give you (2r, 0) which is thee gradient vector relative to polars.
 
Physics news on Phys.org
The gradient of a function ##f:\mathbb R^n\to\mathbb R## is the function ##\nabla f:\mathbb R^n\to\mathbb R^n## defined by
$$\nabla f(x)=(f_{,1}(x),\dots,f_{,n}(x)),$$ for all ##x\in\mathbb R^n##. For each ##i\in\{1,\dots,n\}##, ##f_{,i}## denotes the ith partial derivative of f. In differential geometry, partial derivatives are defined using both a coordinate system and the conventional type of partial derivatives. For example, if ##x:U\to\mathbb R^n## is a coordinate system on ##U\subseteq\mathbb R^n##, and ##p\in U##, then for all ##i\in\{1,\dots,n\}##, we have
$$\frac{\partial}{\partial x^i}\bigg|_p f= (f\circ x^{-1})_{,i}(x(p)).$$ This statement defines the notation on the left.

The conventional partial derivatives in a gradient can be interpreted as partial derivatives in the sense of differential geometry, if we use the fact that the identity map ##I##, defined by ##I(x)=x## for all ##x\in\mathbb R^n##, is a coordinate system. We have
$$\frac{\partial}{\partial I^i}\bigg|_p f = (f\circ I^{-1})_{,i}(I(p)) = f_{,i}(p).$$ To see how partial derivatives in the sense of differential geometry transform under a change of coordinates ##x\to y##, we need to use the chain rule:
\begin{align}
\frac{\partial}{\partial y^i}\bigg|_p f &=(f\circ y^{-1})_{,i}(y(p)) = (f\circ x^{-1}\circ x\circ y^{-1})_{,i}(y(p))= (f\circ x^{-1})_{,j} \big((x\circ y^{-1})(y(p))\big)\, (x\circ y^{-1})^j{}_{,i}(y(p))\\
& = (x\circ y^{-1})^j{}_{,i}(y(p)) \frac{\partial}{\partial x^j}\bigg|_p f.
\end{align} Is the transformation
$$\frac{\partial}{\partial x^i}\bigg|_p \to \frac{\partial}{\partial y^i}\bigg|_p =(x\circ y^{-1})^j{}_{,i}(y(p)) \frac{\partial}{\partial x^j}\bigg|_p$$ covariant or contravariant? Well, "covariant" means that the components transform the same way as the basis vectors, but the partial derivative functionals ##\frac{\partial}{\partial x^i}\big|_p## are the basis vectors (of the tangent space at p) associated with the coordinate system x. So the transformation is by definition covariant.

I guess this changes the question to why the coordinate n-tuple ##(x^1(p),\cdots,x^n(p))## that a coordinate system x associates with a point ##p\in\mathbb R^n## transforms contravariantly. They don't always. Under the coordinate change ##x\to y##, ##x(p)## changes to
$$y(p)=(y\circ x^{-1}\circ x)(p)= (y\circ x^{-1})(x(p)).$$ To proceed from here, an assumption is necessary. We assume that ##y\circ x^{-1}## is a linear bijection from ##\mathbb R^n## to ##\mathbb R^n## (for example a rotation or a Lorentz transformation). The ##i##th component of the matrix equation corresponding to the above (see https://www.physicsforums.com/showthread.php?t=694922 if you don't understand that concept) is
$$(y(p))^i = (y\circ x^{-1})^i{}_j (x(p))^j.$$ Let T be an arbitrary linear bijection from ##\mathbb R^n## to ##\mathbb R^n##. For all ##x\in\mathbb R^n## (apologies for using the symbol x for a second purpose), we have
\begin{align}
&T^i(x)=T^i{}_j x^j\\
&T^i_{,k}(x)=T^i{}_j \delta^j_k =T^i{}_k.
\end{align} This implies that ##(y\circ x^{-1})^i{}_{,j}(x(p)) =(y\circ x^{-1})^i{}_j##. So we have
$$y^i(p)=(y(p))^i =(y\circ x^{-1})^i{}_j (x(p))^j = (y\circ x^{-1})^i{}_{,j}(x(p))\, x^j(p).$$ As you can see, the numbers ##(y\circ x^{-1})^i{}_{,j}(x(p))## that appear in this transformation equation are not the same as the numbers ##(x\circ y^{-1})^j{}_{,i}(y(p))## that appear in the transformation equation for the components of the gradient. However, we have
\begin{align}
\delta^j_k &=I^j{}_{,k}(x(p))= (x\circ y^{-1}\circ y\circ x^{-1})^j{}_{,k}(x(p)) =(x\circ y^{-1})^j{}_{,i}(y(p)) (y\circ x^{-1})^i{}_{,k}(x(p)).
\end{align} This is how we see that coordinate n-tuples transform contravariantly, i.e. using the inverse of the matrix that's used to transform the basis vectors.
 
Last edited by a moderator:
I wonder if the general definition of gradient as (vector-space) duals to vector fields (using the Riemannian metric as a non-degenerate bilinear form to make the isomorphism V-->V* natural) is done to address this issue, i.e, to make the gradient locally independent (within a chart ) of coordinate changes. Anyone know?
 

Similar threads

  • · Replies 9 ·
Replies
9
Views
3K
  • · Replies 6 ·
Replies
6
Views
3K
  • · Replies 8 ·
Replies
8
Views
4K
  • · Replies 1 ·
Replies
1
Views
3K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 9 ·
Replies
9
Views
4K
  • · Replies 1 ·
Replies
1
Views
1K
  • · Replies 21 ·
Replies
21
Views
18K
  • · Replies 25 ·
Replies
25
Views
3K