Why is grad(f) a covariant vector

In summary, the gradient of a function in differential geometry is defined using both a coordinate system and conventional partial derivatives. It can be interpreted as partial derivatives in the sense of differential geometry, and its transformation under a change of coordinates is covariant. However, coordinate n-tuples do not always transform contravariantly, and this is addressed in the general definition of the gradient as vector-space duals to vector fields using a non-degenerate bilinear form.
  • #1
Benjam:n
28
0
Take R2. Take a function f(x,y) defined on R2 which maps every point to a real number. The gradient of this at any point mean a vector which points in the direction of steepest incline. The magnitude of the vector is the value of the derivative of the function in that direction. Both of these things are very real. This vector is solid and is surely there, so why doesn't it transform contravariantly? I had a go to explore this take x coordinate as Cartesian and x bar as the polars. Then define the function f(x,y) as x^2 +y^2. And if you work out the gradient vector and transform this contarvariant it does give you (2r, 0) which is thee gradient vector relative to polars.
 
Physics news on Phys.org
  • #2
The gradient of a function ##f:\mathbb R^n\to\mathbb R## is the function ##\nabla f:\mathbb R^n\to\mathbb R^n## defined by
$$\nabla f(x)=(f_{,1}(x),\dots,f_{,n}(x)),$$ for all ##x\in\mathbb R^n##. For each ##i\in\{1,\dots,n\}##, ##f_{,i}## denotes the ith partial derivative of f. In differential geometry, partial derivatives are defined using both a coordinate system and the conventional type of partial derivatives. For example, if ##x:U\to\mathbb R^n## is a coordinate system on ##U\subseteq\mathbb R^n##, and ##p\in U##, then for all ##i\in\{1,\dots,n\}##, we have
$$\frac{\partial}{\partial x^i}\bigg|_p f= (f\circ x^{-1})_{,i}(x(p)).$$ This statement defines the notation on the left.

The conventional partial derivatives in a gradient can be interpreted as partial derivatives in the sense of differential geometry, if we use the fact that the identity map ##I##, defined by ##I(x)=x## for all ##x\in\mathbb R^n##, is a coordinate system. We have
$$\frac{\partial}{\partial I^i}\bigg|_p f = (f\circ I^{-1})_{,i}(I(p)) = f_{,i}(p).$$ To see how partial derivatives in the sense of differential geometry transform under a change of coordinates ##x\to y##, we need to use the chain rule:
\begin{align}
\frac{\partial}{\partial y^i}\bigg|_p f &=(f\circ y^{-1})_{,i}(y(p)) = (f\circ x^{-1}\circ x\circ y^{-1})_{,i}(y(p))= (f\circ x^{-1})_{,j} \big((x\circ y^{-1})(y(p))\big)\, (x\circ y^{-1})^j{}_{,i}(y(p))\\
& = (x\circ y^{-1})^j{}_{,i}(y(p)) \frac{\partial}{\partial x^j}\bigg|_p f.
\end{align} Is the transformation
$$\frac{\partial}{\partial x^i}\bigg|_p \to \frac{\partial}{\partial y^i}\bigg|_p =(x\circ y^{-1})^j{}_{,i}(y(p)) \frac{\partial}{\partial x^j}\bigg|_p$$ covariant or contravariant? Well, "covariant" means that the components transform the same way as the basis vectors, but the partial derivative functionals ##\frac{\partial}{\partial x^i}\big|_p## are the basis vectors (of the tangent space at p) associated with the coordinate system x. So the transformation is by definition covariant.

I guess this changes the question to why the coordinate n-tuple ##(x^1(p),\cdots,x^n(p))## that a coordinate system x associates with a point ##p\in\mathbb R^n## transforms contravariantly. They don't always. Under the coordinate change ##x\to y##, ##x(p)## changes to
$$y(p)=(y\circ x^{-1}\circ x)(p)= (y\circ x^{-1})(x(p)).$$ To proceed from here, an assumption is necessary. We assume that ##y\circ x^{-1}## is a linear bijection from ##\mathbb R^n## to ##\mathbb R^n## (for example a rotation or a Lorentz transformation). The ##i##th component of the matrix equation corresponding to the above (see https://www.physicsforums.com/showthread.php?t=694922 if you don't understand that concept) is
$$(y(p))^i = (y\circ x^{-1})^i{}_j (x(p))^j.$$ Let T be an arbitrary linear bijection from ##\mathbb R^n## to ##\mathbb R^n##. For all ##x\in\mathbb R^n## (apologies for using the symbol x for a second purpose), we have
\begin{align}
&T^i(x)=T^i{}_j x^j\\
&T^i_{,k}(x)=T^i{}_j \delta^j_k =T^i{}_k.
\end{align} This implies that ##(y\circ x^{-1})^i{}_{,j}(x(p)) =(y\circ x^{-1})^i{}_j##. So we have
$$y^i(p)=(y(p))^i =(y\circ x^{-1})^i{}_j (x(p))^j = (y\circ x^{-1})^i{}_{,j}(x(p))\, x^j(p).$$ As you can see, the numbers ##(y\circ x^{-1})^i{}_{,j}(x(p))## that appear in this transformation equation are not the same as the numbers ##(x\circ y^{-1})^j{}_{,i}(y(p))## that appear in the transformation equation for the components of the gradient. However, we have
\begin{align}
\delta^j_k &=I^j{}_{,k}(x(p))= (x\circ y^{-1}\circ y\circ x^{-1})^j{}_{,k}(x(p)) =(x\circ y^{-1})^j{}_{,i}(y(p)) (y\circ x^{-1})^i{}_{,k}(x(p)).
\end{align} This is how we see that coordinate n-tuples transform contravariantly, i.e. using the inverse of the matrix that's used to transform the basis vectors.
 
Last edited by a moderator:
  • #3
I wonder if the general definition of gradient as (vector-space) duals to vector fields (using the Riemannian metric as a non-degenerate bilinear form to make the isomorphism V-->V* natural) is done to address this issue, i.e, to make the gradient locally independent (within a chart ) of coordinate changes. Anyone know?
 

Related to Why is grad(f) a covariant vector

1. What does it mean for grad(f) to be a covariant vector?

Grad(f) being a covariant vector means that it transforms in a specific way when the coordinate system is changed. It follows a specific set of rules, known as the transformation laws, which allow for consistent calculations and interpretations of the vector in different coordinate systems.

2. Why is it important for grad(f) to be a covariant vector?

It is important for grad(f) to be a covariant vector because it allows for a consistent and meaningful interpretation of the vector in different coordinate systems. Without this property, the vector would behave differently in different coordinate systems, making it difficult to analyze and interpret its values and properties.

3. How does grad(f) transform when the coordinate system is changed?

The transformation of grad(f) follows a specific set of rules known as the transformation laws. In Cartesian coordinates, grad(f) transforms as a partial derivative with respect to each coordinate. In curvilinear coordinates, such as polar or spherical coordinates, the transformation involves both the partial derivative and the scale factors of the coordinate system.

4. What are the implications of grad(f) being a covariant vector in physics?

The covariant nature of grad(f) has important implications in physics, particularly in the study of fields and their gradients. For example, in electromagnetism, the electric and magnetic fields are covariant vectors, and their respective gradients (i.e. the electric and magnetic potentials) are also covariant vectors. This allows for consistent calculations and interpretations of these quantities in different coordinate systems.

5. Can a vector be both covariant and contravariant?

Yes, a vector can be both covariant and contravariant, and this is known as a mixed tensor. This type of vector transforms differently depending on whether the indices are raised or lowered, reflecting its dual nature as both a covariant and a contravariant vector. An example of a mixed tensor is the stress-energy tensor in general relativity, which has both covariant and contravariant components.

Similar threads

  • Differential Geometry
Replies
9
Views
454
  • Differential Geometry
Replies
6
Views
2K
  • Differential Geometry
Replies
2
Views
1K
Replies
2
Views
1K
  • Advanced Physics Homework Help
Replies
5
Views
2K
Replies
4
Views
1K
  • Differential Geometry
Replies
9
Views
2K
  • Differential Geometry
Replies
12
Views
3K
  • Differential Geometry
Replies
1
Views
2K
  • Calculus and Beyond Homework Help
Replies
8
Views
518
Back
Top