Alain De Vos said:
I learned gradient in 3D space. And gradients where always vectors, pointing in the direction of steepest ... and normal to the surface where the functions is constant.
But reading one-forms , a gradient of a function is not always a vector and it has something to do with metric... Can you proof this mathematically? Or an example which disproves that a gradient is a vector? Or visualise it?
If it is not a vector a change of coordinate base is does not change the coordinates in the "correct way"?
I will use the notation ##f{}_{,i}## for the ##i##th partial derivative of a function ##f:\mathbb R^n\to\mathbb R##. In differential geometry, partial derivatives are defined using coordinate systems (charts). If ##M## is a smooth manifold, ##p\in U\subseteq M##, and ##x:U\to\mathbb R^n## is a coordinate system, then we define ##\frac{\partial}{\partial x^i}\big|_p## by
$$\frac{\partial}{\partial x^i}\bigg|_p f =(f\circ x^{-1})_{,i}(x(p))$$ for all smooth functions ##f:M\to\mathbb R##. The notation ##\big(\frac{\partial}{\partial x^i}\big)_p## can be used instead of ##\frac{\partial}{\partial x^i}\big|_p##, and the notation ##\frac{\partial f(p)}{\partial x^i}## can be used instead of ##\frac{\partial}{\partial x^i}\big|_p f##.
The gradient of a differentiable function ##f:\mathbb R^n\to\mathbb R## is the function ##\nabla f:\mathbb R^n\to\mathbb R^n## defined by ##\nabla f (x)=(f_{,1}(x),\dots,f_{,n}(x))## for all ##x\in\mathbb R^n##. The right-hand side can be rewritten using the differential geometry definition of partial derivative, if we use the fact that the identity map on ##\mathbb R^n## is a coordinate system. I will denote the identity map by ##I##. We have
$$\nabla f(x)=(f_{,1}(x),\dots,f_{,n}(x)) =\left((f\circ I^{-1})_{,1}(I(x)),\dots,(f\circ I^{-1})_{,n}(I(x))\right) =\left(\frac{\partial}{\partial I^1}\bigg|_x f,\dots,\frac{\partial}{\partial I^n}\bigg|_x f \right).$$ This is the n-tuple of components of the cotangent vector ##\mathrm (df)_x## in the coordinate system ##I##, since
$$(\mathrm df)_x =\left((df)_x \frac{\partial}{\partial I^i}\bigg|_x\right) \mathrm (dI^i)_x =\left(\frac{\partial}{\partial I^i}\bigg|_x f\right) \mathrm (dI^i)_x.$$ This is one reason to think of the gradient of f as a cotangent vector, specifically as the cotangent vector ##(\mathrm df)_x##.
Another reason is that we can interpret the formula for ##\nabla f(x)## above as associating an n-tuple
with each coordinate system. The right-hand side is the n-tuple associated with the coordinate system ##I##. To get the n-tuple associated with an arbitrary coordinate system ##J##, just make the substitution ##I\to J##. Now that we have an n-tuple associated with each coordinate system, we can investigate how they're related to each other, i.e. we can investigate how an n-tuple "transforms" under a change of coordinates. If you know that ##\big(\frac{\partial}{\partial J^1}\big|_x,\dots,\frac{\partial}{\partial J^1}\big|_x\big)## is an ordered basis for the tangent space at ##x##, and that "transforms covariantly" means "transforms in the same way as the ordered basis", then you should see that it follows almost immediately that our n-tuple ##\big(\frac{\partial}{\partial J^1}\big|_x f,\dots,\frac{\partial}{\partial J^1}\big|_x f\big)## "transforms covariantly" (unlike n-tuples of components of tangent vectors, which "transform contravariantly").