I've had my difficulties with this, but fundamentally it's
not too hard.
In the small limit, there are only derivatives and the geometry is
necessarily linear: a vector space with its dual. Any point of
a manifold, by definition must have a neighbourhood which can
be mapped to an n-dimensional coordinate system, and any such
system gives us a vector space in the limit. There are two
primary types of vector: tangent vectors and gradients or
cotangent vectors; we say they comprise the tangent space and
cotangent space at the point, respectively.
Suppose for example our coordinates are (x,y,z). Basis vectors
of the tangent space are unit vectors along the coordinate
axes; basis vectors of the cotangent space are gradients of
linear functions whose level surfaces are parallel to the
coordinate planes.
E.g. the unit vector \partial_x_1 has components (1,0,0).
It is the tangent to the parametric curve
(t,0,0),
or any other curve
(t,0,0) + O(>1) in t.
Because we have a vector space, curves can be added to make
(v_1, v_2, v_3) t + O(>1),
and they all have the same tangent,
(v_1, v_2, v_3).
The unit covector dx_1 has components (1,0,0). It is just
the gradient of the function x, or any other function
x + O(>1) in (x,y,z).
Functions can be added, and so can gradients because differentiation is
linear, so for example the function
f(x,y,z) = a x + b y + c z + O(>1)
has gradient components (a,b,c).
The components of a gradient are directional derivatives
along the axes. Let's play this out in full. The derivative
of f in the x direction is
Limit(t->0) ( (a x + b y + c z) where (x,y,z) = (t, 0, 0) )/t=
Limit(t->0) ( a t )/t =
a
Another way of stating this, obviously, is
\partial f/\partial x = (1, 0, 0) \cdot (a, b, c)
Note that because all functions look like linear functions in the
small limit, the working above always comes out as a scalar
product
(a, b, c) \cdot (v_1, v_2, v_3)
between a gradient vector and a tangent vector. This is the sense
of the MTW picture of "counting the number of level surfaces
pierced by a curve".
Let's get this into standard terminology now.
The basis cotangent vectors are the gradients of the coordinate
functions: dx, dy, dz, or dx^\mu (\mu = 1,2,3).
Any gradient value of a function is a linear combination of these,
e.g.
df = a dx + b dy + c dz
The basis tangent vectors are tangents to curves aligned with
the coordinate axes: \partial_x, \partial_y, \partial_z, or \partial_{x^\mu}.
Any tangent vector is a linear combination of these, e.g.
v = v_1 \partial_x + v_2 \partial_y + v_3 \partial_z.
Components of gradients are partial derivatives of the coordinates.
Partial derivatives are directional derivatives along the coordinate
axes.
Directional derivatives are scalar products of tangents with
gradients.
Thus for example
\partial_x f / \partial_x = [ trad. notation ]
< \partial_x, df > = [ scalar product ]
a
It follows from the expansion of df and the scalar product,
that
< \partial_{x^\mu}, dx^\nu > = \delta_\mu^\nuFundamentally, that's ALL there is to it. Tangents and
gradients are dual, and their scalar products are partial
derivatives.
But there is an extra refinement of the notation, which can
be very helpful in some cases. Since a tangent vector can
operate on a function, by making a scalar product with its
gradient, so as to calculate a directional derivative, we
can identify the tangent with just that directional
derivative operation; i.e.
v(f) = < v, df >
This is also written
\partial_v f = < v, df >
I don't care for this notation, because it overloads the
\partial symbol, which we already used for the basis tangent
vectors, \partial_{x^\mu}.
It would be OK to write
\partial_x (f) = < \partial_x, df >
but to be consistent, we should write the second form as
\partial_\partial_x f = < \partial_x, df>
It's probably not a crippling confusion, but if I happen to
see \partial_u somewhere, I need to check whether u is a
point-function or a tangent, in order to interpret whether
\partial_u is a basis vector or a derivative in the basis
direction.
Just while I'm whinging, and this is purely pedagogical,
I've wasted crucial hours trying to understand how a
directional derivative operator can be something of dual
type to a gradient. How did the scalar product work? Should
I be applying one to the other? Major angst, and completely
misconceived.
(1) Tangent vectors are dual to gradients.
(2) Directional derivatives are applied to functions.
Yes I know that tangent vectors and d.d. operators are
isomorphic; nevertheless, they are to be distinguished.
At least in teaching!Apologies: This probably reads way too assertively. I'm just putting down
statements as explicitly as I can, so I don't lose track.
Appreciations: Excellent explanations, lethe. Please keep it up. I'm very
glad to have discovered the site, things are just at my level.
Jonathan