sunrah said:
The argument that I don't understand is that this
T^{a}\frac{d}{dx^{a}}
is a vector. To me it looks like the inner product of two vectors, \vec{T} = (T_{x}, T_{y}) and ∇x,y, so looks like a scalar to me.
This is really just a convention, and if it bothers you, you don't have to adopt it, but you should understand it.
I think it's clear in terms of coordinates that if you have a parametrized path, x^\mu(s), giving a path through space (or spacetime) as a function of a parameter s (not necessarily time, nor even proper time, just a real-valued parameter that increases as you move down the path), then there is a corresponding vector U whose components are given by U^\mu = \dfrac{dx^\mu}{ds}. This vector is called the tangent vector to the path x^\mu(s). For each tangent vector U with components U^\mu, there is a corresponding operator, the "directional derivative", that can be defined by: (U \cdot \nabla) \phi = U^\mu \dfrac{d \phi}{dx^\mu}
The above definitions are all in terms of components, which are specific to a coordinate system. Is there a coordinate-independent way to talk about directional derivatives, without mentioning components of the vector? Yes. Let \mathcal{P}(s) be a smooth function from a real number s to points in space (or spacetime). We define the directional derivative along path \mathcal{P}, \frac{d \mathcal{P}}{ds}, to be the operator \hat{U} defined by:
For any scalar field \phi (that is, function that assigns a real number to each point in space, or spacetime),
\hat{U}(\phi) = \dfrac{d\phi(\mathcal{P}(s))}{ds}
This definition of "directional derivative" doesn't mention coordinates or components. It only mentions scalar fields and parametrized paths and derivatives of real-valued functions.
At this point, we note that there is a one-to-one correspondence between directional derivatives (which are operators) and tangent vectors (which are...I don't know...abstract objects that can be represented by column matrices that transform in some particular way under coordinate transformations). We have a coordinate-free definition of a directional derivative, and every tangent vector corresponds to a directional derivative, so there is really no reason not to identify the two: A tangent vector simply
IS a directional derivative.
This way of looking at it flips the idea of what is fundamental. Instead of defining a vector in terms of components, and using components to define a directional derivative, we view the directional derivative as fundamental, and components to be a derived concept:
Pick a coordinate system. Then since a scalar field is any function from points in space \mathcal{P} to real numbers, then a coordinate system is equivalent to a collection of four scalar fields X(\mathcal{P}), Y(\mathcal{P}), Z(\mathcal{P}), T(\mathcal{P}), where X(\mathcal{P}) gives the value of the x-coordinate at point \mathcal{P}, etc. Then we can define the components of a directional derivative \hat{U} via:
U^x = \hat{U}(X)
U^y = \hat{U}(Y)
U^z = \hat{U}(Z)
U^t = \hat{U}(T)
or more compactly: U^\mu = \hat{U}(x^\mu) (where x^\mu is understood as the scalar field corresponding to that coordinate).
Once we identify directional derivatives with tangent vectors, we can pick out a set of basis vectors corresponding to a coordinate system as follows:
e_x = \frac{d}{dx}
e_y = \frac{d}{dy}
e_z = \frac{d}{dz}
e_t = \frac{d}{dt}
Those are the directional derivatives in the x, y, z, and t-directions. An arbitrary vector U can be written as a linear combination of these basis vectors:
U = U^x e_x + U^y e_y + U^z e_z + U^t e_z = U^x \frac{d}{dx} + U^y \frac{d}{dy} + U^z \frac{d}{dz} + U^t \frac{d}{dt} = U^\mu \frac{d}{dx^\mu}
So in the expression U^\mu \frac{d}{dx^\mu}, you're just creating a linear combination of basis vectors.