instead if just criticizing, let me add my two cents. then you can all see and point out my errors as well. (curly d's printed below as question marks for some reason.)
This is purely my own vulgar view of the topic. I hope it helps someone.
A manifold M is a space of points, on which it makes sense to define a real valued smooth function say f:M-->R.
Then we want to differentiate this function, but a derivative is a linear function at each point p of M which approximates M locally near p.
So how to do this when M is not a linear space?
first one approximates M itself locally near p by a linear space called the tangent space to M at p, i.e. Tp(M).
this construction takes work, unless M is already embedded in R^n, and then it just consists of the set of velocity vectors at p of all smooth curves in M, through p. [Even if M is not embedded, one can embed it and check that the equivalence relation "same velocity vector at p" is independent of the embedding. then one defines the tangent space as the set of equivalence classes of curves having the same velocity vector in all embeddings, instead of as the set of those velocity vectors.]
Once it is done, we want the derivative dfp, or f'(p), of f at p, to be a linear map from Tp(M) to R, i.e. f'(p) = dpf:Tp(M)-->R must be defined and linear.
well one way would be, for each vector v in Tp(M), to choose a smooth curve s through p, in M, with velocity vector at p equal to v, [i.e. s(0) = p, and s'(0) = v.] Then the composition fos is a smooth map from an interval to R, and has itself a derivative at 0. this is the value of dfp(v).
Of course invariance under choice of s must be checked (chain rule).
Anyway, we have at last defined a procedure for sending a smooth function f:M-->R, a point p of M, and a tangent vector v of Tp(M), to a number dfp(v).
If we fix any of these objects, say f, and p, and let v vary, we have a linear function dfp, from Tp(M) to R. this is called a "cotangent vector" at p, or an element of "T*p(M)".
If we fix only f say, and let both p and v vary, we have a "covector field" df, that assigns to each point p of M, a linear function or covector on Tp(M).
If we fix only M, we have an operator d, which assigns to functions f and points p,... etc etc...
Now if we fix nothing at all, not even M, we still have a construction d, which associates to manifolds M and functions f, a covector field df on M.
We can call d a total differential, or exterior derivative or whatever. the terminology is less important than the behavior.
Now if we do not like to learn abstract definitions, or even if we do, but we want to make some calculations, we can begin to introduce coordinates to represent these objects.
for instance we can put in a coordinate system on M, and hence a basis of each space
Tp(M), and call these basis elements ?/?x, ?/?y, ?/?z, or whatever. [recall ? means "curly d".]
[If M is already sitting in R^n, then one can simply restrict the basis elements for the ambient tangent space TpR^n, but then they are of course no longer independent on Tp(M) usually.]
This of course immediately chooses also a dual basis for the dual linear space T*p(M) of linear functions on Tp(M). This basis is called dxp, dyp, dzp, i.e. to get a basis for the local derivatives of all functions f at p, one takes the differentials at p of the basic coordinate functions x,y,z.
Then since dfp belongs to T*p(M), one can of course express dfp as a linear combination of dxp, dyp, and dzp. the coordinates in such an expression are called, essentially by definition, ?f/?x(p), ?f/?y(p), ?f/?z(p).
so one has the operation del?" that assigns to each f and each p, the triple (i.e."vector" to some people) of coordinates for dfp, in terms of the basis dx,dy,dz.
Thus the operation assigning f -->(?f/?x(p), ?f/?y(p),?f/?z(p)) might be called a differential operator of some kind. Then some people might argue over whether to call it a scalar operator or a vector operator, to distinguish the two operators
f -->(?f/?x(p), ?f/?y(p),?f/?z(p))
and f --> ?f/?x(p) dxp + ?f/?y(p) dyp + ?f/?z(p) dzp,
from each other.
this is a linguisitic discussion. the point is to understand what is going on. one is approximating a real valued smooth f, by a family of real valued linear functions.
note too that the operator del above depends on the unnatural choice of coordinates x,y,z, hence is a computational aid, and not a concept.