A way of seeing a differential that makes sense to me is this:
The differential measures the change of the function along the
tangent line, plane (depending on wether you are in R,R^2, etc.)
A function is differentiable , if the change in its values can be
approximated locally by a linear function, to any degree of precision
( in a delta-epsilon sense). The linear function that does this approximation
is itself the derivative.
This is one way of seeing the definition:
|| f(x+h)-f(x) -L(h)||=0
Limh_>0 __________
h
An example:
For f(x)=x^2, we have:
df=f'(x)dx , so df=2xdx.
This means that the change in the value of f(x)=x^2 in
a 'hood ( 'hood = neighborhood) of a point can be approximated
by the change in the values in 2x.
Take a small 'hood of, say, 10 on the real line, take
(9.9,10.1). The change of f(x)=x^2 from the value 10 is:
i) |10.1^2 -10^2| =0.201
ii) |9.9^2-10^2| =0.199
Now, consider the approximation to the change of x^2 ,using the derivative:
i') |2(10.1)-2(10)|= 0.2
ii') |2(9.9)-2(10)| = 0.2
The error is pretty small, right?. There are, of course, analogies to this
in higher dimensions, with approximations along tangent planes, etc.
Unfortunately things get much hairier outside of R^n, where you have
sometimes just local Euclidean, like in manifolds, without the standard
tangent planes.
Hope that helped