Buri said:
Okay now I'm really getting confused. We were told that the differential of a multivariate function f: R^n -> R^m is a linear transformation Df = A: R^m -> R^n. We also have that if a function is differentiable then:
D_v f = Df(v)
Where, the first is the directional derivative in the direction of v and the second is the differential of f at v. For f(x,y) = |x| + |y| the differential would be something like [a b] and at v would be something like av1 + bv2 which is linear. So I'm confused now. Could you explain?
D H said:
That means the one dimensional derivative is a linear operator. That of course does not mean that derivatives are linear functions of x. That would preclude rather useful functions such as exp(x).
This is a common confusion, because the 1-dimensional case from high school calculus is very special.
The (total) derivative of f:\mathbb{R}^n\to\mathbb{R}^m at a is a linear map Df(a):\mathbb{R}^n\to\mathbb{R}^m}. Its matrix representation (w.r.t. to the standard bases) is just the Jacobi matrix of partial derivatives at a. If f is differentiable at all a in R^n, then we obtain a map
Df:\mathbb{R}^n\to \text{Lin}(\mathbb{R}^n,\mathbb{R}^m)
a\mapsto Df(a),
where \text{Lin}(\mathbb{R}^n,\mathbb{R}^m) is the vector space of all linear maps from R^n to R^m.
Now observe what happens if m=n=1. We have a function f:\mathbb{R}\to\mathbb{R}. Its (total) derivative at a is a linear map Df(a):\mathbb{R}\to\mathbb{R}, whose matrix representation is the 1x1 matrix consisting of the value f'(a)\in\mathbb{R}. If f is differentiable at all of R, we get the map
Df:\mathbb{R}\to \text{Lin}(\mathbb{R},\mathbb{R})
a\mapsto Df(a).
But there is a nice linear isomorphism from \text{Lin}(\mathbb{R},\mathbb{R}) to good old \mathbb{R} given by L\mapsto L(1), i.e. evaluating at 1! This identification is always implicitly used in high school calculus, so that the above map becomes
Df:\mathbb{R}\to \text{Lin}(\mathbb{R},\mathbb{R})\cong \mathbb{R}
a\mapsto Df(a)(1).
Example:
Take f(x)=exp(5x). Then f'(a)=exp(5a)\in\mathbb{R}, and Df(a) is the linear map x\mapsto e^ax i.e. multiplication by f'(a). Now since f is differentiable at all a in R, we get the map
Df:\mathbb{R}\to\text{Lin}(\mathbb{R},\mathbb{R})
a\mapsto Df(a).
Under the indentification just described, we evaluate Df(a)(1)=f'(a)1=f'(a)\in\mathbb{R}
Df:\mathbb{R}\to \text{Lin}(\mathbb{R},\mathbb{R})\cong \mathbb{R}
a\mapsto e^{5a}.
Again, consider f:R^n\to R^m. Let D_vf(a)\in\mathbb{R}^m denote the directional derivative of f at a in the direction of v. Then:
If f is differentiable at a (i.e. the linear map Df(a) from above exists), then f has dir.der. at a in all directions v\in\mathbb{R}^n, and they are given by
D_vf(a)=Df(a)v,
i.e. D_vf(a) is obtained from Df(a) by evaluating in v, or in matrix notation by multiplying the Jacobi matrix of f at a by the column vector v.
Also, the map
\mathbb{R}^n\to\mathbb{R}^m
v\mapsto D_vf(a)
is linear.