Buri said:
Okay now I'm really getting confused. We were told that the differential of a multivariate function f: R^n -> R^m is a linear transformation Df = A: R^m -> R^n. We also have that if a function is differentiable then:
D_v f = Df(v)
Where, the first is the directional derivative in the direction of v and the second is the differential of f at v. For f(x,y) = |x| + |y| the differential would be something like [a b] and at v would be something like av1 + bv2 which is linear. So I'm confused now. Could you explain?
D H said:
That means the one dimensional derivative is a linear operator. That of course does not mean that derivatives are linear functions of x. That would preclude rather useful functions such as exp(x).
This is a common confusion, because the 1-dimensional case from high school calculus is very special.
The (total) derivative of [itex]f:\mathbb{R}^n\to\mathbb{R}^m[/itex] at a is a linear map [itex]Df(a):\mathbb{R}^n\to\mathbb{R}^m}[/itex]. Its matrix representation (w.r.t. to the standard bases) is just the Jacobi matrix of partial derivatives at a. If f is differentiable at all a in R^n, then we obtain a map
[tex]Df:\mathbb{R}^n\to \text{Lin}(\mathbb{R}^n,\mathbb{R}^m)[/tex]
[tex]a\mapsto Df(a),[/tex]
where [tex]\text{Lin}(\mathbb{R}^n,\mathbb{R}^m)[/tex] is the vector space of all linear maps from R^n to R^m.
Now observe what happens if m=n=1. We have a function [itex]f:\mathbb{R}\to\mathbb{R}[/itex]. Its (total) derivative at a is a linear map [itex]Df(a):\mathbb{R}\to\mathbb{R}[/itex], whose matrix representation is the 1x1 matrix consisting of the value [itex]f'(a)\in\mathbb{R}[/itex]. If f is differentiable at all of R, we get the map
[tex]Df:\mathbb{R}\to \text{Lin}(\mathbb{R},\mathbb{R})[/tex]
[tex]a\mapsto Df(a).[/tex]
But there is a nice linear isomorphism from [tex]\text{Lin}(\mathbb{R},\mathbb{R})[/tex] to good old [itex]\mathbb{R}[/itex] given by [itex]L\mapsto L(1)[/itex], i.e. evaluating at 1! This identification is always implicitly used in high school calculus, so that the above map becomes
[tex]Df:\mathbb{R}\to \text{Lin}(\mathbb{R},\mathbb{R})\cong \mathbb{R}[/tex]
[tex]a\mapsto Df(a)(1).[/tex]
Example:
Take f(x)=exp(5x). Then [itex]f'(a)=exp(5a)\in\mathbb{R}[/itex], and Df(a) is the linear map [itex]x\mapsto e^ax[/itex] i.e. multiplication by f'(a). Now since f is differentiable at all a in R, we get the map
[tex]Df:\mathbb{R}\to\text{Lin}(\mathbb{R},\mathbb{R})[/tex]
[tex]a\mapsto Df(a).[/tex]
Under the indentification just described, we evaluate [itex]Df(a)(1)=f'(a)1=f'(a)\in\mathbb{R}[/itex]
[tex]Df:\mathbb{R}\to \text{Lin}(\mathbb{R},\mathbb{R})\cong \mathbb{R}[/tex]
[tex]a\mapsto e^{5a}.[/tex]
Again, consider f:R^n\to R^m. Let [itex]D_vf(a)\in\mathbb{R}^m[/itex] denote the directional derivative of f at a in the direction of v. Then:
If f is differentiable at a (i.e. the linear map Df(a) from above exists), then f has dir.der. at a in all directions [itex]v\in\mathbb{R}^n[/itex], and they are given by
[tex]D_vf(a)=Df(a)v,[/tex]
i.e. D_vf(a) is obtained from Df(a) by evaluating in v, or in matrix notation by multiplying the Jacobi matrix of f at a by the column vector v.
Also, the map
[tex]\mathbb{R}^n\to\mathbb{R}^m[/tex]
[tex]v\mapsto D_vf(a)[/tex]
is linear.