# Understanding of Higher-Order Derivatives

1. Aug 22, 2009

### shaggymoods

Hey guys, so this may be a really silly question, but I'm trying to grasp a subtle point about higher-order derivatives of multivariable functions. In particular, suppose we have an infinitely differentiable function

$$f: \mathbb{R}^{n} \rightarrow \mathbb{R}$$

I know that the first derivative of this function is a linear map $$\lambda: \mathbb{R}^{n}\rightarrow\mathbb{R}$$. However, when we take the second-derivative of $$\lambda$$, some questions arise for me:

1.) If we are taking this derivative when considering $$\lambda$$ as a linear function, then we'd just get back $$\lambda$$, which isn't the case. So how are we interpreting the first derivative when taking a second?

2.) In general, why do we say that $$D^{k}f:\mathbb{R}^{n^{k}}\rightarrow\mathbb{R}$$ and not $$D^{k}f:\mathbb{R}^{n}\rightarrow\mathbb{R}$$ ??

Thanks in advance.

2. Aug 22, 2009

### VKint

If $$f : \mathbb{R}^n \to \mathbb{R}$$, then the derivative of $$f$$ is a linear map $$\lambda : \mathbb{R}^n \to \mathbb{R}$$ at each point in $$\mathbb{R}^n$$. That is to say, the derivative of $$f$$, properly considered, is a map $$Df : \mathbb{R}^n \to L(\mathbb{R}^n, \mathbb{R})$$, where $$L(\mathbb{R}^n, \mathbb{R})$$ denotes the space of all linear maps $$\lambda : \mathbb{R}^n \to \mathbb{R}$$, which is just the dual of $$\mathbb{R}^n$$ (and is thus isomorphic to $$\mathbb{R}^n$$). The second derivative of $$f$$ is then a map $$D^2 f : \mathbb{R}^n \to L(\mathbb{R}^n, L(\mathbb{R}^n, \mathbb{R})) \cong L(\mathbb{R}^n, \mathbb{R}^n)$$, where $$L(\mathbb{R}^n, \mathbb{R}^n)$$ is the space of all $$n \times n$$ matrices, and is isomorphic to $$\mathbb{R}^{n^2}$$. (The output of the second derivative is usually called the Hessian matrix of $$f$$.) Continuing in this vein, you can show that $$D^k f$$ is a map from $$\mathbb{R}^n$$ to $$\mathbb{R}^{n^k}$$, not a map from $$\mathbb{R}^{n^k} \to \mathbb{R}$$ as you suggest in #2.

Basically, what's going on here is that a derivative, properly defined, is a best linear approximation to a function. Thus, at some point $$\mathbf{p} \in \mathbb{R}^n$$, the derivative $$Df$$ takes the value of the linear map $$\lambda : \mathbb{R}^n \to \mathbb{R}$$ which most closely resembles $$f$$ near $$\mathbf{p}$$. Thus, $$Df$$ is actually a map from $$\mathbb{R}^n$$ into the space of all possible such approximations, and $$D^k f$$ is a map from $$\mathbb{R}^n$$ into some higher tensor product of $$\mathbb{R}^n$$ and its dual space. Your answer to #1 is thus that, while elements of the range of $$Df$$ must be linear maps and have trivial derivatives, $$Df$$ itself is not necessarily linear. This is why it is necessary to specify two arguments when evaluating $$Df$$: a location $$\mathbf{a}$$, and a direction $$\mathbf{h}$$. The location specifies a linear map, i.e., there is some linear map $$\lambda$$ for which $$Df : \mathbf{a} \mapsto \lambda$$. The direction then serves as the argument for $$\lambda$$, and, in a slight abuse of notation, we usually write $$\lambda(\mathbf{h}) \equiv Df(\mathbf{a})(\mathbf{h})$$ or $$Df(\mathbf{a}, \mathbf{h})$$.

Share this great discussion with others via Reddit, Google+, Twitter, or Facebook