Understanding of Higher-Order Derivatives

  • Thread starter Thread starter shaggymoods
  • Start date Start date
  • Tags Tags
    Derivatives
shaggymoods
Messages
26
Reaction score
0
Hey guys, so this may be a really silly question, but I'm trying to grasp a subtle point about higher-order derivatives of multivariable functions. In particular, suppose we have an infinitely differentiable function

f: \mathbb{R}^{n} \rightarrow \mathbb{R}

I know that the first derivative of this function is a linear map \lambda: \mathbb{R}^{n}\rightarrow\mathbb{R}. However, when we take the second-derivative of \lambda, some questions arise for me:

1.) If we are taking this derivative when considering \lambda as a linear function, then we'd just get back \lambda, which isn't the case. So how are we interpreting the first derivative when taking a second?

2.) In general, why do we say that D^{k}f:\mathbb{R}^{n^{k}}\rightarrow\mathbb{R} and not D^{k}f:\mathbb{R}^{n}\rightarrow\mathbb{R} ??

Thanks in advance.
 
Physics news on Phys.org
If f : \mathbb{R}^n \to \mathbb{R}, then the derivative of f is a linear map \lambda : \mathbb{R}^n \to \mathbb{R} at each point in \mathbb{R}^n. That is to say, the derivative of f, properly considered, is a map Df : \mathbb{R}^n \to L(\mathbb{R}^n, \mathbb{R}), where L(\mathbb{R}^n, \mathbb{R}) denotes the space of all linear maps \lambda : \mathbb{R}^n \to \mathbb{R}, which is just the dual of \mathbb{R}^n (and is thus isomorphic to \mathbb{R}^n). The second derivative of f is then a map D^2 f : \mathbb{R}^n \to L(\mathbb{R}^n, L(\mathbb{R}^n, \mathbb{R})) \cong L(\mathbb{R}^n, \mathbb{R}^n), where L(\mathbb{R}^n, \mathbb{R}^n) is the space of all n \times n matrices, and is isomorphic to \mathbb{R}^{n^2}. (The output of the second derivative is usually called the Hessian matrix of f.) Continuing in this vein, you can show that D^k f is a map from \mathbb{R}^n to \mathbb{R}^{n^k}, not a map from \mathbb{R}^{n^k} \to \mathbb{R} as you suggest in #2.

Basically, what's going on here is that a derivative, properly defined, is a best linear approximation to a function. Thus, at some point \mathbf{p} \in \mathbb{R}^n, the derivative Df takes the value of the linear map \lambda : \mathbb{R}^n \to \mathbb{R} which most closely resembles f near \mathbf{p}. Thus, Df is actually a map from \mathbb{R}^n into the space of all possible such approximations, and D^k f is a map from \mathbb{R}^n into some higher tensor product of \mathbb{R}^n and its dual space. Your answer to #1 is thus that, while elements of the range of Df must be linear maps and have trivial derivatives, Df itself is not necessarily linear. This is why it is necessary to specify two arguments when evaluating Df: a location \mathbf{a}, and a direction \mathbf{h}. The location specifies a linear map, i.e., there is some linear map \lambda for which Df : \mathbf{a} \mapsto \lambda. The direction then serves as the argument for \lambda, and, in a slight abuse of notation, we usually write \lambda(\mathbf{h}) \equiv Df(\mathbf{a})(\mathbf{h}) or Df(\mathbf{a}, \mathbf{h}).
 

Similar threads

Back
Top