Understanding of Higher-Order Derivatives

shaggymoods · Aug 22, 2009

Hey guys, so this may be a really silly question, but I'm trying to grasp a subtle point about higher-order derivatives of multivariable functions. In particular, suppose we have an infinitely differentiable function

f: \mathbb{R}^{n} \rightarrow \mathbb{R}

I know that the first derivative of this function is a linear map \lambda: \mathbb{R}^{n}\rightarrow\mathbb{R}. However, when we take the second-derivative of \lambda, some questions arise for me:

1.) If we are taking this derivative when considering \lambda as a linear function, then we'd just get back \lambda, which isn't the case. So how are we interpreting the first derivative when taking a second?

2.) In general, why do we say that D^{k}f:\mathbb{R}^{n^{k}}\rightarrow\mathbb{R} and not D^{k}f:\mathbb{R}^{n}\rightarrow\mathbb{R} ??

Thanks in advance.

VKint · Aug 22, 2009

If f : \mathbb{R}^n \to \mathbb{R}, then the derivative of f is a linear map \lambda : \mathbb{R}^n \to \mathbb{R} at each point in \mathbb{R}^n. That is to say, the derivative of f, properly considered, is a map Df : \mathbb{R}^n \to L(\mathbb{R}^n, \mathbb{R}), where L(\mathbb{R}^n, \mathbb{R}) denotes the space of all linear maps \lambda : \mathbb{R}^n \to \mathbb{R}, which is just the dual of \mathbb{R}^n (and is thus isomorphic to \mathbb{R}^n). The second derivative of f is then a map D^2 f : \mathbb{R}^n \to L(\mathbb{R}^n, L(\mathbb{R}^n, \mathbb{R})) \cong L(\mathbb{R}^n, \mathbb{R}^n), where L(\mathbb{R}^n, \mathbb{R}^n) is the space of all n \times n matrices, and is isomorphic to \mathbb{R}^{n^2}. (The output of the second derivative is usually called the Hessian matrix of f.) Continuing in this vein, you can show that D^k f is a map from \mathbb{R}^n to \mathbb{R}^{n^k}, not a map from \mathbb{R}^{n^k} \to \mathbb{R} as you suggest in #2.

Basically, what's going on here is that a derivative, properly defined, is a best linear approximation to a function. Thus, at some point \mathbf{p} \in \mathbb{R}^n, the derivative Df takes the value of the linear map \lambda : \mathbb{R}^n \to \mathbb{R} which most closely resembles f near \mathbf{p}. Thus, Df is actually a map from \mathbb{R}^n into the space of all possible such approximations, and D^k f is a map from \mathbb{R}^n into some higher tensor product of \mathbb{R}^n and its dual space. Your answer to #1 is thus that, while elements of the range of Df must be linear maps and have trivial derivatives, Df itself is not necessarily linear. This is why it is necessary to specify two arguments when evaluating Df: a location \mathbf{a}, and a direction \mathbf{h}. The location specifies a linear map, i.e., there is some linear map \lambda for which Df : \mathbf{a} \mapsto \lambda. The direction then serves as the argument for \lambda, and, in a slight abuse of notation, we usually write \lambda(\mathbf{h}) \equiv Df(\mathbf{a})(\mathbf{h}) or Df(\mathbf{a}, \mathbf{h}).

Understanding of Higher-Order Derivatives

Thread 'Unit circle bug?'

Similar threads

Hot Threads

I Algebraic property of real numbers

I Problem in understanding instantaneous velocity

I How to find the path if we only know the velocity (without common formulas)?

I Explicit logical justification for last step in epsilon/delta proof?

A Getting the power spectral density from a plot

Recent Insights

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Insights What Exactly is Dirac’s Delta Function? - Insight

Insights Relativator (Circular Slide-Rule): Simulated with Desmos - Insight

Insights Fixing Things Which Can Go Wrong With Complex Numbers

Insights Fermat's Last Theorem

Insights Why Vector Spaces Explain The World: A Historical Perspective