Understanding of Higher-Order Derivatives

shaggymoods · Aug 22, 2009

Hey guys, so this may be a really silly question, but I'm trying to grasp a subtle point about higher-order derivatives of multivariable functions. In particular, suppose we have an infinitely differentiable function

[tex]f: \mathbb{R}^{n} \rightarrow \mathbb{R}[/tex]

I know that the first derivative of this function is a linear map [tex]\lambda: \mathbb{R}^{n}\rightarrow\mathbb{R}[/tex]. However, when we take the second-derivative of [tex]\lambda[/tex], some questions arise for me:

1.) If we are taking this derivative when considering [tex]\lambda[/tex] as a linear function, then we'd just get back [tex]\lambda[/tex], which isn't the case. So how are we interpreting the first derivative when taking a second?

2.) In general, why do we say that [tex]D^{k}f:\mathbb{R}^{n^{k}}\rightarrow\mathbb{R}[/tex] and not [tex]D^{k}f:\mathbb{R}^{n}\rightarrow\mathbb{R}[/tex] ??

Thanks in advance.

VKint · Aug 22, 2009

If [tex]f : \mathbb{R}^n \to \mathbb{R}[/tex], then the derivative of [tex]f[/tex] is a linear map [tex]\lambda : \mathbb{R}^n \to \mathbb{R}[/tex] at each point in [tex]\mathbb{R}^n[/tex]. That is to say, the derivative of [tex]f[/tex], properly considered, is a map [tex]Df : \mathbb{R}^n \to L(\mathbb{R}^n, \mathbb{R})[/tex], where [tex]L(\mathbb{R}^n, \mathbb{R})[/tex] denotes the space of all linear maps [tex]\lambda : \mathbb{R}^n \to \mathbb{R}[/tex], which is just the dual of [tex]\mathbb{R}^n[/tex] (and is thus isomorphic to [tex]\mathbb{R}^n[/tex]). The second derivative of [tex]f[/tex] is then a map [tex]D^2 f : \mathbb{R}^n \to L(\mathbb{R}^n, L(\mathbb{R}^n, \mathbb{R})) \cong L(\mathbb{R}^n, \mathbb{R}^n)[/tex], where [tex]L(\mathbb{R}^n, \mathbb{R}^n)[/tex] is the space of all [tex]n \times n[/tex] matrices, and is isomorphic to [tex]\mathbb{R}^{n^2}[/tex]. (The output of the second derivative is usually called the Hessian matrix of [tex]f[/tex].) Continuing in this vein, you can show that [tex]D^k f[/tex] is a map from [tex]\mathbb{R}^n[/tex] to [tex]\mathbb{R}^{n^k}[/tex], not a map from [tex]\mathbb{R}^{n^k} \to \mathbb{R}[/tex] as you suggest in #2.

Basically, what's going on here is that a derivative, properly defined, is a best linear approximation to a function. Thus, at some point [tex]\mathbf{p} \in \mathbb{R}^n[/tex], the derivative [tex]Df[/tex] takes the value of the linear map [tex]\lambda : \mathbb{R}^n \to \mathbb{R}[/tex] which most closely resembles [tex]f[/tex] near [tex]\mathbf{p}[/tex]. Thus, [tex]Df[/tex] is actually a map from [tex]\mathbb{R}^n[/tex] into the space of all possible such approximations, and [tex]D^k f[/tex] is a map from [tex]\mathbb{R}^n[/tex] into some higher tensor product of [tex]\mathbb{R}^n[/tex] and its dual space. Your answer to #1 is thus that, while elements of the range of [tex]Df[/tex] must be linear maps and have trivial derivatives, [tex]Df[/tex] itself is not necessarily linear. This is why it is necessary to specify two arguments when evaluating [tex]Df[/tex]: a location [tex]\mathbf{a}[/tex], and a direction [tex]\mathbf{h}[/tex]. The location specifies a linear map, i.e., there is some linear map [tex]\lambda[/tex] for which [tex]Df : \mathbf{a} \mapsto \lambda[/tex]. The direction then serves as the argument for [tex]\lambda[/tex], and, in a slight abuse of notation, we usually write [tex]\lambda(\mathbf{h}) \equiv Df(\mathbf{a})(\mathbf{h})[/tex] or [tex]Df(\mathbf{a}, \mathbf{h})[/tex].

Understanding of Higher-Order Derivatives

Similar threads

Undergrad Finding the minimum distance between two curves

Undergrad Why ##a^0=1##?

Undergrad Proving that convexity implies second order derivative being positive

High School Straightforward integration…

High School Arc Length for Hyperbolic Sin

Insights Revisiting the Velocity-Time Function

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect