- #1
bwest121
- 5
- 1
I'm reading a textbook that says:
"The directional derivative in direction ##u## is the derivative of the function ##f( \mathbf x + \alpha \mathbf u)## with respect to ##\alpha##, evaluated at ##\alpha=0##. Using the chain rule, we can see that ##\frac {\partial}{\partial \alpha} f( \mathbf x + \alpha \mathbf u)## evaluates to ##\mathbf u^\intercal \nabla_\mathbf x f(\mathbf x)## when ##\alpha = 0##."
I understand that the directional derivative is the dot product of the gradient function and the direction vector. However, I don't fully see how to get the result through using the chain rule.
Here's my attempt:
$$\frac {\partial}{\partial \alpha} f(\mathbf x + \alpha\mathbf u) = \frac {\partial f}{\partial \alpha} \cdot \frac {\partial (\mathbf x + \alpha\mathbf u)}{\partial \alpha}$$
I know that ##\frac {\partial (\mathbf x + \alpha\mathbf u)}{\partial \alpha} = \mathbf u## either by applying the limit definition of the derivative or by decomposing the ##(\mathbf x + \alpha\mathbf u)## vector and applying ##\frac{\partial}{\partial\alpha}## to each component, thereby eliminating the components of ##\mathbf x## and leaving only ##\mathbf u##. Thus, I'll be dotting ##\mathbf u## with ##\frac {\partial f}{\partial \alpha}## ie; ##\mathbf u^\intercal \frac {\partial f}{\partial \alpha}.## However, how does $$\frac {\partial f}{\partial \alpha} = \nabla_\mathbf x f(\mathbf x)?$$
"The directional derivative in direction ##u## is the derivative of the function ##f( \mathbf x + \alpha \mathbf u)## with respect to ##\alpha##, evaluated at ##\alpha=0##. Using the chain rule, we can see that ##\frac {\partial}{\partial \alpha} f( \mathbf x + \alpha \mathbf u)## evaluates to ##\mathbf u^\intercal \nabla_\mathbf x f(\mathbf x)## when ##\alpha = 0##."
I understand that the directional derivative is the dot product of the gradient function and the direction vector. However, I don't fully see how to get the result through using the chain rule.
Here's my attempt:
$$\frac {\partial}{\partial \alpha} f(\mathbf x + \alpha\mathbf u) = \frac {\partial f}{\partial \alpha} \cdot \frac {\partial (\mathbf x + \alpha\mathbf u)}{\partial \alpha}$$
I know that ##\frac {\partial (\mathbf x + \alpha\mathbf u)}{\partial \alpha} = \mathbf u## either by applying the limit definition of the derivative or by decomposing the ##(\mathbf x + \alpha\mathbf u)## vector and applying ##\frac{\partial}{\partial\alpha}## to each component, thereby eliminating the components of ##\mathbf x## and leaving only ##\mathbf u##. Thus, I'll be dotting ##\mathbf u## with ##\frac {\partial f}{\partial \alpha}## ie; ##\mathbf u^\intercal \frac {\partial f}{\partial \alpha}.## However, how does $$\frac {\partial f}{\partial \alpha} = \nabla_\mathbf x f(\mathbf x)?$$