Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

I How do I apply Chain Rule to get the desired result?

  1. Jan 15, 2017 #1
    I'm reading a textbook that says:

    "The directional derivative in direction ##u## is the derivative of the function ##f( \mathbf x + \alpha \mathbf u)## with respect to ##\alpha##, evaluated at ##\alpha=0##. Using the chain rule, we can see that ##\frac {\partial}{\partial \alpha} f( \mathbf x + \alpha \mathbf u)## evaluates to ##\mathbf u^\intercal \nabla_\mathbf x f(\mathbf x)## when ##\alpha = 0##."

    I understand that the directional derivative is the dot product of the gradient function and the direction vector. However, I don't fully see how to get the result through using the chain rule.

    Here's my attempt:
    $$\frac {\partial}{\partial \alpha} f(\mathbf x + \alpha\mathbf u) = \frac {\partial f}{\partial \alpha} \cdot \frac {\partial (\mathbf x + \alpha\mathbf u)}{\partial \alpha}$$

    I know that ##\frac {\partial (\mathbf x + \alpha\mathbf u)}{\partial \alpha} = \mathbf u## either by applying the limit definition of the derivative or by decomposing the ##(\mathbf x + \alpha\mathbf u)## vector and applying ##\frac{\partial}{\partial\alpha}## to each component, thereby eliminating the components of ##\mathbf x## and leaving only ##\mathbf u##. Thus, I'll be dotting ##\mathbf u## with ##\frac {\partial f}{\partial \alpha}## ie; ##\mathbf u^\intercal \frac {\partial f}{\partial \alpha}.## However, how does $$\frac {\partial f}{\partial \alpha} = \nabla_\mathbf x f(\mathbf x)?$$
     
  2. jcsd
  3. Jan 15, 2017 #2

    Orodruin

    User Avatar
    Staff Emeritus
    Science Advisor
    Homework Helper
    Gold Member

    You have used the chain rule on the (wrong) form df/dx = (df/dx)(dy/dx). The chain rule is df/dx = (df/dy)(dy/dx). If you have several variables y you get a sum over the variables and the derivatives of f will be the partial derivatives.
     
  4. Jan 15, 2017 #3

    PeroK

    User Avatar
    Science Advisor
    Homework Helper
    Gold Member

    The main issue is your understanding of a partial derivative. A scalar function of a vector ##\mathbf{x}## is actually a function of three variables ##f(x, y, z)##. Now, for each of these variables, you can take the partial derivative wrt that variable leaving the others fixed. The result is another function of the three variables. There are various notations for these functions, but normally it's ##f_x, f_y, f_y## or ##\frac{\partial f}{\partial x}, \frac{\partial f}{\partial y}, \frac{\partial f}{\partial z}##.

    Both these notations create something of a problem (that is rarely discussed, I feel). They tie the definition of these partial derivative functions to a particular choice of variable. And, if you start changing variables in some way, it can be difficult to understand what the partial derivatives actually mean.

    There are two alternatives that make things clearer. With ##f## defined as a function of ##(x, y, z)##, then:

    ##f_x = \frac{\partial f}{\partial x} = ## "the partial derivative of ##f## wrt its first argument", which could be written ##f_1##, say.

    Now, if you defined a function ##g(x, y, z) = f(x^2, 2xy, x+z)##, then what is ##g_x##?

    The solution is to see the chain rule as:

    ##g_x = ## "the partial derivative of ##f## wrt its first argument times the partial derivative of its first argument with respect to ##x##" + "the partial derivative of ##f## wrt its second argument times the partial derivative of its second argument with respect to ##x##" + "the partial derivative of ##f## wrt its third argument times the partial derivative of its third argument with respect to ##x##".

    Now, in my new notation this is quite clear:

    ##g_x = f_1 2x + f_2 2y + f_z##

    Or, in the more usual notation this is:

    ##g_x = f_x 2x + f_y 2y + f_z##

    I think this is worth remembering as it can be very useful in cleariungh up any confusion over pd's.

    Finally, how I would analyse your example is, with ##\mathbf x## and ##\mathbf u## fixed, we define:

    ##g(\alpha) = f(\mathbf x + \alpha \mathbf u) = f(x + \alpha u_x, y + \alpha u_y, z + \alpha u_z)##

    And:

    ##\frac{dg}{d \alpha} = f_x u_x + f_y u_y + f_z u_z = \mathbf{ \nabla}f \cdot \mathbf{u}##

    And, as you want the derivative evaluated at ##\mathbf x = (x, y, z)## you take ##\alpha = 0##.
     
  5. Jan 15, 2017 #4

    FactChecker

    User Avatar
    Science Advisor
    Gold Member

    This is wrong. It is not $$\frac {\partial f}{\partial \alpha} $$
    The simple, one variable version is df/dx = df/du * du/dx. Notice the df/du rather than df/dx.
     
  6. Jan 15, 2017 #5
    Thank you so much. I very much appreciate you taking the time to provide such a thorough explanation. :)
     
Know someone interested in this topic? Share this thread via Reddit, Google+, Twitter, or Facebook

Have something to add?
Draft saved Draft deleted



Similar Discussions: How do I apply Chain Rule to get the desired result?
  1. How do I get my angle? (Replies: 4)

Loading...