Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

Chain Rule - intuitive Proof

  1. Oct 13, 2009 #1
    Chain Rule - intuitive "Proof"

    Suppose y = f(u), and u = g(x), then dy/dx = dy/du * du/dx.

    In an intuitive "proof" of the chain rule, it has this step: dy/dx = [tex]\lim_{\Delta x \to 0} \frac {\Delta y}{\Delta x}[/tex] = [tex]\lim_{\Delta x \to 0} \frac {\Delta y}{\Delta u}[/tex] * [tex]\frac {\Delta u}{\Delta x}[/tex]

    My question is, why multiply by [tex]\frac {\Delta u}{\Delta u}[/tex]? I know mathematically, it's because [tex]\frac {\Delta u}{\Delta u}[/tex] = 1, and multiplying by 1 doesn't change the function, but I'm looking for a philosophical reason. I found a quote by the user mathwonk in an old thread, which says: "it seems plausible that the best linear approximation to a composite function, is obtained by composing the best approximations to the component functions. On the other hand for a linear function, composing means simply multiplying." Can someone expand on this?...
  2. jcsd
  3. Oct 13, 2009 #2


    User Avatar
    Science Advisor
    Homework Helper
    Gold Member

    Re: Chain Rule - intuitive "Proof"

    A change Δx in x causes a change Δu in u which causes a change Δy in y. The derivatives are limits of difference quotients so you would expect Δy/Δu [itex]\rightarrow[/itex] dy/du and Δu/Δx [itex]\rightarrow[/itex] du/dx. Multiplying the numerator and denominator by Δu is a convenient and seemingly appropriate way to involve the intermediate variable. That's about as philosophical as I get. And I gather that you do know that method isn't really a proof.
  4. Oct 13, 2009 #3
    Re: Chain Rule - intuitive "Proof"

    But why is it appropriate?

    Yes, that's why I put proof in quotation marks.
  5. Oct 13, 2009 #4


    User Avatar
    Science Advisor
    Homework Helper
    Gold Member

    Re: Chain Rule - intuitive "Proof"

    Because I need a Δy/Δu and Δu/Δx in the equation and multiplying by Δu/Δu doesn't change the equation.
  6. Oct 13, 2009 #5


    User Avatar
    Science Advisor

    Re: Chain Rule - intuitive "Proof"

    Surely you've seen that done before in mathematics? When you get common denominators to add fractions, you multiply numerator and denominator by the same thing- that's exactly the same idea. When you complete the square in a quadratic, you add and subtract the same thing. Almost the same idea.
  7. Oct 13, 2009 #6
    Re: Chain Rule - intuitive "Proof"

    Interesting quote. I'm glad I came across this. (Thanks mathwonk!)

    I'm not sure what more can be said. But for my own sake, and maybe yours as well, I'll do a little explaining to think the idea through.

    Differentiable functions are locally linear. That means, you take any differentiable function f and a point p. There exist constants a and b such that f(x) = ax + b as long as x is pretty damn close to p. The derivative of f at point p is simply a. (That's almost TOO convenient...).

    Combine this idea with the chain rule. Let f and g be differentiable functions. We note that f . g is differentiable too. We pick a point p. We want to find the derivative of f . g at p. We use our rule above. Since f . g is differentiable, it is locally linear. Which means, (f . g)(x) = f(g(x)) = a x + b for some a and b. Our goal is to determine the value of a.

    Well, since f and g are both differentiable, we know that they too are locally linear. So let's will some more variables into existence, using the same rule:

    g(x) = a' x + b'

    (The primes are not differentiation. The a's are the derivatives at point p and the b's are the constants).

    So the derivative of g at point p is a'.

    Next, we use our rule on f. But it's a little different this time! We're not taking the derivative of f at p. Instead, we're taking the derivative of f at the point g(p)! But a point is a point, and fixing g(p), we use our rule to conclude that

    f(x) = a'' x + b'' for all x's that are pretty damn close to g(p).

    So now we have a' (the derivative of g at point p) and a'' (the derivative of f at point g(p)). Let's compose f and g!

    (f . g)(x) = f(g(x))

    If x is close to p, then we can expand g(x) as a' x + b':

    (f . g)(x) = f(a' x + b') for all x's pretty damn close to p.

    Now, and again, g(x) is pretty damn close to p, so we can expand f:

    (f . g)(x) = a'' (a' x + b') + b'' = (a'' a') x + (a'' b' + b'') for all x pretty damn close to p.

    And we draw our conclusion. The derivative of f . g at point p is simply the first-order term, a'' a'. Exciting. What does that mean? Pulling from the definitions above, a'' is the derivative of f at point g(p), and a' is the derivative of g at point p. That is exactly what the chain rule is.

    OK. That wasn't as simple as I hoped. But I hope you get the picture a little better when you replace the f'(g(x)) clutter with constants. I think, in particular, you can see where the multiplication comes in. It's linear. You substitute. You shave off a few bits and toss it into your constant, and multiply a few coefficients.
  8. Oct 13, 2009 #7
    Re: Chain Rule - intuitive "Proof"

Share this great discussion with others via Reddit, Google+, Twitter, or Facebook