Chain Rule of Differentiation

  • Thread starter Drakkith
  • Start date
  • #1
Drakkith
Staff Emeritus
Science Advisor
20,717
4,423

Main Question or Discussion Point

I'm in Calc 1 and the Chain Rule is giving me one hell of a rough time. I've spent about 10-12 hours over the last few days just on the homework problems in this one section (only getting about 15-20 problems done) and still feel like I barely understand it. Does anyone have any tips, tricks, hints, or suggestions that you found helpful when learning the chain rule?

I understand that the chain rule is used when you have composite functions, but there's something about the whole process that makes it super difficult for some reason. Perhaps its trying to keep track of each function as I work? I've tried using letters in place of the functions when I decompose them, but it's almost equally confusing. Writing out the entire function as I go seems to work, but it takes forever and is prone to sign errors and other little mistakes. Any suggestions with this?

And if someone tells me to just do more problems I'm going to wrap the chain rule around their neck and pull it tight!! :-p

Thanks!
 

Answers and Replies

  • #2
2,788
586
Consider the functions ## f(x,y,v) ##,## x(u,v) ## , ##y(u,v,t)## and ## t(v) ##. Now let's calculate ## \frac{df}{dv} ##!
The thing you should keep in mind, is that you should have a map of all the variables and their relationships and then you should identify all the paths that go from f to v. One path is really straightforward because f explicitly depends on v, so we should have a ## \frac{\partial f}{\partial v}##. The next path, is through x, because x depends on v. So we should traverse this path. The first step is from f to x, which is traversed by ## \frac{\partial f}{\partial x}##. Now we should go from x to v, which is done by multiplying the latter derivative by ## \frac{\partial x}{\partial v} ##. The next two paths are through y. One is traversed from f to y to v and the other from f to y to t to v, which means we should have ## \frac{\partial f}{\partial y} \frac{\partial y}{\partial v} ## and ##\frac{\partial f}{\partial y}\frac{\partial y}{\partial t}\frac{d t}{dv}##.
 
  • #3
Drakkith
Staff Emeritus
Science Advisor
20,717
4,423
I'm sorry, Shyan, but you lost me in your first sentence. I've never seen a function with multiple variables inside it and I can't follow the rest of your post.
 
  • #4
2,788
586
I'm sorry, Shyan, but you lost me in your first sentence. I've never seen a function with multiple variables inside it and I can't follow the rest of your post.
I thought you're asking about the full-grown chain rule!
So I don't understand what's your problem with the chain rule. If you give an example and explain why its hard for you to do the example, I can help better.
 
  • #5
33,186
4,872
The chain rule, for functions of one variable, is all about differentiating function compositions. If you have a function that doesn't involve composition, such as f(x) = x3, differentiation is pretty straightforward. ##f'(x) = \frac d {dx} x^3 = 3x^2##.

Now, if you have a function that involves a composition, such as g(x) = (x2 + x)3, you could think of this as g(x) = u3, where u(x) = x2 + x. Here ##g'(x) = \frac d {dx} u^3 = \frac d {du} u^3 * \frac{du}{dx} = 3u^2 * (2x + 1) = 3(u^2 + x)^2 * (2x + 1)##.

The key to understanding the notation above is realizing the ##\frac {du^3}{dx} = \frac{du^3}{du} * \frac{du}{dx}##. In essence, we have multiplied by ##\frac{du}{du}##. Although you've probably been told not to regard ##\frac{dy}{dx}## and other derivativeas fractions, it does no harm to do so here, and has the added advantage of producing the correct answer.

If you expand (x2 + x)3, and then differentiate, you'll get exactly the same thing, providing that you factor what you get using this alternate technique.
 
  • #6
Drakkith
Staff Emeritus
Science Advisor
20,717
4,423
If you give an example and explain why its hard for you to do the example, I can help better.
If I knew exactly what the problem was, I'd tell you. It just feels like crawling through mud when I do these problems.

Now, if you have a function that involves a composition,
Okay, how do you know if a function is composite? I know 2x+2 is a composite function, but I don't feel like I could explain to someone why. Is 2x a composite function? Is (2x)/(1+x)?
 
  • #7
33,186
4,872
Okay, how do you know if a function is composite? I know 2x+2 is a composite function, but I don't feel like I could explain to someone why.
Let g(x) = x + 2, and f(x) = 22. Then 2x + 2 = g(f(x)), which is also written as ##(g \circ f)(x)##.
Drakkith said:
Is 2x a composite function?
You could write it as a composite function, but it would be trivial, with one of the functions being g(x) = x.
Drakkith said:
Is (2x)/(1+x)?
You could write it as a composite function, but there wouldn't be much point in doing so. f(x) = 2x, g(x) = ##\frac x {1 + x}##. The ##\frac{2x} {1 + x} = f(g(x))##. It's more important to recognize this one as a rational function: the quotient of two polynomials. To differentiate this function, it would be silly to use the chain rule. The quotient rule would be most applicable here. However, if you wrote ##\frac{2x} {1 + x}## as (2x) * (1 + x)-1, then you would probably be thinking of using the product rule, following by the chain rule (to get the derivative of (1 + x)-1).
 
  • #8
Drakkith
Staff Emeritus
Science Advisor
20,717
4,423
Okay, now I'm confused. I thought you had to use the chain rule if the function is composite...
 
  • #9
2,788
586
Okay, how do you know if a function is composite? I know 2x+2 is a composite function, but I don't feel like I could explain to someone why. Is 2x a composite function? Is (2x)/(1+x)?
OK. Let's start from the beginning. The definition of a derivative is:
## \displaystyle f'(x)=\lim_{\delta \to 0} \frac{f(x+\delta)-f(x)}{\delta} ##
This definition gives us x'=1. This is a direct computation, no tricks involved. Other examples are ## (e^x)'=e^x##, ## (\sin x)'=\cos x##,etc.
You can do all differentiations like this, directly. But it'll be really painful. So people figured out some tricks:
## (uv)'=u'v+uv'##,##(\frac u v)'=\frac{u'v-uv'}{v^2}##,etc.
So differentiating is now a procedure with different layers of abstraction with the last layer being the direct derivatives with no abstraction. I can say that each layer of abstraction is a composition and so any time you use a rule of the above kind, you're dealing with a composite function. In fact in differentiating something like ## y=\tan(\frac{2x}{x+1})##, we do ## \frac{dy}{dx}=\frac{dy}{du} \frac{du}{dv}\frac{dv}{dx} ## where ##u=\frac{2x}{x+1} ## and ## v=x ##.
 
  • #10
Drakkith
Staff Emeritus
Science Advisor
20,717
4,423
Okay, that all seems to make sense, Shyan. Now my question is when do you use the chain rule? Do you use it only when the other methods don't work?
 
  • #11
2,788
586
Okay, that all seems to make sense, Shyan. Now my question is when do you use the chain rule? Do you use it only when the other methods don't work?
We always use it because the only alternative is using the definition of derivative directly which is painful. The point is that we don't always mention we used it!
Lets go back to my example, ## y=\tan(\frac{2x}{x+1})##. If I want to take the derivative, I write something like:
## y'=\sec^2(\frac{2x}{x+1}) \frac{2(x+1)-2x}{(x+1)^2} ##. Yeah, I didn't put ## u=\frac{2x}{x+1} ## and v=x but it doesn't mean I didn't use chain rule. I first calculated the derivative of tangent regardless of what is it that it depends on and then I took the derivative of the argument w.r.t. x. Its chain rule, whether I mention it or not.
 
  • #12
Drakkith
Staff Emeritus
Science Advisor
20,717
4,423
No no, I mean that if I have a function F(x) = 2x2, that's technically a composite function consisting of G(U) = U2 and U(x) = 2x, right? But we don't use the chain rule here, we use the power rule and just multiply 2x2 by 2, reduce the exponent by 1, and come up with F'(x) = 4x. So why didn't we use the chain rule if the function was composite?
 
  • #13
2,788
586
No no, I mean that if I have a function F(x) = 2x2, that's technically a composite function consisting of G(U) = U2 and U(x) = 2x, right? But we don't use the chain rule here, we use the power rule and just multiply 2x2 by 2, reduce the exponent by 1, and come up with F'(x) = 4x. So why didn't we use the chain rule if the function was composite?
Yeah, you're right but this is one of those cases where the function being differentiate is an elementary function and there is no layer of abstraction.
You should distinguish between rules like ## (x^n)'=nx^{n-1} ##, ## (\sin x)'=\cos x ##,etc. and rules like ## (uv)'=u'v+uv' ##, ## (u+v)'=u'+v' ## etc.
The former rules are the rock bottom of the layers of abstraction while the latter rules are for transition between layers, or something like that!
I should say that still we can think of ## (x^n)'=nx^{n-1} ## as a rule of second kind and only think of x'=1 as a rock bottom rule.
 
Last edited:
  • #14
SteamKing
Staff Emeritus
Science Advisor
Homework Helper
12,798
1,666
No no, I mean that if I have a function F(x) = 2x2, that's technically a composite function consisting of G(U) = U2 and U(x) = 2x, right?
Well, here's a basic problem: if U(x) = 2x and G(U) = U2, then G(U) ≠ 2x2. G(U) = U2(x) = (2x)2 = 4x2

If you wanted to find dG(U) / dx, then the chain rule would give dG(U) / dx = 2U(x)*dU(x) / dx = 2* (2x) * 2 = 8x = d(4x2)/dx
 
  • #15
Drakkith
Staff Emeritus
Science Advisor
20,717
4,423
Well, here's a basic problem: if U(x) = 2x and G(U) = U2, then G(U) ≠ 2x2. G(U) = U2(x) = (2x)2 = 4x2
Whoops, you're right. Would it be U(x) = x2 and G(U) = 2U?
 
  • #16
Drakkith
Staff Emeritus
Science Advisor
20,717
4,423
The former rules are the rock bottom of the layers of abstraction while the latter rules are for transition between layers, or something like that!
That makes sense.
 
  • #17
Drakkith
Staff Emeritus
Science Advisor
20,717
4,423
So, let's say we have a function F(x) = (2x-1)(x+1).
Is it correct to say that G(x) = 2x-1 and H(x) = x+1 and neither function is part of the other? In other words, you can't use the chain rule because neither H(x) or G(x) are dependent on the other? This contrasts with the function T(x) = 2(x+1)-1 which can be decomposed into V(U) = 2U-1 and U(x) = X+1, which means that V(U) is dependent on U(x). Is all that right?
 
  • #18
2,788
586
So, let's say we have a function F(x) = (2x-1)(x+1).
Is it correct to say that G(x) = 2x-1 and H(x) = x+1 and neither function is part of the other? In other words, you can't use the chain rule because neither H(x) or G(x) are dependent on the other? This contrasts with the function T(x) = 2(x+1)-1 which can be decomposed into V(U) = 2U-1 and U(x) = X+1, which means that V(U) is dependent on U(x). Is all that right?
Chain rule, in the restricted sense that you know, can't be used for F(x). But if you consider the full-grown chain rule as I explained in post #2, then it can be applied to F(x) too.
 
  • #19
SteamKing
Staff Emeritus
Science Advisor
Homework Helper
12,798
1,666
Whoops, you're right. Would it be U(x) = x2 and G(U) = 2U?
That would work.
 
  • #20
33,186
4,872
So, let's say we have a function F(x) = (2x-1)(x+1).
Is it correct to say that G(x) = 2x-1 and H(x) = x+1 and neither function is part of the other?
Yes, correct.
Drakkith said:
In other words, you can't use the chain rule because neither H(x) or G(x) are dependent on the other? This contrasts with the function T(x) = 2(x+1)-1 which can be decomposed into V(U) = 2U-1 and U(x) = X+1, which means that V(U) is dependent on U(x). Is all that right?
Yes.

It is often the case that you can use more than one differentiation rule, and as you get more experience you'll be able to decide which is the better rule to use. Some examples:

1. f(x) = (x + 1)(x - 2)
The product rule would work, or you could expand the right side to x2 - x - 2, and differentiate using the sum rule (derivative of a sum is the sum of the derivatives) and the power rule (d/dx(xn = nxn - 1)
2. g(x) = 3*sin(x)
The product rule would work, but should never be used when one of the factors is a constant (3 here). Instead, use the constant multiple rule (d/dx(k*f(x)) = k * f'(x) ).
3. h(x) = 1/x
Here you could use the quotient rule, or you could rewrite the function's formula as x-1 and use the power rule, which would be easier in this problem.
4. f(x) = ##\frac{x^2}{2}##
The quotient rule appears to be called for, but it would be silly to use it here, as ##\frac{x^2}{2} = (1/2)x^2##. The better choice is to use the constant multiple rule and the power rule.
5. g(x) = (x2 + x)2
You could use the chain rule here, or you could expand the right side and use the sum rule and power rule. If you are not required to use the chain rule, then the alternative I gave is a good choice.

The bottom line -- always use the simplest rule that applies.
Constant multiple rule is "better than" the product rule.
Product rule is usually "better than" the quotient rule.
Simple rules such as the constant rule (d/dx(C) = 0), constant multiple rule, sum rule, and power rule are better to use than the product rule, quotient rule, and chain rule.

By "better than" I mean simpler to use, which means you are less likely to make a mistake. A longer computation provides more opportunities to make a mistake.
 
  • #21
33,186
4,872
Chain rule, in the restricted sense that you know, can't be used for F(x). But if you consider the full-grown chain rule as I explained in post #2, then it can be applied to F(x) too.
Drakkith is currently working with functions of a single variable, so please try to tailor your help with his current level of knowledge in mind.
 

Related Threads for: Chain Rule of Differentiation

  • Last Post
Replies
10
Views
2K
  • Last Post
Replies
6
Views
612
  • Last Post
Replies
4
Views
2K
  • Last Post
Replies
5
Views
4K
  • Last Post
Replies
3
Views
2K
Replies
10
Views
2K
Replies
1
Views
1K
Replies
2
Views
3K
Top