Is There Proof for the Chain Rule?

Click For Summary

Discussion Overview

The discussion centers around the proof of the chain rule in calculus, a fundamental concept in differentiation. Participants express their doubts and seek clarification on the proof, exploring various approaches and intuitions related to the chain rule.

Discussion Character

  • Exploratory
  • Technical explanation
  • Conceptual clarification
  • Debate/contested
  • Mathematical reasoning

Main Points Raised

  • Some participants request a formal proof of the chain rule, expressing ongoing doubts despite familiarity with its application.
  • One participant provides links to external resources for proofs and tools related to the chain rule.
  • Another participant suggests using the definition of the derivative to derive the chain rule, but notes potential issues with undefined expressions.
  • A non-rigorous argument is presented, approximating the derivatives involved and suggesting that a rigorous proof would require detailed ε-δ arguments.
  • One participant mentions that there may be inaccuracies in the theorem's statement and proof, indicating a need for further review.
  • A detailed proof is outlined, involving definitions and limits, with emphasis on the conditions under which the chain rule holds.

Areas of Agreement / Disagreement

Participants express a range of views on the proof of the chain rule, with some providing informal arguments while others seek formal proofs. There is no consensus on a single approach or resolution of doubts regarding the chain rule.

Contextual Notes

Some participants highlight the complexity of the proof, noting that it may involve multiple pages and detailed definitions. There are also mentions of potential inaccuracies in the theorem's statement and proof that remain unresolved.

Yh Hoo
Messages
73
Reaction score
0
Any proof for the CHAIN RULE ??

Can somebody please show me the proof of the chain rule?? even though i have been applying that concept since i touch differentiation but i still have doubt and question on this concept!
 
Physics news on Phys.org


Yh Hoo said:
Can somebody please show me the proof of the chain rule?? even though i have been applying that concept since i touch differentiation but i still have doubt and question on this concept!

Hey Yh Hoo.

A quick google search will give you:

http://en.wikipedia.org/wiki/Chain_rule#Proofs_of_the_chain_rule

But even without this, it's best to resort to the definition of the derivative which is in terms of f'(x) = lim h->0 [f(x+h) - f(x)]/h, but instead you consider f'(g(x)). So expand this out and you get: lim a->b [f(g(b)) - f(g(a))]/[g(b)-g(a)] and then consider the multiplication by [g(b) - g(a)]/[b-a] and use this identity to get things in terms of the derivative of the whole thing.
 


Here's a proof:

http://kruel.co/math/chainrule.pdf

Here's a java applet:

http://webspace.ship.edu/msrenault/GeoGebraCalculus/derivative_intuitive_chain_rule.html

Here's my own input on the matter:

Let's agree on some notation, let y=f(x), z=g(y), so z=g(f(x)).

So there is a proof of z'(x)=g'(f(x))g'(x) in a link above.

But let's ignore the proof and build some intuition, so that we also have a feel for why it is true, not just stare blankly through the proof and nod.

Try y=ax+b, and z=cy+d. If you play with that, you may have more of a feel for it. You can try b=d=0 too.

More intuition: I think I read once that Newton, a big helper in inventing the calculus, thought of the cahin rule in terms of gears. So the way z spins depends on how it is connected to y, which the way y spins depends on how it is connected to x. The g(x) inside of f', you can say is because z is connected to y=g(x), not x.

Some tips: Don't forget the g(x) inside of f'. To see what I mean, try g(x)=ln(x), f(x)=1+x^2.

A nice tool for remembering the rule is what I think is called Liebniz's notation: dz/dx=(dz/dy)(dy/dx).

Be careful of notation, for instance, z' could mean dz/dx or dz/dy.
 


chiro said:
Hey Yh Hoo.

A quick google search will give you:

http://en.wikipedia.org/wiki/Chain_rule#Proofs_of_the_chain_rule

But even without this, it's best to resort to the definition of the derivative which is in terms of f'(x) = lim h->0 [f(x+h) - f(x)]/h, but instead you consider f'(g(x)). So expand this out and you get: lim a->b [f(g(b)) - f(g(a))]/[g(b)-g(a)] and then consider the multiplication by [g(b) - g(a)]/[b-a] and use this identity to get things in terms of the derivative of the whole thing.

This doesn't always work as g(b)-g(a) can be zero and then the fraction is undefined.
 


Thanks guys, i will look at that first.
 


I like the following non-rigorous argument:

It follows immediately from the definition of "derivative" that when h is small,
\begin{align*}
f(x+h)\approx f(x)+hf'(x).
\end{align*} Let's use this formula (twice) to approximate f(g(x+h)).
\begin{align*}
f(g(x+h))\approx f\big(g(x)+hg'(x)\big)\approx f(g(x))+hg'(x)f'(g(x)).
\end{align*} This implies that
\begin{align*}(f\circ g)'(x) &\approx \frac{f(g(x+h))-f(g(x))}{h}\approx \frac{f(g(x))+hg'(x)f'(g(x))-f(g(x))}{h}\\ &\approx f'(g(x))g'(x).
\end{align*}
A rigorous proof will cover at least two pages in a book, if the author does it as a straightforward application of the ε-δ definition of "limit", and includes all the details. There are tricks you can use to make the proof shorter, but I prefer not to use them, for the following two reasons:

1. In my opinion, they make it harder to understand what's really going on.
2. The people who need to study the proof have recently learned the ε-δ definition and are still pretty bad at using it, so it's an excellent exercise for them to study the longer but more straightforward proof.

I actually typed up a long but straightforward proof for my personal notes the last time I participated in one of these threads, but I never posted it. I will do that now. See my next post below (in ten minutes or so).
 


There may still be some minor inaccuracies in the statement and the proof of the theorem. I will take another look at this later today to see if I find any. Feel free to post a comment if you find a mistake before I do.

Theorem: Let ##x\in\mathbb R## be arbitrary. Let f,g be arbitrary functions. If g is differentiable at x, and f is differentiable at g(x), then ##f\circ g## is differentiable at x, and
\begin{align}
(f\circ g)'(x)=f'(g(x))g'(x).
\end{align}
(Comment: In this context, "function" means "real-valued function with a domain that's a subset of ℝ").

Proof:
Let ε>0 be arbitrary. We're going to show that there exists a δ>0 such that for all h,
\begin{align}
|h|<\delta\ \Rightarrow\ %\left|\frac{f(g(x+h))-f(g(x))}{h}-f'(g(x))g'(x)\right|<\varepsilon
\left|\frac{f\circ g(x+h)-f\circ g(x)}{h}-f'(g(x))g'(x)\right|<\varepsilon
\end{align} This will prove both that ##f\circ g## is differentiable at x, and that we have the right formula for ##(f\circ g)'(x)##. We start by defining a notation. For each real number x and each function u that's differentiable at x, define
\begin{align}
R_{u,x}(h)=u(x+h)-u(x)-hu'(x)
\end{align} for all h such that x+h is in the domain of u. Note that ##R_{u,x}(0)=0## and that
\begin{align}
\frac{R_{u,x}(h)}{h}=\frac{u(x+h)-u(x)}{h}-u'(x)\rightarrow 0\text{ as }h\rightarrow 0.
\end{align} Let h be arbitrary. Define ##k=hg'(x)+R_{g,x}(h)##. Since g is differentiable at x, and f is differentiable at g(x), we have
\begin{align}
f(g(x+h)) &=f\big(g(x)+hg'(x)+R_{g,x}(h)\big)
=f(g(x)+k)\\
&=f(g(x))+kf'(g(x))+R_{f,g(x)}(k).
\end{align}This implies that
\begin{align}
&\frac{f\circ g(x+h)-f\circ g(x)}{h} =\frac{f(g(x+h))-f(g(x))}{h}\\
&=\frac{f(g(x))+kf'(g(x))+R_{f,g(x)}(k)-f(g(x))}{h}\\
&=\frac{\big(hg'(x)+R_{g,x}(h)\big)f'(g(x))+R_{f,g(x)}(k)}{h}\\
&=f'(g(x))g'(x)+f'(g(x))\frac{R_{g,x}(h)}{h}+\frac{R_{f,g(x)}(k)}{h},
\end{align} which implies that
\begin{align}
&\left|\frac{f\circ g(x+h)-f\circ g(x)}{h}-f'(g(x))g'(x)\right|
=\left|f'(g(x))\frac{R_{g,x}(h)}{h}+\frac{R_{f,g(x)}(k)}{h}\right|\\
&\leq |f'(g(x))|\left|\frac{R_{g,x}(h)}{h}\right|+\left|\frac{R_{f,g(x)}(k)}{h}\right|.
\end{align}
We're going to show that there exists a δ>0 such that each of the two terms above is <ε/2 when |h|<δ. The first term presents no difficulties. We just choose ##\delta_1>0## such that
\begin{align}
|h|<\delta_1\ \Rightarrow\ \left|\frac{R_{g,x}(h)}{h}\right| <\frac{\varepsilon}{2|f'(g(x))|}.
\end{align} The second term is much more difficult to deal with. If k=0, we have
$$\left|\frac{R_{f,g(x)}(k)}{h}\right|=0
<\frac{\varepsilon}{2}
$$ for all h. If k≠0, we have
\begin{align}
\left|\frac{R_{f,g(x)}(k)}{h}\right|
&=\left|\frac{R_{f,g(x)}(k)}{k}\right|\left|\frac{k}{h}\right|=
\left|\frac{R_{f,g(x)}(k)}{k}\right|\left|g'(x) +\frac{R_{g,x}(h)}{h}\right|\\
&\leq \left|\frac{R_{f,g(x)}(k)}{k}\right|\bigg(|g'(x)|
+\left|\frac{R_{g,x}(h)}{h}\right|\bigg).
\end{align} Choose ##\delta_2>0## such that
\begin{align}
|h|<\delta_2\ \Rightarrow\ \left|\frac{R_{g,x}(h)}{h}\right|<|g'(x)|.
\end{align} Choose ##\delta_3>0## such that
\begin{align}
|k|<\delta_3\ \Rightarrow\ \left|\frac{R_{f,g(x)}(k)}{k}\right| <\frac{\varepsilon}{4|g'(x)|}.
\end{align}Choose ##\delta_4>0## such that
$$|h|<\delta_3\ \Rightarrow |k|<\delta_3.$$ This is possible because
$$|k|=|hg'(x)+R_{g,x}(h)|=|g(x+h)-g(h)|,$$ and g is continuous at x. These choices ensure that if k≠0, then for all h with ##|h|<\min\{\delta_2,\delta_4\}##, we have
\begin{align}
\left|\frac{R_{f,g(x)}(k)}{h}\right|\leq
\left|\frac{R_{f,g(x)}(k)}{k}\right|\bigg(|g'(x)|
+\left|\frac{R_{g,x}(h)}{h}\right|\bigg) <\frac{\varepsilon}{4|g'(x)|}2|g'(x)|
<\frac{\varepsilon}{2}.
\end{align} If we define ##\delta=\min\{\delta_1,\delta_2,\delta_4\}##, then for all real numbers h such that ##|h|<\delta##,
\begin{align}
&\left|\frac{f\circ g(x+h)-f\circ g(x)}{h}-f'(g(x))g'(x)\right|\\
&\leq |f'(g(x))|\left|\frac{R_{g,x}(h)}{h}\right|+\left|\frac{R_{f,g(x)}(k)}{h}\right|
<\frac{\varepsilon}{2}+\frac{\varepsilon}{2} =\varepsilon.
\end{align}
 
Last edited:


Yh Hoo said:
Can somebody please show me the proof of the chain rule?? even though i have been applying that concept since i touch differentiation but i still have doubt and question on this concept!


Try to formalize the following: \frac{f(g(x))-f(g(x_0))}{x-x_0}=\frac{f(g(x))-f(g(x_0))}{g(x)-g(x_0)}\frac{g(x)-g(x_0)}{x-x_0}

Just note the following: as \,x\to 0\, , also \,g(x)\to g(x_0)\, (why?) , and if \,g(x)=g(x_0)\, identically on a certain neighbourhood of

the point \,x_o\, , then \,f(g(x))=f(g(x_0))\, identically in the same neighbourhood, so the result is trivial then...

DonAntonio
 


The only thing wrong with the proof using

(f(g(x+h)) - f(g(x)))/(h) = ((f(g(x+h)) - f(g(x)))/(g(x+h) - g(x)))*((g(x+h) - g(x))/h

is that you don't know g(x + h) - g(x) != 0 on some small interval of h.

You can get around this by letting h_n be a sequence going to 0 and splitting it into two subsequences, one consisting of the elements where g(x+h_i) = g(x) and one consisting of the elements that do not and taking the limits of both.

Also this idea is way, way easier then the standard proof given in most analysis books.
 

Similar threads

  • · Replies 10 ·
Replies
10
Views
3K
  • · Replies 5 ·
Replies
5
Views
3K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 6 ·
Replies
6
Views
4K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 3 ·
Replies
3
Views
3K
  • · Replies 5 ·
Replies
5
Views
2K
  • · Replies 12 ·
Replies
12
Views
4K
  • · Replies 1 ·
Replies
1
Views
3K
  • · Replies 4 ·
Replies
4
Views
2K