Is There Proof for the Chain Rule?

Yh Hoo · Jun 2, 2012

Any proof for the CHAIN RULE ??

Can somebody please show me the proof of the chain rule?? even though i have been applying that concept since i touch differentiation but i still have doubt and question on this concept!

chiro · Jun 2, 2012

Yh Hoo said:

Can somebody please show me the proof of the chain rule?? even though i have been applying that concept since i touch differentiation but i still have doubt and question on this concept!

Hey Yh Hoo.

A quick google search will give you:

http://en.wikipedia.org/wiki/Chain_rule#Proofs_of_the_chain_rule

But even without this, it's best to resort to the definition of the derivative which is in terms of f'(x) = lim h->0 [f(x+h) - f(x)]/h, but instead you consider f'(g(x)). So expand this out and you get: lim a->b [f(g(b)) - f(g(a))]/[g(b)-g(a)] and then consider the multiplication by [g(b) - g(a)]/[b-a] and use this identity to get things in terms of the derivative of the whole thing.

algebrat · Jun 2, 2012

Here's a proof:

http://kruel.co/math/chainrule.pdf

Here's a java applet:

http://webspace.ship.edu/msrenault/GeoGebraCalculus/derivative_intuitive_chain_rule.html

Here's my own input on the matter:

Let's agree on some notation, let y=f(x), z=g(y), so z=g(f(x)).

So there is a proof of z'(x)=g'(f(x))g'(x) in a link above.

But let's ignore the proof and build some intuition, so that we also have a feel for why it is true, not just stare blankly through the proof and nod.

Try y=ax+b, and z=cy+d. If you play with that, you may have more of a feel for it. You can try b=d=0 too.

More intuition: I think I read once that Newton, a big helper in inventing the calculus, thought of the cahin rule in terms of gears. So the way z spins depends on how it is connected to y, which the way y spins depends on how it is connected to x. The g(x) inside of f', you can say is because z is connected to y=g(x), not x.

Some tips: Don't forget the g(x) inside of f'. To see what I mean, try g(x)=ln(x), f(x)=1+x^2.

A nice tool for remembering the rule is what I think is called Liebniz's notation: dz/dx=(dz/dy)(dy/dx).

Be careful of notation, for instance, z' could mean dz/dx or dz/dy.

micromass · Jun 2, 2012

chiro said:

Hey Yh Hoo.

A quick google search will give you:

http://en.wikipedia.org/wiki/Chain_rule#Proofs_of_the_chain_rule

But even without this, it's best to resort to the definition of the derivative which is in terms of f'(x) = lim h->0 [f(x+h) - f(x)]/h, but instead you consider f'(g(x)). So expand this out and you get: lim a->b [f(g(b)) - f(g(a))]/[g(b)-g(a)] and then consider the multiplication by [g(b) - g(a)]/[b-a] and use this identity to get things in terms of the derivative of the whole thing.

This doesn't always work as g(b)-g(a) can be zero and then the fraction is undefined.

Yh Hoo · Jun 3, 2012

Thanks guys, i will look at that first.

Fredrik · Jun 3, 2012

I like the following non-rigorous argument:

It follows immediately from the definition of "derivative" that when h is small,
\begin{align*}
f(x+h)\approx f(x)+hf'(x).
\end{align*} Let's use this formula (twice) to approximate f(g(x+h)).
\begin{align*}
f(g(x+h))\approx f\big(g(x)+hg'(x)\big)\approx f(g(x))+hg'(x)f'(g(x)).
\end{align*} This implies that
\begin{align*}(f\circ g)'(x) &\approx \frac{f(g(x+h))-f(g(x))}{h}\approx \frac{f(g(x))+hg'(x)f'(g(x))-f(g(x))}{h}\\ &\approx f'(g(x))g'(x).
\end{align*}
A rigorous proof will cover at least two pages in a book, if the author does it as a straightforward application of the ε-δ definition of "limit", and includes all the details. There are tricks you can use to make the proof shorter, but I prefer not to use them, for the following two reasons:

1. In my opinion, they make it harder to understand what's really going on.
2. The people who need to study the proof have recently learned the ε-δ definition and are still pretty bad at using it, so it's an excellent exercise for them to study the longer but more straightforward proof.

I actually typed up a long but straightforward proof for my personal notes the last time I participated in one of these threads, but I never posted it. I will do that now. See my next post below (in ten minutes or so).

Fredrik · Jun 3, 2012

There may still be some minor inaccuracies in the statement and the proof of the theorem. I will take another look at this later today to see if I find any. Feel free to post a comment if you find a mistake before I do.

Theorem: Let ##x\in\mathbb R## be arbitrary. Let f,g be arbitrary functions. If g is differentiable at x, and f is differentiable at g(x), then ##f\circ g## is differentiable at x, and
\begin{align}
(f\circ g)'(x)=f'(g(x))g'(x).
\end{align}
(Comment: In this context, "function" means "real-valued function with a domain that's a subset of ℝ").

Proof:
Let ε>0 be arbitrary. We're going to show that there exists a δ>0 such that for all h,
\begin{align}
|h|<\delta\ \Rightarrow\ %\left|\frac{f(g(x+h))-f(g(x))}{h}-f'(g(x))g'(x)\right|<\varepsilon
\left|\frac{f\circ g(x+h)-f\circ g(x)}{h}-f'(g(x))g'(x)\right|<\varepsilon
\end{align} This will prove both that ##f\circ g## is differentiable at x, and that we have the right formula for ##(f\circ g)'(x)##. We start by defining a notation. For each real number x and each function u that's differentiable at x, define
\begin{align}
R_{u,x}(h)=u(x+h)-u(x)-hu'(x)
\end{align} for all h such that x+h is in the domain of u. Note that ##R_{u,x}(0)=0## and that
\begin{align}
\frac{R_{u,x}(h)}{h}=\frac{u(x+h)-u(x)}{h}-u'(x)\rightarrow 0\text{ as }h\rightarrow 0.
\end{align} Let h be arbitrary. Define ##k=hg'(x)+R_{g,x}(h)##. Since g is differentiable at x, and f is differentiable at g(x), we have
\begin{align}
f(g(x+h)) &=f\big(g(x)+hg'(x)+R_{g,x}(h)\big)
=f(g(x)+k)\\
&=f(g(x))+kf'(g(x))+R_{f,g(x)}(k).
\end{align}This implies that
\begin{align}
&\frac{f\circ g(x+h)-f\circ g(x)}{h} =\frac{f(g(x+h))-f(g(x))}{h}\\
&=\frac{f(g(x))+kf'(g(x))+R_{f,g(x)}(k)-f(g(x))}{h}\\
&=\frac{\big(hg'(x)+R_{g,x}(h)\big)f'(g(x))+R_{f,g(x)}(k)}{h}\\
&=f'(g(x))g'(x)+f'(g(x))\frac{R_{g,x}(h)}{h}+\frac{R_{f,g(x)}(k)}{h},
\end{align} which implies that
\begin{align}
&\left|\frac{f\circ g(x+h)-f\circ g(x)}{h}-f'(g(x))g'(x)\right|
=\left|f'(g(x))\frac{R_{g,x}(h)}{h}+\frac{R_{f,g(x)}(k)}{h}\right|\\
&\leq |f'(g(x))|\left|\frac{R_{g,x}(h)}{h}\right|+\left|\frac{R_{f,g(x)}(k)}{h}\right|.
\end{align}
We're going to show that there exists a δ>0 such that each of the two terms above is <ε/2 when |h|<δ. The first term presents no difficulties. We just choose ##\delta_1>0## such that
\begin{align}
|h|<\delta_1\ \Rightarrow\ \left|\frac{R_{g,x}(h)}{h}\right| <\frac{\varepsilon}{2|f'(g(x))|}.
\end{align} The second term is much more difficult to deal with. If k=0, we have
$$\left|\frac{R_{f,g(x)}(k)}{h}\right|=0
<\frac{\varepsilon}{2}
$$ for all h. If k≠0, we have
\begin{align}
\left|\frac{R_{f,g(x)}(k)}{h}\right|
&=\left|\frac{R_{f,g(x)}(k)}{k}\right|\left|\frac{k}{h}\right|=
\left|\frac{R_{f,g(x)}(k)}{k}\right|\left|g'(x) +\frac{R_{g,x}(h)}{h}\right|\\
&\leq \left|\frac{R_{f,g(x)}(k)}{k}\right|\bigg(|g'(x)|
+\left|\frac{R_{g,x}(h)}{h}\right|\bigg).
\end{align} Choose ##\delta_2>0## such that
\begin{align}
|h|<\delta_2\ \Rightarrow\ \left|\frac{R_{g,x}(h)}{h}\right|<|g'(x)|.
\end{align} Choose ##\delta_3>0## such that
\begin{align}
|k|<\delta_3\ \Rightarrow\ \left|\frac{R_{f,g(x)}(k)}{k}\right| <\frac{\varepsilon}{4|g'(x)|}.
\end{align}Choose ##\delta_4>0## such that
$$|h|<\delta_3\ \Rightarrow |k|<\delta_3.$$ This is possible because
$$|k|=|hg'(x)+R_{g,x}(h)|=|g(x+h)-g(h)|,$$ and g is continuous at x. These choices ensure that if k≠0, then for all h with ##|h|<\min\{\delta_2,\delta_4\}##, we have
\begin{align}
\left|\frac{R_{f,g(x)}(k)}{h}\right|\leq
\left|\frac{R_{f,g(x)}(k)}{k}\right|\bigg(|g'(x)|
+\left|\frac{R_{g,x}(h)}{h}\right|\bigg) <\frac{\varepsilon}{4|g'(x)|}2|g'(x)|
<\frac{\varepsilon}{2}.
\end{align} If we define ##\delta=\min\{\delta_1,\delta_2,\delta_4\}##, then for all real numbers h such that ##|h|<\delta##,
\begin{align}
&\left|\frac{f\circ g(x+h)-f\circ g(x)}{h}-f'(g(x))g'(x)\right|\\
&\leq |f'(g(x))|\left|\frac{R_{g,x}(h)}{h}\right|+\left|\frac{R_{f,g(x)}(k)}{h}\right|
<\frac{\varepsilon}{2}+\frac{\varepsilon}{2} =\varepsilon.
\end{align}

DonAntonio · Jun 3, 2012

Yh Hoo said:

Can somebody please show me the proof of the chain rule?? even though i have been applying that concept since i touch differentiation but i still have doubt and question on this concept!

Try to formalize the following: [tex]\frac{f(g(x))-f(g(x_0))}{x-x_0}=\frac{f(g(x))-f(g(x_0))}{g(x)-g(x_0)}\frac{g(x)-g(x_0)}{x-x_0}[/tex]

Just note the following: as [itex]\,x\to 0\,[/itex] , also [itex]\,g(x)\to g(x_0)\,[/itex] (why?) , and if [itex]\,g(x)=g(x_0)\,[/itex] identically on a certain neighbourhood of

the point [itex]\,x_o\,[/itex] , then [itex]\,f(g(x))=f(g(x_0))\,[/itex] identically in the same neighbourhood, so the result is trivial then...

DonAntonio

Skrew · Jun 3, 2012

The only thing wrong with the proof using

(f(g(x+h)) - f(g(x)))/(h) = ((f(g(x+h)) - f(g(x)))/(g(x+h) - g(x)))*((g(x+h) - g(x))/h

is that you don't know g(x + h) - g(x) != 0 on some small interval of h.

You can get around this by letting h_n be a sequence going to 0 and splitting it into two subsequences, one consisting of the elements where g(x+h_i) = g(x) and one consisting of the elements that do not and taking the limits of both.

Also this idea is way, way easier then the standard proof given in most analysis books.

Is There Proof for the Chain Rule?

Undergrad Why ##a^0=1##?

Undergrad Finding the minimum distance between two curves

High School Arc Length for Hyperbolic Sin

Undergrad Why is ##x=e^y## the inverse of ##y=\int_1^x \frac{1}{t} dt##?

Undergrad Best way to numerically solve this system of equations?

Undergrad Correct Upper/Lower limits for continuation of solutions for ODE

Graduate About ellipticity and a proof that a system of PDEs is elliptic

Insights Revisiting the Velocity-Time Function

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Is There Proof for the Chain Rule?

Similar threads