# Proving chain rule

1. Mar 6, 2005

### Icebreaker

If I was trying to prove the chain rule for partial derivatives, can I start with the definition of a total differential? What I mean is:

Let $$f(x,y)=z$$ where $$x=g(t)$$ and $$y=h(t)$$.

I'm looking for $$\frac{dz}{dt}$$.

By definition,

$$dz = \frac{\partial z}{\partial x}dx + \frac{\partial z}{\partial y}dy$$

So, can I simply divide both sides by $$dt$$ and obtain,

$$\frac{dz}{dt} = \frac{\partial z}{\partial x}\frac{dx}{dt} + \frac{\partial z}{\partial y}\frac{dy}{dt}$$

Which happens to be the partial derivate using chain rule, and thus proving it? Or must I go with the definitions of the derivatives ($$\Delta$$ and all)?

2. Mar 6, 2005

### dextercioby

Depends on how rigurous you want to be.A course on analysis for and by mathematicians would reject your proof.

As for us physicists,it's good enough.

Daniel.

3. Mar 6, 2005

### HallsofIvy

Staff Emeritus
Actually, I would do it almost the opposite way: first prove the chain rule by using the basic definition of "derivative" (as $lim \frac{f(x+h)-f(x)}{h}$) THEN
show $dz = \frac{\partial z}{\partial x}dx + \frac{\partial z}{\partial y}dy$ from that!

4. Mar 6, 2005

### mathwonk

I still remember the very clear presentation by Lynn Loomis in my advanced calc class in 1964.

The derivative of f:R^n-->R^m, at x is a linear map L such that the difference o(y) = f(x+y)-f(x) -L(y) is "little oh" as a function of y, i.e. |o(y)|/|y| approaches zero as y does.

A map O is "big oh" if the ratio |O(y)|/|y| remains bounded as y approaches zero. Then simply check (trivial) that the composition of two little oh maps is little oh, and the composition of a little oh and a big oh map is little oh, and that linear maps are big oh (in finite dimensions), and that the sum of big oh maps is big oh, and the sum of little oh maps is little oh.

Then it is trivial to prove that the derivative of a composition is the composition of the derivatives, in the sense of composing linear maps.

in particular the derivative wrt t of f(x(t),y(t)) is the matrix product of [df/dx,df/dy] with the column matrix [dx/dt, dy/dt].

this language mimics that in Hardy's classic calculus book, Pure mathematics.

I never get to teach this stuff, but seem to spend my life reteaching elementary calc over and over. this stuff is so beautiful.

this treatment works also in infinite dimensional banach spaces, where of course the derivative is assumed to be both continuous and linear.

this may seem odd today, but in 1964, most of the students in Loomis' banach space advanced calc course were freshmen.

Last edited: Mar 6, 2005
5. Mar 7, 2005

### Galileo

There's an equivalent way of stating the existence of a derivative.
A function f is differentiable at x=a iff there's a number f'(a), such that:

$f(a+\Delta x)-f(a)=f'(a)\Delta x+\epsilon \Delta x$

with
$$\lim_{\Delta x \to 0}\epsilon=0$$.

Using this it's very easy to prove the one variable chain rule. The chain rule for a function R^n -> R can be proven similarly.

6. Mar 13, 2005

### Icebreaker

I did it using Galileo's suggestion. Thanks to everyone

7. Mar 13, 2005

### mathwonk

i guess then you did not notice that galileo's suggestion was the same as mine.

i.e. the statement f(x+y)-f(x) -L(y) is "little oh" as a function of y, i.e. |o(y)|/|y| approaches zero as y does, means that

[f(x+y)-f(x) - f'(x).y]= e(y) |y|, where e(y)-->0 as y approaches zero.

here |y| = delta(x), in galileo's notation.

the way i stated it (and galileo's as well, suitably interpreted) was designed to work in all dimensions or even infinite dimensions.

Last edited: Mar 13, 2005
8. Mar 13, 2005

### Icebreaker

Well, I needed to hand in the assignment before the deadline, so I went with the simplest form. I do appreciate your extended help, mathwonk, and I will study it further.

9. Mar 13, 2005

### mathwonk

thanks for your response. I know it is harder to assimilate but it is the better formulation.

another lesson that assignments are evil influences! do not be seduced by the darkness!