Can I Prove the Chain Rule with the Definition of a Total Differential?

In summary, the conversation discusses the proof of the chain rule for partial derivatives. The speaker asks if they can start with the definition of a total differential to prove the chain rule, while another person suggests first proving the chain rule using the basic definition of a derivative. The conversation also touches on the existence of a derivative and its properties in different dimensions. Ultimately, the group agrees that both methods can be used to prove the chain rule, with the latter being a more rigorous approach.
  • #1
Icebreaker
If I was trying to prove the chain rule for partial derivatives, can I start with the definition of a total differential? What I mean is:

Let [tex]f(x,y)=z[/tex] where [tex]x=g(t)[/tex] and [tex]y=h(t)[/tex].

I'm looking for [tex]\frac{dz}{dt}[/tex].

By definition,

[tex]dz = \frac{\partial z}{\partial x}dx + \frac{\partial z}{\partial y}dy[/tex]

So, can I simply divide both sides by [tex]dt[/tex] and obtain,

[tex]\frac{dz}{dt} = \frac{\partial z}{\partial x}\frac{dx}{dt} + \frac{\partial z}{\partial y}\frac{dy}{dt}[/tex]

Which happens to be the partial derivate using chain rule, and thus proving it? Or must I go with the definitions of the derivatives ([tex]\Delta[/tex] and all)?
 
Physics news on Phys.org
  • #2
Depends on how rigurous you want to be.A course on analysis for and by mathematicians would reject your proof.

As for us physicists,it's good enough.

Daniel.
 
  • #3
Actually, I would do it almost the opposite way: first prove the chain rule by using the basic definition of "derivative" (as [itex]lim \frac{f(x+h)-f(x)}{h}[/itex]) THEN
show [itex]dz = \frac{\partial z}{\partial x}dx + \frac{\partial z}{\partial y}dy[/itex] from that!
 
  • #4
I still remember the very clear presentation by Lynn Loomis in my advanced calc class in 1964.

The derivative of f:R^n-->R^m, at x is a linear map L such that the difference o(y) = f(x+y)-f(x) -L(y) is "little oh" as a function of y, i.e. |o(y)|/|y| approaches zero as y does.

A map O is "big oh" if the ratio |O(y)|/|y| remains bounded as y approaches zero. Then simply check (trivial) that the composition of two little oh maps is little oh, and the composition of a little oh and a big oh map is little oh, and that linear maps are big oh (in finite dimensions), and that the sum of big oh maps is big oh, and the sum of little oh maps is little oh.


Then it is trivial to prove that the derivative of a composition is the composition of the derivatives, in the sense of composing linear maps.

in particular the derivative wrt t of f(x(t),y(t)) is the matrix product of [df/dx,df/dy] with the column matrix [dx/dt, dy/dt].

this language mimics that in Hardy's classic calculus book, Pure mathematics.

I never get to teach this stuff, but seem to spend my life reteaching elementary calc over and over. this stuff is so beautiful.

this treatment works also in infinite dimensional banach spaces, where of course the derivative is assumed to be both continuous and linear.

this may seem odd today, but in 1964, most of the students in Loomis' banach space advanced calc course were freshmen.
 
Last edited:
  • #5
There's an equivalent way of stating the existence of a derivative.
A function f is differentiable at x=a iff there's a number f'(a), such that:

[itex]f(a+\Delta x)-f(a)=f'(a)\Delta x+\epsilon \Delta x[/itex]

with
[tex]\lim_{\Delta x \to 0}\epsilon=0[/tex].

Using this it's very easy to prove the one variable chain rule. The chain rule for a function R^n -> R can be proven similarly.
 
  • #6
I did it using Galileo's suggestion. Thanks to everyone
 
  • #7
i guess then you did not notice that galileo's suggestion was the same as mine.

i.e. the statement f(x+y)-f(x) -L(y) is "little oh" as a function of y, i.e. |o(y)|/|y| approaches zero as y does, means that

[f(x+y)-f(x) - f'(x).y]= e(y) |y|, where e(y)-->0 as y approaches zero.

here |y| = delta(x), in galileo's notation.

the way i stated it (and galileo's as well, suitably interpreted) was designed to work in all dimensions or even infinite dimensions.
 
Last edited:
  • #8
Well, I needed to hand in the assignment before the deadline, so I went with the simplest form. I do appreciate your extended help, mathwonk, and I will study it further.
 
  • #9
thanks for your response. I know it is harder to assimilate but it is the better formulation.

another lesson that assignments are evil influences! do not be seduced by the darkness!
 

Related to Can I Prove the Chain Rule with the Definition of a Total Differential?

1. What is the chain rule?

The chain rule is a fundamental rule in calculus that relates the derivative of a composite function to the derivatives of its individual components. It allows us to find the rate of change of a function that is composed of two or more functions.

2. How is the chain rule used in mathematics?

The chain rule is used in mathematics to simplify the process of finding the derivative of a composite function. By breaking down the function into smaller, simpler functions, we can apply the chain rule to find the derivative of each component and then combine them to find the overall derivative.

3. What is an example of using the chain rule?

One example of using the chain rule is finding the derivative of the function f(x) = sin(x²). We can use the chain rule to break down the function into two simpler functions, g(x) = x² and h(x) = sin(x). Then, we can find the derivatives of each component and apply the chain rule to find the derivative of f(x).

4. Why is the chain rule important?

The chain rule is important because it allows us to find the derivative of complicated functions that are composed of simpler functions. This is crucial in many areas of mathematics, physics, and engineering, where functions can be very complex and difficult to differentiate without the use of the chain rule.

5. Can the chain rule be used for functions with more than two components?

Yes, the chain rule can be extended to functions with any number of components. The general formula for the chain rule can be applied recursively to find the derivative of a function with multiple nested functions. However, as the number of components increases, the calculations can become more complex and time-consuming.

Similar threads

Replies
6
Views
2K
  • Calculus
Replies
5
Views
1K
Replies
6
Views
930
Replies
1
Views
984
Replies
3
Views
1K
  • Calculus
Replies
2
Views
2K
  • Calculus
Replies
2
Views
2K
Replies
3
Views
1K
Replies
4
Views
2K
Replies
2
Views
1K
Back
Top