What is the Connection Between the Chain Rule and Differentials?

Click For Summary

Discussion Overview

The discussion revolves around the connection between the chain rule and differentials in calculus. Participants explore various proofs of the chain rule, the nature of differentials, and their interpretations in different contexts, including non-standard analysis and differential geometry.

Discussion Character

  • Exploratory
  • Technical explanation
  • Debate/contested
  • Mathematical reasoning

Main Points Raised

  • One participant questions the validity of a proof for the chain rule based on the properties of real numbers, suggesting that since dy, du, and dx are real-valued functions, one could assert that dy/dx = dy/du * du/dx.
  • Another participant counters that dy is not a real number and that dy/dx is not simply a ratio of two real numbers, emphasizing the need for correct definitions in proofs.
  • Some participants discuss the nature of differentials, with one asserting that both dx and dy can be treated as real numbers, while another clarifies that the definition of the derivative involves limits, not just ratios.
  • A participant introduces the idea of infinitesimals and mentions that the chain rule can be proved using infinitesimals, referencing non-standard analysis and its requirements.
  • There is a discussion about the approximation of differentials in calculus textbooks, with some participants noting that dx is often treated as an approximation rather than an exact equality to Δx.
  • One participant elaborates on the modern definition of differentials in differential geometry, explaining how they relate to tangent vectors and the derivative of functions.

Areas of Agreement / Disagreement

Participants express differing views on the nature of differentials and the validity of various proofs of the chain rule. There is no consensus on the definitions and interpretations of differentials, leading to multiple competing perspectives throughout the discussion.

Contextual Notes

Participants highlight limitations in their understanding of differentials and the chain rule, with some noting that definitions and contexts can vary significantly across different calculus resources. The discussion also touches on the complexities of infinitesimals and their role in advanced mathematical frameworks.

MathStudent
Messages
281
Reaction score
1
Hi,
I've seen a couple of proofs for the chain rule, and I know this probably sounds stupid, but I'm wondering why it can't be proved as follows:

given the real valued functions [itex]y=f(u), u=g(x)[/itex]
since [itex]dy, du, dx,[/itex] are all real valued functions as well
can't you just state:
[tex]\frac{dy}{dx}=\frac{dy}{du}\frac{du}{dx}[/tex]
by the properties of real numbers?

can someone explain why this isn't an acceptable proof?

Also since I'm on the subject of differentials, does anyone know of any good books on the theories of differentials, because I've spent a lot of time thinking about this concept, and it seems to have a different meaning for different applications. I've heard that there are plenty of theories that explain what a differential is, and explains more about it's uses ( beyond the scope of a Calc 1-3 book). Any info would be greatly appreciated.

Thanks in advance!
 
Physics news on Phys.org
by the properties of real numbers?

can someone explain why this isn't an acceptable proof?

And what property would that be? Remember that, for instance, dy is not a real number, and dy/dx is not a ratio of two real numbers.

To make an acceptable proof, you have to apply your idea to the correct definitions of the terms involved.
 
Thanks for your reply hurkyl, but I don't understand why dy is not a real number? from what I know,

[itex]dx = \Delta x[/itex] -is an increment of x which is the difference of two real numbers, which itself is a real number.

and since [itex]dy = f'(x)dx[/itex] and [itex]f'(x)[/itex] evaluated at some x is a real number

so it seems to me that both dx and dy take on real values and thus can be treated as real numbers, and so dy/dx could be treated as the ratio of two real numbers.

Please pardon my ignorance here I realize I must be missing something important.
 
Last edited:
That is certainly the inspiration for differentials, but it's not that easy. And while the notation is such that you can usually manipulate them as if they were real numbers, that's not always the case.

Matt grime likes to give this identity involving three dependent variables:

[tex] \frac{dx}{dy} \frac{dy}{dz} \frac{dz}{dx} = -1[/tex]

The actual definition of the derivative is:

[tex] \frac{dy}{dx} := \lim_{\Delta x \rightarrow 0} \frac{\Delta y}{\Delta x}[/tex]

So it's not simply the ratio of two real numbers, but the limit of such ratios. For the proof to be valid, you have to factor in the limit. For the case of the chain rule, it just means that you are cancelling real numbers inside the limit.
 
Hurkyl said:
For the case of the chain rule, it just means that you are cancelling real numbers inside the limit.

Just to clarify, you are stating that the we must cancel real numbers inside the limit before evaluating the limit in order for a proof of the chain rule to be valid?

Also I appologize, I didn't realize that there is already a thread created on this subject which I found to be helpful for anyone else that has similar problems.
https://www.physicsforums.com/showthread.php?t=57419

It seems that there are subjects that go deeper into the theory of differentials and infinitessimals than what can be found in a standard calculus book.
Are there course that go deeper into this theory?

Thanks by the way for all your help!
(I'm very impressedd with this site!)
 
Look closely at your calculus book! The one I am looking at (Calculus by Salas, Hille and Etgen, ninth edition) says "If Δx is small then df is approximately f'Δx". The statement "dx= Δx" is not true: it is an approximation.

[itex]\frac{dy}{dx}[/itex] is approximately equal to [itex]\frac{\Delta y}{\Delta x}[/itex].
 
you are stating that the we must cancel real numbers inside the limit before evaluating the limit in order for a proof of the chain rule to be valid?

I don't know all possible proofs of the chain rule -- I was just referring to the one I think is most straightforward, and highlighting the key difference between it and your invalid argument.


Your calc 1 book says exactly what df/dx means; you don't need to appeal to anything else.


Differentials are something else (but similar), but you wouldn't use them until you start doing differential geometry, or the like.


Infinitessimals are a different subject entirely. In standard analysis, the only infinitessimal is 0, so it's not a particularly useful concept.
 
Actually you can prove the chain rule by just asserting
[tex]\frac{dy}{dt}= \frac{dy}{dx}\frac{dx}{dt}[/tex]
provided you have defined dy, dx, and dt as "infinitesmals". In order to define infinitesmals themselves, you have to go to "non-standard" analysis which requires sophisticated notions from logic (specifically, the "compactness property", that if every finite subset of a set of axioms has a model, then the entire set of axioms has a model).
 
Almost, but not quite. In nonstandard analysis, dy/dx is defined to be equal to the standard part of Δy/Δx, provided that this exists and is the same for all choices of the infinitessimal Δx.

Taking the standard part of a number means to round to the nearest standard (i.e. real) number.

So, for the proof to be accurate, you need a theorem about how multiplication interacts with the standard part operation -- that std (xy) = (std x)(std y), given the appropriate hypothesis.
 
  • #10
HallsofIvy said:
Look closely at your calculus book! The one I am looking at (Calculus by Salas, Hille and Etgen, ninth edition) says "If Δx is small then df is approximately f'Δx". The statement "dx= Δx" is not true: it is an approximation.

[itex]\frac{dy}{dx}[/itex] is approximately equal to [itex]\frac{\Delta y}{\Delta x}[/itex].
hmm ...that's interesting
I have never looked in that book, but that is something I have never heard before. Everywhere I've seen has a slightly dissimilar definition.
They let
[tex]dx = \Delta x[/tex]
where the define
[tex]dy = f'(x)dx[/tex]

so if [itex]\Delta x[/itex] is small then
[itex]\Delta y[/itex] is approximately dy
 
  • #11
In general a proof first requires a definition. So it is clearly true that if you define dx to be deltax, and define dy to be f'(x) deltax, then obviously f'(x) = dy/dx.

the modern differential geometry definition of df, for any differentiable function f, is that it is a function on tangent vectors to the real line. i.e. given a point p on the real line, and a tangent vector v at that point, then df(v) = the derivative of f in the direction v. now the standard tangent vector is the unit vector e in the positive x direction, and the derivative of f in that direction is the usual derivative f'(p) = dfp(e).

If v is any tangent vector one can always write it as a scalar multiple of the standard unit vector, v = ce, and then one has dfp(v) = cdf(e) = cf'(x). So df is a linear function on the tangent space at p.

now x is a function on the x axis, namely the identity function, and as such it has a differential dx, whose value at any point p and any vector v, where v = ce, is simply dx(v) = dx(ce) = cdx(e) = c.1 = c. Since a tangent vector v at p is merely the vector from p to p+v, we also call v = delta x. thus in this sense, dx(v) does equal deltax, i.e. it equals the difference v between x and x+v.

now if v = ce, since dfp(v) equals cdfp(e) = cf'(p), and dxp(v) = cdxp(e) = c, it follws that indeed dfp is a function which on every tangent vector at p, equals exactly f'(p) times what dx equals. thus the quotient of the two linear functions, dfp and dxp, is a constant function with value f'(p).

In this sense dfp/dxp = f'(p) as a quotient of linear functions, for all p, and hence df/dx = f' is true as a quotient.

the definition of dx as deltax, while well meaning, is misleading since it should say that for all p, the function dxp on tangent vectors v at p, equals the function deltax,p. namely both of them, acting on the point x+h, yield the number h.


in geometric terms, df is the family of linear functions whose family of graphs is simply the family of tangent lines to the graph of f. thus dx is the family of tangent lines to the graph of y=x, namely the family of lines of slope 1, one copy for each point p on the x axis. thus dividing dfp by dxp, for a given p, means dividing these two linear functions, which amounts to dividing their slopes. this gives f'(p)/1 = f'(p). i.e. the function taking x to f'(p)x divided by the function taking x to x, can be said to equal the constant function taking x to f'(p), i.e. the number f'(p).
 
Last edited:

Similar threads

  • · Replies 6 ·
Replies
6
Views
4K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 5 ·
Replies
5
Views
2K
  • · Replies 4 ·
Replies
4
Views
3K
  • · Replies 9 ·
Replies
9
Views
3K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 5 ·
Replies
5
Views
2K
  • · Replies 22 ·
Replies
22
Views
4K
  • · Replies 10 ·
Replies
10
Views
2K