# Reasoning behind Infinitesimal multiplication

• I
• Chenkel

#### Chenkel

TL;DR Summary
Calculus made easy is a book by Silvanus P Thompson on infinitesimal analysis that I have a question regarding.
Hello everyone!

I have quite a bit of experience with standard calculus methods of differentiation and integration, but after seeing some of Walter Lewin's lectures I noticed in his derivation of change in momentum for a rocket ejecting a mass dm, with a change in velocity of the rockey dv, he gets a term of dv*dm, and says we can ignore this term in the overall summation, as it's the product of two very small numbers, it's very small relative to the change of momentum. I believe this is a method of infinitesimal analysis, and I've seen the method discussed in Silvanus P Thompson's book "Calculus Made Easy." I feel I have some intuition of the method, but am lacking a rigorous understanding of our treatment of infinitesimals in calculus and physics. When computing volumes, I sometimes see dV = dxdydz, and people treat this as a differential quantity, but isn't this a product of three incredibly numbers? When can we safely ignore a differential term as being too small to matter in the overall equation?

In the following I write an example of the derivation of the derivative of the product uv with respect to x
$$y = uv$$$$y = (u + du)(v + dv) = uv + u{dv}+ v{du} + {du}{dv}$$Based on the methods of infinitesimal calculus we can ignore ##{du}{dv}##, so we get$$y = uv$$$$y = (u + du)(v + dv) = uv + u{dv} + v{du}$$And we can subtract the first equation and divide by dx to get$$\frac {dy}{dx} = u\frac{dv}{dx} + v\frac{du}{dx}$$In summary, when can we safely ignore a differential term? Is there a way to show when we can safely ignore a specific term as we take the change in the independent variable arbitrarily close to 0 to make a perfect approximation?

Let me know what you guys think, thank you!

I do not like that approach very much. Too much Vodoo in my opinion.

Nevertheless, the basic idea behind differentiation is to find a linear approximation ##\dfrac{d}{dx}y=\dfrac{d}{dx}(uv)## to ##y=uv##. The term ##dudv## is quadratic, i.e. smaller than linear. Therefore, it can be ignored.

My favorite definition is Weierstraß's decomposition formula: ##f(x_0+v)=f(x_0)+ \left. \dfrac{d}{dx}\right|_{x=x_0}f \cdot v+ r(v)## where the remainder function, the error, runs faster to zero than ##v,## i.e. ##\lim_{v \to 0} \dfrac{r(v)}{\|v\|}=0.## The ##dudv## term is that remainder, and if we divide one direction, e.g. ##dv##, then we are still left with an infinitessimal small quantity ##du.##

Last edited:
• malawi_glenn and Chenkel
I do not like that approach very much. Too much Vodoo in my opinion.

Nevertheless, the basic idea behind differentiation is to find a linear approximation ##\dfrac{d}{dx}y=\dfrac{d}{dx}(uv)## to ##y=uv##. The term ##dudv## is quadratic, i.e. smaller than linear. Therefore, it can be ignored.

My favorite definition is Weierstraß's decomposition formula: ##f(x_0+v)=f(x_0)+ \left. \dfrac{d}{dx}\right|_{x=x_0}f \cdot v+ r(v)## where the remainder function, the error, runs faster to zero than ##v,## i.e. ##\lim_{v \to 0} \dfrac{r(v)}{\|v\|}=0.## The ##dudv## term is that remainder, and if we divide one direction, e.g. ##dv##, then we are still left with an infinitessimal small quantity ##du.##
I'm a little confused by the vertical line next to the derivative operator, what does that mean?

I'm a little confused by the vertical line next to the derivative operator, what does that mean?
It means that we evaluate the first derivative according to the variable ##x## at a certain point of touch ##x_0##, the tangent at ##x_0##.

If we have a function ##f(x)=x^3## then ##f'(x)=3x^2## but what is ##x## here? It is actually
$$f'(x_0)=\left. \dfrac{d}{dx}\right|_{x=x_0} f=\left. \dfrac{d}{dx}\right|_{x=x_0} x^3 = 3x_0^2$$
Whenever we write ##f'(x)## then we usually mean a number, a slope, namely the function value of ##f'## at a certain point ##x=x_0.## It only is a function if we consider its dependence on the location, the point of evaluation: ##\{x_0\}\stackrel{A}{\longmapsto} \{3x_0^2\}.##

It is also - a completely different - function ##v \stackrel{B}{\longmapsto} f'(x_0)\cdot v=3x_0^2\cdot v##. This time we consider the linear function "multiplication by the slope at ##x_0##". Here we get a linear function. This one is meant if people say that the derivative is a linear function. ##v## is here the direction (and variable) in which we took the tangent. (In the one-dimensional case it is of course only the ##x##-direction.)

This means we have to be careful if we talk about the derivative as a function. In my example we have a quadrative function ##A## and a linear function ##B## although we only considered ##f(x)=x^3.##

• Chenkel
here are two cents more: to expand on fresh's second paragraph, the derivative is by definition the linear part of the change in the output, as a function of the change in the input.

hence by definition, for the function u(x), to get du/dx, write the change in u as a function of the change in x, and take the linear part, or linear term of that function. e.g. if that function is a polynomial, or power series, you keep only the linear term. thus, since you are looking only for the linear part, a term may be ignored or thrown away iff it has higher order than one as a function of dx.

thus by definition u-u0 = (du/dx)dx + higher order terms in dx, and thus the derivative is just the term (du/dx)dx, (or if you prefer, its coefficient du/dx).

this is written as u-u0 ≈ (du/dx)dx, where ≈ means the difference between the two sides is of higher order than one as a function of dx.

the product of any term with a higher order term is also of higher order, so for the derivative of uv, we have

uv - u0v0 ≈ (u0+ (du/dx)dx)(v0 + (dv/dx)dx) - u0v0 = u0 (dv/dx)dx) + v0 (du/dx)dx.

less precisely, d(uv) = uo dv + vo du. or even less precisely, d(uv) = udv + vdu.

I hope I haven't messed ths up too much. but basically it has nothing to do with terms being "infinitely small". unfortunatey, for me, the lovely book of thompson is quite mystical on what is being explained. still the idea is there that in taking derivatives, we ignore all but linear terms. the higher order terms are not really zero, but they may be ignored because just as a product with zero is still zero, so also a product with a higher order term still has higher order.

• Chenkel
here are two cents more: to expand on fresh's second paragraph, the derivative is by definition the linear part of the change in the output, as a function of the change in the input.

hence by definition, for the function u(x), to get du/dx, write the change in u as a function of the change in x, and take the linear part, or linear term of that function. e.g. if that function is a polynomial, or power series, you keep only the linear term. thus, since you are looking only for the linear part, a term may be ignored or thrown away iff it has higher order than one as a function of dx.

thus by definition u-u0 = (du/dx)dx + higher order terms in dx, and thus the derivative is just the term (du/dx)dx, (or if you prefer, its coefficient du/dx).

this is written as u-u0 ≈ (du/dx)dx, where ≈ means the difference between the two sides is of higher order than one as a function of dx.

the product of any term with a higher order term is also of higher order, so for the derivative of uv, we have

uv - u0v0 ≈ (u0+ (du/dx)dx)(v0 + (dv/dx)dx) - u0v0 = u0 (dv/dx)dx) + v0 (du/dx)dx.

less precisely, d(uv) = uo dv + vo du. or even less precisely, d(uv) = udv + vdu.

I hope I haven't messed ths up too much. but basically it has nothing to do with terms being "infinitely small". unfortunatey, for me, the lovely book of thompson is quite mystical on what is being explained. still the idea is there that in taking derivatives, we ignore all but linear terms. the higher order terms are not really zero, but they may be ignored because just as a product with zero is still zero, so also a product with a higher order term still has higher order.
Is u0 = u the only place (du/dx)dx is precisely equal to u - u0, that would be where there's the change in du, or dx is so close to 0 or equal to 0 that our equations are still valid.

well that only happens when u-u0 equals its linear part, and that does usually only happen at u = u0, except when u is itself linear then it happens everywhere. e.g. if u = x^2 it only happens at one point, but if u = x, it happens everywhere.

• Chenkel
I believe the most straightforward reason on why we ignore terms like ##dvdu## is that given that both are differentials of the same variable ##t##, that is ##dv=v'(t)dt,du=u'(t)dt##, then $$dvdu=v'(t)u'(t)(dt)^2$$.

Now because in usual applications what we 'll do at some next step is to divide by ##dt## and take the limit ##dt\to 0## we will have $$\lim_{dt\to 0}\frac{dvdu}{dt}=\lim_{dt\to 0}v'(t)u'(t)dt=0$$ and hence the term can be effectively be ignored. But if somehow we were going to divide by ##(dt)^2## then the term can not be ignored.

• Chenkel
If you wonder what happens if ##u,v## are differentials of the different variables ##t,s##, then similar things happen provided that we 'll take at some step the double limit $$\lim_{dt\to 0}\lim_{ds\to 0} \frac{dudv}{dt}=\lim_{dt\to 0}\lim_{ds\to 0} u'(t)v'(s)ds=0$$ provided that we divide only by dt or only by ds. If we divide by ##ds\cdot dt## then again the term ##dudv## can't be ignored.

• Chenkel
I believe the most straightforward reason on why we ignore terms like ##dvdu## is that given that both are differentials of the same variable ##t##, that is ##dv=v'(t)dt,du=u'(t)dt##, then $$dvdu=v'(t)u'(t)(dt)^2$$.

Now because in usual applications what we 'll do at some next step is to divide by ##dt## and take the limit ##dt\to 0## we will have $$\lim_{dt\to 0}\frac{dvdu}{dt}=\lim_{dt\to 0}v'(t)u'(t)dt=0$$ and hence the term can be effectively be ignored. But if somehow we were going to divide by ##(dt)^2## then the term can not be ignored.
That makes some sense to me, but I'm wondering if treating dt as approaching 0 is adequate in a rigorous approach, dt seems, at least to me, to exist as a very small number, and never exist as a large number, its relationship to other differentials in the form of ratios and formulas give it meaning.

Perhaps I'm reaching for a perfect understanding that is difficult to obtain.

• Delta2
Perhaps I'm reaching for a perfect understanding that is difficult to obtain.
Not difficult, laborious. After all, it is still ... Now, differentials are so important in mathematics as in physics, that there are really many angles under which they can be studied. Even this simple image contains already three possible perspectives: the point of touch (red), the linear space (green), and the curve (black). Now add dimensions, the set of all possible touching points, curvature and its change, linear algebra, coordinates, the set of all possible tangents, etc.

https://www.physicsforums.com/insights/journey-manifold-su2mathbbc-part/
contains a list of ten perspectives, and the word slope isn't even among them.

https://www.physicsforums.com/insights/the-pantheon-of-derivatives-i/
is an attempt to walk through the maze of derivatives.

• Chenkel
Not difficult, laborious. After all, it is still ...

View attachment 303849

Now, differentials are so important in mathematics as in physics, that there are really many angles under which they can be studied. Even this simple image contains already three possible perspectives: the point of touch (red), the linear space (green), and the curve (black). Now add dimensions, the set of all possible touching points, curvature and its change, linear algebra, coordinates, the set of all possible tangents, etc.

https://www.physicsforums.com/insights/journey-manifold-su2mathbbc-part/
contains a list of ten perspectives, and the word slope isn't even among them.

https://www.physicsforums.com/insights/the-pantheon-of-derivatives-i/
is an attempt to walk through the maze of derivatives.
Thank you for the links, I will look them over. One problem I have is understanding the 'linear space' for ##u## and ##v##, it's easy for me to imagine a line with one independent variable when that variable is ##y = mx + b##, but here we have ##dy = udv + vdu + dudv##, which is still only one independent variable, but not as clear what is going on. I believe I have a way of seeing how ##dudv## can safely be ignored in the summation, I can think of ##u## as a function of ##x## where$$u(x) + du = u(x + dx)$$$$v(x) + dv = v(x + dx)$$$$du = u(x + dx) - u(x)$$$$dv=v(x + dx) - v(x)$$This works for small changes in ##x##, and not large ones, which is why I designate infinitesimal change ##dx##. So we have to show ##dudv## is close enough to ##0## that it can be ignored in the summation for any infinitesimal change ##dx##, we can show show it is so small relative to the overall function that it can be set to 0, i.e, ##\frac {dudv}{(u(x)du + v(x)dv + dudv}## we divide both sides by ##dudv## and we get ##\frac 1 {\frac {u(x)}{dv} + \frac {v(x)}{du} + 1}## this is is ##0## for any infinitesimal ##dx##, so we can see setting ##dudv## to #0# is not a problem for the summation when comparing it's relative size to ##dy##. Let me know what you guys think of my proof, thank you!

Last edited:
I am probably the wrong person to ask about the ##dy## notation. I still don't like it despite the fact that I attended a Leibniz school.

In my view of the world, we have
$$\dfrac{dy}{dx}=\dfrac{d}{dx}(u\cdot v)=u\cdot \dfrac{dv}{dx}+ \dfrac{du}{dx} \cdot v$$
so the quadratic term doesn't even occur if we only consider (linear!) tangents. If you want to multiply the formula by ##dx## then we get
$$d(uv)=u\cdot dv+v\cdot du$$
which, again in my world, is an equation in the exterior or Graßmann algebra of, say, smooth real functions. Or it is the defining equation of the fact that ##d## is a derivation (no typo!). But by no means is it something infinitesimal. You should use
$$\dfrac{d}{dx}y(x) :=\lim_{h \to 0}\dfrac{y(x+h)-y(x)}{h}$$
or
$$y(x) = y(x_0) +y'(x_0)\cdot (x-x_0)+r(x)(x-x_0)$$
or
$$y'(x_0) =\left. \dfrac{d}{dx}\right|_{x_0}y(x)=\dfrac{y(x)-y(x_0)}{x-x_0} + r(x)$$
Anything else is problematic. The approach in your book mixes really high mathematics (Graßmann algebra, derivations, Pfaffian forms) with intuition. The result is, that you do not understand either.

The question is: What is ##d## for you?

This question has to be answered before we can discuss it. These Vodoo equations in your post above aren't helpful. Define ##d## and then we can continue to talk about it.

• Chenkel
I am probably the wrong person to ask about the ##dy## notation. I still don't like it despite the fact that I attended a Leibniz school.

In my view of the world, we have
$$\dfrac{dy}{dx}=\dfrac{d}{dx}(u\cdot v)=u\cdot \dfrac{dv}{dx}+ \dfrac{du}{dx} \cdot v$$
so the quadratic term doesn't even occur if we only consider (linear!) tangents. If you want to multiply the formula by ##dx## then we get
$$d(uv)=u\cdot dv+v\cdot du$$
which, again in my world, is an equation in the exterior or Graßmann algebra of, say, smooth real functions. Or it is the defining equation of the fact that ##d## is a derivation (no typo!). But by no means is it something infinitesimal. You should use
$$\dfrac{d}{dx}y(x) :=\lim_{h \to 0}\dfrac{y(x+h)-y(x)}{h}$$
or
$$y(x) = y(x_0) +y'(x_0)\cdot (x-x_0)+r(x)(x-x_0)$$
or
$$y'(x_0) =\left. \dfrac{d}{dx}\right|_{x_0}y(x)=\dfrac{y(x)-y(x_0)}{x-x_0} + r(x)$$
Anything else is problematic. The approach in your book mixes really high mathematics (Graßmann algebra, derivations, Pfaffian forms) with intuition. The result is, that you do not understand either.

The question is: What is ##d## for you?

This question has to be answered before we can discuss it. These Vodoo equations in your post above aren't helpful. Define ##d## and then we can continue to talk about it.
I refined the wording in my post to be more clear. As for differential notation, ##d## is called a 'differential operator,' if I'm not mistaken, and is used for determining the change in the operand relative to a change in the independent variable.

Last edited:
• Delta2
I refined the wording in my post to be more clear. As for differential notation, ##d## is called a 'differential oppererator,' if I'm not mistaken, and is used for determining the change in the operand relative to a change in the independent variable.
You have written ##u(x+dx)=u(x)+du.## This looks a bit like ##u(x+\text{sth.})=u(x)+\text{sth. else}.## Now you are trying to figure out what these somethings are. The problem is that it cannot be answered without either using heavy abstract algebra or returning to basic calculus where we have limits.

You cannot treat ##dy\, , \,du\, , \,dv## by handwavy arguments and demand a rigor explanation. The best I can offer, besides studying mathematics, is to take it as
$$\dfrac{d}{dx}y(x) =\lim_{h \to 0}\dfrac{y(x+h)-y(x)}{h}=y'(x)$$
This is an operator that maps a function ##x\longmapsto y(x)## onto a function ##x\longmapsto y'(x)##. But ##\dfrac{d}{dx}## is the operator, not ##d.## ##dy\, , \,du\, , \,dv## simply do not exist at any basic level, except in an integral where ##dx## signals the variable of the function we integrate. In all other cases ##d## should either be a quotient ##\dfrac{d}{dx}## instead, or a derivation, not a derivative.

Last edited:
• • Chenkel and Delta2
I refined the wording in my post to be more clear. As for differential notation, ##d## is called a 'differential operator,' if I'm not mistaken, and is used for determining the change in the operand relative to a change in the independent variable.
From
$$u(x+dx)=u(x)+du$$
we get $$\dfrac{du}{dx}=\dfrac{u(x+dx)-u(x)}{dx}.$$ Now compare this with the definition of $$\dfrac{du}{dx}=\lim_{h\to 0}\dfrac{u(x+h)-u(x)}{h}.$$

It is almost the same, but only almost. We set ##dx=h## and neglected the limit. So best we can do is to describe ##dx## as ##\lim_{h\to 0}h.## But then we will run into the next problem. What to do with the numerator ##du##?

That is the reason why I call it Vodoo.
$$\dfrac{du}{dx}=\lim_{h\to 0}\dfrac{u(x+h)-u(x)}{h} \stackrel{?}{=} \dfrac{u(x+dx)-u(x)}{dx}$$
At least not rigorously. The ##d## notation helps sometimes, e.g. in the chain rule, but it also causes confusion because it simply forgets the limits and talks about infinitesimals instead. This has historical reasons and some scientists are good at it. But if it is taught in mathematics classes, then it is the limit description, not something small.

• Chenkel and pbuk
It is almost the same, but only almost. We set ##dx=h## and neglected the limit. So best we can do is to describe ##dx## as ##\lim_{h\to 0}h.## But then we will run into the next problem. What to do with the numerator ##du##?

That is the reason why I call it Vodoo.
$$\dfrac{du}{dx}=\lim_{h\to 0}\dfrac{u(x+h)-u(x)}{h} \stackrel{?}{=} \dfrac{u(x+dx)-u(x)}{dx}$$
I read this as 'the infinitesimal change in u wrt the infinitesimal change in x" it seems to make sense to me, but I wonder how far one can get with infinitesimal analysis... Now that I look at it again, there are definitely problems with my 'proof,' for example I said:
##dudv## and we get ##\frac 1 {\frac {u(x)}{dv} + \frac {v(x)}{du} + 1}##
And this seems to only have a well defined answer when dv and du are both positive. But then what happens if du is negative and dv is positive, one part of the summation in the denominator is a negative infinity, the other is a positive infinity, and they don't cancel each other out evenly, instead one grows faster than the other two, and this gives the quotient the value of 0, but this isn't easy to show...
This is an operator that maps a function ##x\longmapsto y(x)## onto a function ##x\longmapsto y'(x)##. But ##\dfrac{d}{dx}## is the operator, not ##d.## ##dy\, , \,du\, , \,dv## simply do not exist at any basic level, except in an integral where ##dx## signals the variable of the function we integrate. In all other cases ##d## should either be a quotient ##\dfrac{d}{dx}## instead, or a derivation, not a derivative.
I'm not sure if they 'do not exist unless used in integrals,' I understand they are very small but I have trouble understanding how they do no not exist... For example, if we identify the change in momentum of an object with respect to time we get dp/dt, and we know this is the same as the force applied to the object, and dp is the change in momentum of the object, this dp is equal to the impulse applied to the object Fdt, both dt and dp are small, but they, at least to me, seem to exist and have a relation to one another...

For example, if we identify the change in momentum of an object with respect to time we get dp/dt, and we know this is the same as the force applied to the object, and dp is the change in momentum of the object, this dp is equal to the impulse applied to the object Fdt, both dt and dp are small, but they, at least to me, seem to exist and have a relation to one another...
Yes, but making it rigorous is a difficult task. Maybe you are interested in hyperreal numbers.

• Chenkel
In my understanding, ##\delta x ## is the _actual_ change of x, while dx, the differential, is the linear approximation to the change along the tangent object, be it the tangent line, plane , etc, depending on the dimension we're working on. So, specifically, given ## y=f(x)##, for the case of one variable, we get:

##df= lim _{\Delta x \rightarrow 0} \frac {f(x+\Delta x)-f(x)}{\Delta x}##

Is the local linear approximation to the change of f along the tangent line. So ## D## describes a local linear approximation, while ## \Delta ## is the _actual_ change.

• Chenkel
In my understanding, ##\delta x ## is the _actual_ change of x, while dx, the differential, is the linear approximation to the change along the tangent object, be it the tangent line, plane , etc, depending on the dimension we're working on. So, specifically, given ## y=f(x)##, for the case of one variable, we get:

##df= lim _{\Delta x \rightarrow 0} \frac {f(x+\Delta x)-f(x)}{\Delta x}##

Is the local linear approximation to the change of f along the tangent line. So ## D## describes a local linear approximation, while ## \Delta ## is the _actual_ change.
... which is wrong. That limit would be ##\dfrac{df}{dx}.## Applied to ##dx## your formula would yield ##dx=1## which is nonsense. Conclusion: you use two different meanings of the same ##d##, depending on where it is placed. Such a context-sensitive "definition" is worse than no definition.

To me, ##d## is a coboundary operator. And the link between that definition and the common usage of ##d## is really hard work.

• Chenkel
... which is wrong. That limit would be ##\dfrac{df}{dx}.## Applied to ##dx## your formula would yield ##dx=1## which is nonsense. Conclusion: you use two different meanings of the same ##d##, depending on where it is placed. Such a context-sensitive "definition" is worse than no definition.

To me, ##d## is a coboundary operator. And the link between that definition and the common usage of ##d## is really hard work.
I need to rewrite a few things with the limits,since I wrote this way too late yesterday, but, no, it would come to be ## x=f'(x)dx##,
##Lim _{\Delta x \rightarrow 0} \frac {\Delta f }{\Delta x} =f'(x) ##, from which we conclude the estimate ##\Delta f## approx ##f'(x)\Delta x##, which is the linear approximation to the change. B
But I need to rewrite it more carefully. Will do so asap.

• Chenkel
I need to rewrite a few things with the limits
That won't help you. If we define ##dx## as a limit, then ##\dfrac{df}{dx}## becomes a quotient of two limits, which is wrong. It is one limit of a quotient and that is crucial. Not only because we would get different results, but a quotient of two limits is also in general different from the limit of the quotient.
$$\lim_{x \to 1 }\dfrac{1-x^4}{1-x^2}=2\quad\text{ and }\quad \dfrac{\displaystyle{\lim_{x \to 1}1-x^4}}{\displaystyle{\lim_{x \to 1}1-x^2}}\quad\text{ is undefined }$$

Again, this is why I do not like the infinitesimal picture. Either we work with hyperreals, or are really, really very cautious using ##dx.## We simply cannot treat it like an actual number, small or not, nor as a limit, which is usually a number.

• Delta2 and Chenkel
The derivative is, by definition, the quotient ## Lim_{\Delta x \rightarrow 0 } \frac {\Delta f}{ \Delta x} ##, then
##\Delta f ## is approximately ##f'(x)dx ##, and the approximation is noted as ##df##, which is ultimately a differential form, so that it works out. In terms of forms, ##df## is precisely this, ##f'(x)dx##.

• Chenkel