Infinitesimals

What Are Infinitesimals – Advanced Version

Estimated Read Time: 18 minute(s)
Common Topics: infinitesimal, sequence, rational, cauchy, bn

Introduction

When I learned calculus, the intuitive idea of infinitesimal was used. These are real numbers so small that, for all practical purposes (say 1/trillion to the power of a trillion) can be thrown away because they are negligible. That way, when defining the derivative, for example, you do not run into 0/0, but when required, you can throw infinitesimals away as being negligible.

This is fine for applied mathematicians, physicists, actuaries etc., who want it as a tool to use in their work. But mathematicians, while conceding it is OK to start that way, eventually will need to rectify using handwavey arguments and be logically sound. In calculus, that is sometimes called doing your ‘epsilonics’. This is code for studying what is called real analysis:

http://ramanujan.math.trinity.edu/wtrench/texts/TRENCH_REAL_ANALYSIS.PDF

I posted the above link so the reader can skim through it and get a feel for real analysis.  I don’t expect the reader to know it, but I would like readers to get the gist of what it is about. Just take a look at it. I won’t be using it. Any analysis ideas I will explicitly state when required. Instead, I will make the idea of infinitesimal logically sound – not with complete rigour – I leave that to specialist texts, but enough to satisfy those interested in the fundamental ideas.   Plus I will be introducing a number of ideas from real analysis.  About 1960, mathematicians (notably Abraham Robinson) did something nifty. They created hyperreal numbers, which have real numbers plus actual infinitesimals.

These are numbers x with a very strange property. If X is any positive real number -X<x<X or |x|<X. Normally zero is the only number with that property – but in the hyperreals, there are actual numbers not equal to zero whose absolute value is less than any positive real number. That way, the infinitesimal approach can be justified without logical issues.  We can legitimately neglect x if |x| < X for any positive real X.  It also aligns with how many are likely to do calculus in practice. Even though I know real analysis, I hardly ever use it – instead use infinitesimals. After reading this, you can continue doing it, knowing it is logically sound. I will link to a book that uses this approach at the end.

Learning calculus IMHO should proceed from the intuitive use of infinitesimals and limits, to understanding what infinitesimals are, which as we will see, also introduces many of the ideas of real analysis, then topics like advanced infinitesimals and analysis such as Hilbert Spaces.   At each step one should do problems, many problems.   You learn math by doing, not by reading articles like this, but by actually doing mathematics.   I also have written a simplified version of this article the reader may wish to look at first:

https://www.physicsforums.com/insights/what-are-infinitesimals-simple-version/

Getting off my soapbox on how I think Calculus should be learned, many books on infinitesimals introduce, IMHO, unnecessary ideas, such as ultrafilters, making understanding them more complex than needed. Eventually of course you will want to see more advanced treatments, but we all must start somewhere.

I will assume here the reader has done calculus to the level of a typical calculus textbook. and would be ready for a real analysis course.   No real analysis, such as the formal definition of limits, is required to read this article.  What is needed will be done as required.   A formal definition of integers, rational and reals may not have been studied yet.  If that is the case see:

http://www.math.uni-konstanz.de/~krapp/research/Presentation_Contruction_of_the_real_numbers_1 

The above is more advanced than the audience I had in mind for this article.   It uses technical terms a beginner probably would not know.   However I was not able to locate one at the appropriate level.  A beginner however would probably be able to read it and get the general gist.   I can see I will need to do an insights article at a more appropriate level.

As can be seen there are a number of ways of defining real numbers.   The construction methods of finite hyperrationals, Cauchy Sequences, and Dedekind Cuts will be used here.

 The General Idea

First let’s look at the idea of convergence (or limit – they are often used interchangeably) of a sequence An.  Informally, intuitively, whatever language you like to use, if as n gets larger An gets arbitrarily closer to a number A, then An is said to converge to A or limit n → ∞ An = A.  For example 1/n gets closer and closer to zero as n gets larger so it converges to zero.  Formally we would say for any ε>0 an N can be found if n>N then |An – A| < ε.  Suppose An and Bn converge to the same number then An – Bn converges to zero. Informally as n gets larger, An – Bn can be made arbitrarily small. Formally we would say for any ε>0 an N can be found such that if n>N then |An – Bn|<ε.   We notice something interesting about this definition.   If I remove a large enough, but finite number of terms, |An – Bn| < ε. In the intuitive sense of infinitesimal, ε can be taken as negligible and thrown away.  Then two sequences An, Bn converge to the same value if a N exists such that if n>N then An = Bn .

This leads to a new definition of sequences having the same limit.  An = Bn except for a finite number of terms.   Two sequences, certainly have the same limit in the usual sense if this is true, but it is not true of all sequences that converge to the same number.  For example An and An + 1/n both converge to A.   Given any N, for n>N then |An + 1/n – An| = 1/n ≠ 0.   A N exists such that if n > N then 1/n < X for any positive X.  We will define the < relation on sequences as A < B if An < Bn except for a finite number of terms.  Given any real number X, x=xn < X if except for a finite number of terms xn<X.  Because 1/n converges to zero, from the formal definition of convergence (for any X an N can be found if n>N then 1/n < X)  the sequence x=1/n < X for any positive real X using our new definition of less than.  This is because regardless of how large N is the the terms before 1/N are finite .  The sequence x is a true infinitesimal.

With this change in perspective infinitesimals can be defined.    Instead of thinking of a number as infinitesimal we can think of the sequence like 1/n as infinitesimal.  Let’s see what would happen if we apply this rule of two sequences being =, >, <, except for a finite number of terms to sets of sequences.   This will lead, not only to infinitesimals, but also infinitely large numbers.   As a byproduct we will gain a greater understanding of what the reals are and why the rationals need to be extended to the reals.  The liberal use of ε is standard practice, and why some call real analysis doing your ‘epsilonics’.

The Hyperrationals

The hyperrationals are all the sequences of rational numbers. Two hyperrationals, A and B, are equal if An = Bn except for a finite number of terms.   However hyperrationals, unless specifically referred to as sequences, are considered a single object. It is what is called a Urelement.  It is part of formal set theory the reader can investigate if desired – there is a Wikipedia article on it.  When two sequences are equal they are considered the same object.  Often this is expressed by saying they belong to the same equivalence class and the equivalence class is considered a single object.   But, being a beginners article I did not want to delve further into set theory, so will just use the idea of a Urelement which is easy to grasp.  A < B is defined as Am < Bm except for a finite number of terms. Similarly, for A > B.   Note there are pathological sequence such as 1 0 1 0 1 0 that are neither =, >, or less than 1.   We will require that all sequences are =, >, < all rationals.   If not it will be equal to zero.

If F(X) is a rational function defined on the rationals, then that can easily be extended to the hyperrationals by F(X) = F(Xn).  This important principle of extension is used a lot in infinitesimal calculus.  A + B = An + Bn, A*B = An*Bn.  Division will not be defined because of the divide by zero issue; instead 1/X is defined as the extension 1/Xn and throw away terms that are 1/0.   If that doesn’t work then 1/X is undefined.  If X is a rational number, then the sequence Xn = X X X …… is the hyperrational of the rational number X ie all terms are the rational number X.   Obviously B is also rational if according to the definition of equality above they are equal.

We will show that the hyperrationals contain actual infinitesimals using the argument detailed before. Let X be any positive rational number. Let B be the hyperrational Bn = 1/n. Then regardless of what value X is, an N can be found such that 1/n < X for any n > N. Hence, by the definition of < in the hyperrationals, |B| < X for any positive rational number, hence B is an actual infinitesimal.

Also, we have infinitesimals smaller than other infinitesimals, eg 1/n^2 < 1/n, except when n = 1.

Note if a and b are infinitesimal so is a+b, and a*b.  To see this; if X is any positive rational |a| < X/2, |b| < X/2 then|a+b| < X.  Similarly |a*b| < |a*1| = |a| < |X|.

Hyperrationals also contain infinite numbers larger than any rational number. Let A be the sequence An=n. If X is any rational number there is an N such for all n > N, then An > X. Again we have infinitely large numbers greater than other infinitely large numbers because except for n = 1, n^2 > n.  Even 1 + n > n for all n.

If a hyperrational is not infinitesimal or infinitely large it is called finite.

Also note if a is a positive infinitesimal a/a = 1.  1/a can’t be infinitesimal because then a/a would be infinitesimal.   Similarly it cant be finite because there would be an N, |1/a| < N and a/a would be infinitesimal.   Hence 1/a is infinitely large.

.9999999….. is the sequence A =  .9 .99 .999 ………. But every term is less than 1. Thus A < 1. However, 1 – .99999999999…… is the sequence B = .1 .01 .001 ……. = B1 B2 B3… Bn …. Hence for any positive rational number X, we can find N such that for n > N then Bn < X.  Hence .9999999…. differs infinitesimally from 1. This leads us to look at limits in a different way.   Suppose An converges to A.  Consider the sequence Bn = (An – A).  As n gets larger Bn will get arbitrarily smaller.   This means given any positive rational rational X, a N can be found if n > N  then |Bn| < X.  Hence if An converges to A then An as a hyperrational is infinitesimally close to its limit, but may not equal its limit as demonstrated by .999999999….. = 1.

Real Numbers

As detailed in the link on how integers, rational numbers, etc are constructed one way to define real numbers uses the concept of Cauchy sequence.  Intuitively it is a sequence such that as n gets larger the terms get closer and closer to each other until eventually they are so close the difference can be neglected ie the sequence is convergent.   Formally a sequence A2 A3 …… An …… is Cauchy if for any ε>0 a N can be found such if m,n>N then |Am – An| < ε.   Also it is easy to see if a sequence is convergent it is Cauchy.  Formally fix 𝜖>0 then we can find a N such that if n>N, |An-A| < ε/2 and m>N, |Am – A| < ε/2.  |Am – An| = |Am – A – (An – A)| ≤ |An – A| + |Am – A| < ε.   Tip for those doing epsilon type proofs; a good trick is to first fix ε>0 then use something like ε/2 in the proof so you end up with proving something <ε at the end.  It was told to me by my analysis professor and has been an enormous help in these kind of proofs.

However the reverse is not true.  Sometimes it converges to a rational in which case there are no problems.  But sometimes it is something we have not formally defined called an irrational number. For example let X1=2, Xn+1 = Xn/2 + 1/Xn be the recursively defined sequence Xn.  Each Xn is rational.  Calculate the the first few terms.   Even the fourth term is close to √2.  Indeed let εn’ = Xn – √2. Define εn = εn’/√2. Xn = √2*(1+εn). We have seen εn is small after a few terms. Xn+1  = ((1/√2)*(1+εn)) + (1/√2)*(1/(1+εn)) = 1/√2*((1+εn) + 1/(1+εn)).  If S = 1 + x + x^2 +x^3 …. S – Sx = 1.  S = 1/1-x = 1 + x + x^2 + x^3……   If x is small to good approximation 1/1-x = 1 + x or 1/1+x = 1 – x.   We call this true to the first order of smallness because we neglected terms of higher powers than 1.  Hence Xn+1 = (1/√2)*((1+εn) + (1-εn)) = √2 to the first order of smallness in en.   The sequence quickly converges to √2 which is well known not to be rational.   As an aside for those that know it the sequence was constructed using Newtons method which generally converges quickly.

Because of this the rationals are called incomplete.  It is a general concept – if the Cauchy sequences of any set of objects does not always converge to elements of the set they are called incomplete.  If all Cauchy sequences converge to an element of the set they are complete.  Formally, if the Cauchy sequence does not converge to a rational limit, the Urelement of the sequence will be the single object A.  Cauchy sequences are represented by the same Urelement if limit (An – Bn) = 0.  Rational and irrational numbers are both called reals and the union of both sets is the real set. Note two Cauchy sequences that are equal by convergence are not necessarily equal as hyperrationals.  An and An+1/n are equal as convergent Cauchy sequences, but not as hyperrationals.  For reals A ≥ B is defined as A ≥ B when A and B are hyperrationals.   Similarly for A ≤ B.   We can then define =, > and < for reals.   Because equality is defined differently for hyperrationals > and < are different for reals.

In the set of reals, under the usual definition of limit n → ∞ An = A exists, but in the hyperrationals A is simply a formal definition, although we will still say An converges to A (or, equivalently limit n → ∞ An = A) just to make life simple.

Are the reals complete?   Let Xn be a Cauchy sequence of real numbers.  Since every real number has a sequence of rationals that converges to it we can always find a rational arbitrarily close to any real.   Hence we can can find a rational Rn |Xn – Rn| < 1/n.   Limit n → ∞  |Xn – Rn| = 0. Xn – Rn is convergent, hence Cauchy.   The difference of two Cauchy sequences is Cauchy.   Xn – (Xn – Rn) = Rn is Cauchy.  Hence Rn converges to a real number. But Xn – Rn converges to zero.   Hence Xn converges to the same real number.  The reals are complete.

I now will prove a very important property of the reals.   Every set, S, with an upper bound has a least upper bound (LUB). If S has exactly one element, then its only element is a least upper bound. So consider S with more than one element, and suppose that S has an upper bound B1. Since S is nonempty and has more than one element, there exists a real number A1 that is not an upper bound for S.  Define A1 A2 A3 … and B1 B2 B3 … as follows.  Check if (An + Bn) ⁄ 2 is an upper bound for S.  If it is, let An+1 = An and let Bn+1 = (An + Bn) ⁄ 2. Otherwise there is an element s in S so that s>(An + Bn) ⁄ 2. Let An+1 = s and let Bn+1 = Bn. Then A1 ≤ A2 ≤ A3 ≤ ⋯ ≤ B3 ≤ B2 ≤ B1 and An − Bn converges to zero. It follows that both sequences are Cauchy and have the same limit L, which must be the least upper bound for S.  It is not true for rationals because, while Cauchy, the limit may not exist ie the rationals are not complete.

A hyperrational B is called finite, or bounded, if |B| < Q where Q is some positive rational number.  If B is infinitesimally close to to a rational Q then B = Q + q where q is infinitesimal.   As the sequence that converges to √2 shows such is not always the case.    If B is not infinitesimally close to a rational then all rationals < B and those > B defines a the real R, closest to B.  Hence B = R + r where R is infinitesimal.   Since r is infinitesimal, rn converges to zero.  Hence rn is Cauchy.  Add R to all elements of a Cauchy sequence, then the sequence is still Cauchy.   Hence B is Cauchy.  Any Cauchy sequence is bounded hence is a finite hyperrational.  The bounded hyperrationals are all the rational Cauchy sequences and each defines a real.

This can be viewed another way.  A Dedekind Cut is a partition of the rational numbers into two sets A and B, such that all elements of A are less than all elements of B, and A contains no greatest element.  Any real number, R is defined by a Dedekind Cut. In fact since B is all the rationals not in A, a Dedekind Cut is defined by A alone.  A set A of rationals that has no largest element and every element not in A is greater than any element in A defines a real number R.   It is the LUB of A.  Let X be any finite hyperrational.   Let A be the set of rationals < X.  A is a Dedekind Cut.  Hence X can be identified with a real number R.   If Y is infinitesimally close to X then the set of rationals < Y is also A hence defines the same real, R.   Only if Y is finitely different to X does it define a different real number S.    That is because the difference is a finite hyperreal and defines a real number Z. R≠S   This leads to a new definition of the reals.  Two finite hyperreals are equal if they are infinitesimally close.   The hyperreals infinitesimally close to each other are denoted by the same object.   These objects are the reals.

The Hyperreals

Now we know what reals are we can extend hyperrationals to hyperreals ie all the sequences of reals.   The hyperrationals are a proper subset of the hyperreals.   As before the real number A is the sequence An = A A A A……………  Similar to hyperrationals if F(X) is a function defined on the reals then that can easily be extended to the hyperreals by F(X) = F(Xn). A + B = An + Bn. A*B = An*Bn.  Two hyperreals, A and B, are equal if An = Bn except for a finite number of terms.  As usual they are treated as a single object.  Again the limit of the terms is the usual definition, except this time if it is Cauchy the limit will also be a hyperreal.   We define A < B  and A > B similarly ie differing by only a finite number of terms.  A + B = An + Bn. A*B = An*Bn.   We have infinitesimals and infinitely large hyperreal numbers.  Again pathological sequences are set to zero.   Also note a sequence that converges to a real number can be infinitesimally close to a real number, but under the definition of equally not equal to it.   However as we will see, we can now throw away the infinitesimal part and take them as equal.

We want to show if B is a finite hyperreal then B is infinitesimally close to some real R, B = R + r were r is infinitesimal. Let A be the set of all rationals < B.  A is a Dedekind Cut hence defines a real, R, the standard part of B, denoted by st(B).   We also call it throwing away the infinitesimal part of B.  In intuitive infinitesimal calculus where infinitesimal r is small, when required, we throw away r.   Before the hyperreals this had issues with exactly how small r can be before it can be thrown away.  But here, r is infinitesimal so |r| < X for any real X.  It can legitimately be thrown away.

How It Is Applied

It instructive and fun to go through the infinitesimal arguments in a book like Calculus Made Even Easier and apply the hyperreals to it, instead of the intuitive way the book does it.   For example d(x^2) = (x+dx)^2 – x^2 = 2xdx + dx^2 = dx*(2x +dx).  But since dx is smaller than any real number it can be neglected in (2x+dx) to give simply 2x.  d(x^2) = 2xdx or d(x^2)/dx = 2x.

Lets define limits using infinitesimals.  limit x → c f(x) = st(f(c+a)) where a is any infinitesimal not zero and st(f(x+a)) is the same regardless of the value of a.  limit x → ∞ f(x) = st(f(A)) where A is any infinitely large number and st(f(A))

The definition of derivative is easy.  dy/dx = limit Δx → 0 Δy/Δx = st((y(x+dx) – y(x))/dx)

f(x) is continuous at c if st(f(c+a)) = f(c) for any non zero infinitesimal a.

The indefinite integral, ∫f(x)*dx is defined as F(x) + C where F(x) is an antiderivative of f(x).   All antiderivatives has the form F(x) + C where C is any constant.    It actually is not a function, but a family of functions, each differing by a constant that is different for each function.   Not only that but if F(x) is a member of the family so is F(x) + C where C is any constant.  All members of this family are antiderivatives of f(x).  This notation allows the easy derivation of the important change of variables formula.   ∫f*dy = ∫f*(dy/dx)*dx.  It is used often in actually calculating integrals – or to be more exact antiderivatives.

Application to Area

Without having any idea of what area is, from the definition of indefinite integral ∫1*dA = ∫dA = A + C where A is this thing called area.   Doing a change of variable ∫dA = ∫(dA/dx)*dx.   Let f(x) = dA/dx. ∫f(x)*dx = A(x) + C.  We do not have a definition of A from this because of the arbitrary constant C.   But note something interesting.   A(b) – A(a)  = A(b) + C – (A(a) + C).  Now the arbitrary constant C has gone.   This leads to the following unique definition of the area A between a and b.   If A(x) is an antiderivative of a function f(x) the area between and and b = A(b) – A(a).   It is given a special name – the definite integral denoted by ∫(a to b)f(x)dx = A(b) – A(a) where A(x) is an antiderivative of f(x).  We know to good approximation, if Δx is small the area under f(x) from x to x+Δx is f(x)*Δx.   It is exact if Δx = 0, but then the area is zero.  f(x)dx can be thought of as an infinitesimal area.  By this is meant to good approximation ΔA = f(x)Δx.  The approximation gets better as Δx get smaller.   It would be exact when Δx = 0, except for one problem, ΔA = 0.  To circumvent this we extend ΔA to the hyperreals and da = f(x)dx.  But dx can be neglected.   So we can have our cake and eat it to.  dx is effectively zero, so the approximation is exact, but it isn’t zero so dv is not zero.   In this way other things like volume of rotation can be defined.   If Δx is small the volume of rotation about f(x), ΔV, is f(x)^2*Π*Δx to good approximation, with the approximation getting better as Δx gets smaller.   In order to be exact Δx wound need to be zero, but then ΔV the volume of rotation is zero.  Similar to area we want is Δx to be effectively zero, but not zero. Extending the formula to the hyperreals dV would be dV = f(x)*Π*r^2*dx.  ∫dV = ∫f(x)^2*Π*dx and the volume can be calculated.   Same with surface area.

Diving Deeper

This is just an overview of a rich subject.   For more detail see:

people.math.wisc.edu/…ler/foundations.pdf

To see a development of calculus from true infinitesimals see Elementary Calculus – An Infinitesimal Approach – by Jerome Keisler (the above link is an appendix to that book):

https://people.math.wisc.edu/~hkeisler/calc.html

Other Applications

For even more advanced applications into Hilbert Spaces etc see the book Applied Nonstandard Analysis.   It goes much deeper into axiomatic set theory, ultrafilters etc.   However I would not attempt it until you have done Lebesgue integration at least – it is not meant for the beginner level.  Actually while not assuming any knowledge of real analysis I did introduce some ideas from it, which hopefully will assist when studying real analysis.

Concluding Remarks

Next step – see the following article and the associated thread for further recommendations.

 

 

0 replies

Leave a Reply

Want to join the discussion?
Feel free to contribute!

Leave a Reply