# Cantor's decimal proof that (0,1) is uncountable

Assume for contradiction that there does exist a function ##f : \mathbb{N} \rightarrow (0,1)## that is a bijection. ##\forall m \in \mathbb{N}##, ##f(m)## is a real number between 0 and 1, and we represent it using the decimal notation ##f(m) = .a_{m1}a_{m2}a_{m3}...##. We assume that every real number is in the set ##\{ f(1), f(2), f(3),...\}##. Now, define ##x \in (0,1)## to be ##x=.b_1b_1b_3...## where ##b_n = 2## if ##a_{nn} \ne 2## and ##b_n = 3## if ##a_{nn} =2##. Then ##\forall n \in \mathbb{N} ~~x \ne f(n)##. This means that ##f## is not onto, and hence not a bijection, which is a contradiction.

My only question about this proof is that do we need to take into account the fact that, for example, ##0.59999... = 0.6##? Do we need to state at the beginning of the proof that don't allow strings of 9's? Is this the only way to resolve this issue?

Last edited:

Related Set Theory, Logic, Probability, Statistics News on Phys.org
andrewkirk
Homework Helper
Gold Member
Yes. In order for ##a_{mk}## to be well defined, we must create a function that maps a natural number ##m## to a unique sequence of digits. Hence we need to make a decision up-front about things like whether to represent a half by ##0.5\dot 0## or ##0.4\dot 9##. A simple way to do it is to exclude from the range of the function all sequences that end in infinite strings of nines. Having made that decision, there is no ambiguity left.

By the way, this proof used to greatly trouble me because I felt that it somehow used the Axiom of Choice. It was only after I returned to it many years later that I was able to convince myself that it doesn't use that axiom.

FactChecker
Gold Member
It is not necessary to worry about that. The proof starts with the assumption of a bijection onto the reals, without regard to how they are represented. Then they can be represented, one by one in place without changing the bijection. You know that the number 5.999999... = 6.0 is only on the list once. The proof works no matter in which form the number is represented.

CORRECTION: I think that you may have to worry that you are not creating a bad diagonal counterexample that ends in infinite 9s and may match a 6.0-type representation. But that is easy to do since you have freedom to chose any digit that does not match the current diagonal digit. So you can just specify that you periodically pick a digit other than 9.

PS. This way of creating a number that is not one of the counted numbers also shows how tiny a fraction of the reals that can be counted. At every digit, you can create 9 numbers that were not counted. After two digits, you have shown that the counted numbers are at most 1 in 100 reals. After 3 digits it is 1 in 1000, after n it's 1 in 10n. So you can understand how small the set of counted numbers are compared to all reals.

Last edited:
Cantor's diagonal proof is one of the most elegantly simple proofs in Mathematics. Yet its simplicity makes educators simplify it even further, so it can be taught to students who may not be ready. Because the proposition is not intuitive, this leads inquisitive students to doubt the steps that are misrepresented.

The proposition Cantor was trying to prove is that there exists an infinite set that cannot have a bijection with the set of all natural numbers. All that is needed to prove this proposition is an example. And the example Cantor used in Diagonalization was not the set of real numbers ℝ. Explicitly. Cantor: "There is a proof of this proposition that ... does not depend on considering the irrational numbers."

Cantor didn't use quite this language - it wasn't established at that time - but the following is a paraphrasing of his entire proof:

A string is essentially a function from a set of natural numbers to a set of characters. Cantor used what I call Cantor Strings, which are infinite-length strings combining only two characters. He used c : ℕ→{'m','w'}.

Cantor called the set of all such Cantor Strings M.

He then assumed that an injective function E : ℕ→M exists. He did not assume it was bijective.

Using E, diagonlazation defines a new function B : ℕ→{'m','w'}. By definition, this function is a Cantor String. By its definition, it is not mapped by the function E. This proves that if a function from ℕ to M is injective, then it is not surjective.

Cantor did not use proof-by-contradiction. If anything, he used contraposition: this also proved that if a function from ℕ to M is surjective, then it is not injective. (But note that this isn't really necessary. I include it, because it paraphrases what Cantor said.)

Thus, no function from ℕ to M can be bijective.​

Last edited:
Math_QED
Homework Helper
2019 Award
Yes. In order for ##a_{mk}## to be well defined, we must create a function that maps a natural number ##m## to a unique sequence of digits. Hence we need to make a decision up-front about things like whether to represent a half by ##0.5\dot 0## or ##0.4\dot 9##. A simple way to do it is to exclude from the range of the function all sequences that end in infinite strings of nines. Having made that decision, there is no ambiguity left.

By the way, this proof used to greatly trouble me because I felt that it somehow used the Axiom of Choice. It was only after I returned to it many years later that I was able to convince myself that it doesn't use that axiom.
Interesting. I didn't think about the use of AC here. Now you made the comment, can you clarify why it doesn't need choice? Is it because there is an explicit choice function?

I.e., for ##n \in N##, define choice(n) = "a fixed number not equal to blabla"

I'm not formally introduced to choice yet

andrewkirk
Homework Helper
Gold Member
In very loose terms, the axiom of choice says that it is possible to make an infinite number of choices. More rigorously, it says that given a collection ##C## of nonempty sets, there exists a function ##f:C\to \bigcup_{S\in C} S## that 'arbitrarily' picks one element from every set, ie such that for every ##S\in C## we have ##f(S)\in S## as the element that is picked from set ##S##.

What concerned me about the Cantor proof was that, for every ##n##, we had to choose as the digit in the ##n##th decimal place a digit that does not match the ##n##th digit of the ##n##th number in our ##\mathbb N##-indexation of the reals. That is making an infinite number of choices.

What resolved my concern was that we can specify how the choice is made, so that it is not arbitrary and hence we don't need to use the axiom of choice. One specification is that if ##d##, the ##n##th digit of the ##n##th number is less than 9, we choose ##d+1##, otherwise we chose 0. That ensures it will be different, but the choice is not arbitrary.

The general approach to trying to convert a proof that uses the axiom of choice to one that does not is to try to find a constructive, non-arbitrary way of making each choice, as we did here. Sometimes it is not possible, and in those cases we have a proof that genuinely relies on the axiom of choice.

• Math_QED
stevendaryl
Staff Emeritus
There is an alternative proof, also due to Cantor, that doesn't make use of decimal representations at all.

Theorem: If ##r_1, r_2, ...## is a countably infinite sequence of distinct reals, then there is a real ##r## that is unequal to any in the sequence.

Proof:

We construct an infinite sequence of pairs ##(a_n, b_n)## with ##a_n < b_n## as follows:

Let ##a_1 =r_1##. Let ##b_1## be the first number in the sequence ##r_1, r_2, ...## that is strictly greater than ##a_1##. Then to go from ##(a_n, b_n)## to ##(a_{n+1}, b_{n+1})##, we do the following:

Let ##i## be the smallest integer such that ##a_n < r_i < b_n##. If there is no such ##i##, then we can let ##r = \frac{a_n + b_n}{2}## and we're done.
Let ##j## be the smallest integer such that ##r_i < r_j < b_n##. Again, if there is no such ##j##, then we can let ##r = \frac{r_i + b_n}{2}## and we're done.
Then we let ##a_{n+1} = r_i## and ##b_{n+1} = r_j##.

Now, assuming that we don't quit early, by being unable to continue, we will construct two infinite sequences:

##a_1 < a_2 < a_3 ... < b_3 < b_2 < b_1##

Since all the ##a_n## are bounded above by ##b_1##, they must converge to a , ##A##. Similarly, all the ##b_n## are bounded below by ##a_1##, so they must converge to a limit, ##B##. Then neither ##A## nor ##B## is in the sequence ##r_1, r_2, ...##.

Why not? Well, if you look at the process for constructing the sequences ##(a_n, b_n)##, you can see that for every item in the sequence ##r_k##, one of four things happens:
1. There is some ##n## such ##r_k = a_n##
2. There is some ##n## such that ##r_k = b_n##
3. There is some ##n## such that ##r_k < a_n##
4. There is some ##n## such that ##b_n < r_k##
In all 4 cases, it is not the case that ##a_n < r_k < b_n##. But we can see that ##a_n < A \leq B < b_n##. So ##r_k## cannot equal ##A## or ##B##.

mathwonk
Homework Helper
another simple way to make the proof avoid involving decimals which end in all 9's is just to use the argument to prove that those decimals consisting only of 0's and 1's is already uncountable. Consequently the larger set of all reals in the interval is also uncountable. Since here the choice function is merely to take the opposite digit in the set {0,1}, the choice is also not very exotic logically. This is a concrete version of the abstract argument that the set of subsets of any set has larger cardinality than the set itself. Here the big set is the natural numbers and a decimal sequence consisting of 0's and 1's picks out the subset of those natural numbers which index a decimal location which has an entry of "1".

Abstractly, if you have any map from a set to the set of all its subsets, you consider those elements that do not belong to the subset they map to. Such a subset cannot be the image of any element, since if that element belonged to the subset it would contradict the definition of the subset, and if the element did not belong to the subset it would also contradict the definition of the subset, since then this element would satisfy the condition that it should belong to that subset.

If you write down a sequence of decimals, thought of a sequence of subsets, then you start your counterexample decimal with 1 if and only if the first subset did not start with 1, i.e. your counterexample subset includes the natural number 1 if and only if 1 did not belong to the subset it labels. Similarly 2 labels the second decimal or subset and we choose 2 to belong to our subset, i.e. we put a 1 in the second position of our decimal, if and only if there was not a 1 in the second position of the second decimal, i.e. if and only if the subset labeled by 2 did not contain the natural number 2,...etc....

Thus just as, given any map f:S--->2^S (= set of subsets of S), there is a canonical way to pick out a subset of S not contained in the image of f, so exactly in the same way, given any map from the natural numbers to the set of decimals using only 0's and 1's, there is a canonical way to pick out a decimal not in the image of the map.

As an item of personal history, I read this proof as a high school student, and the next year while interviewing for an honors calculus class in college, used it to convince the professor, who was at first reluctant, to put me in the class. As I recall, he decided that even though I was rather ignorant in general, at least I had some interest in mathematics.

FactChecker
Gold Member
another simple way to make the proof avoid involving decimals which end in all 9's is just to use the argument to prove that those decimals consisting only of 0's and 1's is already uncountable.
Decimals which end in all 9's are not a problem for the usual proof. Just rule them out and specify that those numbers on the list are represented by their equivalent representation ending in all 0's. That would still be a valid representation of the list which we assumed includes all real numbers. In forming the diagonal counterexample, just rule out ending in all 9's. That is easy, since we are allowed to pick any digit not matching the diagonal digit. So we only have to specify some digit other than the diagonal or 9. (This also clearly illustrates how abundant the counterexamples are. -- many more than are on the list.)

If one assumes there is an injection ##\mathbb R\to \mathbb N ##, there is necessarely a surjection from ##\mathbb N\to \mathbb R ##.
If ##\mathbb R ## is the sequence, say, ##(a_n)_{n\in\mathbb N} ## then taking ##a_j := \sum_{k\in\mathbb Z} a_{jk}10^{-k}##, where ##a_{jk}\in \{0,\ldots ,9\} ## we construct a real number
$$a := \sum_{k\in\mathbb N} a_k\cdot 10^{-k},\quad \begin{cases}a_k = 2, & a_{kk}\geq 5\\ a_k = 7, &a_{kk}<5 \end{cases}\quad (\text{the choice is kind of arbitrary})$$
It's obvious that for any ##n\in\mathbb N## ##a\neq a_n##, which contradicts the assumption about surjectivity.

Since no such injection exists, the cardinality of ##\mathbb R ## must exceed that of ##\mathbb N ##.

One can construct injections ##(0,1)\to \mathbb R ## (this one is immediate) and ##\mathbb R\to (0,1) ##, hence by Cantor-Bernstein, there is necessarely a bijection between the two sets. You can augment the above sketch assuming injection from ##(0,1)\to\mathbb N ##.

Last edited: