# Is Zero a Natural Number?

[Total: 2    Average: 4.5/5]

Using: Anderson-Feil Chapter 1.1

Is zero a natural number?

This is a pretty controversial question. Many mathematicians – especially those working in foundational areas – say yes. Another good deal of mathematicians say no. It’s not really an important question, since it is essentially just a definition and it matters very little either way. I will follow the convention that ##0## is a natural number – contrary to Anderson and Feil’s convention.

== Peano axioms ==
Anderson and Feil describe the natural numbers intuitively and assume that you are familiar with them. But they don’t go much into detail into how to make the natural numbers rigorous. A famous axiom system by Peano describes axioms that the natural numbers should satisfy. It makes it possible to give a completely rigorous account of the natural number system.

Basically, a Peano system consist of a set ##N##, a function ##s:N\rightarrow N## (the successor function) and an element ##0\in N##. These must satisfy the following three properties:
# There is no ##n\in N## such that ##s(n) = 0##.
# The function ##s## is injective.
# If ##G\subseteq N## is a set such that ##1\in G## and such that ##s(n)\in G## whenever ##n\in G##, then ##G=N##.

It is clear intuitively that the natural numbers satisfy these axioms. But perhaps more axioms should be needed to completely specify the natural numbers? We will see soon that this is not the case.

== Operations in the Peano system ==
We fix a Peano system ##\mathbb{N}## with successor function ##s## and distinguished element ##0##. Using our axiomatic definition of the natural numbers as a Peano system, we can now define addition, multiplication and exponentiation as follows:

We define ##+## recursively as the operation such that
# ##n+0 = n##
# ##n + s(m) = s(n+m)##

If we define ##1 = s(0)##, then we can prove easily that ##s(n) = n+1##, so our successor function does pick out the next element. Here are some things that can be proven about addition:

# For all ##n,m,k \in \mathbb{N}##, we have ##n+(m+k) = (n+m) + k##.
# For all ##n,m\in \mathbb{N}##, we have ##n+m = m+n##.
# For all ##n\in \mathbb{N}##, we have ##n+0 = 0+n = n##.
# For all ##n,m,k\in \mathbb{N}##, we have that ##n+k = m+k## if and only if ##n=m##.

For the proofs, try them as an exercise, or read the first few sections of Bloch’s “The real numbers and real analysis”.

We can define multiplication ##\cdot## recursively as the operation such that
# ##n\cdot 0 = 0##
# ##n\cdot s(m) = n\cdot m + n##

We can prove the following:

# For all ##n,m,k\in \mathbb{N}##, we have that ##n\cdot (m\cdot k) = (n\cdot m)\cdot k##.
# For all ##n,m,k\in \mathbb{N}##, we have that ##n\cdot m = m\cdot n##.
# For all ##n\in \mathbb{N}##, we have that ##n\cdot 1 = n = 1\cdot n##.
# For all ##n,m,k\in \mathbb{N}## such that ##k\neq 0##, we have that ##n\cdot k = m\cdot k## if and only if ##n=m##.
# For all ##n,m\in \mathbb{N}##, we have that ##n\cdot m = 0## if and only if ##n=0## or ##m=0## (or both).
# For all ##n,m,k\in \mathbb{N}##, we have that ##n\cdot (m+k) = n\cdot m + n\cdot k## and ##(m+k)\cdot n = m\cdot n + k\cdot n##.
# For all ##n,m\in \mathbb{N}##, we have that ##n\cdot m = 1## if and only if ##n=m=1##.

Again, prove this as an exercise or see Bloch.

Finally, we can define recursively exponentiation as
# ##n^0 = 1##
# ##n^{s(m)} = n^m \cdot n##

We can prove the following
# For all ##n\in \mathbb{N}##, we have that ##n^1 = n##.
# For all ##n,m,k\in \mathbb{N}##, we have that ##n^{m+k} = n^m \cdot n^k##.
# For all ##n,m,k\in \mathbb{N}##, we have that ##n^{m\cdot k} = (n^m)^k = (n^k)^m##.
# For all ##n,m,k \in \mathbb{N}##, we have that ##(n\cdot m)^k = n^k\cdot m^k##.

Try to prove this yourself.

We can also define an ordering ##\leq## as ##n\leq m## if and only if there is some ##k## such that ##n+k = m##. It is then possible to prove:

# For all ##n\in \mathbb{N}##, we have ##n\leq n##.
# For all ##n,m\in \mathbb{N}##, we have that ##n\leq m## and ##m\leq n## implies that ##n=m##.
# For all ##n,m,k\in \mathbb{N}##, we have that ##n\leq m## and ##m\leq k## implies ##n\leq k##.
# For all ##n,m\in \mathbb{N}##, we have that either ##n\leq m## or ##m\leq n## (or both, but this only happens if and only if ##n=m##.
# For all ##n,m,k,l\in \mathbb{N}##, we have that ##n\leq k## and ##m\leq l## implies that ##n+m\leq k+l##.
# For all ##n,m,k,l\in \mathbb{N}##, we have that ##n\leq k## and ##m\leq l## implies that ##n\cdot m \leq k\cdot l##.
# For all ##n,m\in \mathbb{N}##, we have that ##n\leq m\leq n+1## implies that ##n=m## or ##n+1 = m##.

The ordering ##<## can then be defined as ##n<m## if and only if ##n\leq m## is true and ##n=m## is not true. This satisfies similar properties as above: formulate and prove them.

Substraction can also be defined: if ##n\leq m##, then we define ##m-n## as the number such that ##n+(m-n) = m##. We can prove that ##m-n## is well-defined (that is: there is exactly one number ##k## such that ##n+k = m##).

== Categoricity of the natural numbers ==
We have asserted that the Peano system identifies the natural numbers exactly. It should be obvious intuitively that the natural numbers satisfy the Peano axioms. But aren’t there any axioms missing? Another way to ask this is: sure, the natural numbers satisfy the Peano axioms, but are there other structures that satisfy the Peano axioms too?

The answer is no. There is essentially only one structure that satisfies the Peano axioms. Why the word “essentially”? Because pure uniqueness is something that we can never get out of an axiomatic treatment. For example, ##\mathbb{N}## satisfies the Peano axioms. But also ##\{0,2,4,6,8,…\}## satisfies them if we let our successor function be ##s(n) = n+2##. But this second example is just a renaming of our first example, it is not essentially different.

Formally, if ##N## is a Peano system with successor function ##s## and distinguished element ##0##, and if ##M## is a Peano system with successor function ##t## and distinguished element ##a##, then there is a unique bijection ##f:N\rightarrow M## such that ##f(s(n)) = t(f(n))## for all ##n\in N## and such that ##f(0) = a##. This function intuitively does the renaming in the sense that ##f(n)## is just another name for ##n##.

Some people are confused that ##\{0,2,4,6,8,…\}## is also a model for the Peano axioms, while surely it’s not the natural numbers that we know. This is true, but it does behave exactly like our real numbers in the exact same way. The main differences between the set ##\{0,2,4,6,8,…\}## as Peano system and as set that we usually know, is that as we usually know it, we have ##2\cdot 2 = 4##, while as a Peano system, we have
$$2\cdot 2 = 2\cdot s(0) = 2\cdot 0 + 2 = 0 + 2 = 2.$$
So if we see ##\{0,2,4,6,8,…\}## as a Peano system, then the Peano operations are completely different as the one we are used to. Indeed, the operation are just like we see ##2## as another name for ##1##. Imagine a parallel universe where one would be denoted as ##2##, where two would be denoted as ##4##, and so on. It is exactly that universe that the Peano system ##\{0,2,4,6,8,…\}## describes! People in that parallel universe, the people would find our system of ##0,1,2,3,4,…## very weird!

== Independence of the axioms ==
It is a useful mathematical exercise to show that the three Peano axioms are independent. This means that if we assume two of the axioms to be true, then it is impossible to prove the third. Here is how to do this in this case:

1) Assume that the first two axioms are true, then the induction axiom is not necessarily true.
A good example of such a structure is ##[1,+\infty)##, with ##s(x) = x+1##. Then the function ##s## is clearly injective and there is no ##x## such that ##s(x) = 1##. The induction theorem is false. Indeed, consider ##\mathbb{N} = \{1,2,3,…\}##. This is a subset of our structure, and it satisfies that ##1\in \mathbb{N}## and if ##n\in \mathbb{N}## then ##n+1\in \mathbb{N}##. But our set isn’t entire ##[1,+\infty)##.

In this sense we can see the induction axiom as an axiom delimiting the size of our system. So the induction hypothesis might be seen as an axiom saying that our system is the smallest possible satisfying axiom (1) and (2).

2) Assume that the first axiom and the induction are true, then the second axiom is not necessarily true.
The idea here is to take ##X = \{1,2,…,n\}## with a somewhat weird ##s## function. We define ##s(1) = 2##, ##s(2) = 3##, etc. until ##s(n-1) = n##. The only difference is that we also define ##s(n) = 2##. Then the first axiom is clearly satisfied and the induction hypothesis can be satisfied, but ##s## is not injective since ##s(n) = s(1) = 2##.

3) Assume that the second axiom and the induction axiom is true, then the first axiom is not necessarily true.
The idea is to take ##X = \{1,…,n\}## and to take ##s(1) = 2##, ##s(2) = 3##, etc. until ##s(n-1) = n##. The only difference with the previous example is by taking ##s(n) = 1##. This gives some kind of cyclic structure.

This example is actually a very useful mathematically since it can be used to describe divisibility. A lot of useful theorems of natural numbers actually can be carried over to this context. This yields the example of cyclic groups and rings.

== Construction of the natural numbers ==
It is possible to construct a Peano system by using only sets. Note that we take a Peano system in the sense of the above definition, where the “first element” is ##0##.

The idea is that
##0 = \emptyset,~1 = \{\emptyset\},~2 = \{\emptyset,\{\emptyset\}\},…##
In general, we define ##0=\emptyset##, and if ##n## is defined, then we define ##n+1 = n\cup \{n\}##.

More formally, we call a set ##X## to be inductive if ##\emptyset \in X## and if for all ##x\in X##, we have that ##x\cup\{x\}## is an element of ##X##. It is an axiom of set theory that an inductive set exists. The set of natural numbers can then be defined as the smallest inductive set, or:
##\mathbb{N}=\bigcap\{X~\vert~X~\text{is inductive}\}.##
We can check that this set ##\mathbb{N}## satisfies the Peano axioms with ##0 = \emptyset## and ##s(x) = x\cup \{x\}##. Let us do that:

1) There is no ##n\in \mathbb{N}## such that ##s(n) = 0##.
Indeed, if such an ##n## exists, then we would have ##n\cup \{n\} = \emptyset##. But then ##n\in \emptyset## which is impossible.

2) The function ##s## is injective.
First we prove that every ##n\in \mathbb{N}## is transitive. This means that if ##m\in n##, then ##m\subseteq n##. Indeed, let ##X\subseteq \mathbb{N}## be the set of all elements ##n\in \mathbb{N}## that are transitive. We prove that ##X## is inductive, from which follows that ##X=\mathbb{N}##. It is clear that ##\emptyset## is inductive, so ##\emptyset \in X##. Now assume that ##n\in X##. Then ##n## is inductive. Now take ##m\in n\cup \{n\}##, then either ##m = n##, from which follows that ##m\subseteq n\cup \{n\}##. Or we have that ##m\in n##. Since ##n## is inductive, it follows that ##m\subseteq n##. So ##X## is inductive.

Now assume that ##s(n) = s(m)##. Then ##n\cup \{n\} = m\cup \{m\}##. If ##n\neq m##, then we have that ##n\in m## and ##m\in n##. But since ##n## is inductive, it follows that ##m\subseteq n##. So it follows from ##n\in m## that ##n\in n##. We now show that this is impossible for any ##n\in \mathbb{N}##. To do this, we take ##X## to be the set of all ##n\in \mathbb{N}## such that ##n\notin n##. Clearly, ##\emptyset \in X##. Now assume that ##n\notin n##. If ##n\cup \{n\} \in n\cup \{n\}##, then either ##n\cup \{n\}\in n## or ##n\cup \{n\} = n##. We show that both are impossible:
a) If ##n\cup \{n\} \in n##, then by transitivity of ##n## follows that ##n\cup \{n\}\subseteq n##. Then ##n\in n## which was against the assumption.
b) If ##n\cup \{n\} = n##, then ##n\in n##, which from the assumption was again impossible.
So we have prove that if ##n\in X##, then so is ##n\cup \{n\}##. So ##X## is inductive. But then ##X=\mathbb{N}##.

3) Assume that ##G\in \mathbb{N}##. Assume that ##0\in G## and that from ##g\in G## follows that ##s(g)\in G##. Then ##G=\mathbb{N}##.
This follows immediately since ##G## is inductive and ##\mathbb{N}## is the smallest inductive set.

== What is a natural number ==
We have now exhibited two definitions of natural numbers: one as a set that satisfies the Peano axioms, and the other a very weird explicit construction in terms of sets. This last definition is very weird, we had things like ##1 = \{\emptyset\}##. But surely the number ##1## is something different?

This is a philosophical issue. What exactly is the nature of the number ##1## (or any other number) is unknown and debatable. All mathematicians care about is what properties the natural numbers have. It seems very agreeable to say that the natural numbers (whatever they are) should satisfy the Peano axioms. So what mathematicians do is not worry about what natural numbers are exactly, but rather construct a model that behaves exactly like the natural numbers. The benefit of that is we don’t need anything extra outside of set theory in order to talk about natural numbers.

== Consistency of the Peano axioms ==
Consistency means that the axioms cannot lead to a contradiction. A contradiction is a statement that can be proven true and false. It is crucial in mathematics that our systems are consistent. For example, consider the following axiom system which is a set ##X## satisfying the following axioms
1) ##X## is nonempty
2) ##X## is empty
Clearly, there is no such axiomatic system, since my axioms are contradictory. So this pretty silly example shows that we cannot just allow anything to be an axiom: for any axiomatic system it has the risk that it is similarly contradictory.

An important question is whether the Peano axioms are consistent. Sadly, it has been shown by Gödel that nobody can show whether the axiomatic system is consistent in itself. Gödel’s result means that we can only talk about the consistency of the Peano axioms relatively to another axiomatic system. With that in mind, we can prove that the Peano axioms are consistent within set theory: if set theory is consistent, then so are the Peano axioms. Our construction of the natural numbers above is one way of showing this.

But isn’t the Peano system just a formalized version of the counting numbers? Numbers we use every single day? Yes, but a Peano system also asserts that there are infinitely many such numbers. While this may seem obvious, this is outside the realm of our experience. Surely, the numbers ##1##, ##2##, ##3## behave like we expect, but why should gigantic numbers like ##10^{10^{10}}## exist at all? It is safe to say that such infinite numbers do not exist in our reality. If they don’t exist in our reality, then nothing guarantees that these made-up numbers make sense at all. And nothing guarantees that the collection of these numbers is a consistent whole.

== Fibonacci numbers ==
Anderson and Feil define the Fibonnaci numbers as those numbers ##F_n## such that
##F_0 = 0,~F_1 = 1~\text{and}~F_{n+2} = F_{n+1} + F_n~\text{if}~n\in \mathbb{N}.##
Subsequently, he proves the rather mysterious formula
##F_n = \frac{(1+ \sqrt{5})^n – (1 – \sqrt{5})^n}{2^n \sqrt{5}}.##
Sadly, he does not indicate how we found this remarkable relation!

I will indicate two ways to find this remarable relation, one relies on a lucky guess, the other is more general but requires some knowledge of linear algebra.

”’Method 1:”’ What is the lucky guess? Well, the idea is that a general form to such a recursion relation is of the form
##CA^n + DB^n##
for some ##A,B,C,D\in \mathbb{R}##. It is not clear why this should be true (hence the lucky guess), but it does always work for this recursive relation and many similar ones. That it always works, we can show with the linear algebra technique.

Anyway, we guess that ##F_n = CA^n + DB^n## for all ##n##. In particular, this must be true for ##n=0## and ##n=1##. This gives us the constraints that ##0 = C+D## and ##1 = CA + DB##. Because ##C = -D##, the second relation gives us ##1 = D(B – A)##. In particular ##D\neq 0## and ##B\neq A##.

Now, the relation ##F_{n+2} = F_{n+1} + F_n## must also be satisfied. Since we know that ##F_n = D(B^n – A^n)##, we see that
##D(B^{n+2} – A^{n+2}) = D(B^{n+1} – A^{n+1}) + D(B^n – A^n).##
We know that ##D\neq 0##, so dividing by ##D## gives us
##B^{n+2} – B^{n+1} – B^n = A^{n+2} – A^{n+1} – A^n##
for all ##n##. And so
##B^n(B^2 – B -1) = A^n(A^2 – A – 1).##

”’Lemma:”’ For real numbers ##x,y,\alpha,\beta## with ##x## and ##y## nonzero, it holds that that if ##\alpha x^n = \beta y^n## for two consecutive integers ##n##, then either ##x=y## or ##\alpha = \beta=0##.

”’Proof:”’ Since this must be true for some integers ##k## and ##k+1##, we see that ##\alpha x^k = \beta y^k##. It must also be true for ##n=k+1##, so ##\alpha x^{k+1}= \beta y^{k+1}##. We get
##\beta y^{k+1} = \alpha x^{k+1} = \alpha x^k x = \alpha y^k x##
So we get from ##y## being nonzero that
##\beta y = \alpha x.##

”’Case 1:”’We see that if ##\alpha = 0##, then so is ##\beta y = 0##. Since ##y## is nonzero, we get ##\beta = 0##. Thus we get the case ##\alpha = \beta = 0##.

”’Case 2:”’ In the other case, assume that ##\alpha \neq 0##. Since ##y## is nonzero, we get ##\beta = \frac{\alpha x}{y}##. Substituting this into ##\alpha x^n = \beta y^n## yields
##\alpha x^n = \alpha x y^{n-1}##
for all ##n\geq k##. Since ##\alpha\neq 0##, we get ##x^n = x y^{n-1}##. Since ##x## is nonzero, we get ##x^{n-1} = y^{n-1}##. By setting ##n## to be an even number (note that either ##k## or ##k+1## is even), we see that ##n-1## is an odd number. By taking the ##n-1##th root, we get ##x = y## as desired. ”’QED”’

Note that the proof goes through even in the case that ##\alpha x^n = \beta y^n## only for two consecutive natural numbers ##n##.

##B^n(B^2 – B -1) = A^n(A^2 – A – 1).##
which we assume to be true for all natural numbers. From the lemma we get three possible solutions:

:: ”’Case (1)”’ ##B = 0##: In this case, the general solution is ##F_n = CA^n##. Since this must be true for ##n=0##, we get that ##0 = F_0 = C##. But then ##F_n = 0## for all ##n##, this is in contradiction with ##F_1 = 1##.

:: ”’Case (2)”’ ##A = 0##: this is similar as the case above.

:: ”’Case (3)”’ So the final case must be true that ##B^2 – B – 1 = A^2 – A – 1 = 0##. Otherwise put: both ##A## and ##B## are solutions to the quadratic equation ##x^2 – x – 1 = 0##. Notice that from ##D(B-A) = 1## (derived above), we get that ##A\neq B##. So ##A## and ##B## must be the two distinct solutions to ##x^2 – x – 1##. The two solutions are given by ##\frac{1+\sqrt{5}}{2}## and ##\frac{1 – \sqrt{5}}{2}##. We can choose one of these as ##A## and one of these as ##B## (the situation is clearly symmetric since ##F_n = CA^n + DB^n## is equivalent to ##F_n = CB^n + DA^n##). So we get that
##F_n = C\left(\frac{1+\sqrt{5}}{2}\right)^n + D\left(\frac{1-\sqrt{5}}{2}\right)^n.##
:: Again from the relation ##D(B-A) = 1##, we get that
##1 = D(B-A) = D\left(\frac{1 – \sqrt{5}}{2} – \frac{1 + \sqrt{5}}{2}\right) = -\sqrt{5}.##
:: So ##D = -\frac{1}{\sqrt{5}}## and ##C = -D = \frac{1}{\sqrt{5}}##.
:: We end up with
##F_n = \frac{1}{\sqrt{5}}\left(\frac{1+\sqrt{5}}{2}\right)^n – \frac{1}{\sqrt{5}}\left(\frac{1-\sqrt{5}}{2}\right)^n.##
:: This is exactly the claimed solution.

”’Method 2:”’ The second solution does not rely on the lucky guess ##F_n = CA^n + DB^n##. It depends on rewriting our equation as matrices. Note that if ##F_{n+2} = F_{n+1} + F_n##, then we can rewrite this as
##\left(\begin{array}{c}F_{n+2}\\ F_{n+1}\end{array}\right)=\left(\begin{array}{cc} 1 & 1\\ 1 & 0\end{array}\right)\left(\begin{array}{c}F_{n+1}\\ F_n\end{array}\right).##
We can rewrite the final column matrix in the same way, we get
##\left(\begin{array}{c}F_{n+2}\\ F_{n+1}\end{array}\right)=\left(\begin{array}{cc} 1 & 1\\ 1 & 0\end{array}\right)^2\left(\begin{array}{c}F_n\\ F_{n-1}\end{array}\right).##
We can do this over and over, eventually, we get to
##\left(\begin{array}{c}F_{n+2}\\ F_{n+1}\end{array}\right)=\left(\begin{array}{cc} 1 & 1\\ 1 & 0\end{array}\right)^{n+1}\left(\begin{array}{c}F_1\\ F_0\end{array}\right) = \left(\begin{array}{cc} 1 & 1\\ 1 & 0\end{array}\right)^{n+1}\left(\begin{array}{c} 1 \\ 0\end{array}\right).##

We can write this differently by diagonalizing the matrix ##\left(\begin{array}{cc} 1 & 1\\ 1 & 0\end{array}\right)##. The eigenvalues can be found by solving the characteristic equation ##x^2 – x – 1 = 0## (notice that we encountered this in the first solution too). This has as solution ##\frac{1 \pm \sqrt{5}}{2}##. Eigenvalues are easily seen to be given by ##(\frac{1\pm \sqrt{5}}{2}, 1)##.

Using diagonalization, we then can write
##\left(\begin{array}{cc} 1 & 1\\ 1 & 0\end{array}\right) = \left(\begin{array}{cc} \frac{1 – \sqrt{5}}{2} & \frac{1 + \sqrt{5}}{2}\\ 1 & 1\end{array}\right)\left(\begin{array}{cc} \frac{1 -\sqrt{5}}{2} & 0\\ 0 & \frac{1 + \sqrt{5}}{2} \end{array}\right)\left(\begin{array}{cc} – \frac{1}{\sqrt{5}} & \frac{5 + \sqrt{5}}{10}\\ \frac{1}{\sqrt{5}} & \frac{5 – \sqrt{5}}{10}\end{array}\right)##

Note now that if ##A = XDX^{-1}##, then ##A^2 = XDX^{-1}XDX^{-1} = XD^2 X^{-1}##. In general, we see easily that ##A^n = X D^n X^{-1}##. So we get that

##\begin{eqnarray*}
\left(\begin{array}{c}F_{n+2}\\ F_{n+1}\end{array}\right)
& = & \left(\begin{array}{cc} \frac{1 – \sqrt{5}}{2} & \frac{1 + \sqrt{5}}{2}\\ 1 & 1\end{array}\right)\left(\begin{array}{cc} \frac{1 -\sqrt{5}}{2} & 0\\ 0 & \frac{1 + \sqrt{5}}{2}\end{array}\right)^{n+1}\left(\begin{array}{cc} – \frac{1}{\sqrt{5}} & \frac{5 + \sqrt{5}}{10}\\ \frac{1}{\sqrt{5}} & \frac{5 – \sqrt{5}}{10}\end{array}\right)\left(\begin{array}{c}1 \\ 0 \end{array}\right)\\
& = & \left(\begin{array}{cc} \frac{1 – \sqrt{5}}{2} & \frac{1 + \sqrt{5}}{2}\\ 1 & 1\end{array}\right)\left(\begin{array}{cc} \left(\frac{1 -\sqrt{5}}{2}\right)^{n+1} & 0\\ 0 & \left(\frac{1 + \sqrt{5}}{2}\right)^{n+1} \end{array}\right)\left(\begin{array}{c} – \frac{1}{\sqrt{5}}\\ \frac{1}{\sqrt{5}}\end{array}\right)\\
& = & \left(\begin{array}{cc} \frac{1 – \sqrt{5}}{2} & \frac{1 + \sqrt{5}}{2}\\ 1 & 1\end{array}\right)\left(\begin{array}{c} -\frac{1}{\sqrt{5}}\left(\frac{1 -\sqrt{5}}{2}\right)^{n+1}\\ \frac{1}{\sqrt{5}}\left(\frac{1 + \sqrt{5}}{2}\right)^{n+1}\end{array}\right)\\
& = & \left(\begin{array}{c} \frac{1}{\sqrt{5}}\left(\frac{1 + \sqrt{5}}{2}\right)^{n+2} -\frac{1}{\sqrt{5}}\left(\frac{1 -\sqrt{5}}{2}\right)^{n+2}\\ \frac{1}{\sqrt{5}}\left(\frac{1 + \sqrt{5}}{2}\right)^{n+1} -\frac{1}{\sqrt{5}}\left(\frac{1 -\sqrt{5}}{2}\right)^{n+1}\end{array}\right)
\end{eqnarray*}##

This gets us exactly the solution we wanted.

Exercises:
# Find the general form of the Lucas numbers defined by ##L_0 = 2##, ##L_1 = 1## and ##L_{n+2}= L_{n+1} + L_n##. Use both methods to find the answer.
# Find the general form of the numbers defined by ##A_0 = 2##, ##A_1 = 4## and ##A_{n+2} = 5A_{n+1} + 3A_n##. Use both methods to find the answer.
# Find the general form of the numbers defined by ##B_0 = 0##, ##B_1 = 0##, ##B_2 = 1## and ##B_{n+3} = B_{n+2} + B_{n+1} + B_n##. Adapt both methods to also deal with this more general case.

== Summation formulas in exercise 1,2,6 ==
In exercise 1,2 and 6, you are asked to provide proofs of several formulas. These formulas all are of the form
##1^k + 2^k + 3^k + 4^k + … + n^k = …##
for some fixed number ##k##. In exercise 1, this fixed number is ##k=1##. In exercise 2, it is ##k=2## and in exercise 3, it is ##k=3##.

Neither the exercise, nor the proof actually tells us how to find these formulas in practice! This is one of the bad parts about induction proofs: an induction proof shows that a formula is completely true, but it doesn’t tell us how we found it. As such, an induction proof is not really useful, because it doesn’t tell us how to generalize the result or how to find related results. This is why many mathematicians try to avoid induction proofs: it’s not that they are invalid, it’s more that non-induction proofs are usually way more informative.

The idea behind a non-induction proof is that we can get the value of ##1^{k+1} + 2^{k+1}+ … n^{k+1}## from ##1^k + 2^k + … + n^k##. In particular, we can get the value of exercise 1 from ##1^0 + 2^0 + … + n^0##, which is easily computed to be ##n##. We can then use the value from exercise 1 to get the value for exercise 2, and we can use the value for exercise 2 to get the value for exercise 3.

I will illustrate the procedure here. I will assume that we already know the formulas
##1 + 1 + … + 1 = n##
##1 + 2 + … + n = \frac{n(n+1)}{2}##
##1^2 + 2^2 + … + n^2 = \frac{n(2n+1)(n+1)}{6}## and I will demonstrate how to compute ##1^3 + 2^3 + … + n^3## without knowing the value of this sum beforehand (like in an induction proof).

The idea is behind the binomial formula ##(a+b)^4 = a^4 + 4a^3b + 6a^2b^2 + 4 ab^3 + b^4## (see exercise 14 on how to find this formula). We can then write
##(m+1)^4 – m^4 = (m^4 + 4m^3 + 6m^2 + 4m + 1) – m^4 = 4m^3 + 6m^2 + 4m + 1##
We can easily compute the following telescoping sum
##\begin{eqnarray*}
\sum_{m=1}^n[(m-1)^4 – m^4]
& = & (2^4 – 1^4) + (3^4 – 2^4) + … + ((n+1)^4 – n^4)\\
& = & (n+1)^4 – 1
\end{eqnarray*}##
On the other hand, we know that
##\begin{eqnarray*}
\sum_{m=1}^n[(m-1)^4 – m^4]
& = & \sum_{m=1}^n [4m^3 + 6m^2 + 4m + 1]\\
& = & 4\sum_{m=1}^n m^3 + 6\sum_{m=1}^n m^2 + 4\sum_{m=1}^n m + \sum_{m=1}^n 1\\
& = & 4\sum_{m=1}^n m^3 + 6\frac{n(2n+1)(n+1)}{6} + 4\frac{n(n+1)}{2} + n
\end{eqnarray*}##
By equating the previous equations and isolating ##\sum_{m=1}^n m^3##, we get
##\begin{eqnarray*}
\sum_{m=1}^n m^3
& = & \frac{1}{4}\left((n+1)^4 – 1 – 6\frac{n(2n+1)(n+1)}{6} – 4\frac{n(n+1)}{2} – n\right)\\
& = & \frac{1}{4}\left((n^4 + 4n^3 + 6n^2 + 4n +1) – 1 – n(2n+1)(n+1) – 2n(n+1) – n\right)\\
& = & \frac{1}{4}\left((n^4 + 4n^3 + 6n^2 + 4n +1) – 1 – (2n^3 + 3n^2 + n) – (2n^2 + 2n) – n\right)\\
& = & \frac{1}{4}\left(n^4 + 2n^3 + n^2 \right)\\
& = & \frac{1}{4} n^2(n+1)^2
\end{eqnarray*}##
which is the formula.

”’Exercises:”’
# Use the same technique to find ##1 + 2 + … + n## by only assuming the (trivial) identity ##1^0 + 2^0 + … + n^0 = n##.
# Use the same technique to find ##1^2 + 2^2 + … + n^2## by only assume the (trivial) identity ##1^0 + 2^0 + … + n^0 = n## and ##1+ 2 +… + n = \frac{n(n+1)}{2}##.
# Find ##1^4 + 2^4 + … + n^4##.

== Exercise 8 ==
In exercise 8, you are asked to prove a formula of the form
##\frac{1}{1\cdot 2} + \frac{1}{2\cdot 3} + … +\frac{1}{n(n+1)}##
Again, we can ask the question on how to find this formula if we don’t know the answer in advance. The trick here is one that you should know from integration: the technique of partial fractions. What we try to do is to decompose ##\frac{1}{k(k+1)}## in partial fractions. That is, we assume
##\frac{1}{k(k+1)} = \frac{A}{k} + \frac{B}{k+1}.##
This is equivalent with
##\frac{1}{k(k+1)} = \frac{Ak + A}{k(k+1)} + \frac{Bk}{k(k+1)}.##
Thus we get that ##(A+B) = 0## and ##A = 1##. This gets us as solutions ##A=1## and ##B = -1##. So we see that
##\frac{1}{k(k+1)} = \frac{1}{k} – \frac{1}{k+1}.##
Now we get a telescoping sum:
##\begin{eqnarray*}
\frac{1}{1\cdot 2} + \frac{1}{2\cdot 3} + … +\frac{1}{n(n+1)}
& = & \left(\frac{1}{1} – \frac{1}{2}\right)+\left(\frac{1}{2} – \frac{1}{3}\right)+ … + \left(\frac{1}{n} – \frac{1}{n+1}\right)\\
& = & 1 – \frac{1}{n+1}
\end{eqnarray*}##
which is exactly the formula we want.

== Exercise 14 ==
In Exercise 14, Anderson and Feil introduce the binomial coefficients as ##\binom{n}{k} = \frac{n!}{k!(n-k)!}##. Now, the book doesn’t prove it, and doesn’t really need it, but this binomial coefficient has a very nice interpretation. Namely ##\binom{n}{k}## is the number of subsets with ##k## element in a set with ##n## elements.

Let us try to prove this. We will show it in two ways: one intuitive combinatorial way, and one somewhat more formal way.

”’Intuitive”’ Let ##X## be a set with ##n## elements. In order to select a subset of ##k## elements of ##X##, we can do this by selecting a first element from ##X##, then from the remaining ##n-1## elements, we select a second element, and we continue this process ##k## times.

Selecting the first element can be done in ##n## ways. Selecting the second element can be done in ##n-1## ways. Selecting the ##k##th element can be done in ##n-k+1## ways since there are ##n-k+1## elements left. In total, selecting ##k## elements can be done in ##n(n-1)…(n-k+1)## ways. This is easily seen to equal ##\frac{n!}{(n-k)!}##.

But we are not finished. When selecting the ##k## elements, we have also implicitely ordered them. We must remove this ordering. For example, if we select ##2##-element sets from the set ##\{a,b,c,d\}##. Then I can first select ##a## and then ##b##, but I can also select first ##b## and then select ##a##. This yields the same set, while my method counted them as different sets.

The number of orderings on a set with ##k## elements is ##k!##. So in order to remove the orderings, we have to divide by ##k!##. We get ##\frac{n!}{k!(n-k)!}## subsets with ##k## elements.

”’Formal:”’ Let ##X## be a set with ##n## elements. Let
##\Omega = \{(A,f)~\vert~A\subseteq X~\text{has}~k~\text{elements and}~f~\text{is a bijection between}~A~\text{and}~\{1,…,k\}\}.##
We can count the number of elements in ##\Omega##. To obtain an element in ##\Omega##, we first need to select a subset of ##X## of ##k## elements, and then we need to select a bijection between ##A## and ##\{1,…,k\}##. Selecting a subset of ##X## of size ##k## can happen in ##\binom{n}{k}## elements. There are also ##k!## bijections between this subset ##A## and between ##\{1,…,k\}##. So
##|\Omega| = \binom{n}{k} k!##
On the other hand, an element in ##\Omega## is already completely specified by selecting an injection ##g:\{1,…,k\}\rightarrow X##. Indeed, given such a bijection, we get the following element of ##\Omega##: ##(g(\{1,…,k\}), g^{-1})##. The number of such injections is ##\frac{n!}{(n-k)!}##. Thus we get that
##|\Omega| = \frac{n!}{(n-k)!}##
In particular, we get the formula
##\binom{n}{k} = \frac{n!}{k!(n-k)!}.##

This more formal technique is somewhat more formal than the combinatorial way since we don’t deal with things like “up to reordering”. But more importantly: this formal way also is a great technique to find certain formulas! Try to use this technique to solve the following:

:: 1) Finding the number of injections. Let ##|A| = n## and ##|B| = k##. Assume without loss of generality that ##A\subseteq B##. show that the number of injections ##A\rightarrow B## is given by ##\frac{n!}{(n-k)!}## by considering the set
##\Omega = \{(f,g)~\vert~f~\text{is an injection}~A\rightarrow B~\text{and}~g~\text{is an injection}~B\setminus A\rightarrow B\setminus f(A)\}.##
:: 2) Let ##S## be a set and let ##A_1##, ##A_2##, … ##A_m##, and ##B_1##, ##B_2##, … ##B_n## be two groups of subsets of ##S##. There are ##p## elements in each ##A_i## and each element of ##S## is in exactly ##p_1## sets of the ##A## group. There are ##q## elements in each ##B_i## and each element of ##S## is in exactly ##q_1## sets of the ##B## group. Write ##n## in terms of ##m##, ##p##, ##p_1##, ##q## and ##q_1##.

== Multinomial theorem ==
There is also a multinomial theorem:
##(x_1 + … + x_k)^n = \sum \binom{n}{n_1,…,n_k} x^{n_1} … x^{n_k}##
where the sum ranges over the set ##\{(n_1,…,n_k)~\vert~n_i\in \mathbb{N}~\text{and}~n_1 + … + n_k = n\}##, and where the multinomial coefficient is
##\binom{n}{n_1,…,n_k} = \frac{n!}{n_1!…n_k!}.##

”’Exercises:”’
# Prove the multinomial theorem by induction.
# Interpret the multinomial coefficient combinatorially.
# Prove the multinomial theorem combinatorially.
# Prove that ##\binom{n}{k} = \binom{n}{k,n-k}##
# Note that also ##\binom{n}{n-k} = \binom{n}{k,n-k}##. This proves that ##\binom{n}{k} = \binom{n}{n-k}##. Prove this formula also combinatorially.

0 replies