MHB Understanding induction proof of pigeonhole principle

AI Thread Summary
The discussion revolves around understanding the induction proof of the pigeonhole principle as presented in Biggs' Discrete Mathematics. The theorem asserts that if there is an injection from the natural numbers up to n to those up to m, then n must be less than or equal to m. Participants express confusion about specific steps in the proof, particularly the construction of the injection j* and the implications of the cases where j(x) equals or does not equal s+1. Clarifications highlight that the induction statement should consider m as a variable rather than fixed, which simplifies the proof process. The conversation concludes with a better understanding of the proof's structure and the importance of correctly framing induction statements.
Mathmellow
Messages
7
Reaction score
0
I am struggling to understand the induction proof of the pigeonhole principle in my textbook. The theorem and the proof, from Biggs Discrete Mathematics, is pasted below, and I will explain further (see bold text) what I am having trouble with.

Theorem. Let m be a natural number. Then the following statement is true for every natural number n: if there is an injection from $$\Bbb{N}_n$$ to $$\Bbb{N}_m$$, then $$n\leq m$$.

Proof. We use the principle of induction.
The statement is true when $$n=1$$, since $$1\leq m$$ for any natural number m.
The induction hypothesis is that the statement is true when n takes a specific value $$k\ge 1$$. We have to deduce that it is true when $$n=k+1$$.
Suppose that $$j:\Bbb{N}_{k+1}\to\Bbb{N}_m$$ is an injection. Since $$k+1\geq 2$$, it follows (from the definition of an injection) that m cannot be 1. So $$m=s+1$$, for some natural number s. In order to show that $$k+1\leq m=s+1$$, we construct an injection $$j^*:\Bbb{N}_k\to\Bbb{N}_s$$, and use the induction hypothesis to conclude that $$k\leq s$$.
There are two cases.

Case 1: Suppose that $$j(x)\neq s+1$$ for all $$x\in \Bbb{N}_k$$. Then we let $$j^*$$ be the injection defined by $$j^*(x)=j(x)$$, for all $$x\in\Bbb{N}_k$$.
What do they mean by this step? What are they accomplishing by defining the injection $$j^*(x)=j(x)$$, and what exactly do they mean by $$j(x)\neq s+1$$?

Case 2: (See figure below) Suppose that there is an $$x\in\Bbb{N}_k$$ such that $$j(x)=s+1$$.
Then $$j(k+1)=y$$, where (since j is an injection) $$y\neq s+1$$. In this case define $$j^*$$ as follows: $$j^*(x)=y,\quad j^*(z)=j(z) (z\neq x)$$.
It is easy to check that $$j^*$$ is an injection.
View attachment 8180
Again, I am having problems understanding this case. I would appreciate if someone could explain this, as well as the previous step more thoroughly.
Thanks in advance!
 

Attachments

  • IMG_5444.JPG
    IMG_5444.JPG
    11.7 KB · Views: 147
Physics news on Phys.org
Mathmellow said:
what exactly do they mean by $j(x)\ne s+1$?
Well, to be honest, this question is strange. In mathematics, any two objects, including numbers, are either equal or not equal. In this case, $j(x)$ and $s+1$ are either the same number, or they are two different numbers. So one has the right two consider two cases in a proof: $j(x)=s+1$ and $j(x)\ne s+1$. In both cases one has to prove the same thing. It is similar to the following reasoning: If it rains tomorrow, I won't go to the park. If it doesn't rain, I won't go to the park, either. Therefore, I won't go there, period. To be more precise we are not considering a specific $x$, but rather the following cases: (1) $j(x)\ne s+1$ for all $1\le x\le k$ and (2) $j(x)= s+1$ for some $1\le x\le k$. I assume you understand all this.

If $j(x)\ne s+1$ for all $1\le x\le k$, then we have the right to appeal to the induction hypothesis for the initial portion of the function $j$, i.e., for $j$ restricted to the set $\mathbb{N}_k=\{1,\ldots,k\}$. That is, we throw away the value of $j$ at $k+1$. In the proof this portion is denoted by $j^*$. On the domain $\mathbb{N}_k$ the values of $j$ lie in $\mathbb{N}_s$. Indeed, they lie in $\mathbb{N}_m=\mathbb{N}_{s+1}$, but the value $s+1$ is never taken: that's the assumption in the first sentence of this paragraph. The restriction of $j$ to $\mathbb{N}_k$ is still an injection, so the induction hypothesis concludes that $k\le s$, which implies $k+1\le s+1=m$.

I have a small complaint about the way the induction statement is apparently stated in the book. By induction statement I mean the property of $n$ that is being proved by induction. Judging by the theorem, the statement is
\[
\forall j:\mathbb{N}_n\to \mathbb{N}_m.\,j\text{ is an injection}\implies n\le m.
\]
It looks like $m$ is a parameter here, i.e., $m$ is fixed for the duration of the proof. This is not the case. The correct induction statement is
\[
\forall m\,\forall j:\mathbb{N}_n\to \mathbb{N}_m.\,j\text{ is an injection}\implies n\le m.
\]
That is, if in the induction step we have to conclude that $k+1\le m$ if $j:\mathbb{N}_{k+1}\to \mathbb{N}_m$ is an injection, we can manufacture an injection $j^*:\mathbb{N}_k\to \mathbb{N}_s$ for any $s$, not necessarily for $s=m$, and the induction hypothesis guarantees that $k\le s$. This is what we have done above when we took the restriction $j^*$ of $j$ to $\mathbb{N}_k$: we have $j^*:\mathbb{N}_k\to \mathbb{N}_s$ where $m=s+1$. By the hypothesis $k\le s$, so $k+1\le s+1=m$, as required.

Let's get back to the second case. If $j(x)= s+1=m$ for some $1\le x\le k$, then we can't apply the induction hypothesis to the restriction $j^*$ of $j$ to $\mathbb{N}_k$. Or, rather, we can, and $j^*$ is still an injection, but it maps $\mathbb{N}_k$ to $\mathbb{N}_m$. Therefore the induction hypothesis is only able to guarantee that $k\le m$. We can't conclude from there that $k+1\le m$. So we consider a different $j^*$:
\[
j^*(z)=
\begin{cases}
j(k+1),&z=x\\
j(z),&z\ne x
\end{cases}.
\]
Since $j$ is an injection, $j(k+1)\ne j(x)=s+1$ since $x\le k$. So $j(k+1)\le s$ (it cannot be greater that $s+1$ because $j:\mathbb{N}_{k+1}\to \mathbb{N}_{s+1}$, and it cannot be $s+1$ because $j(x)=s+1$). Similarly $j(z)\le s$ for $z\ne x$. Thus, $j^*:\mathbb{N}_k\to \mathbb{N}_s$ and $j^*$ is an injection. By induction hypothesis $k\le s$.
 
Evgeny.Makarov said:
To be more precise we are not considering a specific $x$, but rather the following cases: (1) $j(x)\ne s+1$ for all $1\le x\le k$ and (2) $j(x)= s+1$ for some $1\le x\le k$. I assume you understand all this.
Yes, of course. Thank you for the clarification though.

Evgeny.Makarov said:
If $j(x)\ne s+1$ for all $1\le x\le k$, then we have the right to appeal to the induction hypothesis for the initial portion of the function $j$, i.e., for $j$ restricted to the set $\mathbb{N}_k=\{1,\ldots,k\}$. That is, we throw away the value of $j$ at $k+1$. In the proof this portion is denoted by $j^*$. On the domain $\mathbb{N}_k$ the values of $j$ lie in $\mathbb{N}_s$. Indeed, they lie in $\mathbb{N}_m=\mathbb{N}_{s+1}$, but the value $s+1$ is never taken: that's the assumption in the first sentence of this paragraph. The restriction of $j$ to $\mathbb{N}_k$ is still an injection, so the induction hypothesis concludes that $k\le s$, which implies $k+1\le s+1=m$.
This just became so much clearer, I really did not understand what the point of $j^*$ really was.
Evgeny.Makarov said:
The correct induction statement is
\[
\forall m\,\forall j:\mathbb{N}_n\to \mathbb{N}_m.\,j\text{ is an injection}\implies n\le m.
\]
That is, if in the induction step we have to conclude that $k+1\le m$ if $j:\mathbb{N}_{k+1}\to \mathbb{N}_m$ is an injection, we can manufacture an injection $j^*:\mathbb{N}_k\to \mathbb{N}_s$ for any $s$, not necessarily for $s=m$, and the induction hypothesis guarantees that $k\le s$. This is what we have done above when we took the restriction $j^*$ of $j$ to $\mathbb{N}_k$: we have $j^*:\mathbb{N}_k\to \mathbb{N}_s$ where $m=s+1$. By the hypothesis $k\le s$, so $k+1\le s+1=m$, as required.
So if I understand you correctly, the proof would have been simpler if they had used this statement? Was it a mistake the author made?
Evgeny.Makarov said:
So we consider a different $j^*$:
\[
j^*(z)=
\begin{cases}
j(k+1),&z=x\\
j(z),&z\ne x
\end{cases}.
\]
Since $j$ is an injection, $j(k+1)\ne j(x)=s+1$ since $x\le k$. So $j(k+1)\le s$ (it cannot be greater that $s+1$ because $j:\mathbb{N}_{k+1}\to \mathbb{N}_{s+1}$, and it cannot be $s+1$ because $j(x)=s+1$). Similarly $j(z)\le s$ for $z\ne x$. Thus, $j^*:\mathbb{N}_k\to \mathbb{N}_s$ and $j^*$ is an injection. By induction hypothesis $k\le s$.
Thank you so much! The way you wrote this is so much clearer than my textbook. I think I finally understand this proof completely, I am not very familiar with induction proofs where functions are involved like this.
 
Mathmellow said:
So if I understand you correctly, the proof would have been simpler if they had used this statement?
The author never wrote the induction statement explicitly (at least the proof in post #1 does not have it), but he made a distinction between $m$ and $n$ as follows: "Let $m\in\mathbb{N}$. Then for every $n\in\mathbb{N}$...". This suggests that $m$ is supposed to be fixed throughout the proof and we are proving $P(n)$ for all $n$ where $P(n)$ is
\[
\forall j:\mathbb{N}_n\to \mathbb{N}_m.\,j\text{ is an injection}\implies n\le m.
\]
It's not that the proof is more complicated with such $P(n)$; the problem is that the induction does not go through. The induction statement must start with $\forall m$.

In general if one proves $\forall m\forall n\,Q(m,n)$ by induction on $n$, it is better to prove $\forall m\,Q(m,n)$ for all $n$ rather than fix $m$ and prove $Q(m,n)$ for all $n$ and that specific $m$. In the first approach the induction step is
\[
(\forall m\,Q(m,n))\to\forall m\,Q(m,n+1),
\]
and in the second one it is
\[
Q(m,n)\to Q(m,n+1).
\]
While there is less to prove with $Q(m,n+1)$ compared to $\forall m\,Q(m,n)$, the induction hypothesis is also weaker: for example, it does not allow using $Q(m-1,n)$ or $Q(m+1,n)$. However, this is a subtle point that many students and instructors skip over, and this is OK. It becomes important, for example, when one studies formal proofs in mathematical logic.
 
I will definitely keep this in mind, I have a course in formal logic that I am planning to take so it will probably come in handy.

Thank you for taking your time to respond to my questions! :D
 
I was reading documentation about the soundness and completeness of logic formal systems. Consider the following $$\vdash_S \phi$$ where ##S## is the proof-system making part the formal system and ##\phi## is a wff (well formed formula) of the formal language. Note the blank on left of the turnstile symbol ##\vdash_S##, as far as I can tell it actually represents the empty set. So what does it mean ? I guess it actually means ##\phi## is a theorem of the formal system, i.e. there is a...
Back
Top