MHB Understanding induction proof of pigeonhole principle

Mathmellow
Messages
7
Reaction score
0
I am struggling to understand the induction proof of the pigeonhole principle in my textbook. The theorem and the proof, from Biggs Discrete Mathematics, is pasted below, and I will explain further (see bold text) what I am having trouble with.

Theorem. Let m be a natural number. Then the following statement is true for every natural number n: if there is an injection from $$\Bbb{N}_n$$ to $$\Bbb{N}_m$$, then $$n\leq m$$.

Proof. We use the principle of induction.
The statement is true when $$n=1$$, since $$1\leq m$$ for any natural number m.
The induction hypothesis is that the statement is true when n takes a specific value $$k\ge 1$$. We have to deduce that it is true when $$n=k+1$$.
Suppose that $$j:\Bbb{N}_{k+1}\to\Bbb{N}_m$$ is an injection. Since $$k+1\geq 2$$, it follows (from the definition of an injection) that m cannot be 1. So $$m=s+1$$, for some natural number s. In order to show that $$k+1\leq m=s+1$$, we construct an injection $$j^*:\Bbb{N}_k\to\Bbb{N}_s$$, and use the induction hypothesis to conclude that $$k\leq s$$.
There are two cases.

Case 1: Suppose that $$j(x)\neq s+1$$ for all $$x\in \Bbb{N}_k$$. Then we let $$j^*$$ be the injection defined by $$j^*(x)=j(x)$$, for all $$x\in\Bbb{N}_k$$.
What do they mean by this step? What are they accomplishing by defining the injection $$j^*(x)=j(x)$$, and what exactly do they mean by $$j(x)\neq s+1$$?

Case 2: (See figure below) Suppose that there is an $$x\in\Bbb{N}_k$$ such that $$j(x)=s+1$$.
Then $$j(k+1)=y$$, where (since j is an injection) $$y\neq s+1$$. In this case define $$j^*$$ as follows: $$j^*(x)=y,\quad j^*(z)=j(z) (z\neq x)$$.
It is easy to check that $$j^*$$ is an injection.
View attachment 8180
Again, I am having problems understanding this case. I would appreciate if someone could explain this, as well as the previous step more thoroughly.
Thanks in advance!
 

Attachments

  • IMG_5444.JPG
    IMG_5444.JPG
    11.7 KB · Views: 145
Physics news on Phys.org
Mathmellow said:
what exactly do they mean by $j(x)\ne s+1$?
Well, to be honest, this question is strange. In mathematics, any two objects, including numbers, are either equal or not equal. In this case, $j(x)$ and $s+1$ are either the same number, or they are two different numbers. So one has the right two consider two cases in a proof: $j(x)=s+1$ and $j(x)\ne s+1$. In both cases one has to prove the same thing. It is similar to the following reasoning: If it rains tomorrow, I won't go to the park. If it doesn't rain, I won't go to the park, either. Therefore, I won't go there, period. To be more precise we are not considering a specific $x$, but rather the following cases: (1) $j(x)\ne s+1$ for all $1\le x\le k$ and (2) $j(x)= s+1$ for some $1\le x\le k$. I assume you understand all this.

If $j(x)\ne s+1$ for all $1\le x\le k$, then we have the right to appeal to the induction hypothesis for the initial portion of the function $j$, i.e., for $j$ restricted to the set $\mathbb{N}_k=\{1,\ldots,k\}$. That is, we throw away the value of $j$ at $k+1$. In the proof this portion is denoted by $j^*$. On the domain $\mathbb{N}_k$ the values of $j$ lie in $\mathbb{N}_s$. Indeed, they lie in $\mathbb{N}_m=\mathbb{N}_{s+1}$, but the value $s+1$ is never taken: that's the assumption in the first sentence of this paragraph. The restriction of $j$ to $\mathbb{N}_k$ is still an injection, so the induction hypothesis concludes that $k\le s$, which implies $k+1\le s+1=m$.

I have a small complaint about the way the induction statement is apparently stated in the book. By induction statement I mean the property of $n$ that is being proved by induction. Judging by the theorem, the statement is
\[
\forall j:\mathbb{N}_n\to \mathbb{N}_m.\,j\text{ is an injection}\implies n\le m.
\]
It looks like $m$ is a parameter here, i.e., $m$ is fixed for the duration of the proof. This is not the case. The correct induction statement is
\[
\forall m\,\forall j:\mathbb{N}_n\to \mathbb{N}_m.\,j\text{ is an injection}\implies n\le m.
\]
That is, if in the induction step we have to conclude that $k+1\le m$ if $j:\mathbb{N}_{k+1}\to \mathbb{N}_m$ is an injection, we can manufacture an injection $j^*:\mathbb{N}_k\to \mathbb{N}_s$ for any $s$, not necessarily for $s=m$, and the induction hypothesis guarantees that $k\le s$. This is what we have done above when we took the restriction $j^*$ of $j$ to $\mathbb{N}_k$: we have $j^*:\mathbb{N}_k\to \mathbb{N}_s$ where $m=s+1$. By the hypothesis $k\le s$, so $k+1\le s+1=m$, as required.

Let's get back to the second case. If $j(x)= s+1=m$ for some $1\le x\le k$, then we can't apply the induction hypothesis to the restriction $j^*$ of $j$ to $\mathbb{N}_k$. Or, rather, we can, and $j^*$ is still an injection, but it maps $\mathbb{N}_k$ to $\mathbb{N}_m$. Therefore the induction hypothesis is only able to guarantee that $k\le m$. We can't conclude from there that $k+1\le m$. So we consider a different $j^*$:
\[
j^*(z)=
\begin{cases}
j(k+1),&z=x\\
j(z),&z\ne x
\end{cases}.
\]
Since $j$ is an injection, $j(k+1)\ne j(x)=s+1$ since $x\le k$. So $j(k+1)\le s$ (it cannot be greater that $s+1$ because $j:\mathbb{N}_{k+1}\to \mathbb{N}_{s+1}$, and it cannot be $s+1$ because $j(x)=s+1$). Similarly $j(z)\le s$ for $z\ne x$. Thus, $j^*:\mathbb{N}_k\to \mathbb{N}_s$ and $j^*$ is an injection. By induction hypothesis $k\le s$.
 
Evgeny.Makarov said:
To be more precise we are not considering a specific $x$, but rather the following cases: (1) $j(x)\ne s+1$ for all $1\le x\le k$ and (2) $j(x)= s+1$ for some $1\le x\le k$. I assume you understand all this.
Yes, of course. Thank you for the clarification though.

Evgeny.Makarov said:
If $j(x)\ne s+1$ for all $1\le x\le k$, then we have the right to appeal to the induction hypothesis for the initial portion of the function $j$, i.e., for $j$ restricted to the set $\mathbb{N}_k=\{1,\ldots,k\}$. That is, we throw away the value of $j$ at $k+1$. In the proof this portion is denoted by $j^*$. On the domain $\mathbb{N}_k$ the values of $j$ lie in $\mathbb{N}_s$. Indeed, they lie in $\mathbb{N}_m=\mathbb{N}_{s+1}$, but the value $s+1$ is never taken: that's the assumption in the first sentence of this paragraph. The restriction of $j$ to $\mathbb{N}_k$ is still an injection, so the induction hypothesis concludes that $k\le s$, which implies $k+1\le s+1=m$.
This just became so much clearer, I really did not understand what the point of $j^*$ really was.
Evgeny.Makarov said:
The correct induction statement is
\[
\forall m\,\forall j:\mathbb{N}_n\to \mathbb{N}_m.\,j\text{ is an injection}\implies n\le m.
\]
That is, if in the induction step we have to conclude that $k+1\le m$ if $j:\mathbb{N}_{k+1}\to \mathbb{N}_m$ is an injection, we can manufacture an injection $j^*:\mathbb{N}_k\to \mathbb{N}_s$ for any $s$, not necessarily for $s=m$, and the induction hypothesis guarantees that $k\le s$. This is what we have done above when we took the restriction $j^*$ of $j$ to $\mathbb{N}_k$: we have $j^*:\mathbb{N}_k\to \mathbb{N}_s$ where $m=s+1$. By the hypothesis $k\le s$, so $k+1\le s+1=m$, as required.
So if I understand you correctly, the proof would have been simpler if they had used this statement? Was it a mistake the author made?
Evgeny.Makarov said:
So we consider a different $j^*$:
\[
j^*(z)=
\begin{cases}
j(k+1),&z=x\\
j(z),&z\ne x
\end{cases}.
\]
Since $j$ is an injection, $j(k+1)\ne j(x)=s+1$ since $x\le k$. So $j(k+1)\le s$ (it cannot be greater that $s+1$ because $j:\mathbb{N}_{k+1}\to \mathbb{N}_{s+1}$, and it cannot be $s+1$ because $j(x)=s+1$). Similarly $j(z)\le s$ for $z\ne x$. Thus, $j^*:\mathbb{N}_k\to \mathbb{N}_s$ and $j^*$ is an injection. By induction hypothesis $k\le s$.
Thank you so much! The way you wrote this is so much clearer than my textbook. I think I finally understand this proof completely, I am not very familiar with induction proofs where functions are involved like this.
 
Mathmellow said:
So if I understand you correctly, the proof would have been simpler if they had used this statement?
The author never wrote the induction statement explicitly (at least the proof in post #1 does not have it), but he made a distinction between $m$ and $n$ as follows: "Let $m\in\mathbb{N}$. Then for every $n\in\mathbb{N}$...". This suggests that $m$ is supposed to be fixed throughout the proof and we are proving $P(n)$ for all $n$ where $P(n)$ is
\[
\forall j:\mathbb{N}_n\to \mathbb{N}_m.\,j\text{ is an injection}\implies n\le m.
\]
It's not that the proof is more complicated with such $P(n)$; the problem is that the induction does not go through. The induction statement must start with $\forall m$.

In general if one proves $\forall m\forall n\,Q(m,n)$ by induction on $n$, it is better to prove $\forall m\,Q(m,n)$ for all $n$ rather than fix $m$ and prove $Q(m,n)$ for all $n$ and that specific $m$. In the first approach the induction step is
\[
(\forall m\,Q(m,n))\to\forall m\,Q(m,n+1),
\]
and in the second one it is
\[
Q(m,n)\to Q(m,n+1).
\]
While there is less to prove with $Q(m,n+1)$ compared to $\forall m\,Q(m,n)$, the induction hypothesis is also weaker: for example, it does not allow using $Q(m-1,n)$ or $Q(m+1,n)$. However, this is a subtle point that many students and instructors skip over, and this is OK. It becomes important, for example, when one studies formal proofs in mathematical logic.
 
I will definitely keep this in mind, I have a course in formal logic that I am planning to take so it will probably come in handy.

Thank you for taking your time to respond to my questions! :D
 
Namaste & G'day Postulate: A strongly-knit team wins on average over a less knit one Fundamentals: - Two teams face off with 4 players each - A polo team consists of players that each have assigned to them a measure of their ability (called a "Handicap" - 10 is highest, -2 lowest) I attempted to measure close-knitness of a team in terms of standard deviation (SD) of handicaps of the players. Failure: It turns out that, more often than, a team with a higher SD wins. In my language, that...
Hi all, I've been a roulette player for more than 10 years (although I took time off here and there) and it's only now that I'm trying to understand the physics of the game. Basically my strategy in roulette is to divide the wheel roughly into two halves (let's call them A and B). My theory is that in roulette there will invariably be variance. In other words, if A comes up 5 times in a row, B will be due to come up soon. However I have been proven wrong many times, and I have seen some...
Back
Top