Office_Shredder said:
so you could solve that recurrence relation numerically and see how it compares...
Recursion is a good idea. We can apply it to recursively generate the relevant distributions.
Let ##N## be the population size, and let ##M## the number of samples taken. We can imagine the samples taken taken in a sequence. Let ##f_M()## be the probability distribution for the number of distinct populations members in the sample after M samples are taken. So ## f_M(k)## is the probability that after ##M## samples are taken that exactly ##k## distinct members of the population are in the sample. Let ##F_M(k)## be the cumulative distribution for ##f##.
Consider computing ##F_{M+1}(k+1)##. In order to have ##k+1## or fewer samples after the ##M+1## sample is taken we must be in one of the following mutually exclusive cases after the ##M##th sample was taken.
1) In the previous ##M## samples there are ##k## or fewer distinct members in the sample.
2) In the previous ##M## samples there are exactly ##k+1## distinct members in the sample
So we have the recursion relation
##F_{M+1}(k+1) = F_M(k) + (f_M(k+1))(\frac{k+1}{N})\ \ ## [eq. 1]
which is based on computing the probability of transitioning between each of the above cases to the situation of having ##k+1## or fewer distinct members after the ##M+1## sample is taken.We have the initial conditions
##f_1(1) = 1, F_1(1) = 1##
##F(k) =1 ## for ##k \ge M##
##f_M(k) = 0## for ##k > M## or ##k < 1##Examples:
Applying eq. 1 with ##M = 1## we have:
##F_2(1) = F_1(0) + f_1(1) \frac{1}{N} = 0 + (1)(\frac{1}{N}) = \frac{1}{N}##
##F_2(2) = F_1(1) + f_1(2)\frac{2}{N} = 1 + 0 = 1##
which implies ##f_2(1) = \frac{1}{N},\ f_2(2) = 1 - \frac{1}{N} = \frac{N-1}{N} ##
For ##M = 2## we have:
##F_3(1) = F_2(0) + (f_2(1))(\frac{1}{N}) = 0 + (\frac{1}{N})(\frac{1}{N}) = \frac{1}{N^2}##
##F_3(2) = F_2(1) + (f_2(2))(\frac{2}{N}) = \frac{1}{N} + (\frac{N-1}{N})(\frac{2}{N}) = \frac{3N-2}{N^2}##
##F_3(3) = F_2(2) + f_2(3)(\frac{3}{N}) = 1 + (0)(\frac{3}{N}) = 1 ##
which implies
##f_3(1) = \frac{1}{N^2}##
## f_3(2) = \frac{3N-2}{N^2} - \frac{1}{N^2} = \frac{2N-2}{N^2}##
## f_3(3) = 1 - \frac{3N-2}{N^2} = \frac{N^2 - 3N + 2}{N^2}##
I don't know if pursuing the symbolic algebra implied by the recursion works out in a nice way. At least the recursion implies a numerical algorithm for generating ##F_M##.