# Homework Help: Poisson Approximation Q

1. Dec 7, 2012

### CAF123

1. The problem statement, all variables and given/known data
Consider $n$ independent trials, each of which results in one of the outcomes $1,...k$ with respective probabilities $p_1,....p_k, \sum_{i=1}^{k} p_i = 1$. (I interpret this summation as just saying mathematically that definitely one of the outcomes has to occur on each trial). Show that if all the $p_i$ are small, then the probability that no trial outcome occurs more than once is a approximately equal to $$exp(-n(n-1)\sum_{i} p_i^2/2)$$

3. The attempt at a solution

So if all the $p_i$ are small, in comparison to $n$, then I believe I can approximate this to a Poisson RV (that is where the exp comes in). (Correct?) So, first I can write the prob that no trial occurs more than once as $$p_1 (1-p_1)^{n-1} p_2 (1-p_2)^{n-1}...p_k (1-p_k)^{n-1} \cdot k! = k!\,\prod_{i=1}^{k} p_i (1-p_i)^{n-1}$$ I am not really sure where to go from here. I am guessing at some stage, I will need to take the limit that as n tends to infinity, something tends to $e$ Thanks.

2. Dec 7, 2012

### Ray Vickson

There is something wrong with the wording of the question. If k is fixed and n gets large the conditions are impossible to meet. For example, if k = 3 and n = 1000 we cannot have all of the three outcomes appearing <= 1 time. Furthermore, the three p_i cannot all be small (although, of course, they can be small compared to n). For the event {all are <= 1 in n trials} I think you need either n being <= k or else you need to have k and n growing large together. Finally, did the question say anything about what "approximately" means?

Last edited: Dec 7, 2012
3. Dec 8, 2012

### CAF123

The question didn't give any more details - what I wrote is exactly as it was. Now that I think about it, then if we don't assume $n \leq k$ then the condition of the trial outcomes cannot be satisfied. Perhaps the question wanted me to realise this? (I.e make a realistic assumption)

In terms of the approximation, I think it means approximate to Poisson (again, I may need an assumption here than the question means that $n$ tends to infinity). The question did not explicitly state anything. The only other approximations I have come across is that the sum of Bernouilli →standard normal. (or binomial as n gets large →normal)

4. Dec 8, 2012

### Ray Vickson

I know what they want you to use an Poisson approximation; that was not the issue. The issue is: do they want an asymptotic approximation, in the sense that a(n) is approximately b(n) for large n if $lim_{n \to \infty} a(n)/b(n) = 1$, or do they want a pointwise approximation, so that a(n) is approximately b(n) for large n if $|a(n) - b(n) | < \epsilon$ for small ε and large n? Or, did they not say? (Without some notion of approximation, nothing prevents me from claiming that √2 is approximately 1.)

5. Dec 8, 2012

### CAF123

We haven't really defined anything in a truly rigorous way yet (i haven't heard of an asymptotic or pointwise approx). I suppose by approximation, they want something like as n tends to infinity and as long as the p_i are small, something tends to exp. The course I am doing is the very first introductory course in Probability, so that might give you an idea of what level I am at, if you are happy to help out. Thanks!

6. Dec 8, 2012

### Ray Vickson

Asymptotic approximation means exactly what I wrote: $$a(n) \sim b(n) \text{ if } a(n)/b(n) \rightarrow 0 \text{ as } n \rightarrow \infty.$$
A familiar example is Stirling's formula for approximating n! when n is large:
$$n! \sim \sqrt{2 \pi} n^{n}\sqrt{n} e^{-n}.$$ The _absolute_ errors can be large, but the _relative_ (percentage) errors go to zero. For example, 20! ≈ 2.43290e(18), while Stirling's formula gives 2.42279e(18). The absolute error (20! - Stirling) is large: error ≈ 1.01152e(16), while the relative error (20! - Stirling)/Stirling = 0.00412 = 0.412% is small.

The other type of error I called 'pointwise', but I don't think that is its official name; I just needed some terminology. It is what I said: the absolute error is small.

I think you need an asymptotic notion of approximation, since you are approximating some probability p(n) with some other formula q(n) that involves n itself.

7. Dec 9, 2012

### CAF123

Ok, thanks for letting me know. I'll maybe email my professor today and see what he expects from this question and then I'll get back to you.

8. Dec 10, 2012

### CAF123

So I got the following hint:
'Consider the probability that a given pair of trials results in the same outcome.
Now use the Poisson principle, i.e. use the approximation that all pairs of trials are independent.'

The probability that any two trials results in the same outcome is just $p_i p_i$, for $i = 1,...k$ Then perhaps I could do $$P(\text{no two trials the same})\, = 1 - P(\text{at least one pair the same})\, = 1 - P(\cup_{i=1}^k (\text{pair i the same}))\, = 1 - \sum_i P(\text{pair i the same}) \, = 1 - \sum_i p_i^2$$ So then since each outcome of pair of trials may be assumed independent, I can write $\lambda = np = n(1 - \sum_i p_i^2)$ and then using the Poisson approximation $~ e^{-\lambda} \frac{\lambda^i}{i!}$, I can write $$exp(-n(1 - \sum_i p_i^2) \frac{[n(1 - \sum_i p_i^2)]^i}{i!}$$ Am I making some progress?

Last edited: Dec 10, 2012
9. Dec 10, 2012

### haruspex

Seems to be some confusion here between outcomes and pairs.
Do you mean 1-nC2*P(a given pair the same) (+smaller terms, according to principle of inclusion-exclusion)? To be rigorous you'll need to put a bound on the error induced by ignoring the smaller terms.

10. Dec 11, 2012

### Ray Vickson

This problem involves the multinomial distribution, in which there are k categories and n independent trials, with category i having probability p_i in each trial and where Ʃ p_i = 1. The probability that category i appears n_i times (with Ʃ n_i = n) is given by the multinomial distribution:
$$p(n_1,n_2, \ldots, p_k) = {n \choose n_1,n_2, \cdots, n_k} p_1^{n_1} p_2^{n_2} \cdots p_k^{n_k}\\ \text{ for } \sum_{j=1}^k n_j = n \text{ and } \sum_{j=1}^k p_j = 1,$$
where
$${n \choose n_1,n_2, \cdots, n_k} = \frac{n!}{n_1! n_2! \cdots n_k!}\\ \text{ for } \sum_{j=1}^n n_j = n.$$ I think what the question is asking for is an approximate expression for the probability that all n_i are <= 1, and I suppose the assumption is that k is large compared to n, and that all p_i are small compared to n.

Basically, you are tossing n balls independently into k cells and you want the probability that no cell receives more than one ball.

Last edited: Dec 11, 2012
11. Dec 11, 2012

### CAF123

Yes, there are $n$ trials and ${n \choose 2}$ pairs. So what I have is $1-{n \choose 2}\sum_i p_i^2$ I don't really see how to progress with this method. Any ideas?

Then I tried a slightly different approach. Let $X$ denote the number of trials which are the same. We want $P(X = 0)$. The probability that 2 pairs are the same is $\sum_i^k p_i^2$(sum over all possible outcomes) and there are, as you said, ${n \choose 2}$ pairings. So, in approximating this to Poisson, $$\lambda = np = {n \choose 2} \sum_i p_i^2 = \frac{n(n-1)}{2} \sum_i p_i^2$$. The probability that X=0 reduces the Poisson mass function to simply $exp(-\lambda)$ and so subbing in what I had for $\lambda$ gives the answer I wanted. Is it a good argument? Thanks.

12. Dec 11, 2012

### Ray Vickson

Here is another argument. Let X_i = number of balls that fall into cell i (where we toss n balls independently into k cells, with cell i having probability p_i each time). The multivariate distribution of the (X_i) is the multinomial distribution given before, but the (marginal) distribution of each X_i separately is Binomial(n,p_i). (This is "obvious", but is also provable explicitly from the multinomial distribution.) Let E_i = event {cell i has <= 1 ball} = {X_i <= 1}. We want the probability P{E_1 & E_2 & E_3 & ... & E_k}. The E_k are not independent, but are "almost independent". If we pretend they are truly independent then we get that the answer we want is
$$\text{answer } \doteq P(E_1) P(E_2) \cdots P(E_k).$$

Now P(E_i) is just the probability P{X_i<=1} for X_i ~ Binomial(n,p_i). Write the formula for this, take the ln of it, and then expand that in powers of p_i, stopping at p_i^2 (since p_i is small; and in fact, we assume n*p_i is small as well). If we now exponentiate the approximate logarithm, we get an alternative, simple approximate formula for P(E_i) that will involve an exponential. Use that approximation in the above approximate expression.

The approximation of P(E_i) can be studied in some numerical cases to check how good it is, so that is not a real problem. The main difficulty is in assessing the effects of assuming independence when it is not really true. I have no useful suggestions about that.

13. Dec 11, 2012

### CAF123

Could you elaborate on what you mean by 'almost independent'?

14. Dec 11, 2012

### Ray Vickson

No, because it is a somewhat intuitive notion.

One can try to look at some measures of overlap. For example, the correlation coefficient between X_i and X_j in the multinomial can be computed, and it is
$$\text{Corr}(X_i,X_j) = \sqrt{\frac{p_i p_j}{(1-p_i)(1-p_j)}},$$ which is "small" if both p_i and p_j are small. However, that does not tell the whole story.

One can also look at things like
$$P\{ E_i | E_j \}$$ by using the multinomial distribution, and then see whether we have $$P\{ E_i | E_j \} \doteq P \{ E_i \},$$ but again, that would not tell the whole story.

Note added in editing: For the multinomial distribution, we can look at distributions of pairs, such as (X_1,X_2), which has a bivariate binomial distribution. Now we can look at
$$P(E_1 \& E_2) = P(X_1 = 0,X_2=0) + P(X_1=0,X_2=1) + P(X_1=1,X_2=0) + P(X_1 = 1,X_2 = 1).$$ This involves 4 terms in n, p_1 and p_2. Now we can take its logarithm and expand that in powers of p_1, p_2, up to terms of order p_1^2, p_2^2 and p_1*p_2. We can also look at $P(E_1) P(E_2)$, take its logarithm and expand that in powers of p_1, p_2 up to second order. We find that the two final expansions are the same, so to the extent that we can neglect higher powers in the p_i, we will have
$$P(E_1 \& E_2) = P(E_1) P(E_2) + \text{ corrections of higher order in the } p_i.$$
If we allow ourselves to neglect the corrections, we get independence.

Note: we could keep going like that, to see if P(E_1 & E_2 & E_3) is approximately the product of the P(E_i), but it starts to get really complicated. I did all the above in the computer algebra system Maple, and certainly the more complicated cases would be nasty if done by hand (although back in the Stone Age, when I was a student, that's what we had to do).

Last edited: Dec 11, 2012
15. Dec 11, 2012

### haruspex

Looks like a useful step. You know that 1 - δ ≈ e for small δ, so apply that. That's an approx for the prob that a given outcome does not occur twice. The error term is O((npi)3). Now take the product over i. Of course, that's not exactly valid because each time a certain outcome is taken not to occur twice it slightly increases the odds that others do, so again you need a bound on the error term.
But it's not clear how much rigour is required. Maybe you have enough already.

16. Dec 12, 2012

### CAF123

So if the $p_i$ are small then $p_i^2$ will be even smaller so we can definitely apply the approximation. Given that my course is introductory probability, I think it would suffice to leave it like this rather than look at higher order terms and bounds. Thanks RGV and haruspex!