Poisson vs Binomial distribution.

1. Feb 19, 2013

Jcampuzano2

Hello PF

This might be a fairly simple question to most of you, but I was given this problem (don't worry, I already solved it just wondering about something)

Suppose the probability of suffering a side effect of a certain flu vaccine is 0.005. If 1000 persons are inoculate, find the approximate probability that

(a) at most 1 person suffers, (b) 4,5, or 6 persons suffer.

I already solved it, but this problem is in the chapter on the Poisson distribution. Unfortunately my teacher didn't cover this distribution in detail, but when I first looked at the problem it look like a typical Binomial distribution problem? I later figured out I was supposed to approximate with the Poisson distribution.

Why would we use an approximation for the Binomial when we could just apply it, and under what circumstances am I allowed to make this approximation in the first place?

2. Feb 19, 2013

micromass

The problem with the binomial distribution is that it is very hard to calculate.

So the second question would be

$$\sum_{k=4}^6 \binom{1000}{k} (0.005)^k0.995^{1000-k}$$

This is the correct answer. But computing those binomial coefficients is not very fun.

However, we can show that if we are working with binomial(n,pn) distributions and if $np_n\rightarrow \lambda$ for some $\lambda$, then

$$\binom{n}{k} p^k (1-p)^{n-k} \rightarrow e^{-\lambda} \frac{\lambda^k}{k!}$$

So, if n is very large and p is very small, then the Binomial(n,p) distribution is very close to the Poisson(np) distribution.

So, in our case, p=0.005 is small and n=1000 is large. The product is medium: 5. So we can approximate the answer by

$$\sum_{k=1}^6 e^{-5} \frac{5^k}{k!}$$

And we are also rid of that pesky binomial coefficient.

This approximation is also theoretically interesting. The sum of two (independent) Poisson distributions is always a Poisson distribution, for example. But the sum of two (independent) binomial distributions is not binomial.

3. Feb 20, 2013

ImaLooser

Partly it is holdover from the old days when computation was expensive. The teaching of statistics hasn't changed much in the past 50 years, as far as I could tell. The binomial is still tricky to compute because the factorials in the intermediate results can be very large and you have to be careful not to get computer overflow.

You can use the Poisson approximation when n is large (greater than 50 is probably enough) and when the chance of 0 successes or n successes is negligible. It depends on how much accuracy you need, so there can be no hard and fast rule.

Share this great discussion with others via Reddit, Google+, Twitter, or Facebook