# Homework Help: Poisson distribution ( approximation)

1. Dec 27, 2017

### tzx9633

1. The problem statement, all variables and given/known data
The number of flaws in a plastic panel used in the interior of cars has a mean of 2.2 flaws per square meter of panel .
What's the probability that there are less than 20 surface flaws in 10 square meter of panel ?

2. Relevant equations

3. The attempt at a solution
This is a poisson distribution problem , am i right ?

X ~Po(22) for 10 square meter of panel

It's quite insane to calculate the probability from 1 to 20 , right ? I'm wondering is there any appoximation method so that i can solve this question easily ?

2. Dec 27, 2017

You didn't show the solution, but I'm presuming you know that it is $P=e^{-\lambda} \, \sum\limits_{k=o}^{19} \frac{\lambda^k}{k!}$ where $\lambda =10(2.2)=22$. The only way I know of to get this answer is by letting the computer process it. Perhaps someone else knows some approximation method. Meanwhile, with today's spreadsheets, this sum is readily computed. $\\$ Because the mean $\lambda=22$, I anticipate the numerical answer is going to come out somewhere near $P=\frac{1}{2}$. If you process it by computer, let us know what you come up with. :)

Last edited: Dec 27, 2017
3. Dec 27, 2017

### Ray Vickson

Whether or not it is a Poisson probability problem depends a lot on details of the manufacturing process that we have not been told. If the positions and numbers of flaws are, indeed, random and independent, then the Poisson model is justified. (Often, textbooks will give problems like this one, where crucial information is lacking, in the expectation that the student will make some assumptions because of context---such as the problem occurring in a chapter about the Poisson distribution, for example.)

Since the mean (22) is not very small, one can try a Normal approximation to the Poisson; then if you have a scientific calculator with the Normal distribution on it you could use that to get an answer that is, at least, in the right "ballpark". However, it would not be particularly accurate, even if you included a so-called "1/2" correction.

With modern tools (spreadsheets,etc.) computing and summing 20 Poisson probabilities is a snap, and including all of them is far from "insane". Of course, on an exam the situation would be different, and making a reasonable approximation would be more important.

4. Dec 27, 2017

### StoneTemplePython

You need to tell us what the distribution is. It certainly feels like it could be Poisson.

-- edit: I basically concur with @Ray Vickson's post. However I've thrown in a few different ways of estimating / bounding the results that may be helpful. ---

I'd suggest solving this problem a few different ways, and then comparing results at the end.

What is the $\lambda$ i.e. key parameter, of this Poisson distribution?

1.) It is not insane, and you should in fact go from 0 to 19. (Or solve for the complement that goes from $\{20, 21, 22, 23, ....\}$.) This is one way, and you should try it in excel or with something like python or matlab.

2.) How would you tackle the complement? You could come up with your own approach if you have any knowledge of how to bound the power series for the exponential function. This could be worth playing around with, or not.

3.) You may also consider using a normal approximation to a poisson.

4.) I think the process at whatever scale gets a fresh start, as Poisson's do. You could select this as some large n discrete (read: Bernouli trials) events, as Poisson's can be interpreted as a limit of a Bernouli Process. Select some appropriately large n (in fact try a few different big values for n). It takes some care to do this right and properly convert between large Bernoulis and Poissons, but it isn't that tough. The model is you have a simple binary outcomes where something is either flawed or not, and we have independence in here between bernouli trials. This gives nice and easy approach, using Chernoff Bounds. The most accessible writeup I've seen is in section 20.6.2 (i.e. page 891) of MIT's 'Math for CS', available here: https://courses.csail.mit.edu/6.042/spring17/mcs.pdf

5. Dec 27, 2017

### Orodruin

Staff Emeritus
22 is a rather large number. If you just want a reasonable approximation, apply the central limit theorem.

To be honest, I never understood why people use spreadsheets when the corresponding code in Matlab or Mathematica is one line.

6. Dec 27, 2017

I am not terribly computer savvy. (Probably an understatement). I would normally compute this with an EXCEL spreadsheet, but I don't even have that capability on my present computer. :)

7. Dec 27, 2017

### Orodruin

Staff Emeritus
I am not complaining about you in particular, just that many people use spreadsheets when a simple Matlab or Mathematica command not only would have been easier to write down, but also would have given a more accurate result.

Example: A student I am familiar with was asked to compute an integral did it with Mathematica in one line and was told by the professor that that was an unnecessarily complicated way of doing things and "we just want something easy that works". The professor's solution involved computing the integrand at 20 different points in a spreadsheet and summing them ... Of course, what Matlab or Mathematica would do would be exactly the same, but with a much larger number of points and with a much less cumbersome input. Doing stuff like this in a spreadsheet is like rewriting the integration routines, just in a much less suited language and with less precision ...

8. Dec 27, 2017

I know=I am retired=today's generation has somewhat better tools than what some of us had. I, myself, like doing calculations by hand whenever possible, and this is a skill that I think the present generation, for the most part, does not do quite as well as my generation. :)

9. Dec 27, 2017

### Ray Vickson

When I composed my response I included a note about on-line Poisson calculators, but for some reason it did not come through in the final, posted version. Anyway, I mentioned spreadsheets---although I mostly try to avoid them myself---because they are likely to be the most widely available tools on almost any computer a student will own. Personally, I just slap such calculations into Maple and let that take care of things. (As for spreadsheets, I started using them primarily because the introductory textbooks in my subject---Operations Research---went almost exclusively the spreadsheet route. I never liked it, but went with the flow.)

Back in the Stone Age when I was a student--and even in more recent times when I started teaching---I would go to the library and consult statistical tables if I did not happen to have the appropriate ones available in the back of some book. Writing a Fortran program on punch cards would be more trouble than it was worth for a problem of this type; and doing things by hand using a sliderule and/or log tables also seems a bit excessive. That was when I really learned the importance of making reasonable approximations.

10. Dec 27, 2017

Perhaps it is worth mentioning to the OP that if a Gaussian (normal distribution) approximation is used for this problem, the mean $\mu=\lambda$, and $\sigma=\sqrt{\mu}=\sqrt{\lambda}$. (Compare to binomial, where $\sigma=\sqrt{Npq}$). That gives a $z=(20-22)/4.7=-.425$. The table showed this gives $P=.335$. It would be interesting to hear what the OP got when he performed the sum.

Last edited: Dec 27, 2017
11. Dec 27, 2017

### tzx9633

I have learnt normal and poisson approximation to binomial , but not normal distribution approximation to poisson distribution , does the normal distribution approximation to poisson distribution exist ? Sorry

12. Dec 28, 2017

### Ray Vickson

Yes. For large $\lambda$, $\text{Poisson}(\lambda)$ and the $\text{Normal}(\lambda, \sqrt{\lambda})$ are quite close. In your case the issue is whether or not '22' is a large enough number to justify the replacement. The basic reason here is that your $X \sim \text{Po}(22)$ can be considered as a sum of $N = 22$ independent, identically-distributed random variables, each having distribution $\text{Po}(1)$ (or as a sum of $M = 20$ variables having distribution $\text{Po}(1.1)$.) Anyway, you are getting into the territory of summing a moderate-to-large number of iid random variables, so are getting close to the territory where the Central Limit Theorem applies.

See, eg.,
http://wiki.stat.ucla.edu/socr/index.php/AP_Statistics_Curriculum_2007_Limits_Norm2Poisson
or
http://www.socr.ucla.edu/Applets.dir/NormalApprox2PoissonApplet.html