Are Both Sensible Interpretations of Poisson Behavior?

nonequilibrium · Jan 20, 2012

Are both sensible (equivalent? contradictory?) interpretations of "Poisson" behavior?

I've come across two quite distinct notions (or so it seems to me, anyway) of Poisson behavior and I'm not sure if they're equally sensible or perhaps even equivalent. I'll apply both "views" to the same case to show you what I mean.

The first view is how I first met Poisson in my statistics textbook (and the way I did not like):
Say we have n houses, and we know that the probability of one house burning down in the course of one year is p. We can assume that n is very large and p is quite small. We're interested in knowing the probability of the number of houses that burn down in the course of one year. If we look more closely we see, as the burning down of every house is a Bernoulli experiment, that this follows a binomial distribution with variables n and p. As n is large and p is small, we can approximate this distribution by a Poisson distribution with parameter \lambda = np (with \lambda a "normal" size, way larger than p, way smaller than n).

The second "Poisson" view is called a Poisson process (it has got its own wiki page):
Clear your head of the previous case. We now regard the burning down of one specific house as an intrinsically random event, hence the time until burning down is modeled well by an exponential distribution. Call the exponential distribution parameter \lambda. We want to know the probability distribution for the number of houses that burn down after a time t (afterwards we will take "t = one year"). As the probability of a house burning down in a given time interval dt is \lambda \mathrm d t (you can see this as a consequence of the memorylessness of the exponential distr.), we can see that "the number of houses that burn down after a time t" as a sum of \frac{t}{\mathrm dt} bernouilli experiments, each with probability \lambda \mathrm d t. In the limit \mathrm d t \to 0, this is described by a Poisson distribution with parameter \lim_{\mathrm d t \to 0} \left(\lambda \mathrm dt \right) \left( \frac{t}{\mathrm d t} \right) = \lambda t.

You see that in both cases we arrive at a Poisson distribution but in quite distinct ways. (For ease, take the units of lambda to be "per year", then the lambda in both cases is equal.) But is the result equivalent? I think not, right? For one thing, in the former case, the end result depended on the number of houses, whereas in the latter case it didn't enter at all.
Even conceptually it is quite different, no? In the former case, it was truly a Binomial distribution which we mathematically approximated by a Poisson distribution (cause n was big and p was small), but in the latter case it was the Binomial distribution which was a temporary approximation, but by taking the limit we got the actual nature of the problem. But perhaps this note is too much philosophy and too little hardcore mathematical objection.

Anyway, I was wondering, are both sensible applications of the Poisson distribution? I think both derivations make sense, but on the other hand the results are different. Which of the two should a company use? Or should we expect them to make similar predictions? Is there any of the two more true than the other? Do they apply in distinct cases? And am I the only one who thinks the second view is somehow more pleasing? (I just encountered the second view in my physics Markov course, and only now do I feel I somehow understand Poisson behavior.)

mathman · Jan 21, 2012

Although similar, the questions being asked in the two cases are slightly different. In the first example, you are looking at the distribution of the number of houses burnt down, while the second is concentrating on the probability of a given house burning down.

nonequilibrium · Jan 21, 2012

I don't think so. Both answer the question "how many houses can I expect to burn down in a year?".

pbuk · Jan 22, 2012

You are being confused by the fact that λ means different things in the two explanations and by a feature of the trial I shall come to later. The following should be clearer but lacks rigour:

In the first view we use the fact that a binomial distribution tends towards a poisson distribution as n->∞ (google binomial limit poisson for more details). With n trials the expected value of the number of houses burning down is np, and so the limit is a Poisson distribution with mean μ = np.

The explanation of the second view is somewhat confusing. It starts with the result that where the time between successes forms an exponential distribution (as it does in most cases because if you wait twice as long you expect twice as many successes) that the number of successes in time t forms a Poisson distribution with mean μ = ft where f is the frequency parameter of the exponential distribution. This equivalence can be derived directly, there is no need to introduce the binomial distribution and take it to the limit.

Combining the two expressions for μ we see that μ = np = ft, or f = np / t which is what we would expect.

But (here is the bit about the feature of the trial), with a finite set of houses and a finite time (in particular a time shorter than the time it takes to rebuild a house), this is not a Poisson process. In a year, either a house burns down or it doesn't: it can't burn down twice so the first model as a Bernoulli experiment with the associated binomial distribution is accurate and the result does depend on n. In the second model we are using the Poisson distribution to approximate the Binomial distribution: the Poisson distribution does not depend on n, and that is why it is not a precise model: the error is a function of n (and p).

If instead the trial was a house being struck by lightning, the first scenario would be the approximation: it is quite feasible (however improbable) for a given house to be struck by lightning twice in a year so this is not a Bernoulli experiment, but it is a Poisson process. But if p is small enough we can use the Binomial distribution as an approximation: the error is a function of p (and n).

So with these (abbreviated) facts:

Bernoilli experiments are accurately modeled by the binomial distribution
Events with an exponential distribution form a Poisson process
The differences between the Poisson distribution and the binomial distribution become small for large N (providing p is small enough)

you can answer your own questions.

I'll add one more thing: if N and p are sufficiently large, both the binomial distribution and the Poisson distribution tend towards the normal distribution (this can be derived directly or seen quickly from the Central Limit Theorem) which is often the easiest to use.

Are Both Sensible Interpretations of Poisson Behavior?

Thread 'Onto set mapping is the surjective set mapping, and into injective?'

Thread 'Roulette wheel physics and probability'

Thread 'Detail of Diagonalization Lemma'

Similar threads

Hot Threads

B A Little Probability Puzzle

I Need help solving this Existence Algorithm for truth

A Does this computation satisfy LTL formulas?

A Prove that points which are indistinguishable from 0 exist (using logic)

A Mathematical Connection between Cosmic Expansion and Exponential Growth

Recent Insights

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Insights What Exactly is Dirac’s Delta Function? - Insight

Insights Relativator (Circular Slide-Rule): Simulated with Desmos - Insight

Insights Fixing Things Which Can Go Wrong With Complex Numbers

Insights Fermat's Last Theorem

Insights Why Vector Spaces Explain The World: A Historical Perspective