Physical intuitions for simple statistical distributions

Click For Summary

Discussion Overview

The discussion revolves around understanding various statistical distributions, particularly the Gaussian, Log Normal, and Poisson distributions, and their applications in real-world scenarios such as emergency call behavior. Participants explore the underlying principles and intuitions behind these distributions, as well as their relationships and potential simulations.

Discussion Character

  • Exploratory
  • Technical explanation
  • Conceptual clarification
  • Debate/contested
  • Mathematical reasoning

Main Points Raised

  • One participant seeks to understand why statistical distributions like Gaussian and Log Normal are prevalent, citing examples of random variations in activities like dart throwing and bacterial growth.
  • Another participant explains that the Gaussian and Poisson distributions are limiting cases of the Binomial Distribution, with specific conditions under which they arise.
  • Questions arise regarding the notation used in the explanation of the Binomial Distribution, particularly concerning the relationship between n, Np, and the scaling to the Gaussian Distribution.
  • A participant inquires about simulating a Poisson distribution using agents that make rare entries in a queue, questioning the appropriateness of their proposed method and the probability calculations involved.
  • One participant reports success in simulating the Poisson distribution based on their real data, indicating a practical application of the discussed concepts.

Areas of Agreement / Disagreement

Participants express varying degrees of understanding and curiosity about the statistical distributions, with some points of clarification needed regarding notation and simulation methods. There is no consensus on the deeper intuitions behind the distributions or their relationships, particularly concerning the Exponential and Log Normal distributions.

Contextual Notes

Participants highlight potential confusion regarding the notation and assumptions in the mathematical explanations, as well as the subtleties involved in simulating the Poisson distribution accurately.

schip666!
Messages
594
Reaction score
0
I'm trying to understand why various statistical distributions are so common. For the most part, all I can find online is how to calculate and manipulate them... I did finally find a couple of refs that helped with Gaussians, this being one:
http://stat.ethz.ch/~stahel/lognormal/bioscience.pdf"

According to the above, a Gaussian Normal distribution arises due to having some "central tendency" or constraint summed with a buncha small plus/minus "random" variations -- such as in tossing darts at a target or playing Pachinko. And a Log Normal distribution is similar but the variations are multiplicative -- such as with exponential bacterial growth...Any other insights into these would be appreciated...

But the one I'm really interested in is the Poisson -- or its inverse, Exponential -- distribution which describes things like queuing behavior. For instance the time between emergency calls for a small volunteer fire department (and probably a large one too) -- I have exactly such data which matches the said distributions almost perfectly. But I have no intuition about how this happens. I would think that folks call 911 pretty much at random (modulo time of day and such) and that that would lead to a fairly even distribution across time. But no, I get that exponential instead. Why?

A secondary question, which I am just too stupid to be able to figure out on my own, is: Is the Exponential distribution actually a case of Log Normal? If so, then multiplicative variations would be an explanation, except I don't see a physical reason that queues might have that property.

I know, I know, I should post in Math. But this is a question about "Reality"...
 
Last edited by a moderator:
Physics news on Phys.org
Start with an event that has two possible outcomes, 'heads' and 'tails', or 'success' and 'failure', or whatever you want to call them. The probability of a success is p, while the probability of a failure is q, where p + q = 1.

The Binomial Distribution P(n) is the probability distribution of getting n successes out of N independent such events. It has a peak at n = Np, and an approximate width of √(Npq). Both the Gaussian and Poisson distributions are limiting cases of the Binomial Distribution.

You get the Gaussian Distribution by letting N get large (infinite, actually) in a way such that Np and Npq also get large. You then scale down this enormously large graph back to a reasonable size by switching to a variable x = (n - Np)/√(Npq), and look at P(x) - that's the Gaussian.

You get the Poisson Distribution P(n) by letting N get large and p get small, in such a way that a = Np stays finite. The Poisson Distribution is the probability distribution for a very large number of independent rare (p ≈ 0) events.
 
Thanks for the quick answer...Of course, I have more questions...

First in notation.

Your second paragraph sets "n = Np", but the third has "x = (n - Np)/√(Npq)". Would not n - Np then always equal 0? Or are we presuming those values to be sets or something?

Then in the last paragraph "a = Np stays finite", should that not be n = Np ... I suppose a quibble, but since I'm still not sure what I'm looking at consistency is my hob-gob.

But the real question is... Given that Poisson is a distribution over very rare events, can I simulate/generate one using a bunch of "agents" who randomly (and very rarely) make entries in a queue? For instance, could I regenerate my real data, summarized here:
http://hondovfd.org/statistics.php"
Around the middle of that page is a graph of our Calls per Day compared to (what I hope to be) the ideal Poisson distrib.

I have about 5000 people in my fire district and about 500 emergency calls per year. That means about 1/10 of them need help every year (there are a number of repeat customers, but I hope we can ignore them here). So on any particular day each of those 5000 potential customers has a probability of instantiating of about (.1/365) == .00027.

Would I get the right result by stepping through days with 5000 guys having a .00027p of going "true" on each step? Or are there more subtleties to consider? Or I suppose I should just try it, eh?
 
Last edited by a moderator:
Well, I'll be dangnabbled... I did try it and it did work.
thanks!
 

Similar threads

  • · Replies 5 ·
Replies
5
Views
3K
  • · Replies 24 ·
Replies
24
Views
4K
  • · Replies 15 ·
Replies
15
Views
2K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 9 ·
Replies
9
Views
5K
  • · Replies 4 ·
Replies
4
Views
3K
  • · Replies 16 ·
Replies
16
Views
2K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 3 ·
Replies
3
Views
3K
  • · Replies 3 ·
Replies
3
Views
12K