Probability of event occurring - poisson distribution?

schip666! · Aug 26, 2010

probability of event occurring -- poisson distribution?

I am the keeper of records for my local Volunteer Fire Dept. I have now collected data for each of our incident calls from the last 3 years and have made some _very_ basic stabs at interesting statistics which you can see at:
http://hondovfd.org/statistics.php"

We have about 500 calls a year -- a bit over 40 a month or around 1.3 per day. But as you can see from the graphs at the bottom of that page -- which are just about the full extent of my Excel skills -- they are not randomly distributed over the days of the week or hours of the day. More interesting to all our responders is how they are distributed by number per day. My "Calls per Day" graph seems to show a sorta-exponential decay from 1 per day to 8 (our all time high during a snow storm when our little section of Interstate turned into a Bumper Car arena). However we can go for up to a week with nada, and then break the drought with 3 or 4 in an afternoon.

So the question is: How do I characterize the likely-hood of getting a certain number of calls in any particular day, with the number 0 being of special interest. I think I should be able to compare to a Poisson distribution to see how un-random things are, but my eyes roll into the back of my head about a quarter of the way through the wiki page. Can anyone point me to some other explanations and examples, or have better thoughts on the approach?

moonman239 · Aug 27, 2010

Just use the POISSON function in Excel. Set the third value (cumulative) to 0.

schip666! · Aug 27, 2010

Oh,huh...thanks.

So what I did was get a poisson value for the integers from 0 to 8 using my average call/day rate of 1.39: POISSON( <N=0::8>, 1.39, FALSE) and then plotted those values against the real data I have -- err, after fixing a mistake in my original, how come no one told me my percentages only added up to %72? -- and it all looks like it's pretty random. Except we are a couple percent more likely to get 3 runs a day and less likely to get 2...plus two intriguingly fat-tailed points out there at 7 and 8 per day.

[PLAIN]http://www.etantdonnes.com/TMP/image011.gif

Too bad. I was hoping for some publishable results (hah) -- but at least I've learned a bit more arithmetic now.

Thanks again!

bpet · Aug 28, 2010

The fit looks pretty good. The small discrepancies you observed seem to be from correlation between events; for example the snow storm would have made individual events more likely on that day.

It would be interesting to see the results with days classified according to their risk, e.g. icy vs normal vs hot/dry weather.

schip666! · Aug 29, 2010

Yup, as I understand it, that fat tail sort of distribution indicates non-random correlation between events and your storm-hypothesis is probably correct. Of course I'm basing that on data for about 4 days out of 1000. Is there a way to determine how significant a deviation is? (I guess, modulo the amount of data used to start with...) I was trying to figure out how I could plot a power-law relationship, like the poisson, for comparison, but got stuck again.

Unfortunately, except for the date and time which could be indicative, I don't have weather correlations for the calls.

Another thing I should try is to break out the different types of calls... We have about 40% medical and 20% vehicle accidents (which are medical with added traffic -- you'd think the Sheriff would do traffic but they like to measure skid marks so it's the FD that stands out there with the SLOW signs). When it comes down to it there's less than 10% that even have a fire involved somehow. Also, the data I'm working with is what we send to the National Fire Information Something-or-Other database and it doesn't have an easy way to determine how significant each event is, so a type: "Structure Fire" with action: "Fire Control" could be anything from, A) someone climbing through an open window to take a pot of caramelized eggs off the stove, to B) everyone in the region saving half of a house belonging to a guy who spilled lacquer thinner in his garage and then closed it all up to take his kid to school.

Maybe I should be satisfied being able to that say that things is mostly random...

bpet · Aug 30, 2010

schip666! said:

Of course I'm basing that on data for about 4 days out of 1000.

That's not necessarily a bad thing, in fact if the outliers are all explainable by foreseeable factors such as weather then that would be a very good result. A perfectly valid and useful model could be along the lines of "if the weather is normal, number of calls has the Poisson distribution, otherwise anything can happen."

Is there a way to determine how significant a deviation is?

The Chi-square test is probably most appropriate here (implemented as CHITEST in Excel - tests the difference between the observed and expected number of events). May need to group together the 5+ calls per day to make the frequency high enough to be accurately tested.

schip666! · Aug 31, 2010

When I put my two tiny distributions into CHITEST() I get "1". Maybe that means I'm in perfect agreement with myself? The Excel "help" says:
"CHITEST returns the probability for a γ2 statistic and degrees of freedom, df"
however there is only one return value so I'm not sure what I'm looking at or for...

Excel "help" (using the word in it's broadest sense) does not, and wiki seems to assume that I already know what they are trying to explain. I guess I'd better try to read that Stat's book I've stored away all these years.

But thanks for the attempt anyway.

Probability of event occurring - poisson distribution?

Discussion Overview

Discussion Character

Main Points Raised

Areas of Agreement / Disagreement

Contextual Notes

Similar threads

Graduate Expected numbers of cards of a last color remaining

Undergrad The problem of points

Graduate Probability puzzle

Undergrad How does axiom of foundation prevent infinite sequence of elements?

Undergrad The countability paradox of computable numbers

Insights Revisiting the Velocity-Time Function

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect