Simple error analysis for probabilities

Click For Summary
SUMMARY

This discussion focuses on the use of Poisson errors for histogram bins when analyzing probabilities. The author illustrates a scenario with a data set of 8 elements, where 7 items are in one bin, leading to an erroneous probability calculation of P=0.875±0.33. It is established that the Poisson approximation is only valid when the bin contains a small fraction of the total sample, and in cases with fewer elements, a binomial approach should be used instead. The correct standard error for this scenario is calculated as Npq = √((7/8)(1-7/8)/8) = 0.11.

PREREQUISITES
  • Understanding of Poisson distribution and its limitations
  • Familiarity with binomial distribution and its application
  • Basic knowledge of probability theory
  • Experience with histogram data representation
NEXT STEPS
  • Study the differences between Poisson and binomial distributions in statistical analysis
  • Learn how to calculate standard errors for different probability distributions
  • Explore advanced histogram techniques for data visualization
  • Investigate the implications of asymmetric distributions on statistical results
USEFUL FOR

Statisticians, data analysts, and researchers working with probability distributions and histogram data who need to understand error analysis in small sample sizes.

cahill8
Messages
29
Reaction score
0
I'm dealing with a histogram and want to use poisson errors for each bin. For example, having 7 items in a bin gives that bin an error of sqrt(7). I'm comparing four different data sets, each with different sizes. I'm scaling everything in terms of probabilities so the four data sets can be compared.

My smallest data set contains only 8 elements, of which 7 are in one bin, with an error of sqrt(7). Now when this is scaled to a probability, y=7/8 and yerr=sqrt(7)/8. However, this gives P=0.875+-0.33, which does not make sense since the probability cannot exceed 0. Is there something simple I'm missing?
 
Physics news on Phys.org
There are a couple things going on here. First, the Poisson approximation is only valid when the bin contains only a small fraction of the total sample. So if you have 100 elements and 7 are in one bin, the Poisson approximation is good. But if you have 8 elements and 7 are in one bin, the Poisson approximation is very very bad. You'd do better to use a binomial, where the standard error will be Npq = \sqrt{\frac{(7/8)(1-7/8)}{8}}=0.11. (Actually, it should be 0.125 but I'll not get into that.)

Second, there is no reason why a one standard error range cannot include impossible values. For very asymmetric distributions, it will commonly be the case.
 

Similar threads

Replies
28
Views
4K
  • · Replies 37 ·
2
Replies
37
Views
5K
  • · Replies 5 ·
Replies
5
Views
2K
  • · Replies 25 ·
Replies
25
Views
3K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 16 ·
Replies
16
Views
2K
  • · Replies 8 ·
Replies
8
Views
3K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 4 ·
Replies
4
Views
8K