Could you use the binomial distribution here?

Click For Summary

Discussion Overview

The discussion revolves around the appropriateness of using the binomial distribution to model the number of red counters in a sample drawn from a larger population. Participants explore the conditions under which the binomial distribution is applicable versus when the hypergeometric distribution should be used, considering factors such as population size and sampling methods.

Discussion Character

  • Debate/contested
  • Technical explanation
  • Mathematical reasoning

Main Points Raised

  • Some participants suggest that the binomial distribution could provide a reasonable estimate for the probabilities in the given problem, assuming the population is large enough that removing a counter does not significantly change the probabilities.
  • Others argue that if the population size is not large, the hypergeometric distribution should be used instead, as it accounts for sampling without replacement.
  • A participant questions what constitutes "large enough" for the population size to justify using the binomial distribution, suggesting that a sample size of 1% of the population might be a threshold.
  • Another participant clarifies that the hypergeometric distribution is the correct model when calculating probabilities for sampling without replacement.
  • There is mention of a general guideline that if the sample size is at most 5% of the population size, the binomial distribution can typically be applied.
  • Some participants note that a small population can lead to significant errors in probability estimates when using the binomial distribution.
  • One participant highlights that the binomial distribution assumes a constant probability for all selections, which may not hold true in small populations when sampling without replacement.

Areas of Agreement / Disagreement

Participants express differing views on the applicability of the binomial distribution versus the hypergeometric distribution, indicating that there is no consensus on the conditions under which each should be used. The discussion remains unresolved regarding the specific thresholds for population size and sampling methods.

Contextual Notes

Limitations include the lack of a clear definition of "large enough" population size and the dependence on the sampling method (with or without replacement) affecting the choice of distribution.

trollcast
Gold Member
Messages
282
Reaction score
13
I'm looking through my statistics notes and on the page that's giving examples of cases where you can use a binomial distribution it gives the problem:

"The number of red counters in a randomly chosen sample of 30 counters taken from a large number of counters of which 10% are red."

Now my notes goes on to say that this can't be modeled by a binomial distribution but doesn't say how you could model it with any other distribution.

But given that very limited amount of data could you not obtain a reasonable estimate of the probabilities using the binomial distribution as the question states, "a large number" , could we not assume that removing the counter isn't going to change the probability very much?
 
Physics news on Phys.org
trollcast said:
could you not obtain a reasonable estimate of the probabilities using the binomial distribution as the question states, "a large number" , could we not assume that removing the counter isn't going to change the probability very much?

Yes, the binomial distribtion is appropriate here. I think your notes should say that "if the number of counters is not large it cannot be modeled using the binomial distribtion"; in this case you would have to calculate the specific probabilities of 0, 1, 2... red counters when selecting 30 from N (without replacement - if there is replacement then the binomial distribution always applies).
 
MrAnchovy said:
Yes, the binomial distribtion is appropriate here. I think your notes should say that "if the number of counters is not large it cannot be modeled using the binomial distribtion"; in this case you would have to calculate the specific probabilities of 0, 1, 2... red counters when selecting 30 from N (without replacement - if there is replacement then the binomial distribution always applies).

Thanks,

How would you define large enough? If the sample is 1% of the population or something?
 
Firstly, I should have pointed out that "calculating the specific probabilities of 0, 1, 2... red counters when selecting 30 from N without replacement" is in fact the Hypergeometric Distribution.

"Large enough" depends on how accurate you want to be; for further investigation and limits on errors see statistical textbooks or google "binomial hypergeometric difference".
 
MrAnchovy said:
Firstly, I should have pointed out that "calculating the specific probabilities of 0, 1, 2... red counters when selecting 30 from N without replacement" is in fact the Hypergeometric Distribution.

"Large enough" depends on how accurate you want to be; for further investigation and limits on errors see statistical textbooks or google "binomial hypergeometric difference".

Ok, I thought that if population was too small then it wouldn't work at all but I see now how its all to do with a small population making the error far too large to get a sensible value out of it.
 
"Large enough" depends on how accurate you want to be; for further investigation and limits on errors see statistical textbooks or google "binomial hypergeometric difference"."

Typically if the sample size is at most 5% of the population size the binomial distribution can be used.
 
trollcast said:
Ok, I thought that if population was too small then it wouldn't work at all but I see now how its all to do with a small population making the error far too large to get a sensible value out of it.

Indeed. In this case the binomial distribution gives P(3) ≈ 0.24 whereas with N = 60, P(3) ≈ 0.33.
 
The population size must be large so that the 10% value is true for the entire collection of 30 counters -- the binomial distribution assumes the same probability (10% in this example) for all 30 counters to be picked.

If you sample without replacement from a small population, then removing a counter significantly affects the likelihood that the next counter chosen is red. But if you sample with replacement, then the 10% value remains fixed no matter the population size -- as MrAnchovy indicated, the binomial distribution would apply exactly, not just approximately.
 

Similar threads

  • · Replies 3 ·
Replies
3
Views
3K
  • · Replies 15 ·
Replies
15
Views
2K
  • · Replies 3 ·
Replies
3
Views
3K
Replies
1
Views
3K
  • · Replies 9 ·
Replies
9
Views
3K
  • · Replies 5 ·
Replies
5
Views
3K
  • · Replies 0 ·
Replies
0
Views
2K
  • · Replies 12 ·
Replies
12
Views
3K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 14 ·
Replies
14
Views
6K