Could you use the binomial distribution here?

trollcast · Mar 8, 2013

I'm looking through my statistics notes and on the page that's giving examples of cases where you can use a binomial distribution it gives the problem:

"The number of red counters in a randomly chosen sample of 30 counters taken from a large number of counters of which 10% are red."

Now my notes goes on to say that this can't be modeled by a binomial distribution but doesn't say how you could model it with any other distribution.

But given that very limited amount of data could you not obtain a reasonable estimate of the probabilities using the binomial distribution as the question states, "a large number" , could we not assume that removing the counter isn't going to change the probability very much?

pbuk · Mar 8, 2013

trollcast said:

could you not obtain a reasonable estimate of the probabilities using the binomial distribution as the question states, "a large number" , could we not assume that removing the counter isn't going to change the probability very much?

Yes, the binomial distribtion is appropriate here. I think your notes should say that "if the number of counters is not large it cannot be modeled using the binomial distribtion"; in this case you would have to calculate the specific probabilities of 0, 1, 2... red counters when selecting 30 from N (without replacement - if there is replacement then the binomial distribution always applies).

trollcast · Mar 8, 2013

MrAnchovy said:

Yes, the binomial distribtion is appropriate here. I think your notes should say that "if the number of counters is not large it cannot be modeled using the binomial distribtion"; in this case you would have to calculate the specific probabilities of 0, 1, 2... red counters when selecting 30 from N (without replacement - if there is replacement then the binomial distribution always applies).

Thanks,

How would you define large enough? If the sample is 1% of the population or something?

pbuk · Mar 8, 2013

Firstly, I should have pointed out that "calculating the specific probabilities of 0, 1, 2... red counters when selecting 30 from N without replacement" is in fact the Hypergeometric Distribution.

"Large enough" depends on how accurate you want to be; for further investigation and limits on errors see statistical textbooks or google "binomial hypergeometric difference".

trollcast · Mar 8, 2013

MrAnchovy said:

Firstly, I should have pointed out that "calculating the specific probabilities of 0, 1, 2... red counters when selecting 30 from N without replacement" is in fact the Hypergeometric Distribution.

"Large enough" depends on how accurate you want to be; for further investigation and limits on errors see statistical textbooks or google "binomial hypergeometric difference".

Ok, I thought that if population was too small then it wouldn't work at all but I see now how its all to do with a small population making the error far too large to get a sensible value out of it.

statdad · Mar 8, 2013

"Large enough" depends on how accurate you want to be; for further investigation and limits on errors see statistical textbooks or google "binomial hypergeometric difference"."

Typically if the sample size is at most 5% of the population size the binomial distribution can be used.

pbuk · Mar 8, 2013

trollcast said:

Ok, I thought that if population was too small then it wouldn't work at all but I see now how its all to do with a small population making the error far too large to get a sensible value out of it.

Indeed. In this case the binomial distribution gives P(3) ≈ 0.24 whereas with N = 60, P(3) ≈ 0.33.

Redbelly98 · Mar 10, 2013

The population size must be large so that the 10% value is true for the entire collection of 30 counters -- the binomial distribution assumes the same probability (10% in this example) for all 30 counters to be picked.

If you sample without replacement from a small population, then removing a counter significantly affects the likelihood that the next counter chosen is red. But if you sample with replacement, then the 10% value remains fixed no matter the population size -- as MrAnchovy indicated, the binomial distribution would apply exactly, not just approximately.

Could you use the binomial distribution here?

1. What is the binomial distribution?

2. When should I use the binomial distribution?

3. How do I calculate the probability using the binomial distribution?

4. Can the binomial distribution be used for continuous data?

5. What are some real-life examples of using the binomial distribution?

Similar threads

Hot Threads

Recent Insights