Jonathan212
- 198
- 4
What is the formula for this? Throwing a flipping coin N times, what is the probability that the number of tails results is higher than M?
Jonathan212 said:This is not homework if that's what you were thinking. Sounds specialized enough to me that a google answer is not readily available. Unless you dig a lot in which case might as well derive it from scratch.
Jonathan212 said:Thanks. So the answer is:
Pr(M; N, 0.5) = N! / (M! * (N - M)!) * 0.5^M * (1 - 0.5)^(N-M)
https://en.wikipedia.org/wiki/Binomial_distribution
Is this High School level in the US nowadays?
Jonathan212 said:Oopsa. Excel can't deal with N > 170. Any trick to avoid the overflow of "171!" ?
https://www.statisticshowto.datasci...theorem/normal-approximation-to-the-binomial/Jonathan212 said:Got stuck now. Something is wrong. Do we want a normal distribution with a standard deviation of N * 0.5 * ( 1 - 0.5 ) and a mean of N * 0.5? Excel has the NORMDIST() function but it returns a result 4 times bigger than the binomial distribution at N = 170, M = 102.
Jonathan212 said:So we want sigma = sqrt( N * 0.5 * ( 1 - 0.5 ) ). And the answer for N = 1000, M = 601 is 1 in 21 million. Hurray! That's statistically significant as hell.
Looking at the error of normal versus the binomial up to N = 170:
View attachment 238253
2015 sti 0 60
It actually gets worse after N = 140. Is it meant to?
Jonathan212 said:At N = 100, M = 51, I get 0.078 from both the binomial and the normal. Sure this is right?
Jonathan212 said:But I want M or more tails, not M.
Jonathan212 said:Looks like there is no formula that saves you having to do the sum of the binomial. But luckily, there is for the normal in excel.
PeroK said:I'm taking a guess that the normal distribution, being essentially ##e^{-x^2}##, will fall off to zero faster than the binomial. So, after a certain point the relative error will increase, but relative to some very small probabilities.
what are you ultimately looking for here?
Jonathan212 said:The way to assess the statistical significance of a true random number generator's observed bias. Got the answer in Excel, it is as BvU says. Only thing is, the result is a bit suspicious for high N: probability of 600 OR MORE heads in 1000 throws is 1 in 19 billion.
Jonathan212 said:Look at some more numbers. All at 60% heads or more:
Probability of 15 or more heads in 25 throws = 1 in 4.7
Probability of 60 or more heads in 100 throws = 1 in 35
Probability of 150 or more heads in 250 throws = 1 in 1,062
Probability of 240 or more heads in 400 throws = 1 in 27,000
Probability of 300 or more heads in 500 throws = 1 in 220,000
Probability of 360 or more heads in 600 throws = 1 in 1,800,000
Probability of 480 or more heads in 800 throws = 1 in 1,200,000,000
Jonathan212 said:The way to assess the statistical significance of a true random number generator's observed bias. Got the answer in Excel
Jonathan212 said:Got the answer in Excel, it is as BvU says. Only thing is, the result is a bit suspicious for high N: probability of 600 OR MORE heads in 1000 throws is 1 in 19 billion.
Jonathan212 said:Look at some more numbers. All at 60% heads or more:
Probability of 15 or more heads in 25 throws = 1 in 4.7
Probability of 60 or more heads in 100 throws = 1 in 35
Probability of 150 or more heads in 250 throws = 1 in 1,062
Probability of 240 or more heads in 400 throws = 1 in 27,000
Probability of 300 or more heads in 500 throws = 1 in 220,000
Probability of 360 or more heads in 600 throws = 1 in 1,800,000
Probability of 480 or more heads in 800 throws = 1 in 1,200,000,000
Jonathan212 said:Does the integral of the standard distribution have an analytical solution by any chance? If not, how do you know what interval to use to the 2 significant figures in the probability
Jonathan212 said:Indeed Excel's cumulative binomial runs out of steam at N = 1029, it fails above that value. But so does Excel's normal distribution at N = 1555 (the cumulative option).
Jonathan212 said:Btw the source of my true random data is not Excel but /dev/random or "haveged".
from scipy.stats import norm
print(1-norm.cdf(4.47))
print(1-norm.cdf(6.32))
StoneTemplePython said:if you want to get the numerical values of the Gaussian integral, can't you just use a built-in excel function?
Jonathan212 said:At N = 1000 it still has a 6.5% error relative to the binomial.![]()
It isn't.slappmunkey said:That's the same question just reworded as "if I flip a coin 10 times and 9 times it's heads, what's the odds it will land tails on the 10th flip?"
mfb said:The question is about seeing (at least) 600 tails in total, not about the probability of a specific flip out of the 1000.It isn't.
slappmunkey said:It is. Each individual flip is a 50% chance to land either way, that means it's a 50/50% chance no matter how many flips you do.