Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

Probability Problem - Coin Tossing

  1. May 24, 2010 #1
    If you toss a fair coin 20 times, what is the chance that you get a run of at least 5 heads at any time?

    Easy to solve by simulation, and I know the answer already, but I'm trying to create a spreadsheet where I can enter the number of tosses and the length of the run required and get the probability.

    Can someone tell me how to do this in excel?

  2. jcsd
  3. May 24, 2010 #2


    User Avatar
    Science Advisor
    Homework Helper

  4. May 26, 2010 #3
    Thanks EnumaElish,

    Can you explain how this continues to get x?

    where x = 1 + q.p^r + (r+1)(q.p^r)^2 + ....
  5. May 26, 2010 #4
    Ok, here's a solution I used after getting some help from a friend. This is a very simple method to get the correct answer exactly, rather than an approximation.

    I've asked a few people about this and many take (1-0.5^5)^16 as a starting point, as you can get at least 5 heads in a row on tosses 1-5, 2-6,...16-20. But that seems to lead to a dead end as it doesn't simplify anything. A sequence of HTHHH on tosses 1-5
    already rules out HHHHH on tosses 2-6 and so on.

    Start with :
    0.03125 Chance of HHHHH on toss 1-5
    0.015625 Chance of THHHHH on toss 1-6
    0.015625 Chance of THHHHH on toss 2-7
    They are mutually exclusive and all possible ways of achieving at least 5 consecutive heads are covered. We treat THHHHHH on toss 1-7 as identical to THHHHH on toss 1-6, as we are interested in at least 5 consecutive heads.
    0.015625 Chance of THHHHH on toss 3-8
    0.015625 Chance of THHHHH on toss 4-9
    0.015625 Chance of THHHHH on toss 5-10
    Now comes the tricky bit. We can't simply continue like this until toss 16-20 because sequences get counted multiple times (for instance HHHHHTHHHHH would be counted twice). So you need to multiply The chance of the next group of 6 being THHHHH by the chance that there hasn't previously been a group of THHHHH. At this stage that chance would be 1 - ( the chance of THHHHH on tosses 1-6 + the chance of THHHHH on tosses 1-7). and continue.

    In excel this can be done in a few seconds:

    A1 = 0.5^5
    A2 through A6 = 0.5^6
    B7 = 1-sum($A$1:A1)
    A7 = 0.5^6*B7
    Copy the range A7:B7 to A16:B16
    The sum of A1:A16 gives the answer. It also gives all the answers for at least 5 heads from 5-19 tosses.

    The output is

    0.015136719 0.96875
    0.014892578 0.953125
    0.014648438 0.9375
    0.014404297 0.921875
    0.014160156 0.90625
    0.013916016 0.890625
    0.013679504 0.875488281
    0.013446808 0.860595703
    0.013217926 0.845947266
    0.012992859 0.831542969
    Answer (rounded off to 7 digits): 0.2498703

    This can also be adapted to solve 6 consecutive heads in 100 tosses or any other number.

    I'd still appreciate some help in getting this solution in excel though: http://www.win.tue.nl/~iadan/blockq/rows.pdf
    As I'd like to understand how it works!
  6. May 26, 2010 #5
    "1 - ( the chance of THHHHH on tosses 1-6 + the chance of THHHHH on tosses 1-7)"

    Should be
    1 - ( the chance of HHHHH on tosses 1-5)

    The probability is the same.
  7. May 28, 2010 #6
    I know you are specifically asking how to do that recurrence method in Excel, and I don't know that one. But there is a way to get an exact, explicit solution for any number of tosses n and any run length k by counting. In fact, you and your friend figured out the main trick, which is considering sequences of the form THHHHH. I learned this from Tiny Tim, who I think is an admin here.

    A run of at least 5 heads can happen in two ways. Either the series of 20 tosses starts out HHHHH, or at some point in the series the sequence THHHHH comes up. These are not mutually exclusive, so the probability you are looking for is

    P(series starts out HHHHH) + P(sequence THHHHH comes up) - P(start out HHHHH and THHHHH comes up later)

    The first probability is (1/2)5, of course. To find the second probability, the probability that THHHHH comes up at least once in 20 tosses, you again use the Principle of Inclusion/Exclusion (PIE) to avoid over counting.

    The sequence THHHHH could start on the first toss, the second toss, the third toss, etc. Symbolize these by

    (there are 14 + 1 = 15 of these)

    where X could be either H or T. Each of these sequences has probability (1/2)6. As you have already pointed out, this overcounts such sequences as


    So they must be subtracted out. There are 20 -(2*6) + 2 = 10 places for either an X or a THHHHH to fit in, and you must choose 2. Thus there are 10 C 2 such sequences, and each sequence has probability (1/2)12

    But now we have undercounted those sequences that have 3 occurrences of THHHHH, such as


    We must add these back in. There are 20-(3*6) + 3 = 5 places for either an X or a THHHHH to fit in, and you must choose 3. So the probability of all these sequences is (5 C 3)*(1/2)18. Therefore the probability for an occurrence of THHHHH in 20 tosses is

    P(sequence THHHHH comes up) = 15*(1/2)6 - (10 C 2)*(1/2)12 + (5 C 3)*(1/2)18

    The probability that you start out with HHHHH and also have a THHHHH come up later is calculated in a similar way.
  8. May 29, 2010 #7
    [tex]P(H_5|N_{20}) = \binom{20}{5} \left (\frac{1}{2} \right )^5 \left (\frac{1}{2} \right ) ^{15} [/tex]

    (should be to the power of 15... not sure why PF latex is broken)

    Where [tex]\binom{20}{5}[/tex] is the binomial coefficient found by [tex]\binom{n}{k} = \frac{n!}{k! (n - k)!}[/tex]


    [tex]P(H_x;x \geq 5|H_{20}) = 1 - P(H_x; x \leq 4|H_{20})[/tex]


    [tex]= 1 - \sum_j^4 P(H_j|H_{20})[/tex]

    Where (from earlier),

    [tex]P(H_k|N_{n}) = \binom{n}{k} p^k (1-p) ^{n-k}[/tex]

    So it comes out to something like:

    20 flips, at least 5 heads.
    Combinations for 0 heads = 20! / [0! * (20 - 0)!] = 1
    Combinations for 1 heads = 20! / [1! * 19!] = 20
    Combinations for 2 heads = 20! / [2! * 18!] = 180
    C for 3 heads = 20! / [3! * 17!] = 1140
    C for 4 heads = 20! / [4! * 16!] = 4845

    Probability of 0 heads in 20 flips = 1 combination * (1/2)^20
    Probability of 1 head in 20 flips = 20 * (1/2)^19 * (1/2) = 20 * (1/2)^20
    2 heads = 180 * (1/2)^20

    at least 5 heads = 1 - 1*(1/2)^20 - 20*(1/2)^20 - 180*(1/2)^20 - 1140*(1/2)^20 - 4845*(1/2)^20
    = 1 - (1/2^20) * [1 + 20 + 180 + 1140 + 4845]

    There you go.
    Last edited: May 29, 2010
  9. May 29, 2010 #8
    Thanks techmologist, that works well. Might be a bit time consuming if you are working with large numbers though (if I understand you correctly).

    luma, thanks for your contribution. The math is fine, but it answers a different question. Here I am looking specifically at a the probability of a run of at least 5 heads occurring rather than at least 5 heads occurring in the sequence in total.

    Someone else suggested another way to solve the problems which seems to be the simplest of all.

    For N tosses of a coin, the number of outcomes that don't include H consecutive heads can be calculated by addition as follows. You start at N=0, and work you way up to the amount of tosses you are interested in:

    For 0≤N<H it's simply 2^N
    For N≥H it's the sum of the H previous values. (Hope that makes sense, as I'm not sure this is the proper way of putting it).

    So for N=20 and H=5, you would work out

    N H
    0 1 =(2^N)
    1 2 =(2^N)
    2 4 =(2^N)
    3 8 =(2^N)
    4 16= (2^N)
    5 31 =(1+2+4+8+16)
    6 61 =(2+4+8+16+31) etc.
    7 120
    8 236
    9 464
    10 912
    11 1793
    12 3525
    13 6930
    14 13624
    15 26784
    16 52656
    17 103519
    18 203513
    19 400096
    20 786568

    Now you have the number of outcomes that don't include at least 5 consecutive heads after 20 tosses, as well as those after 19, 18 etc.
    The total possible combinations = 2^20, so the probability for at least 5 consecutive heads =1-786568/2^20

    for N=100 and H=6

    N H
    0 1
    1 2
    2 4
    3 8
    4 16
    5 32
    6 63
    7 125
    8 248
    9 492
    10 976
    11 1936
    12 3840
    13 7617
    14 15109
    15 29970
    16 59448
  10. May 30, 2010 #9
    Correct me if im wrong here but if you have a 0.03125% chance of getting 5 heads in a row and you want to find out how likely you are to get that in 20 coin tosses then wouldn't you be able to just take the chance of getting 5 heads in a row and times it by the amount of trials you get to do that? Like in 20 coin toss you would have a chance from the start of 1-15 of the coin tosses so 15 chances overall? In that case I get like a .46 chance of it happening from .03125% x 15.
  11. May 30, 2010 #10
    (It's a 0.03125 = 3.125% chance for 5 heads in a row for 5 tosses)

    That doesn't work as it counts sequences that are impossible. For instance, if tosses 1-5 HTHHH, the chance of HHHHH on tosses 2-6 is 0%, not 3.125%. It would also count HHHHHH[rest of tosses] as two separate occurrences, because the sequence of at least 5 heads occurs on 1-5, and 2-6.

    Try to figure out what the probability is for at least 5 heads in a row for 50 tosses. .3125x46 is not possible as it is greater than one.

    One small thing, there are 16 rather 15 chances. 1-5,...,16-20
  12. May 30, 2010 #11
    Yeah, sometimes an exact closed form solution is less useful than just solving a problem iteratively or recursively on a computer. And with this problem being as hard as it is, any solution is going to be messy. But I think the general solution using PIE is manageable. You could get Matlab or Mathematica to evaluate it quickly.

    Probability of a run of at least H heads in N tosses =
    [tex]\frac{1}{2^H}+\sum_{j=1}^{\lfloor \frac{N}{H+1} \rfloor}(-1)^{j+1}\binom{N-j(H+1)+j}{j}\frac{1}{2^{j(H+1)}} - \frac{1}{2^H}\sum_{j=1}^{\lfloor \frac{N-H}{H+1} \rfloor} (-1)^{j+1}\binom{N-H-j(H+1)+j}{j}\frac{1}{2^{j(H+1)}}[/tex]

    I get that.

    I don't understand that. Can you explain why that works? If it does, that is definitely the easiest way to do it.

    EDIT: latex formula should be right now. Thanks to D H for corrections.
    Last edited: May 31, 2010
  13. May 30, 2010 #12

    D H

    User Avatar
    Staff Emeritus
    Science Advisor

    The Investor's answer in post #4 is simple, correct, and easily generalizable to determine the probability of obtaining a run of some length N in more in a set of M (M≥N) tosses.

    techmologist, your counting in post #6 is incorrect. A run starting on the first throw is conceptually different from a run starting on any other roll. You need to carry this distinction throughout in your accounting. The count approach quickly gets intractable. With 23 rolls you have to worry about overcounting four runs of five, for example.
  14. May 30, 2010 #13
    Sorry for my muddled up symbols. I'm using H as the run length of heads we're interested in, and in the table H stands for the number of variations without that run occurring, so it's all a bit confusing. I guess you do see that it works, but want to know why? I'm not sure, because someone else gave me this solution, and I asked this question too, but haven't received a response yet!

    I did come up with another solution independently while trying to work it out:

    This involves solving a series of simple problems that give the answer for the complex one.
    Once again solve it in one step increments. So if you are interested in the probability of at least 5 consecutive heads in 20 tosses, first solve the number of occurrences without a run of 5 consecutive heads in 0 tosses, then 1 toss, 2 tosses, 3 tosses, .... 20 tosses.

    When N<H, number of outcomes without a run of 5 heads= N(x) = 2^x
    When N=H, number of outcomes without a run of 5 heads = N(x) = 2^x-1
    When N>H, number of outcomes without a run of 5 heads = N(x) = N(x-1)*2-N(x-1-H)

    So for N=20 and H=5, first find the number of outcomes without H consecutive heads for N=0 and H=5, then N=1 and H=5,... N=20 and H=5

    N(6)= N(5)*2-N(5-H)
    N(7)= N(6)*2-N(6-H)
    N(8)=N(7)*2- N(7-H)

    And for N=100 and H=6, first find the number of outcomes without H consecutive heads for N=0 and H=6, then N=1 and H=6,... N=100 and H=6

    N(6)= 2^6-1=63
    N(7)= N(6)*2-N(6-H)
    N(8)=N(7)*2- N(7-H)

    I have no idea if I have conveyed this correctly, so feel free to correct my notation or ask for clarification. I have checked it, and the answers generated using this algorithm are correct, but the way I have described it probably isn't!
  15. May 30, 2010 #14
    I don't really see what the hthhh roll has to do with it because if your going to account for one random roll effecting the rest then imo you just cherry picked that number and the roll and should have to factor for all possible rolls and in that case your left with the same chance.

    When I read the Op question it seemed he was just asking what are the chances of getting 5 heads in a row in 20 coin tosses. I personally think that the answer for that type of question should be very simple because the question itself sounds simple.
  16. May 30, 2010 #15
    Also I wanted to point out that you should probably test your numbers irl before you use them. I was just checking with a coin toss generator however the one I was using only allowed for 16 coins tossed at a time at max. Anyhow it seems like with just 16 coins tossed the amount of times I got 5 heads in a row was about 33% if I did it for like 10 or so trys. But that site was a little wierd as it tryed to factor for different coins and stuff id rather have used a just random generated coin toss but was stressed for time.
  17. May 30, 2010 #16
    That's what I thought to, I was initially surprised that answering such a simple question was not that easy. To understand why your method doesn't work, you can take a much simpler problem:

    If you toss a fair coin 6 times, what is the chance that you get a run of at least 5 heads at any time?

    You'll see that reasoning along the lines of two chances to get 5 consecutive heads (tosses 1-5 and tossed 2-6), does not lead to the correct answer.

    We can simply count them. HHHHHH, THHHHH, HHHHHT, there are 3 sequences out of 64 possibilities, so P(5 consecutive heads in 6 tosses)=3/64
  18. May 30, 2010 #17
    huh? the hhhhhh is included with the thhhhh and hhhhht imo... At least with the way the problem is worded.
  19. May 30, 2010 #18
    That's quite a lot more than I would expect.
    P(5 consecutive heads in 16 tosses) = 0.196533203
  20. May 30, 2010 #19
    If you toss a fair coin 3 times, what is the chance that you get a run of at least 2 heads at any time?

    There are 8 equiprobable outcomes:

    so the chance = 3/8
    Do you see that HHHHHH, THHHHH and HHHHHT are distinct?
  21. May 30, 2010 #20
    Nope take it back I still think what I said was right. There is a 1/32 chance of getting 5 coins in a row as heads. There is 15 chances where you could get 5 coins in a row as heads.

    Your saying it will happen about 24% of the time im saying it will happen about 46% of the time. I totally think 25% is too low of a chance %.

    Also if you count the HHH as two you end up with 4/8 and that's exactly what my math will give. I can understand why you would not want to count the HHH as if it was 2 but I think you should.
    Last edited: May 30, 2010
Share this great discussion with others via Reddit, Google+, Twitter, or Facebook