How many tosses to flip a coin HHTH?

member 428835 · Apr 27, 2022

Hi PF!

Given a die with probability ##p## for heads, how many flips to achieve HHTH?

The method I was thinking is Markov chains: $$E = (1-p)(E+1)+p(1-p)(2+E)+p^3(3+E) + p^2(1-p)^2(4+E) + p^3(1-p)4$$ and solve this for ##E##, which implies 30 flips on average for a fair sided coin. My thought process was to find the expected number of tosses for H, HT, HHH, HHTT, HHTH respectively (each separated by a sum ##+## respectively).

Is this correct? If so, is there an approach to this using expected value and infinite sums?

anuttarasammyak · Apr 27, 2022

I think the probability of HHTH is ##p^3(1-p)## and expectation value of number of flips to get it is its inverse. For an example of p=1/2, average number of flips to get it is eight. Too simple ?

[EDIT] not eight, but 2^4=16

member 428835 · Apr 27, 2022

anuttarasammyak said:

I think the probability of HHTH is ##p^3(1-p)## and expectation value of number of flips to get it is its inverse. For an example of p=1/2, average number of flips to get it is eight. Too simple ?

Yea, I think it's too simple, since what I'm looking for isn't the probability of HHTH (which you gave) but rather the expected number of flips to see it. I'm almost positive my analysis is correct. However, I'm guessing there's a way to do this using expected values and infinite sums.

Edit: just read your entire post. Sorry, it's late here and my eyes skipped. I feel it's incorrect, but I can't explain why aside from the fact it doesn't match the above approach (30 vs 16).

FactChecker · Apr 27, 2022

joshmccraney said:

Yea, I think it's too simple, since what I'm looking for isn't the probability of HHTH (which you gave) but rather the expected number of flips to see it. I'm almost positive my analysis is correct. However, I'm guessing there's a way to do this using expected values and infinite sums.

Edit: just read your entire post. Sorry, it's late here and my eyes skipped. I feel it's incorrect, but I can't explain why aside from the fact it doesn't match the above approach (30 vs 16).

I think that the Markov chain is the proper way to approach this. Suppose that we already have HHT. Then the probability of success versus failure on the next toss is very different from having HHH. IMO, with all those kinds of conditional probabilities to deal with, a Markov chain is best. And if it gives a different answer from the simpler approach, then it proves that the simpler approach was wrong.

member 428835 · Apr 27, 2022

FactChecker said:

I think that the Markov chain is the proper way to approach this. Suppose that we already have HHT. Then the probability of success versus failure on the next toss is very different from having HHH. IMO, with all those kinds of conditional probabilities to deal with, a Markov chain is best. And if it gives a different answer from the simpler approach, then it proves that the simpler approach was wrong.

But is there a way to do this via infinite series? For example, the expected number of coin tosses for a heads can be computed as $$E = (1-p)(E+1) + p$$ or we can take another approach. Let ##N## be the number of coin tosses to get heads. Then it is clear that if ##N = k##, we had ##k-1## consecutive tails before a heads. From this we know ##P(N=k) = (1-p)^{k-1}p##, and if the coin is fair, ##P(N=k) = 0.5^k##. This thus the expected value is $$E = \sum_{k=1}^\infty k 0.5^k$$ which can be geometric if we multiply E by 0.5, take the difference, and viola.

Is there a similar way to approach this? It seems it would be overly complicated, but I was wondering.

anuttarasammyak · Apr 27, 2022

I would like to check whether I got the problem properly by restating it
--------------------
Let us make a binary decimal, digit by digit, e.g.
0
0.1
0.10
0.101
...
How many digits after comma we will require in average to get sequence "0010" first time, where the probabilities for 0 to take place is p and 1 to take place is 1-p ?
---------------------

member 428835 · Apr 28, 2022

anuttarasammyak said:

I think the probability of HHTH is ##p^3(1-p)## and expectation value of number of flips to get it is its inverse. For an example of p=1/2, average number of flips to get it is eight. Too simple ?

Okay, after thinking about it I think you're calculating the number of tosses required to get three heads and one tails in any position.

member 428835 · Apr 28, 2022

joshmccraney said:

Given a die with probability ##p## for heads, how many flips to achieve HHTH?

The method I was thinking is Markov chains: $$E = (1-p)(E+1)+p(1-p)(2+E)+p^3(3+E) + p^2(1-p)^2(4+E) + p^3(1-p)4$$ and solve this for ##E##, which implies 30 flips on average for a fair sided coin.

Okay, so there is an error in my analysis. When calculating the scenarios in the Markov chain, I first consider T, then HT, then HHH which I write as ##p^3(3+E) ##. But this is incorrect! It doesn't put us back at the start, since after HHH we only need TH, as opposed to restarting the entire process. So how do you factor this in?

FactChecker · Apr 28, 2022

joshmccraney said:

Okay, so there is an error in my analysis. When calculating the scenarios in the Markov chain, I first consider T, then HT, then HHH which I write as ##p^3(3+E) ##. But this is incorrect! It doesn't put us back at the start, since after HHH we only need TH, as opposed to restarting the entire process. So how do you factor this in?

Your Markov process should have a state where the string ends in "HH". Another "H" occurring in that state should keep you in that state. Are you doing this with a transition matrix?

member 428835 · Apr 28, 2022

FactChecker said:

Your Markov process should have a state where the string ends in "HH". Another "H" occurring in that state should keep you in that state. Are you doing this with a transition matrix?

Can you show me how? This is unclear to me.

PeroK · Apr 28, 2022

anuttarasammyak said:

I think the probability of HHTH is ##p^3(1-p)## and expectation value of number of flips to get it is its inverse. For an example of p=1/2, average number of flips to get it is eight. Too simple ?

This is not right.

PeroK · Apr 28, 2022

joshmccraney said:

Yea, I think it's too simple, since what I'm looking for isn't the probability of HHTH (which you gave) but rather the expected number of flips to see it. I'm almost positive my analysis is correct. However, I'm guessing there's a way to do this using expected values and infinite sums.

A simple approach won't work for the following reason. Let's assume ##p = \frac 1 2##. The probability of success after four tosses is ##\frac 1{16}##. But, a failure takes a variable number of tosses and leads to two different scenarios for the next attempt.

If you fail by getting ##T## or ##HT## or ##HHTT##, then you are back to the same starting position. You still have to calculate the expected number of tosses for these failures. Although that's not too complicated.

If, however, you fail by getting ##HHH##, then you are starting the next attempt with ##HH## and only need ##TH## to succeed on the next attempt.

So, it's not a simple case of having success or failure, then starting again. Effectively two different games with different probabilities of success and different expected lengths are interwoven.

That said, it should be possible to get an answer by elementary means.

PeroK · Apr 28, 2022

Here's the structure you need.

We define game 1 as trying to get ##HHTH## in four tosses. There are three relevant probabilities:

##F_1## is the probability of failing and staying in game 1. I.e. failing in one of the three ways that end in a ##T## and means game 1 starts again.

##S_1## is the probability of success in game 1.

##T_1## is the probability of failing in game 1 and transitioning to game 2. I.e. getting ##HHH##

Game 2 is trying to get ##TH##. Again there are three relevant probabilities:

##F_2## is the probability of failing in game 2 and staying in game 2. I.e. getting ##H##.

##S_2## is the probability of success in game 2.

##T_2## is the probability of failing in game 2 and transitioning back to game 1. I.e. getting ##TT##.

Note that we can also calculate the expected number of tosses for each of these six possibilities.

The trick is to calculate the probability/expected number of tosses to succeed before returning to game 1. There are a number of possibilities for this:

a) ##F_1##: i.e. we simply fail at game 1 and stay in game 1.

b) ##T_1F_2^kT_2##: we fail at game 1, fail ##k## times at game 2 before transitioning back to game 1. Note that ##k = 0, 1, 2, \dots##.

Ultimately that will give us some number ##E_1##, say, of an expected number of tosses before returning to the start of the game. And an associated probability ##P_1##.

The other calculation is, of course, the expected number of tosses to succeed without returning to the start of game 1. This is similar to the above:

c) ##S_1##: i.e. we simply succeed at game 1.

d) ##T_1F_2^kS_2##: we transition to game 2, fail ##k## times at game 2, then succeed at game 2.

This will give us some number ##E_2##, say, of an expected number of tosses to succeed without returning to the start of game 1. This must be ##P_2 = 1 - P_1##.

Edit: and, of course, this gives us a binomial distribution with a probability ##P = P_2## of success.

The expected number of trials to get success is ##\frac 1 P##. And the expected number of tosses is:
$$E = E_2 + (\frac 1 P - 1)E_1$$

PeroK · Apr 28, 2022

PS I assume that the theory of Markov chains gives you a formal way to do that with transition matrices. But, that's not something I've studied.

anuttarasammyak · Apr 28, 2022

Re: my post #6, to start from simplest case of getting "T" the expectation value of toss times to get it is calculated as
[tex] \sum_{n=0}^\infty (n+1)p^n (1-p)=\frac{1}{1-p}[/tex]
Am I understanding the problem properly ?

PeroK · Apr 28, 2022

anuttarasammyak said:

Am I understanding the problem properly ?

I don't think so.

PeroK · Apr 28, 2022

PS I've done a quick calculation and get the expected length (for ##p = \frac 1 2##) to be ##283/16 = 17.6875##.

This probably has an error in it. I'll double check later. My Python code seems to give exactly 18.

anuttarasammyak · Apr 28, 2022

PeroK said:

PS I've done a quick calculation and get the expected length (for p=12) to be 283/16=17.6875.

This probably has an error in it. I'll double check later.

I would also estimate the expected length for p=1/2 roughly. Say we have machine which output random 4 binary digits.
1st flip 　:0.1
2nd flip 　:0.10
3rd flip 　:0.101
4th flip 　:0.1011 : compare last 4 digits 1011 with a new machine run output. When they coincide, stop flipping the coin. Go to the next flip otherwise.
5th flip 　:0.10110 : compare last 4 digits 0110 with a new machine run output. When they coincide, stop flipping the coin. Go to the next flip otherwise.
-----
n th flip : 0.10110...0010 : compare last 4 digits 0010 with a new machine output. When they coincide, stop flipping the coin. Go to the next flip otherwise.
-----
Expected length is
[tex] 2^{-4}\sum_{n=0}^\infty (n+4)(1-2^{-4})^n=19=3+16[/tex]
obviouls. I know this is an answer for another process ,not for the original one, so it remains as a rough estimation.

PeroK · Apr 28, 2022

A computer simulation shows that the answer is 18.

FactChecker · Apr 28, 2022

joshmccraney said:

Can you show me how? This is unclear to me.

I'm afraid that I have not studied Markov processes enough to help you. This is called the "hitting time" of a Markov process. Here is an article on it. It has an example using the transition matrix.

PeroK · Apr 28, 2022

I've found a slighty better variation of the above.

First, we look at all the ways to get back to the starting state. These are:$$T, HT, HHTT, HHHTT, HHHHTT \dots$$The probability of this is $$P_1 = \frac 1 2 + \frac 1 4 + \frac 1 {16} + \frac 1 {32} + \dots = 1 - \frac 1 8 = \frac 7 8$$We also need the expected length of this scenario, conditional on it happening. We have:$$E_1 = \frac 8 7[\frac 1 2 + \frac 2 4 + \frac 4 {16} + \frac 5 {32} \dots] = \frac 4 7[
1 + \frac 2 2 + \frac 4 {8} + \frac 5 {16} \dots] = \frac 4 7[4 - \frac 3 4] = \frac{13}{7}$$Where I've used the binomial expansion$$\frac 1 {(1-x)^2} = 1 + 2x + 3x^2 + 4x^3 + 5x^4 \dots$$with ##x = \frac 1 2##.

In other words, there is a probability of ##P_1 = \frac 7 8## of returning to the starting state, with an expected length of ##E_1 = \frac{13}7## tosses.

Next we look at all the way to get ##HHTH## without returning to the starting state. These are:
$$HHTH, HHHTH, HHHHTH, \dots$$The probability of this is:
$$P_2 = \frac 1 {16} + \frac 1 {32} + \frac 1 {64} \dots = \frac 1 8$$And the expected length conditional on this happening is:$$E_2 = 8[\frac 4 {16} + \frac 5 {32} + \frac 6 {64} \dots] = 4[4 - 1 - 1 - \frac 3 4] = 5$$Now we effectively have the binomial trials scenario we want: there is a probability of ##P = \frac 1 8## of success, hence an expected number of ##7## failures before the successful trial. The expected number of tosses is then:
$$E = (7 \times E_1) + E_2 = (7 \times \frac {13}{7}) + 5 = 18$$

member 428835 · Apr 28, 2022

PeroK said:

I've found a slighty better variation of the above.

First, we look at all the ways to get back to the starting state. These are:$$T, HT, HHTT, HHHTT, HHHHTT \dots$$The probability of this is $$P_1 = \frac 1 2 + \frac 1 4 + \frac 1 {16} + \frac 1 {32} + \dots = 1 - \frac 1 8 = \frac 7 8$$We also need the expected length of this scenario, conditional on it happening. We have:$$E_1 = \frac 8 7[\frac 1 2 + \frac 2 4 + \frac 4 {16} + \frac 5 {32} \dots] = \frac 4 7[
1 + \frac 2 2 + \frac 4 {8} + \frac 5 {16} \dots] = \frac 4 7[4 - \frac 3 4] = \frac{13}{7}$$Where I've used the binomial expansion$$\frac 1 {(1-x)^2} = 1 + 2x + 3x^2 + 4x^3 + 5x^4 \dots$$with ##x = \frac 1 2##.

In other words, there is a probability of ##P_1 = \frac 7 8## of returning to the starting state, with an expected length of ##E_1 = \frac{13}7## tosses.

Next we look at all the way to get ##HHTH## without returning to the starting state. These are:
$$HHTH, HHHTH, HHHHTH, \dots$$The probability of this is:
$$P_2 = \frac 1 {16} + \frac 1 {32} + \frac 1 {64} \dots = \frac 1 8$$And the expected length conditional on this happening is:$$E_2 = 8[\frac 4 {16} + \frac 5 {32} + \frac 6 {64} \dots] = 4[4 - 1 - 1 - \frac 3 4] = 5$$Now we effectively have the binomial trials scenario we want: there is a probability of ##P = \frac 1 8## of success, hence an expected number of ##7## failures before the successful trial. The expected number of tosses is then:
$$E = (7 \times E_1) + E_2 = (7 \times \frac {13}{7}) + 5 = 18$$

Brilliant! Very nicely done. This was very enlightening!

pbuk · Apr 28, 2022

PeroK said:

The expected number of tosses is then:
$$E = (7 \times E_1) + E_2 = (7 \times \frac {13}{7}) + 5 = 18$$

Excellent, and this is verified by your Monte Carlo computation.

One advantage of using Markov chain modelling is that you don't have to hunt for the right way to approach the problem as you did in #17: just follow the right steps and you get to the right answer (and for some problems get to answers that are difficult to achieve by first principals analysis).

If you are interested in looking at the Markov chain approach this Cambridge University course material provides both an excellent introduction to the subject and a number of example problems which are comparable to the one in the OP: http://www.statslab.cam.ac.uk/~rrw1/markov/index.html

Don't get tricked into modelling many states for partial strings such as "HHTT", the only states you are interested in are ## { 0, H, HH, HHT, HHTH } ##. The first and last rows of the transition matrix should look like
$$ \begin{bmatrix}
q & p & 0 & 0 & 0 \\
\dots \\
0 & 0 & 0 & 0 & 1 \\
\end{bmatrix}
$$

PeroK · Apr 29, 2022

For a general probability ##p## of getting heads we have:
$$E = \frac{1 + p^2 - p^3}{p^3(1-p)}$$Which agrees closely enough it seems with my revised Python simulation.

MevsEinstein · Apr 29, 2022

joshmccraney said:

Hi PF!

Given a die with probability ##p## for heads, how many flips to achieve HHTH?

The method I was thinking is Markov chains: $$E = (1-p)(E+1)+p(1-p)(2+E)+p^3(3+E) + p^2(1-p)^2(4+E) + p^3(1-p)4$$ and solve this for ##E##, which implies 30 flips on average for a fair sided coin. My thought process was to find the expected number of tosses for H, HT, HHH, HHTT, HHTH respectively (each separated by a sum ##+## respectively).

Is this correct? If so, is there an approach to this using expected value and infinite sums?

My guess is to find the probability (1/16) of HHTH and then get its reciprocal (16) since there is a one in sixteen chance of HHTH happening.

FactChecker · Apr 29, 2022

PeroK said:

For a general probability ##p## of getting heads we have:
$$E = \frac{1 + p^2 - p^3}{p^3(1-p)}$$Which agrees closely enough it seems with my revised Python simulation.

Having bragged about the advantages of Monte Carlo simulation, I feel obliged to point out that this is a good example of an advantage of the analytical approach when rare events are involved.
With p=1/16, the expected string length before "HHTH" occurs is 4,385 and p=1/100 has an expected length of 1,010,201. (As p gets small, the expected length approaches 1/p^3 asymptotically) I would hate to have to simulate a lot of those with my program.
On the other hand, the analytical approach can quickly get out of hand if there are any tweaks to the problem logic.

FactChecker · Apr 29, 2022

FactChecker said:

I feel obliged to point out that this is a good example of an advantage of the analytical approach when rare events are involved.
...
On the other hand, the analytical approach can quickly get out of hand if there are any tweaks to the problem logic.

A third alternative is the approach of @.Scott in this post. I have not studied it, but I think it is estimating the integral ##\int x Prob(x)## directly. That would not require simulations, or advanced analysis, and would be adaptable to "tweaks" in the problem logic. I think that it could handle cases like p=1/100 with no problem even if the analytical alternatives got out of hand.

.Scott · Apr 29, 2022

FactChecker said:

A third alternative is the approach of @.Scott in this post. I have not studied it, but I think it is estimating the integral ##\int x Prob(x)## directly. That would not require simulations, or advanced analysis, and would be adaptable to "tweaks" in the problem logic. I think that it could handle cases like p=1/100 with no problem even if the analytical alternatives got out of hand.

This is basically correct.

For any "n" tosses, I am tallying up what portion ended up un-resolve in each of the possible 8 states (determined by the last three toss results) and how many are contribute to the "resolved" state. I am calculating ##\int x Prob(x)## directly. The only "estimate" is that I compute it for "##x=4## to ##1000##" instead of "##x = 4## to ##\infty##". But the calculation are fast and at pH=pT=0.5, the result (as an estimate of "##x=4## to ##\infty##") is accurate to 10 places after only 320 "tosses", so the only real limit is from the imprecision of the floating point arithmetic.

When I ran the program for pH = 0.1, it took about 29,000 tosses to approximate the full integral.
The expected value was 1121.11111111.

I have updated the code to compute for pHeads = 0.1 and posted it to the other thread. That new post is at this link.

FactChecker · Apr 29, 2022

.Scott said:

I am calculating ##\int x Prob(x)## directly. The only "estimate" is that I compute it for "##x=4## to ##1000##" instead of "##x = 4## to ##\infty##".

Yes. That is what I meant. But I agree that your answer was exact for all practical purposes.

.Scott said:

But the calculation are fast and at pH=pT=0.5, the result (as an estimate of "##x=4## to ##\infty##") is accurate to 10 places after only 320 "tosses", so the only real limit is from the imprecision of the floating point arithmetic.

That is fast enough. Using Monte Carlo simulations, I made several runs, each with 10 million trials. The answers for the expected value varied by 0.002. I imagine it took a lot longer and still was not nearly as accurate as your answer.

.Scott said:

When I ran the program for pH = 0.1, it took about 29,000 tosses to approximate the full integral.
The expected value was 1211.11111111.

I think you mean 1121.11111111. I ran two simulations of 100,000 trials each and got 1122.58 and 1115.5 Each simulation took about 5 minutes. I don't think I will do it with 10 million trials as I did for p=1/2.

PeroK · Apr 30, 2022

A simpler approach is to relate the expected values using the transitions. If we have states ##S_0, S_1, S_2, S_3## to be the starting state, ##H, HT## and ##HHT## respectively and ##N_i## to be the expected number of tosses starting from the ##i##th state, then we can interate through the options as follows:
$$N_0 = 1 + \frac 1 2 N_0 + \frac 1 2 N_1$$That's because after the the first toss we must, in general, have reduced the expected number by ##1## and are now in a 50-50 split of ##S_0## and ##S_1##. This gives:
$$N_1 = N_0 - 2$$Similarly:
$$N_1 = 1 + \frac 1 2 N_0 + \frac 1 2 N_2$$Giving:$$N_2 = N_0 - 6$$And:
$$N_2 = 1 + \frac 1 2 N_3 + \frac 1 2 N_2$$Giving:$$N_3 = N_0 - 8$$Finally:
$$N_3 = 1 + \frac 1 2 N_0$$Giving:$$N_0 = 18, \ N_1 = 16, \ N_2 = 12, \ N_3 = 10$$

PeroK · Apr 30, 2022

And with ##p## as the probability of throwing a head we have:
$$N_0 = 1 + pN_1 + (1-p)N_0, \dots$$Which leads to:
$$N_1 = N_0 - \frac 1 p, \ N_2 = N_0 - \frac 1 p - \frac 1 {p^2}, \ N_3 = N_0 - \frac 1 p - \frac 1 {p^2} - \frac 1 {1-p}$$And$$N_3 = 1 + pN_0$$Which leads to the same result as before:
$$N_0 = \frac{1 + p^2 - p^3}{p^3(1-p)}$$

.Scott · Apr 30, 2022

FactChecker said:

I think you mean 1121.11111111. I ran two simulations of 100,000 trials each and got 1122.58 and 1115.5 Each simulation took about 5 minutes. I don't think I will do it with 10 million trials as I did for p=1/2.

Yes, 1121.11111111. I have corrected my post. The results in the other thread were cut-and-pasted - and so they show the accurate values without being filtered through bio-data-processing elements.

FactChecker · Apr 30, 2022

.Scott said:

Yes, 1121.11111111. I have corrected my post. The results in the other thread were cut-and-pasted - and so they show the accurate values without being filtered through bio-data-processing elements.

Ha! :-) Those dang bio-data-processing elements will get you every time!

WWGD · May 5, 2022

There's always the non-elegant-but-sure way:

1)It can be done after n throws, with probability $$P_n$$ with expected value $$ \Sigma n*P_n $$ . It may get somewhat messy, but after 15 or so throws, $$ n*P_n $$ is small-enough that you can ignore it. And it seems some sort of recursion would help compute $$P_n$$.

For $$n=4$$, the expected value contribution is :$$ 4*2^{-4}$$; for $$n=5: 5*2^{-5}$$. Then it becomes a bit more complicated, since you need to exclude the substring HHT at the beginning.

WWGD · May 6, 2022

@PeroK : What part are you skeptical about?

How many tosses to flip a coin HHTH?

Similar threads

Hot Threads

Recent Insights