# Prob puzzle question

1. Apr 25, 2013

### BWV

came up with a different answer than someone posted on this, what do you all think?

Say you have one bacterium now. In any minute, each living one has a quarter chance of dying, a quarter of staying still, a quarter of splitting into 2, and a quarter of splitting into 3. What is the probability of all the bacteria eventually dying out?

2. Apr 25, 2013

### Staff: Mentor

If you have two different answers, why don't you write them down?

3. Apr 26, 2013

### BWV

my initial thought was the geometric series over n for p=.25

Ʃpn-1 =0.333

but it is more complex than that

at T1 there is a p chance of extinction

but at T2 there is a p2+p3+p4 chance of extinction, as if the germ does not go extinct at T1 there is an equal probability of 1,2 or 3 germs

and at T3 there is a p3+p4+p5+p6+p7+p8+p9 chance of extinction

after that for Tj get a sum pnof over a range n=j to j+3(j-1)

but there should be some pattern of coefficients to reflect the multiple paths to the middle range of probabilities at each T, which is where I am hung up

another guess is:
$\displaystyle\sum\limits_{n=1}^∞ p^n$ + $\displaystyle\sum\limits_{n=2}^∞ p^n$+ $\displaystyle\sum\limits_{n=3}^∞ p^n$...≈0.45

but this overcounts p2 as there is only one path to it

Last edited: Apr 26, 2013
4. Apr 26, 2013

### Staff: Mentor

How did you get that?
The probability distribution of T2 is not so trivial, I don't see why this expression should be true.

P(n->m) is the number of all options to write m as sum of n distinct integers from 0 to 3; divided by 4^n.
P(n->0) is always 1/4^n, but the extinction chance depends on the distribution of n.

Written as matrix Pmn, we look for the limit of the first component in $\displaystyle \lim_{k\to\infty}\,(0,1,0,...) \cdot P_{mn}^{k}$

5. Apr 26, 2013

### BWV

Sorry was omitting unknown coefficients at t3 as you start to get multiple paths, but t2 is correct as there are either 1 2 or 3 bacteria and they all have to go extinct

Thanks for the response, will play with it when I get some time

6. Apr 28, 2013

### bahamagreen

Seems to me it has to be p=1 for all dieing out.
Each of the four possibilities per minute is subject to fluctuation.
Considering the maximum fluctuation cases for any minute:

- every cell splits into three
- every cell splits into two
- every cell survives without splitting
- every cell dies

The probability of these fluctuations within one particular arbitrary minute are equal, but their impacts to growth are not.
Advances in population by the first two, or maintenance of stasis by the third, may be somewhat recovered by subsequent lessor fluctuations of the fourth (where not all cells die), and the first three may enjoy arbitrary repetitions...
The occurrence of the maximum fourth is a total stop unrecoverable by the first three, and it only needs to happen once.
Given an indefinitely long period of time, it will.

7. Apr 28, 2013

### Staff: Mentor

Those extreme fluctuations are extremely rare - and their probability is decreasing with increasing population size. If you start with a population of size n, their probability to die out goes to 0 for n->infinity.
The probability of this goes down quicker than exponentially, giving a finite sum.

I simulated the problem in python. I stopped if the population reached 0 (="dead"), 1 (=back to the original state), >50 (="living"), or had 30 iterations (="unclear"). In 1 million samples, 397509 were living, 280790 were dead, 0 were unclear. The others returned to 1 and were not counted. This gives a fraction of 41.4% dead populations with an uncertainty of about 0.1%.

A population with 50 is basically immortal. It gets 75+-8 descendants, the probability that it shrinks is tiny. If I consider only populations of >=500 as "living", the program runs longer, but the result stays the same.

A quick check (with 10-100 thousand populations each) for other starting values:

8. Apr 28, 2013

### BWV

this was supposedly a job interview question for a quant job at Morgan Stanley

maybe "I would just monte carlo it" was the answer they were looking for?

9. Apr 28, 2013

### Office_Shredder

Staff Emeritus
Let p be the probability of dying. Then after one generation either you die, you stay at 1 guy, you get 2 guys (who both have to die off independently) or you get three guys (who have to die off independently). So
p = .25+.25p+.25p2+.25 p3

10. Apr 28, 2013

### Staff: Mentor

Very nice, I did not see that.
This gives about 41.42%, in agreement with my simulations.

11. Apr 28, 2013

### Office_Shredder

Staff Emeritus
I just realized that that polynomial has p=1 as a root, so you have to prove separately that the line will not die off with probability 1. The proof isn't that bad... if Pn is the population on the nth iteration, there is a nonzero probability that for really large n, Pn is at least 1.5n (because that's the expected value of Pn). Now suppose that we have one of these really large populations.

Pn+1/Pn is the average value of the size of each of the individual cell's iterations. It has expected value 1.5 and variance 1.25/Pn. Therefore Markov's inequality for the variance tells us
$$Prob(P_{n+1}/P_n < 1.3) \leq \frac{c}{P_n}$$
for some constant c. Now we can show that there is a non-zero probability that each Pn+k is 1.3 times larger than the one that comes before it. Suppose n+k is the first iteration where this fails:
$$Prob(P_{n+k}/P_{n+k-1} <1.3 \ |\ P_{n+j}/P_{n+j-1} > 1.3\ \forall\ j<k ) \leq \frac{c}{P_{n+k-1}} \leq \frac{c}{(1.3)^{k-1}P_{n}}$$

Which means that adding up the probability for each k being the first one
$$\sum_{k \geq 0} \frac{c}{(1.3)^{k-1}P_{n}} = \frac{1.3c}{P_n} \frac{1}{1-1/(1.3)}$$
which is smaller than 1 if Pn is large enough. So there is a nonzero probability that this population increases by a factor of 1.3 for EVERY subsequent iteration

12. Apr 29, 2013

### BWV

cool, I could not see on the site where I found the problem where the √2-1 answer posted came from