Probability of Bacteria Dying Out: What Do You Think?

BWV · Apr 25, 2013

came up with a different answer than someone posted on this, what do you all think?

Say you have one bacterium now. In any minute, each living one has a quarter chance of dying, a quarter of staying still, a quarter of splitting into 2, and a quarter of splitting into 3. What is the probability of all the bacteria eventually dying out?

mfb · Apr 25, 2013

If you have two different answers, why don't you write them down?

BWV · Apr 26, 2013

my initial thought was the geometric series over n for p=.25

Ʃpⁿ-1 =0.333

but it is more complex than that

at T₁ there is a p chance of extinction

but at T₂ there is a p²+p³+p⁴ chance of extinction, as if the germ does not go extinct at T₁ there is an equal probability of 1,2 or 3 germs

and at T₃ there is a p³+p⁴+p⁵+p⁶+p⁷+p⁸+p⁹ chance of extinction

after that for T_j get a sum pⁿof over a range n=j to j+3(j-1)

but there should be some pattern of coefficients to reflect the multiple paths to the middle range of probabilities at each T, which is where I am hung up

another guess is:
\displaystyle\sum\limits_{n=1}^∞ p^n + \displaystyle\sum\limits_{n=2}^∞ p^n+ \displaystyle\sum\limits_{n=3}^∞ p^n...≈0.45

but this overcounts p² as there is only one path to it

mfb · Apr 26, 2013

BWV said:

and at T₃ there is a p³+p⁴+p⁵+p⁶+p⁷+p⁸+p⁹ chance of extinction

How did you get that?
The probability distribution of T₂ is not so trivial, I don't see why this expression should be true.

P(n->m) is the number of all options to write m as sum of n distinct integers from 0 to 3; divided by 4^n.
P(n->0) is always 1/4^n, but the extinction chance depends on the distribution of n.

Written as matrix P_mn, we look for the limit of the first component in ##\displaystyle \lim_{k\to\infty}\,(0,1,0,...) \cdot P_{mn}^{k}##

BWV · Apr 26, 2013

Sorry was omitting unknown coefficients at t3 as you start to get multiple paths, but t2 is correct as there are either 1 2 or 3 bacteria and they all have to go extinct

Thanks for the response, will play with it when I get some time

bahamagreen · Apr 28, 2013

Seems to me it has to be p=1 for all dieing out.
Each of the four possibilities per minute is subject to fluctuation.
Considering the maximum fluctuation cases for any minute:

- every cell splits into three
- every cell splits into two
- every cell survives without splitting
- every cell dies

The probability of these fluctuations within one particular arbitrary minute are equal, but their impacts to growth are not.
Advances in population by the first two, or maintenance of stasis by the third, may be somewhat recovered by subsequent lessor fluctuations of the fourth (where not all cells die), and the first three may enjoy arbitrary repetitions...
The occurrence of the maximum fourth is a total stop unrecoverable by the first three, and it only needs to happen once.
Given an indefinitely long period of time, it will.

mfb · Apr 28, 2013

Those extreme fluctuations are extremely rare - and their probability is decreasing with increasing population size. If you start with a population of size n, their probability to die out goes to 0 for n->infinity.

The occurrence of the maximum fourth is a total stop unrecoverable by the first three, and it only needs to happen once.
Given an indefinitely long period of time, it will.

The probability of this goes down quicker than exponentially, giving a finite sum.I simulated the problem in python. I stopped if the population reached 0 (="dead"), 1 (=back to the original state), >50 (="living"), or had 30 iterations (="unclear"). In 1 million samples, 397509 were living, 280790 were dead, 0 were unclear. The others returned to 1 and were not counted. This gives a fraction of 41.4% dead populations with an uncertainty of about 0.1%.

A population with 50 is basically immortal. It gets 75+-8 descendants, the probability that it shrinks is tiny. If I consider only populations of >=500 as "living", the program runs longer, but the result stays the same.

A quick check (with 10-100 thousand populations each) for other starting values:
0: 100% dead (of course)
2: 18% dead
3: 7% dead
4: 2.8% dead
5: 1.2% dead

BWV · Apr 28, 2013

this was supposedly a job interview question for a quant job at Morgan Stanley

maybe "I would just monte carlo it" was the answer they were looking for?

Office_Shredder · Apr 28, 2013

Let p be the probability of dying. Then after one generation either you die, you stay at 1 guy, you get 2 guys (who both have to die off independently) or you get three guys (who have to die off independently). So
p = .25+.25p+.25p²+.25 p³

mfb · Apr 28, 2013

Office_Shredder said:

(who both have to die off independently)

Very nice, I did not see that.
This gives about 41.42%, in agreement with my simulations.

Office_Shredder · Apr 28, 2013

I just realized that that polynomial has p=1 as a root, so you have to prove separately that the line will not die off with probability 1. The proof isn't that bad... if P_n is the population on the nth iteration, there is a nonzero probability that for really large n, P_n is at least 1.5ⁿ (because that's the expected value of P_n). Now suppose that we have one of these really large populations.

P_n+1/P_n is the average value of the size of each of the individual cell's iterations. It has expected value 1.5 and variance 1.25/P_n. Therefore Markov's inequality for the variance tells us
Prob(P_{n+1}/P_n < 1.3) \leq \frac{c}{P_n}
for some constant c. Now we can show that there is a non-zero probability that each P_n+k is 1.3 times larger than the one that comes before it. Suppose n+k is the first iteration where this fails:
Prob(P_{n+k}/P_{n+k-1} <1.3 \ |\ P_{n+j}/P_{n+j-1} > 1.3\ \forall\ j<k ) \leq \frac{c}{P_{n+k-1}} \leq \frac{c}{(1.3)^{k-1}P_{n}}

Which means that adding up the probability for each k being the first one
\sum_{k \geq 0} \frac{c}{(1.3)^{k-1}P_{n}} = \frac{1.3c}{P_n} \frac{1}{1-1/(1.3)}
which is smaller than 1 if P_n is large enough. So there is a nonzero probability that this population increases by a factor of 1.3 for EVERY subsequent iteration

BWV · Apr 29, 2013

cool, I could not see on the site where I found the problem where the √2-1 answer posted came from

Probability of Bacteria Dying Out: What Do You Think?

Similar threads

Hot Threads

B A Little Probability Puzzle

I Need help solving this Existence Algorithm for truth

A Does this computation satisfy LTL formulas?

A Prove that points which are indistinguishable from 0 exist (using logic)

A Mathematical Connection between Cosmic Expansion and Exponential Growth

Recent Insights

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Insights What Exactly is Dirac’s Delta Function? - Insight

Insights Relativator (Circular Slide-Rule): Simulated with Desmos - Insight

Insights Fixing Things Which Can Go Wrong With Complex Numbers

Insights Fermat's Last Theorem

Insights Why Vector Spaces Explain The World: A Historical Perspective