Probability of Bacteria Dying Out: What Do You Think?

  • Context: Undergrad 
  • Thread starter Thread starter BWV
  • Start date Start date
  • Tags Tags
    Puzzle
Click For Summary

Discussion Overview

The discussion revolves around the probability of bacteria dying out given a specific model of bacterial reproduction and death. Participants explore various mathematical approaches to calculate extinction probabilities based on defined probabilities for different outcomes each minute.

Discussion Character

  • Exploratory
  • Mathematical reasoning
  • Debate/contested

Main Points Raised

  • One participant proposes a model where each bacterium has a 25% chance of dying, staying the same, splitting into 2, or splitting into 3, and questions the overall extinction probability.
  • Another participant suggests using a geometric series to approach the problem but acknowledges the complexity of the extinction probabilities at different time steps.
  • A later reply challenges the initial assumptions about the probability distribution at T2 and suggests a matrix approach to analyze the extinction probabilities.
  • Some participants discuss the implications of extreme fluctuations in population size and how they affect the probability of extinction, with one arguing that such fluctuations become rarer as population size increases.
  • One participant shares simulation results indicating that larger initial populations have a significantly lower probability of dying out, suggesting a threshold effect.
  • Another participant introduces a polynomial equation to express the extinction probability and notes that it has a root at p=1, prompting a discussion about proving the non-extinction of the population.
  • Several participants express uncertainty about the correct approach and calculations, with one noting the need for further exploration of the coefficients involved in the extinction probabilities.

Areas of Agreement / Disagreement

Participants express differing views on the extinction probabilities, with no consensus reached on the correct method or final probability. Multiple competing models and interpretations of the problem remain present throughout the discussion.

Contextual Notes

Some assumptions about the independence of bacterial deaths and the distribution of outcomes are not fully resolved. The complexity of the mathematical expressions and the potential for overcounting in certain scenarios are acknowledged but not clarified.

BWV
Messages
1,674
Reaction score
2,018
came up with a different answer than someone posted on this, what do you all think?

Say you have one bacterium now. In any minute, each living one has a quarter chance of dying, a quarter of staying still, a quarter of splitting into 2, and a quarter of splitting into 3. What is the probability of all the bacteria eventually dying out?
 
Physics news on Phys.org
If you have two different answers, why don't you write them down?
 
my initial thought was the geometric series over n for p=.25

Ʃpn-1 =0.333

but it is more complex than that

at T1 there is a p chance of extinction

but at T2 there is a p2+p3+p4 chance of extinction, as if the germ does not go extinct at T1 there is an equal probability of 1,2 or 3 germs


and at T3 there is a p3+p4+p5+p6+p7+p8+p9 chance of extinction

after that for Tj get a sum pnof over a range n=j to j+3(j-1)

but there should be some pattern of coefficients to reflect the multiple paths to the middle range of probabilities at each T, which is where I am hung up

another guess is:
[itex]\displaystyle\sum\limits_{n=1}^∞ p^n[/itex] + [itex]\displaystyle\sum\limits_{n=2}^∞ p^n[/itex]+ [itex]\displaystyle\sum\limits_{n=3}^∞ p^n[/itex]...≈0.45

but this overcounts p2 as there is only one path to it
 
Last edited:
BWV said:
and at T3 there is a p3+p4+p5+p6+p7+p8+p9 chance of extinction
How did you get that?
The probability distribution of T2 is not so trivial, I don't see why this expression should be true.

P(n->m) is the number of all options to write m as sum of n distinct integers from 0 to 3; divided by 4^n.
P(n->0) is always 1/4^n, but the extinction chance depends on the distribution of n.

Written as matrix Pmn, we look for the limit of the first component in ##\displaystyle \lim_{k\to\infty}\,(0,1,0,...) \cdot P_{mn}^{k}##
 
Sorry was omitting unknown coefficients at t3 as you start to get multiple paths, but t2 is correct as there are either 1 2 or 3 bacteria and they all have to go extinct

Thanks for the response, will play with it when I get some time
 
Seems to me it has to be p=1 for all dieing out.
Each of the four possibilities per minute is subject to fluctuation.
Considering the maximum fluctuation cases for any minute:

- every cell splits into three
- every cell splits into two
- every cell survives without splitting
- every cell dies

The probability of these fluctuations within one particular arbitrary minute are equal, but their impacts to growth are not.
Advances in population by the first two, or maintenance of stasis by the third, may be somewhat recovered by subsequent lessor fluctuations of the fourth (where not all cells die), and the first three may enjoy arbitrary repetitions...
The occurrence of the maximum fourth is a total stop unrecoverable by the first three, and it only needs to happen once.
Given an indefinitely long period of time, it will.
 
Those extreme fluctuations are extremely rare - and their probability is decreasing with increasing population size. If you start with a population of size n, their probability to die out goes to 0 for n->infinity.
The occurrence of the maximum fourth is a total stop unrecoverable by the first three, and it only needs to happen once.
Given an indefinitely long period of time, it will.
The probability of this goes down quicker than exponentially, giving a finite sum.I simulated the problem in python. I stopped if the population reached 0 (="dead"), 1 (=back to the original state), >50 (="living"), or had 30 iterations (="unclear"). In 1 million samples, 397509 were living, 280790 were dead, 0 were unclear. The others returned to 1 and were not counted. This gives a fraction of 41.4% dead populations with an uncertainty of about 0.1%.

A population with 50 is basically immortal. It gets 75+-8 descendants, the probability that it shrinks is tiny. If I consider only populations of >=500 as "living", the program runs longer, but the result stays the same.

A quick check (with 10-100 thousand populations each) for other starting values:
0: 100% dead (of course)
2: 18% dead
3: 7% dead
4: 2.8% dead
5: 1.2% dead
 
this was supposedly a job interview question for a quant job at Morgan Stanley

maybe "I would just monte carlo it" was the answer they were looking for?
 
Let p be the probability of dying. Then after one generation either you die, you stay at 1 guy, you get 2 guys (who both have to die off independently) or you get three guys (who have to die off independently). So
p = .25+.25p+.25p2+.25 p3
 
  • #10
Office_Shredder said:
(who both have to die off independently)
Very nice, I did not see that.
This gives about 41.42%, in agreement with my simulations.
 
  • #11
I just realized that that polynomial has p=1 as a root, so you have to prove separately that the line will not die off with probability 1. The proof isn't that bad... if Pn is the population on the nth iteration, there is a nonzero probability that for really large n, Pn is at least 1.5n (because that's the expected value of Pn). Now suppose that we have one of these really large populations.

Pn+1/Pn is the average value of the size of each of the individual cell's iterations. It has expected value 1.5 and variance 1.25/Pn. Therefore Markov's inequality for the variance tells us
[tex]Prob(P_{n+1}/P_n < 1.3) \leq \frac{c}{P_n}[/tex]
for some constant c. Now we can show that there is a non-zero probability that each Pn+k is 1.3 times larger than the one that comes before it. Suppose n+k is the first iteration where this fails:
[tex]Prob(P_{n+k}/P_{n+k-1} <1.3 \ |\ P_{n+j}/P_{n+j-1} > 1.3\ \forall\ j<k ) \leq \frac{c}{P_{n+k-1}} \leq \frac{c}{(1.3)^{k-1}P_{n}}[/tex]

Which means that adding up the probability for each k being the first one
[tex]\sum_{k \geq 0} \frac{c}{(1.3)^{k-1}P_{n}} = \frac{1.3c}{P_n} \frac{1}{1-1/(1.3)}[/tex]
which is smaller than 1 if Pn is large enough. So there is a nonzero probability that this population increases by a factor of 1.3 for EVERY subsequent iteration
 
  • #12
cool, I could not see on the site where I found the problem where the √2-1 answer posted came from
 

Similar threads

  • · Replies 20 ·
Replies
20
Views
5K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 45 ·
2
Replies
45
Views
6K
  • · Replies 2 ·
Replies
2
Views
3K
  • · Replies 1 ·
Replies
1
Views
3K
  • · Replies 7 ·
Replies
7
Views
2K
  • · Replies 11 ·
Replies
11
Views
3K
  • · Replies 11 ·
Replies
11
Views
4K
  • · Replies 57 ·
2
Replies
57
Views
7K
  • · Replies 1 ·
Replies
1
Views
3K