Mathematica Gillespie Algorithm (Monte Carlo Simulation) for simple process in Mathematica

AI Thread Summary
The discussion centers on modeling a birth and death process in Mathematica using the Gillespie Algorithm, focusing on the transcription of DNA to mRNA and its subsequent degradation. The user finds discrepancies between deterministic and stochastic results, particularly when using small initial DNA numbers, leading to a steady state mRNA number of 2.3 instead of the expected 1.73. In contrast, larger initial conditions yield matching results, suggesting that the stochastic model's accuracy improves with higher DNA counts. The conversation highlights the challenges of applying continuous derivatives to a discrete process, especially with low population sizes, which can skew results. Overall, the forum emphasizes the need for careful consideration of initial conditions and sampling in stochastic simulations.
Lucid Dreamer
Messages
25
Reaction score
0
I am trying to model a simple birth and death process in Mathematica using the Gillespie Algorithm.

I am using 1 DNA molecule that is transcribed to mRNA with rate k1,
\mbox{DNA} \longrightarrow \mbox{DNA + RNA}
and the transcribed RNA are subject to degradation with rate k2,
\mbox{RNA} \longrightarrow \mbox{0}
I expect my DNA numbers to remain the same (since DNA is not being produced or degraded), and the steady state RNA number occurs when \frac{\partial (RNA)}{\partial t}= 0
for,
\frac{\partial (RNA)}{\partial t} = \mbox{(# of DNA)k1 - (# of RNA)k2}
Thus, deterministically, the steady state RNA number is given by \mbox{(# of DNA)}\frac{\mbox{k1}}{\mbox{k2}}
Using the initial conditions # of DNA = 1, k1 = 0.01, k2 = 0.00577, I expect to get a steady state mRNA number of 1.73

I model this process stochastically in Mathematica, but my steady state numbers for the deterministic and stochastic solution do not match. Using the gillespie algorithm, my steady state value is close to 2.3.

The funny thing is that if I start my code with the initial condition # of DNA = 50, k1= 0.01, k2= 0.00577, my deterministic and stochastic solution mach perfectly (in this regime, I expect steady mRNA to be 86.6)

So when I choose small numbers for my initial condition, the deterministic and stochastic solution do not match, but if I choose larger initial conditions, then there is a perfect match between the deterministic and stochastic solution.

Here is the code I used. Any help is appreciated (both in solving this problem and in making the code more efficient)

Code:
Clear[dna0, mRNA0, protein0, time0, dna, mRNA, protein, time, ndna, 
nmRNA, nprotein, ntime, k1, k2, k3, k4, b, n, m, concTime, 
concTimeAvg, rand, randexp]
(*dna -> dna + nMRNA [k1]*)
(*mRNA -> 0 [k2]*)
time0 = 0(*Initial Time*);
dna0 = 1(*Initial number of DNA molecules*);
mRNA0 = 0(*Initial number of mRNA molecules*);
n = 100(*Number of iterations*);
m = 1000(*Number of runs*);
k1 = 0.01(*mRNA production rate*);
k2 = 0.00577(*mRNA degredation rate*);

(*Single iteration*)
concentration[nd_, nm_, t0_, n_] := NestList[
  (
    a1 = #[[1]]*k1;
    a2 = #[[2]]*k2;
    a0 = a1 + a2;
    randexp = 
     a0 // ExponentialDistribution // 
      RandomVariate;(*step time forward*)
    rand = RandomReal[];
    If[rand <= a1/a0,
     {#[[1]], #[[2]] + 1, #[[3]] + 
       randexp},(*add one mRNA molecule*)
     {#[[1]], #[[2]] - 1, #[[3]] + 
       randexp}(*subtract one mRNA molecule*)]
    ) &,
  {nd, nm, t0}, n]

{dna, mRNA, time} = Transpose[concentration[dna0, mRNA0, time0, n]];

(*Average over m runs*)
concentrationAverage[nd_, nm_, t0_, n_, m_] := 
 Table[concentration[nd, nm, t0, n], {m}]
{dnaAverage, mRNAAverage, timeAverage} = 
  concentrationAverage[dna0, mRNA0, time0, n, m] // N // Mean // 
   Transpose;
ListLinePlot[Transpose[{timeAverage, dnaAverage}]]
ListLinePlot[Transpose[{timeAverage, mRNAAverage}], PlotRange -> All]
 
Physics news on Phys.org
LucidDreamer,

try to weight your averages by Delta t at each simulation step.
 
The reason you aren't getting a match when using small numbers is because the system will then require more sampling, say... infinity? Well, not infinity... but in theory you always need an infinite number of calculations to get true convergence, so you should consider running more calculations to see if you get convergence. I also do monte carlo simulations and the general rule is always do more computing if you aren't sure of the results.
 
Hi Lucid Dreamer, welcome to PF!

I ran your code and reproduced your results. It's not a matter of more sampling. It's a matter of applying (continuous) derivatives to try to reproduce a discrete process. When you have small DNA values, the system keeps bumping against mRNA=0, which it can't go below. As a result, the distribution of final values is strongly right-skewed and the mean is larger than one would expect by solving the continuous equation. As the DNA value increases, the approximation of a continuous system gets better and better.

The problem of small numbers is why Gillespie developed his algorithm in the first place!
 
Back
Top