- #1

- 26

- 0

Hi,

I'm really close to deriving something that many biology textbooks like to skip: the probability of a certain DNA molecule (base) remaining the same while it suffers changes both to and from the other 3 types of DNA molecules.

This sounds specificially biological, but as probably a good deal of you know, it can be applied to a ton of other situations: migrations in and out of cities, influx and outflux of liquids, growth and decay and lots more besides. DNA substitution is also nice to treat however.

However, I am stuck at a certain point, mainly that I am not getting the right answer, and I am definitely missing something, so if anybody could shine any light, especially in relation to my missteps, I'd appreciate it.

Ok, let's start. We have a DNA site which happens to be a "C", a cytosine molecule. Over time, the site changes into either of the other three DNA molecules, "A", "G", or "T". We want to calculate [itex]P_c(t)[/itex], which is the probability of the site being a "C" at time [itex]t[/itex]

Now, let's characterise a rate that we can use: we will say it is instantaneous, which is somewhat disconcerting in reality, because we'd be unable to count changes in a split instant of time, but we're in the world of theory right now, so it's possible to give it a symbol [itex]\alpha[/itex]. I already know the solution, and it makes pefect sense: it's

[itex]P_c(t)=\frac{1}{4} +\frac{3}{4} e^{-4 \alpha t}[/itex]

It make sense, because the [itex]P_c(t)[/itex] will eventually settle to a steady state [itex]\frac{1}{4}[/itex], but in the beginning the [itex]\frac{3}{4}[/itex] part will decrease exponentially with time. But, I want to derive this from (fairly) first principles.

So, we're going to observe a site for a certain unit of time, and we say we'll start at [itex]t = 0[/itex].

So we start at the beginning with our "C". What's [itex]P_c(t)[/itex] at [itex]t= 0[/itex]? Easy one:

[itex]P_c(0)=1[/itex]

Great, let's move on to our second time unit.

[itex]P_c(1)=1-3\alpha[/itex]

i.e. during this time, there's been "move" to the other bases, each at [itex]\alpha[/itex]. Let's move on

[itex]P_c(2) = (1-3 \alpha) P_c(1) + \alpha (1-P_c(1))[/itex]

Here we get the two main phenomena at work: we continue to lose chances of staying at "C" by [itex]3 \alpha[/itex], but the bases that now are not "C" will also suffer a rate of change back into "C" again. This second step allows us to generalise to any time point [itex]t[/itex] and a time advance [itex]\triangle t[/itex]. We must replace our [itex]\alpha[/itex] with [itex]\alpha \triangle t[/itex] to get

[itex]P_c(t + \triangle t) = (1-3 \alpha \triangle t) P_c(t) + \alpha \triangle t (1-P_c(t))[/itex]

which is

[itex]P_c(t + \triangle t) = P_c(t) - 4 \alpha \triangle t P_c(t) + \alpha \triangle t[/itex]

and also

[itex]\frac{P_c(t + \triangle t) - P_c(t)}{\triangle t} = \alpha - 4 \alpha P_c(t)[/itex]

So taking limits we get

[itex]\frac{d P_c(t)}{dt} = \alpha ( 1 - 4 P_c(t))[/itex]

And we can start gearing up to do an integration as we re-arrange:

[itex]\frac{d P_c(t)}{1 - 4 P_c(t)} = \alpha dt [/itex]

At this point, I'm going to drop the pseudo-authoritative tone, which I guess you noticed, and admit that I'm getting unsteady at this point because I'm unsure whether to go for indefinite integral or start apply some limits. However, there is time for a substitution before I make that nasty decision:

Our integral:

[itex]\int \frac{d P_c(t)}{1 - 4 P_c(t)} = \int \alpha dt [/itex]

Let [itex]Q = 1 -4 P_c(t)[/itex], so that

Let [itex]dQ = -4 dP_c(t)[/itex]

and [itex]dP_c(t) = - \frac{dQ}{4}[/itex]

which we apply to the above integral to get

[itex]- \frac{1}{4} \int \frac{dQ}{Q} = \int \alpha dt [/itex]

OK, time to do the dirty. I'm going to go for indefinite integrals and combine constant terms.

[itex]\ln |Q| = - 4 \alpha t + C [/itex]

which is

[itex]\ln |1-4 P_c(t)| = - 4 \alpha t + C [/itex]

So,

[itex]|1-4 P_c(t)| = e^{-4 \alpha t} + C [/itex]

and I move awkwardly ahead

[itex]4 P_c(t) = 1 - e^{-4 \alpha t} + C [/itex]

Initial condition:

[itex]P_c(0) = 1[/itex], so [itex]C = 4[/itex]

And so,

[itex]P_c(t) = \frac{1}{4} - \frac{1}{4} e^{-4 \alpha t} + 1 [/itex]

Which is not the right answer, though it would be if that final 1 was also multiplied by

[itex]e^{-4 \alpha t}[/itex]

But my suspect way of integrating has not given this.

Which leads me to my humble request for guidance on this.

Many thanks in advance!

I'm really close to deriving something that many biology textbooks like to skip: the probability of a certain DNA molecule (base) remaining the same while it suffers changes both to and from the other 3 types of DNA molecules.

This sounds specificially biological, but as probably a good deal of you know, it can be applied to a ton of other situations: migrations in and out of cities, influx and outflux of liquids, growth and decay and lots more besides. DNA substitution is also nice to treat however.

However, I am stuck at a certain point, mainly that I am not getting the right answer, and I am definitely missing something, so if anybody could shine any light, especially in relation to my missteps, I'd appreciate it.

Ok, let's start. We have a DNA site which happens to be a "C", a cytosine molecule. Over time, the site changes into either of the other three DNA molecules, "A", "G", or "T". We want to calculate [itex]P_c(t)[/itex], which is the probability of the site being a "C" at time [itex]t[/itex]

Now, let's characterise a rate that we can use: we will say it is instantaneous, which is somewhat disconcerting in reality, because we'd be unable to count changes in a split instant of time, but we're in the world of theory right now, so it's possible to give it a symbol [itex]\alpha[/itex]. I already know the solution, and it makes pefect sense: it's

[itex]P_c(t)=\frac{1}{4} +\frac{3}{4} e^{-4 \alpha t}[/itex]

It make sense, because the [itex]P_c(t)[/itex] will eventually settle to a steady state [itex]\frac{1}{4}[/itex], but in the beginning the [itex]\frac{3}{4}[/itex] part will decrease exponentially with time. But, I want to derive this from (fairly) first principles.

So, we're going to observe a site for a certain unit of time, and we say we'll start at [itex]t = 0[/itex].

So we start at the beginning with our "C". What's [itex]P_c(t)[/itex] at [itex]t= 0[/itex]? Easy one:

[itex]P_c(0)=1[/itex]

Great, let's move on to our second time unit.

[itex]P_c(1)=1-3\alpha[/itex]

i.e. during this time, there's been "move" to the other bases, each at [itex]\alpha[/itex]. Let's move on

[itex]P_c(2) = (1-3 \alpha) P_c(1) + \alpha (1-P_c(1))[/itex]

Here we get the two main phenomena at work: we continue to lose chances of staying at "C" by [itex]3 \alpha[/itex], but the bases that now are not "C" will also suffer a rate of change back into "C" again. This second step allows us to generalise to any time point [itex]t[/itex] and a time advance [itex]\triangle t[/itex]. We must replace our [itex]\alpha[/itex] with [itex]\alpha \triangle t[/itex] to get

[itex]P_c(t + \triangle t) = (1-3 \alpha \triangle t) P_c(t) + \alpha \triangle t (1-P_c(t))[/itex]

which is

[itex]P_c(t + \triangle t) = P_c(t) - 4 \alpha \triangle t P_c(t) + \alpha \triangle t[/itex]

and also

[itex]\frac{P_c(t + \triangle t) - P_c(t)}{\triangle t} = \alpha - 4 \alpha P_c(t)[/itex]

So taking limits we get

[itex]\frac{d P_c(t)}{dt} = \alpha ( 1 - 4 P_c(t))[/itex]

And we can start gearing up to do an integration as we re-arrange:

[itex]\frac{d P_c(t)}{1 - 4 P_c(t)} = \alpha dt [/itex]

At this point, I'm going to drop the pseudo-authoritative tone, which I guess you noticed, and admit that I'm getting unsteady at this point because I'm unsure whether to go for indefinite integral or start apply some limits. However, there is time for a substitution before I make that nasty decision:

Our integral:

[itex]\int \frac{d P_c(t)}{1 - 4 P_c(t)} = \int \alpha dt [/itex]

Let [itex]Q = 1 -4 P_c(t)[/itex], so that

Let [itex]dQ = -4 dP_c(t)[/itex]

and [itex]dP_c(t) = - \frac{dQ}{4}[/itex]

which we apply to the above integral to get

[itex]- \frac{1}{4} \int \frac{dQ}{Q} = \int \alpha dt [/itex]

OK, time to do the dirty. I'm going to go for indefinite integrals and combine constant terms.

[itex]\ln |Q| = - 4 \alpha t + C [/itex]

which is

[itex]\ln |1-4 P_c(t)| = - 4 \alpha t + C [/itex]

So,

[itex]|1-4 P_c(t)| = e^{-4 \alpha t} + C [/itex]

and I move awkwardly ahead

[itex]4 P_c(t) = 1 - e^{-4 \alpha t} + C [/itex]

Initial condition:

[itex]P_c(0) = 1[/itex], so [itex]C = 4[/itex]

And so,

[itex]P_c(t) = \frac{1}{4} - \frac{1}{4} e^{-4 \alpha t} + 1 [/itex]

Which is not the right answer, though it would be if that final 1 was also multiplied by

[itex]e^{-4 \alpha t}[/itex]

But my suspect way of integrating has not given this.

Which leads me to my humble request for guidance on this.

Many thanks in advance!

Last edited: