How can I compute Markov transition probabilities from given data?

user366312 · May 8, 2019

Suppose, I have a Markov Chain as follows:

##S=\{1, 2\}##

##
\alpha = \begin{bmatrix}
0.5&0.5\end{bmatrix}
##

##
P =
\begin{bmatrix}
0.5&0.5\\
0&1
\end{bmatrix}
##

And, the following data regarding 5 steps of Markov Chain taken 12 times:
[CODE title="Markov Chain"] Steps 1 2 3 4 5
--------------------------------
1 2 2 1 1 1
2 1 1 1 1 1
3 1 1 1 1 1
4 1 1 1 1 1
5 2 2 2 1 1
6 2 2 2 1 1
7 1 1 1 1 1
8 1 1 1 1 1
9 1 1 1 1 1
10 2 2 2 2 2
11 1 1 1 1 1
12 1 1 1 1 1[/CODE]

How can I calculate,say, ##P(X_1 = 1|X_0 = 1)## and ##P(X_5 = 2|X_2 = 1)## from this table?

Kindly, explain your answer.

FactChecker · May 9, 2019

In the data, count the number of transitions from one state directly to another (or to itself). That should give you probabilities for a transition matrix. From that, the rest follows. The data you show doesn't seem to match the probability matrix, P. Maybe there is not enough data, but the difference is large.

I don't understand why you are asking about ##P(X_1=1|X_0=1)##. Are you sure that you really understand the transition matrix, P?

user366312 · May 9, 2019

FactChecker said:

In the data, count the number of transitions from one state directly to another (or to itself). That should give you probabilities for a transition matrix. From that, the rest follows. The data you show doesn't seem to match the probability matrix, P. Maybe there is not enough data, but the difference is large.

I used the following code:

[CODE title="R code" highlight="16,23,24"]alpha <- c(1, 1) / 2
mat <- matrix(c(1 / 2, 0, 1 / 2, 1), byrow=TRUE, nrow = 2, ncol = 2) # Different than yours

chainSim <- function(alpha, mat, n)
{
out <- numeric(n) # create a numeric vector of size-n
out[1] <- sample(1:2, 1, prob = alpha) # create 1 random number
for(i in 2:n)
{
#set.seed(1)
out <- sample(1:2, 1, prob = mat[out[i - 1], ])
}
out
}

set.seed(1)
# Doing once
#sss <- chainSim(alpha, mat, 1 + 5)
#sss
# [1] 2 2 2 2 2 2

# Doing 100 times
sim <- replicate(chainSim(alpha, mat, 1 + 5), n = 12)
t(sim)
# rowMeans(sim - 1)

# mean(sim[2, sim[1, ] == 1] == 1)
# [1] 0.4583333[/CODE]

FactChecker said:

I don't understand why you are asking about ##P(X_1=1|X_0=1)##.

Coz, that is what I was told in my class to find from the simulation.

FactChecker said:

Are you sure that you really understand the transition matrix, P?

I think so. Coz, I passed the MidTerm test.

FactChecker · May 9, 2019

I don't see anything wrong with the code, but your results have many more transitions from 2 to 2 than I get from 2 to one (9 versus 3). I thought those two transitions should have equal probability. I have tried a couple of seeds and no seed at all and seem to keep getting more transitions 2 -> 2 than 2->1. I don't understand that. I will have to defer to anyone who understands this better.

CORRECTION: From the data, I thought that 1 was supposed to be the absorbing state. That is wrong. From the definition of P, 2 is the absorbing state and 1 transitions to 1 and 2 with equal probabilities.

FactChecker · May 9, 2019

I made a couple of 1000 simulation runs (using different seeds) and keep getting about twice as many 2->2 transitions than 2->1 transitions. So there is definitely something that I don't understand.

CORRECTION: From the data, I thought that 1 was supposed to be the absorbing state. That is wrong. From the definition of P, 2 is the absorbing state and 1 transitions to 1 and 2 with equal probabilities.

FactChecker · May 9, 2019

I think that your code defining mat is wrong: mat <- matrix(c(1 / 2, 0, 1 / 2, 1), byrow=TRUE, nrow = 2, ncol = 2)
That gives probabilities [0.5, 1.0] in chainSim where it should be using [0.5, 0.5].
I changed it to mat <- matrix(c(1 / 2, 1/2, 0, 1), byrow=TRUE, nrow = 2, ncol = 2) and got transition results that make sense. 2 is an absorbing state and the transitions 1->1 and 1->2 are equally probable. Looking at your code, that seems to be what you intended.

user366312 · May 9, 2019

FactChecker said:

..., that seems to be what you intended.

Okay. Thanks.

Could you now shed some light on finding ##P(X_1=1|X_0=1)##?

The exact question was like the following:

Estimate P(X1 = 1|X0 = 1) from simulation. Compare your result with the exact probability you calculated.

FactChecker · May 9, 2019

user366312 said:

Okay. Thanks.

Could you now shed some light on finding ##P(X_1=1|X_0=1)##?

The exact question was like the following:

To do that, I recommend greatly increasing the number of runs. Then count how many times the initial state is 1, #initialState1. In those cases, also count how many runs have the second state of 1, #secondState1. the estimated probability would be #secondState1/#initialState1

I set the R code to do 100000 runs and used this code at the end to do the counting:

[CODE title="R code"]
numInitial=which(sim[1,]==1)
N1= length(numInitial)
print(N1)
numSecond = which(sim[2,numInitial]==1)
N2 = length(numSecond)
print(N2)
prob = N2/N1
print(prob)
[/CODE]

How can I compute Markov transition probabilities from given data?

Graduate Expected numbers of cards of a last color remaining

Undergrad The problem of points

Graduate Probability puzzle

Undergrad The countability paradox of computable numbers

Undergrad How does axiom of foundation prevent infinite sequence of elements?

Insights Revisiting the Velocity-Time Function

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

How can I compute Markov transition probabilities from given data?

Similar threads