How can I compute Markov transition probabilities from given data?

user366312 · May 8, 2019

Suppose, I have a Markov Chain as follows:

##S=\{1, 2\}##

##
\alpha = \begin{bmatrix}
0.5&0.5\end{bmatrix}
##

##
P =
\begin{bmatrix}
0.5&0.5\\
0&1
\end{bmatrix}
##

And, the following data regarding 5 steps of Markov Chain taken 12 times:
[CODE title="Markov Chain"] Steps 1 2 3 4 5
--------------------------------
1 2 2 1 1 1
2 1 1 1 1 1
3 1 1 1 1 1
4 1 1 1 1 1
5 2 2 2 1 1
6 2 2 2 1 1
7 1 1 1 1 1
8 1 1 1 1 1
9 1 1 1 1 1
10 2 2 2 2 2
11 1 1 1 1 1
12 1 1 1 1 1[/CODE]

How can I calculate,say, ##P(X_1 = 1|X_0 = 1)## and ##P(X_5 = 2|X_2 = 1)## from this table?

Kindly, explain your answer.

FactChecker · May 9, 2019

In the data, count the number of transitions from one state directly to another (or to itself). That should give you probabilities for a transition matrix. From that, the rest follows. The data you show doesn't seem to match the probability matrix, P. Maybe there is not enough data, but the difference is large.

I don't understand why you are asking about ##P(X_1=1|X_0=1)##. Are you sure that you really understand the transition matrix, P?

user366312 · May 9, 2019

FactChecker said:

In the data, count the number of transitions from one state directly to another (or to itself). That should give you probabilities for a transition matrix. From that, the rest follows. The data you show doesn't seem to match the probability matrix, P. Maybe there is not enough data, but the difference is large.

I used the following code:

[CODE title="R code" highlight="16,23,24"]alpha <- c(1, 1) / 2
mat <- matrix(c(1 / 2, 0, 1 / 2, 1), byrow=TRUE, nrow = 2, ncol = 2) # Different than yours

chainSim <- function(alpha, mat, n)
{
out <- numeric(n) # create a numeric vector of size-n
out[1] <- sample(1:2, 1, prob = alpha) # create 1 random number
for(i in 2:n)
{
#set.seed(1)
out <- sample(1:2, 1, prob = mat[out[i - 1], ])
}
out
}

set.seed(1)
# Doing once
#sss <- chainSim(alpha, mat, 1 + 5)
#sss
# [1] 2 2 2 2 2 2

# Doing 100 times
sim <- replicate(chainSim(alpha, mat, 1 + 5), n = 12)
t(sim)
# rowMeans(sim - 1)

# mean(sim[2, sim[1, ] == 1] == 1)
# [1] 0.4583333[/CODE]

FactChecker said:

I don't understand why you are asking about ##P(X_1=1|X_0=1)##.

Coz, that is what I was told in my class to find from the simulation.

FactChecker said:

Are you sure that you really understand the transition matrix, P?

I think so. Coz, I passed the MidTerm test.

FactChecker · May 9, 2019

I don't see anything wrong with the code, but your results have many more transitions from 2 to 2 than I get from 2 to one (9 versus 3). I thought those two transitions should have equal probability. I have tried a couple of seeds and no seed at all and seem to keep getting more transitions 2 -> 2 than 2->1. I don't understand that. I will have to defer to anyone who understands this better.

CORRECTION: From the data, I thought that 1 was supposed to be the absorbing state. That is wrong. From the definition of P, 2 is the absorbing state and 1 transitions to 1 and 2 with equal probabilities.

FactChecker · May 9, 2019

I made a couple of 1000 simulation runs (using different seeds) and keep getting about twice as many 2->2 transitions than 2->1 transitions. So there is definitely something that I don't understand.

CORRECTION: From the data, I thought that 1 was supposed to be the absorbing state. That is wrong. From the definition of P, 2 is the absorbing state and 1 transitions to 1 and 2 with equal probabilities.

FactChecker · May 9, 2019

I think that your code defining mat is wrong: mat <- matrix(c(1 / 2, 0, 1 / 2, 1), byrow=TRUE, nrow = 2, ncol = 2)
That gives probabilities [0.5, 1.0] in chainSim where it should be using [0.5, 0.5].
I changed it to mat <- matrix(c(1 / 2, 1/2, 0, 1), byrow=TRUE, nrow = 2, ncol = 2) and got transition results that make sense. 2 is an absorbing state and the transitions 1->1 and 1->2 are equally probable. Looking at your code, that seems to be what you intended.

user366312 · May 9, 2019

FactChecker said:

..., that seems to be what you intended.

Okay. Thanks.

Could you now shed some light on finding ##P(X_1=1|X_0=1)##?

The exact question was like the following:

Estimate P(X1 = 1|X0 = 1) from simulation. Compare your result with the exact probability you calculated.

FactChecker · May 9, 2019

user366312 said:

Okay. Thanks.

Could you now shed some light on finding ##P(X_1=1|X_0=1)##?

The exact question was like the following:

To do that, I recommend greatly increasing the number of runs. Then count how many times the initial state is 1, #initialState1. In those cases, also count how many runs have the second state of 1, #secondState1. the estimated probability would be #secondState1/#initialState1

I set the R code to do 100000 runs and used this code at the end to do the counting:

[CODE title="R code"]
numInitial=which(sim[1,]==1)
N1= length(numInitial)
print(N1)
numSecond = which(sim[2,numInitial]==1)
N2 = length(numSecond)
print(N2)
prob = N2/N1
print(prob)
[/CODE]

How can I compute Markov transition probabilities from given data?

Discussion Overview

Discussion Character

Main Points Raised

Areas of Agreement / Disagreement

Contextual Notes

Similar threads

Graduate Hypothesis testing: Defining H0, HA hypotheses so that ( H_A)_A' makes sense

Undergrad My basic understanding of set theory

Undergrad The problem of points

Graduate Expected numbers of cards of a last color remaining

Undergrad How does axiom of foundation prevent infinite sequence of elements?

Insights Revisiting the Velocity-Time Function

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect