Question about transition matrix of Markov chain

Click For Summary

Discussion Overview

The discussion revolves around the conventions used for the transition matrix in Markov chains, specifically regarding the orientation of states in the matrix and the implications for calculations. Participants explore the differences between two proposed conventions and their effects on matrix multiplication.

Discussion Character

  • Debate/contested

Main Points Raised

  • One participant notes that their teacher's note states the column part of the transition matrix represents the current state and the row part represents the future state, leading to the sum of each column equaling 1.
  • Another source suggests the opposite, where the row part represents the current state and the column part represents the future state, resulting in the sum of each row equaling 1.
  • Some participants propose that the differences arise from conventions regarding whether to multiply the transition matrix with a column vector on the right or a row vector on the left.
  • A later reply confirms a preference for using the first convention (matrix A) when multiplying the transition matrix with a column vector on the right.
  • One participant introduces a related issue regarding the iteration of Markov chains and the use of adjoint methods, questioning if such methods have been applied in Markov processes.

Areas of Agreement / Disagreement

Participants generally agree that the differences in matrix orientation are a matter of convention, but no consensus is reached on which convention is universally correct.

Contextual Notes

The discussion highlights the potential for confusion arising from different conventions and the importance of consistency in applying these conventions within calculations.

songoku
Messages
2,512
Reaction score
394
TL;DR
Transition matrix is matrix that shows the probability of going into future state from a certain current state
The note I get from the teacher states that for transition matrix, the column part will be current state and the row part will be future state (let this be matrix A) so the sum of each column must be equal to 1. But I read from another source, the row part is the current state and the column part is the future state (let this be matrix B) so the sum of row is equal to 1. Matrix B is transpose of matrix A but when I try to multiply each of them with other matrix (matrix of the current value of observation), I get different results

https://www.dartmouth.edu/~chance/teaching_aids/books_articles/probability_book/Chapter11.pdf

That link states that the row part is the current state and the column part is the future state

https://www.math.ucdavis.edu/~dadde...tions/MarkovChain/MarkovChain_9_18/node1.html

The second link states that the column part will be current state and the row part will be future state

So which one is correct, matrix A or matrix B? Or maybe I am missing something?

Thanks
 
Physics news on Phys.org
It's a matter of convention. Do you prefer to multiply the transition matrix with a column vector on the right or with a row vector on the left?
It's unfortunate that there are places where people didn't agree on a single convention, but as long as you keep the convention consistent within your work it will give the right result.
 
  • Like
Likes   Reactions: Klystron, songoku and Dale
mfb said:
It's a matter of convention. Do you prefer to multiply the transition matrix with a column vector on the right or with a row vector on the left?
Oh I see. I prefer to multiply the transition matrix with a column vector on the right side of transition matrix so the one I use should be matrix A, correct?

Thanks
 
  • Like
Likes   Reactions: Dale
Right.
 
  • Like
Likes   Reactions: songoku
Thank you very much mfb
 
  • Like
Likes   Reactions: Klystron
This brings up a related issue. One can iterate a Markov chain $p(i,t+1)=\sum_j T_{i,j) p(j,t)$ from $t=0$ to $t=N$, i.e. in vector form $p(t+1)=T p(t)$ and then make the measurement $c=(q(N),p(N))$, where $(\cdot , \cdot)$ is the $l^2$ inner product. Or you could advance the measurement vector q(N) \emph{backwards} by the transpose $T^T$ of the transition matrix $T$, and then take the inner product at $t=0$. This is a basic adjoint method. Have such adjoint methods been used in Markov processes?
 

Similar threads

  • · Replies 6 ·
Replies
6
Views
2K
  • · Replies 2 ·
Replies
2
Views
2K
Replies
24
Views
4K
  • · Replies 6 ·
Replies
6
Views
2K
  • · Replies 4 ·
Replies
4
Views
5K
  • · Replies 10 ·
Replies
10
Views
4K
  • · Replies 20 ·
Replies
20
Views
5K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 1 ·
Replies
1
Views
5K
  • · Replies 12 ·
Replies
12
Views
2K