Probability of Hall in H after John's Second Play: Markov Matrix Solution

Aleoa · May 21, 2018

Mentor note: Thread moved from Mathematics section, so is missing the HW template
John and Eli are playing a game with a hall that can roll into one of two
pockets labelled H and Y. John wants to keep the hall in H and Eli wants
to keep it in Y. When it is John's turn to play, ifhe finds the hall in H that is
fine with him and he does nothing; but if he finds it il). Y he attempts to roll
it into pocket H. This takes some skill; the probability that he succeeds is 2/3 ,
there being a 1/3 chance that the hall will roll back into Y. When Eli's turn
comes, he does nothing if the hall is in Y, but tries to get it there ifhe finds it
in H. Eli is less skillful than John and his probability of succeeding in bis
effort is only 1/2.

I built this 2x2 matrix:

 A=\begin{bmatrix} \frac{1}{3} & \frac{1}{2}\\ \frac{2}{3} & \frac{1}{2} \end{bmatrix}
The first colum is for state Y and the second for the state H.

But, when i read the first exercise to do with this matrix, it seems not properly built:

(a)Starting with the hall in Y and John to play, what is the probability
that the hall will be in H after John's second play?

What is the correct Markov Matrix that represent this scenario ?

FactChecker · May 21, 2018

Have you taken into account that John and Eli take turns? Who's turn it is is an essential part of the definition of the current state. So you have a situation where the state (John's turn and ball in H) goes to the state (Eli's turn and ball in H). Similarly for other states. I think this must be a 4x4 state transition matrix.

PS. There may be some way that I don't see that allows you to make this a 2x2, but I recommend that you keep it simple and 4x4 for now.

Mark44 · May 21, 2018

@Aleoa, please post questions like this in the Homework sections, not in the technical Math sections.

Also, John and Eli are playing with a ball, not a hall. You can't normally roll a hall into a pocket.

Ray Vickson · May 21, 2018

Aleoa said:

Mentor note: Thread moved from Mathematics section, so is missing the HW template
John and Eli are playing a game with a hall that can roll into one of two
pockets labelled H and Y. John wants to keep the hall in H and Eli wants
to keep it in Y. When it is John's turn to play, ifhe finds the hall in H that is
fine with him and he does nothing; but if he finds it il). Y he attempts to roll
it into pocket H. This takes some skill; the probability that he succeeds is 2/3 ,
there being a 1/3 chance that the hall will roll back into Y. When Eli's turn
comes, he does nothing if the hall is in Y, but tries to get it there ifhe finds it
in H. Eli is less skillful than John and his probability of succeeding in bis
effort is only 1/2.

I built this 2x2 matrix:

 A=\begin{bmatrix} \frac{1}{3} & \frac{1}{2}\\ \frac{2}{3} & \frac{1}{2} \end{bmatrix}
The first colum is for state Y and the second for the state H.

But, when i read the first exercise to do with this matrix, it seems not properly built:

(a)Starting with the hall in Y and John to play, what is the probability
that the hall will be in H after John's second play?

What is the correct Markov Matrix that represent this scenario ?

The vast majority of Markov chain presentations--in applied areas, at least, not necessarily in pure "math"--use the convention that the rows of a transition matrix sum to 1. You have columns summing to 1. Is that intentional on your part (for example, because your course notes do it that way), or is it an error you have made?

Note: having the columns sum to 1 is not, in itself, an error. However, if you do it that way you must very carefully and laboriously check each and every formula you are using, to make sure you are using the correct version.

Orodruin · May 21, 2018

FactChecker said:

PS. There may be some way that I don't see that allows you to make this a 2x2, but I recommend that you keep it simple and 4x4 for now.

Since the turn always alternates, you could look at the transition from Eli's current turn to Eli's next turn (states H and Eli's turn + Y and Eli's turn), or the transition between John's turns (H and John's turn + Y and John's turn). Of course, those transition matrices will not be the same and they will be the two blocks of the block diagonal matrix which is the square of the 4x4 matrix that considers all four states (which is block off-diagonal).

FactChecker · May 21, 2018

Orodruin said:

Since the turn always alternates, you could look at the transition from Eli's current turn to Eli's next turn (states H and Eli's turn + Y and Eli's turn), or the transition between John's turns (H and John's turn + Y and John's turn).

Good point. I guess a problem with that approach would be that it would be difficult to answer questions about state transition probabilities from John's turn to any of Eli's turns. In general, I think it is a messy way to address the process. It's probably a "simplification" to two separate 2x2 matrices that makes some things more complicated.

Of course, those transition matrices will not be the same and they will be the two blocks of the block diagonal matrix which is the square of the 4x4 matrix that considers all four states (which is block off-diagonal).

They would be related, but I don't think that any of the entries of the 4x4 would match any of the 2x2 entries. The 2x2 entries are for a John-Eli (or Eli-John) sequence of two state transitions. (CORRECTION: @Orodruin did specify that the 2x2 entries would match entries in the square of the 4x4 matrix.)

StoneTemplePython · May 21, 2018

A few thoughts:

1.)
The states are ##\text{\{(1) Eli's turn, ball in H, (2) Eli's turn, ball in Y, (3) John's turn, ball in H, (4) John's turn, ball in Y\}}##

This seems to be a bipartite graph.

From a past post of mine, here's the blocked multiplication:
##
\mathbf Y =
\begin{bmatrix}\mathbf 0 &\mathbf{A} \\ \mathbf{B} &\mathbf 0\end{bmatrix}
##

##\mathbf Y^2 =
\begin{bmatrix}\mathbf{AB} & \mathbf 0 \\ \mathbf 0 & \mathbf{BA}\end{bmatrix}
##

so for odd powers, say to the seventh power:
##\mathbf Y^7 = \mathbf Y^2\mathbf Y^2\mathbf Y^2\mathbf Y = \big(\mathbf Y^2\big)^3 \mathbf Y##

and for even powers, like the sixth power:
##\mathbf Y^6 = \mathbf Y^2 \mathbf Y^2 \mathbf Y^2= \big(\mathbf Y^2\big)^3##

which should make the multiplications a lot easier.

2.) Ray's right that the standard approach is for the matrix to be row stochastic, not column stochastic. Pretty much everyone --including Feller-- does this-- except me (most of the time). For some reason I've developed a distaste for vector matrix products and have a strong preference for matrix vector products. That said, if you leave it in matrix vector form, it's very easy to transpose between the two different approaches.

Orodruin · May 22, 2018

FactChecker said:

They would be related, but I don't think that any of the entries of the 4x4 would match any of the 2x2 entries. The 2x2 entries are for a John-Eli (or Eli-John) sequence of two state transitions.

That is not what I said. I said the entries of the 2x2 matrices would match the diagonal blocks of the square of the 4x4.

StoneTemplePython said:

Pretty much everyone --including Feller-- does this-- except me (most of the time). For some reason I've developed a distaste for vector matrix products and have a strong preference for matrix vector products.

This may be another mathematician-physicist discrepancy. I would also prefer to be column stochastic.

Aleoa · May 22, 2018

StoneTemplePython said:

A few thoughts:

The states are ##\text{\{(1) Eli's turn, ball in H, (2) Eli's turn, ball in Y, (3) John's turn, ball in H, (4) John's turn, ball in Y\}}##

Considering this states order i built this 4x4 markov matrix:

 M=\begin{bmatrix} 0 &0 & 1 & \frac{2}{3}\\ 0 & 0 & 0 & \frac{1}{3}\\ \frac{1}{2} & 0 &0 &0 \\ \frac{1}{2} &1 &0 &0 \end{bmatrix}

It seems correct to me . Is that it?

Orodruin · May 22, 2018

Aleoa said:

Considering this states order i built this 4x4 markov matrix:

 M=\begin{bmatrix} 0 &0 & 1 & \frac{2}{3}\\ 0 & 0 & 0 & \frac{1}{3}\\ \frac{1}{2} & 0 &0 &0 \\ \frac{1}{2} &1 &0 &0 \end{bmatrix}

It seems correct to me . Is that it?

Yes.

Aleoa · May 22, 2018

Thanks so much to all you guys :)

StoneTemplePython · May 22, 2018

Aleoa said:

Considering this states order i built this 4x4 markov matrix:

 M=\begin{bmatrix} 0 &0 & 1 & \frac{2}{3}\\ 0 & 0 & 0 & \frac{1}{3}\\ \frac{1}{2} & 0 &0 &0 \\ \frac{1}{2} &1 &0 &0 \end{bmatrix}

It seems correct to me . Is that it?

Sometimes drawing the state diagram helps...

edit: I just dropped my state diagram labels in with your matrix. Maybe some labeling nits remain (graph isomorphism, I know...)
- - - -
it seems to be ok afterall.

Ray Vickson · May 22, 2018

Orodruin said:

That is not what I said. I said the entries of the 2x2 matrices would match the diagonal blocks of the square of the 4x4.This may be another mathematician-physicist discrepancy. I would also prefer to be column stochastic.

All the mathematician-authored books I have seen/own that treat Markov chains use row stochasticity. More to the point, while I got my PhD in Physics I spent all my after-post-doc working life doing Operations Research, and every single book and almost all papers in that field use row stochasticity. The column-stochasticity convention is used, but rarely, at least outside physics.

Aleoa · May 22, 2018

Ray Vickson said:

All the mathematician-authored books I have seen/own that treat Markov chains use row stochasticity. More to the point, while I got my PhD in Physics I spent all my after-post-doc working life doing Operations Research, and every single book and almost all papers in that field use row stochasticity. The column-stochasticity convention is used, but rarely, at least outside physics.

The book I'm using is :https://www.amazon.com/dp/0521406498/?tag=pfamazon01-20 , and the author uses column stochasticity

FactChecker · May 22, 2018

Orodruin said:

That is not what I said. I said the entries of the 2x2 matrices would match the diagonal blocks of the square of the 4x4.

Sorry Orodruin, I missed that you specified square. I have corrected that post.

Ray Vickson · May 22, 2018

Aleoa said:

The book I'm using is :https://www.amazon.com/dp/0521406498/?tag=pfamazon01-20 , and the author uses column stochasticity

Well, I did specify that the row method is used extensively in applied areas outside of physics.

Anyway, the main point is that you were, indeed, doing it that way on purpose (following your textbook) and are presumably perfectly aware of the issue and know the correct formulas to use. In other words, you did not make a mistake of the kind I have seen many people make in the past.

Aleoa · May 23, 2018

This is the third point of the exercise:
(c) Suppose the game has been going on fora 'long time' and you look in
just after Eli has played. What is the probability that the ball is now in
H? How many turns constitutes a 'long time' if we want to be certain
that this probability is correct within 0.001?

Is there a way to reduce the matrix i built , or i have to calculate the eigenvectors/eigenvalues of a 4x4 matrix ?

StoneTemplePython · May 23, 2018

Aleoa said:

This is the third point of the exercise:
(c) Suppose the game has been going on fora 'long time' and you look in
just after Eli has played. What is the probability that the ball is now in
H? How many turns constitutes a 'long time' if we want to be certain
that this probability is correct within 0.001?

Is there a way to reduce the matrix i built , or i have to calculate the eigenvectors/eigenvalues of a 4x4 matrix ?

You may enjoy this post: https://www.physicsforums.com/threads/eigenvalues-of-block-matrices.925046/#post-5840702

And you may want to consider that

##
\big(\mathbf Y^2\big)^k =
\begin{bmatrix}\big(\mathbf{AB}\big)^k & \mathbf 0 \\ \mathbf 0 & \big(\mathbf{BA}\big)^k\end{bmatrix}
##

so

##\text{trace}\Big(\big(\mathbf Y^2\big)^k\Big) = \text{trace}\Big(\big(\mathbf{AB}\big)^k+ \big(\mathbf{BA}\big)^k\Big) = 2\text{trace}\Big(\big(\mathbf{AB}\big)^k\Big)##

i.e. the above is true for an even number of iterations.

For odd number of iterations, the trace is always zero --e.g. like ##\text{trace}\Big(\mathbf Y^7\Big)##.

This is some special bipartite graph stuff, which gives strong information on the eigenvalues.

But it comes down to how comfortable you are with linear algebra.

- - - -
edit:
Clever use of ##\text{trace}\Big(\big(\mathbf Y^2\big)\Big)## can give you all the eigenvalues with no real work. But I'm not supposed to give this all way. Happy to show the way after you're done if you don't uncover the secret.- - - -
If the above makes you uncomfortable, you can take a simpler, longer approach and just calculate the characteristic polynomial and solve for the eigenvalues and eigenvectors. You know at least one eigenvalue is 1 because it is (column) stochastic, so ##\mathbf 1^* \mathbf Y = \lambda_1 \mathbf 1^* = 1 \mathbf 1^* = \mathbf 1^*##

as another hint, you may consider that there are basically two separate (i.e. bi partitions) of the graph, so maybe you can do something interesting with the ones vector here...

Aleoa · May 23, 2018

What i thought is to calculate the eigenvalues and eigenvector of the 2x2 top right matrix, and the eigenvalues and eigenvector of the 2x2 lower left matrix, and then use the initial state vector
V=\begin{bmatrix} a\\ b \\ 0 \\ 0 \end{bmatrix} Multiplying the sub-vector [a,b] with the top right 2x2 matrix , and then multiply the result of this calculation with the lower left 2x2 matrix and watch what happens.
Is this correct ?

StoneTemplePython · May 23, 2018

Aleoa said:

What i thought is to calculate the eigenvalues and eigenvector of the 2x2 top right matrix, and the eigenvalues and eigenvector of the 2x2 lower left matrix, and then use the initial vector
V=\begin{bmatrix} a\\ b \\ 0 \\ 0 \end{bmatrix} Multiplying the sub-vecto [a,b] with the top right 2x2 matrix , and then multiply the result of this calculation with the lower left 2x2 matrix and watch what happens.
Is this correct ?

It's a decent idea, kinda close but...

Look your matrices in the corner are triangular so it would be ideal, right? I mean, I can see the eigenvalues on the diagonals of triangular matrices.

Your approach would tell you that ##\mathbf Y## has eigenvalues of ##\{1, \frac{1}{3}, \frac{1}{2}, 1\}## but

##1 + \frac{1}{3} + \frac{1}{2} + 1 = 2 \frac{5}{6} \gt 0 = \text{trace}\Big(\mathbf Y\Big) = \lambda_1 + \lambda_2 + \lambda_3 + \lambda_4##

which is a contradiction.

Orodruin · May 23, 2018

Hint: Look at the transition matrix from just before John’s turn n to just before John’s turn n+1. The state space is then two-dimensional as discussed earlier.

Ray Vickson · May 23, 2018

Aleoa said:

What i thought is to calculate the eigenvalues and eigenvector of the 2x2 top right matrix, and the eigenvalues and eigenvector of the 2x2 lower left matrix, and then use the initial state vector
V=\begin{bmatrix} a\\ b \\ 0 \\ 0 \end{bmatrix} Multiplying the sub-vector [a,b] with the top right 2x2 matrix , and then multiply the result of this calculation with the lower left 2x2 matrix and watch what happens.
Is this correct ?

Your transition matrix has the form
$$P = \begin{bmatrix} 0 & Q \\ R & 0 \end{bmatrix} ,$$
where the ##2 \times 2##matrices ##Q## and ##R## are stochastic.
Thus,
$$P^2= \begin{bmatrix} A & 0 \\ 0 & B \end{bmatrix} .$$
Here, the ##2 \times 2## matrices ##A## and ##B## are the 2-step E-E and J-J transition matrices, respectively. So, ##A^n## is the ##2n##-step transition matrix for player E, and similarly for ##B^n##.

We want to know the rates of convergence of ##A^n## and ##B^n## to their limits. My favorite way of doing such problems is as follows.

If ##r_1## and ##r_2## are the eigenvalues of ##A##, and if ##r_1 \neq r_2## (true in this case) then from diagonalization arguments we know that
$$A = r_1 E_1 + r_2 E_2,$$
for some ##2 \times 2## matrices ##E_1## and ##E_2##. Again, because of diagonalization arguments, it is a fact that for any analytic function
$$f(x) = c_0 + c_1 x + c_2 x^2 + \cdots$$
the matrix function
$$f(A) = c_0 I + c_1 A + c_2 A^2 + \cdots $$
(with ##I## the identity matrix) obeys
$$f(A) = f(r_1) E_1+ f(r_2) E_2$$
That is, the function ##f## acts on the eigenvalues only, and leaves the ##E_i## unchanged.

We can get the ##E_i## from
$$\begin{array}{rcl}
f(x) = 1 = x^0 & \Rightarrow &I = r_1^0 E_1 + r_2^0 E_2 = E_1 + E_2\\
f(x) = x = x^1 & \Rightarrow & A = r_1^1 E_1 + r_2^1 E_2 = r_1 E_1 + r_2 E_2
\end{array}
$$
That is, ##I = E_1 + E_2## and ##A = r_1 E_1 + r_2 E_2##. You can solve these for ##E_1## and ##E_2##, and so can easily get ##A^n = r_1^n E_1 + r_2^n E_2.## From that you can assess convergence rates, etc.

You can just repeat all that for the matrix ##B##.

Things become a bit more complicated if you have repeated eigenvalues, but it is still possible to do such calculations without going into the laborious process of finding the Jordan canonical form. It is not relevant to this thread, so I will skip the details for now.

Aleoa · May 24, 2018

I did some calculation using your advises, i get that the 2 states at N\rightarrow \infty are V1=\begin{bmatrix} 0\\0 \\\frac{2}{3} \\ \frac{1}{3} \end{bmatrix} and V2=\begin{bmatrix} \frac{1}{3} \\\frac{2}{3} \\0 \\ 0 \end{bmatrix}

Is it correct ?

StoneTemplePython · May 24, 2018

Aleoa said:

I did some calculation using your advises, i get that the 2 states at N\rightarrow \infty are V1=\begin{bmatrix} 0\\0 \\\frac{2}{3} \\ \frac{1}{3} \end{bmatrix} and V2=\begin{bmatrix} \frac{1}{3} \\\frac{2}{3} \\0 \\ 0 \end{bmatrix}

Is it correct ?

I'm not sure what you did here, but it doesn't seem correct to me. Help us understand what kinds of calculations you did.

I think we've all hinted at this... can you write out what ##\mathbf Y^2## looks like? This is a much easier problem to deal with as shown in the blocked multiplication.

Aleoa · May 25, 2018

I simply get the eigenvalues and eigenvectors, and i deduced that whatever vectors i use at start, for N\rightarrow \infty , we have only 2 states and they continuously alternate:

 V_{1}=\begin{bmatrix} \frac{2}{3}\\\frac{1}{3} \\ 0 \\ 0 \end{bmatrix}

and

 V_{2}=\begin{bmatrix} 0\\0 \\ \frac{2}{3} \\ \frac{1}{3} \end{bmatrix}

Orodruin · May 25, 2018

Sorry, but it is incorrect. For example, you do not get ##V_1## from multiplying ##V_2## with the matrix from the left.

Aleoa · May 25, 2018

I have swapped the terms. If you multiply [2/3 , 1/3 , 0 , 0] , you get [ 0,0,1/3,2/3] , and so on infinitely

Orodruin · May 25, 2018

No you don’t. Check your algebra.

Aleoa · May 25, 2018

I'm very sorry. My algebra was wrong :( . I'm going to try other ways

Aleoa · Jul 3, 2018

Ray Vickson said:

Your transition matrix has the form
$$P = \begin{bmatrix} 0 & Q \\ R & 0 \end{bmatrix} ,$$
where the ##2 \times 2##matrices ##Q## and ##R## are stochastic.
Thus,
$$P^2= \begin{bmatrix} A & 0 \\ 0 & B \end{bmatrix} .$$
Here, the ##2 \times 2## matrices ##A## and ##B## are the 2-step E-E and J-J transition matrices, respectively. So, ##A^n## is the ##2n##-step transition matrix for player E, and similarly for ##B^n##.

We want to know the rates of convergence of ##A^n## and ##B^n## to their limits. My favorite way of doing such problems is as follows.

If ##r_1## and ##r_2## are the eigenvalues of ##A##, and if ##r_1 \neq r_2## (true in this case) then from diagonalization arguments we know that
$$A = r_1 E_1 + r_2 E_2,$$
for some ##2 \times 2## matrices ##E_1## and ##E_2##. Again, because of diagonalization arguments, it is a fact that for any analytic function
$$f(x) = c_0 + c_1 x + c_2 x^2 + \cdots$$
the matrix function
$$f(A) = c_0 I + c_1 A + c_2 A^2 + \cdots $$
(with ##I## the identity matrix) obeys
$$f(A) = f(r_1) E_1+ f(r_2) E_2$$
That is, the function ##f## acts on the eigenvalues only, and leaves the ##E_i## unchanged.

We can get the ##E_i## from
$$\begin{array}{rcl}
f(x) = 1 = x^0 & \Rightarrow &I = r_1^0 E_1 + r_2^0 E_2 = E_1 + E_2\\
f(x) = x = x^1 & \Rightarrow & A = r_1^1 E_1 + r_2^1 E_2 = r_1 E_1 + r_2 E_2
\end{array}
$$
That is, ##I = E_1 + E_2## and ##A = r_1 E_1 + r_2 E_2##. You can solve these for ##E_1## and ##E_2##, and so can easily get ##A^n = r_1^n E_1 + r_2^n E_2.## From that you can assess convergence rates, etc.

You can just repeat all that for the matrix ##B##.

Things become a bit more complicated if you have repeated eigenvalues, but it is still possible to do such calculations without going into the laborious process of finding the Jordan canonical form. It is not relevant to this thread, so I will skip the details for now.

It's been a long time, I'm sorry.
Anyway, i calculated lim (QR)^{n}, getting \begin{bmatrix} \frac{1}{5} &-\frac{4}{5} \\ -\frac{1}{5}& \frac{4}{5} \end{bmatrix}.

In order to do so, i calculated QR, trasformed it in the form B\Lambda B^{-1} and then i calculated lim (B\Lambda ^{n}B^{-1})

If it's correct i continue with the exercise, calculating first of all lim (RQ)^{n}

Ray Vickson · Jul 3, 2018

Aleoa said:

It's been a long time, I'm sorry.
Anyway, i calculated lim (QR)^{n}, getting \begin{bmatrix} \frac{1}{5} &-\frac{4}{5} \\ -\frac{1}{5}& \frac{4}{5} \end{bmatrix}.

In order to do so, i calculated QR, trasformed it in the form B\Lambda B^{-1} and then i calculated lim (B\Lambda ^{n}B^{-1})

If it's correct i continue with the exercise, calculating first of all lim (RQ)^{n}

Your calculations must be wrong: ##Q## and ##R## are stochastic matrices, so ##M = QR## is also a stochastic matrix; that is, its elements are ##\geq 0## and its columns sum to 1. Therefore, for any positive integer ##n## the matrix ##M^n## is also stochastic, as is ##\Pi = \lim_{n \to \infty} M^n##, provided that the limit exists (which it does in this case). Your limiting matrix should be of the form
$$\Pi = \pmatrix{\pi_1 & \pi_1 \\ \pi_2 &\pi_2}, $$
where ##\pi_1, \pi_2 > 0## and ##\pi_1 + \pi_2 = 1.##

You do not need to find eigenvalues or anything like that in order to find the limiting matrix ##\Pi##; just use the standard "equilibrium equations", to get a simple ##2
\times 2## system of linear equations whose solution gives you what you need. Of course, you could use eigenvalues, do diagonalization and all that in order to find the limit, but that is like using a sledgehammer to crack a peanut. Why do it the hard way, when the easy way is almost trivial?

Orodruin · Jul 3, 2018

Also, note that the requirement of summing up to one constrains the possible form of the columns to a n-1-dimensional hyperplane. This implies that one eigenvalue will always be one. This is the eigenvalue you are interested in in the end and its corresponding eigenvector is the limiting distribution.

StoneTemplePython · Jul 3, 2018

Ray Vickson said:

Of course, you could use eigenvalues, do diagonalization and all that in order to find the limit, but that is like using a sledgehammer to crack a peanut. Why do it the hard way, when the easy way is almost trivial?

If OP is getting lost, maybe it is advisable to be solved first with sledgehammer, and then we can go back and show a better way? There's a line from Gauss that I'm fond of (mis?)quoting -- "First get the proof. Then get the right proof."

Also, we should recall

Aleoa said:

What is the probability that the ball is now in
H? How many turns constitutes a 'long time' if we want to be certain
that this probability is correct within 0.001?

i.e. some estimates on convergence rates (closely related to diagonalization) need to be done here. Using your suggested spectral projector approach, these can of course be fairly easily backed into given the starting matrix and the steady state...

Ray Vickson · Jul 4, 2018

StoneTemplePython said:

If OP is getting lost, maybe it is advisable to be solved first with sledgehammer, and then we can go back and show a better way? There's a line from Gauss that I'm fond of (mis?)quoting -- "First get the proof. Then get the right proof."

Also, we should recalli.e. some estimates on convergence rates (closely related to diagonalization) need to be done here. Using your suggested spectral projector approach, these can of course be fairly easily backed into given the starting matrix and the steady state...

If the OP is truly lost, one strategy might be to take the straightforward way out and use something like a spreadsheet to do numerical computations of the first few matrix powers ##M^n##, just to see what is occurring. He/she could then easily enough extract the needed value of ##n## to get 0.0001 precision just by looking. Perhaps the person setting the problem would regard that as an acceptable solution (or, perhaps not).

In any case, I cannot believe the OP was given such a problem with zero background and no textbook or lecture-note backup.

Aleoa · Jul 4, 2018

I made errors in calculations.
Now
lim (QR)^{n} = \begin{bmatrix} \frac{4}{5} &\frac{4}{5} \\ \frac{1}{5}& \frac{1}{5} \end{bmatrix}.

In order to do so, i calculated QR, trasformed it in the form B\Lambda B^{-1} and then i calculated lim (B\Lambda ^{n}B^{-1})

If it's correct i continue with the exercise, calculating first of all lim (RQ)^{n}

Aleoa · Jul 5, 2018

I continued:
lim (RQ)^{n} = \begin{bmatrix} \frac{2}{5} &\frac{2}{5} \\ \frac{3}{5}& \frac{3}{5} \end{bmatrix}.

In order to do so, i calculated RQ, trasformed it in the form B\Lambda B^{-1} and then i calculated lim (B\Lambda ^{n}B^{-1}).

Now i have to build the 4x4 matrix with the two just built 2x2 matrices and i can deduce, for large n, what is the resultant probability distributions of states. Is it correct ?

Ray Vickson · Jul 5, 2018

Aleoa said:

I continued:
lim (RQ)^{n} = \begin{bmatrix} \frac{2}{5} &\frac{2}{5} \\ \frac{3}{5}& \frac{3}{5} \end{bmatrix}.

In order to do so, i calculated RQ, trasformed it in the form B\Lambda B^{-1} and then i calculated lim (B\Lambda ^{n}B^{-1}).

Now i have to build the 4x4 matrix with the two just built 2x2 matrices and i can deduce, for large n, what is the resultant probability distributions of states. Is it correct ?

Not quite: your 4-state Markov chain is "periodic", with period two. Therefore, a limiting distribution for ##M^n## does not exist. You get different limits for large even ##n## or large odd ##n##.

However, if you write
$$M = \pmatrix{0 & Q\\R & 0 },$$
the ##2 \times 2## matrices ##A = QR## and ##B = R Q## are the transition matrices for the John-John or Eli-Eli transitions---that is, for the transitions between successive starting states for John's or Eli's plays. Both ##A^n## and ##B^n## do have well-defined limits for large ##n##; these would be the diagonal blocks of ##M^{2n}## for large ##n##.

Note: my interpretation of ##A## and ##B## uses row-wise stochasticity, where rows sum to 1. Things may be different if you use column-wise stochasticity---but you can work it out

However, the matrix ##M## does have a well-defined "equilibrium" value---a kind of long-run average---given as
$$\Pi_M = \lim_{n \to \infty} \frac 1 2 \left( M^n + M^{n+1} \right) . $$
The interpretation here is that half the time it is John's turn (in which case you get the limiting value ##A^n##) and half the time it is Eli's turn (in which case you get the limiting value of ##B^n##). More precisely, John sees ##A^n## half the time and sees ##B^n Q## half the time, while Eli sees ##B^n## half the time and sees ##A^n R## half the time.

Aleoa · Jul 12, 2018

What i think is that it's sufficient to know the limit for large n of M^{2n} in order to know the general behaviour of this process.
Since, if i start with a probability vector \bar{v_{0}}, then after 1000 iterations i get the resultant probability vector as M^{1000}\bar{v_{0}}, from this vector can know the general behaviour of the process. Is it correct ?

StoneTemplePython · Jul 12, 2018

Aleoa said:

What i think is that it's sufficient to know the limit for large n of M^{2n} in order to know the general behaviour of this process.
Since, if i start with a probability vector \bar{v_{0}}, then after 1000 iterations i get the resultant probability vector as M^{1000}\bar{v_{0}}, from this vector can know the general behaviour of the process. Is it correct ?

sort of. It depends what you want with it. The even powers show two distinct different graphs (i.e. your chain is reducible when looked at from even powers only -- specifically there are 2 separate 2x2 matrices denoting 2 separate 'mini' markov chains). If you are only interested in even powers, then you are good to go.

What Ray suggested is that you may be interested in an ensemble average, which is a common interpretation when dealing with periodic behavior. If you take a closer look your matrix ##M## has an eigenvalue of ## \lambda_1 = 1## and ##M^2## has two eigenvalues of 1, which means ##M## must have an eigenvalue of ## \lambda_2 = -1##. But what happens is when you use the ensemble average then you have ##\frac{\lambda_1 + \lambda_1^2}{2} = 1 ## whereas ##\frac{\lambda_2 + \lambda_2^2}{2} = 0## so the ensemble average has a limit that is strictly dealing behavior /eigenvector of ##\lambda_1##. When you are looking at the only even powers of ##M## you actually are considering limiting behavior associated with two distinct eigenvalues ##\lambda_1^2 = 1## and ##\lambda_2^2 = 1##... this should make sense because again the even powers show a reducible matrix aka 2 distinct 2x2 markov chain matrices in there (and every markov chain must have at least one eigenvalue of 1, why?).

Back to your question

" you look in just after Eli has played. What is the probability that the ball is now in H?"
Does this need even or odd powers? If even, then you're good to go. If odd, you can of course compute the limit on even powers and do a single matrix multiplication afterward (though a little finesse is needed if you want to properly speak about this as some kind of 'limiting' behavior)

"How many turns constitutes a 'long time' if we want to be certain that this probability is correct within 0.001?" How do you propose doing this?

Aleoa · Jul 12, 2018

StoneTemplePython said:

sort of.
What Ray suggested is that you may be interested in an ensemble average, which is a common interpretation when dealing with periodic behavior. If you take a closer look your matrix ##M## has an eigenvalue of ## \lambda_1 = 1## and ##M^2## has two eigenvalues of 1, which means ##M## must have an eigenvalue of ## \lambda_2 = -1##. But what happens is when you use the ensemble average then you have ##\frac{\lambda_1 + \lambda_1^2}{2} = 1 ## whereas ##\frac{\lambda_2 + \lambda_2^2}{2} = 0## so the ensemble average has a limit that is strictly dealing behavior /eigenvector of ##\lambda_1##.

Are there efficient ways to do this eigenvalues calculations (due to the particular form of M), or do i have to use the classical approach to eigenvaluse calculation ?

StoneTemplePython said:

"How many turns constitutes a 'long time' if we want to be certain that this probability is correct within 0.001?" How do you propose doing this?

The limiting state of M^{n} is the eigenvector associated with \lambda = 1 , however i don't know how to calculate a rate of convergence...

StoneTemplePython said:

What Ray suggested is that you may be interested in an ensemble average, which is a common interpretation when dealing with periodic behavior. If you take a closer look your matrix ##M## has an eigenvalue of ## \lambda_1 = 1## and ##M^2## has two eigenvalues of 1, which means ##M## must have an eigenvalue of ## \lambda_2 = -1##. But what happens is when you use the ensemble average then you have ##\frac{\lambda_1 + \lambda_1^2}{2} = 1 ## whereas ##\frac{\lambda_2 + \lambda_2^2}{2} = 0## so the ensemble average has a limit that is strictly dealing behavior /eigenvector of ##\lambda_1##. When you are looking at the only even powers of ##M## you actually are considering limiting behavior associated with two distinct eigenvalues ##\lambda_1^2 = 1## and ##\lambda_2^2 = 1##... this should make sense because again the even powers show a reducible matrix aka 2 distinct 2x2 markov chain matrices in there (and every markov chain must have at least one eigenvalue of 1, why?).

Unfortunately I'm not able to clearly understand the meaning of what you said here :(

Orodruin · Jul 12, 2018

Aleoa said:

The limiting state of M^{n} is the eigenvector associated with \lambda = 1 , however i don't know how to calculate a rate of convergence...

If you write the initial state as a linear combination of eigenvectors ##v_i##, i.e.,
$$
v = \sum_{i = 1}^4 a_i v_i,
$$
what is the state after applying ##M^n##?

Ray Vickson · Jul 12, 2018

Aleoa said:

Are there efficient ways to do this eigenvalues calculations (due to the particular form of M), or do i have to use the classical approach to eigenvaluse calculation ?
The limiting state of M^{n} is the eigenvector associated with \lambda = 1 , however i don't know how to calculate a rate of convergence...

Unfortunately I'm not able to clearly understand the meaning of what you said here :(

If I were doing the question I would avoid eigenvalues altogether. If you want to know how fast ##M^{2n}## approaches its limit, by far the easiest way is to first compute the limit; you do not need to use fancy eigenvalue methods to do that; in fact, since
$$M^{2n} = \pmatrix{A^n & 0 \\ 0 & B^n}$$
you can analyze the two ##A^n## and ##B^n## problems separately. You get ##\lim_n A^n = \Pi_A## and ##\lim_n B^n = \Pi_B## by simple, high-school algebraic methods, taking about 3-4 lines of work each in your two cases. (However, you need to either read the textbook or go on-line for how to do it; I will leave that up to you.) After you have ##\Pi_A## you can let ##D_a(1) = A - \Pi_A##, ##D_a(2)= A D_a(1), ## ##\ldots, ## ## D_a(n) = A D_a(n-1) = A^{n-1} D_a(1)##. These are the matrix of deviations of ##A^n-\Pi_A##, whose elements will go to zero geometrically fast as ##n## grows. You can just go ahead and compute them numerically (for example, using a spreadsheet), and look for the first ##n## that makes all the elements less tan 0.0001 in maginitude.

Naturally, you could do all that using the eigenvalues of ##A## (which might be just as easy in this little ##2 \times 2## case), but if you had a Markov chain with several thousand states (as in some real-world applications), just getting the eigenvalues themselves would be almost out of the question. However, getting the limiting matrix ##\Pi## would just involve solving linear equations---admittedly with thousands of equations in thousands of unknowns---and would usually be doable without excessive pain. In practice, that is how people actually do it; nobody ever uses eigenvalues for that task. (However, to be fair, the rows of limiting matrix ##\Pi##, are, in fact, the normalizd eigenvector for eigenvalue ##\lambda = 1##.)

Computing something like the norm of ##D(n) = M^n - \Pi## via a recursive numerical scheme would also be doable (but slow).

Orodruin · Jul 12, 2018

Ray Vickson said:

If I were doing the question I would avoid eigenvalues altogether. If you want to know how fast ##M^{2n}## approaches its limit, by far the easiest way is to first compute the limit; you do not need to use fancy eigenvalue methods to do that; in fact, since
$$M^{2n} = \pmatrix{A^n & 0 \\ 0 & B^n}$$
you can analyze the two ##A^n## and ##B^n## problems separately. You get ##\lim_n A^n = \Pi_A## and ##\lim_n B^n = \Pi_B## by simple, high-school algebraic methods, taking about 3-4 lines of work each in your two cases. (However, you need to either read the textbook or go on-line for how to do it; I will leave that up to you.) After you have ##\Pi_A## you can let ##D_a(1) = A - \Pi_A##, ##D_a(2)= A D_a(1), ## ##\ldots, ## ## D_a(n) = A D_a(n-1) = A^{n-1} D_a(1)##. These are the matrix of deviations of ##A^n-\Pi_A##, whose elements will go to zero geometrically fast as ##n## grows. You can just go ahead and compute them numerically (for example, using a spreadsheet), and look for the first ##n## that makes all the elements less tan 0.0001 in maginitude.

Naturally, you could do all that using the eigenvalues of ##A## (which might be just as easy in this little ##2 \times 2## case), but if you had a Markov chain with several thousand states (as in some real-world applications), just getting the eigenvalues themselves would be almost out of the question. However, getting the limiting matrix ##\Pi## would just involve solving linear equations---admittedly with thousands of equations in thousands of unknowns---and would usually be doable without excessive pain. In practice, that is how people actually do it; nobody ever uses eigenvalues for that task. (However, to be fair, the rows of limiting matrix ##\Pi##, are, in fact, the normalizd eigenvector for eigenvalue ##\lambda = 1##.)

Computing something like the norm of ##D(n) = M^n - \Pi## via a recursive numerical scheme would also be doable (but slow).

I honestly find this a much more cumbersome way than solving a few easy eigenvalue and eigenvector equations. I also fail to see why one would use a spreadsheet to do matrix algebra when there are far better suited options like Python. I also doubt anyone would solve for the eigenvalues and eigenvectors of a Markov chain system with thousands of state by hand, but you do not need to because Python or Matlab will do that for you as well.

Ray Vickson · Jul 12, 2018

Orodruin said:

I honestly find this a much more cumbersome way than solving a few easy eigenvalue and eigenvector equations. I also fail to see why one would use a spreadsheet to do matrix algebra when there are far better suited options like Python. I also doubt anyone would solve for the eigenvalues and eigenvectors of a Markov chain system with thousands of state by hand, but you do not need to because Python or Matlab will do that for you as well.

Of course; I personally would never use a spreadsheet for such a task---I always use Maple---but I have no idea what software is available to the OP. Almost everybody has access to EXCEL or an open-source equivalent.

I DID say that in the OP's case using eigenvalues may be about as good as the method I mentioned; however, I also referred to much larger problems, where eigenvalues start to become nearly inaccessible, no matter what packages you use.

Basically, I am trying to steer the OP away from excessive reliance on eigenvalues, since for Markov chains there are so many attractive alternatives---but he needs to do some reading (not much, but some at least). I am trying to get him to actually do that reading, but will gladly give up if he/she has no interest.

StoneTemplePython · Jul 12, 2018

Orodruin said:

I honestly find this a much more cumbersome way than solving a few easy eigenvalue and eigenvector equations. I also fail to see why one would use a spreadsheet to do matrix algebra when there are far better suited options like Python. I also doubt anyone would solve for the eigenvalues and eigenvectors of a Markov chain system with thousands of state by hand, but you do not need to because Python or Matlab will do that for you as well.

Furthermore there is a ton of research on estimating / bounding the subdominant eigenvalue. There's also a lot of computational routines that are geared toward computing just a few biggest and smallest magnitude eigenvalues (for this reason).

For really massive problems, e.g. classical PageRank, the problem turns on its head. In those cases people want an estimate on the subdominant eigenvalue and then try to iterate to the steady state -- the matrix is too large and massively sparse so direct solution is infeasible. The subdominant eigenvalue gives an estimate on how many iterations will be needed.

- - - -
May I make a suggestion to @Aleoa ? Pick one of the methods mentioned on here, and solve the problem in full. Then Ray, Oroduin and I can show you alternative approaches that are easier, or more satisfying, etc. But I think it's required for OP to put in the work first.

- - - -
For a complementary take, you may like chapter 11 of:

https://www.math.dartmouth.edu/~prob/prob/prob.pdf

They just about refuse to use eigenvalues in there for whatever reason. The flip side is they introduce a coupling argument and some other goodies. There's some really good problems in there (though a few of them have wording that I found a little hard to interpret... still good overall though.)

Probability of Hall in H after John's Second Play: Markov Matrix Solution

Attachments

Similar threads

Hot Threads

Prove that the integral is equal to ##\pi^2/8##

Solving the wave equation with piecewise initial conditions

Area of loop in x-y plane

Calculating radius of gyration of plane figure about x-axis

Solve this problem that involves induction

Recent Insights

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Insights What Exactly is Dirac’s Delta Function? - Insight

Insights Relativator (Circular Slide-Rule): Simulated with Desmos - Insight

Insights Fixing Things Which Can Go Wrong With Complex Numbers

Insights Fermat's Last Theorem

Insights Why Vector Spaces Explain The World: A Historical Perspective