Probability of Hall in H after John's Second Play: Markov Matrix Solution

Ray Vickson · Jul 3, 2018

Aleoa said:

It's been a long time, I'm sorry.
Anyway, i calculated lim (QR)^{n}, getting \begin{bmatrix} \frac{1}{5} &-\frac{4}{5} \\ -\frac{1}{5}& \frac{4}{5} \end{bmatrix}.

In order to do so, i calculated QR, trasformed it in the form B\Lambda B^{-1} and then i calculated lim (B\Lambda ^{n}B^{-1})

If it's correct i continue with the exercise, calculating first of all lim (RQ)^{n}

Your calculations must be wrong: ##Q## and ##R## are stochastic matrices, so ##M = QR## is also a stochastic matrix; that is, its elements are ##\geq 0## and its columns sum to 1. Therefore, for any positive integer ##n## the matrix ##M^n## is also stochastic, as is ##\Pi = \lim_{n \to \infty} M^n##, provided that the limit exists (which it does in this case). Your limiting matrix should be of the form
$$\Pi = \pmatrix{\pi_1 & \pi_1 \\ \pi_2 &\pi_2}, $$
where ##\pi_1, \pi_2 > 0## and ##\pi_1 + \pi_2 = 1.##

You do not need to find eigenvalues or anything like that in order to find the limiting matrix ##\Pi##; just use the standard "equilibrium equations", to get a simple ##2
\times 2## system of linear equations whose solution gives you what you need. Of course, you could use eigenvalues, do diagonalization and all that in order to find the limit, but that is like using a sledgehammer to crack a peanut. Why do it the hard way, when the easy way is almost trivial?

Orodruin · Jul 3, 2018

Also, note that the requirement of summing up to one constrains the possible form of the columns to a n-1-dimensional hyperplane. This implies that one eigenvalue will always be one. This is the eigenvalue you are interested in in the end and its corresponding eigenvector is the limiting distribution.

StoneTemplePython · Jul 3, 2018

Ray Vickson said:

Of course, you could use eigenvalues, do diagonalization and all that in order to find the limit, but that is like using a sledgehammer to crack a peanut. Why do it the hard way, when the easy way is almost trivial?

If OP is getting lost, maybe it is advisable to be solved first with sledgehammer, and then we can go back and show a better way? There's a line from Gauss that I'm fond of (mis?)quoting -- "First get the proof. Then get the right proof."

Also, we should recall

Aleoa said:

What is the probability that the ball is now in
H? How many turns constitutes a 'long time' if we want to be certain
that this probability is correct within 0.001?

i.e. some estimates on convergence rates (closely related to diagonalization) need to be done here. Using your suggested spectral projector approach, these can of course be fairly easily backed into given the starting matrix and the steady state...

Ray Vickson · Jul 4, 2018

StoneTemplePython said:

If OP is getting lost, maybe it is advisable to be solved first with sledgehammer, and then we can go back and show a better way? There's a line from Gauss that I'm fond of (mis?)quoting -- "First get the proof. Then get the right proof."

Also, we should recalli.e. some estimates on convergence rates (closely related to diagonalization) need to be done here. Using your suggested spectral projector approach, these can of course be fairly easily backed into given the starting matrix and the steady state...

If the OP is truly lost, one strategy might be to take the straightforward way out and use something like a spreadsheet to do numerical computations of the first few matrix powers ##M^n##, just to see what is occurring. He/she could then easily enough extract the needed value of ##n## to get 0.0001 precision just by looking. Perhaps the person setting the problem would regard that as an acceptable solution (or, perhaps not).

In any case, I cannot believe the OP was given such a problem with zero background and no textbook or lecture-note backup.

Aleoa · Jul 4, 2018

I made errors in calculations.
Now
lim (QR)^{n} = \begin{bmatrix} \frac{4}{5} &\frac{4}{5} \\ \frac{1}{5}& \frac{1}{5} \end{bmatrix}.

In order to do so, i calculated QR, trasformed it in the form B\Lambda B^{-1} and then i calculated lim (B\Lambda ^{n}B^{-1})

If it's correct i continue with the exercise, calculating first of all lim (RQ)^{n}

Aleoa · Jul 5, 2018

I continued:
lim (RQ)^{n} = \begin{bmatrix} \frac{2}{5} &\frac{2}{5} \\ \frac{3}{5}& \frac{3}{5} \end{bmatrix}.

In order to do so, i calculated RQ, trasformed it in the form B\Lambda B^{-1} and then i calculated lim (B\Lambda ^{n}B^{-1}).

Now i have to build the 4x4 matrix with the two just built 2x2 matrices and i can deduce, for large n, what is the resultant probability distributions of states. Is it correct ?

Ray Vickson · Jul 5, 2018

Aleoa said:

I continued:
lim (RQ)^{n} = \begin{bmatrix} \frac{2}{5} &\frac{2}{5} \\ \frac{3}{5}& \frac{3}{5} \end{bmatrix}.

In order to do so, i calculated RQ, trasformed it in the form B\Lambda B^{-1} and then i calculated lim (B\Lambda ^{n}B^{-1}).

Now i have to build the 4x4 matrix with the two just built 2x2 matrices and i can deduce, for large n, what is the resultant probability distributions of states. Is it correct ?

Not quite: your 4-state Markov chain is "periodic", with period two. Therefore, a limiting distribution for ##M^n## does not exist. You get different limits for large even ##n## or large odd ##n##.

However, if you write
$$M = \pmatrix{0 & Q\\R & 0 },$$
the ##2 \times 2## matrices ##A = QR## and ##B = R Q## are the transition matrices for the John-John or Eli-Eli transitions---that is, for the transitions between successive starting states for John's or Eli's plays. Both ##A^n## and ##B^n## do have well-defined limits for large ##n##; these would be the diagonal blocks of ##M^{2n}## for large ##n##.

Note: my interpretation of ##A## and ##B## uses row-wise stochasticity, where rows sum to 1. Things may be different if you use column-wise stochasticity---but you can work it out

However, the matrix ##M## does have a well-defined "equilibrium" value---a kind of long-run average---given as
$$\Pi_M = \lim_{n \to \infty} \frac 1 2 \left( M^n + M^{n+1} \right) . $$
The interpretation here is that half the time it is John's turn (in which case you get the limiting value ##A^n##) and half the time it is Eli's turn (in which case you get the limiting value of ##B^n##). More precisely, John sees ##A^n## half the time and sees ##B^n Q## half the time, while Eli sees ##B^n## half the time and sees ##A^n R## half the time.

Aleoa · Jul 12, 2018

What i think is that it's sufficient to know the limit for large n of M^{2n} in order to know the general behaviour of this process.
Since, if i start with a probability vector \bar{v_{0}}, then after 1000 iterations i get the resultant probability vector as M^{1000}\bar{v_{0}}, from this vector can know the general behaviour of the process. Is it correct ?

StoneTemplePython · Jul 12, 2018

Aleoa said:

What i think is that it's sufficient to know the limit for large n of M^{2n} in order to know the general behaviour of this process.
Since, if i start with a probability vector \bar{v_{0}}, then after 1000 iterations i get the resultant probability vector as M^{1000}\bar{v_{0}}, from this vector can know the general behaviour of the process. Is it correct ?

sort of. It depends what you want with it. The even powers show two distinct different graphs (i.e. your chain is reducible when looked at from even powers only -- specifically there are 2 separate 2x2 matrices denoting 2 separate 'mini' markov chains). If you are only interested in even powers, then you are good to go.

What Ray suggested is that you may be interested in an ensemble average, which is a common interpretation when dealing with periodic behavior. If you take a closer look your matrix ##M## has an eigenvalue of ## \lambda_1 = 1## and ##M^2## has two eigenvalues of 1, which means ##M## must have an eigenvalue of ## \lambda_2 = -1##. But what happens is when you use the ensemble average then you have ##\frac{\lambda_1 + \lambda_1^2}{2} = 1 ## whereas ##\frac{\lambda_2 + \lambda_2^2}{2} = 0## so the ensemble average has a limit that is strictly dealing behavior /eigenvector of ##\lambda_1##. When you are looking at the only even powers of ##M## you actually are considering limiting behavior associated with two distinct eigenvalues ##\lambda_1^2 = 1## and ##\lambda_2^2 = 1##... this should make sense because again the even powers show a reducible matrix aka 2 distinct 2x2 markov chain matrices in there (and every markov chain must have at least one eigenvalue of 1, why?).

Back to your question

" you look in just after Eli has played. What is the probability that the ball is now in H?"
Does this need even or odd powers? If even, then you're good to go. If odd, you can of course compute the limit on even powers and do a single matrix multiplication afterward (though a little finesse is needed if you want to properly speak about this as some kind of 'limiting' behavior)

"How many turns constitutes a 'long time' if we want to be certain that this probability is correct within 0.001?" How do you propose doing this?

Aleoa · Jul 12, 2018

StoneTemplePython said:

sort of.
What Ray suggested is that you may be interested in an ensemble average, which is a common interpretation when dealing with periodic behavior. If you take a closer look your matrix ##M## has an eigenvalue of ## \lambda_1 = 1## and ##M^2## has two eigenvalues of 1, which means ##M## must have an eigenvalue of ## \lambda_2 = -1##. But what happens is when you use the ensemble average then you have ##\frac{\lambda_1 + \lambda_1^2}{2} = 1 ## whereas ##\frac{\lambda_2 + \lambda_2^2}{2} = 0## so the ensemble average has a limit that is strictly dealing behavior /eigenvector of ##\lambda_1##.

Are there efficient ways to do this eigenvalues calculations (due to the particular form of M), or do i have to use the classical approach to eigenvaluse calculation ?

StoneTemplePython said:

"How many turns constitutes a 'long time' if we want to be certain that this probability is correct within 0.001?" How do you propose doing this?

The limiting state of M^{n} is the eigenvector associated with \lambda = 1 , however i don't know how to calculate a rate of convergence...

StoneTemplePython said:

What Ray suggested is that you may be interested in an ensemble average, which is a common interpretation when dealing with periodic behavior. If you take a closer look your matrix ##M## has an eigenvalue of ## \lambda_1 = 1## and ##M^2## has two eigenvalues of 1, which means ##M## must have an eigenvalue of ## \lambda_2 = -1##. But what happens is when you use the ensemble average then you have ##\frac{\lambda_1 + \lambda_1^2}{2} = 1 ## whereas ##\frac{\lambda_2 + \lambda_2^2}{2} = 0## so the ensemble average has a limit that is strictly dealing behavior /eigenvector of ##\lambda_1##. When you are looking at the only even powers of ##M## you actually are considering limiting behavior associated with two distinct eigenvalues ##\lambda_1^2 = 1## and ##\lambda_2^2 = 1##... this should make sense because again the even powers show a reducible matrix aka 2 distinct 2x2 markov chain matrices in there (and every markov chain must have at least one eigenvalue of 1, why?).

Unfortunately I'm not able to clearly understand the meaning of what you said here :(

Orodruin · Jul 12, 2018

Aleoa said:

The limiting state of M^{n} is the eigenvector associated with \lambda = 1 , however i don't know how to calculate a rate of convergence...

If you write the initial state as a linear combination of eigenvectors ##v_i##, i.e.,
$$
v = \sum_{i = 1}^4 a_i v_i,
$$
what is the state after applying ##M^n##?

Ray Vickson · Jul 12, 2018

Aleoa said:

Are there efficient ways to do this eigenvalues calculations (due to the particular form of M), or do i have to use the classical approach to eigenvaluse calculation ?
The limiting state of M^{n} is the eigenvector associated with \lambda = 1 , however i don't know how to calculate a rate of convergence...

Unfortunately I'm not able to clearly understand the meaning of what you said here :(

If I were doing the question I would avoid eigenvalues altogether. If you want to know how fast ##M^{2n}## approaches its limit, by far the easiest way is to first compute the limit; you do not need to use fancy eigenvalue methods to do that; in fact, since
$$M^{2n} = \pmatrix{A^n & 0 \\ 0 & B^n}$$
you can analyze the two ##A^n## and ##B^n## problems separately. You get ##\lim_n A^n = \Pi_A## and ##\lim_n B^n = \Pi_B## by simple, high-school algebraic methods, taking about 3-4 lines of work each in your two cases. (However, you need to either read the textbook or go on-line for how to do it; I will leave that up to you.) After you have ##\Pi_A## you can let ##D_a(1) = A - \Pi_A##, ##D_a(2)= A D_a(1), ## ##\ldots, ## ## D_a(n) = A D_a(n-1) = A^{n-1} D_a(1)##. These are the matrix of deviations of ##A^n-\Pi_A##, whose elements will go to zero geometrically fast as ##n## grows. You can just go ahead and compute them numerically (for example, using a spreadsheet), and look for the first ##n## that makes all the elements less tan 0.0001 in maginitude.

Naturally, you could do all that using the eigenvalues of ##A## (which might be just as easy in this little ##2 \times 2## case), but if you had a Markov chain with several thousand states (as in some real-world applications), just getting the eigenvalues themselves would be almost out of the question. However, getting the limiting matrix ##\Pi## would just involve solving linear equations---admittedly with thousands of equations in thousands of unknowns---and would usually be doable without excessive pain. In practice, that is how people actually do it; nobody ever uses eigenvalues for that task. (However, to be fair, the rows of limiting matrix ##\Pi##, are, in fact, the normalizd eigenvector for eigenvalue ##\lambda = 1##.)

Computing something like the norm of ##D(n) = M^n - \Pi## via a recursive numerical scheme would also be doable (but slow).

Orodruin · Jul 12, 2018

Ray Vickson said:

If I were doing the question I would avoid eigenvalues altogether. If you want to know how fast ##M^{2n}## approaches its limit, by far the easiest way is to first compute the limit; you do not need to use fancy eigenvalue methods to do that; in fact, since
$$M^{2n} = \pmatrix{A^n & 0 \\ 0 & B^n}$$
you can analyze the two ##A^n## and ##B^n## problems separately. You get ##\lim_n A^n = \Pi_A## and ##\lim_n B^n = \Pi_B## by simple, high-school algebraic methods, taking about 3-4 lines of work each in your two cases. (However, you need to either read the textbook or go on-line for how to do it; I will leave that up to you.) After you have ##\Pi_A## you can let ##D_a(1) = A - \Pi_A##, ##D_a(2)= A D_a(1), ## ##\ldots, ## ## D_a(n) = A D_a(n-1) = A^{n-1} D_a(1)##. These are the matrix of deviations of ##A^n-\Pi_A##, whose elements will go to zero geometrically fast as ##n## grows. You can just go ahead and compute them numerically (for example, using a spreadsheet), and look for the first ##n## that makes all the elements less tan 0.0001 in maginitude.

Naturally, you could do all that using the eigenvalues of ##A## (which might be just as easy in this little ##2 \times 2## case), but if you had a Markov chain with several thousand states (as in some real-world applications), just getting the eigenvalues themselves would be almost out of the question. However, getting the limiting matrix ##\Pi## would just involve solving linear equations---admittedly with thousands of equations in thousands of unknowns---and would usually be doable without excessive pain. In practice, that is how people actually do it; nobody ever uses eigenvalues for that task. (However, to be fair, the rows of limiting matrix ##\Pi##, are, in fact, the normalizd eigenvector for eigenvalue ##\lambda = 1##.)

Computing something like the norm of ##D(n) = M^n - \Pi## via a recursive numerical scheme would also be doable (but slow).

I honestly find this a much more cumbersome way than solving a few easy eigenvalue and eigenvector equations. I also fail to see why one would use a spreadsheet to do matrix algebra when there are far better suited options like Python. I also doubt anyone would solve for the eigenvalues and eigenvectors of a Markov chain system with thousands of state by hand, but you do not need to because Python or Matlab will do that for you as well.

Ray Vickson · Jul 12, 2018

Orodruin said:

I honestly find this a much more cumbersome way than solving a few easy eigenvalue and eigenvector equations. I also fail to see why one would use a spreadsheet to do matrix algebra when there are far better suited options like Python. I also doubt anyone would solve for the eigenvalues and eigenvectors of a Markov chain system with thousands of state by hand, but you do not need to because Python or Matlab will do that for you as well.

Of course; I personally would never use a spreadsheet for such a task---I always use Maple---but I have no idea what software is available to the OP. Almost everybody has access to EXCEL or an open-source equivalent.

I DID say that in the OP's case using eigenvalues may be about as good as the method I mentioned; however, I also referred to much larger problems, where eigenvalues start to become nearly inaccessible, no matter what packages you use.

Basically, I am trying to steer the OP away from excessive reliance on eigenvalues, since for Markov chains there are so many attractive alternatives---but he needs to do some reading (not much, but some at least). I am trying to get him to actually do that reading, but will gladly give up if he/she has no interest.

StoneTemplePython · Jul 12, 2018

Orodruin said:

I honestly find this a much more cumbersome way than solving a few easy eigenvalue and eigenvector equations. I also fail to see why one would use a spreadsheet to do matrix algebra when there are far better suited options like Python. I also doubt anyone would solve for the eigenvalues and eigenvectors of a Markov chain system with thousands of state by hand, but you do not need to because Python or Matlab will do that for you as well.

Furthermore there is a ton of research on estimating / bounding the subdominant eigenvalue. There's also a lot of computational routines that are geared toward computing just a few biggest and smallest magnitude eigenvalues (for this reason).

For really massive problems, e.g. classical PageRank, the problem turns on its head. In those cases people want an estimate on the subdominant eigenvalue and then try to iterate to the steady state -- the matrix is too large and massively sparse so direct solution is infeasible. The subdominant eigenvalue gives an estimate on how many iterations will be needed.

- - - -
May I make a suggestion to @Aleoa ? Pick one of the methods mentioned on here, and solve the problem in full. Then Ray, Oroduin and I can show you alternative approaches that are easier, or more satisfying, etc. But I think it's required for OP to put in the work first.

- - - -
For a complementary take, you may like chapter 11 of:

https://www.math.dartmouth.edu/~prob/prob/prob.pdf

They just about refuse to use eigenvalues in there for whatever reason. The flip side is they introduce a coupling argument and some other goodies. There's some really good problems in there (though a few of them have wording that I found a little hard to interpret... still good overall though.)

Probability of Hall in H after John's Second Play: Markov Matrix Solution

Homework Help Overview

Discussion Character

Approaches and Questions Raised

Discussion Status

Contextual Notes

Similar threads

Distance between a Clock's hands when the distance is increasing most rapidly

Volume with spherical coordinates

Polar integral

Does this series converge uniformly?

Is this the correct general solution of the given PDE?

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Insights What Exactly is Dirac’s Delta Function? - Insight