Finding the Steady State Distribution of a Markov Chain

  • Thread starter Thread starter HmBe
  • Start date Start date
  • Tags Tags
    Probability
Click For Summary

Homework Help Overview

The discussion revolves around finding the steady state distribution of a Markov chain, specifically focusing on the eigenvalues and eigenvectors associated with the transition matrix. Participants are exploring the implications of these eigenvalues in relation to the system's behavior over time.

Discussion Character

  • Exploratory, Mathematical reasoning, Assumption checking

Approaches and Questions Raised

  • Participants discuss the representation of the initial state as a linear combination of eigenvectors and the implications of the eigenvalues on the long-term behavior of the system. There is uncertainty about how to proceed from the eigenvalues and eigenvectors provided, particularly in relation to the second-largest eigenvalue.

Discussion Status

Some participants have offered guidance on how to express the initial state in terms of the eigenvectors and have discussed the significance of the eigenvalues as time progresses. There is an ongoing exploration of the implications of these eigenvalues, particularly the largest and second-largest, without reaching a definitive conclusion.

Contextual Notes

Participants have noted the complexity of the eigenvalues, including complex conjugates, and the potential need for additional calculations or considerations regarding the constants associated with the eigenvalues. There is also mention of the importance of precision in calculations to avoid roundoff errors.

HmBe
Messages
45
Reaction score
0

Homework Statement



snakes.png


Homework Equations





The Attempt at a Solution



I'm fine on part a), and for part b) I got 0.162 and c) I got 1/9, 2/9 and 0.481, although didn't feel too sure about those.

I'm stuck on part d).

The eigenvalues are

1
0.8734
-0.0799 + 0.1101i
-0.0799 - 0.1101i
0.5
0.2865

And eigenvectors are

[0] ... [-0.1269] ... [-0.6328] ..... [-0.6328]
[0] ... [-0.0832] ... [ 0.3823 + 0.3589i] ... [ 0.3823 - 0.3589i]
[0] ... [-0.0991] ... [ 0.4630 - 0.2784i] ... [ 0.4630 + 0.2784i]
[0] ... [-0.1380] ... [-0.0547 - 0.0335i] ... [-0.0547 + 0.0335i]
[0] ... [-0.4276] ... [-0.1049 - 0.1061i] ... [-0.1049 + 0.1061i]
[1] ... [ 0.8748] ... [-0.0529 + 0.0591i] ... [-0.0529 - 0.0591i]

[ 0] .... [-0.1637]
[ 0] .... [ 0.3793]
[ 0] .... [ 0.6000]
[ 0.7071] ... [-0.6365]
[-0.7071] ... [-0.2447]
[ 0] .... [ 0.0657]


But I don't really know what to do from here. I know 0 < μ < 1, and I'm guessing there might be a way to write x(0) in a linear combination of the eigenvectors, but don't see how this will help. Sorry for the rubbish formatting.
 
Physics news on Phys.org
I haven't read your whole post, but I think I get the gist of what you're asking from skimming it

Let B, bet the matrix composed of the eigenvectors.

x(0) = B y

Solve this system for y. This is how to write x(0) in terms of the eigenvectors. Suppose for the moment that x(0) WAS an eigenvector, but not the one corresponding to eigenvalue 1 (this isn't true but just imagine for a second).

x(1) = A x(0) = λ x(0)
x(2) = AAx(0) = λ2 x(0)
.
.
.
x(n) = λnx(0)

Okay so I think you can pretty clearly see how to answer the question (d) in that simple case. Now how does the situation change if x(0) is a linear combination of eigenvectors? As I said above, y is how you represent x(0) in that way. What happens as n → ∞
 
Following your instructions I got

x(t)= A^t * x(0) = A^t * ( V1*y1 + V2*y2 + ... + V6*y6 ) = (λ1^t)*V1*y1 + (λ2^t)*V2*y2 + ... + (λ6^t)*V6*y6

And as t grows large we can ignore the smaller eigenvalues as they tend to zero.

The eigenvalues are

1
0.8734
-0.0799 + 0.1101i
-0.0799 - 0.1101i
0.5
0.2865

So the only eigenvalue that doesn't tend to zero is 1, but the term containing this eigenvalue is constant over time, so I said to look at the next largest eigenvalue, 0.8734. And that for suitably large (but not too large) t,

x(t) = (λ1^t)*V1*y1 + (λ2^t)*V2*y2

x(t) = V1 + (0.8734^t)*V2*y2

So 0.8734 is the answer?
 
HmBe said:

Homework Statement



snakes.png


Homework Equations


The Attempt at a Solution



I'm fine on part a), and for part b) I got 0.162 and c) I got 1/9, 2/9 and 0.481, although didn't feel too sure about those.

I'm stuck on part d).

The eigenvalues are

1
0.8734
-0.0799 + 0.1101i
-0.0799 - 0.1101i
0.5
0.2865

And eigenvectors are

[0] ... [-0.1269] ... [-0.6328] ..... [-0.6328]
[0] ... [-0.0832] ... [ 0.3823 + 0.3589i] ... [ 0.3823 - 0.3589i]
[0] ... [-0.0991] ... [ 0.4630 - 0.2784i] ... [ 0.4630 + 0.2784i]
[0] ... [-0.1380] ... [-0.0547 - 0.0335i] ... [-0.0547 + 0.0335i]
[0] ... [-0.4276] ... [-0.1049 - 0.1061i] ... [-0.1049 + 0.1061i]
[1] ... [ 0.8748] ... [-0.0529 + 0.0591i] ... [-0.0529 - 0.0591i]

[ 0] .... [-0.1637]
[ 0] .... [ 0.3793]
[ 0] .... [ 0.6000]
[ 0.7071] ... [-0.6365]
[-0.7071] ... [-0.2447]
[ 0] .... [ 0.0657]But I don't really know what to do from here. I know 0 < μ < 1, and I'm guessing there might be a way to write x(0) in a linear combination of the eigenvectors, but don't see how this will help. Sorry for the rubbish formatting.

You may find it convenient to know the following fact about functions of a matrix: if an nxn matrix A has n distinct eigenvalues r_1, r_2, ..., r_n, then there exist nxn matrices F_1, F_2,..., F_n such that f(A) = F_1 * f(r1) + F_2 * f(r_2) + ... + F_n * f(r_n) for any analytic function f(x) = c0 + c1*x + c2*x^2 + c3*x^3 + ... whose circle of convergence contains all the r_i. We define f(A) as f(A) = c0*I + c1*A + c2*A^2 + ..., where I = identity matrix. Note that the matrices F_i are independent of what function f you choose.

We may apply this to f(x) = 1 (so f(A) = I), f(x) = x (so f(A) = A), f(x) = x^2 (so f(A) = A^2), etc. Going up to A^(n-1) we get linear equations to determine all the F_i. In practice, leave the I, A, A^2,... unevaluated (i.e., let them be symbolic constants i, a1, a2, a3,...), then solve the equations i = sum(f_i), a1 = sum(r_i*f_i), a2 = sum(r_i^2*f_i),... for symbolic f_1, f_2,..., f_n, then determine the matrices F_i by substituting i=I,a1=A,a2=A^2,... into the formula for f_i.

You can also get the F_i as u_i * v_i, where u_i is the normalized right eigenvector of r_i (column vector) and v_i is the normalized left eigenvector of r_i (row vector). This has the form column x row , so is a matrix.

For the case where two of the eigenvalues are complex conjugates of each other, one can obtain purely real results by going to the polar form u + i*v = r*cos(w) + i*r*sin(w), and writing, for example, C*r^n*cos(n*w) + D*r^n*sin(n*w) in the expression for A^n. Aside from that not much changes.

If you carry out this procedure using a computer package, you should keep a large number of digits until writing down the final results; that avoids serious roundoff issues when solving for the F_i, etc.

On the other hand, in this question all you need is the second-largest eigenvalue; you don't need to know the F_i to get that. However, if you did want the constant C in the question, you would need knowledge of the F_i associated with the two largest eigenvalues.

RGV
 
Last edited:
HmBe said:
Following your instructions I got

x(t)= A^t * x(0) = A^t * ( V1*y1 + V2*y2 + ... + V6*y6 ) = (λ1^t)*V1*y1 + (λ2^t)*V2*y2 + ... + (λ6^t)*V6*y6

And as t grows large we can ignore the smaller eigenvalues as they tend to zero.

The eigenvalues are

1
0.8734
-0.0799 + 0.1101i
-0.0799 - 0.1101i
0.5
0.2865

So the only eigenvalue that doesn't tend to zero is 1, but the term containing this eigenvalue is constant over time, so I said to look at the next largest eigenvalue, 0.8734. And that for suitably large (but not too large) t,

x(t) = (λ1^t)*V1*y1 + (λ2^t)*V2*y2

x(t) = V1 + (0.8734^t)*V2*y2

So 0.8734 is the answer?

I haven't checked your calculations or anything, but you're saying the right things so at least the approach should be correct.