Finding the Steady State Distribution of a Markov Chain

In summary: .... [-0.0832] ... [ 0.3823 + 0.3589i] ... [ 0.3823 - 0.3589i] [0] .... ........... [ 0.6000] [0.7071] .... [-0.6365] [-0.7071] .... [-0.2447]
  • #1
HmBe
45
0

Homework Statement



snakes.png


Homework Equations





The Attempt at a Solution



I'm fine on part a), and for part b) I got 0.162 and c) I got 1/9, 2/9 and 0.481, although didn't feel too sure about those.

I'm stuck on part d).

The eigenvalues are

1
0.8734
-0.0799 + 0.1101i
-0.0799 - 0.1101i
0.5
0.2865

And eigenvectors are

[0] ... [-0.1269] ... [-0.6328] ..... [-0.6328]
[0] ... [-0.0832] ... [ 0.3823 + 0.3589i] ... [ 0.3823 - 0.3589i]
[0] ... [-0.0991] ... [ 0.4630 - 0.2784i] ... [ 0.4630 + 0.2784i]
[0] ... [-0.1380] ... [-0.0547 - 0.0335i] ... [-0.0547 + 0.0335i]
[0] ... [-0.4276] ... [-0.1049 - 0.1061i] ... [-0.1049 + 0.1061i]
[1] ... [ 0.8748] ... [-0.0529 + 0.0591i] ... [-0.0529 - 0.0591i]

[ 0] .... [-0.1637]
[ 0] .... [ 0.3793]
[ 0] .... [ 0.6000]
[ 0.7071] ... [-0.6365]
[-0.7071] ... [-0.2447]
[ 0] .... [ 0.0657]


But I don't really know what to do from here. I know 0 < μ < 1, and I'm guessing there might be a way to write x(0) in a linear combination of the eigenvectors, but don't see how this will help. Sorry for the rubbish formatting.
 
Physics news on Phys.org
  • #2
I haven't read your whole post, but I think I get the gist of what you're asking from skimming it

Let B, bet the matrix composed of the eigenvectors.

x(0) = B y

Solve this system for y. This is how to write x(0) in terms of the eigenvectors. Suppose for the moment that x(0) WAS an eigenvector, but not the one corresponding to eigenvalue 1 (this isn't true but just imagine for a second).

x(1) = A x(0) = λ x(0)
x(2) = AAx(0) = λ2 x(0)
.
.
.
x(n) = λnx(0)

Okay so I think you can pretty clearly see how to answer the question (d) in that simple case. Now how does the situation change if x(0) is a linear combination of eigenvectors? As I said above, y is how you represent x(0) in that way. What happens as n → ∞
 
  • #3
Following your instructions I got

x(t)= A^t * x(0) = A^t * ( V1*y1 + V2*y2 + ... + V6*y6 ) = (λ1^t)*V1*y1 + (λ2^t)*V2*y2 + ... + (λ6^t)*V6*y6

And as t grows large we can ignore the smaller eigenvalues as they tend to zero.

The eigenvalues are

1
0.8734
-0.0799 + 0.1101i
-0.0799 - 0.1101i
0.5
0.2865

So the only eigenvalue that doesn't tend to zero is 1, but the term containing this eigenvalue is constant over time, so I said to look at the next largest eigenvalue, 0.8734. And that for suitably large (but not too large) t,

x(t) = (λ1^t)*V1*y1 + (λ2^t)*V2*y2

x(t) = V1 + (0.8734^t)*V2*y2

So 0.8734 is the answer?
 
  • #4
HmBe said:

Homework Statement



snakes.png


Homework Equations


The Attempt at a Solution



I'm fine on part a), and for part b) I got 0.162 and c) I got 1/9, 2/9 and 0.481, although didn't feel too sure about those.

I'm stuck on part d).

The eigenvalues are

1
0.8734
-0.0799 + 0.1101i
-0.0799 - 0.1101i
0.5
0.2865

And eigenvectors are

[0] ... [-0.1269] ... [-0.6328] ..... [-0.6328]
[0] ... [-0.0832] ... [ 0.3823 + 0.3589i] ... [ 0.3823 - 0.3589i]
[0] ... [-0.0991] ... [ 0.4630 - 0.2784i] ... [ 0.4630 + 0.2784i]
[0] ... [-0.1380] ... [-0.0547 - 0.0335i] ... [-0.0547 + 0.0335i]
[0] ... [-0.4276] ... [-0.1049 - 0.1061i] ... [-0.1049 + 0.1061i]
[1] ... [ 0.8748] ... [-0.0529 + 0.0591i] ... [-0.0529 - 0.0591i]

[ 0] .... [-0.1637]
[ 0] .... [ 0.3793]
[ 0] .... [ 0.6000]
[ 0.7071] ... [-0.6365]
[-0.7071] ... [-0.2447]
[ 0] .... [ 0.0657]But I don't really know what to do from here. I know 0 < μ < 1, and I'm guessing there might be a way to write x(0) in a linear combination of the eigenvectors, but don't see how this will help. Sorry for the rubbish formatting.

You may find it convenient to know the following fact about functions of a matrix: if an nxn matrix A has n distinct eigenvalues r_1, r_2, ..., r_n, then there exist nxn matrices F_1, F_2,..., F_n such that f(A) = F_1 * f(r1) + F_2 * f(r_2) + ... + F_n * f(r_n) for any analytic function f(x) = c0 + c1*x + c2*x^2 + c3*x^3 + ... whose circle of convergence contains all the r_i. We define f(A) as f(A) = c0*I + c1*A + c2*A^2 + ..., where I = identity matrix. Note that the matrices F_i are independent of what function f you choose.

We may apply this to f(x) = 1 (so f(A) = I), f(x) = x (so f(A) = A), f(x) = x^2 (so f(A) = A^2), etc. Going up to A^(n-1) we get linear equations to determine all the F_i. In practice, leave the I, A, A^2,... unevaluated (i.e., let them be symbolic constants i, a1, a2, a3,...), then solve the equations i = sum(f_i), a1 = sum(r_i*f_i), a2 = sum(r_i^2*f_i),... for symbolic f_1, f_2,..., f_n, then determine the matrices F_i by substituting i=I,a1=A,a2=A^2,... into the formula for f_i.

You can also get the F_i as u_i * v_i, where u_i is the normalized right eigenvector of r_i (column vector) and v_i is the normalized left eigenvector of r_i (row vector). This has the form column x row , so is a matrix.

For the case where two of the eigenvalues are complex conjugates of each other, one can obtain purely real results by going to the polar form u + i*v = r*cos(w) + i*r*sin(w), and writing, for example, C*r^n*cos(n*w) + D*r^n*sin(n*w) in the expression for A^n. Aside from that not much changes.

If you carry out this procedure using a computer package, you should keep a large number of digits until writing down the final results; that avoids serious roundoff issues when solving for the F_i, etc.

On the other hand, in this question all you need is the second-largest eigenvalue; you don't need to know the F_i to get that. However, if you did want the constant C in the question, you would need knowledge of the F_i associated with the two largest eigenvalues.

RGV
 
Last edited:
  • #5
HmBe said:
Following your instructions I got

x(t)= A^t * x(0) = A^t * ( V1*y1 + V2*y2 + ... + V6*y6 ) = (λ1^t)*V1*y1 + (λ2^t)*V2*y2 + ... + (λ6^t)*V6*y6

And as t grows large we can ignore the smaller eigenvalues as they tend to zero.

The eigenvalues are

1
0.8734
-0.0799 + 0.1101i
-0.0799 - 0.1101i
0.5
0.2865

So the only eigenvalue that doesn't tend to zero is 1, but the term containing this eigenvalue is constant over time, so I said to look at the next largest eigenvalue, 0.8734. And that for suitably large (but not too large) t,

x(t) = (λ1^t)*V1*y1 + (λ2^t)*V2*y2

x(t) = V1 + (0.8734^t)*V2*y2

So 0.8734 is the answer?

I haven't checked your calculations or anything, but you're saying the right things so at least the approach should be correct.
 

1. What is a Markov chain?

A Markov chain is a mathematical model that describes a sequence of events where the probability of transitioning from one state to another depends only on the current state and not on any previous states.

2. How are Markov chains used in probability?

Markov chains are used in probability to model and analyze various systems with random behavior. They can help predict the likelihood of certain events occurring and understand the overall behavior of a system.

3. What are the limitations of Markov chains?

Markov chains assume that the future state of a system is only dependent on the current state, which may not always be true in real-world situations. They also require a finite number of states and may not be able to accurately model complex systems.

4. Can Markov chains be used for continuous variables?

Yes, Markov chains can be used for continuous variables by discretizing the data or by using a continuous-time Markov chain model.

5. Are there any real-life applications of Markov chains?

Yes, Markov chains have various real-life applications, such as predicting stock market trends, weather forecasting, analyzing customer behavior, and speech recognition.

Back
Top