How can numpy.dot() be used in Python to solve Markov chain Monte Carlo methods?

jbowers9 · May 11, 2018

Homework Statement

I'm taking an online course in Stat. Mech. and we're working on Markov chain Monte Carlo methods. One of the programs supplied uses the following:
Mentor note: Added code tags

Python:

import numpy
neighbor =  [[1, 3, 0, 0], [2, 4, 0, 1], [2, 5, 1, 2],
             [4, 6, 3, 0], [5, 7, 3, 1], [5, 8, 4, 2],
             [7, 6, 6, 3], [8, 7, 6, 4], [8, 8, 7, 5]]
transfer = numpy.zeros((9, 9))
for k in range(9):
    for neigh in range(4): transfer[neighbor[k][neigh], k] += 0.25
position = numpy.zeros(9)
position[8] = 1.0
for t in range(10):
    print(t,'  ',["%0.5f" % abs(i- 1.0 / 9.0) for i in position])
 *  position = numpy.dot(transfer, position)

I was able to de-construct most of the code by using print statements, but I'm lost on the above statement. I know how to do a dot product but I'm clueless as to how the values in position are being computed after successive iterations of t. I'm a little fuzzy about what the transfer matrix is too - mathematically I think it's the probability matrix for the random walk - but how its datatype, which seems to me to be an array string when I print it, can be used in a dot product gets me head scratching. The [ ] bracket in the print statement also isn't so obvious as i isn't declared/initialized anywhere before its use. I'm new to python, if you haven't guessed, but I have done some C++ programming before. Any help would be greatly appreciated.

2. Homework Equations
πa(p(a→b)+p(a→c)) = πb(p(b→a)) + πc(p(c→a))

The global balance condition is just right.

The Attempt at a Solution

Inserted print statements in the code.

phyzguy · May 11, 2018

One useful thing to help you understand is:
print(type(variable))
This will print out what type of object a variable is. It might be a string, an integer, a numpy array, and so forth. numpy.zeros creates a numpy array, by default an array of floats. So position is a 9x1 array of floats. You could also print out position.shape, which will tell you the shape of a numpy array. Similarly, transfer is a 9x9 array of floats - in other words a matrix. So numpy.dot just takes the matrix product of the 9x9 array with a 9x1 array, so it will return another 9x1 array .

The python structure [<something with an i> for i in <some list>] is just a shorthand way to loop on a variable i. It is called a "list comprehension". Below is a brief tutorial on how it works. It is very handy but confusing if you are not familiar with it.

http://www.pythonforbeginners.com/basics/list-comprehensions-in-python

StoneTemplePython · May 11, 2018

I recognize that this is someone else's code though a couple things jump out at me.

First thing I'd point out:

If you're using Python ##\geq 3.4##, in general for matrix vector operations (i.e. ##\text{m x n }## arrays and ##\text{n}## dimensional arrays), you don't need to use np.dot(), which always felt clunky to me.

Python:

numpy.dot(transfer, position)

### equivalently, but better:

transfer @ position

If the @ sign is nested all the way to the left of the script, with no spaces or anything else before it on that row, then it is a decorator. But otherwise its used for matrix vector multiplication these days. Note: there are differences between @ operator and .dot() for higher dimensional tensor operations (outside the scope).

- - - -
Second: it seems to buck other conventions -- in particular it is customary to use lower case with vectors and upper case with matrices, so in mathese -- it would be somthing like ##\mathbf p## for position and ##\mathbf T## for ~~transition~~ Transition.

- - - -
Now for your main question, as for the what for loop is actually doing...

Python:

for t in range(10):
    position = numpy.dot(transfer, position)

but better is

Python:

for t in range(10):
    position = Transfer @ position

which reads as
##\mathbf p^{(1)} = \mathbf T \mathbf p^{(0)}##
##\mathbf p^{(2)} = \mathbf T \mathbf p^{(1)}##
##\mathbf p^{(3)} = \mathbf T \mathbf p^{(2)}##
##\mathbf p^{(4)} = \mathbf T \mathbf p^{(3)}##
##\mathbf p^{(5)} = \mathbf T \mathbf p^{(4)}##
##\mathbf p^{(6)} = \mathbf T \mathbf p^{(5)}##
##\mathbf p^{(7)} = \mathbf T \mathbf p^{(6)}##
##\mathbf p^{(8)} = \mathbf T \mathbf p^{(7)}##
##\mathbf p^{(9)} = \mathbf T \mathbf p^{(8)}##
##\mathbf p^{(10)} = \mathbf T \mathbf p^{(9)}## mathematically you can collapse this into one statement:

##\mathbf p^{(10)} = \big(\mathbf T \big(\mathbf T...\big(\mathbf T\big(\mathbf T \mathbf p^{(0)}\big)\big)...\big)\big) = \mathbf T^{10} \mathbf p^{(0)}##

i.e. they are making use of associativity of matrix vector multiplication, and applying your matrix ##\mathbf T## or "Transfer", 10 times in total.

another approach to the same result (i.e. the right hand side of the above) is to use something like

Python:

result = np.matrix_power(Transfer, 10) @ position

except that is wasteful because it does 10 matrix matrix multiplications which are ##\gt O\big(N^2\big)## required for 10 matrix vector operations. (Doing so could also mess up a lot of special features that are common -- e.g. sparse matrices.)

jbowers9 · May 12, 2018

Thank you StoneTemple. I will use your suggested coding for dot products and matrix, array nomenclature. As per the code, print(t,' ',["%0.5f" % abs(i- 1.0 / 9.0) for i in position]), the results of the position = Transfer @ position operation are being used to output a matrix, not saved mind you, after t iterations that approaches an equilibrium(?) condition of where the pebble will be after a sufficiently long time. And t.y. "bicycle repairman," aka phyzguy - I'm using PythonAnywhere to run the code

. I finally figured out that the code is 'print(transfer.shape)' not 'print(transfer.shape()).' Needs further investigation...

StoneTemplePython · May 12, 2018

jbowers9 said:

As per the code, print(t,' ',["%0.5f" % abs(i- 1.0 / 9.0) for i in position]), the results of the position = Transfer @ position operation are being used to output a matrix, not saved mind you, after t iterations that approaches an equilibrium(?) condition of where the pebble will be after a sufficiently long time.

yes... they are using what is known as the "power method" of iterating to where the steady state vector is. There's a ton of literature out there particularly on finite state markov chains, which discusses the nuances behind this. Iterative methods, particularly for large sparse matrices, are quite useful in practice.

mfb · May 13, 2018

StoneTemplePython said:

except that is wasteful because it does 10 matrix matrix multiplications which are ##\gt O\big(N^2\big)## required for 10 matrix vector operations. (Doing so could also mess up a lot of special features that are common -- e.g. sparse matrices.)

For large powers (e.g. to study to the limit) it could still be interesting. You have the additional overhead of diagonalizing the matrix once, but then you can calculate arbitrary powers in O(n²).

StoneTemplePython · May 13, 2018

mfb said:

For large powers (e.g. to study to the limit) it could still be interesting. You have the additional overhead of diagonalizing the matrix once, but then you can calculate arbitrary powers in O(n²).

diagonalization -- and related Jordan forms-- always lurk nearby here for sure. For an awful lot of markov chain problems (think directed graph of links on the internet), it's not advisable or even feasible because the graphs are massive and extremely sparse.

Other problems' structure and size may very.

There's a ton of literature on estimating / bounding the second largest magnitude eigenvalue as it gives an estimate on the rate of the convergence. I had a bunch more typed on the matter -- the results are extra nice since OP's transition matrix is symmetric -- but I didn't want to go off on too much of a tangent.

- - - -
side note:

For what its worth: at the opposite extreme of sparsity, there is the case that a stochastic matrix has every entry ##\gt 0##.

In this case people can find the steady state solution directly from looking at the nullspace of ##\big(\mathbf T - \mathbf I\big)##, and directly apply a coupling argument to get an easy estimate on the exponential rate of convergence.

How can numpy.dot() be used in Python to solve Markov chain Monte Carlo methods?

Homework Statement

The Attempt at a Solution

1. What is numpy.dot() in Python and what does it do?

2. How is numpy.dot() different from other multiplication functions in Python?

3. Can numpy.dot() be used for arrays of different dimensions?

4. Are there any limitations to the size of arrays that can be used with numpy.dot()?

5. Can numpy.dot() be used for non-numeric arrays?

Similar threads

Hot Threads

Recent Insights