Recognitions:

Metropolis Algorithm - Why does it work?

 Quote by Pi-Bond So is it the accept reject factor I mentioned earlier than ensures that there is a stationary distribution for our chain?
Yes, it assures there is a particlar stationary distribution.

X(A) Pr(A->B) = X(B) Pr(B->A)

Suppose X(A) < X(B)

The Metropolis algorithm defines Pr(A->B) = 1 and Pr(B->A) = X(A)/X(B)

Those probabilities change above equation into an identity:

X(A) (1) = X(B) ( X(A)/X(B))
 Ok, I understand this better now. However I'm having trouble with the concept of "state" here. For a conventional (discrete time) Markov Chain, we have a row vector which tells us the probability of being in a state. So the size of the row vector is the number of states. I'm wondering what is the analogous vector for the Markov Chain we construct in the Metropolis Algorithm. Is it an infinite row vector, as there are infinitely many points in the parameter space? Then the transition matrix would also be infinite then.

 Quote by Pi-Bond Ok, I understand this better now. However I'm having trouble with the concept of "state" here. For a conventional (discrete time) Markov Chain, we have a row vector which tells us the probability of being in a state. So the size of the row vector is the number of states. I'm wondering what is the analogous vector for the Markov Chain we construct in the Metropolis Algorithm. Is it an infinite row vector, as there are infinitely many points in the parameter space? Then the transition matrix would also be infinite then.
In a way, yes, though infinite vectors can be tricky to compute with and there's a few technical considerations in going to infinite number of states (especially if there is a continuum of lilypads) :)

If you have the time, some courses on measure theory and stochastic processes would certainly be worth your while, but the gist of it is that the same arguments for Metropolis-Hastings apply as long as the jump "density" is defined and non-zero between every pair of distinct lilypads.

HTH
 Alright, thanks for the information. Final query (I think)- should a Markov Chain be aperiodic if its stationary distribution is to be explored by the Metropolis Algorithm? (and why?)
 I recommend the third edition of the book Understanding Probability (Cambridge University Press, 2012) by Henk Tijms. The author provides a very clear explanation of Markov chain Monte Carlo simulation. The explanation of the Metropolis algorithm allows one to understand how to construct an algorithm of their own. Nice illustrations are given.