I'm doing a project based around the Metropolis algorithm to explore parameter spaces.The computational aspect I can understand and implement. I understand that a Markov Chain is constructed in the parameter space. However, I don't understand the theory behind why it works so well in exploring it. Being a Physics student, most of the statistical literature I have read has not helped me. Can anyone explain it to me, or point to basic references? Thanks.