Hi there, I'm currently working on a relatively simple code to do some lattice simulations. I have access to a computing cluster at school and have been learning how to use OpenMP to parallelize my code (each node has 16 cores). I'm currently not planning to use MPI. My main question is regarding the process of thermalizating the lattice. I'm currently using a method due to Creutz to perform updated on links (note I'm simulating pure SU(2), not SU(3), no fermions or anything). My code is written in Fortran 90, and looks something like Do t=1,L Do z=1,L Do y=1,L Do x=1,L Do d=1,4 !each link at point (x,y,z,t) Update the link specified by (t,x,y,z,d) End Do End Do End Do End Do End Do Updating a link depends on the links that make up plaquettes containing the link to be updated, so if I want to parallelize the thermalizing process I have to be sure that each thread isn't trying to update two links that share a plaquette at the same time. I thought the simplest way to do that would be to just split up the updating process so that each thread updates a layer (say at constant t) which isn't adjacent to any other layer being updated. So I wrote my code to update all the even layers then odd layers in parallel. The problem is, the results I'm getting now don't agree exactly with the results I get when I update sequentially. Simple observables like Wilson loops don't display any difference, but when I measure correlators of spacelike-separated timelike links, I find they disagree slightly at large distances. The parallel updating seems to yield results that are incorrect at large distances (comparing to results from a paper I've been given to reproduce). Can anyone explain why it might be that updating even layers then odd layers would yield different results than just updating lattice sites one by one? Should it make a difference? I can't see any reason it would, and it seems to me a fairly obvious simple parallelization method. everything I've found refers to breaking the lattice into chunks, and controlling for the parallel updating of dependent links at the boundaries between chunks. I don't really want to deal with anything that involved currently. Any help appreciated.