In the review by Shakhnovich that Ygggdrasil quoted, Shakhnovich mentions at the end of the paper:
In a certain sense, such atomic-level simulations will represent a “final solution” of the problem of the protein folding mechanism. However, protein folding has been an active field for more than 30 years, and probably all conceivable mechanisms have been proposed in the literature either as pure speculations or as insights from coarse-grained models. In this sense, “the final solution” of the problem of the protein folding mechanism will most likely look like a multiple-choice problem rather than an “essay”-like solution presenting an entirely novel mechanism that nobody thought of in the past. Most likely, the “final solution” will combine elements of many mechanisms that researchers observed in simplified models in more pure forms, so that in a sense the best “multiple choice” answer will sound like “all of the above”. Nevertheless, we are bound to witness decisive progress in studies of protein folding in the coming years.
I take the opinion that the above is a pretty reasonable answer - while you might be prepare a set of all possible mechanistic elements that are responsible for protein folding, the mix of elements and each one's contribution to the process for that particular protein or protein family is going to vary.
Even for something that is modeled as a two-state (or nearly so) equilibrium process, the kinetics don't have to be that simple. So having multiple contributions crop up as the protein folds makes a certain intuitive sense in that regard, such that it just doesn't have to be one factor. The possibility that the internal environment of the cell might also be a factor, from viscosity effects to its heterogeneity. I've skimmed across mentions of where dewetting can, in principle, be a major contribution to facilitating folding processes as well (don't have the citation on me, but I think it was in PNAS last year).
So, for example, if someone told me that protein A had a hydrophobic collapse at one end of the polypeptide chain that served as a nucleation site for the rest of the protein, and spontaneous formation of secondary structure elsewhere that collided where dewetting then occurred, followed by formation of a disulfide bridge which restricted a mobile linker region which initiated the collided secondary structure to wrap around the nucleation site...it sounds perfectly reasonable. Maybe not as straightforward as some may like, but such it goes.
I actually thought that sentiment of Anfinsen's was both funny and true. After all, there's a vague sense that we can usually trust macromolecular structures (after all, it's our quantitatively and physics-inclined colleagues doing crystallography and NMR doing that work) as the "native" structure. When you hear about NMR spectroscopists playing fast and loose with the sample conditions so as to minimize linewidth (without much concern about its reasonableness as a faux-biological environment) and how crystal structures are frequently done under cryogenic conditions...well, anyway.
(FYI, I'm an NMR type who has fiddled around with crystallography. Some honest self-reflection never hurt anyone, I figure...)