Most of the explanations I've read of Propp-Wilson sampling describe the method in terms of "sampling from the past," in order to make sense of the fact that you get your random numbers before attempting to obtain a sample from the target distribution, and don't re-sample them until you succeed (hence the way the Markov chain is grown from to ).

I find it more intuitive to think of this in terms of "sampling from deterministic universes." The basic hand-waving intuition is that instead of a non-deterministic system, you are sampling from a probabilistic ensemble of fully deterministic systems, so you first a) select the deterministic system (that is, the infinite series of random numbers you'll use to walk through the Markov chain), and b) run it until its story doesn't depend on the choice of original state. The result of this procedure will be a sample from the exact equilibrium distribution (because you have sampled from or "burned off" the two sources of distortion from this equilibrium distribution, the non-deterministic nature of the system and the dependence on the original state).

As I said, I think this is mathematically equivalent to Propp-Wilson sampling, although you'd have to tweak a bit the proofs. But it feels more understandable to me than other arguments I've read, so at least it has that benefit (assuming, of course, it's true).

PS: On the other hand "sampling from the past" is too fascinating a turn of phrase not to use, so I can see the temptation.