The way I think of it is in terms of causal connections between regions of spacetime.
In the figure, the triangle whose apex is labeled "Alice" is the collection of all events in the causal past of Alice's experimental result (if we assume relativity, then it's the "backward's lightcone" for the event that is the apex of the triangle). The triangle labeled "Bob" is the collection of all events in the causal past of Bob's result (the backwards lightcone for Bob's result). The various regions of spacetime are numbered:
- Region 1 is the region immediately prior to Alice's result.
- Region 2 is the region immediately prior to Bob's result.
- Region 3 includes events possbily relevant to Alice's experimental setup that are prior to her measurement, but recent enough that they could have no causal influence on Bob's result. This region includes Alice's choice of a detector setting.
- Region 4 includes events possibly relevant to Bob's experimental setup that are prior to his measurement, but recent enough that they could have no causal influence on Alice's result. This region includes Bob's choice of a detector setting.
- Region 5 includes events that are possibly relevant to both Bob's result and Alice's result.
Bell's assumption basically is that events in region 1 can be influenced by events in regions 3 and 5, but not on events in regions 2 or 4. Events in region 2 can be influenced by events in regions 4 and 5, but not on events in regions 1 or 3. That's the locality assumption.
Then there is a second assumption, and I'm not sure exactly what the technical name is, but it's something like "completeness of dependencies". Let F_1, F_2, F_3, F_4, F_5 be facts about the 5 regions. We are interested in conditional probabilities:
P(F_1 \wedge F_2\ |\ F_3 \wedge F_4 \wedge F_5)
the probability of both F_1 and F_2 being true, given that F_3, F_4, F_5 are all true.
If F_5 were the
complete description of everything there is to know about the common influences of regions 1 and 2, then we assume that probabilities would factor as follows:
P(F_1 \wedge F_2\ |\ F_3 \wedge F_4 \wedge F_5)<br />
= P(F_1\ |\ F_3 \wedge F_5) \cdot P(F_2\ |\ F_4 \wedge F_5)
I don't think that such a factoring is a law of probability. It's an additional assumption, it seems to me. It certainly holds in any deterministic model, and it holds in the simple sort of local hidden variables models that one is likely to come up with. But whether it holds in every possible local hidden variables model, I'm not sure. I suppose you could just use it as the
definition of a local hidden variables model.