An Alternative form of Bell's Inequality

jambaugh · Mar 27, 2021

Bell's inequality in it's original form is:
[tex]|cor(a,b) - cor(a,c)| \le 1 - cor(b,c)[/tex]
where ##a,b## and ##c## are random variables with values ##\pm 1##, and the correlation is then simply the expectation value of their products, ##cor(a,b)=E[ab]## or as usually expressed ##\langle ab\rangle##.

I found it instructive to recast this in terms of Bernoulli {0,1} valued random variables. I will use upper case for these: ##A= (a+1)/2, a=2A-1## and likewise with ##b## and ##c##. ##A=1## if ##a=1## else ##A=0##.
So, a bit of algebra:
[tex]cor(b,c) = E[bc] = E[4BC-2B-2C+1] \to[/tex] [tex]1-cor(b,c) = 2E[B-2BC-C] = 2E[B^2 - 2BC - C^2] = 2E[(B-C)^2][/tex] (using ##B^2 = B## etc.) Note however that as Boolean (0 or 1) values ##(B-C)^2 = B\veebar C## their exclusive or. Also note that the expected value of a Boolean variable is its probability so ## 1-cor(b,c) = P(B\veebar C)## and likewise with others.
Similarly:
[tex]cor(a,b) - cor(a,c) = 2P(A\veebar C) - 2P(A\veebar B)[/tex] So in these terms Bell's original inequality becomes:
[tex]|P(A\veebar C) - P(A\veebar B)| \le P(B\veebar C)[/tex] When we expand the absolute value we get:
[tex]-P(B\veebar C) \le P(A\veebar C) - P(A\veebar B) \le P(B\veebar C)[/tex] Rearrange the terms for each inequality separately and you get two versions of the general inequality:
[tex]P(X\veebar Y) + P(Y\veebar Z) \ge P(X\veebar Z)[/tex] It's a "triangle inequality" and we can use ##d(X,Y) = P(X\veebar Y)## as a "metric" between events. The other Bell-like inequalities are, I believe, just extensions of the same.

This "Triangle Inequality" is just additivity and positivity of probability distributions over a sample space "underlying set of possible realities" and assumes events, (including specific measurements) are subsets of that "set of possible realities". This format of Bell's inequality is, at least in my mind, much easier to understand.

Demystifier · Mar 31, 2021

The original Bell inequality is violated by quantum mechanics. However, this does not automatically imply that your final inequality is also violated by quantum mechanics. You have to verify this by an additional calculation. If you can indeed show this, then your result is very interesting.

wle · Mar 31, 2021

Demystifier said:

The original Bell inequality is violated by quantum mechanics. However, this does not automatically imply that your final inequality is also violated by quantum mechanics. You have to verify this by an additional calculation. If you can indeed show this, then your result is very interesting.

Well his inequality is just Bell's one expressed in a different way. So it is violated in the same circumstances that Bell's one is.

Incidentally, Bell's original inequality is just a special case of a more general one called CHSH that was derived a few years later. You can derive Bell's inequality from CHSH by assuming that one pair of measurements always gives anticorrelated outcomes. This also means that Bell's original inequality only applies under a rather artificial condition (you are never going to have perfect anticorrelations in a real Bell experiment). So Bell's inequality is fine to show that quantum physics (a mathematically-defined model) makes predictions incompatible with any local hidden-variable theory (another mathematically-defined model) but it can't by itself be applied to directly say anything about the correlations observed in a real experiment. Bell actually thought of this and there's a part of his original paper that allows for being ##\epsilon##-close to having perfect anticorrelations, but CHSH does this in a cleaner way.

A common expression of CHSH for ##\pm 1##-valued measurements is $$\langle A_{1} B_{1} \rangle + \langle A_{1} B_{2} \rangle + \langle A_{2} B_{1} \rangle - \langle A_{2} B_{2} \rangle \leq 2 \,.$$ If you rewrite this in terms of the probabilities of getting different outcomes (substitute ##\langle A_{x} B_{y} \rangle = 1 - 2 P(A_{x} \neq B_{y})##) you get $$P(A_{1} \neq B_{1}) + P(A_{1} \neq B_{2}) + P(A_{2} \neq B_{1}) \geq P(A_{2} \neq B_{2}) \,,$$ which looks kind of like a triangle (or in this case, quadrilateral) inequality. However, not all Bell inequalities are as simple as CHSH. Another Bell inequality, applicable to a Bell experiment with three measurements done on each side, is: $$\begin{eqnarray*}
\langle A_{1} \rangle + \langle A_{2} \rangle + \langle B_{1} \rangle + \langle B_{2} \rangle && \\
-\> \langle A_{1} B_{1} \rangle - \langle A_{1} B_{2} \rangle - \langle A_{2} B_{1} \rangle - \langle A_{2} B_{2} \rangle && \\
+\> \langle A_{3} B_{1} \rangle - \langle A_{3} B_{2} \rangle
+ \langle A_{1} B_{3} \rangle - \langle A_{2} B_{3} \rangle &\leq& 4 \,.
\end{eqnarray*}$$ This Bell inequality has been known for a few decades^*. It is also known to be tight, i.e., it is known that this Bell inequality can't be derived by taking the sum of simpler Bell inequalities or other trivial inequalities like ##\langle A_{1} \rangle \leq 1##.

In general, Bell inequalities characterise the sets of joint probability distributions that admit a local hidden-variable model, i.e., that can be expressed in the form $$P(ab|xy) = \sum_{\lambda} p_{\lambda} P_{\mathrm{A}}(a|x; \lambda) P_{\mathrm{B}}(b | y; \lambda) \,.$$ The set of joint probability distributions that can be expressed in such a way is typically called the "local set" or "classical set" or "local polytope". A couple of things that are known about it:

Deciding whether a given joint distribution ##P(ab|xy)## is in the local set (membership testing) in general is known to be an NP-complete problem.
As one of the names implies, the local set is known to be a polytope, i.e., it can be seen as the convex hull of a finite number of "vertex" probability distributions which consist of the local deterministic ones (##P(ab|xy) = P_{\mathrm{A}}(a|x) P_{\mathrm{B}}(b|y)## with ##P_{\mathrm{A}}(a|x), P_{\mathrm{B}}(b|y) \in \{0, 1\}##).

The first point means that it is very unlikely that anyone will be able to actually derive all possible valid Bell inequalities or that there is a simple intuitive expression for all of them. The second point means that it is nevertheless in principle possible to derive all the tight Bell inequalities for a given Bell setting in a systematic way (any software that can find the facets of a polytope given its vertices can do this), but in practice it is only computationally feasible for small Bell settings involving very limited numbers of different measurements and measurement outcomes. Both CHSH and the Froissart inequality above correspond to facets of the local polytope.

There was a review article published several years ago covering this sort of thing: https://arxiv.org/abs/1303.2849.

^*It was first derived by Marcel Froissart [Nuov Cim B 64, 241-251 (1981)] and later rederived by others in the early 2000s.

Demystifier · Mar 31, 2021

wle said:

Well his inequality is just Bell's one expressed in a different way. So it is violated in the same circumstances that Bell's one is.

You are right. His derivation can be inverted, i.e. one can start from his final inequality and derive the Bell's one. Hence the two inequalities are equivalent. (In dealing with inequalities one has to ba careful because e.g. ##x<1## implies ##x<2##, but the converse is not true. However, in his derivation there are no such irreversible steps.)

wle · Apr 1, 2021

Well in general there are multiple ways of expressing any given Bell inequality due to equality constraints satisfied by the probabilities.

For instance, in the setting that CHSH is defined in (2 parties each do 2 different possible measurements each with 2 possible outcomes), in a hypothetical Bell experiment the correlations are fully described by sixteen probabilities ##P(ab|xy)##, ##a, b \in \{\pm 1\}##, ##x, y \in \{1, 2\}##, but only eight of them are really independent since the others can be inferred from normalisation (##\sum_{ab} P(ab|xy) = 1##) and the no-signalling (e.g., ##\sum_{b} P(ab|x1) = \sum_{b} P(ab|x2)##) conditions.

You can "standardise" the representation of a Bell inequality by choosing in advance a suitable projection of the probabilities to work with. For binary-outcome measurements a popular one is the expectation values of ##\pm 1##-valued observables associated to the measurements: $$\begin{eqnarray*}
\langle A_{x} \rangle &=& P(++|xy) + P(+-|xy) - P(-+|xy) - P(--|xy) \,, \\
\langle B_{y} \rangle &=& P(++|xy) - P(+-|xy) + P(-+|xy) - P(--|xy) \,, \\
\langle A_{x} B_{y} \rangle &=& P(++|xy) - P(+-|xy) - P(-+|xy) + P(--|xy) \,.
\end{eqnarray*}$$ For the CHSH setting this gives eight independent numbers that can fully describe any nonsignalling correlations. Another popular one, which also works for measurements with more than two outcomes, is to discard one of the outcomes and describe everything in terms of the marginal and joint probabilities of the non-discarded outcomes (e.g., describe everything in terms of ##P_{\mathrm{A}}(+|x)##, ##P_{\mathrm{B}}(+|y)##, and ##P(++|xy)##, which also gives eight independent numbers in the CHSH setting).

stevendaryl · Apr 1, 2021

Another form of the inequality for probabilities is, (I don't know the name for it, maybe "Boole's inequality"?), is this:

##P(A \wedge B) + P(\neg B \wedge \neg C) \geq P(A \wedge \neg C)##

This is easily proved by expanding:

##P(A \wedge B) = P(A \wedge B \wedge C) + P(A \wedge B \wedge \neg C)##
##P(\neg B \wedge \neg C) = P(A \wedge \neg B \wedge \neg C) + P(\neg A \wedge \neg B \wedge \neg C)##
##P(A \wedge \neg C) = P(A \wedge B \wedge \neg C) + P(A \wedge \neg B \wedge \neg C)##

So we have:

##P(A \wedge B) + P(\neg B \wedge \neg C) - P(A \wedge \neg C) = P(A \wedge B \wedge C) + P(\neg A \wedge \neg B \wedge \neg C)##

So if probabilities are all positive, then

##P(A \wedge B) + P(\neg B \wedge \neg C) - P(A \wedge \neg C) \geq 0##

So the above inequality has to hold.

To illustrate how QM seems to violate this inequality (of course, it doesn't really, since it's a mathematical theorem...), let's take an example from EPR with anti-correlated spin-1/2 particles.

We have a source of anti-correlated twin pairs. Out of each pair, Alice measures the spin of one of them relative to one of three axes, labeled 0, 120, or 240. The angle between any two axes is 120 degrees. Bob measures the spin of the other relative to the same three axes.

The predictions of QM are:

No matter what axes Alice and Bob choose, the probability for each of them is 50% of measuring spin-up, and 50% of measuring spin-down.
If they choose the same axis, they will always get opposite results.
If they choose different axes, they will get the same result 75% of the time (##sin^2(120/2)##).

A naive realist model is that there is a hidden variable ##\lambda## associated with each twin pair, and that there are 8 different possible values for ##\lambda##:

##\lambda_{UUU}##: If the twin pair has this value, then Alice will get spin-up for any axis, and Bob will get spin-down.
##\lambda_{UUD}##: If the twin pair has this value, then Alice will get spin-up for axis 0 or 120, but spin-down for axis 240. Bob will get the opposite: spin-down for 0 or 120, and spin-up for 240.
etc.

So we assume that when a twin pair is created, one of those 8 types is created, with a certain probability, and that afterwards, Alice's and Bob's results are completely determined by the type of twin-pair and the axes they choose.

To agree with the predictions of quantum mechanics:

##P(U??) = P(UUU) + P(UUD) + P(UDU) + P(UDD) = 0.5## (where ? means the value can be anything).
##P(U?D) = P(UUD) + P(UDD) = 0.375##
##P(UU?) = P(UUU) + P(UUD) = 0.125##
##P(?DD) = P(UDD) + P(DDD) = 0.125##

Equations 2, 3 and 4 maybe not obvious. If Alice chooses to measure at 0 degrees and Bob chooses to measure at 240 degrees, then they will get the same result 75% of the time and different results 25%. Since Bob's results are always different from Alice's, that means that Alice's result for 0 and her result for 240 are different 75% of the time and the same 25% of the time. Which means that 75% of the time, the value of the hidden variable is U?D or D?U Which means that the probability of U?D is half that, 37.5%. The probabilities for UU? and ?DD are similarly half of 25%, or 12.5%.

Now, letting A be the event U?? and letting B be the event ?U? and letting C be the event ??U, we have:
##A \wedge B## is the event ##UU?## and ##\neg B \wedge \neg C## is the event ##?DD## and ##A \wedge \neg C## is the event ##U?D##. So Boole's inequality would tell us:

##P(UU?) + P(?DD) \geq P(U?D)##

But plugging in the numbers above give us:
##0.125 + 0.125 \lt 0.375##

So QM (or rather, this hidden-variable explanation for the QM probabilities) seems to violate Boole's inequality.

If we assume for simplification that there is complete symmetry among the three axes, and there is a symmetry between ##U## and ##D##, then there will only be two different probabilities:

##x = P(UUU) = P(DDD)##
##y = P(UUD) = P(UDU) = P(DUU) = P(DDU) = P(DUD) = P(UDD)##

The numbers above would tell us that:

##P(UU?) = 2y = 0.375##
##P(U?D) = x+y = 0.125##

Which would lead to the conclusion that:

##y = 0.1875 = 3/16##
##x = -0.0625 = -1/16##

So unless there is a way to make sense of negative probabilities, this hidden-variable model is ruled out.

An Alternative form of Bell's Inequality

Discussion Overview

Discussion Character

Main Points Raised

Areas of Agreement / Disagreement

Contextual Notes

Similar threads

High School Interesting paper on QM in Scientific American

Undergrad ##r-##independent angular momentum in quantum mechanics

Graduate Consistency of Relativistic QM

Graduate Some derivation in QFT in Curved SpaceTime by Birrell and Davies

High School Seemingly odd quantum tunneling

Insights Revisiting the Velocity-Time Function

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect