Axiomatization of quantum mechanics and physics in general ?

  • #201
billschnieder said:
I don't understand how you could calculate a probability that any local deterministic theory will violate Bell inequalities, without clearly defining the space of "local deterministic theories". For a given theory (and Gill gives one), yes I can imagine one easily checking the probability that it will violate the inequalities but how do you do that for "any local deterministic theory". It seems to me our interest is in the latter probability not the former one.

billschnieder said:
What assumptions do we have to apply to this in order to end up with 2 on the RHS? I can think of one. We could say ##A_1 = A_2, A'_3 = A'_4, B'_2 = B'_4, B_1 = B_3##, which translating from the numbers to spreadsheets of numbers, it means the corresponding columns are identical, not just that the have the same ratios of {+1, -1} but that the pattern of changing back and forth is identical, or can be made identical by rearranging. This is a condition that will allow us to factorize the terms from 4 disjoint sets. For that to be the case, the source will have to know what set each pair will end up in, or the distributions will have to so uniform at all angle settings that a single set will not be able to reproduce the experimentally observed expectation value for one angle pair.

That seems notionally right, and basically corresponds to the condition that everything is independent and identically distributed, and that the measurement settings and the hidden variables are independent. Gill does discuss the possibility of weaker conditions, but this is the typical assumption. See also wle's post #197 and the paper he linked to, where apparently a bound is derived in which the i.i.d. assumption is only needed on the measurement settings, but not the N samples on which the measurements are made.

Edit: In fact, the Pironio paper http://arxiv.org/abs/0911.3427 that wle linked to cites an earlier paper by Gill http://arxiv.org/abs/quant-ph/0301059 for a bound in which the i.i.d. assumption on the N samples is removed. Interestingly, Gill does comment that the 30 standard deviations given in Weihs et al is under the assumption of i.i.d and that probabilities were equal to observed frequencies, and that the bound under weaker conditions cannot be as strong.
 
Last edited:
Physics news on Phys.org
  • #202
billschnieder said:
The issue is what is the correct inequality to use for this kind of result.

You first assume that you know what they assumed the kind of result it is supposed to be, and then you question if the inequality is proper for it. I'm saying we first need to find out what kind of result they assumed it is supposed to be and then decided whether the inequality or the assumption is proper.

Should we use an inequality we derived...

First we should look at the actual CHSH inequality derivation and make sure we understand each step, especially step (6).

ba9e0adafbad6d0ef40507804a84790a.png

ff17109dfef0cfa366ccc63279ce1c5c.png

94f3fbf512dcca86f509d4b95b3c2aed.png

http://en.wikipedia.org/wiki/CHSH_inequality#Derivation_of_the_CHSH_inequality

That's the true origin of how ab,ab',a'b and a'b' got together. Similarity with AB + AB' + A'B - A'B' inequality is more of a coincidence because they both share the same {-1,+1} limits. But they are completely different and based off very different premises, they have nothing in common. One deals with binary units and the other with decimal range, it's like apples and elephants. AB + AB' + A'B - A'B' = -2 or +2 can not be violated by QM or any other theory, because it is not a subject to any theory, it's absolutely general and purely mathematical.

We know exactly why AB + AB' + A'B - A'B' is what it is. It's not a result of any derivation, it's a starting premise, purely mathematical premise completely unrelated to anything but abstract numbers by themselves. But we do not know what premise is combination of ab,ab',a'b and a'b' based on. You're asking the right question, just talking about wrong inequality. I wish we would focus on actual CHSH derivation and try to understand that first.
 
  • #203
Alien8 said:
First we should look at the actual CHSH inequality derivation and make sure we understand each step, especially step (6).

ba9e0adafbad6d0ef40507804a84790a.png

ff17109dfef0cfa366ccc63279ce1c5c.png

94f3fbf512dcca86f509d4b95b3c2aed.png

http://en.wikipedia.org/wiki/CHSH_inequality#Derivation_of_the_CHSH_inequality

That's the true origin of how ab,ab',a'b and a'b' got together. Similarity with AB + AB' + A'B - A'B' inequality is more of a coincidence because they both share the same {-1,+1} limits. But they are completely different and based off very different premises, they have nothing in common. One deals with binary units and the other with decimal range, it's like apples and elephants. AB + AB' + A'B - A'B' = -2 or +2 can not be violated by QM or any other theory, because it is not a subject to any theory, it's absolutely general and purely mathematical.

We know exactly why AB + AB' + A'B - A'B' is what it is. It's not a result of any derivation, it's a starting premise, purely mathematical premise completely unrelated to anything but abstract numbers by themselves. But we do not know what premise is combination of ab,ab',a'b and a'b' based on. You're asking the right question, just talking about wrong inequality. I wish we would focus on actual CHSH derivation and try to understand that first.

In that step, the idea they are using is that ##A = B + C## can also be written as ##A = B + C + D - D##. Basically we can add any term that is of the form ##D - D## since ##D - D = 0##.
 
  • #204
billschnieder said:
I don't follow. S is already a result of 4 different realizations, but then you appear to be averaging more than one S.

By "realisation" I mean what you might call "measurement on one particle pair". (Though I don't like that terminology much since Bell's theorem is about locally causal theories, which may or may not be theories about particles.) Alice and Bob each pick a measurement to do. They measure their systems. They each get a result which they record. That's one realisation. This would normally be repeated thousands of times in a Bell experiment to get a good statistical estimate of the Bell correlator.

Yes, I would expect ##-4 \le S_{k} \ge +4##. How you get from this to ##-2 \le \langle S_{k} \rangle \ge +2## is what the problem is.

I explained that in the subsequent part of my post. The point is that the estimator is defined in such a way that its expectation value, including the average taken over the choice of measurements (which is random), is exactly what's bounded in most derivations of the CHSH inequality. If you want me to do that explicitly, then start with the last line I wrote down: $$\begin{eqnarray}
\langle S_{k} \rangle &=& \sum_{abxy} (-1)^{a + b + xy} P(ab \mid xy) \\
&=& \sum_{xy} (-1)^{xy} \sum_{ab} (-1)^{a} (-1)^{b} P(ab \mid xy) \\
&=& E(00) + E(01) + E(10) - E(11) \,,
\end{eqnarray}$$ with $$\begin{eqnarray}
E(xy) &=& \sum_{ab} (-1)^{a} (-1)^{b} P(ab \mid xy) \\
&=& P(00 \mid xy) - P(01 \mid xy) - P(10 \mid xy) + P(11 \mid xy)
\end{eqnarray}$$ defined for convenience. For a Bell-local model, the probability distribution should have the form $$P(ab \mid xy) = \int \mathrm{d}\lambda \, \rho(\lambda) \, P_{\mathrm{A}}(a \mid x; \lambda) \, P_{\mathrm{B}}(b \mid y; \lambda) \,,$$ so the quantities ##E(xy)## can be written as $$E(xy) = \int \mathrm{d}\lambda \, \rho(\lambda) \, E(xy; \lambda)$$ with $$\begin{eqnarray}
E(xy; \lambda) &=& \sum_{ab} (-1)^{a} (-1)^{b} \, P_{\mathrm{A}}(a \mid x; \lambda) \, P_{\mathrm{B}}(b \mid y; \lambda) \\
&=& \bigl( P_{\mathrm{A}}(0 \mid x; \lambda) - P_{\mathrm{A}}(1 \mid x; \lambda) \bigr) \bigl( P_{\mathrm{B}}(0 \mid y; \lambda) - P_{\mathrm{B}}(1 \mid y; \lambda) \bigr) \\
&=& E_{\mathrm{A}}(x; \lambda) \, E_{\mathrm{B}}(y; \lambda) \,.
\end{eqnarray}$$ In the last line, I set $$\begin{eqnarray}
E_{\mathrm{A}}(x; \lambda) &=& P_{\mathrm{A}}(0 \mid x; \lambda) - P_{\mathrm{A}}(1 \mid x; \lambda) \,, \\
E_{\mathrm{B}}(y; \lambda) &=& P_{\mathrm{B}}(0 \mid y; \lambda) - P_{\mathrm{B}}(1 \mid y; \lambda) \,,
\end{eqnarray}$$ which are bounded by ##-1 \leq E_{\mathrm{A}}(x; \lambda) \leq 1## and ##-1 \leq E_{\mathrm{B}}(y; \lambda) \leq 1##. For any given ##\lambda##, $$\begin{eqnarray}
E(00; \lambda) + E(01; \lambda) + E(10; \lambda) - E(11; \lambda)
&=& E_{\mathrm{A}}(0; \lambda) \bigl( E_{\mathrm{B}}(0; \lambda) + E_{\mathrm{B}}(1; \lambda) \bigr) \\
&&+\> E_{\mathrm{A}}(1; \lambda) \bigl( E_{\mathrm{B}}(0; \lambda) - E_{\mathrm{B}}(1; \lambda) \bigr) \\
&\leq& \lvert E_{\mathrm{B}}(0; \lambda) + E_{\mathrm{B}}(1; \lambda) \rvert + \lvert E_{\mathrm{B}}(0; \lambda) - E_{\mathrm{B}}(1; \lambda) \rvert \\
&\leq& 2 \,,
\end{eqnarray}$$ so for the CHSH estimator expectation value, you get $$\begin{eqnarray}
\langle S_{k} \rangle &=& \int \mathrm{d}\lambda \, \rho(\lambda) \, \bigl( E(00; \lambda) + E(01; \lambda) + E(10; \lambda) - E(11; \lambda) \bigr) \\
&\leq& \max_{\lambda} \bigl( E(00; \lambda) + E(01; \lambda) + E(10; \lambda) - E(11; \lambda) \bigr) \\
&\leq& 2 \,.
\end{eqnarray}$$
 
Last edited:
  • Like
Likes atyy
  • #205
wle said:
In the last line, I set $$\begin{eqnarray}
E_{\mathrm{A}}(x; \lambda) &=& P_{\mathrm{A}}(0 \mid x; \lambda) - P_{\mathrm{A}}(1 \mid x; \lambda) \,, \\
E_{\mathrm{B}}(y; \lambda) &=& P_{\mathrm{B}}(0 \mid y; \lambda) - P_{\mathrm{B}}(1 \mid y; \lambda) \,,
\end{eqnarray}$$ which are bounded by ##-1 \leq E_{\mathrm{A}}(x; \lambda) \leq 1## and ##-1 \leq E_{\mathrm{B}}(y; \lambda) \leq 1##. For any given ##\lambda##, $$\begin{eqnarray}
E(00; \lambda) + E(01; \lambda) + E(10; \lambda) - E(11; \lambda)
&=& E_{\mathrm{A}}(0; \lambda) \bigl( E_{\mathrm{B}}(0; \lambda) + E_{\mathrm{B}}(1; \lambda) \bigr) \\
&&+\> E_{\mathrm{A}}(1; \lambda) \bigl( E_{\mathrm{B}}(0; \lambda) - E_{\mathrm{B}}(1; \lambda) \bigr) \\
&\leq& \lvert E_{\mathrm{B}}(0; \lambda) + E_{\mathrm{B}}(1; \lambda) \rvert + \lvert E_{\mathrm{B}}(0; \lambda) - E_{\mathrm{B}}(1; \lambda) \rvert \\
&\leq& 2 \,,
\end{eqnarray}$$ so for the CHSH estimator expectation value, you get $$\begin{eqnarray}
\langle S_{k} \rangle &=& \int \mathrm{d}\lambda \, \rho(\lambda) \, \bigl( E(00; \lambda) + E(01; \lambda) + E(10; \lambda) - E(11; \lambda) \bigr) \\
&\leq& \max_{\lambda} \bigl( E(00; \lambda) + E(01; \lambda) + E(10; \lambda) - E(11; \lambda) \bigr) \\
&\leq& 2 \,.
\end{eqnarray}$$
So, let us focus on the part where you are doing the factorization, as I keep coming back to the factorization (it is the crucial part of every such proof). You are doing algebra with the functions ##E(0; \lambda)_A , E(1; \lambda)_A E(0; \lambda)_B , E(1; \lambda)_B##, factorizing them like on the 4th line above. One may ask, if you can factorize them out of their respective pairs, and you have just 4 functions, why can't you just measure each one individually in the experiment and use that to verify your inequality?? For example, you have a very interesting inequality there, this one:

$$\begin{eqnarray}
\lvert E_{\mathrm{B}}(0; \lambda) + E_{\mathrm{B}}(1; \lambda) \rvert + \lvert E_{\mathrm{B}}(0; \lambda) - E_{\mathrm{B}}(1; \lambda) \rvert
&\leq& 2
\end{eqnarray}$$

Involving just single sided results, which are actually quite easy to measure, and for which QM has predictions. If QM does not violate this inequality, there is no chance it will violate ##E(00; \lambda) + E(01; \lambda) + E(10; \lambda) - E(11; \lambda) \le 2##, is there? Do you know what the QM predictions for ##E(0; \lambda)_A , E(1; \lambda)_A E(0; \lambda)_B , E(1; \lambda)_B## are for Bell states?
 
  • #206
atyy said:
In that step, the idea they are using is that ##A = B + C## can also be written as ##A = B + C + D - D##. Basically we can add any term that is of the form ##D - D## since ##D - D = 0##.

I think we went far away from what the original topic was supposed to be. I started a new thread specifically about CHSH derivation:
https://www.physicsforums.com/threads/derivation-of-the-chsh-inequality.772844/
 
  • Like
Likes bhobba
  • #207
billschnieder said:
So, let us focus on the part where you are doing the factorization, as I keep coming back to the factorization (it is the crucial part of every such proof). You are doing algebra with the functions ##E(0; \lambda)_A , E(1; \lambda)_A E(0; \lambda)_B , E(1; \lambda)_B##, factorizing them like on the 4th line above. One may ask, if you can factorize them out of their respective pairs, and you have just 4 functions, why can't you just measure each one individually in the experiment and use that to verify your inequality??

Because they depend on a variable ##\lambda## that a local hidden variable would supply that may not be measurable or even exist. If it does, then according to a local hidden variable theory you should have the factorisation ##E(xy; \lambda) = E_{\mathrm{A}}(x; \lambda) \, E_{\mathrm{B}}(y; \lambda)##, but for the terms ##E(xy)## all this let's you say is that they can be expressed in the form $$E(xy) = \int \mathrm{d}\lambda \, \rho(\lambda) E_{\mathrm{A}}(x; \lambda) \, E_{\mathrm{B}}(y; \lambda) \,,$$ which don't necessarily factorise into something like ##E(xy) = E_{\mathrm{A}}(x) \, E_{\mathrm{B}}(y)##.
 
  • #208
wle, your choice of notation is very confusing, what the heck is ##E_{\mathrm{A}}(x; \lambda)## supposed to mean that is different from ##E_{\mathrm{A}}(x)##. Why not just use the standard notation ##A(x; \lambda)##?
 
  • #209
billschnieder said:
wle, your choice of notation is very confusing, what the heck is ##E_{\mathrm{A}}(x; \lambda)## supposed to mean that is different from ##E_{\mathrm{A}}(x)##. Why not just use the standard notation ##A(x; \lambda)##?

What notation looks "standard" depends on where you learned Bell's theorem from. I explained how to derive the CHSH inequality starting from the factorisation condition $$P(ab \mid xy) = \int \mathrm{d}\lambda \, \rho(\lambda) \, P_{\mathrm{A}}(a \mid x; \lambda) \, P_{\mathrm{B}}(b \mid y; \lambda)$$ for a probability distribution, which is how Bell defined a local model in some of his later essays. This is a more general definition than what's used in many derivations of the Bell or CHSH inequality because it doesn't require the local model to be deterministic (though as atyy pointed out in an earlier post, it's always possible to turn a local stochastic model into a local deterministic model by adding more hidden variables, so it doesn't make any difference). I also personally find the definition given in terms of probabilities a lot clearer and less prone to misconceptions.

Most of the terms in post #204 are simply defined in terms of the elements appearing in the factorisation above. For instance, ##E_{\mathrm{A}}(x; \lambda)## was an intermediate variable defined as $$E_{\mathrm{A}}(x; \lambda) = P_{\mathrm{A}}(0 \mid x; \lambda) - P_{\mathrm{A}}(1 \mid x; \lambda) \,,$$ which I introduced just because it was convenient. If you insist on giving this an interpretation, then it's the expectation value of Alice's result for a given ##\lambda## if the results are called ##A = +1## or ##A = -1## instead of ##a = 0## or ##a = 1##. In general, this is a real number bounded by ##-1 \leq E_{\mathrm{A}}(x; \lambda) \leq 1##. For a deterministic local model, ##E_{\mathrm{A}}(x; \lambda)## can only be either ##+1## or ##-1## and it's the same thing that many derivations of the CHSH inequality would call ##A(x; \lambda)## or something similar.

I didn't explicitly define what ##E_{\mathrm{A}}(x)## was because I never needed such a term, but if I did I'd define it as $$E_{\mathrm{A}}(x) = \int \mathrm{d}\lambda \, \rho(\lambda) \, E_{\mathrm{A}}(x; \lambda) \,.$$ In the notation for deterministic local models that you're more familiar with, that would be the same thing as $$\langle A(x) \rangle = \int \mathrm{d}\lambda \, \rho(\lambda) \, A(x; \lambda) \,,$$ though this particular expectation value is never used in derivations of the CHSH inequality.
 
Last edited:
  • #210
wle said:
What notation looks "standard" depends on where you learned Bell's theorem from. I explained how to derive the CHSH inequality starting from the factorisation condition $$P(ab \mid xy) = \int \mathrm{d}\lambda \, \rho(\lambda) \, P_{\mathrm{A}}(a \mid x; \lambda) \, P_{\mathrm{B}}(b \mid y; \lambda)$$ for a probability distribution, which is how Bell defined a local model in some of his later essays. This is a more general definition than what's used in many derivations of the Bell or CHSH inequality because it doesn't require the local model to be deterministic (though as atyy pointed out in an earlier post, it's always possible to turn a local stochastic model into a local deterministic model by adding more hidden variables, so it doesn't make any difference). I also personally find the definition given in terms of probabilities a lot clearer and less prone to misconceptions.

Where did you learn to derive CHSH? I like your proof. I'm a biologist, so probabilities and directed graphical nonsense are much more my cup of tea too.
 
  • #211
atyy said:
Where did you learn to derive CHSH?

Originally a combination of an introduction to Bell's theorem by Travis Norsen [arXiv:0707.0401 [quant-ph]], one of Bell's explanations ["The theory of local Beables"], and just sitting down and working it out. I'd read both Bell's original 1964 article the 1969 CHSH article before that but didn't find the reasoning quite as clear.

Deriving the local bound on a given linear Bell correlator isn't really an issue though. Like you pointed out earlier in post #192, it's sufficient to consider deterministic models. You can always work out the local bound on a linear Bell correlator just by maximising it over the set of local deterministic strategies (i.e., deterministic ways of mapping inputs ##x## and ##y## to outputs ##a_{x}## and ##b_{y}##), and there are a finite number of these (e.g., there are sixteen in the situation that the CHSH correlator applies to).
 
Last edited:
Back
Top