# Nick Herbert's proof?

## Main Question or Discussion Point

This is a spin-off from the thread on Bell's theorem, following this post:

Instead of discussing about λ and the way to account for it, Nick Herbert seems to have provided a stunningly simple and convincing proof of "quantum non-locality" here (thanks for the link Lugita15):

- http://quantumtantra.com/bell2.html
The essential part is in the last drawing, with the text just above it:

simple arithmetic and the assumption that Reality is Local leads one to confidently predict that the code mismatch at 60 degrees must be less than 50%.

It surely looks very convincing to me! :tongue2:
Thus my questions:

- are there known issues with that proof?
- I thought that models exist that reproduce the characteristic of QM of a greater "mismatch". However, according to Herbert's proof, that is not possible. What's going on? Related Quantum Physics News on Phys.org
gill1109
Gold Member
That's a perfectly good proof. Nothing wrong with it.

What do you mean by "I thought that models exist that reproduce the characteristic of QM of a greater 'mismatch'"?

There are published models which are wrong (as they must be: you can't contradict a true theorem).

There are published models which simply exploit the so-called detection loophole, but without making that explicit.

Imagine the standard Bell situation where two particles fly to two distant locations where they are each measured in one of two possible ways resulting in a binary outcome. Suppose the two particles, just before departing from the source, agree what pair of settings they would like to see and what pair of outcomes they will then generate. They then set off on their journey. Each of them arrives at a detector and sees that one particular setting has been chosen. If the setting which has been chosen by the experimenter is different from the setting which the two particles had agreed on in advance, then that particle decides to vanish. It's not detected at all.

If real settings and if "guessed settings" are all chosen completely at random, half of the particles will fail to arrive at each detector. Both will be detected one quarter of the times. And on those quarter of all the times, they can produce any correlation they like, for instance, they can go all the way to 4 in the Bell inequality (QM only goes to 2 sqrt 2).

It's well known that even if only 10% of the particles fail to be detected (or something like that), they can perfectly violate Bell inequality at the 2 sqrt 2 level predicted by QM.

In real experiments many, many particles are not detected. The non-detection rate is more like 95%.

gill1109
Gold Member
By the way the link between Nick Herbert's proof and Bell's proof is to think of lambda as being the set of four measurement outcomes of the two particles under each of the two settings. After all, Bell's encoding of "local realism" is that if you knew the values of hidden variables located in the photons or in the measuring devices or anywhere else, then the outcomes of either measurement, on either particle, is just some deterministic function of the values of all these variables.

Then in Bell's formulas for the correlation between the outcomes of two measurements, replace integration over the possible values of the hidden variables, weighted according to their probabiity density, by a sum over the 16 possible values of the four binary outcomes.

Well, it still maybe looks like different mathematics, but in fact we are now getting very close to a simple combinatorial argument - just running through a finite number of different possibilities.

- I thought that models exist that reproduce the characteristic of QM of a greater "mismatch". However, according to Herbert's proof, that is not possible. What's going on? local hidden variable theory

You should be able to find examples of LR models that produce a nonlinear correlation between θ and P(a,b) via Google and arxiv.org searches.

Here are a couple of statements of the issues regarding Bell-type or Herbert-type proofs by a couple of physicists who think that the assumption of nonlocality in nature might be unwarranted.

Arthur Fine said:
One general issue raised by the debates over locality is to understand the connection between stochastic independence (probabilities multiply) and genuine physical independence (no mutual influence). It is the latter that is at issue in “locality,” but
it is the former that goes proxy for it in the Bell-like calculations.
David Mermin said:
How clearly and convincingly to exorcise nonlocality from the foundations of physics in spite of the violations of Bell inequalities. Nonlocality has been egregiously oversold. On the other hand, those who briskly dismiss it as a naive error
are evading a direct confrontation with one of the central peculiarities of quantum
physics. I would put the issue like this: what can one legitimately require of
an explanation of correlations between the outcomes of independently selected tests
performed on systems that no longer interact?

gill1109
Gold Member
Fine is wrong. Bell gave sufficient conditions for his inequality to hold, not necessary and sufficient conditions. The issue is not statistical independence. The issue is the possibility to add into the model the outcomes of the measurements which were not performed, alongside of those which were performed, in a way which respects locality. In the Bell-CHSH set-up (two parties, two measurements per party, two possible outcomes per measurement) all Bell-CHSH inequalities hold if and only if the unperformed measurements can also have outcomes attributed to them, in a local way.

gill1109
Gold Member
The only models which are correct and which reproduce P(a,b) are models exploiting the detection loophole.

Fine is wrong. Bell gave sufficient conditions for his inequality to hold, not necessary and sufficient conditions. The issue is not statistical independence. The issue is the possibility to add into the model the outcomes of the measurements which were not performed, alongside of those which were performed, in a way which respects locality. In the Bell-CHSH set-up (two parties, two measurements per party, two possible outcomes per measurement) all Bell-CHSH inequalities hold if and only if the unperformed measurements can also have outcomes attributed to them, in a local way.
Not sure what you're saying. Have Herbert and Bell proven that nature is nonlocal?

DrChinese
Gold Member
Not sure what you're saying. Have Herbert and Bell proven that nature is nonlocal?
Don't forget the realism requirement. If that is dropped, locality is possible.

Don't forget the realism requirement. If that is dropped, locality is possible.
What is the difference between local non-realism vs non-local non-realism?

DrChinese
Gold Member
What is the difference between local non-realism vs non-local non-realism?
To me (and not everyone has the exact same definitions): It is non-realistic if you deny existence of counterfactual outcomes. In EPR terms, you are essentially denying the existence of "elements of reality independent of the act of observation." EPR would say that perfect correlations are a manifestation of these elements of reality. While the Copenhagen view would be that perfect correlations are the mathematical outcome you get as a result of the cos^2(theta) relationship.

So I guess the non-local version of the above adds in the idea that the measurement device settings are effectively in communication with the particles being observed.

What do you mean by "I thought that models exist that reproduce the characteristic of QM of a greater 'mismatch'"?
I think he's referring to LR models that produce a nonlinear angular dependence.

In real experiments many, many particles are not detected. The non-detection rate is more like 95%.
For the purposes of the OP, assuming 100% detection efficiency and 100% attribute-pairing efficiency, wouldn't the predictions of any Bell-type or Herbert-type LR model of entanglement (wrt a simple setup where you have two parties, one measurement per party per entangled pair, and two possible outcomes per measurement) still disagree with most of the QM predictions? That is, even in the ideal, an LR model necessarily produces a correlation between θ and P(a,b), from linear to something approching cos2θ, that will, necessarily, be incongruent with the QM correlation.

Don't forget the realism requirement. If that is dropped, locality is possible.
The way I think about this is that the realism requirement is the association of the individual measurement outcomes (either +1 and -1, or 1 and 0, denoting detection and nondetection, respectively, wrt a coincidence interval) with a function describing individual detection, which includes both λ and the polarizer setting, a (or b), such that, as Bell wrote, A(a,λ) = ±1. While the locality requirement is the separation of the functions determining individual detection in the formulation of the function determining joint detection, such that, as Bell wrote,
P(a,b) = ∫dλρ(λ)A(a,λ)B(b,λ) .

So, if the realism requirement is dropped, then how would locality be expressed/encoded?

Fine is wrong. ... The issue is not statistical independence.
Apparently, something gets lost (or confused) in the translation of the assumption of locality (independence) into a testable mathematical model.
Nick Herbert said:
Assuming a local reality means that, for each A photon, whatever hidden mechanism determines the output of Miss A's SPOT detector, the operation of that mechanism cannot depend on the setting of Mr B's distant detector. In other words, in a local world, any changes that occur in Miss A's coded message when she rotates her SPOT detector are caused by her actions alone.

And the same goes for Mr B. The locality assumption means that any changes that appear in the coded sequence B when Mr B rotates his SPOT detector are caused only by his actions and have nothing to do with how Miss A decided to rotate her SPOT detector.
J. S. Bell said:
The vital assumption is that the result B for particle 2 does not depend on the setting a, of the magnet for particle 1, nor A on b.
Arthur Fine said:
One general issue raised by the debates over locality is to understand the connection between stochastic independence (probabilities multiply) and genuine physical independence (no mutual influence). It is the latter that is at issue in “locality,” but
it is the former that goes proxy for it in the Bell-like calculations.

DrChinese
Gold Member
The way I think about this is that the realism requirement is the association of the individual measurement outcomes (either +1 and -1, or 1 and 0, denoting detection and nondetection, respectively, wrt a coincidence interval) with a function describing individual detection, which includes both λ and the polarizer setting, a (or b), such that, as Bell wrote, A(a,λ) = ±1. While the locality requirement is the separation of the functions determining individual detection in the formulation of the function determining joint detection, such that, as Bell wrote,
P(a,b) = ∫dλρ(λ)A(a,λ)B(b,λ) .

So, if the realism requirement is dropped, then how would locality be expressed/encoded?
I don't know, as it is essential to all Bell-type arguments including Herbert's.

To me, the locality requirement is tested by having the Alice and Bob measurement settings be determined while spacelike separated. This has the effect of proving that no classical communication is occurring between Alice and Bob.

- are there known issues with that proof?
Herbert goes from this:
Assuming a local reality means that, for each A photon, whatever hidden mechanism determines the output of Miss A's SPOT detector, the operation of that mechanism cannot depend on the setting of Mr B's distant detector. In other words, in a local world, any changes that occur in Miss A's coded message when she rotates her SPOT detector are caused by her actions alone.

And the same goes for Mr B. The locality assumption means that any changes that appear in the coded sequence B when Mr B rotates his SPOT detector are caused only by his actions and have nothing to do with how Miss A decided to rotate her SPOT detector.
To this:
Starting with two completely identical binary messages, if A's 30 degree turn introduces a 25% mismatch and B's 30 degree turn introduces a 25% mismatch, then the total mismatch (when both are turned) can be at most 50%. In fact the mismatch should be less than 50% because if the two errors happen to occur on the same photon, a mismatch is converted to a match.
What has he overlooked?

That's a perfectly good proof. Nothing wrong with it.
It was indeed my first impression that it is a perfect proof: it seems to be necessarily valid for all possible scenarios - that is, without any exceptions or "loopholes".
What do you mean by "I thought that models exist that reproduce the characteristic of QM of a greater 'mismatch'"?
I meant models that predict observations that reproduce such observations - and which therefore are impossible according to Herbert's proof. Consequently, either Herbert's proof has a weakness that I did not perceive, or such models are erroneous.
[..] There are published models which simply exploit the so-called detection loophole, but without making that explicit.

Imagine the standard Bell situation where two particles fly to two distant locations where they are each measured in one of two possible ways resulting in a binary outcome. Suppose the two particles, just before departing from the source, agree what pair of settings they would like to see and what pair of outcomes they will then generate. They then set off on their journey. Each of them arrives at a detector and sees that one particular setting has been chosen. If the setting which has been chosen by the experimenter is different from the setting which the two particles had agreed on in advance, then that particle decides to vanish. It's not detected at all.
Perhaps I misunderstood Herbert's proof; I thought that such particles participate in generating the differences that are recorded, on each side independently. Then it's not clear to me how it matters, how his proof could be affected by this (see next). And evidently Herbert also did not realise that his proof could fail for such situations... He only considerd the observed patterns, and his only condition is that "reality is local" - detection yield doesn't play any role in his argument.
If real settings and if "guessed settings" are all chosen completely at random, half of the particles will fail to arrive at each detector. Both will be detected one quarter of the times. And on those quarter of all the times, they can produce any correlation they like, for instance, they can go all the way to 4 in the Bell inequality (QM only goes to 2 sqrt 2).

It's well known that even if only 10% of the particles fail to be detected (or something like that), they can perfectly violate Bell inequality at the 2 sqrt 2 level predicted by QM.

In real experiments many, many particles are not detected. The non-detection rate is more like 95%.
Thank you! Clearly I overlooked something. Regretfully I still don't get it, as Herbert's proof looks robust for such cases - your proposed model is just another possible hidden mechanism that "determines the output", as Herbert described.
His conclusion that "simple arithmetic and the assumption that Reality is Local leads one to confidently predict that the code mismatch at 60 degrees must be less than 50%", sounds rock solid.

isn't the possible loss of particle detections included in the 25% mismatch? Thus, how can in such a case 25% + <=25% > 50% ??

Last edited:
To me, the locality requirement is tested by having the Alice and Bob measurement settings be determined while spacelike separated. This has the effect of proving that no classical communication is occurring between Alice and Bob.
The experiments do structure out c or sub c communications between spacelike separated events per coincidence interval.

The problem is that the independence encoded in LR formulations involves statistical independence, while also requiring coincidental detection to be expressed in terms of a variable (via the individual detection functions) that, as far as I can tell, doesn't determine it.

Also, the possibility remains of some sort of superluminal communication between whatever -- even though lower bounds have been calculated wrt various experiments. There's no way to falsify the assumption of >c transmissions wrt optical Bell tests, is there?

gill1109
Gold Member
Herbert's argument (which is informal) relies on 25% and 25% and 75% being percentages of the same photon pairs. In a model which violates Bell through the detection loophole, it would be different photon pairs which are not detected with each pair of detector settings. 25% of a smaller subset of photons, 25% of another small subset of photons, 75% of yet another small subset of photons.

He is silently assuming realism by imagining the same population of photon pairs being measured in different ways.

gill1109
Gold Member
The only statistical independence required is that between the experimentally chosen measurement settings and the set of counterfactual measurement outcomes (the two pairs of outcomes of both possible measurements on both particles).

The only statistical independence required is that between the experimentally chosen measurement settings and the set of counterfactual measurement outcomes (the two pairs of outcomes of both possible measurements on both particles).
Herbert's argument (which is informal) relies on 25% and 25% and 75% being percentages of the same photon pairs. In a model which violates Bell through the detection loophole, it would be different photon pairs which are not detected with each pair of detector settings. 25% of a smaller subset of photons, 25% of another small subset of photons, 75% of yet another small subset of photons.

He is silently assuming realism by imagining the same population of photon pairs being measured in different ways.
I still don't get it... Herbert's proof doesn't even consider particles, let alone both particles or the same photon pairs.

Here is how I apply Herbert's proof to the scenario of incomplete detection, following his logic by the letter and adding my comments:

---------------------------------------------------------------------
Step One: Start by aligning both SPOT detectors. No errors are observed.

[harrylin: for example the sequences go like this:

A 10010110100111010010
B 10010110100111010010]

Step Two: Tilt the A detector till errors reach 25%. This occurs at a mutual misalignment of 30 degrees.

[harrylin: for example (a bit idealized) the sequences go like this:

A 10010100110110110110
B 10110100111010010010

This mismatch could be partly due to the detection of different photon pairs.]

Step Three: Return A detector to its original position (100% match). Now tilt the B detector in the opposite direction till errors reach 25%. This occurs at a mutual misalignment of -30 degrees.

[harrylin: for example the sequences go like this, for the same reasons:

A 10100100101011010011
B 10010101101011010101]

Step Four: Return B detector to its original position (100% match). Now tilt detector A by +30 degrees and detector B by -30 degrees so that the combined angle between them is 60 degrees.

What is now the expected mismatch between the two binary code sequences?

[..] Assuming a local reality means that, for each A photon, whatever hidden mechanism determines the output of Miss A's SPOT detector, the operation of that mechanism cannot depend on the setting of Mr B's distant detector. In other words, in a local world, any changes that occur in Miss A's coded message when she rotates her SPOT detector are caused by her actions alone.

[harrylin: apparently that includes whatever mechanism one could imagine - also non-detection of part of the photons]

And the same goes for Mr B. [..] So with this restriction in place (the assumption that reality is local), let's calculate the expected mismatch at 60 degrees.

Starting with two completely identical binary messages, if A's 30 degree turn introduces a 25% mismatch and B's 30 degree turn introduces a 25% mismatch, then the total mismatch (when both are turned) can be at most 50%. In fact the mismatch should be less than 50% because if the two errors happen to occur on the same photon, a mismatch is converted to a match.

[harrylin: and if the errors happen to occur on different photons that are compared, still sometimes a mismatch will be converted to a match. Thus now for example the sequences go like this, for the same reasons as +30 degrees and -30 degrees:

A 10101010110101010011
B 10100100101011010101]
----------------------------------------------------------------------------

So far Herbert's proof, which is simply comparing binary code sequences. Nowhere is there any assumption about detection efficiency, as there is even no assumption about what happens at the detectors or about what happens at the source. The only assumptions concern independent detections and reproducibility of % of matching in sufficiently long sequences.

Where is the error?

Last edited:
DrChinese
Gold Member
I still don't get it... Herbert's proof doesn't even consider particles, let alone both particles or the same photon pairs.

Here is how I apply Herbert's proof to the scenario of incomplete detection, following his logic by the letter and adding my comments:

---------------------------------------------------------------------
Step One: Start by aligning both SPOT detectors. No errors are observed.

[harrylin: for example the sequences go like this:

A 10010110100111010010
B 10010110100111010010]

Step Two: Tilt the A detector till errors reach 25%. This occurs at a mutual misalignment of 30 degrees.

[harrylin: for example (a bit idealized) the sequences go like this:

A 10010100110110110110
B 10110100111010010010

This mismatch could be partly due to the detection of different photon pairs.]

Step Three: Return A detector to its original position (100% match). Now tilt the B detector in the opposite direction till errors reach 25%. This occurs at a mutual misalignment of -30 degrees.

[harrylin: for example the sequences go like this, for the same reasons:

A 10100100101011010011
B 10010101101011010101]

Step Four: Return B detector to its original position (100% match). Now tilt detector A by +30 degrees and detector B by -30 degrees so that the combined angle between them is 60 degrees.

What is now the expected mismatch between the two binary code sequences?

[..] Assuming a local reality means that, for each A photon, whatever hidden mechanism determines the output of Miss A's SPOT detector, the operation of that mechanism cannot depend on the setting of Mr B's distant detector. In other words, in a local world, any changes that occur in Miss A's coded message when she rotates her SPOT detector are caused by her actions alone.

[harrylin: apparently that includes whatever mechanism one could imagine - also non-detection of part of the photons]

And the same goes for Mr B. [..] So with this restriction in place (the assumption that reality is local), let's calculate the expected mismatch at 60 degrees.

Starting with two completely identical binary messages, if A's 30 degree turn introduces a 25% mismatch and B's 30 degree turn introduces a 25% mismatch, then the total mismatch (when both are turned) can be at most 50%. In fact the mismatch should be less than 50% because if the two errors happen to occur on the same photon, a mismatch is converted to a match.

[harrylin: and if the errors happen to occur on different photons that are compared, still sometimes a mismatch will be converted to a match. Thus now for example the sequences go like this, for the same reasons as +30 degrees and -30 degrees:

A 10101010110101010011
B 10100100101011010101]
----------------------------------------------------------------------------

So far Herbert's proof, which is simply comparing binary code sequences. Nowhere is there any assumption about detection efficiency, as there is even no assumption about what happens at the detectors or about what happens at the source. The only assumptions concern independent detections and reproducibility of % of matching in sufficiently long sequences.

Where is the error?
Realism is assumed. That is because there are 3 detector positions: 0, +30, -30. In a Bell type proof, you are looking for a setup in which there is a counterfactual setting. These are exactly equivalent to the Mermin example, which I often use, which is 0/120/240 degrees.

Realism is assumed. That is because there are 3 detector positions: 0, +30, -30. In a Bell type proof, you are looking for a setup in which there is a counterfactual setting. These are exactly equivalent to the Mermin example, which I often use, which is 0/120/240 degrees.
Realism is definitely assumed in simulations that make use of detection time windows; and surely the detection times at A are not affected by the detection times at B. Such simulations demonstrate that there has to be a glitch in this nice looking proof by Herbert... and as you and gill say, it's similar with Mermin's example as well as with Bell's calculation. There is thus a glitch in all these "proofs" that I just don't get... :uhh:

DrChinese
Gold Member
Realism is definitely assumed in simulations that make use of detection time windows; and surely the detection times at A are not affected by the detection times at B. Such simulations demonstrate that there has to be a glitch in this nice looking proof by Herbert... and as you and gill say, it's similar with Mermin's example as well as with Bell's calculation. There is thus a glitch in all these "proofs" that I just don't get... :uhh:
No glitch. Realism has nothing to do with detection or efficiency of same.

Realism is essentially the requirement of EPR that there are elements of reality *independent* of the act of observation. So they believe in the reality of counterfactual cases, i.e. the probability of occurance is in the range 0 to 100%.

No glitch. Realism has nothing to do with detection or efficiency of same.

Realism is essentially the requirement of EPR that there are elements of reality *independent* of the act of observation. So they believe in the reality of counterfactual cases, i.e. the probability of occurance is in the range 0 to 100%.
Then please point out (if you found it) where the error is in applying Herbert's proof with the "detection loophole"; as his proof is not concerned at all with what happens at the detectors but only with the generated data strings, I obtain the exact same conclusion with or without it... Is the error in step 1, 2, 3 or 4, and where exactly?

DrChinese