# I Do these Bell inequalities rely on probability concept?

#### zonde

Gold Member
This question came up in another thread.
You miss the point - his [Nick Herbert's] proof, right or wrong, regardless of what you think of him, used probability.

If you want to discuss Eberhard' proof post the full proof and in another thread at least at the I level then it can be discussed.
I will post again the link to Nick Herbert's proof here: https://www.physicsforums.com/threads/a-simple-proof-of-bells-theorem.417173/#post-2817138
I don't see where the probability shows up in Nick Herbert's proof.

For discussion of Eberhard's proof I will give it in this post:

Eberhard inequality for detection efficiency $\eta$ < 100%
Bell inequalities concern expectation values of quantities that can be measured in four different experimental setups, defined by specific values $\alpha_1$, $\alpha_2$, $\beta_1$, and $\beta_2$ of $\alpha$ and $\beta$. The setups will be referred to by the symbols $(\alpha_1, \beta_1)$, $(\alpha_1, \beta_2)$, $(\alpha_2, \beta_1)$, and $(\alpha_2, \beta_2)$, where the first index designates the value of $\alpha$. and the second index the value of $\beta$.

In each setup, the fate of the photon a and the fate of photon b is referred to by an index (o) for photon detected in the ordinary beam, (e) for photon detected in the extraordinary beam, or (u) for photon undetected. Therefore there are nine types of events: (o, o), (o, u), (o, e), (u, o), (u, u), (u, e), (e, o), (e, u), and (e, e), where the first index designates the fate of photon a and the second index the fate of photon b. Table I shows a display of boxes corresponding to the nine types of event in each setup. The value of $\alpha$, and the fate of photon a designate a row. The value of $\beta$ and the fate of photon b designate a column. Any event obtained in one of the setups corresponds to one box in Table I.

For a given theory, we consider all the possible sequences of N events that can occur in each setup. N is the same for the four setups and arbitrarily large. As in [Eberhard1977] and [Eberhard1978], a theory is defined as being "local" if it predicts that, among these possible sequences of events, one can find four sequences (one for each setup) satisfying the following conditions:
(i) The fate of photon a is independent of the value of $\beta$, i.e., is the same in an event of the sequence corresponding to setup $(\alpha_1, \beta_1)$ as in the event with the same event number k for $(\alpha_1, \beta_2)$; also same fate for a in $(\alpha_2, \beta_1)$ and $(\alpha_2, \beta_2)$; this is true for all k's for these carefully selected sequences.
(ii) The fate of photon b is independent of the value of $\alpha$, i.e., is the same in event k of sequences $(\alpha_1, \beta_1)$ and $(\alpha_2, \beta_1)$; also same fate for b in sequences $(\alpha_1, \beta_2)$ and $(\alpha_2, \beta_2)$.
(iii) Among all sets of four sequences that one has been able to find with conditions (i) and (ii) satisfied, there are some for which all averages and correlations differ from the expectation values predicted by the theory by less than, let us say, ten standard deviations.

These conditions are fulfilled by a deterministic local hidden-variable theory, i.e., one where the fate of photon a does not depend on $\beta$ and the fate of b does not depend on $\alpha$. For such a theory, these four sequences could be just four of the most common sequences of events generated by the same values of the hidden variables in the different setups. Conditions (i)—(iii) are also fulfilled by probabilistic local theories, which assign probabilities to various outcomes in each of the four setups and assume no "influence" of the angle $\beta$ on what happens to a and no "influence" of $\alpha$ on b. With such theories, one can generate sequences of events having properties (i) and (ii) by Monte Carlo, using an algorithm that decides the fate of a without using the value of $\beta$ and, for b, without using the value of $\alpha$. If the same random numbers are used for the four different setups, the sequences of events will automatically have properties (i) and (ii), and the vast majority of them will have property (iii).

Let us follow an argument first used in Ref. [Stapp1971]. When four sequences are found satisfying conditions (i) and (ii), the four events with the same event number k in the four different sequences will be called "conjugate events". Because of condition (i), two conjugate events in setups $(\alpha_1, \beta_1)$ and $(\alpha_1, \beta_2)$ fall into two boxes on the same row in Table I. The same thing applies for conjugate events in setups $(\alpha_2, \beta_1)$ and $(\alpha_2, \beta_2)$. Because of (ii), conjugate events for setups $(\alpha_1, \beta_1)$ and $(\alpha_2, \beta_1)$ lie in boxes in the same column; and so do conjugate events for $(\alpha_1, \beta_2)$ and $(\alpha_2, \beta_2)$. Let us select all the $n_{oo}(\alpha_1, \beta_1)$ events that fall into the box marked with a $\bullet$ in the section of Table I reserved for setup $(\alpha_1, \beta_1)$. None of these events falls into any other box for setup $(\alpha_1, \beta_1)$. Because of condition (i), their conjugate events in setup $(\alpha_1, \beta_2)$ fall into boxes on row o. Because of condition (ii), the conjugate events in setup $(\alpha_2, \beta_1)$ lie in boxes in column o. Therefore none of the boxes marked with a $*$ contains any of the events of this sample or any of their conjugates.

Now, from that sample, let us remove events with conjugates falling in one of the boxes marked with a $\otimes$ in setup $(\alpha_2, \beta_1)$. The number of events subtracted is smaller than or equal to the total number $n_{uo}(\alpha_2, \beta_1)$ + $n_{eo}(\alpha_1, \beta_1)$ of events of all categories contained in those two boxes. Therefore the remaining sample contains $n_{oo}(\alpha_1, \beta_1) — n_{uo}(\alpha_2, \beta_1) — n_{eo}(\alpha_2, \beta_1)$ events or more. None of the events in the remaining sample has a conjugate falling in a box on rows u or e in setup $(\alpha_2, \beta_1)$; thus, because of condition (i), none falls in setup $(\alpha_2, \beta_2)$ either. None of the conjugate events falls in a box marked with an $\times$ .

Let us further restrict the sample by removing events with conjugates in sequence $(\alpha_1, \beta_2)$ falling in boxes marked with a $\oplus$ in Table I. Using the same argument as in the preceding paragraph, the number of events left must be more than or equal to

$n_{oo}(\alpha_1, \beta_1) — n_{uo}(\alpha_2, \beta_1) — n_{eo}(\alpha_2, \beta_1) — n_{ou}(\alpha_1, \beta_2) — n_{oe}(\alpha_1, \beta_2)$

where $n_{ou}(\alpha_1, \beta_2) + n_{oe}(\alpha_1, \beta_2)$ is the total number of events of all categories falling into the boxes marked with a $\oplus$. None of the events in that restricted sample falls in column u and e in setup $(\alpha_1, \beta_2)$; therefore, because of condition (ii), none falls in setup $(\alpha_2, \beta_2)$ either; therefore, none falls in any box marked with a $+$ .

All events belonging to the latter sample must have conjugates in sequence $(\alpha_2, \beta_2)$ falling into the only remaining box for that setup, i.e., box (o,o). That is possible only if that most restricted sample contains a number of events less than or equal to the total number $n_{oo}(\alpha_2, \beta_2)$ of events of all categories in that box. Thus conditions (i) and (ii) can be satisfied by our four sequences only if

$n_{oo}(\alpha_1, \beta_1) — n_{uo}(\alpha_2, \beta_1) — n_{eo}(\alpha_2, \beta_1) — n_{ou}(\alpha_1, \beta_2) — n_{oe}(\alpha_1, \beta_2) \leq n_{oo}(\alpha_2, \beta_2)$ (12)

i.e.,

$\mathit{J_B} = n_{oe}(\alpha_1, \beta_2) + n_{ou}(\alpha_1, \beta_2) + n_{eo}(\alpha_2, \beta_1) + n_{uo}(\alpha_2, \beta_1) + n_{oo}(\alpha_2, \beta_2) - n_{oo}(\alpha_1, \beta_1)$ (13)

For condition (iii) to be true no matter how large the number of events N is, inequality (13) also has to apply to the expectation values of these numbers. It is a form of the Bell inequality, which Eqs. (6)—(9) make equivalent to inequality (4) of Ref. [CH1974].

#### Attachments

• 2.5 KB Views: 357
Related Quantum Physics News on Phys.org

#### zonde

Gold Member
Nick Herbert' proof uses CFD in the model - which thus fails.
Any theory that makes predictions based on some external input is CFD. So we can say that Nick Herbert' proof applies to all scientific models. And I don't consider this a failure. Besides how would you define locality without some CFD-like reasoning?
Eberhard actually gives different definition of locality without this CFD-like reasoning but it is much harder to use in the analysis of hypothetical models.

But what do you think concerning the question of this thread? Does Nick Herbert's proof relay on probability?

#### Mentz114

Gold Member
Any theory that makes predictions based on some external input is CFD. So we can say that Nick Herbert' proof applies to all scientific models. And I don't consider this a failure. Besides how would you define locality without some CFD-like reasoning?
Eberhard actually gives different definition of locality without this CFD-like reasoning but it is much harder to use in the analysis of hypothetical models.

But what do you think concerning the question of this thread? Does Nick Herbert's proof relay on probability?
Eberhard is talking about statistics
Any event obtained in one of the setups corresponds to one box in Table I
..
For a given theory, we consider all the possible sequences of N events that can occur in each setup. N is the same for the four setups and arbitrarily large.
Later he refers to frequencies like $n_{oo}(\alpha_1, \beta_1)$. He could use the same logic if one sets $p_{oo}(\alpha_1, \beta_1)=n_{oo}(\alpha_1, \beta_1)/N$ in the limit.

I think Herbert's proof also is talking about expected results from a long run of trials.
Tilt the A detector till errors reach 25%. This occurs at a mutual misalignment of 30 degrees.
Does this mean "Tilt the A detector till (for a long run of trials) errors reach 25%" ?

Please change the word 'relay' in the title to 'rely'.

#### Peter Morgan

Gold Member
Another way to put @Mentz114's comment #4 is to say that Eberhard is implicitly using an ensemble interpretation of probability.

#### Derek P

But what do you think concerning the question of this thread? Does Nick Herbert's proof relay on probability?
Of course not. That was the point.

#### zonde

Gold Member
Eberhard is talking about statistics

Later he refers to frequencies like $n_{oo}(\alpha_1, \beta_1)$. He could use the same logic if one sets $p_{oo}(\alpha_1, \beta_1)=n_{oo}(\alpha_1, \beta_1)/N$ in the limit.
$n_{oo}(\alpha_1, \beta_1)$ is number of events not frequency.

The concept of event is more basic concept than frequency or probability. So you can talk about events without referring to probabilities and frequencies but you can't talk about probability or frequency without implicitly talking about events.

I think Herbert's proof also is talking about expected results from a long run of trials.

Does this mean "Tilt the A detector till (for a long run of trials) errors reach 25%" ?
Well to make sense of Herbert's proof we have to restrict it's application to models. With this fix on mind the answer to your question is: "Change the angle of the A detector till expectation value (error rate) predicted by the model reach 25%".

#### Mentz114

Gold Member
$n_{oo}(\alpha_1, \beta_1)$ is number of events not frequency.

The concept of event is more basic concept than frequency or probability. So you can talk about events without referring to probabilities and frequencies but you can't talk about probability or frequency without implicitly talking about events.
The accepted usage of 'frequency' is a count. Are the $n_{oo}(\alpha_1, \beta_1)$'s not counts of events happening with conditions $\alpha_1,\beta_1$ out of N trials ?

Well to make sense of Herbert's proof we have to restrict it's application to models. With this fix on mind the answer to your question is: "Change the angle of the A detector till expectation value (error rate) predicted by the model reach 25%".
I don't understand much of Herberts 'proof'. But an example with specific settings is not a proof.

#### zonde

Gold Member
The accepted usage of 'frequency' is a count.
No, accepted usage of 'frequency' is count per time unit. Relative frequencies are used by experimentalists as one usually can not know total count in (optical) experiments. So they use source with stable count rate then measure max frequency (counts per time unit) and compare it with frequencies at different measurement settings. Think about it, how would you test Malus' law? You measure count rate when it's maximum (polarizer angle is aligned with photon polarization angle) and then compare this max rate with count rates at different polarizer orientations.
I don't understand much of Herberts 'proof'. But an example with specific settings is not a proof.
Yes, proof usually would address more general statements. So Herbert's proof is rather something between counter example and a proof.
But the target audience of this proof is laymen and it is well adopted for this audience. It is mostly useless for experimentalists. Experimentalists are using other proofs (say Eberhard's or CH74 proof).

#### Mentz114

Gold Member
No, accepted usage of 'frequency' is count per time unit. Relative frequencies are used by experimentalists as one usually can not know total count in (optical) experiments. So they use source with stable count rate then measure max frequency (counts per time unit) and compare it with frequencies at different measurement settings. Think about it, how would you test Malus' law? You measure count rate when it's maximum (polarizer angle is aligned with photon polarization angle) and then compare this max rate with count rates at different polarizer orientations.
No. The word has many meanings.
Look up frequentist probability.

If you want Bell without probability you should check out the Kochen-Specker theorem.

Last edited:

#### zonde

Gold Member
No. That is patronising.
Look up frequentist probability.
Ok, my mistake. Event count in frequentist probability is called frequency and is used as an input to calculate relative frequencies.
But still, if I simply count events it does not in any way mean that I am using any framework that counts events as well and uses it as an input. If to approaches use the same input it does not mean that they relay on each other.
If you want Bell without probability you should check out the Kochen-Specker theorem.
Ok, I think I understand where you see probabilities in these proofs. They are in predictions of QM, right?
In other words why do we believe that predicted probability means particular event count in a set of N tests. Hmm, but this is how we test predictions of QM. So it does not makes sense to doubt this belief if we take QM as experimentally confirmed. So maybe I still do not understand you.

#### Peter Morgan

Gold Member
Ok, my mistake. Event count in frequentist probability is called frequency and is used as an input to calculate relative frequencies.
But still, if I simply count events it does not in any way mean that I am using any framework that counts events as well and uses it as an input. If to approaches use the same input it does not mean that they relay on each other.

Ok, I think I understand where you see probabilities in these proofs. They are in predictions of QM, right?
In other words why do we believe that predicted probability means particular event count in a set of N tests. Hmm, but this is how we test predictions of QM. So it does not makes sense to doubt this belief if we take QM as experimentally confirmed. So maybe I still do not understand you.
I think you've pretty much got it, @zonde. I hope it's helpful to say that I'd put it as "We take Quantum Theory as experimentally confirmed because we can construct models of the theory that generate probability distributions that match well enough with observed statistics". In particular, well enough for us to engineer new experimental apparatus that will, with predictable statistical accuracy, do something we want it to do.
It's perhaps also worth noting, however, that Bayesian tools of mathematical probability/pseudo-probability are now quite commonly used: when a lab engineers a new state preparation apparatus to produce some desired statistics of events —to be, say, an exemplar of some desired quantum state— they likely will get quite close because they have lots of experience with similar apparatus, but if the actual statistics do not match closely enough we may choose to update the statistical parameters of the model —that is, we will say that the apparatus is an exemplar of some slightly different quantum state— to better characterize the state preparation apparatus. Sometimes this is reversed in that a lab constructs an experiment to characterize a new measurement apparatus instead of to characterize a new state preparation apparatus. Some people take these uses of Bayesian tools to justify a subjectivist interpretation of probability, however Bayesian methods can also be understood as just a sometimes esoteric way of mixing a priori probabilities and a posteriori statistics to an in-Bayesian-sense best effect. To take a specific, if we construct an experiment to violate Bell inequalities, we might say that we have constructed precisely a state $\frac{1}{2}(|+\rangle|-\rangle+|-\rangle|+\rangle)$, akin to saying that we can construct a perfectly unbiased coin, or we might go to the trouble of using observed statistics and good characterizations of the measurement devices we use to characterize more precisely what state has actually been prepared.
To be a little pompous, the details of the centuries-long trail of characterizations of different experiments, which in any well-managed lab goes back in a recordable way to National and International Standards bodies, and the gradually changing methods we use to update and adapt our models to reflect actual statistics of experimental results are the heart of the consistency of the scientific method. TBH, the account I've just suggested is inspired by Hasok Chang's "Inventing Temperature", which describes the century-long process of constructing international standards for how temperature should be measured, mostly in a historical way but also in a philosphically enlightening way. Hasok is not as well-known outside history and philosophy circles as he should be, IMO.

#### Mentz114

Gold Member
Ok, my mistake. Event count in frequentist probability is called frequency and is used as an input to calculate relative frequencies.
But still, if I simply count events it does not in any way mean that I am using any framework that counts events as well and uses it as an input. If to approaches use the same input it does not mean that they relay on each other.

Ok, I think I understand where you see probabilities in these proofs. They are in predictions of QM, right?
In other words why do we believe that predicted probability means particular event count in a set of N tests. Hmm, but this is how we test predictions of QM. So it does not makes sense to doubt this belief if we take QM as experimentally confirmed. So maybe I still do not understand you.
The word you want is 'rely' not 'relay'.

I'm sure you have some valid point of view behind what you are saying but I definately do not understand you.
Let us agree to differ or just drop this unfruitful discussion.

#### zonde

Gold Member
I'm sure you have some valid point of view behind what you are saying but I definately do not understand you.
Let us agree to differ or just drop this unfruitful discussion.
I guess it's my fault as I didn't gave enough context from the other thread in which I wanted to discuss the question. So let's drop this discussion.

### Want to reply to this thread?

"Do these Bell inequalities rely on probability concept?"

### Physics Forums Values

We Value Quality
• Topics based on mainstream science
• Proper English grammar and spelling
We Value Civility
• Positive and compassionate attitudes
• Patience while debating
We Value Productivity
• Disciplined to remain on-topic
• Recognition of own weaknesses
• Solo and co-op problem solving