Two questions about Bell that relate to things that I learned in philosophy and subsequently make it difficult to understand Bell's claims.

First, Bell says he has 3 assumptions and that one of them failed because his inequality was violated.

I gathered that proof by negation was invalid in practice because a person could never identify all of their non trivial assumptions. Meaning there was an infinite number of plausible things that could have occurred that a person in such a situation assumes did not.

Further according to what I have learned, the concept of plausibility and therefore trivial assumption is meaningless in a realm where we have little experience. Thus it is not trivial to assume anything, and there are infinite more assumptions that "implausible" things were not occurring.

Second, I was under the impression that it was always possible to separate one event from others and the rules that govern them as a fundamental property of inductive reasoning.

This would mean for example, that the light speed barrier need not apply to entangled particles to still be able to apply to everything else.

A thought experiment demonstrating this in this particular case would consist of there being something realizing sub atomic particles in ways which we can only observe as the world we live in (that is governed by the speed barrier), but these unobservable properties result in superluminal travel in that one particular case.

Instead it seems to be claimed that entangled particles must obey the speed barrier for the theory and equations governing it (which are demonstrated by all else in this world) to be preserved and therefore come to the conclusion that reason or objective reality fails? oO

Last edited:

vanesch
Staff Emeritus
Gold Member
A thought experiment demonstrating this in this particular case would consist of there being something realizing sub atomic particles in ways which we can only observe as the world we live in (that is governed by the speed barrier), but these unobservable properties result in superluminal travel in that one particular case.

But that is exactly one of the possible "resolutions" of the Bell "paradox"! In fact, it was even inspired by Bell's own favorite viewpoint, which was Bohmian mechanics.

If you take on the stance that negative reasoning (reductio ad absurdum) cannot be used, then all of science according to Popper falls down. You will never be able to falsify a statement, because you could always be falsifying just a triviality which you failed to notice (like, your experimental apparatus wasn't connected properly).

Bell's results are in fact quite universal, and the 3 hypotheses in it are quite well identified by now. They are:
1) locality (what's physically happening at B cannot depend on the choice of measurement at A)
2) no superdeterminism (things that are not "obviously" causally linked directly, or by common origin, are statistically independent)
3) single, objective reality (outcomes are unique, and objectively the same for all observers).

From these 3 assumptions, Bell derives his inequalities which are in contradiction with quantum mechanical predictions. It's essentially mathematics. So it is difficult to get away with 1,2 and 3 and not to have Bell's inequalities.

From this one can conclude that:
or 1), or 2) or 3) are to be rejected - at least in this case - or 4) the predictions of QM are erroneous - at least in this case.

4) could be correct (it is the stance of the "local realists") but this is increasingly implausible by many experimental results.

So what gives ?

1) could be right. No strict locality. That's what is true with Bohmian mechanics for instance. But 1) wrong is a genuine pain in the a*** for relativists.

2) superdeterminism could be right. But that's a genuine pain in the a*** for about any scientific theory, because we can't use statistics anymore then to prove or disprove any potential causal link (all of medical testing falls apart, and you can't argue anymore against astrology).

3) That's what some people think that quantum theory suggests. But it is a pain in the a*** of many philosophers, and other people alike.

If you can come up with yet another (hidden) assumption in the derivation of Bell's inequalities, be my guest. Of course they exist, but they are all even worse, like: logic doesn't work, or mathematics doesn't work, or "things just happen", or...

Science is limited in its possibilities of reasoning to those that admit the existence of science as a sensible activity. Maybe this is fundamentally misguided, and we live in a "demon-haunted world", but this is what science needs to limit itself to. So if the "hidden assumption" is that one should be able to understand rationally, mathematically, logically what goes on, well, that was the scientific assumption from the start! Maybe the world is finally not that way. But we limit ourselves to this frame of thinking, because it has proven successful elsewhere.

Last edited:
The three assumptions are very wide sweeping and therefore I suppose it could be argued that they encompass a large number of other possible assumptions. However proof by negation still isn't proof or logic at all. I will say what I mean in a minute but...

Why would non locality in this one case be so bad for relativists? I mean, everything else could be completely preserved in that case. It seems that it is just like rules of mechanics not applying to electrons because of other forces involved. Or the rules of a hockey game not applying to football for that matter.

Also, what do you mean by superdeterminism? To be honest, I see a different assumption there every time someone describes this to me. By this am I to presume that you are a determinist and believe that this is the assumption that fails? (Non determinists don't seem to understand enough about it to separate it into different kinds)

If so then perhaps you can see why 1 and 2 can be related. If future events affect past or present events, 2 would fail right? But what if they that order was just observed by us because something was capable of traveling faster than any possible observation tool such that a far away event could influence a closer event and give us time to observe it before we could observe the far away event?

Proof by Negation

Regarding proof by negation, you provide an argument. I hope this argument is not considered a type of reduction ad absurdium because the result is not logically absurd. (If not reduction ad absurdium, science fails)

I think a significant difference between trying to use proof by negation and other science is that proof by negation is trying to be hawked off as logical deductive reasoning with a strength value equal to that of math and that is not at all the case. It shouldn't be called proof or reduction to the absurd or anything that implies it is related to logic.

It is absolutely possible and it frequently occurs that experiments are faulty because of some mistake in reasoning by the experimenter like sampling bias. A person sampling people off of the street corner for a heart experiment might be sampling 66% coffee drinkers if there is a star bucks 2 blocks away.

Peer review and repetitive experiments help reduce these kind of uncertainties from experiments. That is because people can identify the biases and conduct their own tests where the same biases are not likely to be present. But it is always possible that an experiment produces results that are different from what a given population will experience, and only hindsight can allow us to pick up the pieces in that case. This approach though can be considered robust because if it fails then the failure can be used to better interpret the results, thus failure reduces in frequency over time.

However peer review is useless in this particular case. Other scientists can not recognize assumptions when there is no way to observe it if they do fail. It is absolutely wrong to call any such assumption trivial. You seem to be giving the false impression that a best evidence approach to this will work, or that this system is robust and that is not true. What we believe Bell's assumptions are now are not likely to be proven wrong even if they are wrong, rather the evidence is so removed that we need to disregard what we think Bell's assumptions are now in order to have a chance to interpret that logically remote evidence.

Last edited:
vanesch
Staff Emeritus
Gold Member
Why would non locality in this one case be so bad for relativists?

It's a basic assumption of relativity. See, relativity makes the basic assumption that whatever represents "reality" must be definable over a spacetime manifold, and it gets terribly complicated to do so if you do not make the simplifying assumption of locality. Now, one could probably do with some restricted forms of non-locality, as long as you still obtain objects over spacetime. But it is going to be terribly difficult.

But in any case, the kind of non-locality needed in Bell's type of setups is not going to be definable over a spacetime manifold, simply because it would allow you, in principle, to make "kill your grandpa" setups. Not with a Bell type system in itself, but with the non-locality required by this kind of Bell system, if the explanation is to be non-locality.

In other words, the non-locality needed to serve as an explanation for a Bell setup, would in principle also allow you to construct a device that upon reception of signal A, sends out signal B, and upon reception of signal B, sends out signal A, and which receives the signal it sends out before it did send out the signal: a paradoxical machine.

Again, Bell's setup by itself can't do so, and quantum theory tells us why. But if non-locality is going to be the *explanation* for Bell's setup (namely that Alice's particle "gets to know" what Bob's measurement choice was before it interacts at Alice's measurement), then the *lock* which would avoid a paradoxical machine (namely, locality) is broken.

I mean, everything else could be completely preserved in that case. It seems that it is just like rules of mechanics not applying to electrons because of other forces involved. Or the rules of a hockey game not applying to football for that matter.

It is not so simple. In relativity, EVERYTHING breaks down if you find one single exception. It is very well possible that relativity is simply wrong of course. But the whole point is that we have no compelling evidence for this, on the contrary. If there is ONE single way of finding out an absolute reference system, then the whole building of relativity falls down, because it is build upon a symmetry of nature which puts all reference systems on equal footing. And if that happens, then it leaves us wondering why there are so many instances of things where things happen AS IF there was this symmetry even though it isn't there.

You provide an argument. I hope this argument is not considered a type of reduction ad absurdium because the result is not logically absurd. (If not reduction ad absurdium, science fails)

I use reductio ad absurdum in the following sense:

1) make assumptions A, B and C
2) assume that mathematics and logic holds
3) derive from A, B and C an inconsistency

Now, the assumptions are:
A: Bell's assumptions (locality, non-superdeterminism, objective single outcomes)
B: quantum theory

from A, Bell derives his inequalities (mathematically), and from B, one derives violations of those inequalities. Hence a contradiction.

All this is pretty simple mathematics. No experimental science is involved here: it is done on a piece of paper. The derivation of Bell's inequalities from the assumptions A is pretty straight-forward. Now, it is true that there are some other assumptions, but which are so basic that if you doubt them, that no scientific reasoning can ever be held anymore.

For instance, it is assumed that there is some "statistical regularity". If you do a certain experiment a large number of times, that you will find averages for the outcomes, and if you do that same experiment without any change in an essential parameter again a large number of times, that you will find similar averages for outcomes. If that's not true, then no experiment ever has any value, because the next day, things can be totally different. If you measure the decay time of a radioactive substance today, then tomorrow this can be totally different.

Another (related) assumption is that there is some form of causality (not necessarily deterministic), but that outcomes, or the statistics of outcomes, are function of a certain number of "physical parameters" and not of others. The negation of this assumption is that "things just happen". All correlations ever observed are just fortuitous, and there is no cause-effect relationship ever.

If you put in doubt these assumptions, then you put in doubt about 99.99% of all scientific work ever. Nature might be like that, of course. But then one has to explain how it comes that the scientific method has proved so successful in many fields, even though its basic principles are totally false.

The derivation of the violation of those inequalities in quantum theory is also very simple.
There is even less doubt here, because quantum theory is quite clear on what are the predictions of the statistics of the correlations.

So we have maybe two pages of mathematics.

I think a significant difference between trying to use proof by negation and other science is that proof by negation is trying to be hawked off as logical deductive reasoning with a strength value equal to that of math and that is not at all the case. It shouldn't be called proof or reduction to the absurd or anything that implies it is related to logic.

Bell's problem is a purely formal thing: it shows you that the results predicted by quantum mechanics (as a theory) are incompatible with a set of normally helt basic assumptions, of which the application in any other field of science would, at no point, be even considered doubtful.

It is absolutely possible and it frequently occurs that experiments are faulty because of some mistake in reasoning by the experimenter like sampling bias. A person sampling people off of the street corner for a heart experiment might be sampling 66% coffee drinkers if there is a star bucks 2 blocks away.

I agree with you that *experimental* science is much more difficult and prone to hidden errors, but Bell's thing is a purely formal thing on paper.

However peer review is useless when you try to use proof by negation. Other scientists can not recognize assumptions when there is no evidence that they could fail.

They simply have to look at the two pages of calculations. Everything is there. It's a formal thing.

What we believe Bell's assumptions are now are not likely to be proven wrong even if they are wrong, rather the evidence is so removed that we need to disregard what we think Bell's assumptions are now in order to have a chance to interpret that logically remote evidence.

I don't understand that. You can go through the reasoning step by step yourself, and ask yourself what hidden assumption is made now. This is not more difficult than analyzing a mathematical proof for correctness.

But our concept of time depends on the order in which we observe things which depends on the speed of light.

Consider the following perfectly acceptable under current beliefs thought experiment. Bob is blind and on a 500 ft/second train car with an open side door. He passes a man aiming a gun at a road sign a good distance away. As Bob is in between the sign and the man with the gun, the man fires at the road sign. Bob hears the road sign being hit before the gun being fired.

This doesn't mean that the road sign actually was hit first, just that he observed it being hit first. Thus the kind of time line required for relativity is preserved.

The name of what is now called relativity might change, but that doesn't mean what it claims would be useless. Where is a case where there HAS to be equal footing as opposed to there just not being any evidence for either reference point being privileged? It seems logically impossible for a theory to be dependent on there being no advantaged reference point.

Proof by negation is not a logical or formal proof at all. It seems more reasonable in a closed system with no connection to reality but the "infinite assumption" problem is still there.

A) I ate the last cracker
B) The last cracker is still there
C) Proof by negation one of the above is false

Well this left out the infinite number of assumptions like A2) Someone replaced the eaten cracker A3) I dreamt I ate the cracker...

We could call one like A4) A space alien gave me the false memory of eating the cracker.. implausible or trivial but that has no meaning in Quantum Physics.

Also it is wrong to say that any other assumption failing would cause 99% of science to fail. Other assumptions could easily fail in this case and not in other scientific inquiries. Just because your microscope is maladjusted now doesn't mean it was in every other experiment.

Regarding peer review, you misunderstood my objection. For example a peer cannot identify the assumption A4) A space alien gave me the false memory that I ate the cracker.. any more than I can. It wouldn't matter if no one could ever see any evidence of that assumption failing. But what if a brain scan shows somehow that my memory had been altered, but noone would ever think to try that if this (or any other situation where my memory was altered) was not considered one of the assumptions?

This is the type of situation that makes proof by negation so dangerous - when the logical result of an unrecognized assumption is not closely connected to that unrecognized assumption.

Last edited:
vanesch
Staff Emeritus
Gold Member
Just to show you how elementary Bell's thing is, consider the following. This is paraphrasing from Bell's own little book (speakable and unspeakable in quantum mechanics).

Consider a device with a central button ("start") two opposite long arms, and at each end of an arm, 3 buttons A, B and C, an indicator that "start" has been pressed, and a red and a green light. In front of each of these terminals sits an experimenter (Alice on one hand, Bob on the other).

When they see the "start" indicator light up, they pick at their leasure a button A, B or C, and when they push it, the red light or the green light light up.
They write down their choice and the outcome (red or green), and continue the experiment. The idea is that they push a certain number of times each button (A, B and C), so that there is enough statistics for each. After a large number of these experiments, they come together and compare notes.

They classify their trials in 9 possible categories:
(A,A), (A,B), (A,C), (B,A), ... (C,C).

Assuming there are enough trials in each category (this must happen if they pick "randomly enough"), they can find the number of 4 different possible outcomes in each case:
(green,green), (green, red), (red, green) and (red, red).

They calculate a number for each case:

C(A,A) = { N(green,green) + N(red,red) - N(red,green) - N(green,red) }/
{ N(green,green) + N(red,red) + N(red,green) + N(green,red) }

Note that this is a number between -1 and 1. If it is -1, then it means that EACH TIME, they got opposite outcomes, if it is +1, it means that EACH time, they got the same outcomes.

Our assumption of statistical regularity means that C(A,A) is going to be a number that is independent of the instance of the experiment, and gives each time the same outcome, if we have enough statistics. So C(A,A) is going to be a property of our machine.

We will have such a number for the 9 outcomes.

We will make some extra simplifying assumptions, which are symmetry assumptions. They don't have to be true, and they don't change the argument. But they limit the number of cases that have to be examined.

The first assumption is symmetry between Alice and Bob. C(X,Y) = C(Y,X).

The second assumption is this:
C(A,A) = C(B,B) = C(C,C).

The third assumption is that in each category, there are on average as many red lights as green lights, for Bob, and for Alice.

Again, things don't need to be that way, but they make the maths much harder. So let us restrict ourselves to this kind of case.

From the C-values, we will try to deduce some properties of the internal workings of our machine.

Let us first consider some "toy examples" to warm up.

Consider that for all 9 C's, we have C(A,A) = 1, C(A,B) = 1, ... C(C,C) = 1.

In other words, no matter what buttons Alice and Bob press, they always obtain the same outcome on each side. That's of course simple to implement: the knobs A, B and C are simply without function, and at the "start" signal, a signal is send from the central box to the two extremities, which indicates whether the green or the red light should light up.

Other toy example: C(A,A) = C(A,B) = ... = -1.

Similar explanation, only opposite signals are now sent out.

Third toy example: C(A,A) = C(A,B) = ... C(C,C) = 0.

Of course this can be obtained with a random signal sent to both sides. But we could also think of it as no connection between the boxes at all, and independent random generators in each experimental box.

Now, Bell analysed in how much there is a link between the different C(X,Y) values.

He assumed the following: in Alice's box, a random process is going to make the green or the red light go on, and the probabilities of this to happen depend upon two things:
1) a signal that comes from the central box, call it Lambda
2) whether Alice pushes A, B or C.

This is the assumption of locality: the fact that the green or the red lamp will go on, can only depend on what is locally available: Alice's choice and an eventual signal coming from the central box. It cannot depend on something remote that didn't come in (in time) to become "local".

So we have that P_red_alice is a mathematical function, depending on Alice's choice (A,B,C) and on lambda. Same for P_green_alice, but it is of course equal to 1-P_red_alice.

So we have 3 functions of lambda:
P_A(Lambda)
P_B(Lambda)
P_C(Lambda)

We didn't specify yet what Lambda was, but it is the signal that comes from the central box.

We have a similar reasoning at Bob's of course:
Q_A(Lambda)
Q_B(Lambda)
Q_C(Lambda)

We can include in Lambda the signal that will be sent both to Alice's side and to Bob's side. As such, Alice's side "knows" everything that the central box sent to Bob's side and vice versa. All this is included in Lambda.
Lambda will have a certain statistical distribution in the central box (as is any signal).

And now comes this famous assumption of no superdeterminism. We use it twice. It says essentially that random variables which have no clear causal link (direct or indirect) are statistically independent.

We assume that Lambda is not correlated in any means with Alice's or Bob's choices. Indeed, we assume that the picking of a Lambda when it is sent out from the central box outward, and, a bit later, when Alice and Bob pick their choices for their A, B or C, that these two phenomena are not indirectly causally related (that is, that Alice picking A has nothing to do with a previous phenomenon which also influenced the central box). We also assume that they are not directly causally related (that the central box has no direct influence on what Alice is going to choose).

The next thing we assume is again no superdeterminism, and this time we assume that the random process that picks locally the red or green light at Alice according to the above probability is going to be statistically independent from the random process that does the same at Bob's side.

From these assumptions, and the assumption of statistical regularity, we can derive, that the probability, given a certain Lambda, when Alice picked X and Bob picked Y, to have red-red, is given by:

P(X,Lambda) Q(Y,lambda).

From this, we calculate the expectation value of the correlation for a given Lambda, which is nothing else but its value (+1 or -1) weighted with the probabilities of the 4 mutually exclusive events:
red-red, green-green, red-green and green-red.
We use here the third assumption: a single outcome at each side (mutually exclusive events). Note that the total probability is indeed 1.

D(X,Y,Lambda) = (P(X,Lambda)Q(Y,Lambda) + (1-P(X,Lambda)) (1 - Q(Y,Lambda)) - P(X,Lambda)(1-Q(Y,Lambda)) - (1-P(X,Lambda))Q(Y,Lambda))

which simplifies quickly (algebra) to:

D(X,Y,Lambda) = (2 P(X,Lambda) - 1) (2 Q(Y,Lambda) - 1)

Now, the correlation C(X,Y) is this expectation value, weighted over the distribution of Lambda, making again use of the assumption of statistical regularity. We know (no superdeterminism) that this Lambda distribution is independent of X or Y.

So:
C(X,Y) = < (2 P(X,Lambda) - 1) (2 Q(Y,Lambda) - 1) > |lambda

Now, consider the specific case where C(A,A) = C(B,B) = C(C,C) = 1. That means, full correlation when Alice and Bob make the same choices.

As the number (2 P(X,Lambda) - 1) (2 Q(Y,Lambda) - 1) is a number between -1 and 1, the only way for this to be weighted over Lambda, to be equal to 1, is that it is equal to 1 for all values of Lambda.

Hence, we have for all lambda:
P(A,Lambda) = 1 and Q(A,Lambda) = 1 OR
P(A,Lambda) = 0 and Q(A,Lambda) = 0

and same for B and C.

As such, we can write:
P(A,Lambda) = Q(A,Lambda) = 1 or 0
P(B,Lambda) = Q(B,Lambda) = 1 or 0
P(C,Lambda) = Q(C,Lambda) = 1 or 0

Note the funny CONSEQUENCE of the full correlations: we have local probabilities of only 1 or 0. In other words, the outcome (green or red light) is ENTIRELY FIXED by lambda and by the choice of the user (A, B or C). This isn't a hypothesis, it is a consequence!

So this means that we have 8 classes of Lambda (three times two choices):
Lambda in L1 means: P(A,Lambda) = Q(A,Lambda) = +1 AND
P(B,Lambda) = Q(B,Lambda) = +1 AND
P(C,Lambda) = Q(C,Lambda) = +1

Lambda in L2 means: P(A,Lambda) = Q(A,Lambda) = 0 AND
P(B,Lambda) = Q(B,Lambda) = +1 AND
P(C,Lambda) = Q(C,Lambda) = +1

Lambda in L3 means: P(A,Lambda) = Q(A,Lambda) = +1 AND
P(B,Lambda) = Q(B,Lambda) = 0 AND
P(C,Lambda) = Q(C,Lambda) = +1

etc...

So that a priori daunting distribution over potentially complicated signals reduces to a distribution over 8 classes, with probabilities p1, p2, p3, ... p8, such that p1 + p2 + ... p8 = 1.

Note also that (no superdeterminism) we assume that p1, p2, ... p8 are fixed numbers, independent of the choices of alice and bob, and describe fully the setup.

We can calculate the 9 C(X,Y) values as a function of p1, ... p8. Of course, 3 of them are already fixed: C(A,A) = C(B,B) = C(C,C) = 1.

We find, for instance, that:

C(A,B) = p1 + p2 + p7 + p8 - p3 - p4 - p5 - p6
C(A,C) = p1 + p3 + p6 + p8 - p2 - p4 - p5 - p7

...

We also have the assumption of equal probability of having red or green light on each side individually, from which it follows:
p1 + p2 + p3 + p4 = p5 + p6 + p7 + p8 = 1/2
p1 + p3 + p5 + p7 = p2 + p4 + p6 + p8 = 1/2
p1 + p2 + p5 + p6 = p3 + p4 + p7 + p8 = 1/2

From this, it can be shown that there are 4 independent variables, for instance:
p1, p2, p3 and p7.

We then have:
p4 = 1/2 - p1 - p2 - p3
p5 = 1/2 - p1 - p3 - p7
p6 = -p2 + p3 + p7
p8 = p1 + p2 - p7.

This gives you:
C(A,B) = -1 + 4p1 + 4p2
C(B,C) = - 4p3 - 4p7 + 1
C(A,C) = -1 + 4p1 + 4p3

This is hence the kind of relation that must hold for the correlations of our kind of device, no matter what happens inside, as long as the assumptions we set out hold, and as long as we have strict correlation C(A,A) = C(B,B) = C(C,C) = 1.

Now, for a simple quantum system with entangled spin-1/2 systems, one can easily calculate that:
Cqm(th1,th2) = 2 sin^2{(th1-th2)/2} - 1

So if we use such pairs of entangled particles in our machine, (and inverse the red and green light on one side, so that the correlations change sign), then we can find the quantum predictions for analyser settings:
A - 0 degrees
B - 45 degrees
C - 90 degrees

where now our C is minus Cqm (so that we still have C(A,A) = 1 etc...)

-Cqm(A,B) = -Cqm(B,C) = 1/sqrt(2)
C(A,C) = 0

These 3 equations fix 3 of the 4 degrees of freedom we had: (p1,p2,p3 and p7).
p2 = 1/8 (2 + sqrt(2) - 8p1)
p3 = 1/4 (1 - 4 p1)
p7 = 1/8 (8 p1 - sqrt(2))

from this follows then:
p4 = p1 - 1/(4 sqrt(2))
p5 = 1/8 (2 + sqrt(2) - 8 p1)
p6 = p1 - 1/(2 sqrt(2))
p8 = 1/4(1+sqrt(2) - 4 p1)

now, all these px are numbers between 0 and 1, and the point is that you can't find such a solution:

from p3 follows: p1 < 1/4

from p6 follows: p1 > 1/(2 sqrt(2))

But 1/(2 sqrt(2)) > 1/4, hence you can't find a p1 that makes that all our px are between 0 and 1.

As such, the quantum prediction fails to satisfy the kind of correlations we derived using our black box machine.

Last edited:
vanesch
Staff Emeritus
Gold Member
It seems logically impossible for a theory to be dependent on there being no advantaged reference point.

Nevertheless, that's the starting point of many physical theories. It goes under the name of "symmetry". For instance, the law of conservation of momentum comes from the fact that we cannot make any distinction between space, and space that has undergone a translation. The day you can find this, our REASON for conservation of momentum broke down. It can still hold, but we would be puzzled as of WHY.

The assumption of the impossibility of distinguishing a certain thing over another - in other words, the assumption of a symmetry - has proven to be extremely powerful.

A) I ate the last cracker
B) The last cracker is still there
C) Proof by negation one of the above is false

There are a few hidden assumptions:
1) there is only one last cracker
2) a last cracker that is eaten cannot be at the same time "there" (uniqueness of state of last cracker)

If THESE assumptions are included, then indeed the proof holds.

Well this left out the infinite number of assumptions like A2) Someone replaced the eaten cracker A3) I dreamt I ate the cracker...

A2 would fail the "there is only one last cracker" assumption
A3 fails the first assumption that I ate it. I only dreamt that I did, but I didn't.

We could call one like A4) A space alien gave me the false memory of eating the cracker.. implausible or trivial but that has no meaning in Quantum Physics.

The false memory of me thinking I ate the cracker still doesn't mean that I did eat it, hence assumption A is false and the proof still holds (that A or B or 1 or 2 has to be false).

Also it is wrong to say that any other assumption failing would cause 99% of science to fail.

Superdeterminism, or no statistical regularity would make a lot of science hopeless.

Regarding peer review, you misunderstood my objection. For example a peer cannot identify the assumption A4) A space alien gave me the false memory that I ate the cracker.. any more than I can. It wouldn't matter if no one could ever see any evidence of that assumption failing. But what if a brain scan shows somehow that my memory had been altered, but noone would ever think to try that if this (or any other situation where my memory was altered) was not considered one of the assumptions?

Yes, but the proof still holds: the cracker wasn't eaten, you only have a memory of it being eaten. You didn't put in A: I remember eating the cracker, you put there: the cracker was ontologically eaten.

Interesting discussion.

Bell said that the crucial assumption was locality. This was incorporated into the formulation by putting the probability of coincidental detection into factorable form.

But is Bell's locality condition really a locality condition?

Suppose that the locality assumption is just a statistical independence assumption.

Just to show you how elementary Bell's thing is, consider the following. This is paraphrasing from Bell's own little book …….
It may be elementary to some but the point sure escapes me. Easy enough to set up a few real numbers for the first half of your post. Maybe something like:

First the (Alice, Bob) results without considering what the other is doing either in how the other calibrate their personal box or in the A, B or C choice the other makes during the testing. While receiving the Lambda each fine tunes the three choices on their box independently until they achieve the following consistent results.

( P Alice, Bob) Probability of 1.0 of course means result is achieved 100% of the time.
Alice & three choices;
(A,-) (1.0 Red, -) (0.0, Green, -)
(B,-) (0.2 Red, -) (0.8, Green, -)
(C,-) (0.5 Red, -) (0.5, Green, -)

Bob & three choices;
(-,A) (1.0 - , Red) (0.0 - , Green)
(-,B) (0.8 - , Red) (0.2 - , Green)
(-,C) (0.5 - , Red) (0.5 - , Green)

Once calibrated the settings remain the same throughout the test as the timed Lambda signals received and random local three button choices are the only inputs.

After collecting sufficient data to produce a fair sampling all cataloged in order of the Lambda input received from the common distant source. The Nine possible different outcomes produce the following results when correlated, along with a calculated number “C” or ranging from 1.0 to -1.0. (Negative probability only for calculation proposes)

1 (A,A) (1.0 Red, Red) (0.0 Red, Green) (0.0 Green, Red) (0.0 Green, Green) 1.0
2 (A,B) (0.8 Red, Red) (0.2 Red, Green) (0.0 Green, Red) (0.0 Green, Green) 0.6
3 (A,C) (0.0 Red, Red) (0.5 Red, Green) (0.5 Green, Red) (0.0 Green, Green) -1.0

4 (B,A) (0.2 Red, Red) (0.0 Red, Green) (0.8 Green, Red) (0.0 Green, Green) -0.6
5 (B,B) (0.0 Red, Red) (0.2 Red, Green) (0.8 Green, Red) (0.0 Green, Green) -1.0
6 (B,C) (0.0 Red, Red) (0.2 Red, Green) (0.5 Green, Red) (0.3 Green, Green) -0.4

7 (C,A) (0.5 Red, Red) (0.0 Red, Green) (0.5 Green, Red) (0.0 Green, Green) 0.0
8 (C,B) (0.3 Red, Red) (0.2 Red, Green) (0.5 Green, Red) (0.0 Green, Green) -0.4
9 (C,C) (0.0 Red, Red) (0.5 Red, Green) (0.5 Green, Red) (0.0 Green, Green) -1.0

But I cannot get past the following:

…….. From these assumptions, and the assumption of statistical regularity, we can derive, that the probability, given a certain Lambda, when Alice picked X and Bob picked Y, to have red-red, is given by:

P(X,Lambda) Q(Y,lambda).

From this, we calculate the expectation value of the correlation for a given Lambda, which is nothing else but its value (+1 or -1) weighted with the probabilities of the 4 mutually exclusive events:
red-red, green-green, red-green and green-red.

I assume we are attempting to define the range of and values for Lambda along with detail which can be applied to six different functions (3 for Alice & 3 for Bob) to achieve the resulting probability distribution shown in “the 4 mutually exclusive events”. Your wording above seems to imply that the results of the “the 4 mutually exclusive events” be somehow weighted into or with Lambda which would open a window to allow superdeterminism which is not allowed.
For now I’ll assume this is just unclear wording I cannot sort out.

My main problem is understanding “we calculate the expectation value of the correlation for a given Lambda, which is nothing else but its value (+1 or -1)”
Exactly what is the “it” in “nothing else but its value”?
If we are setting a limit on the range and values of Lambda as only being +1 or -1 that hardly seems fair.
Or are we talking about the “expectation value”?
Are you saying that should be restricted to values two possible values (+1 or -1)?
Or is that a range with values to match the “calculated number” defined as “C” where my above example has values of (1.0. 0.0, 0.6, -0.4, -0.6, -1.0).

It is not clear to me what is being produced here so that it “simplifies quickly” –
Apparently into functions that only have “1” or “0” results.
Nor is it clear what kind of limits or restrictions are placed on the values of Lambda or the calculated “expectation value”.

Is the point to get down to defining the “9 possible categories” with only 8 available probability classes?

You suggested that Lambda include a LA for Alice and a LB for Bob with both LA and LB being sent in both directions making both available to Alice and Bob. If we allow each of these to be independent variables accessible to the A, B, & C functions defined for both sides wouldn’t that mean more than 8 probability classes would be required here?

If the required addition of a probability classes would ruin this argument or proof, (even though I don’t understand it) is there anything to justify restricting LB to something directly derivable from LA to avoid that problem.

RB

Last edited:
Nevertheless, that's the starting point of many physical theories. It goes under the name of "symmetry". For instance, the law of conservation of momentum comes from the fact that we cannot make any distinction between space, and space that has undergone a translation. The day you can find this, our REASON for conservation of momentum broke down. It can still hold, but we would be puzzled as of WHY.

The assumption of the impossibility of distinguishing a certain thing over another - in other words, the assumption of a symmetry - has proven to be extremely powerful.

There are a few hidden assumptions:
1) there is only one last cracker
2) a last cracker that is eaten cannot be at the same time "there" (uniqueness of state of last cracker)

If THESE assumptions are included, then indeed the proof holds.

A2 would fail the "there is only one last cracker" assumption
A3 fails the first assumption that I ate it. I only dreamt that I did, but I didn't.

The false memory of me thinking I ate the cracker still doesn't mean that I did eat it, hence assumption A is false and the proof still holds (that A or B or 1 or 2 has to be false).

Superdeterminism, or no statistical regularity would make a lot of science hopeless.

Yes, but the proof still holds: the cracker wasn't eaten, you only have a memory of it being eaten. You didn't put in A: I remember eating the cracker, you put there: the cracker was ontologically eaten.

I have an aversion to sorting through 2 page arguments that might end up not saying anything new anyways but I will go back and do so when I have time because you are an honest debater and took the time to write it. First though I wanted to comment on the Cracker scenario.

You are right that some of the assumptions can fall under the ones I listed. More importantly you are right that there are more assumptions that someone who thinks about for more than a second after seeing the cracker still there can recognize.

But are we to now suddenly believe we have all the assumptions just like we did a minute ago before we recognized the "one last cracker" assumption? Do we just wait for additional situations to show us more assumptions that we forgot?

Or do we make a general statement about proof by negation - It is not a deductive argument as it is subject to the limits of INDUCTION and that this exposure is not limited to the premises like in a real deductive argument?

Also consider the assumptions that fall under the first 2. True, if a space alien put images in our brain of eating the cracker, we never really ate it. But what if we defined assumption 1) eating the cracker as our memory instead of ACTUALLY eating the cracker? Then that assumption is separate. Thus our ability to list all assumptions depends on our ability to be self aware enough to know exactly what we are assuming.

A bit abstract but to tie it to Bell... The locality assumption is not called universal locality. It is just called locality. So we consider the likelihood of it failing based on how well relativity explains things. In reality though, locality can still explain those things but fail in only certain situations. Thus our assumption is not really accurate in that regard.

Just like I might value the assumption that I ate the cracker based on how well I remember eating it, and think it impossible that was the assumption that failed.

vanesch
Staff Emeritus
Gold Member
It may be elementary to some but the point sure escapes me. Easy enough to set up a few real numbers for the first half of your post. Maybe something like:

First the (Alice, Bob) results without considering what the other is doing either in how the other calibrate their personal box or in the A, B or C choice the other makes during the testing. While receiving the Lambda each fine tunes the three choices on their box independently until they achieve the following consistent results.

( P Alice, Bob) Probability of 1.0 of course means result is achieved 100% of the time.
Alice & three choices;
(A,-) (1.0 Red, -) (0.0, Green, -)
(B,-) (0.2 Red, -) (0.8, Green, -)
(C,-) (0.5 Red, -) (0.5, Green, -)

Bob & three choices;
(-,A) (1.0 - , Red) (0.0 - , Green)
(-,B) (0.8 - , Red) (0.2 - , Green)
(-,C) (0.5 - , Red) (0.5 - , Green)

Once calibrated the settings remain the same throughout the test as the timed Lambda signals received and random local three button choices are the only inputs.

After collecting sufficient data to produce a fair sampling all cataloged in order of the Lambda input received from the common distant source. The Nine possible different outcomes produce the following results when correlated, along with a calculated number “C” or ranging from 1.0 to -1.0. (Negative probability only for calculation proposes)

C is not a probability, but the value of the correlation function, which is given by the expression you also use. It is the expectation value of the random variable which equals +1 for (green, green) and (red, red), and -1 for (red, green) and (green, red), as these are the 4 possible outcomes once we are within a category of choice such as (A,C).

You use it correctly, btw. UNDER THE ASSUMPTION - which is not necessarily true! - that Alice's and Bob's results are statistically INDEPENDENT. So the result you have is only true for a single Lambda!

1 (A,A) (1.0 Red, Red) (0.0 Red, Green) (0.0 Green, Red) (0.0 Green, Green) 1.0
2 (A,B) (0.8 Red, Red) (0.2 Red, Green) (0.0 Green, Red) (0.0 Green, Green) 0.6
3 (A,C) (0.0 Red, Red) (0.5 Red, Green) (0.5 Green, Red) (0.0 Green, Green) -1.0

4 (B,A) (0.2 Red, Red) (0.0 Red, Green) (0.8 Green, Red) (0.0 Green, Green) -0.6
5 (B,B) (0.0 Red, Red) (0.2 Red, Green) (0.8 Green, Red) (0.0 Green, Green) -1.0
6 (B,C) (0.0 Red, Red) (0.2 Red, Green) (0.5 Green, Red) (0.3 Green, Green) -0.4

7 (C,A) (0.5 Red, Red) (0.0 Red, Green) (0.5 Green, Red) (0.0 Green, Green) 0.0
8 (C,B) (0.3 Red, Red) (0.2 Red, Green) (0.5 Green, Red) (0.0 Green, Green) -0.4
9 (C,C) (0.0 Red, Red) (0.5 Red, Green) (0.5 Green, Red) (0.0 Green, Green) -1.0

But I cannot get past the following:

I assume we are attempting to define the range of and values for Lambda along with detail which can be applied to six different functions (3 for Alice & 3 for Bob) to achieve the resulting probability distribution shown in “the 4 mutually exclusive events”. Your wording above seems to imply that the results of the “the 4 mutually exclusive events” be somehow weighted into or with Lambda which would open a window to allow superdeterminism which is not allowed.
For now I’ll assume this is just unclear wording I cannot sort out.

The 4 mutually exclusive events are (red,red), (green,red), (red,green) and (green,green). You cannot have both of them together (unless we take an MWI approach, hence the explicit condition of a unique outcome at each side!).

GIVEN a Lambda (whatever it is, and Alice and Bob will not be able to see Lambda, it is a hidden variable), and GIVEN a choice A, B or C at Alice for instance, this will give you a probability that Alice sees red, or green. Note that Lambda can be anything: a text file, an electromagnetic signal, whatever. But it is something which has been sent out by the central box, and the SAME Lambda is sent to Alice and to Bob. You could object here: why is not Lambda1 sent to Alice, and Lambda2 sent to Bob ? That's no problem: call in that case, Lambda the union of Lambda1 and Lambda2 (of which Alice's box is free just to use only the Lambda1 part). So if Lambda1 is a 5K text file, and Lambda2 is a 5K text file, call Lambda the 10K text file which is the concatenation of Lambda1 and Lambda2.

Alice's box receives 2 inputs: Lambda (from the central box) - of which it is free just to use a part, like Lambda1, and Alice's choice (A,B or C). It can also have local random processes, which, based upon the value of Lambda, and the value of the choice, will DRAW an outcome (red or green). We assume that the probability for red is determined by Lambda and Alice's choice: P(A,Lambda).

Note that we also assume that this probability is not a function of the previous history of Alice's choices, and outcomes. This is part of the assumption of "statistical regularity". Each event is supposed to be statistically independent of a previous event.

My main problem is understanding “we calculate the expectation value of the correlation for a given Lambda, which is nothing else but its value (+1 or -1)”
Exactly what is the “it” in “nothing else but its value”?

The correlation is a random variable (that is, it is a function over the space of all possible outcomes, in this case there are 4 of them: (red,red), (red,green), (green,red) and (green,green) ). It takes on the value +1 for (red,red) and (green,green), and it takes on the value -1 for the outcomes (green,red) and (red,green).

The expectation value of a random variable is the value of it for a given outcome, weighted by the probability of that outcome, and summed over all outcomes.

If we are setting a limit on the range and values of Lambda as only being +1 or -1 that hardly seems fair.

No, we are talking about the values of the random variable "correlation", not about Lambda. Lambda can be a long text file ! The "value" of lambda (a possible text file say) is just an argument in the probability function. So for each different Lambda (there can be many of them, as many as you can have different 5K text files), you have different values of the probabilities P(A,Lambda), and hence of the expectation value of the correlation function C, which we write < C >

For each different Lambda, we have a different value of < C >. I called this D.
D is a function of Alice's choice X (one of A,B,C), and Bob's choice Y (one of A,B,C), and Lambda.
Given an X, given a Y and given a Lambda, we have the probabilities P(X,Lambda) for Alice to see a red light, and Q(Y,Lambda) for Bob to see a red light.

We ASSUME (statistical independence: no superdeterminism) that whatever random process is going to determine the "drawing" of red/green at Alice (with probability P(X,Lambda)) is going to be statistically independent of a similar drawing at Bob (with probability Q(Y,Lambda)), and hence, the probability for having, say, (red,red), is given by P(X,Lambda) Q(Y,Lambda) exactly as you did this in your own calculation - with the exception that we now do the calculation FOR A GIVEN LAMBDA. As such, we can (as you did), calculate D:

D(X,Y,Lambda).

It is not clear to me what is being produced here so that it “simplifies quickly” –

Just some algebra!

Apparently into functions that only have “1” or “0” results.
Nor is it clear what kind of limits or restrictions are placed on the values of Lambda or the calculated “expectation value”.

Now, the idea is that Lambda (the text file) is unknown to Bob and Alice (only to their box). So Bob and Alice cannot "sort" their outcomes according to Lambda, they only see an AVERAGE over ALL Lambda values. So our D(X,Y,Lambda) must still be averaged over all possible Lambda values, which can be very many. We assume that Lambda has a statistical distribution over all of its possible values (the set of 5K text files, say). If we make that average, we will find the correlation that Alice and Bob CAN measure.

So we consider that there is a certain probability function over the set (huge) of all possible Lambda values (all possible text files), and we are going to calculate the expectation value of D over this set:

C(X,Y) = D(X,Y, Lambda_1) x P(Lambda_1) + D(X,Y,Lambda_2) P(Lambda_2) + ....
+ D(X,Y,Lambda_205234) x P(Lambda_205234) + ....

This is a priori a very long sum!

However, we show that *in the case of perfect correlations* C(A,A) = 1, we must have that ALL D(A,A,Lambda) values must be equal to 1!
Indeed, D is a number between 1 and -1, and P(Lambda) is a distribution with sum = 1.
The only way for such a weighted sum to be equal to 1 is that ALL D(A,A,Lambda) = 1. One single D(A,A,Lambda) less than 1, and the sum cannot be 1, but must be less.

So we know that ALL D(A,A,Lambda) = 1. But D(A,A,Lambda) = (1 - 2 P(A,Lambda) (1-2 Q(A,Lambda)), and P and Q are probabilities (numbers between 0 and 1).

The only way to have (1 - 2x) (1-2y) = 1 with x between 0 and 1 and y between 0 and 1, is by having OR x = y = 1 (then it is (-1) (-1)) OR x = y = 0 (then it is (1) (1)).

All other values of x or y will give you a number that is less than 1 for (1-2x)(1-2y).

So this means that from our requirement for all Lambda to have D(A,A,Lambda) = 1, that it follows that for each Lambda, P(A,Lambda) = Q(A,Lambda) and moreover that for each Lambda, we have OR P(A,Lambda) = 1 OR P(A,Lambda) = 0.

This means that we can split up the big set of all Lambda (the big set of text files) in two pieces: a piece of all those Lambda which give P(A,Lambda) = 1 and a complementary piece which gives P(A,Lambda) = 0.
Concerning P(A,Lambda), we hence don't need any exact value of Lambda, but only to know in which of the two halves Lambda resides.

C(B,B) = 1 does the same for P(B,Lambda), but of course, the slicing up of the big set of Lambdas will be different now. So we now need to know, for a given Lambda, P(A,Lambda) and P(B,Lambda), in which of the 4 possible "slices" Lambda falls (2 slices for P(A), and 2 slices for P(B) gives in total 4 different "pieces of the Lambda-cake"). We can sum all probabilities over these 4 slices, and only need to know what is the probability for Lambda to be in the 1st slice, the second one, the third one and the fourth one, because within each of these slices, P(A,Lambda) and P(B,Lambda) are known (it is either 1 or 0).

Same for C(C,C), and hence, in the end, we only need in total 8 different "Lambda slices", with their summed probabilities: p1, p2, ... p8. In each slice, P(A,Lambda), P(B,Lambda), P(C, lambda) take on a well-defined value (either 1 or 0).

Is the point to get down to defining the “9 possible categories” with only 8 available probability classes?

We have 9 possible values of the correlation function expectation value:
C(A,A), C(B,B), C(C,C), C(A,B), ... and we already fixed 3 of them: C(A,A) = C(B,B) = C(C,C) = 1. So only 6 remain, and we can calculate them as a function of p1,....p8.

You suggested that Lambda include a LA for Alice and a LB for Bob with both LA and LB being sent in both directions making both available to Alice and Bob. If we allow each of these to be independent variables accessible to the A, B, & C functions defined for both sides wouldn’t that mean more than 8 probability classes would be required here?

No, because the C(A,A)=C(B,B)=C(C,C)=1 already fix (see above) that P(A,Lambda) = Q(A,Lambda) etc..., which can, moreover, only be equal to 1 or 0. That leaves you with just 8 possibilities.

vanesch
Staff Emeritus
Gold Member
But are we to now suddenly believe we have all the assumptions just like we did a minute ago before we recognized the "one last cracker" assumption? Do we just wait for additional situations to show us more assumptions that we forgot?

But this is like checking a mathematical proof (whether by negation or just "straightforward"): you try to see whether each step is "logically complete" and doesn't have hidden assumptions. Of course, sometimes people get tricked into overlooking (collectively) a hidden assumption, but that's because all formal thinking is in the end a human activity prone to error! Usually, however, after sufficient study, one or other person finds out that something is overlooked.

It's exactly the same here: it is a formal reasoning on a few pages. You can check it yourself as much as you want. You go mentally through each step, and you wonder at each step if something is fishy.

Of course, if it would be a 5000 page reasoning, an error hidden somewhere is always possible. But not on a 2-page reasoning. It's written there explicitly. You can check it yourself as many times as you want.

I agree with you that an *experimental* situation is much more difficult to check completely. But a formal reasoning on 2 pages contains normally "everything there is to it".

Or do we make a general statement about proof by negation - It is not a deductive argument as it is subject to the limits of INDUCTION and that this exposure is not limited to the premises like in a real deductive argument?

I don't agree with that: a proof by negation is just as much a formal proof as any other. You make a list of starting assumptions, and you build a formal argument on that to arrive at a conclusion. It is only the NATURE of the conclusion that is different, namely in the form of a contradiction, instead of a statement. But the process is exactly the same as when, say, you prove Pythagoras' theorem from Euclid's axioms.

What we have here is that we make starting assumptions:
- statistical regularity and no superdeterminism
- locality
- uniqueness of outcomes

and from that we derive some inequalities on C(X,Y).

We then calculate C(X,Y) in quantum mechanics.

We find that C(X,Y) from quantum mechanics violates the previous inequalities.

But we could have stopped earlier, and just write the theorem:

"from statistical regularity, no superdeterminism, uniqueness of outcomes an locality, we can derive the following conditions on C(X,Y)".

That's "direct" proof.
The negation only comes from the fact that the quantum values don't satisfy these conditions.

It then follows logically that the quantum correlations cannot be the result of anything that satisfies statistical regularity, no superdeterminism, uniqueness of outcomes and locality.

That's Bell's theorem.

Also consider the assumptions that fall under the first 2. True, if a space alien put images in our brain of eating the cracker, we never really ate it. But what if we defined assumption 1) eating the cracker as our memory instead of ACTUALLY eating the cracker?

Then you will not be able to provide a proof that from "having the memory of having eaten the last cracker" and "the last cracker is still there" you find a contradiction!

You need to provide a *formal proof* where each step is *justified*.

vanesch
Staff Emeritus
Gold Member
Interesting discussion.

Bell said that the crucial assumption was locality. This was incorporated into the formulation by putting the probability of coincidental detection into factorable form.

But is Bell's locality condition really a locality condition?

Suppose that the locality assumption is just a statistical independence assumption.

Bell said that because Bell had a program. Bell was a proponent of Bohmian mechanics, and what one had against Bohmian mechanics is that in its mechanism, it is not local (it cannot be made Lorentz-invariant for instance). Following Einstein, people still thought that it was maybe possible to find a hidden variable theory that WAS local.

But Bell proved (in fact to his own surprise) that if ever there is to be a hidden-variable theory that is giving the same predictions as quantum theory, it is going to be non-local just as Bohmian mechanics is. So one couldn't then blame Bohmian mechanics to be non-local, as EVERY hidden-variable theory that is compatible with quantum mechanics must be so.

However, a closer analysis of Bell's reasoning showed that there were indeed, extra assumptions, such as statistical independence (no superdeterminism) and uniqueness of outcome. This weakened a bit Bell's original argument in favor of Bohmian mechanics, as a failure to comply to one of these other assumptions is also sufficient, and it is not necessarily *locality* which has to be given up. Nevertheless, giving up these other assumptions (meaning: accepting superdeterminism, or accepting "multiple outcomes", or accepting "no statistical regularity") is also something difficult to swallow. Although not impossible. For instance, the Many Worlds Interpretation escapes Bell's conclusion simply because there is no unique outcome at each side. As such, locality can be conserved in this view.

I don't agree with that: a proof by negation is just as much a formal proof as any other. You make a list of starting assumptions, and you build a formal argument on that to arrive at a conclusion. It is only the NATURE of the conclusion that is different, namely in the form of a contradiction, instead of a statement. But the process is exactly the same as when, say, you prove Pythagoras' theorem from Euclid's axioms.

What we have here is that we make starting assumptions:
- statistical regularity and no superdeterminism
- locality
- uniqueness of outcomes

and from that we derive some inequalities on C(X,Y).

We then calculate C(X,Y) in quantum mechanics.

We find that C(X,Y) from quantum mechanics violates the previous inequalities.

But we could have stopped earlier, and just write the theorem:

"from statistical regularity, no superdeterminism, uniqueness of outcomes an locality, we can derive the following conditions on C(X,Y)".

That's "direct" proof.
The negation only comes from the fact that the quantum values don't satisfy these conditions.

It then follows logically that the quantum correlations cannot be the result of anything that satisfies statistical regularity, no superdeterminism, uniqueness of outcomes and locality.

That's Bell's theorem.

Then you will not be able to provide a proof that from "having the memory of having eaten the last cracker" and "the last cracker is still there" you find a contradiction!

You need to provide a *formal proof* where each step is *justified*.

That doesn't really address what I have just shown which is the limits of induction limits your ability to recognize all of your assumptions for precisely what they are.

In a *formal proof* all uncertainty is relegated to the premises. It could be wrong, but only if the premises are wrong. You always know where to look for a problem. In a proof by negation anything about the argument could be wrong. It is just like if Johnny Cochran says "If the glove don't fit, it must aquit". Well it rhymes.... so it must make sense right?

I think you are trying to say that it is possible for someone to get an argument wrong like

1) 1+1=2
2) 1+1+1+1=4
3) 2+2=2

Sure it is possible to mess up a deductive reasoning step in a real *formal* proof. However it is always blatantly obvious, and can always be checked by others who can make a single step in deductive reasoning at a time.

Where as messing up an inductive step is by nature of induction always possible and possible to be repeated by everyone else who looks.

I think the part about it that confuses so many people is that it is a formal proof in a closed enviornment (not reality) where you define everything there is to know about what is going on. But we are not god and therefore can never define the whole situation to be what we want.

The last example was incredibly simple by design and yet I still showed how an average person could have left out assumptions.

Imagine one day you put 2 and 2 together and just got 2. What the hell happened?

Maybe you are at a park and just pulled 4 apples out of the bag, 2 at a time and placed them on the table. What the hell happened? You might reason that

2+2=4 but proof by negation the definition of 2 or the definition of plus is wrong.

But in reality in this case, it isn't the definition of 2 or the definition of + is wrong. It is that the apples weren't really added even though we thought they were. (2 of them rolled off the table or were stolen)

You could present this argument to be checked by a million people and no one could identify the error. At best they could try to repeat what happened and try to observe everything. If the same apple rolling off the table trick happened again, they might see it and claim maybe that's what happened to you.

But now imagine this: There is can on the table which cannot be moved and cannot be looked in (but the contents can be removed). In reality there is a hole at the bottom of the bag and in the table that the apples randomly roll into the padded hollow inside of the table.

People but 2 groups of 2 apples in there, and pull out random amounts of apples from 0 to 4. Then they try to frame the same proof by negation as above.

Well this time infinite people could replicate the experiment, AND produce the same results, and still not be able to identify the problem. (The situation is a metaphor for a newer branch of physics and thus people can not just simply declare by fiat that there isa whole in a bag because it has been shown billions of times that something like that is the only explanation)

The issue is that our 2 + 2 = 4 does not represent what is going on inside that bag. BUT we have no way of knowing that. And what happens in that bag is not defined by our ignorance. So should we begin questioning things like the definition of plus or 2?

Proof by negation is NOT a formal proof because uncertainty is not relegated to the premises - it is all over the place.

Last edited:
C is not a probability, but the value of the correlation function, which is given by the expression you also use. ........ You use it correctly, btw. UNDER THE ASSUMPTION - which is not necessarily true! - that Alice's and Bob's results are statistically INDEPENDENT. So the result you have is only true for a single Lambda!
Since the “correlation function” comes directly from adding and subtracting probabilities I thought it important to point out that just because the values were sometimes negative it did not detract from their validity, that’s all. I wasn’t questioning there mathematical usefulness.

As to my results “true for a single Lambda”? My results require averaging observations from thousands of pairs of samples. Each sample taken from a single unique Lambda shared with both Alice and Bobs Machines. So there would many different ‘Tables of Lambda Information’, one for each individual test. I will assume you mean one unique Lambda being shared with each device for each test. Important to also note that Lambda itself remains “hidden” from Alice and Bob as only the observed results from applying different device functions offer clues as to the true make up of what Lambda might be.

As to how you piece together “probability classes”, & “expectation value”, into individual “D” vales for each individual Lambda sent the two machines, I think I understand in a BlackBox kind of way. It may be as you say “elementary”, but not simple enough for me to defend it or justify the assumptions made to allow for easier math. But I understand what you’ve said well enough to accept you conclusions as correct for a single Lambda per test.

However, on the issue of each Lambda consisting of two parts LA and LB creating enough variability in the analysis as to the final render the negative proof against Einstein Local possibilities inconclusive I disagree with you.
No, because the C(A,A)=C(B,B)=C(C,C)=1 already fix (see above) that P(A,Lambda) = Q(A,Lambda) etc..., which can, moreover, only be equal to 1 or 0. That leaves you with just 8 possibilities.

Lambda here is considered determinate but with no SuperDeterminism. Meaning the Lambda values established at the source remain the same and unchanged until detected and used by the distant machines. And that the values established at the source are complete, random, and indeterminate with respect to any past future or distant condition.

I also disagree with implication by some here that negative proofs can negative proofs can never be considered complete. They can be and have been proven positively true by positively eliminating all possible contradictions to the negative proof. That Logic can and has done that does not mean it is an easy thing to do. The question remains has that been done in this case. As I’ve discussed elsewhere in these forums I think not.

I this example I would apply that idea as follows;
Consider Lambda as consisting of at least two parts LA and LB just as determinate with no SuperDeterminism as is Lambda; but also indeterminately random with respect to each other. This would mean your calculation for D as D(X,Y,Lambda) would be incomplete and need to shown as:

D(X,Y, LA,LB)

That allows to many possibilities for probability class solutions to retain the negative proof as valid. Note: this does not refute the Bell Theorem or the example you have explained here, it only questions how conclusions drawn from them are considered as complete.

IMO in order to retain the negative proof Bell conclusions against Einstein, a logical proof and reason to reject the idea of Lambda consisting of two or more indeterminately random with respect to each other variables such as LA and LB must be provided.
I don’t believe the proof against Einstein’s Local Hidden Variables can be considered complete without that. I see plenty of effort given toward reconfirming the single variable Lambda explanation. But I’ve seen no effort, experimentally or theoretically, to exclude the possibility of Lambda consisting of two or more independent hidden variables.

RandallB

vanesch
Staff Emeritus
Gold Member
As to my results “true for a single Lambda”? My results require averaging observations from thousands of pairs of samples.

What you can't assume, as you did, is that if P(A) is the probability for Alice to see red when she pushes A (and hence 1-P(A) the probability to see green), and if Q(B) is the probability for Bob to see red if he pushes B, that you can conclude from this that
P(A) x Q(B) is the probability to obtain the result (red,red) when Alice pushes A and Bob pushes B. This is ONLY the case if both events are statistically independent, but they aren't necessary as they can have partly a common origin (Lambda!).

Imagine this:
probability (red,red) = 0
probability (red,green) = 0.4
probability (green,red) = 0.6
probability (green,green) = 0

Now, the probability for alice to see red is 0.4, and the probability for bob to see red is 0.6, but the probability to have (red,red) is not 0.4 x 0.6 = 0.24, but rather 0.

However, ONCE we have taken into account the "common origin" (that is, a given lambda), THEN we can assume statistical independence (it was an assumption!).

Each sample taken from a single unique Lambda shared with both Alice and Bobs Machines. So there would many different ‘Tables of Lambda Information’, one for each individual test. I will assume you mean one unique Lambda being shared with each device for each test.

Yes, what you call the "table of lambda information", I just called Lambda, and you could imagine it as a text file sent out by the central box to Alice's and Bob's box. As you say, at each trial, another text file is sent out. But the important part is that Alice and Bob never get to see that text file, it can only influence the behaviour of their box.

Important to also note that Lambda itself remains “hidden” from Alice and Bob as only the observed results from applying different device functions offer clues as to the true make up of what Lambda might be.

yes.

As to how you piece together “probability classes”, & “expectation value”, into individual “D” vales for each individual Lambda sent the two machines, I think I understand in a BlackBox kind of way. It may be as you say “elementary”, but not simple enough for me to defend it or justify the assumptions made to allow for easier math.

The point is that there can be gazillions of different text files that can be sent out by the central box, but that they can (in the case of C(A,A) = C(B,B) = C(C,C) = 1) only have 8 different kinds of effects on the boxes! In other words, we've shown that there are only 8 different TYPES of textfiles, types (sets of text files) which we classified L1,L2...L8:
the first which does the following (L1):
if Alice presses A, then red goes on,
if Alice presses B, then red goes on,
if Alice presses C, then red goes on ;
if bob presses A, then red goes on,
if bob presses B then red goes on,
if bob presses C, then red goes on

The second type of text file (L2) does the following:
if Alice presses A, then green goes on,
if Alice presses B, then red goes on,
if Alice presses C, then red goes on ;
if bob presses A, then green goes on,
if bob presses B then red goes on,
if bob presses C, then red goes on

etc...

This comes about because we can only have P(X,Lambda) = 1 or 0, and this comes about because otherwise it is not possible to have C(A,A) = 1.

And now, the point is that (again, no superdeterminism) the statistical distribution of these text files (the random process in the central box that draws the text files) is independent of the choice of Alice and Bob.

That means that each of the gazillion possible text files has a probability to it to be sent out, and we can sum of course all the probabilities within one set L1, to have the probability that a text file from set L1 will be drawn. We called that probability p1, and it is independent of X and Y (alice's and bob's choices). So we have a probabilitiy p1 that one of the possible text files in L1 will be drawn, and this means, we have a probability p1 that:
if Alice presses A, then red goes on,
if Alice presses B, then red goes on,
if Alice presses C, then red goes on ;
if bob presses A, then red goes on,
if bob presses B then red goes on,
if bob presses C, then red goes on

Same for p2 which will be the probability that the central box sends out a text file in the L2 set etc...

It's all we need to calculate the expectation value of the correlation function, because we only need the outcomes (red,red) ... and their probabilities. So even if miriads of different text files are sent out, if they have the same effect (the same outcomes (red,red)...), we don't need more information to calculate the correlation function.

However, on the issue of each Lambda consisting of two parts LA and LB creating enough variability in the analysis as to the final render the negative proof against Einstein Local possibilities inconclusive I disagree with you.

I'm NOT saying that locality is the culprit!

Lambda here is considered determinate but with no SuperDeterminism. Meaning the Lambda values established at the source remain the same and unchanged until detected and used by the distant machines. And that the values established at the source are complete, random, and indeterminate with respect to any past future or distant condition.

The fact that they are "complete" and that there is "determinism" (that is, that we can derive from Lambda, with certainty, whether the red or green light goes on when Alice pushes A), is not a hypothesis, but FOLLOWS from the C(A,A) =1 assumption. In other words, if there is any local randomness left, under the assumption of statistical independence (no superdeterminism), it is mathematically impossible to obtain C(A,A) = 1. So determinism FOLLOWS (is one of the consequences), and wasn't put in a priori. We FIND that the probabilities can only be 1 or 0. We didn't ASSUME it.

I this example I would apply that idea as follows;
Consider Lambda as consisting of at least two parts LA and LB just as determinate with no SuperDeterminism as is Lambda; but also indeterminately random with respect to each other. This would mean your calculation for D as D(X,Y,Lambda) would be incomplete and need to shown as:

D(X,Y, LA,LB)

Eh, no: call Lambda = {L_A, L_B}. I don't see why you see sending ALSO the L_B that will influence the Bob box to Alice, but without her box using it, as a RESTRICTION ?

If Alice's box only depends on L_A, but it receives L_A AND L_B, then this is no limitation of generality, no ?

That allows to many possibilities for probability class solutions to retain the negative proof as valid. Note: this does not refute the Bell Theorem or the example you have explained here, it only questions how conclusions drawn from them are considered as complete.

No, really, there is no difference. Lambda is a big information record (I took the example of a text file). But at each of the boxes, only a limited number of possibilities remain or we wouldn't have C(A,A) = 1.

If you want to split Lambda in LA and LB, be my guest, but it simply complicates notation. We now have to work with the probability distribution of the couples {LA,LB}. You will find that there are only 8 possibilities for these couples, corresponding to the 8 different classes L1, L2, ... L8. Because if this is not the case, you cannot obtain C(A,A) = C(B,B) = C(C,C) = 1.

IMO in order to retain the negative proof Bell conclusions against Einstein, a logical proof and reason to reject the idea of Lambda consisting of two or more indeterminately random with respect to each other variables such as LA and LB must be provided.
I don’t believe the proof against Einstein’s Local Hidden Variables can be considered complete without that. I see plenty of effort given toward reconfirming the single variable Lambda explanation. But I’ve seen no effort, experimentally or theoretically, to exclude the possibility of Lambda consisting of two or more independent hidden variables.

But they consist of thousands of independent variables if you want ! Each character of the text file can be considered an independent variable. However, LA and LB must be strictly correlated, or it will be impossible to obtain C(A,A) = 1.

You see, THIS is the binding condition: we assume that C(A,A) = C(B,B) = C(C,C) =1.
In other words, each time that alice pushes A and bob also pushes A, they ALWAYS find the same outcome. We only consider machines that have this property. And we are analysing what are the eventual conditions on the "crossed" correlations C(A,B) etc... in this case, if we make the hypotheses of no superdeterminism, locality, statistical regularity, single outcomes...
This will mean that something is correlated between Alice and Bob, and the whole idea is that the only correlation we can have must come from the COMMON signal that they received from the central box - so in as much as you want to look at "different" signals from this common box to Alice and Bob, they must have a form of correlation.

So in as much as there are "independent" parts to the signal to Alice and to Bob, this would in fact be irrelevant because the independent parts could even be generated locally in Alice's and Bob's boxes. What counts is the common part. But nothing stops you from considering that in the Lambda message, only one part is to be addressed at Alice's, and another part at Bob's box, and that Alice's box discards the part of Lambda for Bob, and vice versa.

I really don't see your objection here as to how this limits the generality of the argument.

Last edited:
2) superdeterminism could be right. But that's a genuine pain in the a*** for about any scientific theory, because we can't use statistics anymore then to prove or disprove any potential causal link (all of medical testing falls apart, and you can't argue anymore against astrology).

Let's say that we have a superdeterministic universe that relates the emission of an entangled pair with the existence of two suitable oriented absorbers (say atoms or molecules) so that the statistical independence assumption fails. Now, please explain how your above statement follows from this hypothesis.

vanesch
Staff Emeritus
Gold Member
Let's say that we have a superdeterministic universe that relates the emission of an entangled pair with the existence of two suitable oriented absorbers (say atoms or molecules) so that the statistical independence assumption fails. Now, please explain how your above statement follows from this hypothesis.

If we cannot assume statistical independence of "a priori causally unrelated happenings", then we cannot prove ANY causal link, because the way to prove a causal link is to "change arbitrarily" the supposed "cause" and find that the "effect" is statistically correlated with this choice.

If we have a medecine to test, then we take 2000 patients with the target illness, and "flip coins" or something of the kind for each of them to decide whether or not we give them the medecine or a placebo. If, after a while, we notice a statistical difference between the group that took the medecine, and the other group, then we assume that there was a *causal* link between taking the medecine and "getting better".

But in superdeterminism, there can be a correlation between the flip of the coin and the whatever OTHER cause that may improve the health situation of the patient (like, I don't know, a genetic peculiarity or whatever). Then this correlation would simply be the one between the flipping of the coin and getting better, and taking the medecine or not wouldn't affect this. In other words, if there is superdeterminism, no procedure ever will allow you to do fair sampling, which is at the basis of most cause-effect deductions.

I just got to say Bell believed in a single objective reality we all percieve the same...
Doubting 3 is doubting all science and becoming a solipsist... All of reality tells us different.

This gives you:
C(A,B) = -1 + 4p1 + 4p2
C(B,C) = - 4p3 - 4p7 + 1
C(A,C) = -1 + 4p1 + 4p3

This is hence the kind of relation that must hold for the correlations of our kind of device, no matter what happens inside, as long as the assumptions we set out hold, and as long as we have strict correlation C(A,A) = C(B,B) = C(C,C) = 1.

Now, for a simple quantum system with entangled spin-1/2 systems, one can easily calculate that:
Cqm(th1,th2) = 2 sin^2{(th1-th2)/2} - 1

So if we use such pairs of entangled particles in our machine, (and inverse the red and green light on one side, so that the correlations change sign), then we can find the quantum predictions for analyser settings:
A - 0 degrees
B - 45 degrees
C - 90 degrees

where now our C is minus Cqm (so that we still have C(A,A) = 1 etc...)

-Cqm(A,B) = -Cqm(B,C) = 1/sqrt(2)
C(A,C) = 0

These 3 equations fix 3 of the 4 degrees of freedom we had: (p1,p2,p3 and p7).
p2 = 1/8 (2 + sqrt(2) - 8p1)
p3 = 1/4 (1 - 4 p1)
p7 = 1/8 (8 p1 - sqrt(2))

from this follows then:
p4 = p1 - 1/(4 sqrt(2))
p5 = 1/8 (2 + sqrt(2) - 8 p1)
p6 = p1 - 1/(2 sqrt(2))
p8 = 1/4(1+sqrt(2) - 4 p1)

now, all these px are numbers between 0 and 1, and the point is that you can't find such a solution:

from p3 follows: p1 < 1/4

from p6 follows: p1 > 1/(2 sqrt(2))

But 1/(2 sqrt(2)) > 1/4, hence you can't find a p1 that makes that all our px are between 0 and 1.

As such, the quantum prediction fails to satisfy the kind of correlations we derived using our black box machine.

So what is the conclusion here--that a hidden variable theory(under the three reasonable assumptions) can not give quantum mechanical correlations?Or are the assumptions not so reasonable?

What you can't assume, as you did, is that if P(A) is the probability for Alice to see red when she pushes A (and hence 1-P(A) the probability to see green), and if Q(B) is the probability for Bob to see red if he pushes B, that you can conclude from this that
P(A) x Q(B) is the probability to obtain the result (red,red) when Alice pushes A and Bob pushes B. This is ONLY the case if both events are statistically independent, but they aren't necessary as they can have partly a common origin (Lambda!).

Imagine this:
probability (red,red) = 0
probability (red,green) = 0.4
probability (green,red) = 0.6
probability (green,green) = 0

Now, the probability for alice to see red is 0.4, and the probability for bob to see red is 0.6, but the probability to have (red,red) is not 0.4 x 0.6 = 0.24, but rather 0.

This makes no sense.
What you describe is comparable to my probability example #5 (B,B) Alice selecting function B and Bob selecting his function B, but we can use yours just as well.
I see no problem in these functions causing Alice to see Red 40% of the time otherwise Green and Bob to see Red 60% of the time otherwise Green. And I see no problem at all in having the Alice machine on A function decide that if Lambda would cause the Bob machine to produce Green if Bob uses function B that it should always produce Red. That would be the definition of the Alice ‘A’ function. Which of course results in all the remaining possible Green results for Alice to come when Bob only has Red results available. The only way you can multiple P(A) x Q(B) together is when you randomly combing the results observed by Alice & Bob during function (A,B) observations without maintaining the time ordered pairings to ensure the paired observations are correlated for using a common Lambda.
I see no reason or logic for calculating 0.4 x 0.6 = 0.24 it serves no proposes nor justification for where it comes from.
However, ONCE we have taken into account the "common origin" (that is, a given lambda), THEN we can assume statistical independence (it was an assumption!).
Either you miss typed something or I clearly don’t understand your assumption here! If we insure we a comparing results produced from a common individual given Lambda for each set of observations we allow the opportunity for statistical interdependence. By what logic can you presume “statistical independence”?
The point is that there can be gazillions of different text files that can be sent out by the central box, ……. there are only 8 different TYPES of textfiles, types (sets of text files) which we classified L1,L2...L8:

I'm NOT saying that locality is the culprit!
But you are concluding that locality or some other assumption that comes from a realistic “common sense” reality, must be false.

That includes the realistic assumption that there is no superdeterminism. Which I don’t know why people keep try to refute your example using “superdeterminism” BM MWI etc., since doing only agrees with the claim being made. That is that not all of the Local Realism arguments as you listed them can be true. They are just a redundant minor examples of a Non-LR solution that only supports the claim you are making. A pointless effort IMO.

My point is you have not provided a satisfactory justification for assuming there are only “8 different TYPES of textfiles sets” available in a complete analyses of your example.
Eh, no: call Lambda = {L_A, L_B}. I don't see why you see sending ALSO the L_B that will influence the Bob box to Alice, but without her box using it, as a RESTRICTION ?

If Alice's box only depends on L_A, but it receives L_A AND L_B, then this is no limitation of generality, no ?
Where do you get this from? I didn’t say this, reread post #9 I said; “we allow each of these to be independent variables accessible to the A, B, & C functions defined for both sides” clearly both Alice and Bob have LA and LB available to their machines even if “hidden” from their direct observation.

The point is you are making an unsupported assumption in your analysis to conclude that only “8 different TYPES of textfile sets” are available!

You reduce your probabilities down to:
P(Alice, Lambda)
P(Bob, Lambda)
And for correlated results: P(Alice,Bob,Lanbda)
But these cannot exist without the assumption you made above
“call Lambda = {L_A, L_B}”

And you have provided no justification for such an assumption.

EXAMPLE:
Assume your Lambda text files include the details of a individual triangle shared with both devices, including at least “Area” plus the length of “Side1” and “Side2” for that triangle. If only the area is used from the table to define the Red Vs. Green Outcome (after all Side1 and Side2) help make up that defined area) I must agree you conclusions are acceptable.

However if each of the functions selected by Alice or Bob use the values for Side1 and Side2 instead of just the Area it creates an entirely different situation. Many triangles can have a common Area with a variety of different S1 & S2 values. And there is no reason to expect that any S1 value would demand any particular S2 value. That is two independent probabilities cannot be properly defined by a single probability set when the different functions selected by Alice or Bob can mean the device can consider the interrelation between S1 and S2 differently depending on the A, B, or C choice made.

As long as this possibility is available the equation:

D(X,Y,Lambda)
Cannot be justified and must be replaced by;
D(X,Y, LA, LB)

Where LA or LB are variables independent or each other. Each defined by one of the S1 or S2 values as modified by the function selection. Matters could even be made worse if the many function selection applied to the S values is modified by yet another independent variable imbedded in Lambda such as the angle between the two sides. (that would only be taking advantage of a more complete description of the triangle information embedded in the Lambda table of information).
No, really, there is no difference.
Lambda is a big information record (I took the example of a text file).

So in as much as there are "independent" parts to the signal to Alice and to Bob, this would in fact be irrelevant because the independent parts could even be generated locally in Alice's and Bob's boxes.

I really don't see your objection here as to how this limits the generality of the argument.
Yes really; it makes a huge difference:
And you can prove me wrong by showing how “independent parts could even be generated locally” in a simple example.
Your use of a text file is fine; just randomly generate Five S1 lengths, Five S2 and Five angles to be used between those two lengths. From this you can establish 5 unique triangles, just remember no superdeterminism allowed in setting the S1 value relative to the S2 value.

From this you can generate Five Text Files describing each triangle including “Area”, “Side1”, “Side2”, and “Angle”. Plus if you like any factored multiplication of any two or more of these four variables to create a fifth piece of information in the text file.

Remember your claim is “independent parts could even be generated locally” by the device in the Alice or Bob device from a single Lambda.
Just provide for us any function you can to use a single Lambda defined as any one of these five variables in the table to reconstruct in correct detail all independent pieces of information contained in the text file. IMO no one independent value can be used to completely describe the triangle to produce all the hidden separate values that could be made available to the device from the Lambda text file.

Let us know if you can come up with a workable function:

If you cannot; then you cannot justify eliminating the extra comma in:
D(X,Y, LA, LB)

And that extra comma means more than 8 options are possible!

And IMO the “generality of the argument” fails until removing that extra comma can be justified. Leaving us with no complete proof against the possibility of all three Local Realism assumptions being potentially true.

Last edited:
If we cannot assume statistical independence of "a priori causally unrelated happenings", then we cannot prove ANY causal link, because the way to prove a causal link is to "change arbitrarily" the supposed "cause" and find that the "effect" is statistically correlated with this choice.

Well, what I say is that the emission of an entangled pair IS "causally related" with the existence of two suitable absorbers. That doesn't mean that one cannot use the "independence assumption" for other situations like a test of a new medicine.

If we have a medecine to test, then we take 2000 patients with the target illness, and "flip coins" or something of the kind for each of them to decide whether or not we give them the medecine or a placebo. If, after a while, we notice a statistical difference between the group that took the medecine, and the other group, then we assume that there was a *causal* link between taking the medecine and "getting better"."

But in superdeterminism, there can be a correlation between the flip of the coin and the whatever OTHER cause that may improve the health situation of the patient (like, I don't know, a genetic peculiarity or whatever). Then this correlation would simply be the one between the flipping of the coin and getting better, and taking the medecine or not wouldn't affect this. In other words, if there is superdeterminism, no procedure ever will allow you to do fair sampling, which is at the basis of most cause-effect deductions.

I didn't say anything about coin flips, patients, doctors and the like. It is far from obvious that these correlations logically follow from the assumption I made (the emission of an entangled pair IS "causally related" with the existence of two suitable absorbers). On the contrary, I would expect that the "superdeterministic effects" are of little importance at macroscopic level, being hidden in the statistical noise just like the non-local effects of Bohm's interpretation or the uncertainties of CI. But, if you can prove the contrary, I am all ears.

My point is you have not provided a satisfactory justification for assuming there are only “8 different TYPES of textfiles sets” available in a complete analyses of your example. Where do you get this from?

$$2^3=8$$ (case C(A,A)=C(B,B)=C(C,C)=1)--see vanesch's earlier post.

For this case
P(A,Lambda) = Q(A,Lambda) = 1 or 0
P(B,Lambda) = Q(B,Lambda) = 1 or 0
P(C,Lambda) = Q(C,Lambda) = 1 or 0
...so eight possibilities.

Last edited:
vanesch
Staff Emeritus
Gold Member
So what is the conclusion here--that a hidden variable theory(under the three reasonable assumptions) can not give quantum mechanical correlations?Or are the assumptions not so reasonable?

That a theory (hidden variable or not) that satisfies the 3 assumptions cannot give the quantum-mechanical correlations. That's Bell's theorem. Bell limited himself to the only assumption of "locality" because he took the others for granted, and he thought he proved that locality was not compatible with quantum theory, but a closer look to his proof showed that he needed the statistical independence (no superdeterminism) and the others.

vanesch
Staff Emeritus
Gold Member
I see no problem in these functions causing Alice to see Red 40% of the time otherwise Green and Bob to see Red 60% of the time otherwise Green. And I see no problem at all in having the Alice machine on A function decide that if Lambda would cause the Bob machine to produce Green if Bob uses function B that it should always produce Red.

Ah, now, let us think:
Alice's box receives a certain lambda and bob receives that same lambda. In that lambda is written that Alice's box will draw with a probability of 40% red, and with a probability of 60%, green. That's done by "an innocent hand in Alice's box", right ?
Now, if Bob's box receives the same lambda, HOW CAN THIS BOX KNOW WHAT HAS BEEN DRAWN AT ALICE ?
This can only happen if Lambda contained that information! But if Lambda contained already the information of whether it was going to be red or green at Alice, then the probabilities (for the given lambda) are not 40% red and 60% green, but rather red FOR SURE or green FOR SURE. THEN Bob knows (through lambda) what has been drawn at Alice's as there wasn't a real draw, but the result was fixed. In other words, GIVEN that lambda, the remaining probabilities are 100% or 0%.

Now, OVERALL (for all lambdas), we can still get of course 40% red at alice, because 40% OF THE TIME, THE LAMBDA TOLD HER TO GET RED. This means that the 40% doesn't come from a single lambda, but from the DISTRIBUTION of lambdas. That's what we do here.

So 40% of the time, a lambda is emitted to bob and alice that tells alice to find red, and bob to find green. And 60% of the time, a lambda is emitted that tells Alice to find green, and bob to find red.

But for a SINGLE lambda, the probabilities are not 40% and 60%, but rather 100% or 0%.

That would be the definition of the Alice ‘A’ function. Which of course results in all the remaining possible Green results for Alice to come when Bob only has Red results available. The only way you can multiple P(A) x Q(B) together is when you randomly combing the results observed by Alice & Bob during function (A,B) observations without maintaining the time ordered pairings to ensure the paired observations are correlated for using a common Lambda.
I see no reason or logic for calculating 0.4 x 0.6 = 0.24 it serves no proposes nor justification for where it comes from.

I simply said that because it is what YOU did in your first example !

Either you miss typed something or I clearly don’t understand your assumption here! If we insure we a comparing results produced from a common individual given Lambda for each set of observations we allow the opportunity for statistical interdependence. By what logic can you presume “statistical independence”?

If, after having taken into account all "common causes" (lambda) there is still some randomness in the outcome at Alice, this must be because the common cause only specifies a probability of local drawing. We assume that the "innocent hand" that draws from this at Alice is independent from the innocent hand that draws from this at Bob.
If it was a "common" random drawing, we can always include it in lambda, and it wouldn't appear anymore in the remaining probabilities.

That includes the realistic assumption that there is no superdeterminism. Which I don’t know why people keep try to refute your example using “superdeterminism” BM MWI etc., since doing only agrees with the claim being made. That is that not all of the Local Realism arguments as you listed them can be true. They are just a redundant minor examples of a Non-LR solution that only supports the claim you are making. A pointless effort IMO.

I don't understand a word of this...

My point is you have not provided a satisfactory justification for assuming there are only “8 different TYPES of textfiles sets” available in a complete analyses of your example. Where do you get this from? I didn’t say this, reread post #9 I said; “we allow each of these to be independent variables accessible to the A, B, & C functions defined for both sides” clearly both Alice and Bob have LA and LB available to their machines even if “hidden” from their direct observation.

The point is you are making an unsupported assumption in your analysis to conclude that only “8 different TYPES of textfile sets” are available!

As others pointed out, because there are only 8 different types of results possible!

You reduce your probabilities down to:
P(Alice, Lambda)
P(Bob, Lambda)
And for correlated results: P(Alice,Bob,Lanbda)
But these cannot exist without the assumption you made above
“call Lambda = {L_A, L_B}”
And you have provided no justification for such an assumption.

That's not an assumption, it is a definition. This is like in the proof: x + 2 > x, we say:
"let y = x + 2", now blahblah
and you object to the proof because I made the *assumption* that y = x + 2

EXAMPLE:
Assume your Lambda text files include the details of a individual triangle shared with both devices, including at least “Area” plus the length of “Side1” and “Side2” for that triangle. If only the area is used from the table to define the Red Vs. Green Outcome (after all Side1 and Side2) help make up that defined area) I must agree you conclusions are acceptable.

However if each of the functions selected by Alice or Bob use the values for Side1 and Side2 instead of just the Area it creates an entirely different situation. Many triangles can have a common Area with a variety of different S1 & S2 values. And there is no reason to expect that any S1 value would demand any particular S2 value. That is two independent probabilities cannot be properly defined by a single probability set when the different functions selected by Alice or Bob can mean the device can consider the interrelation between S1 and S2 differently depending on the A, B, or C choice made.

?? You can define a probability over about ANY thinkable (measurable) set. You can define a probability distribution over the set of all triangles for instance. You don't have to see Lambda as a real number, you know. The function P(A,Lambda) can be a function of all "real numbers" that are included in Lambda, like your sides and areas and everything.

But the point is, because of C(A,A) = C(B,B) = C(C,C) = 1, all these sides, areas etc... can only have 8 different possible effects.

As long as this possibility is available the equation:

D(X,Y,Lambda)
Cannot be justified and must be replaced by;
D(X,Y, LA, LB)

Where LA or LB are variables independent or each other. Each defined by one of the S1 or S2 values as modified by the function selection.

But if I say: CALL lambda = {L_A,L_B} then that's the same, no ? That means that if I write D(X,Y,Lambda), I mean of course D(X,Y,L_A,L_B).

But this doesn't change the conclusions. You can include as many symbols as you want. You can write D(X,Y,side1,side2,side3,side4,area1,area2,....)

The point is that all these "random variables" together can only have 8 different effects. And it is sufficient to know the probabilities of the classes of combinations of side1,side2,...area2 that result in each of them.

Last edited: