RandallB said:
It may be elementary to some but the point sure escapes me. Easy enough to set up a few real numbers for the first half of your post. Maybe something like:
First the (Alice, Bob) results without considering what the other is doing either in how the other calibrate their personal box or in the A, B or C choice the other makes during the testing. While receiving the Lambda each fine tunes the three choices on their box independently until they achieve the following consistent results.
( P Alice, Bob) Probability of 1.0 of course means result is achieved 100% of the time.
Alice & three choices;
(A,-) (1.0 Red, -) (0.0, Green, -)
(B,-) (0.2 Red, -) (0.8, Green, -)
(C,-) (0.5 Red, -) (0.5, Green, -)
Bob & three choices;
(-,A) (1.0 - , Red) (0.0 - , Green)
(-,B) (0.8 - , Red) (0.2 - , Green)
(-,C) (0.5 - , Red) (0.5 - , Green)
Once calibrated the settings remain the same throughout the test as the timed Lambda signals received and random local three button choices are the only inputs.
After collecting sufficient data to produce a fair sampling all cataloged in order of the Lambda input received from the common distant source. The Nine possible different outcomes produce the following results when correlated, along with a calculated number “C” or ranging from 1.0 to -1.0. (Negative probability only for calculation proposes)
C is not a probability, but the value of the correlation function, which is given by the expression you also use. It is the expectation value of the random variable which equals +1 for (green, green) and (red, red), and -1 for (red, green) and (green, red), as these are the 4 possible outcomes once we are within a category of choice such as (A,C).
You use it correctly, btw. UNDER THE ASSUMPTION - which is not necessarily true! - that Alice's and Bob's results are statistically INDEPENDENT. So the result you have is only true for a single Lambda!
1 (A,A) (1.0 Red, Red) (0.0 Red, Green) (0.0 Green, Red) (0.0 Green, Green) 1.0
2 (A,B) (0.8 Red, Red) (0.2 Red, Green) (0.0 Green, Red) (0.0 Green, Green) 0.6
3 (A,C) (0.0 Red, Red) (0.5 Red, Green) (0.5 Green, Red) (0.0 Green, Green) -1.0
4 (B,A) (0.2 Red, Red) (0.0 Red, Green) (0.8 Green, Red) (0.0 Green, Green) -0.6
5 (B,B) (0.0 Red, Red) (0.2 Red, Green) (0.8 Green, Red) (0.0 Green, Green) -1.0
6 (B,C) (0.0 Red, Red) (0.2 Red, Green) (0.5 Green, Red) (0.3 Green, Green) -0.4
7 (C,A) (0.5 Red, Red) (0.0 Red, Green) (0.5 Green, Red) (0.0 Green, Green) 0.0
8 (C,B) (0.3 Red, Red) (0.2 Red, Green) (0.5 Green, Red) (0.0 Green, Green) -0.4
9 (C,C) (0.0 Red, Red) (0.5 Red, Green) (0.5 Green, Red) (0.0 Green, Green) -1.0
But I cannot get past the following:
I assume we are attempting to define the range of and values for Lambda along with detail which can be applied to six different functions (3 for Alice & 3 for Bob) to achieve the resulting probability distribution shown in “the 4 mutually exclusive events”. Your wording above seems to imply that the results of the “the 4 mutually exclusive events” be somehow weighted into or with Lambda which would open a window to allow superdeterminism which is not allowed.
For now I’ll assume this is just unclear wording I cannot sort out.
The 4 mutually exclusive events are (red,red), (green,red), (red,green) and (green,green). You cannot have both of them together (unless we take an MWI approach, hence the explicit condition of a unique outcome at each side!).
GIVEN a Lambda (whatever it is, and Alice and Bob will not be able to see Lambda, it is a hidden variable), and GIVEN a choice A, B or C at Alice for instance, this will give you a probability that Alice sees red, or green. Note that Lambda can be anything: a text file, an electromagnetic signal, whatever. But it is something which has been sent out by the central box, and the SAME Lambda is sent to Alice and to Bob. You could object here: why is not Lambda1 sent to Alice, and Lambda2 sent to Bob ? That's no problem: call in that case, Lambda the union of Lambda1 and Lambda2 (of which Alice's box is free just to use only the Lambda1 part). So if Lambda1 is a 5K text file, and Lambda2 is a 5K text file, call Lambda the 10K text file which is the concatenation of Lambda1 and Lambda2.
Alice's box receives 2 inputs: Lambda (from the central box) - of which it is free just to use a part, like Lambda1, and Alice's choice (A,B or C). It can also have local random processes, which, based upon the value of Lambda, and the value of the choice, will DRAW an outcome (red or green). We assume that the probability for red is determined by Lambda and Alice's choice: P(A,Lambda).
Note that we also assume that this probability is not a function of the previous history of Alice's choices, and outcomes. This is part of the assumption of "statistical regularity". Each event is supposed to be statistically independent of a previous event.
My main problem is understanding “we calculate the expectation value of the correlation for a given Lambda, which is nothing else but its value (+1 or -1)”
Exactly what is the “it” in “nothing else but its value”?
The correlation is a random variable (that is, it is a function over the space of all possible outcomes, in this case there are 4 of them: (red,red), (red,green), (green,red) and (green,green) ). It takes on the value +1 for (red,red) and (green,green), and it takes on the value -1 for the outcomes (green,red) and (red,green).
The expectation value of a random variable is the value of it for a given outcome, weighted by the probability of that outcome, and summed over all outcomes.
If we are setting a limit on the range and values of Lambda as only being +1 or -1 that hardly seems fair.
No, we are talking about the values of the random variable "correlation", not about Lambda. Lambda can be a long text file ! The "value" of lambda (a possible text file say) is just an argument in the probability function. So for each different Lambda (there can be many of them, as many as you can have different 5K text files), you have different values of the probabilities P(A,Lambda), and hence of the expectation value of the correlation function C, which we write < C >
For each different Lambda, we have a different value of < C >. I called this D.
D is a function of Alice's choice X (one of A,B,C), and Bob's choice Y (one of A,B,C), and Lambda.
Given an X, given a Y and given a Lambda, we have the probabilities P(X,Lambda) for Alice to see a red light, and Q(Y,Lambda) for Bob to see a red light.
We ASSUME (statistical independence: no superdeterminism) that whatever random process is going to determine the "drawing" of red/green at Alice (with probability P(X,Lambda)) is going to be statistically independent of a similar drawing at Bob (with probability Q(Y,Lambda)), and hence, the probability for having, say, (red,red), is given by P(X,Lambda) Q(Y,Lambda) exactly as you did this in your own calculation - with the exception that we now do the calculation FOR A GIVEN LAMBDA. As such, we can (as you did), calculate D:
D(X,Y,Lambda).
It is not clear to me what is being produced here so that it “simplifies quickly” –
Just some algebra!
Apparently into functions that only have “1” or “0” results.
Nor is it clear what kind of limits or restrictions are placed on the values of Lambda or the calculated “expectation value”.
Now, the idea is that Lambda (the text file) is unknown to Bob and Alice (only to their box). So Bob and Alice cannot "sort" their outcomes according to Lambda, they only see an AVERAGE over ALL Lambda values. So our D(X,Y,Lambda) must still be averaged over all possible Lambda values, which can be very many. We assume that Lambda has a statistical distribution over all of its possible values (the set of 5K text files, say). If we make that average, we will find the correlation that Alice and Bob CAN measure.
So we consider that there is a certain probability function over the set (huge) of all possible Lambda values (all possible text files), and we are going to calculate the expectation value of D over this set:
C(X,Y) = D(X,Y, Lambda_1) x P(Lambda_1) + D(X,Y,Lambda_2) P(Lambda_2) + ...
+ D(X,Y,Lambda_205234) x P(Lambda_205234) + ...
This is a priori a very long sum!
However, we show that *in the case of perfect correlations* C(A,A) = 1, we must have that ALL D(A,A,Lambda) values must be equal to 1!
Indeed, D is a number between 1 and -1, and P(Lambda) is a distribution with sum = 1.
The only way for such a weighted sum to be equal to 1 is that ALL D(A,A,Lambda) = 1. One single D(A,A,Lambda) less than 1, and the sum cannot be 1, but must be less.
So we know that ALL D(A,A,Lambda) = 1. But D(A,A,Lambda) = (1 - 2 P(A,Lambda) (1-2 Q(A,Lambda)), and P and Q are probabilities (numbers between 0 and 1).
The only way to have (1 - 2x) (1-2y) = 1 with x between 0 and 1 and y between 0 and 1, is by having OR x = y = 1 (then it is (-1) (-1)) OR x = y = 0 (then it is (1) (1)).
All other values of x or y will give you a number that is less than 1 for (1-2x)(1-2y).
So this means that from our requirement for all Lambda to have D(A,A,Lambda) = 1, that it follows that for each Lambda, P(A,Lambda) = Q(A,Lambda) and moreover that for each Lambda, we have OR P(A,Lambda) = 1 OR P(A,Lambda) = 0.
This means that we can split up the big set of all Lambda (the big set of text files) in two pieces: a piece of all those Lambda which give P(A,Lambda) = 1 and a complementary piece which gives P(A,Lambda) = 0.
Concerning P(A,Lambda), we hence don't need any exact value of Lambda, but only to know in which of the two halves Lambda resides.
C(B,B) = 1 does the same for P(B,Lambda), but of course, the slicing up of the big set of Lambdas will be different now. So we now need to know, for a given Lambda, P(A,Lambda) and P(B,Lambda), in which of the 4 possible "slices" Lambda falls (2 slices for P(A), and 2 slices for P(B) gives in total 4 different "pieces of the Lambda-cake"). We can sum all probabilities over these 4 slices, and only need to know what is the probability for Lambda to be in the 1st slice, the second one, the third one and the fourth one, because within each of these slices, P(A,Lambda) and P(B,Lambda) are known (it is either 1 or 0).
Same for C(C,C), and hence, in the end, we only need in total 8 different "Lambda slices", with their summed probabilities: p1, p2, ... p8. In each slice, P(A,Lambda), P(B,Lambda), P(C, lambda) take on a well-defined value (either 1 or 0).
Is the point to get down to defining the “9 possible categories” with only 8 available probability classes?
We have 9 possible values of the correlation function expectation value:
C(A,A), C(B,B), C(C,C), C(A,B), ... and we already fixed 3 of them: C(A,A) = C(B,B) = C(C,C) = 1. So only 6 remain, and we can calculate them as a function of p1,...p8.
You suggested that Lambda include a LA for Alice and a LB for Bob with both LA and LB being sent in both directions making both available to Alice and Bob. If we allow each of these to be independent variables accessible to the A, B, & C functions defined for both sides wouldn’t that mean more than 8 probability classes would be required here?
No, because the C(A,A)=C(B,B)=C(C,C)=1 already fix (see above) that P(A,Lambda) = Q(A,Lambda) etc..., which can, moreover, only be equal to 1 or 0. That leaves you with just 8 possibilities.