billschnieder said:
JesseM said:
Uh, why should they need to make sure of that? Saying the correlations "are not conditional on any hidden elements of reality" does not mean they are not causally influenced by hidden elements of reality, it just means the correlation that's calculated is not a conditional one that controls for those elements. For example, suppose I have a large population of people, each of whom is either a smoker or nonsmoker, each of whom either has lung cancer or doesn't, and each of whom either has yellow teeth or doesn't. I can certainly calculate the correlation between yellow teeth and lung cancer alone, i.e. find the fraction of people who satisfy (yellow teeth AND lung cancer) and compare it to the product of the fraction that satisfy (yellow teeth) and the fraction that satisfy (lung cancer), even if it happens to be true that the correlation can be explained causally by the fact that smoking increases the chances of both. That's all it means to say that the correlation I'm calculating is not "conditioned on" the smoking variable, that I'm just not bothering to include it in my calculations, not that it isn't causally influencing the correlation I do see between yellow teeth and lung cancer.
You do not understand the difference between a theoretical exercise and an actual experiment. If I am trying to study the relative effectiveness of two possible treatments T = (A, B) against a kidney stone disease. Theoretically it is okay to say that you randomly select two groups of people from the population of people with the disease, give treatment A to one and B to the second group and then calculate the the relative frequencies of those who recovered in group 1 after taking treatment A. Theoretically speaking, you can then compare that value with relative frequency of those who recovered in group 2 after taking treatment B. This is fine as a theoretical exercise.
Now fast-forward to an actual experiment in which the experimenters do not know about all the hidden factors. What does "select groups at random" mean in a real experiment? Say the experimenters select the two groups according to their best understanding of what may be random. And then after calculating their relative frequencies, they find that Treatment B is effective in 280 of the 350 people (83%), but treatment A is only effective in 273 of the 350 people (78%). So they conclude that Treatment B is more effective than treatment A. Is this a reasonable conclusion according to you?
It depends on how "random" the selections were. If they were just looking at people who were
already on one of the two treatments, it might be that there are other factors which influence the likelihood that a person will choose A vs. B (for example, socioeconomic status) and these factors might also influence the chances of recovery independent of the influences of the treatments themselves. On the other hand, if they picked a large population and then used a truly random method to
decide which members received treatment A vs. treatment B, and no one chose to drop out of the experiment, then this would be a reasonable controlled experiment in which the only reason that other factors (like socioeconomic status) might vary between group A and group B would be a random statistical fluctuation, so the larger the population the less likely there'd be significant variation in other factors between the two groups, and thus any differences in recovery would be likely due to the treatment itself.
billschnieder said:
Now suppose the omniscient being, knowing fully well that the size of the kidney stones is a factor and after looking at the data he finds that if he divides the groups according to the size of kidney stones the patients had the groups break down as follows
Group 1 (those who received treatment A): (87-small stones, 263-large stones)
Group 2( those who received treatment B): (270-small stones, 80-large stones)
This might be possible if the groups were self-selecting, for example if people with low socioeconomic status were both more likely to have large kidney stones (because of diet, say) and more likely to choose treatment A (because it's cheaper), but if the subjects were assigned randomly to group A or B by some process like a random number generator on a computer, there should be no correlation between P(computer algorithm assigns subject to treatment A) and P(subject has large kidney stones), so any difference in frequency in kidney stones between the two groups would be a matter of random statistical fluctuation, and such differences would be less and less likely the larger a population size was used.
billschnieder said:
He now finds that of the the 81 of the 87 (93%) in group 1 who had small stones were cured by treatment A, and 192 of the 263 (73%) of those with large stones in group 1 were cured by treatment A.
For group 2, he finds that 234 of the 270 (87%) with small stones were cured and 55 of the 80 (69%) with large stones were cured.
The relevance of your example to what we were debating is unclear. I was making the simple point that there can be situations where there is a marginal correlation between two random variables, but it the correlation disappears when you condition on some other set of facts (I won't say 'condition on another random variable' because that would probably lead to more semantic quibbling on your part--I'm just talking about a situation where if you condition on each
specific value of some other random variable, in each specific case the correlation disappears, as I illustrated at the end of post #21). But your example isn't like this--in your example there seem to be two measured variables, T which can take two values {received treatment A, received treatment B} and another one, let's call it U, which can also take two values {recovered from disease, did not recover from disease}. Then there is also a hidden variable we can V, which can take two values {large kidney stones, small kidney stones}. In your example there is a marginal correlation between variables T and U, but there is still a correlation (albeit a different correlation) when we condition on either of the two specific values of V. So, let me modify your example with some different numbers. Suppose 40% of the population have large kidney stones and 60% have small ones. Suppose those with large kidney stones have an 0.8 chance of being assigned to group A, and an 0.2 chance of being assigned to group B. Suppose those with small kidney stones have an 0.3 chance of being assigned to group A, and an 0.7 chance of being assigned to B. Then suppose that the chances of recovery depend only one whether one had large or small kidney stones and
is not affected either way by what treatment one received, so P(recovers|large kidney stones, treatment A) = P(recovers|large kidney stones), etc. Suppose the probability of recovery for those with large kidney stones is 0.5, and the probability of recovery for those with small ones is 0.9. Then it would be pretty easy to compute P(treatment A, recovers, large stones)=P(recovers|treatment A, large stones)*P(treatment A, large stones)=P(recovers|large stones)*P(treatment A, large stones)=P(recovers|large stones)*P(treatment A|large stones)*P(large stones) = 0.5*0.8*0.4=0.16. Similarly P(treatment A, doesn't recover, small stones) would be P(doesn't recover|small stones)*P(treatment A|small stones)*P(small stones)=0.1*0.3*0.6=0.018, and so forth.
In a population of 1000, we might then have the following numbers for each possible combination of values for T, U, V:
1. Number(treatment A, recovers, large stones): 160
2. Number(treatment A, recovers, small stones): 162
3. Number(treatment A, doesn't recover, large stones): 160
4. Number(treatment A, doesn't recover, small stones): 18
1. Number(treatment B, recovers, large stones): 40
2. Number(treatment B, recovers, small stones): 378
3. Number(treatment B, doesn't recover, large stones): 40
4. Number(treatment B, doesn't recover, small stones): 42
If we don't know whether each person has large or small kidney stones, this becomes:
1. Number(treatment A, recovers) = 160+162 = 322
2. Number(treatment A, doesn't recover) = 160+18 = 178
3. Number(treatment B, recovers) = 40+378 = 418
4. Number(treatment B, doesn't recover) = 40+42=82
So here, the data shows that of the 500 who received treatment A, 322 recovered while 178 did not, and of the 500 who received treatment B, 418 recovered and 82 did not. There is a marginal correlation between receiving treatment B and recovery: P(treatment B, recovers)=0.418, which is larger than P(treatment B)*P(recovers)=(0.5)*(0.74)=0.37. But if you look at the correlation between receiving treatment B and recovery
conditioned on large kidney stones, there is no conditional correlation: P(treatment B, recovers|large stones) = P(treatment B|large stones)*P(recovers|large stones) [on the left side, there are 400 people with large stones and only 40 of these who also received treatment B and recovered, so P(treatment B, recovers|large stones) = 40/400 = 0.1; on the right side, there are 400 with large stones but only 80 of these received treatment B, so P(treatment B|large stones)=80/400=0.2, and there are 400 with large stones and 200 of those recovered, so P(recovered|large stones)=200/400=0.5, so the product of these two probabilities on the right side is also 0.1] The same would be true if you conditioned treatment B + recovery on
small kidney stones, or if you conditioned any other combination of observable outcomes (like treatment A + no recovery) on either large or small kidney stones.
So do you agree that with my numbers, we find a marginal correlation between the observable variable T (telling us which treatment a person received, A or B) and U (telling us whether they recovered or not), but
no correlation between T and U when we condition on any specific value of the "hidden" variable V (telling us whether the person has large or small kidney stones)?
Please give me a yes or no answer to this question. If you agree that this sort of thing is possible, why do you think the same couldn't be true in a local hidden variables theory where the two observable variables represented measurements (each under some specific detector setting) at different locations, and each specific value of the variable λ represents a specific combination of values for various local hidden variables?
billschnieder said:
As you can hopefully see here, not knowing about all the hidden factors at play, the experiments can not possibly collect a fair sample, therefore their results are not comparable to the theoretical situation in which all possible causal factors are included.
When you say "fair sample", "fair" in what respect? If your 350+350=700 people were randomly sampled from the set of all people receiving treatment A and treatment B, then this is a fair sample where the marginal correlation between the treatment and recovery outcome variables (T and U above) in your group should accurately reflects the marginal correlation that would exist between these same variables if you looked at every single person in the world receiving treatment A and B. The problem of Simpson's paradox is that this marginal positive correlation between B and recovery does not tell you anything about a
causal relation between these variables ('correlation is not causation'), because the positive correlation might become a negative correlation (as in your example) or zero correlation (as in mine) when you condition on some
other fact like having large kidney stones.
If you think this somehow suggests a problem with Bell's reasoning,
you are really missing the point of his argument entirely! Bell does
not just assume that since there is a marginal correlation between the results of different measurements on a pair of particles, there must be a causal relation between the measurements; instead his whole argument is based on
explicitly considering the possibility that this correlation would disappear when conditioned on other hidden variables, exactly analogous to my example where there was no correlation between treatment group and recovery outcome when you conditioned on large kidney stones (or when you conditioned on small kidney stones). The whole point is that in a local realist universe, marginal correlations between
any spacelike-separated events
cannot represent causal influences, the correlations
must disappear when you condition on the state of all local variables in the past light cones of the two spacelike-separated events. And that assumption, that marginal correlations between spacelike-separated events must
not be causal influences under local realism, so that local realism predicts these correlations must disappear when conditioned on the values of other variables in the past light cones, is exactly what is represented by equation (2) in his paper (or by equation 10 on
page 243 of Speakable and Unspeakable in Quantum Mechanics). So criticizing Bell by comparing his argument to that of an imaginary fool who thinks the marginal correlation between treatments and recovery outcomes
does indicate a causal influence between the two,
failing to consider that the correlation might reverse or disappear when you condition for some other hidden variable like large kidney stones, indicates a complete lack of understanding of Bell's argument!
billschnieder said:
So again, do you have a reference to any Aspect type experiment in which they ensured randomness with respect to all possible hidden elements of reality causing the results? By comparing observed correlations to Bell's inequalities, you are claiming that they are in fact comparable.
Bell's inequalities deal with marginal correlations, the ones that are seen when you
don't condition on hidden variables (though of course they are derived from the assumption that any such marginal correlations
must disappear when conditioned on the proper hidden variables). Experiments also deal with the same marginal correlations. So, your request makes absolutely no sense.
billchnieder said:
Huh? The break down of a conclusion can only be taken to imply a failure of one of the premises of that conclusion. The argument usually goes as follows:
(1) Bell's inequalities accurately model local realistic universes
(2) Our universe is locally realistic
(3) Therefore actual experiments in our universe must obey Bell's inequalities.
No, it doesn't; no mainstream physicist argues that way, either you're engaging in pure fantasy or you've completely misread whatever papers gave you this idea (if you think any actual mainstream papers make this sort of argument, why don't you link to them and I can point out your mistake). The actual argument is as follows:
(1) The theoretical
postulate of local realism implies Bell's inequalities should be satisfied
(2) In real experiments, Bell's inequalities are violated
(3) Therefore, the theoretical postulate of local realism has been falsified in our real universe
Obviously you are not convinced of point (1), but do you dispute that if (1)
were theoretically sound and (2) is true of actual experiments, then (3) must follow?