JesseM
Science Advisor
- 8,519
- 16
(continued from previous post)
I'm fairly certain that the rate at which the likelihood of significant statistical fluctuations drops should not depend on the number of λn's in the integral. For example, suppose you are doing the experiment in two simulated universes, one where there are only 10 possible states for λ and one where there are 10,000 possible states for λ. If you want to figure out the number N of trials needed so that there's only a 5% chance your observed statistics will differ from the true probabilities by more than one sigma, it should not be true that N in the second simulated universe is 1000 times bigger than N in the first simulated universe! In fact, despite the thousandfold difference in possible values for λ, I'd expect N to be exactly the same in both cases. Would you disagree?
To see why, remember that the experimenters are not directly measuring the value of λ on each trial, but are instead just measuring the value of some other variable which can only take two possible values, and which value it takes depends on the value of λ. So, consider a fairly simple simulated analogue of this type of situation. Suppose I am running a computer program that simulates the tossing of a fair coin--each time I press the return key, the output is either "T" or "H", with a 50% chance of each. But suppose the programmer has perversely written an over-complicated program to do this. First, the program randomly generates a number from 1 to 1000000 (with equal probabilities of each), and each possible value is associated with some specific value of an internal variable λ; for example, it might be that if the number is 1-20 that corresponds to λ=1, while if the number is 21-250 that corresponds to λ=2 (so λ can have different probabilities of taking different values), and so forth up to some maximum λ=n. Then each possible value of λ is linked in the program to some value of another variable F, which can take only two values, 0 and 1; for example λ=1 might be linked to F=1, λ=2 might be linked to F=1, λ=3 might be linked to F=0, λ=4 might be linked to F=1, etc. Finally, on any trial where F=0, the program returns the result "H", and on any trial where F=1, the program returns the result "T". Suppose the probabilities of each λ, along with the value of F each one is linked to, are chosen such that if you take [sum over i from 1 to n] P(λ=i)*(value of F associated with λ=i), the result is exactly 0.5. Then despite the fact that there may be a very large number of possible values of λ, each with its own probability, this means that in the end the probability of seeing "H" on a given trial is 0.5, and the probability of seeing "T" on a given trial is also 0.5.
Now suppose that my friend is also using a coin-flipping program, where the programmer picked a much simpler design in which the computer's random number generator picks a digit from 1 to 2, and if it's 1 it returns the output "H" and if it's 2 it returns the output "T". Despite the differences in the internal workings of our two programs, there should be no difference in the probability either of us will see some particular statistics on a small number of trials! For example, if either of us did a set of 30 trials, the probability that we'd get more than 20 heads would be determined by the binomial distribution, which in this case says there is only an 0.049 chance of getting 20 or more heads (see the calculator http://stattrek.com/Tables/Binomial.aspx). Do you agree that in this example, the more complex internal set of hidden variables in my program makes no difference in statistics of observable results, given that both of us can see the same two possible results on each trial, with the same probability of H vs. T in both cases?
For a somewhat more formal argument, just look at http://www.dartmouth.edu/~chance/teaching_aids/books_articles/probability_book/Chapter8.pdf, particularly the equation that appears on p. 3 after the sentence that starts "By Chebyshev's inequality ..." If you examine the equation and the definition of the terms above, you can see that if we look at the the average value for some random value X after n trials (the S_n / n part), the probability that it will differ from the expectation value \mu by an amount greater than or equal to \epsilon must be smaller than or equal to \sigma^2 / n\epsilon^2, where \sigma^2 is the variance in the value of the original random variable X. And both the expectation value for X and the variance of X depend only on the probability that X takes different possible values (like the variable F in the coin example which has an 0.5 chance of taking F=0 and an 0.5 chance of taking F=1), it shouldn't matter if the value of X on each trial is itself determined by the value of some other variable λ which can take a huge number of possible values.
Yes, of course.billschnieder said:In Bell's equation (2) does he not integrate over all possible hidden elements of reality? Do you expect that the LHS of his equation (2) in his original paper will have the same value if the integral was not over the full set of possible realizations of hidden elements of reality? I need a yes or no answer here.
No, a partial integral wouldn't give the same results.billschnieder said:For example say n=10 (10 different possible λs) and Bells integral was from λ1 to λ10. Do you expect an integral that is calculated only for from λ1 to λ9 to give you the same result as Bell's integral? Please answer with a simple yes or no.
Simple yes or no is not possible here; there is some probability the actual statistics on a finite number of trials would obey Bell's inequalities, and some probability they wouldn't, and the law of large numbers says the more trials you do, the less likely it is your statistics will differ significantly from the ideal statistics that would be seen given an infinite number of trials (so the less likely a violation of Bell's inequalites would become in a local realist universe).billschnieder said:So then if in an experiment, only λ1 to λ9 were ever realized, will the observed frequencies obey Bell's inequalities? Yes or No please.
I'm fairly certain that the rate at which the likelihood of significant statistical fluctuations drops should not depend on the number of λn's in the integral. For example, suppose you are doing the experiment in two simulated universes, one where there are only 10 possible states for λ and one where there are 10,000 possible states for λ. If you want to figure out the number N of trials needed so that there's only a 5% chance your observed statistics will differ from the true probabilities by more than one sigma, it should not be true that N in the second simulated universe is 1000 times bigger than N in the first simulated universe! In fact, despite the thousandfold difference in possible values for λ, I'd expect N to be exactly the same in both cases. Would you disagree?
To see why, remember that the experimenters are not directly measuring the value of λ on each trial, but are instead just measuring the value of some other variable which can only take two possible values, and which value it takes depends on the value of λ. So, consider a fairly simple simulated analogue of this type of situation. Suppose I am running a computer program that simulates the tossing of a fair coin--each time I press the return key, the output is either "T" or "H", with a 50% chance of each. But suppose the programmer has perversely written an over-complicated program to do this. First, the program randomly generates a number from 1 to 1000000 (with equal probabilities of each), and each possible value is associated with some specific value of an internal variable λ; for example, it might be that if the number is 1-20 that corresponds to λ=1, while if the number is 21-250 that corresponds to λ=2 (so λ can have different probabilities of taking different values), and so forth up to some maximum λ=n. Then each possible value of λ is linked in the program to some value of another variable F, which can take only two values, 0 and 1; for example λ=1 might be linked to F=1, λ=2 might be linked to F=1, λ=3 might be linked to F=0, λ=4 might be linked to F=1, etc. Finally, on any trial where F=0, the program returns the result "H", and on any trial where F=1, the program returns the result "T". Suppose the probabilities of each λ, along with the value of F each one is linked to, are chosen such that if you take [sum over i from 1 to n] P(λ=i)*(value of F associated with λ=i), the result is exactly 0.5. Then despite the fact that there may be a very large number of possible values of λ, each with its own probability, this means that in the end the probability of seeing "H" on a given trial is 0.5, and the probability of seeing "T" on a given trial is also 0.5.
Now suppose that my friend is also using a coin-flipping program, where the programmer picked a much simpler design in which the computer's random number generator picks a digit from 1 to 2, and if it's 1 it returns the output "H" and if it's 2 it returns the output "T". Despite the differences in the internal workings of our two programs, there should be no difference in the probability either of us will see some particular statistics on a small number of trials! For example, if either of us did a set of 30 trials, the probability that we'd get more than 20 heads would be determined by the binomial distribution, which in this case says there is only an 0.049 chance of getting 20 or more heads (see the calculator http://stattrek.com/Tables/Binomial.aspx). Do you agree that in this example, the more complex internal set of hidden variables in my program makes no difference in statistics of observable results, given that both of us can see the same two possible results on each trial, with the same probability of H vs. T in both cases?
For a somewhat more formal argument, just look at http://www.dartmouth.edu/~chance/teaching_aids/books_articles/probability_book/Chapter8.pdf, particularly the equation that appears on p. 3 after the sentence that starts "By Chebyshev's inequality ..." If you examine the equation and the definition of the terms above, you can see that if we look at the the average value for some random value X after n trials (the S_n / n part), the probability that it will differ from the expectation value \mu by an amount greater than or equal to \epsilon must be smaller than or equal to \sigma^2 / n\epsilon^2, where \sigma^2 is the variance in the value of the original random variable X. And both the expectation value for X and the variance of X depend only on the probability that X takes different possible values (like the variable F in the coin example which has an 0.5 chance of taking F=0 and an 0.5 chance of taking F=1), it shouldn't matter if the value of X on each trial is itself determined by the value of some other variable λ which can take a huge number of possible values.
No more need for him to "represent all possible λs" than there is in the coin-flipping example. Even if the program has 3000 possible values of λ (determined by the value of the random number from 1 to 1000000), as long as the total probability of getting result "H" is 0.5, the probability of various numbers of H's and T's on a small set of trials (say, 50) should be given by the binomial distribution, and the more trials I do, the smaller the probability of any significant departure from a 50/50 ratio of H:T. Agree or disagree? If you agree in the coin-flipping example, it shouldn't be "too difficult for you to understand" why similarly in a local hidden variables theory, the probability that your observed statistics differ by a given amount from the ideal probabilities will go down with the number of trials, and the rate at which it goes down should be independent of the number of possible values of λ.billschnieder said:How can an Aspect-type experimenter be expected to ensure a fair sample, one that represents all possible λs, without knowing the details of what λ is in the first place?! Is this too difficult for you to understand.
Last edited: