billschnieder said:
Who said anything about a physical process. I've given you an abstract mathematical list, and you can't bring yourself to admit that you were wrong, to the point you are making yourself look foolish. P(++) for the list I gave you is 1/4, even a cave man can understand that level of probability theory Jesse! Are you being serious, really?
Yes, Bill. Would you deny, for example, that a physical process that had P(++)=0.3, P(+-)=0.2, P(-+)=0.15, and P(--)=0.35 (with all of these numbers being the frequentist probabilities that would represent the fraction of trials with each value in the limit as the number of trials goes to infinity) could easily generate the following results on 4 trials?
++
--
-+
+-
billschnieder said:
Who said anything about frequentist view.
I did. It's the only notion of "probability" that I've been using the whole time, perhaps if you go back and look at some of the posts of mine you thought didn't make sense and read them in this light you will understand them better (also, note that I'm not talking about 'finite frequentism', but 'frequentism' understood in terms of the limit as the number of trials goes to infinity--see below for a link discussing the difference between the two). For example, if we are talking about the frequentist view of probability, the mere fact that you got ++ once on a set of four trials does
not imply P(++)=0.25...do you disagree?
billschnieder said:
All I did was point out to you a basic mainstream fact in probability theory:
Wikipedia (http://en.wikipedia.org/wiki/Law_of_large_numbers):
In probability theory, the law of large numbers (LLN) is a theorem that describes the result of performing the same experiment a large number of times. According to the law, the average of the results obtained from a large number of trials should be close to the expected value, and will tend to become closer as more trials are performed.
So you are way off base and I am right to say that you do not understand probability theory.
Note that the wikipedia article says "
close to the expected value", not "exactly equal to the expected value". And note that this is only said to be true in a large number of trials, the article does not suggest that if you have only four trials the average on those four trials should be anywhere near the expectation value. Finally, note that in the
forms section of the article they actually
distinguish between the "sample average" and the "expected value", and say that the "sample average" only "converges to the expected value" in the limit as n (number of samples) approaches infinity. So, it seems pretty clear the wikipedia article is using the frequentist definition as well.
JesseM said:
So you deny that the "expectation value" for a test which can yield any of N possible results R1, R2, ..., RN would just be 1/N \sum_{i=1}^N R_i * P(R_i )? (where P(R) is the probability distribution function that gives the probability for each possible Ri)
billschnieder said:
Again you are way off base. In probability theory When using the probability of an R as a weight in calculating the expectation value, you do not need to divide the sum by N again. That will earn you an F grade. The correct expression should be:
\sum_{i}^{N} R_i * P(R_i )
Yes, here you did catch me in an error, I wrote down the expression too fast without really thinking carefully, I guess I got confused by all the other sums which did include 1/N on the outside. Before you brandish this as proof that I "don't know probability", note that in previous posts I did write it down correctly, for example in post #1205:
JesseM said:
In general, if you have some finite number N of possible results Ri for a given measurement, and you know the probability P(Ri) for each result, the "expectation value" is just:
E = \sum_{i=1}^N R_i * P(R_i )
If you perform a large number of measurements of this type, the average result over all measurements should approach this expectation value.
And in post #1218:
JesseM said:
Physical assumptions are peripheral to calculating averages from experimental data, it's true, and they're also peripheral to writing down expectation values in terms of the "true" probabilities as I did when I wrote E(R) = \sum_{i=1}^N R_i * P(R_i),
Anyway, now that we seem to be agreed that the correct form for the expectation value is E = \sum_{i=1}^N R_i * P(R_i ) (though I am sure we would disagree on the meaning of P(Ri) since I define it in frequentist terms as the fraction of trials that would give result Ri in the limit as the number of trials goes to infinity), can you tell me if you think I was incorrect to write the expectation value as follows?
E(a,b) = (+1*+1)*P(detector with setting a gets result +1, detector with setting b gets result +1) + (+1*-1)*P(detector with setting a gets result +1, detector with setting b gets result -1) + (-1*+1)*P(detector with setting a gets result -1, detector with setting b gets result +1) + (-1*-1)*P(detector with setting a gets result -1, detector with setting b gets result -1)
This equation does have the form E = \sum_{i=1}^N R_i * P(R_i ) does it not? If you don't object to the claim that the above is at least
one way of defining E(a,b), then why in post #1221 did you object as follows?
billschnieder said:
So when you say:
JesseM said:
This expectation value is understood as a sum of the different possible measurement outcomes weighted by their "true" probabilities:
E(a,b) = (+1*+1)*P(detector with setting a gets result +1, detector with setting b gets result +1) + (+1*-1)*P(detector with setting a gets result +1, detector with setting b gets result -1) + (-1*+1)*P(detector with setting a gets result -1, detector with setting b gets result +1) + (-1*-1)*P(detector with setting a gets result -1, detector with setting b gets result -1)
...
The comment above is completely misguided, since the basic definition of "expectation value" in this experiment has nothing at all to do with knowing the value of λ, it is just understood to be:
E(a,b) = (+1*+1)*P(detector with setting a gets result +1, detector with setting b gets result +1) + (+1*-1)*P(detector with setting a gets result +1, detector with setting b gets result -1) + (-1*+1)*P(detector with setting a gets result -1, detector with setting b gets result +1) + (-1*-1)*P(detector with setting a gets result -1, detector with setting b gets result -1)
It clearly shows that you do not understand probability or statistics. Clearly the definition of expectation value is based on probability weighted sum, and law of large numbers is used as an approximation, that is why it says in the last sentence above that the expectation values is
"almost surely the limit of the sample mean as the sample size grows to infinity"
billschnieder said:
JesseM said:
billschnieder said:
You are given a theoretical list of N pairs of real-valued numbers x and y. Write down the mathematical expression for the expectation value for the paired product.
It's impossible to write down the correct objective/frequentist expectation value unless we know the sample space of possible results (all possible pairs, which might include possibilities that don't appear on the list of N pairs) along with the objective probabilities of each result (which may be different from the frequency with which the result appears on your list, although you can estimate the objective probability based on the empirical frequency if N is large...it's better if you have some theory that gives precise equations for the probability like QM though).
Wow! The correct answer is <xy>
Wikipedia:
http://en.wikipedia.org/wiki/Mean
In statistics, mean has two related meanings:
* the arithmetic mean (and is distinguished from the geometric mean or harmonic mean).
* the expected value of a random variable, which is also called the population mean.
There are other statistical measures that use samples that some people confuse with averages - including 'median' and 'mode'. Other simple statistical analyses use measures of spread, such as range, interquartile range, or standard deviation. For a real-valued random variable X, the mean is the expectation of X.
You really do not know anything about probability.
Here the wikipedia article is failing to adequately distinguish between the "mean" of a finite series of trials (or any finite sample) and the "mean" of a probability distribution (edit: See for example
this book which distinguishes the 'sample mean' \bar X from the 'population mean' \mu, and says the sample mean 'may, or may not, be an accurate estimation of the true population mean \mu. Estimates from small samples are especially likely to be inaccurate, simply by chance.' You might also look at
this book which says 'We use \mu, the symbol for the mean of a probability distribution, for the population mean', or
this book which says 'The mean of a discrete probability distribution is simply a weighted average (discussed in Chapter 4) calculated using the following formula: \mu = \sum_{i=1}^n x_i P[x_i ]'). If you think the expectation value is
exactly equal to the average of a finite series of trials, regardless of whether the number of trials is large or small, then you are disagreeing with the very wikipedia quote you posted earlier from the Law of Large Numbers page:
In probability theory, the law of large numbers (LLN) is a theorem that describes the result of performing the same experiment a large number of times. According to the law, the average of the results obtained from a large number of trials should be close to the expected value, and will tend to become closer as more trials are performed.
According to you, would it be more correct to write "the average of the results obtained from any number of trials would be exactly equal to the expected value"? If you do, then your view is in conflict with the quote above. And if you don't think the average from a finite number of trials is
exactly equal to the expectation value, then you were incorrect to write "Wow! The correct answer is <xy>" above.
JesseM said:
No, it doesn't mean that, because the ρ(λi) that appears in Bell's equations (along with the P(λi) that appears in the discrete version) is pretty clearly supposed to be an objective probability function of the frequentist type.
billschnieder said:
Oh, so now you are abandoning your law of large numbers again because it suits your argument.
Um, how am I doing that? I said "objective probability function
of the frequentist type" above (and again, you can assume that all my comments about probabilities assumed a frequentist definition, it might help you avoid leaping to silly false conclusions about what I'm arguing), do you understand that this would be a function where the "probability" it assigns to any outcome is equal to
the fraction of trials where that outcome would occur in the limit as the number of trials went to infinity? And if I'm defining probabilities in terms of the limit as the number of trials goes to infinity, I'm pretty clearly making use of the law of large numbers, no?
billschnieder said:
Remember the the underlined text because it will haunt you later when you try to argue that expectation values calculated from three different runs of an experiment can be used as terms for comparing with Bell's inequality.
You can't calculate "expectation values" from three runs with a finite series of trials, not in my way of thinking (I have never said otherwise, if you think I did you misread me). You can only calculate the sample average from a finite run of trials. However, by the law of large numbers, the bigger your sample, the smaller the probability that your sample average will differ significantly from the "true" expectation value determined by the "true" probabilities (again with 'true' probabilities defined in frequentist terms)
JesseM said:
Again, no one is asking you to agree that frequentist definitions are the "best" ones to use in ordinary situations where we are trying to come up with probability estimates from real data...
billschnieder said:
Right after arguing that the probabilities I got from real data are not the correct ones, you go right ahead and argue that the frequentist view (which btw, is what I used in the statement you were objecting to), is the "best" one to use.
Again, the usual modern meaning of the "frequentist" view is that the probability of some outcome is just the fraction of trials with that outcome
in the limit as the number of trials goes to infinity, not in any finite series of trials (see
here and http://books.google.com/books?id=Q1AUhivGmyUC&lpg=PA80&dq=frequentism&pg=PA80#v=onepage&q=frequentism&f=false and p.9
here for example...the
Stanford Encyclopedia of Philosophy article also refers to something called 'finite frequentism', but modern authors usually use 'frequentism' to mean the definition involving the limit as number of trials approaches infinity, and in any case this is what
I have always meant by 'frequentism', I'm certainly not talking about finite frequentism)
And I am only arguing that the frequentist view is the "best" one to use for understanding the meaning of the probabilities in Bell's theoretical argument, not for estimating probabilities based on empirical data. All I want to know is whether you are willing to
consider whether your argument about the limited applicability of Bell's proof (that it can't be applied to three separate lists of pairs which can't be resorted in the way you discussed in #1208) would not apply
if we interpret the probabilities in Bell's argument in frequentist terms. Can you please tell me, yes or no, are you willing to consider whether Bell's proof
might allow us to make broad predictions about three runs which each yield a distinct list of pairs, if we do indeed interpret the probabilities in his theoretical argument in (non-finite) frequentist terms?
billschnieder said:
From the number of times you have suddenly invoked the word "frequentist" in the latest post of yours, it seems you would rather we abandon this discussion and start one about definitions of probability of which your favorite is frequentist.
I don't want a discussion of definitions of probability. Whenever I have been talking about probabilities I have been assuming frequentist definitions, and only lately have I noticed that your argument seems to depend critically on the fact that you are using non-frequentist definitions (or 'finite frequentist' definitions if you prefer), which is why I have started trying to be explicit about it. Even if you don't like the frequentist definition in general, all I'm asking is that you consider the possibility that Bell's own probabilities might have been intended to be interpreted in frequentist terms, and that the supposed problems with his argument might disappear if we do interpret symbols like ρ(λ) in this light.
billschnieder said:
I understand that you plan to argue next that unless the frequentist view is used, Bell's work can not be understood correctly. Even though I will not agree with such a narrow view, let me pre-empt that and save you a lot of effort by pointing you to the fact that in my arguments above explaining Bell's work, I have been using the frequentist view.
Your arguments may have been assuming the "finite frequentist" view, but as I said that's not what I'm talking about. I'm talking about the more common "frequentist" view that defines objective probabilities in terms of the limit as the number of trials goes to infinity. Are you willing to discuss whether Bell's argument makes sense (and doesn't have the problem of limited applicability that you point to)
if we assume the probabilities in his theoretical argument were also meant to be understood in the same "frequentist" sense that I'm talking about here?