Regarding consciousness causing wavefunction collapse

stevendaryl · Jul 29, 2017

vanhees71 said:

Hm, how do you then explain the amazing accuracy with which many of the probabilistic prediction of QT are confirmed by experiments, using the frequentist interpretation of probability?

I already said how: The difference between the (incorrect) frequentist analysis and the (correct) Bayesian analysis goes to zero in the limit as the number of trials becomes large.

Or, put in another way. How do you, as a "Bayesian", interpret probabilities and how can you, if there's no objective way to empirically measure probabilities with higher and higher precision by "collecting statistics, verify or falsify the probabilistic predictions of QT?

For a Bayesian, at any given time, there are many alternative hypotheses that could all explain the given data. Gathering more data will tend to make some hypotheses more likely, and other hypotheses less likely. The point of gathering more data is to decrease your uncertainty about the various hypotheses. But unlike frequentists, nothing is ever verified, and nothing is every falsified. That isn't a problem, in principle. In practice, it's cumbersome to keep around hypotheses that have negligible likelihood. So I think there is a sense in which Popperian falsification is a heuristic tool to make science more tractable.

vanhees71 · Jul 29, 2017

I'm again too stupid to follow this argument. I'd describe the coin-throughing probability experiment as follows. I assume that the coin is stable and there's a probability ##p## for showing head (then necessarily the probability for showing tail is ##q=1-p##).

As a frequentist, to figure out the probability ##p## I have to through the coin very often and check the relative frequencies with which I get head or tail, and standard probability theory tells me that this is not as stupid an idea as you tell since we can easily verify the Law of Large Numbers for this simple case. The probability for getting ##0 \leq H \leq N## head obviously is
$$P_N(H)=\binom{N}{H} p^H(1-p)^{N-H}.$$
To go on I define the generating function
$$f(x)=\sum_{H=1}^N \binom{N}{H} \exp(x H) p^H(1-p)^{N-H}=(1+p \exp x-p)^N$$
to evaluate the expectation value for ##H## and its standard deviation
$$\overline{H}=\langle H \rangle =f'(0)= N p, \quad \sigma_{H}^2=\langle H^2 \rangle-\langle H \rangle^2=Np(1-p).$$
The expectation value of the relative frequency for head is thus
$$p_N = \frac{\overline{H}}{N}=p$$
and its standard deviation
$$\sigma_{p_N}=\frac{\sigma{H}}{N}=\frac{p(1-p)}{\sqrt{N}}.$$
For large ##N## the probability distribution for ##p_N## is Gaussian around the mean value ##p## with a width of ##\mathcal{O}(1/\sqrt{N})##, i.e., for ##N \rightarrow \infty## the relative frequencies for head converge in some weak (or "probabilistic") sense to ##p##.

That's more a plausibility argument than a real strict proof, but it can be made rigorous, and it shows that the frequentist interpretation is valid. I don't thus see any need to introduce another interpretation of probabilities than the frequentist one for any practical purpose.

Of course, if you cannot make ##N## very large for some reason, you have to live with large uncertainties. Then you might start with philosophical speculations about the "meaning of probabilities for a small number of events", since physics claims to be an objective science there are some demands for a discovery (e.g., the famous ##5\sigma##-significance rule in HEP physics).

vanhees71 · Jul 29, 2017

stevendaryl said:

I already said how: The difference between the (incorrect) frequentist analysis and the (correct) Bayesian analysis goes to zero in the limit as the number of trials becomes large.

How then can the "frequentist analysis" be wrong? It cannot be wrong, because in the hard empirical sciences we consider only sufficiently often repeatable observations as clear evidence for the correctness of a probabilistic description. "Unrepeatable one-time experiments" are useless for science.

For a Bayesian, at any given time, there are many alternative hypotheses that could all explain the given data. Gathering more data will tend to make some hypotheses more likely, and other hypotheses less likely. The point of gathering more data is to decrease your uncertainty about the various hypotheses. But unlike frequentists, nothing is ever verified, and nothing is every falsified. That isn't a problem, in principle. In practice, it's cumbersome to keep around hypotheses that have negligible likelihood. So I think there is a sense in which Popperian falsification is a heuristic tool to make science more tractable.

Then Bayesianism is simply irrelevant for the natural sciences.

stevendaryl · Jul 29, 2017

Let me bring up a hoary example illustrating the problem with the frequentist notion of uncertainty.

Suppose you're a doctor, and you have some fairly accurate test for some disease. You've confirmed that:

If you have the disease, there is a 99% probability that you will test positive, and only a 1% chance that you will test negative.
If you don't have the disease, there is a 99% probability that you will test negative, and only a 1% chance that you will test positive.

So you test a patient, and he tests positive for the disease. You tell him: "You probably have the disease; but there is a 1% uncertainty in the diagnosis." Should the patient be worried, or not?

Well, 99% certainty sounds pretty certain, so the patient ought to be worried. But the Bayesian analysis would tell us this:

Let [itex]p(D)[/itex] be the a priori probability that the patient has the disease (before any tests are performed).
Let [itex]p(\neg D) = 1 - p(D)[/itex] be the a priori probability that he doesn't have the disease.
Let [itex]p(P|D)[/itex] be the probability of testing positive, given that the patient has the disease (99% in our example).
Let [itex]p(P|\neg D)[/itex] be the probability of testing positive, given that the patient does not have the disease (1% in our example).
Then the probability of the patient having the disease, given that he tests positive, is [itex]p(D|P) = p(P|D) \frac{p(D)}{p(P|D) p(D) + p(P|\neg D) P(\neg D)}[/itex]

If [itex]p(D) = 0.0001[/itex] (1 in 10,000) then this gives us: [itex]P(D|P) \approx[/itex] 0.98%. In other words, the probability that he doesn't have the disease is 99%.

So the 1% uncertainty in the test accuracy is completely inaccurate as a way to estimate the uncertainty in whether the patient has the disease.

vanhees71 · Jul 29, 2017

What has this example to do with what we are discussing?

stevendaryl · Jul 29, 2017

vanhees71 said:

I'm again too stupid to follow this argument. I'd describe the coin-throughing probability experiment as follows. I assume that the coin is stable and there's a probability ##p## for showing head (then necessarily the probability for showing tail is ##q=1-p##).

[Stuff deleted]

That's more a plausibility argument than a real strict proof, but it can be made rigorous, and it shows that the frequentist interpretation is valid. I don't thus see any need to introduce another interpretation of probabilities than the frequentist one for any practical purpose.

That's backwards from what you really want. You're starting with a probability, [itex]p[/itex], and then you're calculating the likelihood that you get [itex]H[/itex] heads out of [itex]N[/itex] flips. What you want is to calculate [itex]p[/itex] from [itex]H[/itex] and [itex]N[/itex], because [itex]p[/itex] is the unknown.

There are two different uncertainties involved in this thought experiment:

The uncertainty in [itex]p[/itex], given [itex]\frac{M}{N}[/itex].
The uncertainty in [itex]\frac{M}{N}[/itex], given [itex]p[/itex].

What you want is the first, but what you calculate is the second. Of course, in the limit that [itex]N \rightarrow \infty[/itex], if the second goes to zero, then so does the first. But for finite [itex]N[/itex] (which is all we ever have), we don't have any way to calculate the relationship between the two without using subjective priors.

If [itex]N[/itex] is finite (which it always is), it's just incorrect for the frequentist to say that there is an uncertainty of 1% that the coin's true probability is [itex]\frac{1}{2} \pm \epsilon[/itex]

stevendaryl · Jul 29, 2017

vanhees71 said:

What has this example to do with what we are discussing?

The issue is the meaning of frequentist uncertainty. If we're trying to determine whether a coin is biased, then what we want to know is the likelihood that the coin is biased (or to make it definite, the likelihood that its bias is greater than [itex]\epsilon[/itex] for some appropriate [itex]\epsilon[/itex]). The frequentist uncertainty doesn't tell us this.

vanhees71 · Jul 29, 2017

Of course, if I have thrown the coin only a few times, my uncertainty about ##p## for head, given the relative frequency, is very large and thus my uncertainty to be sure that it's biased (or not) is also large. Of course, you have to do the experiment with sufficient statistics to decide with some given significance level. That's why, e.g., physicists build the LHC just for finding the Higgs with sufficient significance (not that it wouldn't be good to find something else too, but that's not the issue here).

StevieTNZ · Jul 29, 2017

UsableThought said:

I know about this only because it is one of many interpretations discussed in Michael Raymer's July 2017 book from Oxford U. Press, Quantum Physics: What Everyone Needs to Know.

Good book, have a copy of it. I'll be recommending people read it, for lay-men audience, in addition to 'Sneaking a Look At God's Cards'.

FallenApple · Jul 29, 2017

stevendaryl said:

That's backwards from what you really want. You're starting with a probability, [itex]p[/itex], and then you're calculating the likelihood that you get [itex]H[/itex] heads out of [itex]N[/itex] flips. What you want is to calculate [itex]p[/itex] from [itex]H[/itex] and [itex]N[/itex], because [itex]p[/itex] is the unknown.

There are two different uncertainties involved in this thought experiment:

The uncertainty in [itex]p[/itex], given [itex]\frac{M}{N}[/itex].

The uncertainty in [itex]\frac{M}{N}[/itex], given [itex]p[/itex].

What you want is the first, but what you calculate is the second. Of course, in the limit that [itex]N \rightarrow \infty[/itex], if the second goes to zero, then so does the first. But for finite [itex]N[/itex] (which is all we ever have), we don't have any way to calculate the relationship between the two without using subjective priors.

If [itex]N[/itex] is finite (which it always is), it's just incorrect for the frequentist to say that there is an uncertainty of 1% that the coin's true probability is [itex]\frac{1}{2} \pm \epsilon[/itex]

This is an interesting discussion. I can see why the frequentist method might have some pitfalls. It's essentially trying to do a proof by contradiction based on assumed values. So we can see why statements of probability get to the heart of the matter much more directly. But isn't one of the pitfalls of Bayesean logic is that it depends on good priors? If our subjective beliefs about the prior probability is wrong(not merely inaccurate), then the posterior would be further from the truth than a frequentist analysis.

UsableThought · Jul 29, 2017

vanhees71 said:

That's more a plausibility argument than a real strict proof, but it can be made rigorous, and it shows that the frequentist interpretation is valid. I don't thus see any need to introduce another interpretation of probabilities than the frequentist one for any practical purpose.

. . . Bayesianism is simply irrelevant for the natural sciences.

For any practical purpose? Irrelevant for the natural sciences?

Examples of the usefulness of subjective probability (a category which of course includes Bayesian probability) can be found in primers on scientific inference; it is regarded as especially handy for situations which lack enough information to support a reference class. This "reference class problem" affects all models, but is considered especially difficult for frequentistism.

Here's one such example of using subjective probability when a reference class is lacking; this is drawn from Philosophy of Science: A Very Short Introduction, by Samir Okasha (note this was published in 2002 and so does not reflect more recent Mars missions & speculation about microorganism habitat):

Suppose a scientist tells you that the probability of finding life on Mars is extremely low. Does this mean that life is found only on a small proportion of all celestial bodies? Surely not. For one thing, no one knows how many celestial bodies there are, nor how many of them contain life. So a different notion of probability is at work here. Now since there either is life on Mars or there isn't, talk of probability in this context must presumably reflect our ignorance of the state of the world, rather than describing an objective feature of the world itself. So it is natural to take the scientist's statement to mean that in the light of all the evidence, the rational degree of belief to have in the hypothesis that there is life on Maris is very low.

stevendaryl said:

If we're trying to determine whether a coin is biased, then what we want to know is the likelihood that the coin is biased (or to make it definite, the likelihood that its bias is greater than ϵ for some appropriate ϵ). The frequentist uncertainty doesn't tell us this.

Not sure you'll think this relevant, but frequency counting has been used to identify biased dies. In 1894, a zoologist named Ralph Weldon rolled a set of more than 26,000 times; the numbers 5 and 6 came up too often; examination of the dies showed that the way holes were drilled in the faces, to represent the numbers, resulted in consistent imbalances. Wikipedia mentions Weldon's dice trial, but the description I just cited comes form yet another "Very Short" primer, this one on probability, by John Haigh. That book also mentions a trial done about 70 years later by a man named Willard Longcor, in which Longcor collected various makes of dice and threw each make over 20,000 times; cheaply made dies again showed bias where precision dies such as those used in Las Vegas casino did not show bias - at least not after 20,000 throws. That experiment is mentioned in a blog post here.

FallenApple said:

But isn't one of the pitfalls of Bayesean logic is that it depends on good priors?

Absolutely. From what I read, this is how the reference class problem manifests itself in Bayesian probability, or so mentions Wikipedia in its article on the problem.

Back to the argument about "who's better, frequentist or Bayesian" - assertions that any particular approach is "always" superior seem to me to miss the point: probability models can only be said to be valid to the extent they are useful; and the utility of any particular model seems as if it must vary according to the situation. A recent and interesting book I am reading about the evolution of probability and how in some ways Bayesian analysis in particular has run into trouble in medical studies and similarly difficult applications is Willful Ignorance: The Mismeasurement of Uncertainty, by the statistician and author Herbert Weisberg. I will close with a couple of interesting quotes from that book, starting with this, describing arguments between Bayesians and frequentists:

The disagreement between Bayesians and frequentists arises from a clash between two extreme positions. Bayesians assume that our prior uncertainty should always be framed in terms of mathematical probabilities; frequentists assume it should play no role in our deliberations. Very little serious attention has been paid recently to approaches that attempt to reconcile or transcend these differences.

The other quote has to do with a problem that slides by beneath many discussions of probability: sometimes people assume that probability, as it is mathematically described, is a feature of the universe; when actually, as I think Weisberg makes a good argument for, it is an invention. And this invention, in its various iterations and variations, carries assumptions about the nature of uncertainty which are not always adequate and can be misleading - e.g. the unwitting belief that uncertainty in all cases can be viewed in the manner first introduced by classical probability. Weisberg cites Nassim Nicholas Taleb (The Black Swan, etc.) on this point:

Taleb has dubbed unquestioning belief in the "laws" of classical probability theory the ludic fallacy. The term is derived from the Latin word ludus (game). Taleb chose this term because the underlying metaphor of mathematical probability is the world as a huge casino, with rules like those in a game of chance. Ludic probability gradually supplanted an earlier usage of the word probability that reflected a qualitative analysis of uncertainty grounded in legal, ethical, and even religious considerations.

Note that Weisberg has no interest in going back in time to a non-mathematical approach to probability. I haven't gotten all the way through, but as I mentioned above, he promises to eventually examine problems with Bayesian analysis that have cropped up with trials in medicine, etc., where results can't be replicated and so on. He has ideas for how to improve this situation and says that this is the real point of his book.

PeterDonis · Jul 31, 2017

FallenApple said:

isn't one of the pitfalls of Bayesean logic is that it depends on good priors?

That's not really a "pitfall" of Bayesian logic, it's a manifestation of the way that Bayesian logic forces you to make your prior assumptions explicit so you can reason about them.

Also, the more data you collect, the smaller the effect of your priors.

FallenApple said:

If our subjective beliefs about the prior probability is wrong(not merely inaccurate), then the posterior would be further from the truth than a frequentist analysis.

How so?

FallenApple · Jul 31, 2017

PeterDonis said:

That's not really a "pitfall" of Bayesian logic, it's a manifestation of the way that Bayesian logic forces you to make your prior assumptions explicit so you can reason about them.

Also, the more data you collect, the smaller the effect of your priors.
How so?

The posterior is just the likelihood(model from current data) times the prior. If I just throw in some distribution that was heavily based on incorrect past analysis then wouldn't the posterior estimates be worse than a standalone analysis? Collecting more data reduces the effect of priors but if the priors were good in the first place, then we would not need to rely on current data as much. Not saying that Bayesian logic is bad. I really find the idea of updating knowledge to be more consistent with scientific progress( adding pieces of knowledge at a time to contribute to the overall picture). But part of this is that if the past knowledge is wrong, the current evidence is just going to be dragged back due to giving much credence where it shouldn't be.

PeterDonis · Jul 31, 2017

FallenApple said:

If I just throw in some distribution that was heavily based on incorrect past analysis then wouldn't the posterior estimates be worse than a standalone analysis?

A standalone analysis based on what?

Basically you seem to be saying that a badly done analysis will give worse results than an analysis that isn't badly done. Of course that's true, but so what?

FallenApple said:

if the priors were good in the first place, then we would not need to rely on current data as much.

In other words, if you already know the right answer, more data doesn't change the answer. Again, that's true, but so what?

If you are saying that frequentist analysis somehow magically avoids the problem of having bad starting assumptions, I don't see how that's the case. If you have bad starting assumptions, you're going to have problems no matter what technique you use. But Bayesian analysis, as I said, forces you to at least make those bad starting assumptions explicit.

UsableThought · Jul 31, 2017

PeterDonis said:

That's not really a "pitfall" of Bayesian logic, it's a manifestation of the way that Bayesian logic forces you to make your prior assumptions explicit so you can reason about them.

Bayesian methods seem to get a lot of different names for very closely related procedures. So for purposes of this discussion I'm going to assume that Bayesian logic = Bayesian inference = Bayesian statistics = Bayesian probability. With that in mind, this statement seems to suggest that priors are at all times strictly an advantage for Bayseian methods, thus devoid of any problem that could be called a pitfall. However there views to the contrary; see for example https://plato.stanford.edu/entries/statistics/#DetPri

PeterDonis said:

If you are saying that frequentist analysis somehow magically avoids the problem of having bad starting assumptions, I don't see how that's the case. If you have bad starting assumptions, you're going to have problems no matter what technique you use. But Bayesian analysis, as I said, forces you to at least make those bad starting assumptions explicit.

And this statement seems to imply that frequentism inherently can't or doesn't make its assumptions explicit. This can hardly be the case, or else there would be no such problem as the reference class problem - i.e. you can't have a reference class problem if you aren't choosing a reference class to begin with as one of your starting assumptions. See the third paragraph in this section of the same reference: https://plato.stanford.edu/entries/statistics/#PhyProClaSta

It's not that there aren't differences; it's how to describe these in a non-partisan manner. E.g. this is the conclusion of the section linked to above; "classical statistical procedures" refers to procedures that interpret probabilities as frequencies, i.e. frequentist statistics:

Summing up, it remains problematic that Bayesian statistics is sensitive to subjective input. The undeniable advantage of the classical statistical procedures is that they do not need any such input, although arguably the classical procedures are in turn sensitive to choices concerning the sample space (Lindley 2000). Against this, Bayesian statisticians point to the advantage of being able to incorporate initial opinions into the statistical analysis.

For anyone who wants to dig further into all this, the "Lindley 2000" reference above leads to a very long and technical paper that can be found online here: http://www.phil.vt.edu/dmayo/personal_website/Lindley_Philosophy_of_Statistics.pdf

Dr.AbeNikIanEdL · Jul 31, 2017

stevendaryl said:

But the Bayesian analysis would tell us this:

Let p(D)p(D)p(D) be the a priori probability that the patient has the disease (before any tests are performed).

Let p(¬D)=1−p(D)p(¬D)=1−p(D)p(\neg D) = 1 - p(D) be the a priori probability that he doesn't have the disease.

Let p(P|D)p(P|D)p(P|D) be the probability of testing positive, given that the patient has the disease (99% in our example).

Let p(P|¬D)p(P|¬D)p(P|\neg D) be the probability of testing positive, given that the patient does not have the disease (1% in our example).

Then the probability of the patient having the disease, given that he tests positive, is p(D|P)=p(P|D)p(D)p(P|D)p(D)+p(P|¬D)P(¬D)p(D|P)=p(P|D)p(D)p(P|D)p(D)+p(P|¬D)P(¬D)p(D|P) = p(P|D) \frac{p(D)}{p(P|D) p(D) + p(P|\neg D) P(\neg D)}

If p(D)=0.0001p(D)=0.0001p(D) = 0.0001 (1 in 10,000) then this gives us: P(D|P)≈P(D|P)≈P(D|P) \approx 0.98%.

What is particular Bayesian about this? As far as I understand no one debates this result. Just a frequentist would say that this means that, if the doctor performs this test on ##N## randomly selected people, the fraction of people who actually have the disease among the ones he diagnoses with the disease will only be 1% (for large enough N). The bayesian would say

stevendaryl said:

In other words, the probability that he doesn't have the disease is 99%.

This seems to introduce a new probability concept, as all probabilities so far were relative frequencies (##p(P|D)## and ##p(P|\neg D)## would most likely be the relative frequency in a clinical trial in practice, and you said yourself that ##p(D)## is the relative frequency of the disease in the population). To me it appears just confusing to also call this a probability.

stevendaryl · Aug 1, 2017

Dr.AbeNikIanEdL said:

What is particular Bayesian about this? As far as I understand no one debates this result. Just a frequentist would say that this means that, if the doctor performs this test on ##N## randomly selected people, the fraction of people who actually have the disease among the ones he diagnoses with the disease will only be 1% (for large enough N).

The issue is that frequentists' criterion for significance is like substituting the accuracy for a test as the criterion for the likelihood of the disease.

As I said, the frequentists are computing: What is the likelihood of getting result R if hypothesis H is true, or P(R|H). When H is the null hypothesis, they want to say that their result is significant if P(R|H) is tiny. But what you really care about is the likelihood that hypothesis H is true, given result R, P(H|R). Those are completely different numbers.

Mentz114 · Aug 1, 2017

stevendaryl said:

The issue is that frequentists' criterion for significance is like substituting the accuracy for a test as the criterion for the likelihood of the disease.

As I said, the frequentists are computing: What is the likelihood of getting result R if hypothesis H is true, or P(R|H). When H is the null hypothesis, they want to say that their result is significant if P(R|H) is tiny. But what you really care about is the likelihood that hypothesis H is true, given result R, P(H|R). Those are completely different numbers.

Yes, a significance test only quantifies the support that the data gives to the null hypothesis. There are other techiques. I used receiver operating characteristics to evaluate test results ( when I practised medical statistics) as did many others.

See https://en.wikipedia.org/wiki/Receiver_operating_characteristic

PeterDonis · Aug 1, 2017

UsableThought said:

this statement seems to imply that frequentism inherently can't or doesn't make its assumptions explicit. This can hardly be the case...

UsableThought said:

...it remains problematic that Bayesian statistics is sensitive to subjective input. The undeniable advantage of the classical statistical procedures is that they do not need any such input

Do you see the contradiction between these two statements?

rkastner · Aug 1, 2017

To disentangle the issues around 'measurement' and 'observation', see:
https://transactionalinterpretation...t-measurement-is-not-necessarily-observation/

Victor Ray Rutledge · Aug 1, 2017

stevendaryl said:

The frequentist approach to giving uncertainties is just wrong. It's backwards.

Let me illustrate with coin flipping. Suppose you want to know whether you have a fair coin. (There's actually evidence that there is no such thing as a biased coin: weighting one side doesn't actually make it more likely to land on that side. But that's sort of beside the point...) What you'd like to be able to do is to flip the coin a bunch of times, and note how many heads and tails you get, and use that data to decide whether your coin is fair or not. In other words, what you want to know is:

What is the probability that my coin is unfair, given the data?

But the uncertainty that frequentists compute is:

What is the probability of getting that data, if I assume that the coin is unfair?

By itself, that doesn't tell us anything about the likelihood of having a fair or unfair coin.

(Note: technically, you would compute something like the probability of getting that data under the assumption that the coin's true probability for head, [itex]P_H[/itex], is more than [itex]\epsilon[/itex] away from [itex]\frac{1}{2}[/itex])

When I was in College, in my very first physics class, we decided to do a simple experiment. We constructed a device to flip a coin, and then recorded the output. It came up heads, the first 87 times. Our professor carefully examined the device, and was unable to repeat the results. He got a fairly random set, which reflected our subsequent data. We have no explanation of the original data, nor is there any reason to believe it will recur. This type of result goes to the crux of the question, I believe. We have no way of determining whether the data we obtain reflects a random series of results, or is an anomaly. Only by repeated examinations of the same experiment, can we hope to determine what is 'normal' and what is a result that cannot be repeated. We also, and this is crucial, cannot, ever, remove the human element from the data we collect. There is no way to be human and analyze the results of our efforts, without coloring those self-same results. That having been said, we can expect a closer approach to neutral results by having a separate set of data, collected in another series of experiments, by a separate group of researchers. Errors will still occur, and you can probably point to many such, but we must never simply 'assume' that what we believe to be the 'norm' is not subject to revision.

UsableThought · Aug 2, 2017

PeterDonis said:

Do you see the contradiction between these two statements?

I differ and could explain why; however at this point civility has broken down so there is little point. Instead I will bring up a matter of considerably greater importance going forward; which is that to ask a rhetorical question in this manner (where the desired answer is merely a claim you are withholding) is considered by proponents of fair argument to be rude and unhelpful.

More specifically: Even when we deeply believe we are right (as you clearly do here), it is still our responsibility to explain ourselves when we disagree. I can cite many sources on this from my library of books about teaching, including the teaching of argumentation; for simplicity's sake, here is a brief explanation via the web: https://watchyourlanguage.wikispaces.com/Rhetorical+Questions

I will admit that I behaved badly as well: I made the mistake of phrasing my comment to you in the same adversarial tone. I regret this and apologize for it. I don't like rudeness in myself any more than in others, and have been doing my best to minimize this tendency as I get older. For the sake of minimizing friction, I am making a mental note to avoid responding to you in future.

Demystifier · Aug 2, 2017

atyy said:

I'm not sure. My instinct is to say it depends.

If interpreted in a frequentist sense, then Bayes's theorem does not require consciousness.

If interpreted in a subjective Bayesian sense, then Bayes's theorem does require consciousness.

I don't believe the objective Bayesian approach makes any sense.

You can say the same for Bohmian mechanics too.

PeterDonis · Aug 2, 2017

UsableThought said:

I differ and could explain why

Then please do so.

UsableThought said:

Even when we deeply believe we are right (as you clearly do here), it is still our responsibility to explain ourselves when we disagree.

I thought the point I was making was sufficiently obvious; you appear to agree since you say you could explain why you differ. But if you would like it to be made more explicit, I will do so below.

First, however, a brief comment: disagreement in itself is not uncivil. Neither is leaving some points implicit, when it seems clear that the reader is able to fill them in for himself/herself. I did not find your previous posts uncivil, and although I'm glad to accept your apology in the spirit in which it was offered, I don't think any apology was necessary. I am just interested to see what your explanation is of why you differ, as I requested above.

Now to make my objection more explicit, here I will repeat the statement I quoted before (which I understand is not yours, you were quoting it from the article you referenced):

UsableThought said:

it remains problematic that Bayesian statistics is sensitive to subjective input. The undeniable advantage of the classical statistical procedures is that they do not need any such input

This is simply false. One can express the falsehood in one of two ways, depending on how one wants to define "subjective input". If Bayesian priors are subjective input, then so are the corresponding assumptions in classical statistical procedures. If the latter are not subjective input, then neither are Bayesian priors. So the claimed distinction in the above quote is simply not valid. And since the quote is clearly given from a frequentist perspective, it clearly is evidence against the claim that frequentism makes its assumptions explicit, since it can't even admit that it has to make assumptions (corresponding to the priors in the Bayesian case) at all.

I think the root of the problem here is that there is no unique "right answer" when you're trying to estimate probabilities, at least not in any case of more than trivial interest. Any estimate of probabilities is going to have to put numbers to things based on incomplete knowledge. Whether you want to call those numbers priors or something else, the problem is there, and there's no way around it--after all, if you had complete enough knowledge to know for sure what those numbers were, you wouldn't be estimating probabilities, you would be computing precise quantitative predictions that you already knew would turn out to be correct.

stefanbanev · Aug 2, 2017

PeterDonis said:

Then please do so.
...
I think the root of the problem here is that there is no unique "right answer" when you're trying to estimate probabilities, at least not in any case of more than trivial interest. Any estimate of probabilities is going to have to put numbers to things based on incomplete knowledge...

It's a good summary of issue. Even a trivial probabilities once a complete set of outcomes are defined, it still solely depends on such definition ;o) To establish an "adequate" probability of 0/1 output of an arbitrary black-box of 0/1 generator with limited history is a good representation of the problem. Frankly, there is a reliable way to asses who does a better job - the more compact entropy encoding output the more adequate probabilities assessment is. The "problem" is that any winner can become a looser across all possible 0/1 generators...

stevendaryl · Aug 2, 2017

I recently went through the exercise of using Bayesian probability to figure out the most likely probability for "heads" given that [itex]H[/itex] tosses yielded heads out of [itex]N[/itex] trials. The derivation was enormously complicated, but the answer was very simple: [itex]p = \frac{H+1}{N+2}[/itex]. In the limit as [itex]N \rightarrow \infty[/itex] and [itex]H \rightarrow \infty[/itex], this approaches the relative frequency, [itex]\frac{H}{N}[/itex], but it actually is better-behaved. Before you ever toss the first coin, with [itex]N = H = 0[/itex], the Bayesian estimate gives [itex]p = \frac{1}{2}[/itex]. If you get heads for the first toss, this estimate gives [itex]p = \frac{2}{3}[/itex], rather than the relative frequency estimate, [itex]p = 1[/itex].

I should probably explain what I mean by "the most likely probability". I start off assuming that each coin has a parameter--I'm going to call it [itex]B[/itex], for bias--that characterizes the coin tosses. The model is that:

[itex]P(H | B) = B[/itex]

So the bias is just the probability of heads. But I'm treating it as a parameter of the model. As a parameter, it has a range of possible values, [itex]0 \leq B \leq 1[/itex]. If i have no idea what the value of [itex]B[/itex] is, I can use the least informative prior, which is to assume that [itex]B[/itex] is uniformly distributed in the range [itex][0,1][/itex].

That's kind of an odd concept--we're talking about the probability of a probability. Kind of weird, but let's go on.

So we toss the coin [itex]N[/itex] times and get [itex]H[/itex] heads. Then Bayesian updating tells us the adjusted, posterior probability distribution for [itex]B[/itex], given that data. The rule is (letting [itex]E(H,N)[/itex] be the fact that I got [itex]H[/itex] heads when I flipped the coin [itex]N[/itex] times):

[itex]P(B | E(H,N)) = \frac{P(E(H,N)| B) P(B)}{P(E(H,N)}[/itex]

where [itex]P(E(H,N) | B)[/itex] is the probability of [itex]E(H,N)[/itex], given [itex]B[/itex], and [itex]P(B)[/itex] is the prior probability density of [itex]B[/itex] (which is just 1 for the least informative prior), and [itex]P(E(H,N))[/itex] is the prior probability of [itex]E(H,N)[/itex], not knowing anything about [itex]B[/itex].

These can be computed readily enough:

[itex]P(B) = 1[/itex]
[itex]P(E(H,N) | B) = B^H (1-B)^{N-H} \frac{N!}{H! (N-H)!}[/itex]
[itex]P(E(H,N)) = \int dB P(B) P(E(H,N)|B) = \frac{N!}{H! (N-H)!} \int dB\ B^H (1-B)^{N-H}[/itex]

That last integral is hard to do, but it's done here: https://math.stackexchange.com/questions/86542/prove-binomnk-1-n1-int-01xk1-xn-kdx-for-0-leq-k-le

[itex]\int dB\ B^H (1-B)^{N-H} = \frac{H! (N-H)!}{(N+1)!}[/itex]

That gives: [itex]P(E(H,N)) = \frac{1}{N+1}[/itex]

So our posterior probability distribution for [itex]B[/itex] is:

[itex]P(B|E(H,N)) = \frac{(N+1)!}{H! (N-H)!} B^H (1-B)^{N-H}[/itex]

Now, we compute [itex]\langle B \rangle_{E(H,N)}[/itex], which is the expected value of [itex]B[/itex], given [itex]E(H,N)[/itex]. The formula for expectation values is:

[itex]\langle B \rangle_{E(H,N)} = \int dB\ B\ P(B | E(H,N)) = \frac{(N+1)!}{H! (N-H)!} \int dB\ B^{H+1} (1-B)^{N-H}[/itex]

We can write: [itex]\int dB\ B^{H+1} (1-B)^{N-H} = \int dB\ B^{H+1} (1-B)^{(N+1)-(H+1)} = \frac{(H+1)! (N-H)!}{(N+2)!}[/itex]. So we can immediately write:

[itex]\langle B \rangle_{E(H,N)} = \frac{(N+1)!}{H! (N-H)!} \frac{(H+1)! (N+1-H)!}{(N+2)!} = \frac{H+1}{N+2}[/itex]

Like I said, very simple result that is very complicated to derive.

Mentz114 · Aug 3, 2017

stevendaryl said:

I recently went through the exercise of using Bayesian probability to figure out the most likely probability for "heads" given that [itex]H[/itex] tosses yielded heads out of [itex]N[/itex] trials. The derivation was enormously complicated, but the answer was very simple: [itex]p = \frac{H+1}{N+2}[/itex]. In the limit as [itex]N \rightarrow \infty[/itex] and [itex]H \rightarrow \infty[/itex], this approaches the relative frequency, [itex]\frac{H}{N}[/itex], but it actually is better-behaved. Before you ever toss the first coin, with [itex]N = H = 0[/itex], the Bayesian estimate gives [itex]p = \frac{1}{2}[/itex]. If you get heads for the first toss, this estimate gives [itex]p = \frac{2}{3}[/itex], rather than the relative frequency estimate, [itex]p = 1[/itex].[]
[itex]P(B | E(H,N)) = \frac{P(E(H,N)| B) P(B)}{P(E(H,N)}[/itex]

[]

So we can immediately write:

[itex]\langle B \rangle_{E(H,N)} = \frac{(N+1)!}{H! (N-H)!} \frac{(H+1)! (N+1-H)!}{(N+2)!} = \frac{H+1}{N+2}[/itex]

Like I said, very simple result that is very complicated to derive.

Have you checked if your estimator is

1. The maximum likekihood estimator
2. Has expection 1/2 under a binomial (p=1/2) distribution.

These are usually considered desirable.

This has nothing to do with QT. Probability is not observable so we always have to process counts - i.e. frequencies. How we process the counts is a matter of taste.

stefanbanev · Aug 3, 2017

Mentz114 said:

This has nothing to do with QT. Probability is not observable so we always have to process counts - i.e. frequencies. How we process the counts is a matter of taste.

My first reaction, apparently it us true; but what if the interpretation/"taste" of observable/counts effects the ongoing experiment itself then the interpretation seems relevant... It's definitely a murky path to take, still it seems a viable insight to think about...

ModusPwnd · Aug 4, 2017

Demystifier said:

Yes, but scientists didn't check whether detector detected anything when nobody was looking at it.

Demystifier said:

Yes, but if you look later, you only know what is there later. You cannot know what was there before. You can only assume that it was there before, but you cannot prove that assumption by scientific method. You can "prove" it by using some philosophy, but philosophy is not science, right?

I want to understand this better but I don't quite follow.

Case A: No detector at the slit and we see an interference pattern.
Case B: Detector with conscious observation at the slit and we do not see an interference pattern.

Now remove the conscious observation of the detector at the slit, but leave the detector on.
Case C: Detector without conscious observation at the slit. If we see an interference pattern then the consciousness is required to collapse. If we do not see an interference pattern then consciousness is not required to collapse the function. This logic does not follow?

PeterDonis · Aug 4, 2017

ModusPwnd said:

Detector without conscious observation at the slit. If we see an interference pattern then the consciousness is required to collapse.

Or the presence of the detector at the slits changes the wave function so that no interference is produced. Which is what the math of QM actually tells you if you work it out. So no, this method of testing whether consciousness is required for collapse will not work.

Regarding consciousness causing wavefunction collapse

Discussion Overview

Discussion Character

Main Points Raised

Areas of Agreement / Disagreement

Contextual Notes

Similar threads

Graduate Two equivalent statements of time reversal symmetric Hamiltonian

Undergrad Single vs. Double slit coherence clarification please

High School Interesting paper on QM in Scientific American

Undergrad ##r-##independent angular momentum in quantum mechanics

Graduate Consistency of Relativistic QM

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Insights What Exactly is Dirac’s Delta Function? - Insight