Insights Frequentist vs Bayesian Probability: What's the Difference?

Dale · Dec 30, 2020

Demystifier said:

But what is probability then about? About anything that satisfies the axioms of probability?

Yes. That is what axiomatization does. It abstracts a concept. Then the word “probability” (in that mathematical and axiomatic sense) itself becomes an abstraction representing anything which satisfies the axioms.

Demystifier said:

My view is that, if a set of axioms does not really capture the concept that people originally had in mind before proposing the axioms, then it is the axioms, not the concept, that needs to be changed.

I do sympathize with that view, but realistically it is too late in this case. The Kolomgorov axioms are already useful and well accepted, and using the word “probability” to refer to measures which satisfy those axioms is firmly established in the literature.

The best you can do is to recognize that the word “probability”, like so many other words, has multiple meanings. One is the mathematical meaning of anything which satisfies Kolomgorov’s axioms, and the other is the “concept that people originally had in mind”. Then you merely make sure that it is understood which meaning is being used, as you do with any other multiple-meaning word.

atyy · Dec 30, 2020

Dale said:

I tend to like the idea of uncertainty more than randomness, because I find randomness a lot harder to pin down. It seems to get jumbled up with determinism and other things that you don’t have to worry about for uncertainty.

But if a Bayesian draws samples from a distribution, then wouldn't the Bayesian be using the idea of randomness?

Eg.
https://en.wikipedia.org/wiki/Gibbs_sampling
http://www.mit.edu/~ilkery/papers/GibbsSampling.pdf

Dale · Dec 30, 2020

atyy said:

But if a Bayesian draws samples from a distribution, then wouldn't the Bayesian be using the idea of randomness?

Not necessarily. We are certainly uncertain about random things, but we are also uncertain about some non-random things. Both can be represented as a distribution from which we can draw samples. So the mere act of drawing from a distribution does not imply randomness.

A good example is a pseudorandom number generator. There is nothing actually random about it. But we are uncertain of its next value, so we can describe it using a distribution and draw samples from it.

Stephen Tashi · Dec 30, 2020

Demystifier said:

But what is probability then about? About anything that satisfies the axioms of probability? My view is that, if a set of axioms does not really capture the concept that people originally had in mind before proposing the axioms, then it is the axioms, not the concept, that needs to be changed.

It's fair to say that the concept of probability that people originally had in mind involves a situation where there are several "possible" outcomes of some physical phenomena, but only one of the "possible" outcomes "actually" occurs. The concept of probability associated with such a situation involves a "tendency" for certain outcomes to actually happen that can be measured by a number, but the lack of any absolute guarantee that this number will correspond to the observed frequencies of the outcomes that actually do happen. This is still how many people applying probability theory think of probability.

However, such thoughts involve the complicated metaphysical concepts of "possible" as distinct from "actual". There is not yet any ( well known) system of mathematics that formalizes these metaphysical concepts and also provides anything useful for applications that the Kolmogorov approach doesn't already supply.

The Kolomogorov approach ( measure theory) provides a reliable basis for proving theorems about probabilities. The price of this approach is that probability theory is essentially circular. We have theorems that say if certain probabilities are such-and-such then the probabilities of other things are so-and-so. Any interpretation of probability theory as a guarantee of what will actually happen is outside this theory. It falls under whatever field of science deals with the problem to which the theory is applied.

It seems to me that in physics there is a long tradition of attempts to formulate theories of probability on the basis of actual frequencies of outcomes. For example, if we consider tossing a fair coin as a physical event, then such a theory would tell us to consider the "ensemble" of tossed coins. The ensemble must be an actual thing. It may involve all fair coins that have been tossed in past and all that will be tossed in the future, and coins tossed on other planets etc. In this actual ensemble of fair coins there is an actual frequency that have (or will) land heads. So this frequency is a specific number if the ensemble is finite. (If the ensemble isn't finite, we have more conceptual work to do.)

These ensemble theories do not explain taking independent samples from the ensemble unless we add further structure to theory. (For example, why won't the sub-ensemble corresponding to one experimenter's tosses all come out heads?) So we need the ensemble to be distributed in space and time (e.g. among various labs and among various times-of-day) in some way that mimics the appearance of independent trials.

Wizard · Jan 1, 2021

WWGD said:

If I may offer a suggestion, or maybe you can reply here, on the two different interpretations of probabilistic statements such as :" There is a 60% chance of rain for (e.g.) Thursday." In frequentist perspective, I believe this means that in previous times with a similar combination of conditions as the ones before Thursday, it rained 60% of the time. I have trouble finding a Bayesian interpretation for this claim. You may have a prior, but I can't see what data you would use to update it to a posterior probability.

It means that based on the known distribution parameters and a model of how those parameters affect weather, that there is 60% chance of rain on Thursday. Those parameters include all the things a meteorologist might use to predict the weather. How the model is determined, I'm not quite sure. The model may itself be encoded by additional distribution parameters, which are updated according to observations. The Expectation-Maximisation method is all about determining unknown distribution parameters.

atyy · Jan 3, 2021

Dale said:

Not necessarily. We are certainly uncertain about random things, but we are also uncertain about some non-random things. Both can be represented as a distribution from which we can draw samples. So the mere act of drawing from a distribution does not imply randomness.

A good example is a pseudorandom number generator. There is nothing actually random about it. But we are uncertain of its next value, so we can describe it using a distribution and draw samples from it.

Isn't that the same in frequentist thinking?

Dale · Jan 4, 2021

atyy said:

Isn't that the same in frequentist thinking?

Isn’t what the same?

atyy · Jan 4, 2021

Dale said:

Isn’t what the same?

Isn't it the same in frequentist thinking that randomness can arise from determinism, ie. from our ignorance of the details of a deterministic process?

Stephen Tashi · Jan 4, 2021

"Frequentist thinking" is as vague a category of thinking as "liberal thinking" or "conservative thinking". R.A. Fisher is regarded as one of the famous frequentists. In the article https://www.cmu.edu/dietrich/philos...shers Fiducial Argument and Bayes Theorem.pdf we find the quotation from Fisher:

This fundamental requirement for the applicability to individual cases of the concept of classical probability shows clearly the role both of well- specified ignorance and of specific knowledge in a typical probability statement. . . . The knowledge required for such a statement refers to a well-defined aggregate, or population of possibilities within which the limiting frequency ratio must be exactly known. The necessary ignorance is specified by our inability to discriminate any of the different sub-aggregates having different frequency ratios, such as must always exist.

So we see a Frequentist discussing ignorance and knowledge in connection with the concept of probability. That view may not be statistically typical of the population of Frequentists, but it is a view that would allow probabilities to be assigned to the population of numbers generated by a deterministic random number generator - provided that when we take samples, we don't know how to distinguish sub-populations that have statistical characteristics different than the parent population.

Dale · Jan 4, 2021

Thanks @Stephen Tashi that is a good quote.

atyy said:

Isn't it the same in frequentist thinking that randomness can arise from determinism, ie. from our ignorance of the details of a deterministic process?

So Fisher clearly thinks that it is not necessary to establish “randomness” but merely to have a sample population with a well defined frequency. That fits in well with the frequentist definition of probability as a population frequency. One thing that Fisher doesn’t address there is sampling individual values from the population. Can you still use frequentist probability if the sampling is non-random (e.g. a random number generator with a specified seed)? I suspect that Fisher would say yes, but I am not sure that all prominent frequentists would agree.

So potentially, depending on the individual, there is not much difference between the frequentist and Bayesian interpretation in a deterministic population where we have ignorance.

Where you get a difference is in situations where there is simply no sample population. For example, ##G## or ##\alpha##. Those quantities are not a population, there is only one value but we are uncertain about it. With a frequentist approach ##P(\alpha=1/137)## is somewhere between weird and impossible, whereas a Bayesian would have no qualms about such an expression.

atyy · Jan 5, 2021

Would one accept another piece of evidence that many frequentists consider randomness to arise from ignorance the terminology in quantum mechanics that the density operator is sometimes "ignorance interpretable" and at other times "not ignorance interpretable"? In other words, it shows that standard quantum mechanics does use the idea that probability arises from ignorance, ie. some cases in classical and quantum mechanics are "ignorance interpretable". Here I'm assuming that most physics has used the frequentist interpretation of probability.

Here are two examples from Schlosshauer's review https://arxiv.org/abs/quant-ph/0312059.

"It is a well-known and important property of quantum mechanics that a superposition of states is fundamentally different from a classical ensemble of states, where the system actually is in only one of the states but we simply do not know in which (this is often referred to as an “ignorance-interpretable,” or “proper”ensemble). "

"Most prominently, the orthodox interpretation postulates a collapse mechanism that transforms a pure-state density matrix into an ignorance-interpretable ensemble of individual states (a “proper mixture”)."

Wizard · Jan 7, 2021

How does a frequentist rationalise irrational probabilities? ;)

atyy · Jan 8, 2021

Wizard said:

How does a frequentist rationalise irrational probabilities? ;)

http://www.cs.ru.nl/P.Lucas/teaching/CI/efron.pdf
Why Isn't Everyone a Bayesian? :oldbiggrin:

Author(s): B. Efron
Source: The American Statistician, Vol. 40, No. 1 (Feb., 1986), pp. 1-5

Just a note that "incoherent" is nowadays the more usual technical term in English.

jbergman · Jan 22, 2021

Wizard said:

How does a frequentist rationalise irrational probabilities? ;)

Maybe I'm dense, but that seems easy. :)

Wizard · Jan 23, 2021

Touché

Insights Frequentist vs Bayesian Probability: What's the Difference?

Similar threads

B A Little Probability Puzzle

I Need help solving this Existence Algorithm for truth

I A variant of the Monty Hall problem

I What Are the Axioms of Fuzzy Logic and How Do They Extend Boolean Algebra?

I Please Explain (actually explain) The Monty Hall Problem

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Insights What Exactly is Dirac’s Delta Function? - Insight

Insights Relativator (Circular Slide-Rule): Simulated with Desmos - Insight

Insights Fixing Things Which Can Go Wrong With Complex Numbers