Frequentist vs Bayesian Probability: What's the Difference?

  • Context: Insights 
  • Thread starter Thread starter Dale
  • Start date Start date
  • Tags Tags
    Bayesian Probability
Click For Summary

Discussion Overview

The discussion centers on the differences between frequentist and Bayesian interpretations of probability, exploring their theoretical foundations, applications, and implications in statistical reasoning. Participants examine specific probabilistic statements, the nature of uncertainty, and the methodologies employed by each perspective.

Discussion Character

  • Debate/contested
  • Technical explanation
  • Conceptual clarification

Main Points Raised

  • Some participants suggest that a frequentist interpretation of a probabilistic statement, such as "There is a 60% chance of rain," relates to historical frequencies of similar conditions resulting in rain.
  • Others express difficulty in finding a Bayesian interpretation for the same statement, questioning what data would be used to update a prior probability to a posterior probability.
  • A participant identifies as a moderate Bayesian, indicating a preference for both Bayesian and frequentist methods depending on the context, and emphasizes that neither approach is inherently "right" or "wrong."
  • It is proposed that Bayesian interpretation involves uncertainty about the outcome, with a preference for betting on the likelihood of rain over a coin flip, contingent on having a model to update probabilities.
  • Discussion includes a technical explanation of how frequentist probabilities are determined through repeated trials and the concept of long-run frequency.
  • Some participants note that the notation used in frequentist probability conveys an intuitive belief rather than a precise mathematical definition, raising questions about its interpretation.
  • There are claims that Bayesian probabilities converge to frequentist probabilities under certain conditions, though the implications of this convergence are debated.
  • Concerns are raised about whether defining probability in terms of probability is problematic for frequentist purists, with some asserting that frequentists do not disagree with the Law of Large Numbers.
  • The essential distinction between the two approaches is discussed, focusing on the interpretation of unknown quantities as either definite but unknown or outcomes of stochastic processes.

Areas of Agreement / Disagreement

Participants express differing views on the interpretations of probability, with no consensus reached on the superiority of one approach over the other. The discussion remains unresolved regarding the implications of these interpretations and their applications.

Contextual Notes

Participants highlight limitations in definitions and assumptions underlying both frequentist and Bayesian approaches, particularly concerning the interpretation of probabilities and the conditions under which they apply.

  • #31
Demystifier said:
But what is probability then about? About anything that satisfies the axioms of probability?
Yes. That is what axiomatization does. It abstracts a concept. Then the word “probability” (in that mathematical and axiomatic sense) itself becomes an abstraction representing anything which satisfies the axioms.

Demystifier said:
My view is that, if a set of axioms does not really capture the concept that people originally had in mind before proposing the axioms, then it is the axioms, not the concept, that needs to be changed.
I do sympathize with that view, but realistically it is too late in this case. The Kolomgorov axioms are already useful and well accepted, and using the word “probability” to refer to measures which satisfy those axioms is firmly established in the literature.

The best you can do is to recognize that the word “probability”, like so many other words, has multiple meanings. One is the mathematical meaning of anything which satisfies Kolomgorov’s axioms, and the other is the “concept that people originally had in mind”. Then you merely make sure that it is understood which meaning is being used, as you do with any other multiple-meaning word.
 
  • Like
Likes   Reactions: BWV and Demystifier
Physics news on Phys.org
  • #32
Dale said:
I tend to like the idea of uncertainty more than randomness, because I find randomness a lot harder to pin down. It seems to get jumbled up with determinism and other things that you don’t have to worry about for uncertainty.

But if a Bayesian draws samples from a distribution, then wouldn't the Bayesian be using the idea of randomness?

Eg.
https://en.wikipedia.org/wiki/Gibbs_sampling
http://www.mit.edu/~ilkery/papers/GibbsSampling.pdf
 
  • #33
atyy said:
But if a Bayesian draws samples from a distribution, then wouldn't the Bayesian be using the idea of randomness?
Not necessarily. We are certainly uncertain about random things, but we are also uncertain about some non-random things. Both can be represented as a distribution from which we can draw samples. So the mere act of drawing from a distribution does not imply randomness.

A good example is a pseudorandom number generator. There is nothing actually random about it. But we are uncertain of its next value, so we can describe it using a distribution and draw samples from it.
 
  • #34
Demystifier said:
But what is probability then about? About anything that satisfies the axioms of probability? My view is that, if a set of axioms does not really capture the concept that people originally had in mind before proposing the axioms, then it is the axioms, not the concept, that needs to be changed.

It's fair to say that the concept of probability that people originally had in mind involves a situation where there are several "possible" outcomes of some physical phenomena, but only one of the "possible" outcomes "actually" occurs. The concept of probability associated with such a situation involves a "tendency" for certain outcomes to actually happen that can be measured by a number, but the lack of any absolute guarantee that this number will correspond to the observed frequencies of the outcomes that actually do happen. This is still how many people applying probability theory think of probability.

However, such thoughts involve the complicated metaphysical concepts of "possible" as distinct from "actual". There is not yet any ( well known) system of mathematics that formalizes these metaphysical concepts and also provides anything useful for applications that the Kolmogorov approach doesn't already supply.

The Kolomogorov approach ( measure theory) provides a reliable basis for proving theorems about probabilities. The price of this approach is that probability theory is essentially circular. We have theorems that say if certain probabilities are such-and-such then the probabilities of other things are so-and-so. Any interpretation of probability theory as a guarantee of what will actually happen is outside this theory. It falls under whatever field of science deals with the problem to which the theory is applied.

It seems to me that in physics there is a long tradition of attempts to formulate theories of probability on the basis of actual frequencies of outcomes. For example, if we consider tossing a fair coin as a physical event, then such a theory would tell us to consider the "ensemble" of tossed coins. The ensemble must be an actual thing. It may involve all fair coins that have been tossed in past and all that will be tossed in the future, and coins tossed on other planets etc. In this actual ensemble of fair coins there is an actual frequency that have (or will) land heads. So this frequency is a specific number if the ensemble is finite. (If the ensemble isn't finite, we have more conceptual work to do.)

These ensemble theories do not explain taking independent samples from the ensemble unless we add further structure to theory. (For example, why won't the sub-ensemble corresponding to one experimenter's tosses all come out heads?) So we need the ensemble to be distributed in space and time (e.g. among various labs and among various times-of-day) in some way that mimics the appearance of independent trials.
 
  • Like
Likes   Reactions: Demystifier
  • #35
WWGD said:
If I may offer a suggestion, or maybe you can reply here, on the two different interpretations of probabilistic statements such as :" There is a 60% chance of rain for (e.g.) Thursday." In frequentist perspective, I believe this means that in previous times with a similar combination of conditions as the ones before Thursday, it rained 60% of the time. I have trouble finding a Bayesian interpretation for this claim. You may have a prior, but I can't see what data you would use to update it to a posterior probability.
It means that based on the known distribution parameters and a model of how those parameters affect weather, that there is 60% chance of rain on Thursday. Those parameters include all the things a meteorologist might use to predict the weather. How the model is determined, I'm not quite sure. The model may itself be encoded by additional distribution parameters, which are updated according to observations. The Expectation-Maximisation method is all about determining unknown distribution parameters.
 
  • #36
Dale said:
Not necessarily. We are certainly uncertain about random things, but we are also uncertain about some non-random things. Both can be represented as a distribution from which we can draw samples. So the mere act of drawing from a distribution does not imply randomness.

A good example is a pseudorandom number generator. There is nothing actually random about it. But we are uncertain of its next value, so we can describe it using a distribution and draw samples from it.

Isn't that the same in frequentist thinking?
 
  • #37
atyy said:
Isn't that the same in frequentist thinking?
Isn’t what the same?
 
  • #38
Dale said:
Isn’t what the same?

Isn't it the same in frequentist thinking that randomness can arise from determinism, ie. from our ignorance of the details of a deterministic process?
 
  • #39
"Frequentist thinking" is as vague a category of thinking as "liberal thinking" or "conservative thinking". R.A. Fisher is regarded as one of the famous frequentists. In the article https://www.cmu.edu/dietrich/philos...shers Fiducial Argument and Bayes Theorem.pdf we find the quotation from Fisher:

This fundamental requirement for the applicability to individual cases of the concept of classical probability shows clearly the role both of well- specified ignorance and of specific knowledge in a typical probability statement. . . . The knowledge required for such a statement refers to a well-defined aggregate, or population of possibilities within which the limiting frequency ratio must be exactly known. The necessary ignorance is specified by our inability to discriminate any of the different sub-aggregates having different frequency ratios, such as must always exist.

So we see a Frequentist discussing ignorance and knowledge in connection with the concept of probability. That view may not be statistically typical of the population of Frequentists, but it is a view that would allow probabilities to be assigned to the population of numbers generated by a deterministic random number generator - provided that when we take samples, we don't know how to distinguish sub-populations that have statistical characteristics different than the parent population.
 
  • Like
Likes   Reactions: atyy and Dale
  • #40
Thanks @Stephen Tashi that is a good quote.
atyy said:
Isn't it the same in frequentist thinking that randomness can arise from determinism, ie. from our ignorance of the details of a deterministic process?
So Fisher clearly thinks that it is not necessary to establish “randomness” but merely to have a sample population with a well defined frequency. That fits in well with the frequentist definition of probability as a population frequency. One thing that Fisher doesn’t address there is sampling individual values from the population. Can you still use frequentist probability if the sampling is non-random (e.g. a random number generator with a specified seed)? I suspect that Fisher would say yes, but I am not sure that all prominent frequentists would agree.

So potentially, depending on the individual, there is not much difference between the frequentist and Bayesian interpretation in a deterministic population where we have ignorance.

Where you get a difference is in situations where there is simply no sample population. For example, ##G## or ##\alpha##. Those quantities are not a population, there is only one value but we are uncertain about it. With a frequentist approach ##P(\alpha=1/137)## is somewhere between weird and impossible, whereas a Bayesian would have no qualms about such an expression.
 
Last edited:
  • Like
Likes   Reactions: atyy
  • #41
Would one accept another piece of evidence that many frequentists consider randomness to arise from ignorance the terminology in quantum mechanics that the density operator is sometimes "ignorance interpretable" and at other times "not ignorance interpretable"? In other words, it shows that standard quantum mechanics does use the idea that probability arises from ignorance, ie. some cases in classical and quantum mechanics are "ignorance interpretable". Here I'm assuming that most physics has used the frequentist interpretation of probability.

Here are two examples from Schlosshauer's review https://arxiv.org/abs/quant-ph/0312059.

"It is a well-known and important property of quantum mechanics that a superposition of states is fundamentally different from a classical ensemble of states, where the system actually is in only one of the states but we simply do not know in which (this is often referred to as an “ignorance-interpretable,” or “proper”ensemble). "

"Most prominently, the orthodox interpretation postulates a collapse mechanism that transforms a pure-state density matrix into an ignorance-interpretable ensemble of individual states (a “proper mixture”)."
 
  • #42
How does a frequentist rationalise irrational probabilities? ;)
 
  • #43
Wizard said:
How does a frequentist rationalise irrational probabilities? ;)

http://www.cs.ru.nl/P.Lucas/teaching/CI/efron.pdf
Why Isn't Everyone a Bayesian? :oldbiggrin:
Author(s): B. Efron
Source: The American Statistician, Vol. 40, No. 1 (Feb., 1986), pp. 1-5

Just a note that "incoherent" is nowadays the more usual technical term in English.
 
  • #44
Wizard said:
How does a frequentist rationalise irrational probabilities? ;)
Maybe I'm dense, but that seems easy. :)
 
  • Like
Likes   Reactions: Wizard and Dale
  • #45
Touché
 

Similar threads

  • Sticky
  • · Replies 13 ·
Replies
13
Views
6K
  • · Replies 7 ·
Replies
7
Views
3K
  • Sticky
  • · Replies 12 ·
Replies
12
Views
7K
  • · Replies 26 ·
Replies
26
Views
4K
  • · Replies 18 ·
Replies
18
Views
4K
  • · Replies 23 ·
Replies
23
Views
2K
  • · Replies 9 ·
Replies
9
Views
2K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 3 ·
Replies
3
Views
3K