What motivates Bayes' Theorem?

Dale · Oct 7, 2024

Agent Smith said:

A frequentist takes a sample (the least). A Bayesian ___?

Sometimes there is no sample. So a Bayesian can still choose a prior that represents their uncertainty.

Dale · Oct 7, 2024

PeroK said:

There is only one universe, so the sample size is 1.

PeroK said:

The universe is either flat or it isn't. Probability doesn't apply.

This isn’t correct. Long term frequency doesn’t apply, but frequency isn’t probability.

What defines probability is the Kolomgorov axioms. Anything that fulfills those axioms is probability. Frequency is one example, and uncertainty is another.

There is no population of universes to draw a large sample from and form frequencies, but there is uncertainty about the universe’s flatness. So probability does apply.

PeroK · Oct 7, 2024

Dale said:

This isn’t correct. Long term frequency doesn’t apply, but frequency isn’t probability.

What defines probability is the Kolomgorov axioms. Anything that fulfills those axioms is probability. Frequency is one example, and uncertainty is another.

There is no population of universes to draw a large sample from and form frequencies, but there is uncertainty about the universe’s flatness. So probability does apply.

The Kolmogorov axioms are a pure mathematical construction. Whether you can apply them to a given physical scenario is the question. The universe cannot, by itself, satisfy mathematical axioms. The universe is not a measure space.

If you conclude, for example, that the probability that the universe is flat is 90%, then (as a frequentist) I don't know what to make of that. It's a number and it may have some meaning, but I cannot relate it to what I understand as a probability.

Uncertainty, by itself, does not imply probabilities. One example is the Two-Envelope Problem:

https://en.wikipedia.org/wiki/Two_envelopes_problem

If you assume that there must be probabilities, then you end up with contradictory calculations and inconsistencies. Until, you realise that there must be a distribution. Without specifying the distribution, the numbers you calculate are not probabilities. To have probabilities relating to the universe, you must specify the distribution which applies to the universe(s). Without it, the numbers you calculate are not probabilities.

Dale · Oct 7, 2024

PeroK said:

The Kolmogorov axioms are a pure mathematical construction.

Yes. But they are the mathematical construction that defines probability.

PeroK said:

The universe cannot, by itself, satisfy mathematical axioms

Nobody claimed or even implied otherwise.

PeroK said:

If you conclude, for example, that the probability that the universe is flat is 90%, then (as a frequentist) I don't know what to make of that. It's a number and it may have some meaning, but I cannot relate it to what I understand as a probability.

Which is precisely why I mention the axioms. A lot of frequentists make the same mistake, but it is a mistake.

A lot of frequentists mistakenly believe that probability is defined by long-run frequencies, but that is not correct. Probability is defined by the axioms. Long run frequencies are an example of probability because long run frequencies satisfy the axioms.

This is similar to vectors. Many students are introduced to vectors as little arrows with a magnitude and direction. Later, they are surprised to learn that polynomials are also vectors. We don’t think of polynomials as arrows, but they satisfy the axioms of vectors. Little arrows are the easiest example of vectors, but not the only example. Similarly frequencies are the easiest example of probability, but not the only example.

So a statement that the probability that the universe is flat is 90% cannot reasonably be understood as a frequency, but it certainly can be understood as an uncertainty. And since uncertainties are also probabilities, it is a valid statement.

PeroK · Oct 7, 2024

Dale said:

Yes. But they are the mathematical construction that defines probability.

Nobody claimed or even implied otherwise.

Which is precisely why I mention the axioms. A lot of frequentists make the same mistake, but it is a mistake.

A lot of frequentists mistakenly believe that probability is defined by long-run frequencies, but that is not correct. Probability is defined by the axioms. Long run frequencies are an example of probability because long run frequencies satisfy the axioms.

This is similar to vectors. Many students are introduced to vectors as little arrows with a magnitude and direction. Later, they are surprised to learn that polynomials are also vectors. We don’t think of polynomials as arrows, but they satisfy the axioms of vectors. Little arrows are the easiest example of vectors, but not the only example. Similarly frequencies are the easiest example of probability, but not the only example.

So a statement that the probability that the universe is flat is 90% cannot reasonably be understood as a frequency, but it certainly can be understood as an uncertainty. And since uncertainties are also probabilities, it is a valid statement.

That, as I understand it, is the Bayesian interpretation. That the frequentist interpretation is a mistake is pushing your luck. At the very least, there is a Bayesian prior the the frequentist interpretation is not a mistake with nonzero probability!

Dale · Oct 7, 2024

PeroK said:

That the frequentist interpretation is a mistake is pushing your luck

That is not what I said was a mistake. What is a mistake is the belief that the frequentist interpretation defines probability. Long run frequencies and uncertainties are both valid examples of probabilities. But what defines probability is the axioms.

Agent Smith · Oct 7, 2024

Dale said:

Sometimes there is no sample. So a Bayesian can still choose a prior that represents their uncertainty.

Choose as opposed to compute? This would matter I feel.

Agent Smith · Oct 7, 2024

Dale said:

Which is precisely why I mention the axioms. A lot of frequentists make the same mistake, but it is a mistake.

So frequentism (I hope that's the correct term) is just one version of probability.

Dale · Oct 7, 2024

Agent Smith said:

Choose as opposed to compute? This would matter I feel.

Yes. Choose. With no data how would you compute it? This is one of the best things about Bayesian statistics, and it is also one of the things that most bothers people learning about it. It allows you to include nebulous things like expert opinion that may not be something that you can compute.

I dive into this topic a bit in one of my insights articles, but I don’t remember which one

Agent Smith said:

So frequentism (I hope that's the correct term) is just one version of probability.

Yes. Just like an arrow is one version of a vector.

Agent Smith · Oct 7, 2024

@Dale I'm a bit lost here, Bayes' theorem is ##\text{P(Hypothesis|Evidence)}##. Isn't the evidence a measurement with/without computation, a/the data? However I can see how the prior probability can be chosen (it's just a guess???). The posterior probability would be the result of a computation, using Bayes' Theorem.

Agent Smith · Oct 7, 2024

Also @Dale , can we reduce Bayesian uncertainty to a frequentism, give it a frequentist interpretation? So if I'm 90% certain the earth is flat, it would "mean" 9 out 10 scenarios I'm right and 1 out 10 I'm wrong.

Dale · Oct 8, 2024

Agent Smith said:

@Dale I'm a bit lost here, Bayes' theorem is ##\text{P(Hypothesis|Evidence)}##. Isn't the evidence a measurement with/without computation, a/the data? However I can see how the prior probability can be chosen (it's just a guess???). The posterior probability would be the result of a computation, using Bayes' Theorem.

So in this context Bayes theorem is $$P(Hypothesis|Evidence)=\frac{P(Evidence|Hypothesis) \ P(Hypothesis)}{P(Evidence)}$$ What is chosen is ##P(Hypothesis)##. This is called the prior. It represents the uncertainty before looking at the evidence. It can be based on any prior studies and any prior data, but if there really is not any evidence then it can be based on things like expert opinion or rough estimation. Whatever your knowledge is, from whatever source you have, before looking at your new data, the prior is chosen to reflect that knowledge.

Agent Smith said:

Also @Dale , can we reduce Bayesian uncertainty to a frequentism, give it a frequentist interpretation? So if I'm 90% certain the earth is flat, it would "mean" 9 out 10 scenarios I'm right and 1 out 10 I'm wrong.

In situations where frequentist probabilities exist, they will match Bayesian probabilities. There reverse is not necessarily true. There are situations that there is no reasonable definition of frequentist probability that the Bayesian probability models just fine.

Also, Bayesian statistical tests are usually equivalent to frequentist statistical tests when the Bayesian test is performed using an uninformative prior.

Agent Smith · Oct 8, 2024

@Dale muchas gracias.

Agent Smith · Oct 9, 2024

PeroK said:

The universe is either flat or it isn't. Probability doesn't apply.

@Dale

Not intending to start a fight, but in what way does Bayesian thinking (statistics/probability) inform Quantum Physics? Last I checked, from the very layman's discussions I've had, for a particle it isn't the case that it is "either flat or it isn't". Some say it's both here and there (position-wise). The (Bayesian) uncertainty is a an actual feature of the world of particles I believe.

Dale · Oct 9, 2024

Agent Smith said:

in what way does Bayesian thinking (statistics/probability) inform Quantum Physics?

I don’t know. I am not a big quantum mechanics guy. My physics knowledge runs to classical physics, relativity, and biomedical engineering.

I do know that there is a quantum Bayesian interpretation called qbism, but I don’t know the details. I tend to be interpretation-agnostic in many things, so I probably would not be a particularly strong adherent of it.

Agent Smith · Oct 10, 2024

@Dale , but per the axioms of probability (gracias for that), the sum of the probabilities of a particle's position must add up to ##1##. Doesn't that mean they're mutually exclusive? Perhaps you can answer from a mathematician's point of view.

Dale · Oct 10, 2024

Agent Smith said:

@Dale , but per the axioms of probability (gracias for that), the sum of the probabilities of a particle's position must add up to ##1##.

Yes. This means that the probability that the particle is somewhere is 1.

Agent Smith said:

Doesn't that mean they're mutually exclusive? Perhaps you can answer from a mathematician's point of view.

Again, I am not a big QM guy, but what are you referring to by mutually exclusive?

Agent Smith · Oct 10, 2024

Dale said:

Yes. This means that the probability that the particle is somewhere is 1.

Again, I am not a big QM guy, but what are you referring to by mutually exclusive?

It's ok, I'll try and work it out on my own. Gracias,

What motivates Bayes' Theorem?

Discussion Overview

Discussion Character

Main Points Raised

Areas of Agreement / Disagreement

Contextual Notes

Similar threads

Undergrad Please Explain (actually explain) The Monty Hall Problem

Undergrad A variant of the Monty Hall problem

Graduate Hypothesis testing: Defining H0, HA hypotheses so that ( H_A)_A' makes sense

Undergrad My basic understanding of set theory

Undergrad How do E[X] and E[|X|] relate?

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Insights What Exactly is Dirac’s Delta Function? - Insight