Relationship between a posterior distribution and the LLN

scinoob · Mar 26, 2014

Hello everybody. This is my first post here and I hope I'm not asking a question that's been addressed already (I did try to use the search function, but couldn't find what I'm looking for).

Both the Bayes theorem and the law of large numbers are mathematical theorems derived from Kolmogorov's axioms. I've been thinking of a way to relate the two and I'm not sure how that can be done.

Let's say I have a coin with an unknown bias towards heads Θ. I start with a uniform prior distribution of Θ (between 0 and 1 obviously) and keep flipping the coin while updating the prior using Bayes' theorem. Let's say after 300 flips the posterior looks like a relatively peaked unimodal distribution centered at 0.3. Then I decide to flip the coin another, say, 10k times without updating the posterior anymore and somebody asks the question "approximately what percentage of heads do you expect to see after the 10k flips?" So, what I am likely going to do (at least according to the standard practice) is calculate the expected value of a single flip using the posterior distribution I obtained after the 300 flips. If I then go ahead and flip the coin 10k times, can I expect that the percentage will be close to the value I obtained in the analytic calculation? If I keep flipping it, can I expect that the % of heads will converge to the expected value by appealing to the LLN?

If the answer is 'no', then is there another way to relate Bayes' theorem to the LLN, say, using my coin example in particular?

Stephen Tashi · Mar 26, 2014

scinoob said:

Let's say after 300 flips the posterior looks like a relatively peaked unimodal distribution centered at 0.3.

Bayesians vary in how they interpret the bayesian approach to problems. One common way to think about the posterior distribution in your problem is that your particular coin was picked at from a population of coins with various probabilities of landing heads and that the posterior distribution gives the probabilities of picking such a coin. An expected value calculated using the posterior distribution is the expected number of heads over that entire population of coins, not the expected value of heads from tossing your particular coin repeatedly. So when you say that you continue to toss the coin 10k more times, you must be specific about whether you are tossing the same coin 10k times or whether you are imagining each of the 10k tosses of a coin to consist of two steps: a) Pick the coin at random from the population of all coins b) toss the coin you picked.

scinoob · Mar 26, 2014

Stephen Tashi said:

Bayesians vary in how they interpret the bayesian approach to problems. One common way to think about the posterior distribution in your problem is that your particular coin was picked at from a population of coins with various probabilities of landing heads and that the posterior distribution gives the probabilities of picking such a coin. An expected value calculated using the posterior distribution is the expected number of heads over that entire population of coins, not the expected value of heads from tossing your particular coin repeatedly. So when you say that you continue to toss the coin 10k more times, you must be specific about whether you are tossing the same coin 10k times or whether you are imagining each of the 10k tosses of a coin to consist of two steps: a) Pick the coin at random from the population of all coins b) toss the coin you picked.

Good point. In this case I mean tossing the same coin 10k times. In other words, it's the same coin with some unknown bias θ. The posterior distribution represents the degree of belief in each value for θ.

Stephen Tashi · Mar 26, 2014

If I keep flipping it, can I expect that the % of heads will converge to the expected value by appealing to the LLN?

No, you can't justify that by the CLT.

FactChecker · Mar 28, 2014

Stephen Tashi said:

No, you can't justify that by the CLT.

I'm afraid I have to disagree here. I think this is exactly what the CLT says. It will converge to the true mean of the biased coin. It will not converge to the mean of the 300 samples unless that happens to have 0 error.

Stephen Tashi · Mar 28, 2014

But the true mean of the particular biased coin is not the mean calculated from the posterior distribution, which considers coins with a variety of biases in the calculation.

(Also "will converge" has a very technical definition - different than the definition of convergence used in calculus for the convergence of sequences or limits of functions.)

FactChecker · Mar 29, 2014

Stephen Tashi said:

But the true mean of the particular biased coin is not the mean calculated from the posterior distribution, which considers coins with a variety of biases in the calculation.

(Also "will converge" has a very technical definition - different than the definition of convergence used in calculus for the convergence of sequences or limits of functions.)

Oh. I missed that there were multiple coins with different biases. Then you are right.

Relationship between a posterior distribution and the LLN

Discussion Overview

Discussion Character

Main Points Raised

Areas of Agreement / Disagreement

Contextual Notes

Similar threads

Graduate Hypothesis testing: Defining H0, HA hypotheses so that ( H_A)_A' makes sense

Undergrad My basic understanding of set theory

Undergrad How do E[X] and E[|X|] relate?

Graduate Expected numbers of cards of a last color remaining

Undergrad How does axiom of foundation prevent infinite sequence of elements?

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Insights What Exactly is Dirac’s Delta Function? - Insight