Bayes, Hume, lotteries and trustworty newspapers

  • Context: Undergrad 
  • Thread starter Thread starter haushofer
  • Start date Start date
Click For Summary

Discussion Overview

The discussion revolves around the application of Bayes' theorem to a lottery scenario, where a participant questions the reliability of a newspaper report claiming they have won. Participants explore the relationship between the probabilities of winning the lottery and the accuracy of the newspaper's reporting, drawing parallels to medical testing scenarios, particularly the interpretation of false positives in rare conditions.

Discussion Character

  • Exploratory
  • Technical explanation
  • Conceptual clarification
  • Debate/contested
  • Mathematical reasoning

Main Points Raised

  • One participant introduces a lottery scenario with a low probability of winning and questions how to calculate the probability of actually winning given a potentially erroneous newspaper report.
  • Another participant suggests that the mantra "extraordinary claims need extraordinary evidence" may not apply, as winning the lottery is a common occurrence, yet they express uncertainty about calculating conditional probabilities.
  • Some participants discuss the independence of events, noting that the lottery outcome and the newspaper report are not independent, while the lottery being held and the newspaper being published are independent.
  • Several participants engage in applying Bayes' theorem, discussing the components of the formula and the challenges in estimating the probability of the data.
  • There is a debate about the meaning of the evidence in the context of the lottery versus the medical testing example, with participants expressing confusion over how to relate the two scenarios.
  • One participant emphasizes that the probability of the newspaper reporting a win is closely tied to the actual probability of winning, suggesting a high dependency between the two events.

Areas of Agreement / Disagreement

Participants do not reach a consensus on the application of Bayes' theorem to the lottery example, with differing views on the independence of events and the interpretation of probabilities. The relationship between the lottery and medical testing examples remains contested, with some participants expressing confusion and others attempting to clarify the distinctions.

Contextual Notes

Participants highlight limitations in estimating the probability of the newspaper report and the conditional probabilities involved. The discussion reflects uncertainty regarding the assumptions made in both the lottery and medical testing scenarios, particularly in how they relate to the application of Bayes' theorem.

Who May Find This Useful

This discussion may be of interest to those exploring Bayesian reasoning, probability theory, and the interpretation of statistical evidence in real-world scenarios, particularly in contexts involving rare events and conditional probabilities.

  • #31
haushofer said:
I regard the a priori probability P(H) that unicorns exist as 1 in ten million, and I regard the trustworthyness of the person as P(D|H)=0.98; if unicorns actually exist, the probability is 98% that his person saw it right. Can I know similary conclude that it is highly likely that unicorns actually exist, e.g. P(H|D)≈1? I'd say no, but I can't pinpoint why your reasoning would fail for the unicorn-case.
His reasoning doesn’t fail, but he made an unstated assumption about ##P(D|\neg H)##. That assumption is the most reasonable one for the lottery, but perhaps not for the unicorn. You should evaluate ##P(D|\neg H)## for both cases and see how they differ.
 
Last edited:
  • Like
Likes   Reactions: PeroK
Physics news on Phys.org
  • #32
haushofer said:
Imagine someone is claiming that he saw a unicorn. I know, unicorns are not seen every day, while people win lotteries every day, but imagine that I put the same numbers on this case: I regard the a priori probability P(H) that unicorns exist as 1 in ten million, and I regard the trustworthyness of the person as P(D|H)=0.98; if unicorns actually exist, the probability is 98% that his person saw it right. Can I know similary conclude that it is highly likely that unicorns actually exist, e.g. ##P(H|D) \approx 1##? I'd say no, but I can't pinpoint why your reasoning would fail for the unicorn-case.

Well, you should work through the unicorn case. The Bayesian rule is:

##P(H|D) = \frac{P(D|H) P(H)}{P(D)}##

If we assume that the data is very accurate, then ##P(D|H) \approx 1##, and so it reduces approximately to:

##P(H|D) \approx \frac{P(H)}{P(D)}##

We can also write: ##P(D) = P(D|H) P(H) + P(D|\neg H) P(\neg H)##

If we assume that ##P(D|\neg H) \gg P(H)##, and ##P(\neg H) \approx 1## then we can make another approximation:

##P(D) \approx P(D|\neg H)##

So we have: ##P(H|D) \approx \frac{P(H)}{P(D|\neg H)}##

The difference between the unicorn case and the lottery case is the relative size of ##P(D|\neg H)##
 
  • #33
There is a slight variant of the lottery example that makes a huge difference in the computed probability. Instead of the newspaper reporting the lottery winner with 98% accuracy, assume that there is a website where you can enter your ticket number, and it tells you, with 98% accuracy, whether you are the winner or not. Even though that sounds similar to the original problem, in this case, it's much more likely that the website is in error than that you are the actual winner.
 
  • Like
Likes   Reactions: Dale, BWV, Ygggdrasil and 1 other person
  • #34
Dale said:
I agree. Fairness and random errors are the best assumptions for this problem.

It is just that your post sounded to me like you were trying to claim that the posterior odds must be 49:1 regardless of any other assumptions, simply because that is the rate of accurate calls by the paper.

My point is that your outcome requires more than just the newspaper accuracy, it also requires an assumption about ##P(D|\neg H)##. That assumption is the most natural one in my opinion and probably what was intended, but it is an additional assumption nonetheless.

Yes, the book I quoted is vague about this statement and I was a bit sloppy :P
 
  • #35
stevendaryl said:
Well, you should work through the unicorn case. The Bayesian rule is:

##P(H|D) = \frac{P(D|H) P(H)}{P(D)}##

If we assume that the data is very accurate, then ##P(D|H) \approx 1##, and so it reduces approximately to:

##P(H|D) \approx \frac{P(H)}{P(D)}##

We can also write: ##P(D) = P(D|H) P(H) + P(D|\neg H) P(\neg H)##

If we assume that ##P(D|\neg H) \gg P(H)##, and ##P(\neg H) \approx 1## then we can make another approximation:

##P(D) \approx P(D|\neg H)##

So we have: ##P(H|D) \approx \frac{P(H)}{P(D|\neg H)}##

The difference between the unicorn case and the lottery case is the relative size of ##P(D|\neg H)##
Yes, I will consider that term more closely.

I must say, working through these examples and the help here is really enlightening!
 
  • #36
stevendaryl said:
There is a slight variant of the lottery example that makes a huge difference in the computed probability. Instead of the newspaper reporting the lottery winner with 98% accuracy, assume that there is a website where you can enter your ticket number, and it tells you, with 98% accuracy, whether you are the winner or not. Even though that sounds similar to the original problem, in this case, it's much more likely that the website is in error than that you are the actual winner.
I have to think about that one.
 
  • #37
I think I worked out the lottery now, making some extra assumptions about the report. Maybe I'll put it here for completeness.

stevendaryl said:
There is a slight variant of the lottery example that makes a huge difference in the computed probability. Instead of the newspaper reporting the lottery winner with 98% accuracy, assume that there is a website where you can enter your ticket number, and it tells you, with 98% accuracy, whether you are the winner or not. Even though that sounds similar to the original problem, in this case, it's much more likely that the website is in error than that you are the actual winner.
Could you explain this in more detail?
 
  • #38
haushofer said:
I think I worked out the lottery now, making some extra assumptions about the report. Maybe I'll put it here for completeness.Could you explain this in more detail?

Imagine the following process:
  1. You call the lottery office.
  2. They spin a dial to get a real number between 0 and 1.
  3. If the number is less than 0.98, they tell you the truth about whether you won the lottery or not.
  4. If the number is between 0.98 and 1, they lie to you about it.
Now, you call up the office and ask whether you won the lottery, or not. Before you even talk to anyone, there are 4 possibilities:
  1. You won the lottery and they tell you the truth. The probability of this is ##10^{-10} \cdot 0.98##
  2. You won the lottery and they lie to you: The probability of this is ##0^{-10} \cdot 0.02##
  3. You did not win the lottery, and they tell you the truth. The probability of this is: ##(1-10^{-10}) \cdot 0.98##
  4. You did not win the lottery, and they tell you a lie. The probability of this is: ##(1-10^{-10}) \cdot 0.02##
After you talk to someone and they tell you that you won, you can eliminate possibilities 2 and 3. So that leaves 1 and 4. 4 is much more likely than 1.
 
Last edited:
  • Like
Likes   Reactions: haushofer
  • #39
The newspaper equivalent would be "if you don't win and they print a wrong name, they always print your name".
 
  • Like
Likes   Reactions: stevendaryl
  • #40
stevendaryl said:
Imagine the following process:
  1. You call the lottery office.
  2. They spin a dial to get a real number between 0 and 1.
  3. If the number is less than 0.98, they tell you the truth about whether you won the lottery or not.
  4. If the number is between 0.98 and 1, they lie to you about it.
Now, you call up the office and ask whether you won the lottery, or not. Before you even talk to anyone, there are 4 possibilities:
  1. You won the lottery and they tell you the truth. The probability of this is ##10^{-10} \cdot 0.98##
  2. You won the lottery and they lie to you: The probability of this is ##0^{-10} \cdot 0.02##
  3. You did not win the lottery, and they tell you the truth. The probability of this is: ##(1-10^{-10}) \cdot 0.98##
  4. You did not win the lottery, and they tell you a lie. The probability of this is: ##(1-10^{-10}) \cdot 0.02##
After you talk to someone and they tell you that you won, you can eliminate possibilities 2 and 3. So that leaves 1 and 4. 4 is much more likely than 1.
Thanks, that makes sense. (you missed a 1 in option 2, btw; ##0^{-10} \rightarrow 10^{-10}## ;) )

I must say I like this lotterystuff; it shows the subtleties involved in these calculations.
 
  • #41
So let me write down how I worked out this example. I assumed three things:

* (1) the lottery is fair, so a priori every number has the same probability of winning
* (2) every winning lottery number has the same probability of being printed
* (3) every losing lottery number has the same probability of being printed

So there are no biases concerning rightly or wrongly printing. Now imagine the lottery numbers range from 1 to 10.000.000. We (as PF-forum, happy to share) drew number 42. And indeed, the newspaper reports that number 42 won!

The a priori probability of lottery number ##x## to win is denoted as ##P(x)##, and the data that the newspaper reports that number ##y## won is denoted as ##P(paper:y)##. So we have, according to (1), ##P(42)=\frac{1}{10000000}=10^{-7}##. Also, a reliability of 98% means that ##P(paper:42 | 42)=0.98##. And finally, from (3), we can deduce that ##P(paper:42 | \neg 42) = \frac{0.02}{9999999}##; all of the remaining 9.999.999 (all of the numbers besides 42) have the same probability of being printed wrongly. If I put these numbers into Bayes, I get

<br /> P(42|\text{paper:}42) = \frac{P(\text{paper:}42|42) P(42)}{P(\text{paper:}42|42) P(42) + P(\text{paper:}42|\neg 42)P(\neg 42)}<br />

which gives me as answer

<br /> P(42|\text{paper:}42) = 0,9998 = 99.98 \%<br />

The only thing I'm not sure of whether it makes sense that this number exceeds the initial reliability of 98%. But this should be right, right?

-edit: resolved, made an arithmetic error in converting percentages.
 
Last edited:
  • #42
mfb said:
The newspaper equivalent would be "if you don't win and they print a wrong name, they always print your name".

Yes. So I guess that's why

##P(paper:42 | \neg 42) = \frac{0.02}{9999999}##

instead of

##P(paper:42 | \neg 42) = 0.02 ##.
 
  • #43
Assume you only have one ticket and there is only one winner/number published...

If your number is in the paper your odds of having won = 49/50.

But its also possible you won and the paper made a mistake = 1/10m x 1/50

Just add these together.
 
  • #44
haushofer said:
So let me write down how I worked out this example. I assumed three things:

* (1) the lottery is fair, so a priori every number has the same probability of winning
* (2) every winning lottery number has the same probability of being printed
* (3) every losing lottery number has the same probability of being printed

So there are no biases concerning rightly or wrongly printing. Now imagine the lottery numbers range from 1 to 10.000.000. We (as PF-forum, happy to share) drew number 42. And indeed, the newspaper reports that number 42 won!

The a priori probability of lottery number ##x## to win is denoted as ##P(x)##, and the data that the newspaper reports that number ##y## won is denoted as ##P(paper:y)##. So we have, according to (1), ##P(42)=\frac{1}{10000000}=10^{-7}##. Also, a reliability of 98% means that ##P(paper:42 | 42)=0.98##. And finally, from (3), we can deduce that ##P(paper:42 | \neg 42) = \frac{0.02}{9999999}##; all of the remaining 9.999.999 (all of the numbers besides 42) have the same probability of being printed wrongly. If I put these numbers into Bayes, I get

<br /> P(42|\text{paper:}42) = \frac{P(\text{paper:}42|42) P(42)}{P(\text{paper:}42|42) P(42) + P(\text{paper:}42|\neg 42)P(\neg 42)}<br />

which gives me as answer

<br /> P(42|\text{paper:}42) = 0,9998 = 99.98 \%<br />

The only thing I'm not sure of whether it makes sense that this number exceeds the initial reliability of 98%. But this should be right, right?

It can't be as high as that if the newspaper is only 98% accurate.
 
  • #45
PeroK said:
It can't be as high as that if the newspaper is only 98% accurate.

Mmmm, so where's the mistake then you think?
 
  • #46
haushofer said:
Mmmm, so where's the mistake then you think?
You lost me, I'm afraid. If we let A be you win the lottery and B the newspaper prints your number, then ##P(A) = P(B) = 1/N##, where N is the number of lottery tickets. This, as others have pointed out, depends on the basic assumptions of no bias.

Then, Bayes theorem reduces to:

##P(A|B) = P(B|A) = 0.98##

Another approach, which I would recommend, is to use a probability tree:
 
  • #47
... for example, if we complicate matters by the assumption that sometimes the paper prints an invalid lottery number. Say, 0.5% of the time. Then 1.5% of the time it prints a valid but incorrect winning number. And 98% of the time it prints the correct winning number.

A probability tree is ideal for handling that and avoids the difficulties of the Bayes formula as the number of options increases.
 
  • #48
PeroK said:
You lost me, I'm afraid.
haushofer said:
Mmmm, so where's the mistake then you think?
Never mind, made an arithmetic error. The probability is just 0.98 if you fill in the numbers:

<br /> P(42|\text{paper:}42) = 0.98 = 98 \%<br />

Intuitively, I understand it as the probability

##P(paper:42 | \neg 42) = \frac{0.02}{9999999}##

being too small to affect the initial reliability of ##P(paper:x | x)=0.98## of the newspaper.
 
Last edited:

Similar threads

  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 9 ·
Replies
9
Views
2K
  • · Replies 6 ·
Replies
6
Views
1K
  • · Replies 19 ·
Replies
19
Views
2K
  • · Replies 8 ·
Replies
8
Views
2K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 4 ·
Replies
4
Views
4K
  • · Replies 7 ·
Replies
7
Views
2K
  • · Replies 13 ·
Replies
13
Views
2K
  • · Replies 87 ·
3
Replies
87
Views
8K