I Bayes, Hume, lotteries and trustworty newspapers

  • Thread starter Thread starter haushofer
  • Start date Start date
Click For Summary
In a lottery scenario with a winning probability of 1 in 10 million, a skeptical participant questions the reliability of a newspaper that reports their win, given its historical error rate of 2%. The discussion revolves around applying Bayes' theorem to calculate the conditional probability of actually winning, considering the newspaper's accuracy. The key point is that while the newspaper's error rate is significant, the probability of it correctly reporting a win is much higher, leading to a complex interplay of probabilities. Ultimately, the conversation highlights the differences between lottery outcomes and medical testing scenarios, emphasizing the need for careful consideration of how evidence updates beliefs about improbable events. The conclusion suggests that the participant's trust in the newspaper's report hinges on understanding these probabilistic relationships.
  • #31
haushofer said:
I regard the a priori probability P(H) that unicorns exist as 1 in ten million, and I regard the trustworthyness of the person as P(D|H)=0.98; if unicorns actually exist, the probability is 98% that his person saw it right. Can I know similary conclude that it is highly likely that unicorns actually exist, e.g. P(H|D)≈1? I'd say no, but I can't pinpoint why your reasoning would fail for the unicorn-case.
His reasoning doesn’t fail, but he made an unstated assumption about ##P(D|\neg H)##. That assumption is the most reasonable one for the lottery, but perhaps not for the unicorn. You should evaluate ##P(D|\neg H)## for both cases and see how they differ.
 
Last edited:
  • Like
Likes PeroK
Physics news on Phys.org
  • #32
haushofer said:
Imagine someone is claiming that he saw a unicorn. I know, unicorns are not seen every day, while people win lotteries every day, but imagine that I put the same numbers on this case: I regard the a priori probability P(H) that unicorns exist as 1 in ten million, and I regard the trustworthyness of the person as P(D|H)=0.98; if unicorns actually exist, the probability is 98% that his person saw it right. Can I know similary conclude that it is highly likely that unicorns actually exist, e.g. ##P(H|D) \approx 1##? I'd say no, but I can't pinpoint why your reasoning would fail for the unicorn-case.

Well, you should work through the unicorn case. The Bayesian rule is:

##P(H|D) = \frac{P(D|H) P(H)}{P(D)}##

If we assume that the data is very accurate, then ##P(D|H) \approx 1##, and so it reduces approximately to:

##P(H|D) \approx \frac{P(H)}{P(D)}##

We can also write: ##P(D) = P(D|H) P(H) + P(D|\neg H) P(\neg H)##

If we assume that ##P(D|\neg H) \gg P(H)##, and ##P(\neg H) \approx 1## then we can make another approximation:

##P(D) \approx P(D|\neg H)##

So we have: ##P(H|D) \approx \frac{P(H)}{P(D|\neg H)}##

The difference between the unicorn case and the lottery case is the relative size of ##P(D|\neg H)##
 
  • #33
There is a slight variant of the lottery example that makes a huge difference in the computed probability. Instead of the newspaper reporting the lottery winner with 98% accuracy, assume that there is a website where you can enter your ticket number, and it tells you, with 98% accuracy, whether you are the winner or not. Even though that sounds similar to the original problem, in this case, it's much more likely that the website is in error than that you are the actual winner.
 
  • Like
Likes Dale, BWV, Ygggdrasil and 1 other person
  • #34
Dale said:
I agree. Fairness and random errors are the best assumptions for this problem.

It is just that your post sounded to me like you were trying to claim that the posterior odds must be 49:1 regardless of any other assumptions, simply because that is the rate of accurate calls by the paper.

My point is that your outcome requires more than just the newspaper accuracy, it also requires an assumption about ##P(D|\neg H)##. That assumption is the most natural one in my opinion and probably what was intended, but it is an additional assumption nonetheless.

Yes, the book I quoted is vague about this statement and I was a bit sloppy :P
 
  • #35
stevendaryl said:
Well, you should work through the unicorn case. The Bayesian rule is:

##P(H|D) = \frac{P(D|H) P(H)}{P(D)}##

If we assume that the data is very accurate, then ##P(D|H) \approx 1##, and so it reduces approximately to:

##P(H|D) \approx \frac{P(H)}{P(D)}##

We can also write: ##P(D) = P(D|H) P(H) + P(D|\neg H) P(\neg H)##

If we assume that ##P(D|\neg H) \gg P(H)##, and ##P(\neg H) \approx 1## then we can make another approximation:

##P(D) \approx P(D|\neg H)##

So we have: ##P(H|D) \approx \frac{P(H)}{P(D|\neg H)}##

The difference between the unicorn case and the lottery case is the relative size of ##P(D|\neg H)##
Yes, I will consider that term more closely.

I must say, working through these examples and the help here is really enlightening!
 
  • #36
stevendaryl said:
There is a slight variant of the lottery example that makes a huge difference in the computed probability. Instead of the newspaper reporting the lottery winner with 98% accuracy, assume that there is a website where you can enter your ticket number, and it tells you, with 98% accuracy, whether you are the winner or not. Even though that sounds similar to the original problem, in this case, it's much more likely that the website is in error than that you are the actual winner.
I have to think about that one.
 
  • #37
I think I worked out the lottery now, making some extra assumptions about the report. Maybe I'll put it here for completeness.

stevendaryl said:
There is a slight variant of the lottery example that makes a huge difference in the computed probability. Instead of the newspaper reporting the lottery winner with 98% accuracy, assume that there is a website where you can enter your ticket number, and it tells you, with 98% accuracy, whether you are the winner or not. Even though that sounds similar to the original problem, in this case, it's much more likely that the website is in error than that you are the actual winner.
Could you explain this in more detail?
 
  • #38
haushofer said:
I think I worked out the lottery now, making some extra assumptions about the report. Maybe I'll put it here for completeness.Could you explain this in more detail?

Imagine the following process:
  1. You call the lottery office.
  2. They spin a dial to get a real number between 0 and 1.
  3. If the number is less than 0.98, they tell you the truth about whether you won the lottery or not.
  4. If the number is between 0.98 and 1, they lie to you about it.
Now, you call up the office and ask whether you won the lottery, or not. Before you even talk to anyone, there are 4 possibilities:
  1. You won the lottery and they tell you the truth. The probability of this is ##10^{-10} \cdot 0.98##
  2. You won the lottery and they lie to you: The probability of this is ##0^{-10} \cdot 0.02##
  3. You did not win the lottery, and they tell you the truth. The probability of this is: ##(1-10^{-10}) \cdot 0.98##
  4. You did not win the lottery, and they tell you a lie. The probability of this is: ##(1-10^{-10}) \cdot 0.02##
After you talk to someone and they tell you that you won, you can eliminate possibilities 2 and 3. So that leaves 1 and 4. 4 is much more likely than 1.
 
Last edited:
  • Like
Likes haushofer
  • #39
The newspaper equivalent would be "if you don't win and they print a wrong name, they always print your name".
 
  • Like
Likes stevendaryl
  • #40
stevendaryl said:
Imagine the following process:
  1. You call the lottery office.
  2. They spin a dial to get a real number between 0 and 1.
  3. If the number is less than 0.98, they tell you the truth about whether you won the lottery or not.
  4. If the number is between 0.98 and 1, they lie to you about it.
Now, you call up the office and ask whether you won the lottery, or not. Before you even talk to anyone, there are 4 possibilities:
  1. You won the lottery and they tell you the truth. The probability of this is ##10^{-10} \cdot 0.98##
  2. You won the lottery and they lie to you: The probability of this is ##0^{-10} \cdot 0.02##
  3. You did not win the lottery, and they tell you the truth. The probability of this is: ##(1-10^{-10}) \cdot 0.98##
  4. You did not win the lottery, and they tell you a lie. The probability of this is: ##(1-10^{-10}) \cdot 0.02##
After you talk to someone and they tell you that you won, you can eliminate possibilities 2 and 3. So that leaves 1 and 4. 4 is much more likely than 1.
Thanks, that makes sense. (you missed a 1 in option 2, btw; ##0^{-10} \rightarrow 10^{-10}## ;) )

I must say I like this lotterystuff; it shows the subtleties involved in these calculations.
 
  • #41
So let me write down how I worked out this example. I assumed three things:

* (1) the lottery is fair, so a priori every number has the same probability of winning
* (2) every winning lottery number has the same probability of being printed
* (3) every losing lottery number has the same probability of being printed

So there are no biases concerning rightly or wrongly printing. Now imagine the lottery numbers range from 1 to 10.000.000. We (as PF-forum, happy to share) drew number 42. And indeed, the newspaper reports that number 42 won!

The a priori probability of lottery number ##x## to win is denoted as ##P(x)##, and the data that the newspaper reports that number ##y## won is denoted as ##P(paper:y)##. So we have, according to (1), ##P(42)=\frac{1}{10000000}=10^{-7}##. Also, a reliability of 98% means that ##P(paper:42 | 42)=0.98##. And finally, from (3), we can deduce that ##P(paper:42 | \neg 42) = \frac{0.02}{9999999}##; all of the remaining 9.999.999 (all of the numbers besides 42) have the same probability of being printed wrongly. If I put these numbers into Bayes, I get

<br /> P(42|\text{paper:}42) = \frac{P(\text{paper:}42|42) P(42)}{P(\text{paper:}42|42) P(42) + P(\text{paper:}42|\neg 42)P(\neg 42)}<br />

which gives me as answer

<br /> P(42|\text{paper:}42) = 0,9998 = 99.98 \%<br />

The only thing I'm not sure of whether it makes sense that this number exceeds the initial reliability of 98%. But this should be right, right?

-edit: resolved, made an arithmetic error in converting percentages.
 
Last edited:
  • #42
mfb said:
The newspaper equivalent would be "if you don't win and they print a wrong name, they always print your name".

Yes. So I guess that's why

##P(paper:42 | \neg 42) = \frac{0.02}{9999999}##

instead of

##P(paper:42 | \neg 42) = 0.02 ##.
 
  • #43
Assume you only have one ticket and there is only one winner/number published...

If your number is in the paper your odds of having won = 49/50.

But its also possible you won and the paper made a mistake = 1/10m x 1/50

Just add these together.
 
  • #44
haushofer said:
So let me write down how I worked out this example. I assumed three things:

* (1) the lottery is fair, so a priori every number has the same probability of winning
* (2) every winning lottery number has the same probability of being printed
* (3) every losing lottery number has the same probability of being printed

So there are no biases concerning rightly or wrongly printing. Now imagine the lottery numbers range from 1 to 10.000.000. We (as PF-forum, happy to share) drew number 42. And indeed, the newspaper reports that number 42 won!

The a priori probability of lottery number ##x## to win is denoted as ##P(x)##, and the data that the newspaper reports that number ##y## won is denoted as ##P(paper:y)##. So we have, according to (1), ##P(42)=\frac{1}{10000000}=10^{-7}##. Also, a reliability of 98% means that ##P(paper:42 | 42)=0.98##. And finally, from (3), we can deduce that ##P(paper:42 | \neg 42) = \frac{0.02}{9999999}##; all of the remaining 9.999.999 (all of the numbers besides 42) have the same probability of being printed wrongly. If I put these numbers into Bayes, I get

<br /> P(42|\text{paper:}42) = \frac{P(\text{paper:}42|42) P(42)}{P(\text{paper:}42|42) P(42) + P(\text{paper:}42|\neg 42)P(\neg 42)}<br />

which gives me as answer

<br /> P(42|\text{paper:}42) = 0,9998 = 99.98 \%<br />

The only thing I'm not sure of whether it makes sense that this number exceeds the initial reliability of 98%. But this should be right, right?

It can't be as high as that if the newspaper is only 98% accurate.
 
  • #45
PeroK said:
It can't be as high as that if the newspaper is only 98% accurate.

Mmmm, so where's the mistake then you think?
 
  • #46
haushofer said:
Mmmm, so where's the mistake then you think?
You lost me, I'm afraid. If we let A be you win the lottery and B the newspaper prints your number, then ##P(A) = P(B) = 1/N##, where N is the number of lottery tickets. This, as others have pointed out, depends on the basic assumptions of no bias.

Then, Bayes theorem reduces to:

##P(A|B) = P(B|A) = 0.98##

Another approach, which I would recommend, is to use a probability tree:
 
  • #47
... for example, if we complicate matters by the assumption that sometimes the paper prints an invalid lottery number. Say, 0.5% of the time. Then 1.5% of the time it prints a valid but incorrect winning number. And 98% of the time it prints the correct winning number.

A probability tree is ideal for handling that and avoids the difficulties of the Bayes formula as the number of options increases.
 
  • #48
PeroK said:
You lost me, I'm afraid.
haushofer said:
Mmmm, so where's the mistake then you think?
Never mind, made an arithmetic error. The probability is just 0.98 if you fill in the numbers:

<br /> P(42|\text{paper:}42) = 0.98 = 98 \%<br />

Intuitively, I understand it as the probability

##P(paper:42 | \neg 42) = \frac{0.02}{9999999}##

being too small to affect the initial reliability of ##P(paper:x | x)=0.98## of the newspaper.
 
Last edited:

Similar threads

  • · Replies 1 ·
Replies
1
Views
2K
Replies
9
Views
2K
  • · Replies 19 ·
Replies
19
Views
2K
  • · Replies 8 ·
Replies
8
Views
2K
Replies
3
Views
2K
Replies
4
Views
4K
  • · Replies 36 ·
2
Replies
36
Views
16K
  • · Replies 7 ·
Replies
7
Views
2K
  • · Replies 13 ·
Replies
13
Views
2K
  • · Replies 87 ·
3
Replies
87
Views
8K