Specific question re Bayesian statistics/analysis

phinds · Oct 11, 2018

I am reading "Thinking Fast and Slow" (fantastic book by the way) and I ran across a statement that has me flummoxed. The justification for the statement was said to be "Bayesian analysis" so I looked into that and frankly it's just more than I want to get into so I'm wondering if someone can give me some English language insight into the justification for the statement.

I realize that the answer may well be "Hey, guy, you've just got to learn the math if you really want to understand it" and if so, then so be it, but I thought I'd take a shot.

SO

1) 85% of all cars are blue and 15% are green
2) A witness to an accident says he saw the car involved and it was green.
3) The witness is known to be 80% accurate

Now, what I would take away from that is that the preponderance of blue cars would certainly lower the probability that the car in the accident really was green from 80% to more like maybe 65% or 70%. BUT ... the book says (based on Bayesian statistics) it's 41%. I just can't see how it could be cut it half like that and would appreciate any insight anyone can give me.

Thanks

jedishrfu · Oct 11, 2018

To get things going:

Veritaseum talks about Bayesian interference I mean inference.

mjc123 · Oct 11, 2018

Calculate the probabilities for the four cases:
Car was blue, witness says blue
Car was blue, witness says green
Car was green, witness says blue
Car was green, witness says green
(You have to make the assumption that the reliability of the witness is independent of the car colour.)
The probability (given that the witness says green) that it was green is
P(green, says green)/{P(green, says green) + P(blue, says green)}
The result of 41% is indeed surprising, but correct.

kuruman · Oct 11, 2018

This is how I think it was calculated
$$P=\frac{0.8\times 0.15}{0.8\times 0.15+0.2\times 0.85}=0.414$$
The numerator is the probability that he will say "green" when he sees green. The denominator is the sum of the probabilities that he will say green whether he sees green or blue.

On Edit: @mjc123 beat me to the same answer.

On second edit: The result may be surprising, but if one thinks about it for a moment, the probability of "green" should diminish as the percentage of green cars diminishes. To take it to the extreme, if the cars are 99% blue and the witness is correct 99% of the time, the probability of correct identification as green is 50-50. Under the same assumptions, if the witness identifies the car as blue, the probability of correct identification is 99.9%. Apparently, there is a surprising and counter-intuitive asymmetry in the witness's ability to identify car colors. Are lawyers aware of it?

Dale · Oct 11, 2018

phinds said:

1) 85% of all cars are blue and 15% are green
2) A witness to an accident says he saw the car involved and it was green.
3) The witness is known to be 80% accurate

Now, what I would take away from that is that the preponderance of blue cars would certainly lower the probability that the car in the accident really was green from 80% to more like maybe 65% or 70%. BUT ... the book says (based on Bayesian statistics) it's 41%. I just can't see how it could be cut it half like that

The thing is that the Bayesian approach looks at this the other way around. We are not starting with 80% belief and then decreasing that due to the fraction of green cars, instead we are starting with the 15% frequency of green cars. That is our so-called "prior", in other words without the witness statement we believe that it is 15% likely that the car was green. With no specific data on this particular event, all we can rely on is general knowledge, like the frequency of green cars.

Then, starting from that 15% generic prior belief, we acquire new data about this specific case. In this case the new data is in the form of a reliable witness. Because that witness is known to be quite reliable, after receiving his report we almost triple our belief that the car was green. His statement brings our belief up from 15% to 41%. Therefore, our prior belief is 15%, before hearing the witness, and then after hearing the witness our posterior belief increases to 41%, as it should when receiving information from a reliable witness.

The Bayesian approach is all about updating your prior belief in the face of new evidence. So the first thing that you need to do is to identify what the prior belief is. That is the thing that gets modified. The accuracy of the test is 80%, and that is unchanged in this procedure. What is changed is our belief that the car is green, and that increases dramatically.

mjc123 · Oct 11, 2018

It's not the witness's ability to identify colours that is asymmetric (as I said, we assume that is independent of colour). It is the posterior reliability of the witness's testimony. If we label the 4 cases listed in post #3 as A, B, C, D respectively, then
Prior probability of correctly identifying green as green = D/(C+D)
Prior probability of correctly identifying blue as blue = A/(A+B) assumed to be equal to previous
Posterior probability that car was blue when witness says blue = A/(A+C)
Posterior probability that car was green when witness says green = D/(B+D)
You are using different probability spaces if you compare e.g. D/(B+D) with D/(C+D).
IIRC, this was essentially the basis of Hume's argument against miracles - that they were so unlikely that, however reliable the witness, P(no miracle, wrong) is much greater than P(miracle, right). (Not that I necessarily agree with him, just using the illustration.)

PeroK · Oct 11, 2018

phinds said:

I just can't see how it could be cut it half like that and would appreciate any insight anyone can give me.

Thanks

It wasn't cut in half. The probability the car was green before the witness statement was 15%. After the statement it went up to 41%. It was never 80% in the first place.

phinds · Oct 11, 2018

Wow. I KNEW there was some reason I liked PF

Lots of excellent answers and MUCH better ways of looking at things than I was, which was clearly backwards, so now I get it.

Thanks very much to all.

jedishrfu · Oct 11, 2018

I liked how Veritaseum explained the counterintuitive result in his video. Its common-sense but you have to think about it the right way.

I still get confused especially when I tried to explain it to my son who was in law school at the time. They have the Defense Attorney fallacy and Prosecutor's Fallacy that is a misapplication/lack thereof of Bayesian logic to sway the jury.

https://en.wikipedia.org/wiki/Prosecutor's_fallacy

phinds · Oct 11, 2018

jedishrfu said:

I liked how Veritaseum explained the counterintuitive result in his video. Its common-sense but you have to think about it the right way.

Yes, I agree. I think @Dale said it very well:

Dale said:

The thing is that the Bayesian approach looks at this the other way around. We are not starting with 80% belief and then decreasing that due to the fraction of green cars, instead we are starting with the 15% frequency of green cars.
.
.
.
The Bayesian approach is all about updating your prior belief in the face of new evidence. So the first thing that you need to do is to identify what the prior belief is. That is the thing that gets modified.

and of course @PeroK also pointed out my specific wrong way of looking at it:

PeroK said:

It wasn't cut in half. The probability the car was green before the witness statement was 15%. After the statement it went up to 41%. It was never 80% in the first place.

Specific question re Bayesian statistics/analysis

1. What is Bayesian analysis?

2. How is Bayesian analysis different from other statistical methods?

3. How is prior knowledge or beliefs incorporated into Bayesian analysis?

4. What are the advantages of using Bayesian analysis?

5. What are some common applications of Bayesian analysis?

Similar threads

Hot Threads

Recent Insights