Bayes theorem, answer way too small something wrong?

ilyas.h
Messages
60
Reaction score
0
im sure i followed it correctly but my answer is unusually small...1 in a thousand people have a disease. A company has discovered a new method for testing for the disease.
If a person has the disease, the test will return a +ve result 99% of the time.
If a person doesn't have the disease, the test will return a +ve result 2% of the time.

what is the probability of a person having the disease, given that they have a +ve result?

components:

D = 0.001 (1 in a 1000...)
P(+ve | D) = 0.99
P(+ve | not D) = 0.02

asked to find: P(D | +ve).

Formula (bayes theorem):

P(D | +ve) =

P(D)P(+ve | D)
-----------------------------------------------------
P(D)P(+ve | D) + P(not D)P(+ve | not D)if you plug in the values you get P(D | +ve) = 0.0472

this is clearly too small, if the result was +ve, you'd expect a substantial amount of the +ve cohort to actually have the disease. I've checked the formula 100 times and everything seems correct.
 
Physics news on Phys.org
Your answer is correct. Think of it this way. Suppose I test 1000 people, chosen at random. On the average, 1 of these people will have the disease, and 999 people will not. The test will probably (99% of the time) give a positive result for the one person who has the disease, but it will also give a positive result for 2% of the 999 people who don't have the disease. So there will be about 20 false positives (2% of 999). So, if you get a positive result, there is only about a 1/21 (4.7%) chance that you actually have the disease. This is the point of the exercise, since the result is so different from your intuition. Does this make sense?
 
phyzguy said:
Your answer is correct. Think of it this way. Suppose I test 1000 people, chosen at random. On the average, 1 of these people will have the disease, and 999 people will not. The test will probably (99% of the time) give a positive result for the one person who has the disease, but it will also give a positive result for 2% of the 999 people who don't have the disease. So there will be about 20 false positives (2% of 999). So, if you get a positive result, there is only about a 1/21 (4.7%) chance that you actually have the disease. This is the point of the exercise, since the result is so different from your intuition. Does this make sense?

oh, i understand.

only a small number of people of the thousand will have a +ve result.

1 x 0.99 + 999 x 0.02 = [0.99] + [19.98]

so 0.99 people will have the disease given +ve result (estimate).
19.98 people will not have disease given +ve result (estimate).

so of the +ve people:

0.99/ ([0.99] + [19.98]) = 0.0472

is my logic correct? thanks for your help.
 
ilyas.h said:
oh, i understand.

only a small number of people of the thousand will have a +ve result.

1 x 0.99 + 999 x 0.02 = [0.99] + [19.98]

so 0.99 people will have the disease given +ve result (estimate).
19.98 people will not have disease given +ve result (estimate).

so of the +ve people:

0.99/ ([0.99] + [19.98]) = 0.0472

is my logic correct? thanks for your help.

Yes, that's correct. It's interesting how unintuitive this is.

Here's another way of looking at it. Suppose no-one had the disease (or almost no-one), what would be the percentage of false positives out of all positives?
 
PeroK said:
Yes, that's correct. It's interesting how unintuitive this is.

Here's another way of looking at it. Suppose no-one had the disease (or almost no-one), what would be the percentage of false positives out of all positives?

wouldn't it just be 20 in this case?

no one has disease, 0/02 x 1000 = 20.
 
ilyas.h said:
wouldn't it just be 20 in this case?

no one has disease, 0/02 x 1000 = 20.

The point is that 100% of the positives are false.
 
As a side point: this is one of the reason many older tests (for doping or drugs, say) were two-stage or more tests. A person is randomly selected and gives a sample: that sample is randomly spit into two (for a two-stage test), one is randomly selected and tested. If that test returns negative the second portion is destroyed. If the first sample tests positive the second sample is tested, and if it tests positive there is a very high probability of drugs. Why? Because the only people who make it to the second round of testing are those who had a first sample test positive: a huge percentage of "clean" subjects are eliminated.
 

Similar threads

Back
Top