# Conditonal Probability Problem

1. Aug 27, 2013

### daneault23

1. The problem statement, all variables and given/known data

Suppose there are two different diagnostic tests (say test A and test B) for some disease of interest. Assume that the prevalence of this disease in a large population is 1%. Test A has a false negative rate of 10% (false negative means that the test result is negative when the test is applied to a person who has the disease). Similarly, the false negative rate of test B is 5%. The false positive rate of test A is 4% (false positive means that the test result is positive even though it is applied to a person who does not have the disease). Similarly, the false positive rate of test B is 6%. If both tests A and B are positive when administered to a person selected at random from the population, what is the probability this person has the disease?

2. Relevant equations

P(A|B)=P(A intersect B)/P(B) for conditional probability
P(B|A)=(P(A|B)*P(B))/P(A) for Bayes' Rule

3. The attempt at a solution

I have written down all of the outcomes for Tests A and B. For Test A, I have P(-|+)= Probability that the test results are negative when test is applied to someone who has the disease=.10. Similarly, I have P(+|-)=.04. Therefore P(+|+)=1-.10=.90 and P(-|-)=1-.04=.96. For Test B, I have P(-|+)=.05, P(+|-)=.06, P(+|+)=.95, and P(-|-)=.94.

My question is that I'm not sure what to do now. It seems as if you should do the following: Let P(A)=Probability of population who have the disease, and let P(B)=Probability that both tests A and B are positive.

I would then set up the problem like this: P(A|B)=(P(B|A)*P(A))/P(B), except that you can't figure out what P(B|A) is. Maybe I am making this too hard, but this is really stumping me.

2. Aug 28, 2013

### vela

Staff Emeritus
First, I wouldn't use the letters A and B to denote the events since you're also using them to label the tests. Instead, let's use the notation A="test A is positive," B="test B is positive," and D="has the disease." A', B', and D' will represent the complements.

P(A|D) = 0.90
P(A'|D) = 0.10 (false negative)
P(A|D') = 0.04 (false positive)
P(A'|D') = 0.96

P(B|D) = 0.95
P(B'|D) = 0.05 (false negative)
P(B|D') = 0.06 (false positive)
P(B'|D') = 0.94

You want to find
$$P(D|A \text{ and } B) = \frac{P(A \text{ and } B|D)P(D)}{P(A \text{ and } B)}.$$ So what specifically is keeping you from figuring out P(A and B|D)? I think you can assume here that the results of the tests, given that the individual has the disease, are independent of each other.

3. Aug 28, 2013

### HallsofIvy

Staff Emeritus
I would do this in a completely different way: suppose there are a total of 100,000,000 people in the population.
So a total of 1,000,000 people have the disease and 999,000,000 do not.

So 10% of the 1,000,000, that is, 100,000 will test negative with test A, even though they have the disease and 90%, 900,000, will test positive.

And 50,000 will test negative with test B even though they have the disease. Also 5% of the 900,000 who test positive with test A, a total of 45000, will also test negative with test B. That means that 900,000- 45000= 855000 who have the disease will test positive on both tests.

So 4% of the 999,000,000, or 39,960,000 people, will test positive with test A even though they do not have the disease.

So 6% of the 999,000,000, or 59940000 people, will test positive with test B even though they do not have the disease.

And 6% of the 39,960,000 who tested positive with test A, even though they did not have the disease will test positive with test B. That is, 2397600 people will test positive with both tests even though they do not have the disease.

From above, there will be a total of 855000+ 2,397,600= 3,252,600 people who test positive on both tests, of whom 855000 actually have the disease. If a person tests positive on both tests, the probability he actually has the disease is 855000/3252600= 0.26 or 26%.

Last edited: Aug 28, 2013
4. Aug 28, 2013

### MrAnchovy

No, 100,000,000 - 1,000,000 = 99,000,000 not 999,000,000. This error carries through all your calculations: I make the answer ≈78%.

This is counter-intuitive to many - you might think a false positive rate of 5% means that if you test positive you are 95% likely to have the disease, but consider a disease that has actually been eradicated. The chance that you have that disease is nil, but there is still a 5% chance that you test positive for it.

5. Aug 28, 2013

### MrAnchovy

Nothing, but how do you figure out P(A and B)?

6. Aug 28, 2013

### Ray Vickson

You can use the method outlined by HallsOfIvy; however, you may not have noticed that "independence" was quietly assumed and used. That is, the assumption is that the probability of B giving a false positive/negative depends only on whether the person has the disease or not, and is not affected by whether A gave a false positive/negative, etc.

7. Aug 28, 2013

### daneault23

Using HallsofIvy's approach, I too get about 78%. Does anyone know if this is the correct answer? I think I'm even more confused now.

8. Aug 28, 2013

### MrAnchovy

The point I was trying to make was that vela's hint at a 'solution' was not very useful because the calculation of P(A and B) is equivalent to calculating the solution to the original problem. The right approach is of course that used by HallsOfIvy, even if the arithmetic was wrong.

If the test events are not independent then the information provided in the question is incomplete and it cannot be solved; this assumption is therefore entirely appropriate and I don't think it is reasonable to infer that the OP or anyone else may not have noticed it.

9. Aug 28, 2013

### Ray Vickson

One can do it all using Bayes' formulas. Let A = "test A is positive", B = "test B is positive", D = "have the disease" and D' = "does not have the disease". Using the notation $XY$ instead of $X \cap Y$ for the intersection of events $X, Y$, the problem is to find $P(D|AB)$. This is
$$P(D|AB) = \frac{P(DAB)}{P(AB)},$$
and we have $P(AB) = P(AB|D) P(D) + P(AB|D') P(D').$

Using independence of the test results, given the actual disease status, we have
$$P(AB|D) = \frac{90}{100}\frac{95}{100} \text{ and}\\ P(AB|D') = \frac{4}{100}\frac{6}{100}$$
So, the denominator P(AB) is obtainable. The numerator $P(ABD) = P(AB|D)P(D)$ is also obtainable.

The final result I get this way is P(D|AB) = .7825370675 ≈ 0.78.

The method suggested by HallsofIvy is just all this done in disguise, where we deal with population numbers instead of ratios. I actually like his approach--and used to use it all the time back in the Stone Age when I was teaching this stuff--but sometimes people want to see it done more formally.

10. Aug 28, 2013

### vela

Staff Emeritus
I didn't provide a hint. I was asking the OP to specifically point out what the difficulty he or she was running into instead of simply saying "I don't know how to do this" and waiting for someone to provide the solution.

In my experience, probability is one of the subjects in which reading and understanding a solution is remarkably ineffective in helping one learn how to solve problems. You can read a solution and have it make perfect sense to you, but when you turn around and try to apply the ideas to another problem, you find yourself stuck again. You really need to struggle and work through these concepts on your own to get them straight in your head.

11. Aug 28, 2013

### Ray Vickson

Of course the problem is not doable without the independence assumption, but that is no reason to gloss over the issue. It is always a good idea to recognize and display the assumptions being made. Anyway, the test results are NOT independent---they are (presumably) conditionally independent, and that is a somewhat different concept, worth knowing about on its own.

If I were marking the question I would give full marks only if the solver had mentioned the assumption---it only takes about 3 or 4 words to do so.

Last edited: Aug 28, 2013