Conditonal Probability Problem

  • Thread starter daneault23
  • Start date
  • Tags
    Probability
In summary: However, this is not the case. Two tests that both say a person does not have a disease only count as true positives if the person actually does not have the disease - if they do have the disease, the tests are still wrong.
  • #1
daneault23
32
0

Homework Statement



Suppose there are two different diagnostic tests (say test A and test B) for some disease of interest. Assume that the prevalence of this disease in a large population is 1%. Test A has a false negative rate of 10% (false negative means that the test result is negative when the test is applied to a person who has the disease). Similarly, the false negative rate of test B is 5%. The false positive rate of test A is 4% (false positive means that the test result is positive even though it is applied to a person who does not have the disease). Similarly, the false positive rate of test B is 6%. If both tests A and B are positive when administered to a person selected at random from the population, what is the probability this person has the disease?

Homework Equations



P(A|B)=P(A intersect B)/P(B) for conditional probability
P(B|A)=(P(A|B)*P(B))/P(A) for Bayes' Rule

The Attempt at a Solution



I have written down all of the outcomes for Tests A and B. For Test A, I have P(-|+)= Probability that the test results are negative when test is applied to someone who has the disease=.10. Similarly, I have P(+|-)=.04. Therefore P(+|+)=1-.10=.90 and P(-|-)=1-.04=.96. For Test B, I have P(-|+)=.05, P(+|-)=.06, P(+|+)=.95, and P(-|-)=.94.

My question is that I'm not sure what to do now. It seems as if you should do the following: Let P(A)=Probability of population who have the disease, and let P(B)=Probability that both tests A and B are positive.

I would then set up the problem like this: P(A|B)=(P(B|A)*P(A))/P(B), except that you can't figure out what P(B|A) is. Maybe I am making this too hard, but this is really stumping me.
 
Physics news on Phys.org
  • #2
First, I wouldn't use the letters A and B to denote the events since you're also using them to label the tests. Instead, let's use the notation A="test A is positive," B="test B is positive," and D="has the disease." A', B', and D' will represent the complements.

P(A|D) = 0.90
P(A'|D) = 0.10 (false negative)
P(A|D') = 0.04 (false positive)
P(A'|D') = 0.96

P(B|D) = 0.95
P(B'|D) = 0.05 (false negative)
P(B|D') = 0.06 (false positive)
P(B'|D') = 0.94

You want to find
$$P(D|A \text{ and } B) = \frac{P(A \text{ and } B|D)P(D)}{P(A \text{ and } B)}.$$ So what specifically is keeping you from figuring out P(A and B|D)? I think you can assume here that the results of the tests, given that the individual has the disease, are independent of each other.
 
  • #3
I would do this in a completely different way: suppose there are a total of 100,000,000 people in the population.
Assume that the prevalence of this disease in a large population is 1%.
So a total of 1,000,000 people have the disease and 999,000,000 do not.

Test A has a false negative rate of 10%
So 10% of the 1,000,000, that is, 100,000 will test negative with test A, even though they have the disease and 90%, 900,000, will test positive.

Similarly, the false negative rate of test B is 5%.
And 50,000 will test negative with test B even though they have the disease. Also 5% of the 900,000 who test positive with test A, a total of 45000, will also test negative with test B. That means that 900,000- 45000= 855000 who have the disease will test positive on both tests.

The false positive rate of test A is 4%
So 4% of the 999,000,000, or 39,960,000 people, will test positive with test A even though they do not have the disease.

Similarly, the false positive rate of test B is 6%.
So 6% of the 999,000,000, or 59940000 people, will test positive with test B even though they do not have the disease.

And 6% of the 39,960,000 who tested positive with test A, even though they did not have the disease will test positive with test B. That is, 2397600 people will test positive with both tests even though they do not have the disease.

If both tests A and B are positive when administered to a person selected at random from the population, what is the probability this person has the disease?
From above, there will be a total of 855000+ 2,397,600= 3,252,600 people who test positive on both tests, of whom 855000 actually have the disease. If a person tests positive on both tests, the probability he actually has the disease is 855000/3252600= 0.26 or 26%.
 
Last edited by a moderator:
  • #4
HallsofIvy said:
I would do this in a completely different way: suppose there are a total of 100,000,000 people in the population.

So a total of 1,000,000 people have the disease and 999,000,000 do not.

No, 100,000,000 - 1,000,000 = 99,000,000 not 999,000,000. This error carries through all your calculations: I make the answer ≈78%.

This is counter-intuitive to many - you might think a false positive rate of 5% means that if you test positive you are 95% likely to have the disease, but consider a disease that has actually been eradicated. The chance that you have that disease is nil, but there is still a 5% chance that you test positive for it.
 
  • #5
vela said:
So what specifically is keeping you from figuring out P(A and B|D)?
Nothing, but how do you figure out P(A and B)?
 
  • #6
MrAnchovy said:
Nothing, but how do you figure out P(A and B)?

You can use the method outlined by HallsOfIvy; however, you may not have noticed that "independence" was quietly assumed and used. That is, the assumption is that the probability of B giving a false positive/negative depends only on whether the person has the disease or not, and is not affected by whether A gave a false positive/negative, etc.
 
  • #7
Using HallsofIvy's approach, I too get about 78%. Does anyone know if this is the correct answer? I think I'm even more confused now.
 
  • #8
Ray Vickson said:
You can use the method outlined by HallsOfIvy

The point I was trying to make was that vela's hint at a 'solution' was not very useful because the calculation of P(A and B) is equivalent to calculating the solution to the original problem. The right approach is of course that used by HallsOfIvy, even if the arithmetic was wrong.

Ray Vickson said:
you may not have noticed that "independence" was quietly assumed and used. That is, the assumption is that the probability of B giving a false positive/negative depends only on whether the person has the disease or not, and is not affected by whether A gave a false positive/negative, etc.

If the test events are not independent then the information provided in the question is incomplete and it cannot be solved; this assumption is therefore entirely appropriate and I don't think it is reasonable to infer that the OP or anyone else may not have noticed it.
 
  • #9
daneault23 said:
Using HallsofIvy's approach, I too get about 78%. Does anyone know if this is the correct answer? I think I'm even more confused now.

One can do it all using Bayes' formulas. Let A = "test A is positive", B = "test B is positive", D = "have the disease" and D' = "does not have the disease". Using the notation ##XY## instead of ##X \cap Y## for the intersection of events ##X, Y##, the problem is to find ##P(D|AB)##. This is
[tex] P(D|AB) = \frac{P(DAB)}{P(AB)},[/tex]
and we have ##P(AB) = P(AB|D) P(D) + P(AB|D') P(D').##

Using independence of the test results, given the actual disease status, we have
[tex] P(AB|D) = \frac{90}{100}\frac{95}{100} \text{ and}\\
P(AB|D') = \frac{4}{100}\frac{6}{100}[/tex]
So, the denominator P(AB) is obtainable. The numerator ##P(ABD) = P(AB|D)P(D)## is also obtainable.

The final result I get this way is P(D|AB) = .7825370675 ≈ 0.78.

The method suggested by HallsofIvy is just all this done in disguise, where we deal with population numbers instead of ratios. I actually like his approach--and used to use it all the time back in the Stone Age when I was teaching this stuff--but sometimes people want to see it done more formally.
 
  • #10
MrAnchovy said:
The point I was trying to make was that vela's hint at a 'solution' was not very useful because the calculation of P(A and B) is equivalent to calculating the solution to the original problem. The right approach is of course that used by HallsOfIvy, even if the arithmetic was wrong.
I didn't provide a hint. I was asking the OP to specifically point out what the difficulty he or she was running into instead of simply saying "I don't know how to do this" and waiting for someone to provide the solution.

In my experience, probability is one of the subjects in which reading and understanding a solution is remarkably ineffective in helping one learn how to solve problems. You can read a solution and have it make perfect sense to you, but when you turn around and try to apply the ideas to another problem, you find yourself stuck again. You really need to struggle and work through these concepts on your own to get them straight in your head.
 
  • #11
MrAnchovy said:
The point I was trying to make was that vela's hint at a 'solution' was not very useful because the calculation of P(A and B) is equivalent to calculating the solution to the original problem. The right approach is of course that used by HallsOfIvy, even if the arithmetic was wrong.



If the test events are not independent then the information provided in the question is incomplete and it cannot be solved; this assumption is therefore entirely appropriate and I don't think it is reasonable to infer that the OP or anyone else may not have noticed it.


Of course the problem is not doable without the independence assumption, but that is no reason to gloss over the issue. It is always a good idea to recognize and display the assumptions being made. Anyway, the test results are NOT independent---they are (presumably) conditionally independent, and that is a somewhat different concept, worth knowing about on its own.

If I were marking the question I would give full marks only if the solver had mentioned the assumption---it only takes about 3 or 4 words to do so.
 
Last edited:

1. What is conditional probability?

Conditional probability is the likelihood of an event occurring given that a previous event has already occurred. It is represented as P(A|B), where P(A) is the probability of event A and P(B) is the probability of event B.

2. How is conditional probability calculated?

Conditional probability is calculated by dividing the probability of the joint events (A and B) by the probability of the event that is known to have occurred (B). This can also be written as P(A|B) = P(A and B) / P(B).

3. What is the difference between conditional probability and regular probability?

The main difference between conditional probability and regular probability is that conditional probability takes into account a prior event occurring, while regular probability does not. Regular probability is calculated as P(A) = number of desired outcomes / total number of possible outcomes, while conditional probability is calculated as P(A|B) = P(A and B) / P(B).

4. How is conditional probability used in real life?

Conditional probability is commonly used in fields such as statistics, economics, and genetics. In real life, it can be used to predict the likelihood of certain outcomes based on past events, such as the probability of rain given that it is cloudy outside, or the probability of a child inheriting a genetic disorder from their parents.

5. What are some common misconceptions about conditional probability?

One common misconception about conditional probability is that the probability of event A occurring is affected by the probability of event B occurring. In reality, the probability of event A remains the same, it is just being calculated with the additional information that event B has already occurred. Another misconception is that conditional probability can only be used with two events, when in fact it can be used with multiple events and can be expanded to more complex scenarios through the use of Bayes' theorem.

Similar threads

  • Calculus and Beyond Homework Help
Replies
7
Views
1K
  • Calculus and Beyond Homework Help
Replies
4
Views
858
  • Calculus and Beyond Homework Help
Replies
2
Views
1K
  • Introductory Physics Homework Help
Replies
12
Views
1K
  • Biology and Medical
Replies
2
Views
1K
  • Calculus and Beyond Homework Help
Replies
16
Views
1K
  • Calculus and Beyond Homework Help
Replies
3
Views
740
  • Calculus and Beyond Homework Help
Replies
6
Views
857
  • Calculus and Beyond Homework Help
Replies
1
Views
259
  • Calculus and Beyond Homework Help
Replies
14
Views
2K
Back
Top