Bayes' theorem and disease prevalence

Click For Summary

Homework Help Overview

The discussion revolves around a diagnostic test for a disease, focusing on the application of Bayes' theorem to determine the prevalence of the disease, as well as the sensitivity and specificity of the test. Participants are analyzing the relationships between positive predictive values, negative predictive values, and the probabilities associated with test results.

Discussion Character

  • Exploratory, Assumption checking, Conceptual clarification

Approaches and Questions Raised

  • Participants explore how to calculate the prevalence of the disease using the provided probabilities and test results. There are discussions about the definitions of sensitivity and specificity, and some participants question the terminology and relationships between the probabilities involved.

Discussion Status

The discussion is ongoing, with participants providing insights and questioning each other's interpretations. Some suggest alternative methods, such as using a probability tree, to clarify the relationships between the variables. There is no explicit consensus on the approach to take, but various lines of reasoning are being explored.

Contextual Notes

Participants are working under the constraints of the problem as presented, with specific values for positive and negative predictive values, as well as the rate of positive test results. There is an acknowledgment of potential misunderstandings regarding the terminology and relationships between the probabilities.

BRN
Messages
107
Reaction score
10
Homework Statement
Claculate prevalence, sensitivity and specificity of the diagnostic test.
Relevant Equations
Bayes theorem
Hello at all!

I have to solve this exercise:
A tampon diagnostic test provides 1% positive results. The positive predictive values (probabilities of positive test disease) and negative (absence disease given negative test) are respectively 0.95 and 0.98.
  1. What is the prevalence of the disease?
  2. What are the sensitivity (probabilities of positive disease with disease) and specificity (negative test probability with disease absence) of the test?
With positive test (T+), negative test (T-), disease (D) and healthy (D-), I have this table:
DD-
T+P(T+|D)P(T+|D-)
T-P(T-|D)P(T-|D-)

## P(D|T+) = 0.95 ##, ## P(D-|T-) = 0.98 ## and ## P(T+) = 0.001 ##.

From Bayes theorem I can calculate sensitivity starting from:
$$ P(D|T+) = \frac{P(T+|D)P(D)}{P(T+)} $$

But how can I calculate the prevalence ## P(D) ##?
 
Physics news on Phys.org
BRN said:
The positive predictive values (probabilities of positive test disease) and negative (absence disease given negative test) are respectively 0.95 and 0.98.

But how can I calculate the prevalence ## P(D) ##?
I hope I understand the terminology here. I assume this means that if someone has the disease then there is a 0.95 probability of a positive result (and hence a 0.05 probability of a false negative). And, if someone does not have the disease, then there is a 0.98 probability of a negative results (and a 0.02 probability of a false positive). And, I guess the prevalence means how many people have the disease.

A tampon diagnostic test provides 1% positive results.

What you could do is assume that the prevalence is ##P(D) = p##, calculate the positive results (as a function of ##p##) and equate this to 1%.

PS Although that makes no sense, as with a false positive of 0.02, you must get at least 2% positive tests, even if no one has the disease. Perhaps I've misunderstood the terminology?

PPS I did misunderstand!
 
Last edited:
Your post seems a bit garbled. Do you mean
BRN said:
A tampon diagnostic test provides 1% positive results. The positive predictive values (probabilities of disease given positive test ) and negative (absence disease given negative test) are respectively 0.95 and 0.98.
  1. What is the prevalence of the disease?
  2. What are the sensitivity (probabilities of positive test given disease) and specificity (negative test probability with disease absence) of the test?
BRN said:
From Bayes theorem I can calculate sensitivity starting from:
$$ P(D|T+) = \frac{P(T+|D)P(D)}{P(T+)} $$

But how can I calculate the prevalence ## P(D) ##?
So it's not Bayes' Theorem you need.
What is the relationship between P(A), P(A|B) and P(A|~B)?
 
haruspex said:
Your post seems a bit garbled. Do you meanSo it's not Bayes' Theorem you need.
What is the relationship between P(A), P(A|B) and P(A|~B)?
How could you get 1% positive tests with a 2% false positive rate? That's how I read the question.
 
PeroK said:
How could you get 1% positive tests with a 2% false positive rate? That's how I read the question.
No, it's a false negative rate of 2%. P(absence of disease given negative test)=0.98.
 
  • Like
Likes   Reactions: PeroK
This is interesting. The sensistivity and specificity you are asked to find are related to False Positive and False Negatives:

https://en.wikipedia.org/wiki/Sensitivity_and_specificity

To be precise:

True Positive (Sensitivity) = probability/proportion of positive test results for those who are positive (have the disease)
False Negative = (probability of) negative test for those who are positive

True Negative (Specificity) = probability/proportion of negative test results for those who are negative (do not have the disease)
False Positive = (probability of) positive test for those who are negative

The values we are given is:

Positive Predictive Value: probability that a person is positive given a positive result
Negative Predictive Value: probability that a person is negative given a negative result

https://en.wikipedia.org/wiki/Positive_and_negative_predictive_values

That seems to be the standard terminology.
 
  • Like
Likes   Reactions: BRN
Thanks for your help.

PeroK said:
This is interesting. The sensistivity and specificity you are asked to find are related to False Positive and False Negatives:

https://en.wikipedia.org/wiki/Sensitivity_and_specificity

To be precise:

True Positive (Sensitivity) = probability/proportion of positive test results for those who are positive (have the disease)
False Negative = (probability of) negative test for those who are positive

True Negative (Specificity) = probability/proportion of negative test results for those who are negative (do not have the disease)
False Positive = (probability of) positive test for those who are negative

The values we are given is:

Positive Predictive Value: probability that a person is positive given a positive result
Negative Predictive Value: probability that a person is negative given a negative result

https://en.wikipedia.org/wiki/Positive_and_negative_predictive_values

That seems to be the standard terminology.

OK, the terminology seems right.

So I have:
P(D|T+) = 0.95 Positive Predictive Value
P(D|T-) = 0.05
P(D-|T-) = 0.98 Negative Predictive Value
P(D-|T+) = 0.02
P(T+) = 0.01 positive results
P(T-) = 0.99

and I won't to calculate:
P(D) = prevalence
P(T+|D) = Sensitivity
P(T-|D-) = Specificity

PeroK said:
What you could do is assume that the prevalence is , calculate the positive results (as a function of ) and equate this to 1%.

I try to do this:
from Bayes theorem I have

$$ P(D|T+) = \frac{P(T+|D)P(D)}{P(T+)} \Rightarrow P(T+|D)P(D)=P(D|T+)P(T+) $$

Now

$$ P(T+) = P(T+|D)P(D) + P(T+|D-)(1-P(D) $$
$$ \Rightarrow P(T+) = P(D|T+)P(T+) + P(T+|D-)(1-P(D) $$
$$ \Rightarrow P(T+) -P(D|T+)P(T+) = P(T+|D-)(1-P(D) $$
$$ \Rightarrow P(T+) (1-P(D|T+)) = P(T+|D-)(1-P(D) = FP$$
FP = False Positive.

Now I wouldn't know how to continue...
 
BRN said:
Thanks for your help.
OK, the terminology seems right.

So I have:
P(D|T+) = 0.95 Positive Predictive Value
P(D|T-) = 0.05
P(D-|T-) = 0.98 Negative Predictive Value
P(D-|T+) = 0.02
P(T+) = 0.01 positive results
P(T-) = 0.99

and I won't to calculate:
P(D) = prevalence
P(T+|D) = Sensitivity
P(T-|D-) = Specificity
I try to do this:
from Bayes theorem I have

$$ P(D|T+) = \frac{P(T+|D)P(D)}{P(T+)} \Rightarrow P(T+|D)P(D)=P(D|T+)P(T+) $$

Now

$$ P(T+) = P(T+|D)P(D) + P(T+|D-)(1-P(D) $$
$$ \Rightarrow P(T+) = P(D|T+)P(T+) + P(T+|D-)(1-P(D) $$
$$ \Rightarrow P(T+) -P(D|T+)P(T+) = P(T+|D-)(1-P(D) $$
$$ \Rightarrow P(T+) (1-P(D|T+)) = P(T+|D-)(1-P(D) = FP$$
FP = False Positive.

Now I wouldn't know how to continue...
Please try to answer my question in post #3.
 
BRN said:
So I have:
P(D|T+) = 0.95 Positive Predictive Value
P(D|T-) = 0.05
P(D-|T-) = 0.98 Negative Predictive Value
P(D-|T+) = 0.02
P(T+) = 0.01 positive results
P(T-) = 0.99
That's doesn't look right. You must have ##P(D|T+) + P(D-|T+) = 1## etc.

BRN said:
and I won't to calculate:
P(D) = prevalence
P(T+|D) = Sensitivity
P(T-|D-) = Specificity

$$ P(D|T+) = \frac{P(T+|D)P(D)}{P(T+)} \Rightarrow P(T+|D)P(D)=P(D|T+)P(T+) $$

Now

$$ P(T+) = P(T+|D)P(D) + P(T+|D-)(1-P(D) $$
$$ \Rightarrow P(T+) = P(D|T+)P(T+) + P(T+|D-)(1-P(D) $$
$$ \Rightarrow P(T+) -P(D|T+)P(T+) = P(T+|D-)(1-P(D) $$
$$ \Rightarrow P(T+) (1-P(D|T+)) = P(T+|D-)(1-P(D) = FP$$
FP = False Positive.

Now I wouldn't know how to continue...
Okay, so you're going round in circles perhaps.

I find it's always easier to work from a probability tree, which gives a better insight than Bayes' Theorem (although it's the same information).

Using the tree method, the prevalance (##D_+##) falls out without any effort. The other quantities of sensitivity and specificity you can practically just read off as well.
 

Attachments

  • thumbnail_20210312_081259.jpg
    thumbnail_20210312_081259.jpg
    43 KB · Views: 169
Last edited:
  • #10
PeroK said:
I find it's always easier to work from a probability tree, which gives a better insight than Bayes' Theorem
Finding P(D) from the given data is very simple and does not require Bayes' Theorem. See post #3,
 
  • #11
PeroK said:
Using the tree method, the prevalance () falls out without any effort. The other quantities of sensitivity and specificity you can practically just read off as well.

## P(T+) (PPV) + P(T+) (1-PPV) ## I can interpret it as the total number of positive tests observed (number of positive tests relating to the sick + number of positive tests related to healthy), right?

PeroK said:
That's doesn't look right. You must have etc.

Yes! I agree. I made a mistake...

haruspex said:
What is the relationship between P(A), P(A|B) and P(A|~B)?

But isn't there a single relationship, or am I wrong?

$$ P(A|B) = \frac{P(A \cap B)}{P(B)} $$
$$ P(A|\tilde B) = \frac{P(A \cap \tilde B)}{1-P(B)} $$

I'm sorry, but I'm a beginner...
 
  • #12
BRN said:
## P(T+) (PPV) + P(T+) (1-PPV) ## I can interpret it as the total number of positive tests observed (number of positive tests relating to the sick + number of positive tests related to healthy), right?

I'm sorry, but I'm a beginner...
Yes: $$P(T+) (PPV) + P(T+) (1-PPV) = P(T+)$$
I'm going to suggest you learn the probability tree method. These numbers are all related to each other and the best way to see this is a simple probability tree. The answers drop out (like ripe fruit, as it were!).

Did you understand the diagram I posted above?
 
  • #13
BRN said:
But isn't there a single relationship, or am I wrong?
$$ P(A|B) = \frac{P(A \cap B)}{P(B)} $$
$$ P(A|\tilde B) = \frac{P(A \cap \tilde B)}{1-P(B)} $$
Sorry, the way I expressed it wasn't very clear. It's easier via joint probabilities:
Can you express P(A) in terms of P(A∩B) and P(A∩~B)?
Then P(A∩B) in terms of P(A|B) and P(B) etc?

Wrt post #1, it probably would have been more helpful to have drawn a table of joint probabilities rather than of conditional probabilities. You could fill in four unknowns for these, a, b, c, d, then write expressions for the given data in terms of them. You would have got four equations.
 
Last edited:

Similar threads

  • · Replies 6 ·
Replies
6
Views
3K
  • · Replies 6 ·
Replies
6
Views
2K
  • · Replies 14 ·
Replies
14
Views
3K
  • · Replies 9 ·
Replies
9
Views
2K
  • · Replies 19 ·
Replies
19
Views
2K
  • · Replies 2 ·
Replies
2
Views
5K
  • · Replies 1 ·
Replies
1
Views
1K
  • · Replies 47 ·
2
Replies
47
Views
5K
  • · Replies 10 ·
Replies
10
Views
3K
  • · Replies 4 ·
Replies
4
Views
2K