# False positive rate over two tests

1. Jan 5, 2016

### lavoisier

Hi everyone, and happy new year!
A colleague at work came up with this problem today, which I find very interesting.
We are both chemists, so our ability to solve it is limited - here I am asking for your help!

In a target discovery campaign that we run, we have two separate tests, let's call them A and B.
Each test takes a target as input and gives a positive or negative outcome. By previous validation of these tests, we know that the target is more likely to be active when the test is positive, and more likely to be inactive when the test is negative.

First, a number of targets get tested in A. Some of them are positive.
As it's very expensive to work on a target when it's not really active, we want to make sure that these are not false positives, so we submit the positives from test A to test B, and in the end we only keep those that are positive in B as well.

My colleague has doubts about the validity of this approach. In particular he thinks that, as both tests can also give false negatives, there's no point in running the second test, as we might end up removing as many active targets as inactive ones.

I told him that, at a very superficial glance, it's always better to run the second test, because it's much easier for an individual target to be a false positive once than twice - unless A and B are perfectly correlated (which they aren't, because positive targets do sometimes become negative on retesting).

On more careful consideration, though, it seems to me that there is much more to this problem.
First, the set of targets that are submitted to B is not a random one, it's pre-selected by A, so if I want to know how many false positives I have after B, I can't just multiply the false positive rate of A by the one of B, can I?
Second, if I want to know how many false negatives I have after B, i.e. targets that are actually active but fail to show that in B, do I need to know the false negative rate of both A and B?

I tried to express this in probability terms.
Let 'A+' = 'test A is positive', 'A-' = 'test A is negative', and similarly for B, 'T+' = 'the target is active', 'T-' = 'the target is inactive'. Let P(X) be the probability of X.

Then I think what my colleague wants to know is whether:
P(T+ | (A+ ∧ B+)) > P(T+ | A+)

Intuitively I would say that it is the case, if the false positive rate of each test is less than 50%, but I don't know how to calculate the quantity on the left.

I tried to expand it using the classic formula:
P(T+ | (A+ ∧ B+)) = P(T+ ∧ A+ ∧ B+) / P(A+ ∧ B+)
which tells me that I need some information on the correlation between A and B, but I am stuck with P(T+ ∧ A+ ∧ B+). I thought of separating it into P((T+ ∧ A+) ∧ (T+ ∧ B+)), but I don't see how that helps.

Can anyone please suggest how I should do this? Maybe a decision tree?

If I had a method, I could also calculate stuff like P(T+ | (A+ ∧ B-)), because at the moment we throw away a target that is positive in A and negative in B. This probability would tell us how safe that is.

Thanks!
L

2. Jan 5, 2016

### Orodruin

Staff Emeritus
A few questions (I know very little chemistry): Is a target always active if it is active? I.e., is this a fixed property which does not change, just that the tests A and B may or may not be positive with some probability which depends on whether the target is active or not?

What is the probability that a target is active? (i.e., if you just take a random target and have not tested it, or in other words, what is the fraction of the total targets which are active?)

3. Jan 5, 2016

### Orodruin

Staff Emeritus
Just focusing on this one for a second, it holds if the tests A and B are independent when conditioned on T+ and T- and B is a test satisfying P(B+|T+) > P(B+|T-), which seems to be a reasonable thing to require.

Note that that this probability relation might not be the only thing of relevance when you decide what to do, it depends how you value having to do test B (with the risk of discarding false negative targets) on all A+ targets compared to the benefit of not having to work with inactive targets which were false positives in B. To design a procedure you therefore need to define a cost/benefit function (in terms of time/money/benefit) which you wish to optimise and then select the procedure which minimises it.

4. Jan 5, 2016

### Staff: Mentor

You probably have the sensitivity and specificity for the test, which would be P(A+|T+) and P(A-|T-) and similarly for B. You probably also have the overall probability of a target being active P(T+). So I would try to express the probability of everything in terms of those.

Actually, it might be more valuable to consider the costs of everything and then just run a big monte carlo simulation to see how many valid targets you can test for a fixed amount of money under the different possible decision trees. I mean, there are only a few possible decision trees: no testing, just A, just B, first A then B, first B then A, always A and B.

5. Jan 6, 2016

### Orodruin

Staff Emeritus
I do not think you need a Monte Carlo for this. Each case is pretty simple to analyze analytically. The only thing is finding the "correct" benefit to cost ratio. This depends on how you value having tested more samples and how you value monetary cost vs cost in time etc.

6. Jan 6, 2016

### lavoisier

Thank you both very much for your suggestions!
The cost/benefit aspect is indeed something we should consider along with the probabilities. It's very deep analytical thinking that would be very useful to managers / decision-makers.

As for the problem itself, in the meantime I tried a decision tree, and I think I got the answer.
I revised a bit Bayes etc., and indeed, as you pointed out, I found I needed to include the base rate or P(T+), and the false positive rate.
Here's a snapshot:

As you can see, the true positive rate is quite bad, but the false positive one is not too high (at least for our standards ).
I had to make assumptions, e.g. the fact that A and B are independent, i.e. the fact that a target was tested in A doesn't matter to the outcome of B. And I assumed equal TP and FP rates for A and B (which is probably not too far from true).
If my calculations are correct, it turns out that the base rate doesn't affect the original question of whether P(T+ | (A+ ∧ B+)) > P(T+ | A+).
Here's why:
P(T+ | (A+ ∧ B+)) > P(T+ | A+)
P(T+)⋅P(A+|T+)⋅P(B+|T+) / [P(T+)⋅P(A+|T+)⋅P(B+|T+)+P(T-)⋅P(A+|T-)⋅P(B+|T-)] > P(T+)⋅P(A+|T+) / [P(T+)⋅P(A+|T+)+P(T-)⋅P(A+|T-)]
As:
P(A+|T+)=P(B+|T+)=TPR
P(A+|T-)=P(B+|T-)=FPR
the above yields:
P(T+)⋅TPR2 / [P(T+)⋅TPR2+P(T-)⋅FPR2] > P(T+)⋅TPR / [P(T+)⋅TPR+P(T-)⋅FPR]
and assuming P(T+)>0 and TPR>0:
1 + P(T-)/P(T+)⋅FPR/TPR > 1 + P(T-)/P(T+)⋅FPR2/TPR2
FPR/TPR > FPR2/TPR2
So, assuming FPR>0:
TPR > FPR

It would seem that the risk of progressing an inactive target is reduced by running both A and B rather than A alone if the true positive rate is greater than the false positive rate (under the above assumptions and conditions).
Do you think this is a correct conclusion? It looks similar to Oroduin's one, although his is more to the point of my original question (i.e. if it's worth running B after A).

It's always amazing to me as a non-expert how counterintuitive statistics can be.
For instance, the FPR seems to affect our ability to detect true positives much more than the TPR.
With the same data as above, even if I set TPR to 100%, the probability that a target is really active given that both A and B are positive is only 59%. I would have expected FPR to affect more what happens to inactive targets, not to active ones.
On the other hand, if I decrease the FPR to 1%, the probability of correctly spotting an active target rockets to 75% for A only and 92% for A and B, even with such a poor TPR (56%).

Looking back at what I wrote in the original post, I now think I was wrong in assuming that A and B can be correlated in a random way.
So much so that I included a table with the probabilities of the two assays agreeing or disagreeing, as can be calculated from the decision tree.
E.g. P(A+ and B+) = P(T+ and A+ and B+) + P(T- and A+ and B+).
Do you think this is a correct approach?
Then if it is I would have to think a bit harder about what it means, because we do often look at the correlation between different tests, and I'm not sure we analyse in appropriate detail what they tell us.

The other concern I have is that in fact P(T+) is a wild guess. I don't think we can ever know a priori what percentage of the tested targets should be active. So the absolute values of the probabilities we calculate are probably rather wild guesses, too. It doesn't seem to affect the original question, as discussed above, but may heavily affect cost/benefit calculations. Or am I wrong?

Thank you!
L

7. Jan 6, 2016

### Staff: Mentor

This is at the heart of the single most common error in probabilistic reasoning. People get this wrong in applications from screening job applicants for drugs to doctors explaining test results to patients.

This is because the underlying population is mostly negative, so the TPR is only even relevant for 5% of the population. For 95% of the population since they are negative what matters is the FPR. That overwhelms the impact of the TPR.

Bayes statistics make doing this kind of reasoning easier by requiring you to think about your prior information, but it is not an easy way to think.

8. Jan 8, 2016

### h6ss

Yes! A commonly used example in introductory Bayesian statistics is the rare disease example. If a disease is very rare, then your probability of really having the disease given a positive test result remains low. Suppose a disease that affects 0.1% of the population is diagnosed with 95% accuracy, your probability of really having the disease given that you are diagnosed positive is only about 2%. It is indeed a very counter-intuitive result.