@Dale : actually what you say is what is done in 'simple' HTS practice.
The 'top M' molecules from the initial screen are chosen to go into confirmation.
I think we ended up talking about hit classification because my initial question was how to compare the two assays, and I suppose that is related to the ability of each assay to discriminate well between activity and inactivity.
Lacking a true positive population, I guess the variability on replication and the distribution of the negatives should be the main criteria, because whatever threshold we use, a molecule with a given PIN has a higher probability of being categorised correctly if the PIN has a small SD and if it's 'far' from the negative set.
There is however a subtler point, which I mentioned briefly earlier. We're testing molecules, for which various similarity scores can be calculated. If two molecules are very similar, there is a higher chance that they will have a similar biological activity. So if we have a not-very-diverse set to screen, it may happen that in the top M molecules there will be an over-representation of a certain type of molecule. So by simply picking the top M we will not get more information about the active 'scaffolds' as we call them, we'll just have several copies of same-y stuff. To identify active scaffolds (i.e. groups of closely related molecules, usually by a central 'core'), it is sometimes better to cluster by similarity first, look at the local cluster hit rates and select representative molecules from each high-scoring cluster for confirmation. This would lead to a more diverse confirmation set, and increase the chances to obtain valuable information.
@mfb :
Let's call A the more expensive assay, B the less expensive one.
We can't run either assay on the whole set before the 5K cpds have been run; that's what they call 'validation', and it hasn't been done yet.
Even if that weren't the case, I don't know if we should use B to select molecules for A before we know if the two assays tell us the same thing about the molecules.
Suppose for instance that by some crazy effect the two assays give us very poorly correlated results (so there is no good linear or whatever function relating A and B), and from the outcome of the positive controls we are more confident that A is 'telling the truth'.
Then if we pre-screened all 100 K compounds in B first, we would get a confirmation set that is much poorer in true actives (according to A), thus drastically reducing our chances of success.
I may be wrong, but for me it's important to know first how the two assays compare, and if B really doesn't make sense and/or has a much larger replicate SD than A, we give up on saving money and we run A.
In the long run it would cost us more to progress false positives and lose false negatives because of a bad assay, than to run upfront the more expensive assay.