I spoke yesterday about going through lists of chemical structures, looking for ones that we might want to keep in our screening libraries and, simultaneously, for the ones that we never want to see again. There's a paper from last year in the Journal of Medicinal Chemistry (47, 4891) that's an embarrassing reminder of just how hard it is to do that consistently.
It's from an effort led by Michael Lajiness at what was then Pharmacia/Upjohn (and is now Pfizer, which might account for the lead author now being at Eli Lilly, if you can follow all that.) They had about 22,000 compounds to sort through to see if they wanted to purchase them for the screening files, so they broke them out into 11 lists of 2000 compounds each. Thirteen medicinal chemists volunteered (or were volunteered, unless I miss my guess) to go over these lists. Eight members of the team reviewed two separate lists, and one wild man reviewed three.
The authors of the paper took a look at the list of rejected compounds from each reviewer, correctly (in my view) believing that this list is more significant than the list of what was accepted. After all, an ugly structure that makes it through may well never hit in an assay, and if it does it'll go through many more layers of scrutiny. A structure that's rejected, though, disappears from the company's screening universe forever. False negatives could have bigger consequences than false positives.
So, when more than one chemist went over the same list of 2000 compounds, how similar were their reject lists? Not very! On the average, two medicinal chemists would agree to reject the same compounds only about 23% of the time. (I knew that the overlap wasn't going to be perfect, but that's a lot worse than I was expecting.)
To continue the punishment, the lists had each been, without the knowledge of the reviewers, seeded with the same set of 250 compounds, all of which had been rejected by a previous review. The chemist-to-chemist rejection overlap in this smaller set of potential losers was still only 28%. Not as much of an improvement as you'd hope for. . .
And now the whipped topping and chocolate sprinkles: recall that many of the reviewers did more than one list. That means that they got to see that same group of 250 compounds more than once, in the context of different lists. How did the same people do when they saw the exact same compounds? They only rejected them about 50% of the time, it pains me to report.
It looks as if potential drug leads follow the same rule as Tolstoy's comment in Anna Karenina: Good compounds are all alike, while bad compounds are each bad in their own way. It seems that the Pharmacia reviewers didn't reject many good structures, but they let varying (and inconsistent) numbers of bad ones through (with no particular correlation to their industrial experience, I should add.) The possible reasons advance for this variation include personal bias, inattention (and I wouldn't minimize that factor, not in a list of 2000 compounds), and a general human inability to sort through large complex data sets.
And right at the end, the authors allude to a bigger problem: If this is how consistently our med-chem intuition works, how well does it serve us during drug development? In a research project, there are plenty of decisions to be made about what compounds to make, what structural series to emphasize and which ones to set aside. Just how bad at this are we, really? I'm afraid to find out.