Here's an interesting exercise carried out in the medicinal chemistry departments at J&J. The computational folks took all the molecules in the company's files, and then all the commercially available ones (over five million compounds), minus natural products, which were saved for another effort, and minus the obviously-nondruglike stuff (multiple nitro groups, solid hydrocarbons with no functionality, acid chlorides, etc.) They then clustered things down into (merely!) about 20,000 similarity clusters, and asked the chemists to rate them with up, down, or neutral votes.
What they found was that the opinions of the med-chem staff seemed to match known drug-like properties very closely. Molecular weights in the 300 to 400 range were most favorably received, while the likelihood of a downvote increased below 250 or above 425 or so. Similar trends held for rotatable bonds, hydrogen bond donors and acceptors, clogP, and other classic physical property descriptors. Even the ones that are hard to eyeball, like polar surface area, fell into line.
It's worth asking if that's a good thing, a bad thing, or nothing surprising at all. The authors themselves waffle a bit on that point:
The results of our experiment are fully consistent with prior literature on what confers drug- or lead-like characteristics to a chemical substance. Whether the strategy will yield the desired results in the long term with respect to quality, novelty, and number of hits/leads remains to be seen. It is also unclear whether this strategy can lead to sufficient differentiation from a competitive stand-point. In the meantime, the only undeniable benefits we can point to is that we harnessed our chemists’ opinions to select lead-like molecules that are totally within reasonable property ranges, that fill diversity holes, and that have been purchased for screening, and that we did so in a way that promoted greater transparency, greater awareness, greater collaboration, and a renewed sense of involvement and engagement of our employees.
I'll certainly give them the diversity-of-the-screening-deck point. But I'm not so sure about that renewed sense of involvement stuff. Apparently 145 chemists participated in total (this effort was open to everyone), but no mention is made of what fraction of the total staff that might be. People were advised to try to vote on at least 2,000 clusters (!), but fewer than half the participants even made it that far. Ten people made it halfway through the lot, and 6 lunatics actually voted on every single one of the 22,015 clusters, which makes me think that they had way too much time on their hands and/or have interesting and unusual personality features. A colleague's reaction to that figure was "Wow, they'll have to track those people down", to which my uncharitable reply was "Yeah, with a net".
So while this paper is interesting to read, I can't say that I would have been all that happy participating in it (although I've certainly had smaller-scale experiences of this type). And I'd like to know what the authors thought when they finally assembled all the votes and realized that they'd recapitulated a set of filters that they could have run in a few seconds, since they're surely already built into their software. And we all should reflect on how thoroughly we seem to have incorporated Lipinski's rules into our own software, between our ears. On balance, it's probably a good thing, but it's not without a price.