There's a new paper out in Nature Chemistry called "Quantifying the Chemical Beauty of Drugs". The authors are proposing a new "desirability score" for chemical structures in drug discovery, one that's an amalgam of physical and structural scores. To their credit, they didn't decide up front which of these things should be the miost important. Rather, they took eight properties over 770 well-known oral drugs, and set about figuring how much to weight each of them. (This was done, for the info-geeks among the crowd, by calculating the Shannon entropy for each possibility to maximize the information contained in the final model). Interestingly, this approach tended to give zero weight to the number of hydrogen-bond acceptors and to the polar surface area, which suggests that those two measurements are already subsumed in the other factors.
And that's all fine, but what does the result give us? Or, more accurately, what does it give us that we haven't had before? After all, there have been a number of such compound-rating schemes proposed before (and the authors, again to their credit, compare their new proposal with the others head-to-head). But I don't see any great advantage. The Lipinski "Rule of 5" is a pretty simple metric - too simple for many tastes - and what this gives you is a Rule of 5 with both categories smeared out towards each other to give some continuous overlap. (See the figure below, which is taken from the paper). That's certainly more in line with the real world, but in that real world, will people be willing to make decisions based on this method, or not?
The authors go for a bigger splash with the title of the paper, which refers to an experiment they tried. They had chemists across AstraZeneca's organization assess some 17,000 compounds (200 or so for each) with a "Yes/No" answer to "Would you undertake chemistry on this compound if it were a hit?" Only about 30% of the list got a "Yes" vote, and the reasons for rejecting the others were mostly "Too complex", followed closely by "Too simple". (That last one really makes me wonder - doesn't AZ have a big fragment-based drug design effort?) Note also that this sort of experiment has been done before.
Applying their model, the mean score for the "Yes" compounds was 0.67 (s.d.0.16), and the mean score for the "No" compounds was 0.49 (s.d. 0.23, which they say was statistically significant, although that must have been a close call. Overall, I wouldn't say that this test has an especially strong correlation with medicinal chemists' ideas of structural attractiveness, but then, I'm not so sure of the usefulness of those ideas to start with. I think that the two ends of the scale are hard to argue with, but there's a great mass of compounds in the middle that people decide that they like or don't like, without being able to back up those statements with much data. (I'm as guilty as anyone here).
The last part of the paper tries to extend the model from hit compounds to the targets that they bind to - a druggability assessment. The authors looked through the ChEMBL database, and ranked the various target by the scores of the ligands that are associated with them. They found that their mean ligand score for all the targets in there is 0.478. For the targets of approved drugs, it's 0.492, and for the orally active ones it's 0.539 - so there seems to be a trend, although if those differences reached statistical significance, it isn't stated in the paper.
So overall, I find nothing really wrong with this paper, but nothing spectacularly right with it, either. I'd be interested in hearing other calls on it as it gets out into the community. . .