There's a paper out in Nature with the provocative title of "Automated Design of Ligands to Polypharmcological Profiles". Admittedly, to someone outside my own field of medicinal chemistry, that probably sounds about as dry as the Atacama desert, but it got my attention.
It's a large multi-center contribution, but the principal authors are Andrew Hopkins at Dundee and Bryan Roth at UNC-Chapel Hill. Using James Black's principle that the best place to find a new drug is to start with an old drug, what they're doing here is taking known ligands and running through a machine-learning process to see if they can introduce new activities into them. Now, those of us who spend time trying to take out other activities might wonder what good this is, but there are a some good reasons: for one thing, many CNS agents are polypharmacological to start with. And there certainly are situations where you want dual-acting compounds, CNS or not, which can be a major challenge. And read on - you can run things to get selectivity, too.
So how well does their technique work? The example they give starts with the cholinesterase inhibitor donepezil (sold as Aricept), which has a perfectly reasonable med-chem look to its structure. The groups' prediction, using their current models, was the it had a reasonable chance of having D4 dopaminergic activity, but probably not D2 (which numbers were borne out by experiment, and might have something to do with whatever activity Aricept has for Alzheimer's). I'll let them describe the process:
We tested our method by evolving the structure of donepezil with the dual objectives of improving D2 activity and achieving blood–brain barrier penetration. In our approach the desired multi-objective profile is defined a priori and then expressed as a point in multi-dimensional space termed ‘the ideal achievement point’. In this first example the objectives were simply defined as two target properties and therefore the space has two dimensions. Each dimension is defined by a Bayesian score for the predicted activity and a combined score that describes the absorption, distribution, metabolism and excretion (ADME) properties suitable for blood–brain barrier penetration (D2 score = 100, ADME score = 50). We then generated alternative chemical structures by a set of structural transformations using donepezil as the starting structure. The population was subsequently enumerated by applying a set of transformations to the parent compound(s) of each generation. In contrast to rules-based or synthetic-reaction-based approaches for generating chemical structures, we used a knowledge-based approach by mining the medicinal chemistry literature. By deriving structural transformations from medicinal chemistry, we attempted to mimic the creative design process.
Hmm. They rank these compounds in multi-dimensional space, according to distance from the ideal end point, filter them for chemical novelty, Lipinski criteria, etc., and then use the best structures as starting points for another round. This continues until you reach close enough to the desired point, or until you dead-end on improvement. In this case, they ended up with fairly active D2 compounds, by going to a lactam in the five-membered ring, lengthening the chain a bit, and going to an arylpiperazine on the end. They also predicted, though, that these compounds would hit a number of other targets, which they indeed did on testing.
How about something a bit more. . .targeted? They tried taking these new compounds through another design loop, this time trying to get rid of all the alpha-adrenergic activity they'd picked up, while maintaining the 5-HT1A and dopamine receptor activity they now had. They tried it both ways - running the algorithms with filtration of the alpha-active compounds at each stage, and without. Interestingly, both optimizations came up with very similar compounds, differing only out on the arylpiperazine end. The alpha-active series wanted ortho-methoxyphenyl on the piperazine, while the alpha-inactive series wanted 2-pyridyl. These preferences were confirmed by experiment as well. Some of you who've worked on adrenergics might be saying "Well, yeah, that's what the receptors are already known to prefer, so what's the news here?" But keep in mind, what the receptors are known to prefer is what's been programmed into this process, so of course, that's what it's going to recapitulate. The idea is for the program to keep track of all the known activities - the huge potential SAR spreadsheet - so you don't have to try to do it yourself, with you own grey matter.
The last example asks whether, starting from donezepil, potent and selective D4 compounds could be evolved. I'm going to reproduce the figure from the paper here, to give an idea of the synthetic transformations involved:
So, donezepil (compound 1) is 614 nM against D4, and after a few rounds of optimization, you get structure 13, which is 9 nM. Not bad! Then if you take 13 as a starting point, and select for structural novelty along the way, you get 18 (five micromolar against D4), 20, 21, and (S)-27 (which is 90 nM at D4). All of these compounds picked up a great deal more selectivity for D4 compared to the earlier donezepil-derived scaffolds as well.
Well, then, are we all out of what jobs we have left? Not just yet. You'll note that the group picked GPCRs as a field to work in, partly because there's a tremendous amount known about their SAR preferences and cross-functional selectivities. And even so, of the 800 predictions made in the course of this work, the authors claim about a 75% success rate - pretty impressive, but not the All-Seeing Eye, quite yet. I'd be quite interested in seeing these algorithms tried out on kinase inhibitors, another area with a wealth of such data. But if you're dwelling among the untrodden ways, like Wordsworth's Lucy, then you're pretty much on your own, I'd say, unless you 're looking to add in some activity in one of the more well-worked-out classes.
But knowledge piles up, doesn't it? This approach is the sort of thing that will not be going away, and should be getting more powerful and useful as time goes on. I have no trouble picturing an eventual future where such algorithms do a lot of the grunt work of drug discovery, but I don't foresee that happened for a while yet. Unless, of course, you do GPCR ligand drug discovery. In that case, I'd be contacting the authors of this paper as soon as possible, because this looks like something you need to be aware of.