About this Author
DBL%20Hendrix%20small.png College chemistry, 1983

Derek Lowe The 2002 Model

Dbl%20new%20portrait%20B%26W.png After 10 years of blogging. . .

Derek Lowe, an Arkansan by birth, got his BA from Hendrix College and his PhD in organic chemistry from Duke before spending time in Germany on a Humboldt Fellowship on his post-doc. He's worked for several major pharmaceutical companies since 1989 on drug discovery projects against schizophrenia, Alzheimer's, diabetes, osteoporosis and other diseases. To contact Derek email him directly: Twitter: Dereklowe

Chemistry and Drug Data: Drugbank
Chempedia Lab
Synthetic Pages
Organic Chemistry Portal
Not Voodoo

Chemistry and Pharma Blogs:
Org Prep Daily
The Haystack
A New Merck, Reviewed
Liberal Arts Chemistry
Electron Pusher
All Things Metathesis
C&E News Blogs
Chemiotics II
Chemical Space
Noel O'Blog
In Vivo Blog
Terra Sigilatta
BBSRC/Douglas Kell
Realizations in Biostatistics
ChemSpider Blog
Organic Chem - Education & Industry
Pharma Strategy Blog
No Name No Slogan
Practical Fragments
The Curious Wavefunction
Natural Product Man
Fragment Literature
Chemistry World Blog
Synthetic Nature
Chemistry Blog
Synthesizing Ideas
Eye on FDA
Chemical Forums
Symyx Blog
Sceptical Chymist
Lamentations on Chemistry
Computational Organic Chemistry
Mining Drugs
Henry Rzepa

Science Blogs and News:
Bad Science
The Loom
Uncertain Principles
Fierce Biotech
Blogs for Industry
Omics! Omics!
Young Female Scientist
Notional Slurry
Nobel Intent
SciTech Daily
Science Blog
Gene Expression (I)
Gene Expression (II)
Adventures in Ethics and Science
Transterrestrial Musings
Slashdot Science
Cosmic Variance
Biology News Net

Medical Blogs
DB's Medical Rants
Science-Based Medicine
Respectful Insolence
Diabetes Mine

Economics and Business
Marginal Revolution
The Volokh Conspiracy
Knowledge Problem

Politics / Current Events
Virginia Postrel
Belmont Club
Mickey Kaus

Belles Lettres
Uncouth Reflections
Arts and Letters Daily
In the Pipeline: Don't miss Derek Lowe's excellent commentary on drug discovery and the pharma industry in general at In the Pipeline

In the Pipeline

« Bright, Glowing Industrial Chemistry | Main | Reactive Oxygen Species Are Your Friends! »

January 10, 2013

Automated Ligand Design?

Email This Entry

Posted by Derek

There's a paper out in Nature with the provocative title of "Automated Design of Ligands to Polypharmcological Profiles". Admittedly, to someone outside my own field of medicinal chemistry, that probably sounds about as dry as the Atacama desert, but it got my attention.

It's a large multi-center contribution, but the principal authors are Andrew Hopkins at Dundee and Bryan Roth at UNC-Chapel Hill. Using James Black's principle that the best place to find a new drug is to start with an old drug, what they're doing here is taking known ligands and running through a machine-learning process to see if they can introduce new activities into them. Now, those of us who spend time trying to take out other activities might wonder what good this is, but there are a some good reasons: for one thing, many CNS agents are polypharmacological to start with. And there certainly are situations where you want dual-acting compounds, CNS or not, which can be a major challenge. And read on - you can run things to get selectivity, too.

So how well does their technique work? The example they give starts with the cholinesterase inhibitor donepezil (sold as Aricept), which has a perfectly reasonable med-chem look to its structure. The groups' prediction, using their current models, was the it had a reasonable chance of having D4 dopaminergic activity, but probably not D2 (which numbers were borne out by experiment, and might have something to do with whatever activity Aricept has for Alzheimer's). I'll let them describe the process:

We tested our method by evolving the structure of donepezil with the dual objectives of improving D2 activity and achieving blood–brain barrier penetration. In our approach the desired multi-objective profile is defined a priori and then expressed as a point in multi-dimensional space termed ‘the ideal achievement point’. In this first example the objectives were simply defined as two target properties and therefore the space has two dimensions. Each dimension is defined by a Bayesian score for the predicted activity and a combined score that describes the absorption, distribution, metabolism and excretion (ADME) properties suitable for blood–brain barrier penetration (D2 score = 100, ADME score = 50). We then generated alternative chemical structures by a set of structural transformations using donepezil as the starting structure. The population was subsequently enumerated by applying a set of transformations to the parent compound(s) of each generation. In contrast to rules-based or synthetic-reaction-based approaches for generating chemical structures, we used a knowledge-based approach by mining the medicinal chemistry literature. By deriving structural transformations from medicinal chemistry, we attempted to mimic the creative design process.

Hmm. They rank these compounds in multi-dimensional space, according to distance from the ideal end point, filter them for chemical novelty, Lipinski criteria, etc., and then use the best structures as starting points for another round. This continues until you reach close enough to the desired point, or until you dead-end on improvement. In this case, they ended up with fairly active D2 compounds, by going to a lactam in the five-membered ring, lengthening the chain a bit, and going to an arylpiperazine on the end. They also predicted, though, that these compounds would hit a number of other targets, which they indeed did on testing.

How about something a bit more. . .targeted? They tried taking these new compounds through another design loop, this time trying to get rid of all the alpha-adrenergic activity they'd picked up, while maintaining the 5-HT1A and dopamine receptor activity they now had. They tried it both ways - running the algorithms with filtration of the alpha-active compounds at each stage, and without. Interestingly, both optimizations came up with very similar compounds, differing only out on the arylpiperazine end. The alpha-active series wanted ortho-methoxyphenyl on the piperazine, while the alpha-inactive series wanted 2-pyridyl. These preferences were confirmed by experiment as well. Some of you who've worked on adrenergics might be saying "Well, yeah, that's what the receptors are already known to prefer, so what's the news here?" But keep in mind, what the receptors are known to prefer is what's been programmed into this process, so of course, that's what it's going to recapitulate. The idea is for the program to keep track of all the known activities - the huge potential SAR spreadsheet - so you don't have to try to do it yourself, with you own grey matter.

The last example asks whether, starting from donezepil, potent and selective D4 compounds could be evolved. I'm going to reproduce the figure from the paper here, to give an idea of the synthetic transformations involved:
So, donezepil (compound 1) is 614 nM against D4, and after a few rounds of optimization, you get structure 13, which is 9 nM. Not bad! Then if you take 13 as a starting point, and select for structural novelty along the way, you get 18 (five micromolar against D4), 20, 21, and (S)-27 (which is 90 nM at D4). All of these compounds picked up a great deal more selectivity for D4 compared to the earlier donezepil-derived scaffolds as well.

Well, then, are we all out of what jobs we have left? Not just yet. You'll note that the group picked GPCRs as a field to work in, partly because there's a tremendous amount known about their SAR preferences and cross-functional selectivities. And even so, of the 800 predictions made in the course of this work, the authors claim about a 75% success rate - pretty impressive, but not the All-Seeing Eye, quite yet. I'd be quite interested in seeing these algorithms tried out on kinase inhibitors, another area with a wealth of such data. But if you're dwelling among the untrodden ways, like Wordsworth's Lucy, then you're pretty much on your own, I'd say, unless you 're looking to add in some activity in one of the more well-worked-out classes.

But knowledge piles up, doesn't it? This approach is the sort of thing that will not be going away, and should be getting more powerful and useful as time goes on. I have no trouble picturing an eventual future where such algorithms do a lot of the grunt work of drug discovery, but I don't foresee that happened for a while yet. Unless, of course, you do GPCR ligand drug discovery. In that case, I'd be contacting the authors of this paper as soon as possible, because this looks like something you need to be aware of.

Comments (12) + TrackBacks (0) | Category: Drug Assays | In Silico | The Central Nervous System


1. petros on January 10, 2013 11:59 AM writes...

Well it's rare for med chem to make Nature.

Having heard Andrew Hopkins talk on this a couple of times, it certainly sounds promising. It would be nice to see how well it works with non-monoamine GPCRs as well

Permalink to Comment

2. Ricky Connolly on January 10, 2013 12:07 PM writes...

Holey moley, check out that supplementary file. One hundred and fifty-eight pages!

Permalink to Comment

3. simpl on January 10, 2013 12:36 PM writes...

Once you have bacteria producing your molecules, you could get a similar effect by changing their living conditions;)

Permalink to Comment

4. David Formerly Known as a Chemist on January 10, 2013 12:43 PM writes...

This looks like an extrememly useful tool. I suspect this could be very valuable in brainstorming and selecting chemotypes to pursue. It by no means replaces the few medicinal chemists left on this continent, but can hopefully make him/her more productive. We'll see.

Permalink to Comment

5. A Non Mousse on January 10, 2013 12:50 PM writes...

Roth is a world authority on GPCRs. I would listen if he is talking.

Permalink to Comment

6. jbosch on January 10, 2013 5:00 PM writes...

reminds me a lot of this paper:
Lounkine et al. Large-scale prediction and testing of drug activity on side-effect targets. Nature (2012) vol. 486 (7403) pp. 361-7

Brain Shoichet gave a lecture recently here where he showed that you can predict which of the drugs might have an effect on a completely non-sequence related GPCR target simply by building a ligand-homology tree.

Permalink to Comment

7. Anonymous BMS Researcher on January 10, 2013 7:40 PM writes...

Kipling published a parody of Wordsworth's poem:

He wandered down the mountain-grade
Beyond the speed assigned —
A youth whom Justice often stayed
And generally fined.

He went alone, and none might know
If he could drive or steer;
Now he is in the ditch, and O!
The differential gear!

Permalink to Comment

8. modeling-101 on January 11, 2013 1:24 PM writes...

this isn't anything new - capable practitioners in industry have leveraged this type of ligand-based design strategy for years, exploiting known overlap between pharmacophores associated with different targets ("chemical genomics"). there's abundant chemical overlap for scaffold-hopping from one target to another, or combining different targets. the linchpin is knowing what target selectivity or combination needs to be dialed-in to achieve desired pharmacology - which takes us full circle to having prior established (clinically validated) targets for therapeutic indications.

Permalink to Comment

9. Old School on January 11, 2013 7:07 PM writes...

Maybe the reported evolution from known drug to lead matter could have been done in significantly less than 800 compounds by traditional judicious iteration, i.e. make changes, see what happens and respond to data vs 4 key parameters (D4 inverse agonism, D2 activity, antitarget activity, CNS penetration), with a bit of in vitro metabolic stability data thrown in for good measure?

The work could have then been written up in more prosaic form and published in JMC or BMCL. What do other commentators think?

Am also intrigued by the 18 month gap between submission (1 April 2011) and approval (19 Oct 2012).

Permalink to Comment

10. postdoc on January 12, 2013 5:56 PM writes...

@Old School
Where did you see 800 compounds? From the paper it is 800 datapoints (compounds x targets)

Permalink to Comment

11. Old School on January 13, 2013 7:08 PM writes...

Okay, when time permits (which it probably won't), I'll work out how many compounds the 800 datapoints correspond to (800/n, where n = no of discrete assays, assuming every cpd tested in every assay).

I'd still like to put forward that synthesis and testing effort comparable to the cerebral, computational and curational effort implicit in the paper and supplementary material could have given 800 judiciously chosen compounds with potential to take polypharmacology to a similar place, and maybe even a better place if the metabolic stability data that is routine in modern drug discovery in the real world was part of the programme of work.

Who knows, maybe the 800 minus 800/n new compounds that automated design didn't predict could have even given data that took the project in unforeseen directions (serendipity)?

Alternatively, the 800 compounds could turn out to be utterly useless - that's research for you. Sorry to be a grump, it's just that the words automated and design in one title are hard to bear (at least "predictive" wasn't there too).

Seriously, I should have said this before and congratulated the authors on getting a medchem paper published in Nature. That's some feat.

Permalink to Comment

12. Healthy on January 14, 2013 11:39 AM writes...

Doh!? When I was practicing computer driven docking like 6? years ago the approach was already a bit similar I 'm surprised that this is big news now. Basically letting the soft do a bunch of iterations based on known structures...In fact protein structure calculation has taken this approach for over a decade.
For funding, research and peer finding please refer to the non-profit Aging Portfolio.

Permalink to Comment


Remember Me?


Email this entry to:

Your email address:

Message (optional):

Scripps Update
What If Drug Patents Were Written Like Software Patents?
Stem Cells: The Center of "Right to Try"
Speaking of Polyphenols. . .
Dark Biology And Small Molecules
How Polyphenols Work, Perhaps?
More On Automated Medicinal Chemistry
Scripps Merging With USC?