About this Author
DBL%20Hendrix%20small.png College chemistry, 1983

Derek Lowe The 2002 Model

Dbl%20new%20portrait%20B%26W.png After 10 years of blogging. . .

Derek Lowe, an Arkansan by birth, got his BA from Hendrix College and his PhD in organic chemistry from Duke before spending time in Germany on a Humboldt Fellowship on his post-doc. He's worked for several major pharmaceutical companies since 1989 on drug discovery projects against schizophrenia, Alzheimer's, diabetes, osteoporosis and other diseases. To contact Derek email him directly: Twitter: Dereklowe

Chemistry and Drug Data: Drugbank
Chempedia Lab
Synthetic Pages
Organic Chemistry Portal
Not Voodoo

Chemistry and Pharma Blogs:
Org Prep Daily
The Haystack
A New Merck, Reviewed
Liberal Arts Chemistry
Electron Pusher
All Things Metathesis
C&E News Blogs
Chemiotics II
Chemical Space
Noel O'Blog
In Vivo Blog
Terra Sigilatta
BBSRC/Douglas Kell
Realizations in Biostatistics
ChemSpider Blog
Organic Chem - Education & Industry
Pharma Strategy Blog
No Name No Slogan
Practical Fragments
The Curious Wavefunction
Natural Product Man
Fragment Literature
Chemistry World Blog
Synthetic Nature
Chemistry Blog
Synthesizing Ideas
Eye on FDA
Chemical Forums
Symyx Blog
Sceptical Chymist
Lamentations on Chemistry
Computational Organic Chemistry
Mining Drugs
Henry Rzepa

Science Blogs and News:
Bad Science
The Loom
Uncertain Principles
Fierce Biotech
Blogs for Industry
Omics! Omics!
Young Female Scientist
Notional Slurry
Nobel Intent
SciTech Daily
Science Blog
Gene Expression (I)
Gene Expression (II)
Adventures in Ethics and Science
Transterrestrial Musings
Slashdot Science
Cosmic Variance
Biology News Net

Medical Blogs
DB's Medical Rants
Science-Based Medicine
Respectful Insolence
Diabetes Mine

Economics and Business
Marginal Revolution
The Volokh Conspiracy
Knowledge Problem

Politics / Current Events
Virginia Postrel
Belmont Club
Mickey Kaus

Belles Lettres
Uncouth Reflections
Arts and Letters Daily
In the Pipeline: Don't miss Derek Lowe's excellent commentary on drug discovery and the pharma industry in general at In the Pipeline

In the Pipeline

« Genentech's Big Worry: Roche? | Main | Watch that Little Letter "c" »

August 21, 2012

Four Billion Compounds At a Time

Email This Entry

Posted by Derek

This paper from GlaxoSmithKline uses a technology that I find very interesting, but it's one that I still have many questions about. It's applied in this case to ADAMTS-5, a metalloprotease enzyme, but I'm not going to talk about the target at all, but rather, the techniques used to screen it. The paper's acronym for it is ELT, Encoded Library Technology, but that "E" could just as well stand for "Enormous".

That's because they screened a four billion member library against the enzyme. That is many times the number of discrete chemical species that have been described in the entire scientific literature, in case you're wondering. This is done, as some of you may have already guessed, by DNA encoding. There's really no other way; no one has a multibillion-member library formatted in screening plates and ready to go.

So what's DNA encoding? What you do, roughly, is produce a combinatorial diversity set of compounds while they're attached to a length of DNA. Each synthetic step along the way is marked by adding another DNA sequence to the tag, so (in theory) every compound in the collection ends up with a unique oligonucleotide "bar code" attached to it. You screen this collection, narrow down on which compound (or compounds) are hits, and then use PCR and sequencing to figure out what their structures must have been.

As you can see, the only way this can work is through the magic of molecular biology. There are so many enzymatic methods for manipulating DNA sequences, and they work so well compared with standard organic chemistry, that ridiculously small amounts of DNA can be detected, amplified, sequenced, and worked with. And that's what lets you make a billion member library; none of the components can be present in very much quantity (!)
This particular library comes off of a 1,3,5-triazine, which is not exactly the most cutting-edge chemical scaffold out there (I well recall people making collections of such things back in about 1992). But here's where one of the big questions comes up: what if you have four billion of the things? What sort of low hit rate can you not overcome by that kind of brute force? My thought whenever I see these gigantic encoded libraries is that the whole field might as well be called "Return of Combichem: This Time It Works", and that's what I'd like to know: does it?

There are other questions. I've always wondered about the behavior of these tagged molecules in screening assays, since I picture the organic molecule itself as about the size of a window air conditioner poking out from the side of a two-story house of DNA. It seems strange to me that these beasts can interact with protein targets in ways that can be reliably reproduced once the huge wad of DNA is no longer present, but I've been assured by several people that this is indeed the case.

In this example, two particular lineages of compounds stood out as hits, which makes you much happier than a collection of random singletons. When the team prepared a selection of these as off-DNA "real organic compounds", many of them were indeed nanomolar hits, although a few dropped out. Interestingly, none of the compounds had the sorts of zinc-binding groups that you'd expect against the metalloprotease target. The rest of the paper is a more traditional SAR exploration of these, leading to what one has to infer are more tool/target validation compounds rather than drug candidates per se.

I know that GSK has been doing this sort of thing for a while, and from the looks of it, this work itself was done a while ago. For one thing, it's in J. Med. Chem., which is not where anything hot off the lab bench appears. For another, several of the authors of the paper appear with "Present Address" footnotes, so there has been time for a number of people on this project to have moved on completely. And that brings up the last set of questions, for now: has this been a worthwhile effort for GSK? Are they still doing it? Are we just seeing the tip of a large and interesting iceberg, or are we seeing the best that they've been able to do? That's the drug industry for you; you never know how many cards have been turned over, or why.

Comments (24) + TrackBacks (0) | Category: Chemical Biology | Chemical News | Drug Assays | Drug Industry History


1. alig on August 21, 2012 11:39 AM writes...

That is Praecis technology which GSK bought in 2007, so I doubt that the program is older than that.

Permalink to Comment

2. Derek Freyberg on August 21, 2012 12:04 PM writes...

And the other question is "can you achieve useful diversity using this method?" It's all very well and interesting to have four billion compounds, but if they're all of the "methyl, ethyl, propyl, butyl, futile" type around a very limited range of cores - and even as someone who left the bench long ago, I have to believe that there are significant constraints on what you can make if you have to keep the DNA label intact - there may be limited value in the technique.

Permalink to Comment

3. Curious Wavefunction on August 21, 2012 12:15 PM writes...

Since my company does this kind of DNA-programmed library synthesis I am very interested in the paper. I don't see the link though; could you point to it?

About your point regarding the translation of the hits from the DNA-tethered library to the discrete compounds, that's a very legitimate question, but you would be surprised how often it actually works. The times when it doesn't you have to start thinking about what kind of interactions the attached DNA would preclude the target from having with your compound.

The point about diversity is also a good one, and we keep on trying to come up with various ways of maximizing the diversity of our libraries, none of which are definitive since diversity in some sense is really in the eye of the beholder. One rather obvious thing to note is that diversity can come not just from different R groups but even from the same set of R groups at different locations on the final compound.

Permalink to Comment

4. Imaging guy on August 21, 2012 12:31 PM writes...

Could you give the link to the J. Med. Chem. article.

Permalink to Comment

5. Anonymous on August 21, 2012 12:43 PM writes...

Indeed, that tiny molecule attached to all that DNA. It would be nice to show the actual size relationship. I'm thinking they're not going to get very deep into the enzyme [so who needs a zinc-binder?], but are probably looking at some surface interaction and allosteric effect - but quite a potent one. Would be nice to see some crystallography with these.

Permalink to Comment

6. UKPI on August 21, 2012 12:56 PM writes...


@5 I wonder if they have tried finding protein-protein interaction inhibitors with this methodology.

Permalink to Comment

7. exGlaxoid on August 21, 2012 1:13 PM writes...

The article is:

J. Med. Chem., Article ASAP
DOI: 10.1021/jm300449x
Publication Date (Web): August 14, 2012

This is the same type of chemistry series that Mario Geyson's group and Affymax also beat to death at GSK. The main issue is that most of the encoding schemes only work well with simple chemistry, so triazines and peptides are the main libraries ever made. And every time we got a triazine lead, most chemists tossed it out...

Also, previous arrays made in a similar manner suffer from steric effects that tend to favor certain R groups, so there may not be equal amounts of all 4 billion compounds, maybe more like a few million, which is still a lot, but when we looked at smaller arrays, each step had some chemicals which reacted well, and some that did not work well, so not all steps worked well, much like peptide synthesis.

So if your goal is to make triazines, this is great, but otherwise, not so good. Plus the assays for these don't all work so well. Praecis has worked on this for 5+ years, with a huge amount of resource, but I don't know of any real results leading towards the clinic. I would not count on this array "technology" to be much use in medicinal chemistry. It seems much better suited to molecular biology problems.

This is not to say that there are not good uses of solid phase synthesis, parallel synthesis, and high throughput synthetic methods-there are good uses for all of them, only very few Pharma. companies seem to find them. They seem to want to apply each new technique to every problem then abandon them when they don't work for all of them. It takes a lot of knowledge to apply the right solution to each problem, and they laid the people off that had that...

Permalink to Comment

8. JC on August 21, 2012 1:13 PM writes...

4 billion? psssh try 4 trillion.

Permalink to Comment

9. David Formerly Known as a Chemist on August 21, 2012 1:40 PM writes...

Nightmares of combichem starting to come many papers did we see in the late 90s describing such whiz-bang results and then suddenly, the entire field imploded...

Permalink to Comment

10. annonie on August 21, 2012 2:32 PM writes...

There seems to be very little disclosure by GSK in application of the acquired Praecis technology for providing chemical leads which ultimately precluded legitimate clinical development compounds. Seems that such success would be the type of progress that would be shared at meetings, conferences, etc to enhance R&D image toward enhancing the pipeline. Another grand investment, along the lines of Sirtris?

Permalink to Comment

11. Anonymous on August 21, 2012 2:46 PM writes...

Annonie, what do you need chemical leads for when you fired all your chemists? I'm confused.

Permalink to Comment

12. Anonymous on August 21, 2012 3:21 PM writes...

Yes, I can assure you you that you just saw the tip of a large and interesting iceberg of ELT.

Permalink to Comment

13. Angry Med Chem Guy on August 21, 2012 4:58 PM writes...

You will never out smart traditional medicinal chemistry smarts. Only it's rare that pharma hires anyone with a degree in medicinal chemistry.

Permalink to Comment

14. freddy on August 21, 2012 8:43 PM writes...

@7 Ex-Glaxoid:

If you have any former colleagues still at GSK, you should ask them whether all the ELT libraries are triazine and peptides. The ELT group has been focused on developing new chemistries to enable synthesis of more drug-like and lead-like libraries.

There is ample literature precedence for a variety of DNA-compatibile chemistries, notably from the Liu group. They've shown alkyation, Wittig, cycloadditions, even cross coupling. Which represents (with acylation) a major chunk of the reactions listed in those yearly "what do med chemists make?" papers.

Permalink to Comment

15. Chemical Diversity on August 21, 2012 10:43 PM writes...


The details of my life are quite inconsequential ... Very well, where do I begin? My father was a relentlessly self-improving boulangerie owner from Belgium with low-grade narcolepsy and a penchant for buggery. My mother was a 15-year-old French prostitute named Chloe with webbed feet. My father would womanize; he would drink. He would make outrageous claims like he invented the question mark. Sometimes, he would accuse chestnuts of being lazy. The sort of general malaise that only the genius possess and the insane lament ... My childhood was typical: summers in Rangoon ... luge lessons ... In the spring, we'd make meat helmets ... When I was insolent I was placed in a burlap bag and beaten with reeds — pretty standard, really. At the age of 12, I received my first scribe. At the age of 14, a Zoroastrian named Vilmer ritualistically shaved my testicles. There really is nothing like a shorn scrotum — it's breathtaking ... I suggest you try it.

Permalink to Comment

16. Anon on August 22, 2012 3:06 AM writes...

I saw this very structure presented to me at an internal meeting at GSK in 2007. We all dubbed it 'the return of combichem'. They paid something like 60 million pounds for the company, and named it MDR Boston. I remember someone standing up and asking the presenter if he was worried that the compounds were not very drug-like. He had no answer to that. Just said it was med. chem's job to turn hits into leads. But turning a lemon into a peach is never easy.....

Permalink to Comment

17. DrSnowboard on August 22, 2012 7:46 AM writes...

I saw a talk pre-2001 at GSK on how these triazine libraries (unencoded) were the dogs dangly bits as hits. Every single 'lead' they provided had zero PK in animals.
However, a nM binder to a novel target must be better than starting with a 10uM or above,right?

Permalink to Comment

18. Anonymous on August 22, 2012 9:32 AM writes...

Any new technology is viewed with suspicion in the beginning and look at oligonucleotide therapeutics. With a handful of chemists and biologists involved and reasonably less money spent in the act, the field has advanced quite a bit today. Of course the delivery is still a major hurdle and I am sure if we all spend 1% of our time/money, we will have a working and non-toxic delivery technology for not only oligos but for small molecule drugs as well.

Permalink to Comment

19. noname on August 22, 2012 12:50 PM writes...

You mean like a....pill?

Permalink to Comment

20. MoMo on August 23, 2012 8:00 AM writes...

Nice work GSK! Now do something usefull!

Funny, not many GSK scientists at the recent ACS meeting in Philadelphia, they must be busy studying all those compounds!

Now all of you, go out and buy more TUMS to help support them in their quest to make 4 TRILLION compounds next!

And to treat your ulcers from no Pharm jobs.

Permalink to Comment

21. Anonymous on August 23, 2012 2:11 PM writes...

#19 not a pill, but targeted nanoparticle delivery

Permalink to Comment

22. Jon on August 27, 2012 3:19 PM writes...

Ah the joys of huge fishing exercises while wearing a blindfold! Parallel (no pun intended) to the usual problem of uninteresting chemistry is the question 'did I actually make the compound I'm screening' question. I guess with a library this size, you're only interested in hits, but wouldn't it be useful to at least know what wasn't active? How is that being addressed by the ELT chaps?

Permalink to Comment

23. Jon on August 27, 2012 3:21 PM writes...

Ah the joys of huge fishing exercises while wearing a blindfold! Parallel (no pun intended) to the usual problem of uninteresting chemistry is the question 'did I actually make the compound I'm screening' question. I guess with a library this size, you're only interested in hits, but wouldn't it be useful to at least know what wasn't active? How is that being addressed by the ELT chaps?

Permalink to Comment

24. Steve Young on March 28, 2013 7:05 AM writes...

The comments about combichem and poor properties may be valid in relation to the huge libraries made quickly to demonstrate the power of this technique; but as is always the case one can control and design in good properties if you take the time. This design approach is being pursued by the most recent implementations of this technology at both established (GSK) and new organisations ( Those who are aware of the recent published data on successes in difficult target classes will realise that this technology has huge potential.

Permalink to Comment


Remember Me?


Email this entry to:

Your email address:

Message (optional):

One and Done
The Latest Protein-Protein Compounds
Professor Fukuyama's Solvent Peaks
Novartis Gets Out of RNAi
Total Synthesis in Flow
Sweet Reason Lands On Its Face
More on the Science Chemogenomic Signatures Paper
Biology Maybe Right, Chemistry Ridiculously Wrong