This paper from GlaxoSmithKline uses a technology that I find very interesting, but it's one that I still have many questions about. It's applied in this case to ADAMTS-5, a metalloprotease enzyme, but I'm not going to talk about the target at all, but rather, the techniques used to screen it. The paper's acronym for it is ELT, Encoded Library Technology, but that "E" could just as well stand for "Enormous".
That's because they screened a four billion member library against the enzyme. That is many times the number of discrete chemical species that have been described in the entire scientific literature, in case you're wondering. This is done, as some of you may have already guessed, by DNA encoding. There's really no other way; no one has a multibillion-member library formatted in screening plates and ready to go.
So what's DNA encoding? What you do, roughly, is produce a combinatorial diversity set of compounds while they're attached to a length of DNA. Each synthetic step along the way is marked by adding another DNA sequence to the tag, so (in theory) every compound in the collection ends up with a unique oligonucleotide "bar code" attached to it. You screen this collection, narrow down on which compound (or compounds) are hits, and then use PCR and sequencing to figure out what their structures must have been.
As you can see, the only way this can work is through the magic of molecular biology. There are so many enzymatic methods for manipulating DNA sequences, and they work so well compared with standard organic chemistry, that ridiculously small amounts of DNA can be detected, amplified, sequenced, and worked with. And that's what lets you make a billion member library; none of the components can be present in very much quantity (!)
This particular library comes off of a 1,3,5-triazine, which is not exactly the most cutting-edge chemical scaffold out there (I well recall people making collections of such things back in about 1992). But here's where one of the big questions comes up: what if you have four billion of the things? What sort of low hit rate can you not overcome by that kind of brute force? My thought whenever I see these gigantic encoded libraries is that the whole field might as well be called "Return of Combichem: This Time It Works", and that's what I'd like to know: does it?
There are other questions. I've always wondered about the behavior of these tagged molecules in screening assays, since I picture the organic molecule itself as about the size of a window air conditioner poking out from the side of a two-story house of DNA. It seems strange to me that these beasts can interact with protein targets in ways that can be reliably reproduced once the huge wad of DNA is no longer present, but I've been assured by several people that this is indeed the case.
In this example, two particular lineages of compounds stood out as hits, which makes you much happier than a collection of random singletons. When the team prepared a selection of these as off-DNA "real organic compounds", many of them were indeed nanomolar hits, although a few dropped out. Interestingly, none of the compounds had the sorts of zinc-binding groups that you'd expect against the metalloprotease target. The rest of the paper is a more traditional SAR exploration of these, leading to what one has to infer are more tool/target validation compounds rather than drug candidates per se.
I know that GSK has been doing this sort of thing for a while, and from the looks of it, this work itself was done a while ago. For one thing, it's in J. Med. Chem., which is not where anything hot off the lab bench appears. For another, several of the authors of the paper appear with "Present Address" footnotes, so there has been time for a number of people on this project to have moved on completely. And that brings up the last set of questions, for now: has this been a worthwhile effort for GSK? Are they still doing it? Are we just seeing the tip of a large and interesting iceberg, or are we seeing the best that they've been able to do? That's the drug industry for you; you never know how many cards have been turned over, or why.