About this Author
DBL%20Hendrix%20small.png College chemistry, 1983

Derek Lowe The 2002 Model

Dbl%20new%20portrait%20B%26W.png After 10 years of blogging. . .

Derek Lowe, an Arkansan by birth, got his BA from Hendrix College and his PhD in organic chemistry from Duke before spending time in Germany on a Humboldt Fellowship on his post-doc. He's worked for several major pharmaceutical companies since 1989 on drug discovery projects against schizophrenia, Alzheimer's, diabetes, osteoporosis and other diseases. To contact Derek email him directly: Twitter: Dereklowe

Chemistry and Drug Data: Drugbank
Chempedia Lab
Synthetic Pages
Organic Chemistry Portal
Not Voodoo

Chemistry and Pharma Blogs:
Org Prep Daily
The Haystack
A New Merck, Reviewed
Liberal Arts Chemistry
Electron Pusher
All Things Metathesis
C&E News Blogs
Chemiotics II
Chemical Space
Noel O'Blog
In Vivo Blog
Terra Sigilatta
BBSRC/Douglas Kell
Realizations in Biostatistics
ChemSpider Blog
Organic Chem - Education & Industry
Pharma Strategy Blog
No Name No Slogan
Practical Fragments
The Curious Wavefunction
Natural Product Man
Fragment Literature
Chemistry World Blog
Synthetic Nature
Chemistry Blog
Synthesizing Ideas
Eye on FDA
Chemical Forums
Symyx Blog
Sceptical Chymist
Lamentations on Chemistry
Computational Organic Chemistry
Mining Drugs
Henry Rzepa

Science Blogs and News:
Bad Science
The Loom
Uncertain Principles
Fierce Biotech
Blogs for Industry
Omics! Omics!
Young Female Scientist
Notional Slurry
Nobel Intent
SciTech Daily
Science Blog
Gene Expression (I)
Gene Expression (II)
Adventures in Ethics and Science
Transterrestrial Musings
Slashdot Science
Cosmic Variance
Biology News Net

Medical Blogs
DB's Medical Rants
Science-Based Medicine
Respectful Insolence
Diabetes Mine

Economics and Business
Marginal Revolution
The Volokh Conspiracy
Knowledge Problem

Politics / Current Events
Virginia Postrel
Belmont Club
Mickey Kaus

Belles Lettres
Uncouth Reflections
Arts and Letters Daily
In the Pipeline: Don't miss Derek Lowe's excellent commentary on drug discovery and the pharma industry in general at In the Pipeline

In the Pipeline

« Promoting STEM Education, Foolishly | Main | Underappreciated Analytical Techniques »

May 22, 2013

How Many Binding Pockets Are There?

Email This Entry

Posted by Derek

Just how many different small-molecule binding sites are there? That's the subject of this new paper in PNAS, from Jeffrey Skolnick and Mu Gao at Georgia Tech, which several people have sent along to me in the last couple of days.

This question has a lot of bearing on questions of protein evolution. The paper's intro brings up two competing hypotheses of how protein function evolved. One, the "inherent functionality model", assumes that primitive binding pockets are a necessary consequence of protein folding, and that the effects of small molecules on these (probably quite nonspecific) motifs has been honed by evolutionary pressures since then. (The wellspring of this idea is this paper from 1976, by Jensen, and this paper will give you an overview of the field). The other way it might have worked, the "acquired functionality model", would be the case if proteins tend, in their "unevolved" states, to be more spherical, in which case binding events must have been much more rare, but also much more significant. In that system, the very existence of binding pockets themselves is what's under the most evolutionary pressure.

The Skolnick paper references this work from the Hecht group at Princeton, which already provides evidence for the first model. In that paper, a set of near-random 4-helical-bundle proteins was produced in E. coli - the only patterning was a rough polar/nonpolar alternation in amino acid residues. Nonetheless, many members of this unplanned family showed real levels of binding to things like heme, and many even showed above-background levels of several types of enzymatic activity.

In this new work, Skolnick and Gao produce a computational set of artificial proteins (called the ART library in the text), made up of nothing but poly-leucine. These were modeled to the secondary structure of known proteins in the PDB, to produce natural-ish proteins (from a broad structural point of view) that have no functional side chain residues themselves. Nonetheless, they found that the small-molecule-sized pockets of the ART set actually match up quite well with those found in real proteins. But here's where my technical competence begins to run out, because I'm not sure that I understand what "match up quite well" really means here. (If you can read through this earlier paper of theirs at speed, you're doing better than I can). The current work says that "Given two input pockets, a template and a target, (our algorithm) evaluates their PS-score, which measures the similarity in their backbone geometries, side-chain orientations, and the chemical similarities between the aligned pocket-lining residues." And that's fine, but what I don't know is how well it does that. I can see poly-Leu giving you pretty standard backbone geometries and side-chain orientations (although isn't leucine a little more likely than average to form alpha-helices?), but when we start talking chemical similarities between the pocket-lining residues, well, how can that be?

But I'm even willing to go along with the main point of the paper, which is that there are not-so-many types of small-molecule binding pockets, even if I'm not so sure about their estimate of how many there are. For the record, they're guessing not many more than about 500. And while that seems low to me, it all depends on what we mean by "similar". I'm a medicinal chemist, someone who's used to seeing "magic methyl effects" where very small changes in ligand structure can make big differences in binding to a protein. And that makes me think that I could probably take a set of binding pockets that Skolnick's people would call so similar as to be basically identical, and still find small molecules that would differentiate them. In fact, that's a big part of my job.

But in general, I see the point they're making, but it's one that I've already internalized. There are a finite number of proteins in the human body. Fifty thousand? A couple of hundred thousand? Probably not a million. Not all of these have small-molecule binding sites, for sure, so there's a smaller set to deal with right there. Even if those binding sites were completely different from one another, we'd be looking at a set of binding pockets in the thousands/tens of thousands range, most likely. But they're not completely different, as any medicinal chemist knows: try to make a selective muscarinic agonist, or a really targeted serine hydrolase inhibitor, and you'll learn that lesson quickly. And anyone who's run their drug lead through a big selectivity panel has seen the sorts of off-target activities that come up: you hit someof the other members of your target's family to greater or lesser degree. You hit the flippin' sigma receptor, not that anyone knows what that means. You hit the hERG channel, and good luck to you then. Your compound is a substrate for one of the CYP enzymes, or it binds tightly to serum albumin. Who has even seen a compound that binds only to its putative target? And this is only with the counterscreens we have, which is a small subset of the things that are really out there in cells.

And that takes me to my main objection to this paper. As I say, I'm willing to stipulate, gladly, that there are only so many types of binding pockets in this world (although I think that it's more than 500). But this sort of thing is what I have a problem with:

". . .we conclude that ligand-binding promiscuity is likely an inherent feature resulting from the geometric and physical–chemical properties of proteins. This promiscuity implies that the notion of one molecule–one protein target that underlies many aspects of drug discovery is likely incorrect, a conclusion consistent with recent studies. Moreover, within a cell, a given endogenous ligand likely interacts at low levels with multiple proteins that may have different global structures.

"Many aspects of drug discovery" assume that we're only hitting one target? Come on down and try that line out in a drug company, and be prepared for rude comments. Believe me, we all know that our compounds hit other things, and we all know that we don't even know the tenth of it. This is a straw man; I don't know of anyone doing drug discovery that has ever believed anything else. Besides, there are whole fields (CNS) where polypharmacy is assumed, and even encouraged. But even when we're targeting single proteins, believe me, no one is naive enough to think that we're hitting those alone.

Other aspects of this paper, though, are fine by me. As the authors point out, this sort of thing has implications for drawing evolutionary family trees of proteins - we should not assume too much when we see similar binding pockets, since these may well have a better chance of being coincidence than we think. And there are also implications for origin-of-life studies: this work (and the other work in the field, cited above) imply that a random collection of proteins could still display a variety of functions. Whether these are good enough to start assembling a primitive living system is another question, but it may be that proteinaceous life has an easier time bootstrapping itself than we might imagine.

Comments (16) + TrackBacks (0) | Category: Biological News | In Silico | Life As We (Don't) Know It


1. sgcox on May 22, 2013 8:40 AM writes...

This paper is worth mentioning here:

Permalink to Comment

2. anon on May 22, 2013 8:41 AM writes...

"I don't know of anyone doing drug discovery that has ever believed anything else."

Quite a few academic PIs fancy themselves doing drug discovery these days. Many (not all, mind you, but many) of them are not very knowledgeable of the actual pharmacology, let alone medicinal chemistry. Thus "notion of one molecule–one protein target that underlies many aspects of drug discovery" may be spot on, insofar as it concerns the predominant mentality in academic "drug discovery".

Permalink to Comment

3. petros on May 22, 2013 9:02 AM writes...

This review focusing on cancer targets considers the multiplicity of potential drug sites on target proteins.

Nat Rev Drug Discov. 2013 Jan;12(1):35-50. doi: 10.1038/nrd3913

Permalink to Comment

4. Curious Wavefunction on May 22, 2013 9:12 AM writes...

"This promiscuity implies that the notion of one molecule–one protein target that underlies many aspects of drug discovery is likely incorrect"

Maybe it simply means that even those compounds which we think are selective for one target are probably not? It certainly seems to be the case for drugs like Gleevec and maybe we will find it to be the case for others if we dig deeper.

Permalink to Comment

5. PF9 on May 22, 2013 9:20 AM writes...

A couple of points to consider:
1) a lot will depend on how you define the extent of a binding site. Think about serine proteases: it is S1, how much of S1, do you add in more of the central cavity etc etc
2) Just because a ligand shows binding in vitro, is there really a causal, PK/PD driven link to action in vivo? In many cases I doubt it (hERG aside)

Permalink to Comment

6. anchor on May 22, 2013 9:22 AM writes...

#2-spot on! I moved into academia from big-pharma. I am exasperated at many level with these PIs and I find most of them to be "one trick pony." They all in their infinite wisdom believe that if you fix an "issue" that is their specialty and staple diet for their existence in academia (logP, blood curve, mouse model etc.) then the drug molecule will happen! I am very frustrated at their ignorance. I try to reason with them that it is not that simple but my suggestions and reasoning's fall by the way side. Call it their stupidity or naivety. More damaging these days are with the ready availability of Scifinder search engines, they are even getting bolder and as a medicinal chemist with modest success in the industry, am simply flabbergasted.

Permalink to Comment

7. Imaging guy on May 22, 2013 9:52 AM writes...

When do you call a hit a hit? What is the cutoff Kd below which it is not longer considered a hit? Since there are different interaction assays, what about cross platforms reproducibility?

Permalink to Comment

8. littlegreenpills on May 22, 2013 10:24 AM writes...

The estimate of the number of binding pockets seems confusing. Are they only considering catalytic/active sites? What about allosteric sites?

If there are only about 500 sites then we are probably done finding "new" drugs and should just focus on tweaking the ones already out there to provide the desired effects.

Permalink to Comment

9. a. nonymaus on May 22, 2013 10:29 AM writes...

#7 is onto something here. If something binds to two proteins that can be a problem unless the delta-Kd lets me dose so that one is 95% bound and the other is 5% bound.

What I find surprising about receptors is that subtypes exist at all. What selection pressure is there to maintain so many nicotinic receptor subtypes when they all bind nicotine? Is it that they have different nicotine binding constants? If so, why doesn't the cell just vary the receptor density? Is it that they have different effects on binding and the receptor binding-site differences are an incidental artifact of the structural changes required for the different effects?

Permalink to Comment

10. mausanony on May 22, 2013 11:04 AM writes...


Evolution is a blind watchmaker. If some minor variation to the function conveys fitness to an emerging sub-species, it will be selected for. It matters not how that minor variation was arrived at: receptor gene duplication and slight sequence divergence? Perturbation to transcriptional regulation of that same gene within a different cell type? Perturbation of intracellular signalling cascade due to a mutation who knows where, which results in altered receptor density? Over the aeons, many, if not all, of the possible mechanisms that give rise to phenotypic variation will have a shot at contributing something to the organism, and the complexity (such as receptor subtypes that all bind the same thing) will accumulate.

Permalink to Comment

11. Johannes on May 22, 2013 3:06 PM writes...

I'm somewhat skeptical of their conclusion as well. Fx SGX523, according to Stephen Burley, in a video posted on coursera, showed only binding affinity to a single protein target, something other thought impossible. Could be a freak, likely not

Permalink to Comment

12. Yolo on May 22, 2013 4:18 PM writes...

This article touches on a concept very similar to this and applies it to library design:

Permalink to Comment

13. Anonymous on May 22, 2013 4:28 PM writes...

Evolution is a blind watchmaker. If some minor variation to the function conveys fitness to an emerging sub-species, it will be selected for.

This is simply not true. While natural selection is dependent upon the difference in the number of offspring among variant phenotypes, the difference is the average difference in number of offspring among variant phenotypes and not the individual difference in number of offspring wit

Permalink to Comment

14. Dr. Manhattan on May 22, 2013 4:40 PM writes...

" They all in their infinite wisdom believe that if you fix an "issue" that is their specialty and staple diet for their existence in academia (logP, blood curve, mouse model etc.) then the drug molecule will happen!"

Anchor, I totally agree, based on my own experience! In fact, I suspect the real goal is to get continued funding for their academic "drug discovery" efforts. It is virtually impossible to perform real drug discovery in the absence of a large, multidisciplinary team.

Permalink to Comment

15. Cellbio on May 22, 2013 7:53 PM writes...

Yes, CW, when you take drugs with known moa and presumed selectivity and screen them broadly in biology, you see activities not appreciated. When large collections, say a thousand molecules from one med chem campaign, or in another instance, 18 steroids of similiar structure are screened, it closer to truth that no two are alike than there is evidence of a single target associated with the compound's pharmacology.

We can only adhere to our idea of pursuing the biology of a single target, as the target-centric biology era has done, if we measure little else than the intended impact. And i believe this has been propagated in pharma and is not unique to academia. I dont think the best or most experienced in pharma hold these beliefs, but neither do i think the best of pharma often rise to the top. It can do wonders for your career to populate the pipeline with paradigm driven metric measured clinical candidates that fail spectaculary once in development. Helps with bonus too and lets the execs trot out wonderfully bloated pipeline charts, sometimes with dead molecules remaining as Ph1 or Ph2 zombies until other positive news allows for slipping in public notice of termination.

Career building around socially endorsed endeavors that deviate from good science is, in my opinion, rampant in big companies. It is rare to find a company culture where the voice of a skeptical scientist that urges caution carries the same weight as that of a charismatic business leader, even when the salient issue is technical in nature. That leader, especially when not from the scientific ranks, gives us all the organizational problems spoken of often on this blog, and represented well in fables like Emperor's new clothes.

Permalink to Comment

16. simpl on May 27, 2013 11:45 AM writes...

After the finding reported in Nature on Nppb and receptors, make that 501? In fact, it reminded me of the old Beadle/Tatum idea - one gene complex = 1 protein - that would give you a maximum number of receptors of the order of 10000.

Permalink to Comment


Remember Me?


Email this entry to:

Your email address:

Message (optional):

The Last Post
The GSK Layoffs Continue, By Proxy
The Move is Nigh
Another Alzheimer's IPO
Cutbacks at C&E News
Sanofi Pays to Get Back Into Oncology
An Irresponsible Statement About Curing Cancer
Oliver Sacks on Turning Back to Chemistry