About this Author
DBL%20Hendrix%20small.png College chemistry, 1983

Derek Lowe The 2002 Model

Dbl%20new%20portrait%20B%26W.png After 10 years of blogging. . .

Derek Lowe, an Arkansan by birth, got his BA from Hendrix College and his PhD in organic chemistry from Duke before spending time in Germany on a Humboldt Fellowship on his post-doc. He's worked for several major pharmaceutical companies since 1989 on drug discovery projects against schizophrenia, Alzheimer's, diabetes, osteoporosis and other diseases. To contact Derek email him directly: Twitter: Dereklowe

Chemistry and Drug Data: Drugbank
Chempedia Lab
Synthetic Pages
Organic Chemistry Portal
Not Voodoo

Chemistry and Pharma Blogs:
Org Prep Daily
The Haystack
A New Merck, Reviewed
Liberal Arts Chemistry
Electron Pusher
All Things Metathesis
C&E News Blogs
Chemiotics II
Chemical Space
Noel O'Blog
In Vivo Blog
Terra Sigilatta
BBSRC/Douglas Kell
Realizations in Biostatistics
ChemSpider Blog
Organic Chem - Education & Industry
Pharma Strategy Blog
No Name No Slogan
Practical Fragments
The Curious Wavefunction
Natural Product Man
Fragment Literature
Chemistry World Blog
Synthetic Nature
Chemistry Blog
Synthesizing Ideas
Eye on FDA
Chemical Forums
Symyx Blog
Sceptical Chymist
Lamentations on Chemistry
Computational Organic Chemistry
Mining Drugs
Henry Rzepa

Science Blogs and News:
Bad Science
The Loom
Uncertain Principles
Fierce Biotech
Blogs for Industry
Omics! Omics!
Young Female Scientist
Notional Slurry
Nobel Intent
SciTech Daily
Science Blog
Gene Expression (I)
Gene Expression (II)
Adventures in Ethics and Science
Transterrestrial Musings
Slashdot Science
Cosmic Variance
Biology News Net

Medical Blogs
DB's Medical Rants
Science-Based Medicine
Respectful Insolence
Diabetes Mine

Economics and Business
Marginal Revolution
The Volokh Conspiracy
Knowledge Problem

Politics / Current Events
Virginia Postrel
Belmont Club
Mickey Kaus

Belles Lettres
Uncouth Reflections
Arts and Letters Daily
In the Pipeline: Don't miss Derek Lowe's excellent commentary on drug discovery and the pharma industry in general at In the Pipeline

In the Pipeline

« Managing For Motivation, The Simple Way | Main | Virtual Pharma, Revisited »

November 16, 2011

Proteins in a Living Cell

Email This Entry

Posted by Derek

It's messy inside a cell. The closer we look, the more seems to be going on. And now there's a closer look than ever at the state of proteins inside a common human cell line, and it does nothing but increase your appreciation for the whole process.

The authors have run one of these experiments that (in the days before automated mass spec techniques and huge computational power) would have been written off as a proposal from an unbalanced mind. They took cultured human U2OS cells, lysed them to release their contents, and digested those with trypsin. This gave, naturally, an extremely complex mass of smaller peptides, but these, the lot of them, were fractionated out and run through the mass spec machines, with use of ion-trapping techniques and mass-label spiking to get quantification. The whole process is reminiscent of solving a huge jigsaw puzzle by first running it through a food processor. The techniques for dealing with such massive piles of mass spec/protein sequence data, though, have improved to the point where this sort of experiment can now be carried out, although that's not to say that it isn't still a ferocious amount of work.

What did they find? These cells are expressing on the order of at least ten thousand different proteins (well above the numbers found in previous attempts at such quantification). Even with that, the authors have surely undercounted membrane-bound proteins, which weren't as available to their experimental technique, but they believe that they've gotten a pretty good read of the soluble parts. And these proteins turn out to expressed over a huge dynamic range, from a few dozen copies (or less) per cell up to tens of millions of copies.

As you'd figure, those copy numbers represent very different sorts of proteins. It appears, broadly, that signaling and regulatory functions are carried out by a host of low-expression proteins, while the basic machinery of the cell is made of hugely well-populated classes. Transcription, translation, metabolism, and transport are where most of the effort seems to be going - in fact, the most abundant proteins are there to deal with the synthesis and processing of proteins. There's a lot of overhead, in other words - it's like a rocket, in which a good part of the fuel has to be there in order to lift the fuel.

So that means that most of our favored drug targets are actually of quite low abundance - kinases, proteases, hydrolases of all sorts, receptors (most likely), and so on. We like to aim for regulatory choke points and bottlenecks, and these are just not common proteins - they don't need to be. In general (and this also makes sense) the proteins that have a large number of homologs and family members tend to show low copy numbers per variant. Ribosomal machinery, on the other hand - boy, is there a lot of ribosomal stuff. But unless it's bacterial ribosomes, that's not exactly a productive drug target, is it?

It's hard to picture what it's like inside a cell, and these numbers just make it look even stranger. What's strangest of all, perhaps, is that we can get small-molecule drugs to work under these conditions. . .

Comments (22) + TrackBacks (0) | Category: Analytical Chemistry | Biological News


1. luysii on November 16, 2011 10:49 AM writes...

Not only are there a lot of different proteins, but their aggregate amounts are quite high (the cytosol of E. Coli contains 300 milliGrams of protein/Liter). The concentration of each protein is relatively tiny, 300 milligrams of a protein of mass as low as mass 10 kiloDaltons (a small protein) gives a concentration of 30 microMolar -- and assumes that just one protein is present in the soup. The crowding helps (or forces) the proteins fold into to compact structures.

Whether concentration, as typically defined by chemists, for ions and drugs has any meaning in such a concentrated soup is unclear. You can forget the Debye Huckel theory of electrolytes in the cell. As we used to to say back in the day, it applies to slightly contaminated distilled water.

Permalink to Comment

2. Sleepless in SSF on November 16, 2011 11:31 AM writes...

@Luysii: 300 mg/L (or 30uM) doesn't strike me as especially concentrated. From the way people talk about the bizarre conditions inside cells, I had always assumed that cytosolic proteins must be tens of mg/ml or more. Is 300 mg/L really correct?

Permalink to Comment

3. Lester Freamon on November 16, 2011 11:44 AM writes...

It's 300mg protein per mL, not L.
And that doesn't include all of the lipids, carbohydrates, nucleic acids, metabolites.
Also, for molarity calculations, the volume of an E coli cell is 10^-15 L. In E coli, this means that a protein at 1 nanomolar is present at 1 molecule per cell. But given the size scaling, in a mammalian cell, 1 nanomolar is 1000 copies/cell.

Permalink to Comment

4. barry on November 16, 2011 11:58 AM writes...

there are a few outlier drug targets that are actually abundant. Tubulin (targetted by colchicine and taxol) and Hsp90 (targetted by a bunch of hopeful anti-cancer agents) are among the most abundant proteins in many mammalian cells.

Permalink to Comment

5. Todd on November 16, 2011 12:26 PM writes...

@barry: It makes sense that a lot of cancer targets are abundant. The zeitgeist aims for cell regulatory proteins, and those proteins are needed in huge quantities to keep the cell running. Depending on your perspective, this may be a good thing or bad thing. The good news is that you're trying to throw a rock in the ocean with cancer drugs. The bad news is throwing enough rocks in the ocean creates sandbars and all sorts of nasty issues.

Also, 300mg/mL is super concentrated. No wonder small molecule drugs have become the dominant paradigm. I'm a biology guy by trade, but I remember enough chemistry to know that fitting any sort of drug in that space without hitting anything else is like needing a royal flush every time you play poker. Yikes!

Can I go get my MBA now? LOL

Permalink to Comment

6. pete on November 16, 2011 1:03 PM writes...

The notion that regulatory proteins in eukaryotes generally show:
- increased "evolvability"
- low cellular abundance
maybe suggests a slick way of increasing the "gain" on sensitivity to environmental change. That is, a little change imposed on a regulatory gene might thereby have a big effect at various levels: gene family, cellular & (ultimately) organismal. Interesting stuff.

Permalink to Comment

7. Curious Wavefunction on November 16, 2011 1:55 PM writes...

Luysii: It's still early, but people have started realistically looking at the effects of macromolecular crowding on protein folding and function using simulations and molecular dynamics. C & EN had a cover story on this a while back.

Permalink to Comment

8. Bill on November 16, 2011 3:23 PM writes...

Did anyone else notice all of the typos in the paper? If you spent all that money and time running fancy mass spec experiments, you could at least read the paper before you submit it.

"yeast expands a significant fraction of its protein mass (~30%) on translation and protein sorting"

"this validation method relies only onto a single measurable value"

There are more examples, but you get the idea.

Permalink to Comment

9. bacillus on November 16, 2011 4:46 PM writes...

Remember too, if you shake the flask a bit, you'll get a different proteome altogether.

Permalink to Comment

10. Anonymous on November 16, 2011 5:05 PM writes...

The trick is not to think we 1) know everything 2) understand everything and 3)can work everything out at the molecular level from the ground up. Unfortunately this has been the main approach for the last ~15 years and is why (I belive) the pharma industry has failed to find many drugs. If you start out with 'we don't know s**t' then you end up going down the phenotypic('black box') route which has historically been more successful. Plenty more drugs left in that locker. Come back and revisit the reductionist approach in 500 years when we know a little bit more.

Permalink to Comment

11. daen on November 16, 2011 5:17 PM writes...

The analysis was done on an osteosarcoma cell line. So how about a comparative analysis of normal bone tissue, under the same conditions? By comparing the two, you could identify under- or over-expressed proteins in the U2OS line. That comparison could obviously help in the identification of some of those regulatory choke points and bottlenecks.

Permalink to Comment

12. luysii on November 16, 2011 6:47 PM writes...

Sleepless in SSF -- My face is red ! That's 300 milliGrams/milliLiter (not per Liter). For a reference see J. Bacteriol. vol. 181 pp. 197 - 203 '99. Sorry.

The molarity concept for proteins automatically means concentrations must be low. Molar means Moles per Liter which is Molecular weight in grams (10,000 in the case I mentioned above) per Liter (1,000 grams), so a 1 M concentratio is physically impossible for even a protein of this relatively small size.

Sorry for the mistake.

Permalink to Comment

13. Sleepless in SSF on November 17, 2011 12:39 PM writes...

@luysii: Thanks for the correction. However, I think you may have another error on your hands. It seems to me you are assuming that you can't have solubilities approaching 10000 g/l. That's clearly not true for some solutes: taking a quick spin through an online solubility table produced SbCl3 at 9100 g/l. SbCl3 clearly isn't a protein :) but I wonder if your assertion about physical impossibility is based on an assumption that solubilities of 10000 g/l are impossible in general and not just in the case of proteins, where it may be true (though I don't know).

Permalink to Comment

14. Sleepless in SSF on November 17, 2011 1:14 PM writes...

@Bill: I did notice lots of typos, but the authors are all either Swiss or German and I believe that is likely to be the origin of many of the errors. It seems to me that it's more reasonable to expect journals to employ copy editors rather than expecting grammatical perfection from scientists writing in a second (or third?) language.

And as to the expense, excluding the AQUA peptides the incremental cost of this experiment was probably less than a couple of hundred dollars (cell culture medium, IPG strips and reagents, TCP/IAA/trypsin). I don't see that they've clearly specified the total amount of each AQUA peptide used, but from what I do see I might guess that they used something like $2000 worth. In total, not a very expensive experiment given the amount of data produced.

Permalink to Comment

15. Sleepless in SSF on November 17, 2011 1:22 PM writes...

@daen: The type of differential proteomics experiment you describe is standard methodology. My lab does them every day as do many, many others. This paper was a demonstration of the benefits of combining two somewhat lesser used techniques: directed MS (in contrast to dynamic data acquisition MS/MS) and AQUA absolute quantitation (as opposed to labeled or label-free relative quantitation).

Permalink to Comment

16. daen on November 17, 2011 5:39 PM writes...

@Sleepless in SSF: Thanks! BTW, where in SSF are you? That's where I'm working!

Permalink to Comment

17. luysii on November 17, 2011 10:16 PM writes...

#13 Sleepless in SSF: That was my assumption. My example seemed to me like putting 10 quarts of water in a 1 quart milk carton. Probably still not possible for proteins. Here's why:

Figuring an average molecular mass of 100 Daltons/amino acid, a protein of mass 10,000 would have about 100 amino acids. Now put Avogadro's number of this protein into 1 liter of water which has 55+ Avogadro's number of water molecules, or less than one water molecule solubilizing each amino acid. Not going to happen.

I must confess that my original approach was based simply on mass. Thanks for making me think it through.

Permalink to Comment

18. Sleepless in SSF on November 18, 2011 7:26 AM writes...

@daen: That nick is old and dates from the days when I was at Exelixis, pre-implosion; I keep using it here for the sake of continuity. I'm actually in Florida nowadays.

Permalink to Comment

19. Sleepless in SSF on November 18, 2011 7:43 AM writes...

@luysii: I suspect you are probably correct about 1M 10kDa protein solutions, but the situation still isn't quite as simple as your last rationale. A 10 kDa protein will almost certainly have tertiary structure, and may well have a core that isn't well solubilized. Thus the number of AAs that require solvent contact might well be much less than 100.

Not claiming it would ever really happen, just saying that the effective H2O:AA ratio could be much higher than 55:100.

It sort of raises the question you implied in your first post: What is a solution? The answer seems clear when thinking about the sort of "slightly contaminated water" solution you rightly say that we chemists are used to thinking about. But what the heck would you have have if you did the reverse -- added one liter of water to 10 kg of dry protein (hypothetically assuming that the protein would fold correctly under those conditions). Would it be a solution? How much water would you need to make it a solution, and how would you know when you had enough?

Permalink to Comment

20. luysii on November 18, 2011 1:42 PM writes...

#19 Quite true -- the large class of globular proteins DO have a hydrophobic core in which the amino acid side chains essentially dissolve themselves. Huge medical problems arise when the hydrophobic amino acids of one protein 'dissolve' those of another protein, leading to insoluble protein aggregates. The aggregates are associated with (and probably are in some sense causative) of a variety of neurologic diseases I used to manage (treat is too strong a word): Huntingtin in Huntington's chorea, Abeta peptide in Alzheimer's, alpha-synuclein in Parkinsonism, superoxide dismutase type 1 (SOD1) in familial amyotrophic lateral sclerosis.

It would be an interesting calculation (which I've not done, but should have) to take a 100 amino acid protein of average composition, fold it into a ball, measure the surface area, and see how many water molecules it would take to cover (e.g. solubilize) it. I doubt that 50 would be enough.

Permalink to Comment

21. Nile on November 22, 2011 4:36 PM writes...

I like this research: admittedly, the results are messy compared with the neatly-labelled reagents in a pharma lab, but they're getting better. And that's the point: the protein repertoire of a living cell is a finite amount of information and we can come fairly close to catalogueing it down to the last peptide.


At which point, or close to it, the question "What do you mean, nobody knows what this one does?" will have gone through three phases of answers:

"Nobody knows what hardly any of 'em do, and its no surprise we never saw or heard of half of the sequences in your bucket of gloop".

"Oh, so there's a distinct family of kinases that look like my pet drug target! I never would've heard of them... I wonder what they do?"

"What - an unknown enzyme? Young man, that's either a contaminant or you're the guy who found the first new genera of mammals discovered in three decades, out there in New Guinea, together with the Yeti and a hitherto-unnoticed species of rhinoceros unique to Brooklyn".

Permalink to Comment

22. Mohammed Nader on December 6, 2011 12:09 AM writes...

Biotechnology is, in general, human's use of living organisms. What man kind does with these organisms has branched in various directions. The existence of these huge amounts of proteins in human cells, and of course other living organisms, serves not only in drug delivery and disease treatment, but in numerous other applications as well. For example, bacteria have been modified to produce insulin, which is the key factor in treating diabetes. On the other hand, the same bacteria have been modified to produce biofuel as a replacement to regular chemical fuel. As for these proteins, they can be invested in medicine and drug delivery, but they might also be more valuable and more accessible for other applications like crop engineering, for example. One polypeptide of these thousands might, for instance, end up being a key factor in doubling the efficiency of soil bacteria which help plants grow. Drug delivery is definitely very important and a significant phase of biotechnology's challenges, but it is as significant to understand that it isn't the only one.

Permalink to Comment


Remember Me?


Email this entry to:

Your email address:

Message (optional):

The Last Post
The GSK Layoffs Continue, By Proxy
The Move is Nigh
Another Alzheimer's IPO
Cutbacks at C&E News
Sanofi Pays to Get Back Into Oncology
An Irresponsible Statement About Curing Cancer
Oliver Sacks on Turning Back to Chemistry