It's messy inside a cell. The closer we look, the more seems to be going on. And now there's a closer look than ever at the state of proteins inside a common human cell line, and it does nothing but increase your appreciation for the whole process.
The authors have run one of these experiments that (in the days before automated mass spec techniques and huge computational power) would have been written off as a proposal from an unbalanced mind. They took cultured human U2OS cells, lysed them to release their contents, and digested those with trypsin. This gave, naturally, an extremely complex mass of smaller peptides, but these, the lot of them, were fractionated out and run through the mass spec machines, with use of ion-trapping techniques and mass-label spiking to get quantification. The whole process is reminiscent of solving a huge jigsaw puzzle by first running it through a food processor. The techniques for dealing with such massive piles of mass spec/protein sequence data, though, have improved to the point where this sort of experiment can now be carried out, although that's not to say that it isn't still a ferocious amount of work.
What did they find? These cells are expressing on the order of at least ten thousand different proteins (well above the numbers found in previous attempts at such quantification). Even with that, the authors have surely undercounted membrane-bound proteins, which weren't as available to their experimental technique, but they believe that they've gotten a pretty good read of the soluble parts. And these proteins turn out to expressed over a huge dynamic range, from a few dozen copies (or less) per cell up to tens of millions of copies.
As you'd figure, those copy numbers represent very different sorts of proteins. It appears, broadly, that signaling and regulatory functions are carried out by a host of low-expression proteins, while the basic machinery of the cell is made of hugely well-populated classes. Transcription, translation, metabolism, and transport are where most of the effort seems to be going - in fact, the most abundant proteins are there to deal with the synthesis and processing of proteins. There's a lot of overhead, in other words - it's like a rocket, in which a good part of the fuel has to be there in order to lift the fuel.
So that means that most of our favored drug targets are actually of quite low abundance - kinases, proteases, hydrolases of all sorts, receptors (most likely), and so on. We like to aim for regulatory choke points and bottlenecks, and these are just not common proteins - they don't need to be. In general (and this also makes sense) the proteins that have a large number of homologs and family members tend to show low copy numbers per variant. Ribosomal machinery, on the other hand - boy, is there a lot of ribosomal stuff. But unless it's bacterial ribosomes, that's not exactly a productive drug target, is it?
It's hard to picture what it's like inside a cell, and these numbers just make it look even stranger. What's strangest of all, perhaps, is that we can get small-molecule drugs to work under these conditions. . .