There's a curious paper (subscriber-only link) in the latest Nature that's getting some attention, titled "A linguistic model for the rational design of antimicrobial peptides". For non-subscribers, here's a synopsis of the work from the magazine's news site.
A group at MIT headed by Gregory Stephanopolous has been studying various antimicrobial peptides, which are secreted by all kinds of organisms as antibiotics. Taking the amino acid sequences of several hundred of these and feeding them into a linguistic pattern-analysing program suggested some common features, which they then used to synthesize 42 new unnatural candidates. The hit rate for these was about 50%, which is far, far more than you'd expect if you weren't tuning in to some sort of useful rules.
It's the concept of "peptide grammar" that seems to be the news hook here. But I'm quite puzzled by all the fuss, because looking for homology among protein sequences is one of the basic bioinformatics tools. I have to wonder what the MIT group found with their linguistics program that they wouldn't have found with biology software. What they're doing is good old structure-activity relationship work, the lifeblood of every medicinal chemist. Well, it's perhaps better described as sequence-activity relationships, but sequence is just a code for structure. There's nothing here that any drug company's bioinformatics people wouldn't be able to do for you, as far as I can see.
So why haven't they? Well, despite the article's mention of a potential 50,000 further peptides of this type, the reason is probably because not many people care. After all, we're talking about small peptides here, of the sort that are typically just awful candidates for real-world drugs. And I'm not just babbling theory here - many people have actually tried for many years now to commercialize various antimicrobial peptides and landed flat on their faces.
You won't see a mention of that history in the Nature news story, unfortunately. They do, to their credit, mention (albeit in the fourth paragraph from the end) that peptides are troublesome development candidates. That's where it also says that there are reports that bacteria can become resistant even to these proteins, which prompts me to remind everyone that bacteria can become resistant to everything short of freshly extruded magma. It's in the very last paragraph of the story, though, that Robert Hancock of UBC in Vancouver says just what I was thinking when I started reading:
(Hancock) questions how different the linguistics technique is from other computational methods used to find similarities between protein sequences. "What's new is the catchy title," he says.