About this Author
DBL%20Hendrix%20small.png College chemistry, 1983

Derek Lowe The 2002 Model

Dbl%20new%20portrait%20B%26W.png After 10 years of blogging. . .

Derek Lowe, an Arkansan by birth, got his BA from Hendrix College and his PhD in organic chemistry from Duke before spending time in Germany on a Humboldt Fellowship on his post-doc. He's worked for several major pharmaceutical companies since 1989 on drug discovery projects against schizophrenia, Alzheimer's, diabetes, osteoporosis and other diseases. To contact Derek email him directly: Twitter: Dereklowe

Chemistry and Drug Data: Drugbank
Chempedia Lab
Synthetic Pages
Organic Chemistry Portal
Not Voodoo

Chemistry and Pharma Blogs:
Org Prep Daily
The Haystack
A New Merck, Reviewed
Liberal Arts Chemistry
Electron Pusher
All Things Metathesis
C&E News Blogs
Chemiotics II
Chemical Space
Noel O'Blog
In Vivo Blog
Terra Sigilatta
BBSRC/Douglas Kell
Realizations in Biostatistics
ChemSpider Blog
Organic Chem - Education & Industry
Pharma Strategy Blog
No Name No Slogan
Practical Fragments
The Curious Wavefunction
Natural Product Man
Fragment Literature
Chemistry World Blog
Synthetic Nature
Chemistry Blog
Synthesizing Ideas
Eye on FDA
Chemical Forums
Symyx Blog
Sceptical Chymist
Lamentations on Chemistry
Computational Organic Chemistry
Mining Drugs
Henry Rzepa

Science Blogs and News:
Bad Science
The Loom
Uncertain Principles
Fierce Biotech
Blogs for Industry
Omics! Omics!
Young Female Scientist
Notional Slurry
Nobel Intent
SciTech Daily
Science Blog
Gene Expression (I)
Gene Expression (II)
Adventures in Ethics and Science
Transterrestrial Musings
Slashdot Science
Cosmic Variance
Biology News Net

Medical Blogs
DB's Medical Rants
Science-Based Medicine
Respectful Insolence
Diabetes Mine

Economics and Business
Marginal Revolution
The Volokh Conspiracy
Knowledge Problem

Politics / Current Events
Virginia Postrel
Belmont Club
Mickey Kaus

Belles Lettres
Uncouth Reflections
Arts and Letters Daily
In the Pipeline: Don't miss Derek Lowe's excellent commentary on drug discovery and the pharma industry in general at In the Pipeline

In the Pipeline

« Graphical Abstract Tedium | Main | Geron, And The Risk of Cancer Therapies »

September 6, 2012

Databases and Money

Email This Entry

Posted by Derek

The NIH has been cutting back on its funding (via the National Libraries of Medicine) for a number of external projects. One of those on the chopping block is the Biological Magnetic Resonance Bank (BMRB), at Wisconsin:

The BMRB mission statement is to “collect, annotate, archive and disseminate (worldwide in the public domain)” NMR data on biological macromolecules and metabolites, to “empower scientists” and to “support further development of the field.” Despite its indisputable success in achieving these goals, the BMRB is facing serious funding challenges.

Since 1990, the BMRB has received continuous support from the National Library of Medicine (NLM), at the US National Institutes of Health, in the form of five-year grants. However, the BMRB obtained its latest grant renewal in 2009, accompanied by a sharp reduction in the funding level. It was also to be the last renewal, as the NLM announced that funding for all external centers would be phased out as their grants expire. Thus, as of today, the BMRB has no means of financial support after September 2014.

That editorial link above, from Nature Structural and Molecular Biology, also has a several other database projects formerly supported by the NLM. These are far enough outside my own field that I've never had call to use any of them as a medicinal chemist, but (as that last link shows) they are indeed used, and by plenty of researchers.

This problem won't be going away, since the volume of data produced these days shows no sign of any inflection points. Molecular genetics, protein biology, and structural biology in general are producing vast piles of material. Having as much of it as possible brought together and curated is clearly in the best interest of scientific research - but again, who pays?

Comments (19) + TrackBacks (0) | Category: Biological News


1. road on September 6, 2012 3:03 PM writes...

The link to the article is broken. The correct link is:

Permalink to Comment

2. bacillus on September 6, 2012 4:15 PM writes...

Does the BMRB currently charge users, or is this precluded under the NIH contract? If reasonable user fees could be charged, investigators could build these charges into their grant budgets. The proponents of the BMRB in the accompanying article believe that its work is critical to them, so I'm sure they'd be ready to pony up! Talk is cheap, and a user fee system would better illustrate the usefulness or otherwise of the product. Who doesn't love low cost scientific services? However, the minute they have to start charging on a complete cost recovery basis, it's amazing how usage can plummet. I've seen this happen plenty of times when investigators start having to pay commercial rates for animal per diems. I do sympathize with them, since we had an NIH grant worth several million dollars that was yanked at the Council review stage having aced the study section review (a first according to the NIH program officer), clearly another example of the hard times at NIH and the hard decisions they are being forced to make.

Permalink to Comment

3. Martin on September 6, 2012 4:35 PM writes...

Seems a shame but one thing I have noticed that makes BMRB less useful to medicinal chemists is the fact that it doesn't actually warehouse NMR structures per se, but rather the raw chemical shift data, assignment and restraints data usually. So if for example you wanted to get hold of someones NMR structure of a smallish peptide which falls below the size cutoff of what RCSB/PDB will accept, your chances of getting a set of 20 structures plus an average from BRMB are slim. You could of course get a friendly local NMR person to run the raw data through Xplor/CNS etc for you but that is not exactly convenient.

Permalink to Comment

4. Anonymous on September 6, 2012 7:34 PM writes...

These are US taxpayer subsidized, correct? Yet they are open to the world, too? It would be nice if the other developed nations chipped in.

Permalink to Comment

5. gippgig on September 6, 2012 9:23 PM writes...

Computer power is so cheap nowadays why would databases need funding in the first place?

Permalink to Comment

6. Londonlad on September 6, 2012 11:17 PM writes...

'Computer power is so cheap nowadays why would databases need funding in the first place?'

Mitt, welcome to In the Pipeline! Great to have a politician interested in the sort of stuff we care about.

Permalink to Comment

7. CB on September 7, 2012 12:00 AM writes...

The BMRB has been incredibly important for advancing basic developments in NMR.
For instance: Chemical shift prediction of secondary structure and dihedral angles.
The calculation of protein structures based entirely on NMR chemical shift assignments.
The BMRB aided these developments.

A fee system on the data acts as a wet blanket on basic research, innovation and discovery. If you want to test a speculative idea, or you're just curious and want to check the database for correlations or variations, the cost is an issue. Researchers aren't going to satisfy their curiousity about questions indirectly related to their research if a purchase order is involved. Serendipity is diminished.

Permalink to Comment

8. Morten G on September 7, 2012 4:19 AM writes...

@4. Anonymous
Really hope you aren't a scientist. Hope even less that you are a politician. "Are you saying that a dirty European scientist used research funded by American taxpayers for some purpose I don't understand!? To the war chariots!"

Permalink to Comment

9. MolecularGeek on September 7, 2012 8:22 AM writes...

@5 gippgig
Raw computing power is much more affordable than in the past, but it certainly isn't free, and the cost of building and maintaining a curated data repository on top of the raw platform is not trivial. It's easy to start serving raw data from an old PC beside your desk at work, but when people actually start depending on that data being there to do their work, their demands get expensive. Ask anyone here in the industry with a company that has been running more than 5 years how much they spend to keep their compound registries working.


Permalink to Comment

10. SP on September 7, 2012 9:17 AM writes...

I assume Pubchem is one of the most relevant to drug discovery, is it commonly used by med chemists or HTS biologists?

Permalink to Comment

11. MolecularGeek on September 7, 2012 10:08 AM writes...

PubChem is an intramural project, and so won't be directly affected by this. It certainly gets around , but how much directly by HTS and MedChem folks, as opposed to being grist for the modeling and data-mining crowds is a harder question.

Permalink to Comment

12. HTSguy on September 7, 2012 10:14 AM writes...

I'm in the academic world these days and when someone discusses a possible screen with me, the first thing I do is look in PubChem to see if a similar screen has already been run.

Permalink to Comment

13. Anonymous on September 7, 2012 10:25 AM writes...


Basically you need to hire a full-time IT professional (or three) to keep the database running smoothly. Someone has to fix things when the computers crash at 2 AM, and upgrade hardware/software to keep up with a database with growing information and demand.

Permalink to Comment

14. Tom Womack on September 7, 2012 12:42 PM writes...

@4: in the case of the (larger, older) Protein Databank, there is the PDB at Rutgers, PDB Europe at the EBI near Cambridge, and PDB Japan at the University of Osaka.

Salaries are always the expensive part of this kind of project - the BMRB database dump is four gigabytes long, you could mail out a memory stick with the whole database on to every user annually for the price of one administrator, or host it on Amazon S3 for much less than the cost of providing liquid helium to one high-field NMR machine.

But what you can't do is sanity-check new entries that way. Huge databases of easily-checked Truths are trivial (see for a nice example), huge libraries of information that needs curating are very much not.

Permalink to Comment

15. Anon on September 9, 2012 5:41 AM writes...

Just a nit -

It's the "National Library of Medicine", not Libraries.

Permalink to Comment

16. Vlad on September 9, 2012 7:38 AM writes...

NIH should support development of data standards and enforce use of these standards by NIH-supported research. Private entities will assemble standardized data into database and provide access either for fee or advertisement-based. I am not familiar with the specific type of data. But working a lot with genomic data I argue that data cleaning/curation is most time consuming (i.e expensive) part of database support.

Permalink to Comment

17. Tom Womack on September 9, 2012 9:37 AM writes...

For-fee data availability would ruin a lot of these services, because one of the great uses for big amalgamated databases is in doing statistics on the whole lot. Advertised-based isn't terribly helpful either in that context, since no advertiser will pay for hits produced by wget as you pull the database down.

This is more difficult than it looks - there are a number of protein-structure statistics papers which tell you much more about the REFMAC restraint function than about the geometry of proteins.

Inevitably the curatorial organisations are very busy; so an option where the curators of the data are the only institution that can work with the data cheaply enough to get the analyses to work well is not really acceptable.

Permalink to Comment

18. Anonymous on September 9, 2012 3:11 PM writes...

Morten G:

I didn't say that. (BTW, I am in Europe for the week and thorougly enjoying it). The snark here is incredible by many on this blog - although there are many good contributors, too. The point is, why not have all those benefiting (North American, European, Australasian developed countries) share the cost if there is the potential that the benifit is to be lost for all? That's a rhetorical question, but maybe not to the expert snarkologists. To me there is an obvious answer (although it may not be a poliitically easy answer).

Permalink to Comment

19. DensityDuck on September 10, 2012 12:41 PM writes...

Orbital mechanics was developed by doing curve fits to recorded data of planetary positions--and that data was recorded to years, usually decades, sometimes centuries. And to fully understand that mass of data, we had to invent calculus.

Meaning that, if we hadn't had that huge repository of curated data, then we wouldn't have calculus.

Permalink to Comment


Remember Me?


Email this entry to:

Your email address:

Message (optional):

The Last Post
The GSK Layoffs Continue, By Proxy
The Move is Nigh
Another Alzheimer's IPO
Cutbacks at C&E News
Sanofi Pays to Get Back Into Oncology
An Irresponsible Statement About Curing Cancer
Oliver Sacks on Turning Back to Chemistry