About this Author
DBL%20Hendrix%20small.png College chemistry, 1983

Derek Lowe The 2002 Model

Dbl%20new%20portrait%20B%26W.png After 10 years of blogging. . .

Derek Lowe, an Arkansan by birth, got his BA from Hendrix College and his PhD in organic chemistry from Duke before spending time in Germany on a Humboldt Fellowship on his post-doc. He's worked for several major pharmaceutical companies since 1989 on drug discovery projects against schizophrenia, Alzheimer's, diabetes, osteoporosis and other diseases. To contact Derek email him directly: Twitter: Dereklowe

Chemistry and Drug Data: Drugbank
Chempedia Lab
Synthetic Pages
Organic Chemistry Portal
Not Voodoo

Chemistry and Pharma Blogs:
Org Prep Daily
The Haystack
A New Merck, Reviewed
Liberal Arts Chemistry
Electron Pusher
All Things Metathesis
C&E News Blogs
Chemiotics II
Chemical Space
Noel O'Blog
In Vivo Blog
Terra Sigilatta
BBSRC/Douglas Kell
Realizations in Biostatistics
ChemSpider Blog
Organic Chem - Education & Industry
Pharma Strategy Blog
No Name No Slogan
Practical Fragments
The Curious Wavefunction
Natural Product Man
Fragment Literature
Chemistry World Blog
Synthetic Nature
Chemistry Blog
Synthesizing Ideas
Eye on FDA
Chemical Forums
Symyx Blog
Sceptical Chymist
Lamentations on Chemistry
Computational Organic Chemistry
Mining Drugs
Henry Rzepa

Science Blogs and News:
Bad Science
The Loom
Uncertain Principles
Fierce Biotech
Blogs for Industry
Omics! Omics!
Young Female Scientist
Notional Slurry
Nobel Intent
SciTech Daily
Science Blog
Gene Expression (I)
Gene Expression (II)
Adventures in Ethics and Science
Transterrestrial Musings
Slashdot Science
Cosmic Variance
Biology News Net

Medical Blogs
DB's Medical Rants
Science-Based Medicine
Respectful Insolence
Diabetes Mine

Economics and Business
Marginal Revolution
The Volokh Conspiracy
Knowledge Problem

Politics / Current Events
Virginia Postrel
Belmont Club
Mickey Kaus

Belles Lettres
Uncouth Reflections
Arts and Letters Daily
In the Pipeline: Don't miss Derek Lowe's excellent commentary on drug discovery and the pharma industry in general at In the Pipeline

In the Pipeline

« Would I Take Resveratrol? Would You? | Main | A New Journal (With Bonus Elsevier-Bashing) »

April 10, 2012

Biomarker Caution

Email This Entry

Posted by Derek

After that news of the Stanford professor who underwent just about every "omics" test known, I wrote that I didn't expect this sort of full-body monitoring to become routine in my own lifetime:

It's a safe bet, though, that as this sort of thing is repeated, that we'll find all sorts of unsuspected connections. Some of these connections, I should add, will turn out to be spurious nonsense, noise and artifacts, but we won't know which are which until a lot of people have been studied for a long time. By "lot" I really mean "many, many thousands" - think of how many people we need to establish significance in a clinical trial for something subtle. Now, what if you're looking at a thousand subtle things all at once? The statistics on this stuff will eat you (and your budget) alive.

I can now adduce some evidence for that point of view. The Institute of Medicine has warned that a lot of biomarker work is spurious. The recent Duke University scandal has brought these problems into higher relief, but there are plenty of less egregious (and not even deliberate) examples that are still a problem:

The request for the IOM report stemmed in part from a series of events at Duke University in which researchers claimed that their genomics-based tests were reliable predictors of which chemotherapy would be most effective for specific cancer patients. Failure by many parties to detect or act on problems with key data and computational methods underlying the tests led to the inappropriate enrollment of patients in clinical trials, premature launch of companies, and retraction of dozens of research papers. Five years after they were first made public, the tests were acknowledged to be invalid.

Lack of clearly defined development and evaluation processes has caused several problems, noted the committee that wrote the report. Omics-based tests involve large data sets and complex algorithms, and investigators do not routinely make their data and computational procedures accessible to others who could independently verify them. The regulatory steps that investigators and research institutions should follow may be ignored or misunderstood. As a result, flaws and missteps can go unchecked.

So (Duke aside) the problem isn't fraud, so much as it is wishful thinking. And that's what statistical analysis is supposed to keep in check, but we're got to make sure that that's really happening. But to keep everyone honest, we also have to keep everything out there where multiple sets of eyes can check things over, and this isn't always happening:

Investigators should be required to make the data, computer codes, and computational procedures used to develop their tests publicly accessible for independent review and ensure that their data and steps are presented comprehensibly, the report says. Agencies and companies that fund omics research should require this disclosure and support the cost of independently managed databases to hold the information. Journals also should require researchers to disclose their data and codes at the time of a paper's submission. The computational procedures of candidate tests should be recorded and "locked down" before the start of analytical validation studies designed to assess their accuracy, the report adds.

This is (and has been for some years) a potentially huge field of medical research, with huge implications. But it hasn't been moving forward as quickly as everyone thought it would. We have to resist the temptation to speed things up by cutting corners, consciously or unconsciously.

Comments (14) + TrackBacks (0) | Category: Biological News | Clinical Trials


1. Notadukie on April 10, 2012 8:22 AM writes...

Get used to stories like this. Take a look at the 6000+ abstracts from the 2012 AACR national meeting last week. Wishful thinking is a mild description for that carnival sideshow. Maybe 3% of the work was worthwhile. Four darts and a genome map would provide a better chance of hitting a cancer response biomarker.

Crony capitalism is the term that comes to my mind. Tenure and grant funding have become so dependent on publications in "elite" journals, where reviews from one's "peers" too often simply means that they've gotten a mentors or former lab-mate to accept whatever experimental dross has come out of the previous two years in their lab.

IMO, there needs to be a complete overhaul of the rewards system in academic science. Tenure and grants need to somehow be linked to sustained support for one's ideas and publications. The alternative is more wasted money and lost lives.

Permalink to Comment

2. Wile E. Coyote, Genius on April 10, 2012 8:48 AM writes...

This seems to tie in well with the previous post with the nutraceuticals and resveratrol. there were many offers of evidence for efficacy (Nrf, etc). IMHO, a lot of wishful thinking.

Permalink to Comment

3. johnnyboy on April 10, 2012 9:54 AM writes...

None of this would happen if these academic medical researchers were still primarily concerned about doing academic medical research. Since it seems the first priority out there in the ivory towers is figuring out how to monetize whatever you're doing (patenting everything that moves, founding companies as soon as you have an embryo of an idea), we'll just get access to more unregulated, unverified garbage (sorry - Innovation).

Permalink to Comment

4. clueless on April 10, 2012 10:09 AM writes...

Nothing to fuss about it when the understanding of a disease and its biomarkers is still at early stage.

Permalink to Comment

5. Curious Wavefunction on April 10, 2012 10:52 AM writes...

-that's what statistical analysis is supposed to keep in check

Yes, and I would argue that it's often otherwise. Statistical analysis often misleads us into seeing causation where none exists. Statistical models often predict without explaining. In fact the heavy reliance on statistics in modern biomedical research is creating casualties whose impact will become apparent only when there's long-term damage. Perhaps it's time to heed Sydney Brenner's words and go back to studying the cell as the operational unit of life.

Permalink to Comment

6. biff on April 10, 2012 11:39 AM writes...

johnnyboy wrote (#3): "None of this would happen if these academic medical researchers were still primarily concerned about doing academic medical research."

Sorry, I was doing research back when "academic medical researchers were still primarily concerned about doing academic medical research", back when any hint of commercial application was almost always regarded as career-ending selling out, and "unregulated, unverified garbage" was just as routine then as it is now. From my first day in grad school, I was appalled at how little the faculty at my top tier university medical school knew or cared about statistics. Find the primary author of a biology paper - any biology paper - and ask them why they chose t-test instead of Chi-square, and prepare for a look of befuddlement in at least half of the cases - and I'm being charitable there. Likewise, how many faculty authors are close enough to the laboratory or the point of data collection that they really know what their grad students or postdocs are doing to obtain the "right" result. How many "scientists" truly think deeply about experimental design and all the places that systematic error creep in?

When grants are the primary/exclusive source of a scientist's funding, is there more or less pressure to cook the books in order to secure the next grant than there is to try to cook the books to launch a startup? Mental exercise: how often are experiments repeated after they are published in the academic world, and how critical is it for things need to be reduced to repeatable practice in order to have a chance of commercial success. Arguably, hopes for commercialization impose *more* discipline on science than anything seen in a purely academic pursuit.

Permalink to Comment

7. RKN on April 10, 2012 11:46 AM writes...

But to keep everyone honest, we also have to keep everything out there where multiple sets of eyes can check things over, and this isn't always happening

I agree in principle, and the first "eyes" should be the eyes of the reviewer(s) assessing the paper. But practically speaking even if you publish your algorithm, provide a mathematical proof (where appropriate), and make the raw data and computer code available few people, it seems to me, are going to take the time necessary to verify it's all correct.

I work in the area of biomarker discovery using integration of multiple -omics data. The bar we strive to reach with our analysis is cross-validation on independent data sets. I've also argued ( that the publication of novel computational methods be accompanied by positive and negative controls, just as researchers publishing in the area of, say, cell biology (wet-bench) have been required to do for years.

Otherwise, we run the risk of: Most Random Gene Expression Signatures Are Significantly Associated With Breast Cancer Outcome

Permalink to Comment

8. pete on April 10, 2012 11:59 AM writes...

I'd agree there's a fair bit of "wishful-thinking-science" mixed in the push toward biomarker-ba$ed diagno$tic$.

That said, biomarker discovery will continue to evolve. Consider that, for quite some time now, MDs have used blood chemistry / blood enzyme panels to guide their decisions.

Permalink to Comment

9. c on April 10, 2012 12:03 PM writes...

The important question is whether the argument/analysis/model is justified.

The problem is that people do not appreciate that science is subjective.

The plausibility arguments, including models and algorithms, should be stated clearly but nonetheless recognzied as arguments. This should not be a controversial point.

The value of a good biomarker hypothesis - as defined in a consistent manner by you - is its potential to maximise the information gained per dollar or sample spent.

Whether this argument is “locked down” before or after the data is recorded is completely irrelevant, and unfortunately creates a notion that there is an absolute interpretation of data.

Permalink to Comment

10. DV Henkel-Wallace on April 10, 2012 12:05 PM writes...

The thing I wonder is if there's a way to start collecting a as much "omics" (love that "word") and longitudinal data as possible so that, down the line, someone can crunch the data based on various future hypotheses.

I'm thinking of a kind of broad spectrum mass Framingham study where we just keep collecting as much data as we can (and as technology develops) in the hope that it will be valuable in future. It's hard work, and it has to be done rigorously, but since it will only be valuable to some unknown people in the future, what's the incentive to start now? In fact, as johnnyboy says, the incentives unfortunately _discourage_ collecting these data.

Permalink to Comment

11. gippgig on April 10, 2012 11:05 PM writes...

The more biomarkers (or anything else) you study the larger the sample size must be to establish that a possible connection is real. The weaker the connection the larger the sample size must be. Eventually you will get to the point where proving that a real connection actually is real is impossible because it would require studying more people than the total human population. Welcome to the genetic version of the Heisenberg uncertainty principle.

Permalink to Comment

12. Jose on April 11, 2012 6:47 AM writes...

"Welcome to the genetic version of the Heisenberg uncertainty principle."

Or, the rocket problem- building a bigger rocket requires more fuel, which requires a bigger rocket, which....

Permalink to Comment

13. Todd on April 11, 2012 3:40 PM writes...

With regard to #10, I think it's feasible with existing samples left over from tons of different studies. You'd be surprised what's available in the right facilities if you know who to call. While the logistics and the paperwork would be a nightmare, it's definitely something that's doable from a pure science perspective.

If you do that, you could get a lot of good quality information along with existing outcomes information. It would definitely be fun to watch.

Permalink to Comment

14. chris on June 6, 2014 5:51 PM writes...

Kinemed is about to have an ipo, they are monetizing their biomarker tests. They are somehow detailing pathways with heavy water, etc. Sounds very novel, I have no idea whether there is intensive statistical processing / simulation on the backend required to finalize any patients given test results.

Permalink to Comment


Remember Me?


Email this entry to:

Your email address:

Message (optional):

The Last Post
The GSK Layoffs Continue, By Proxy
The Move is Nigh
Another Alzheimer's IPO
Cutbacks at C&E News
Sanofi Pays to Get Back Into Oncology
An Irresponsible Statement About Curing Cancer
Oliver Sacks on Turning Back to Chemistry