About this Author
DBL%20Hendrix%20small.png College chemistry, 1983

Derek Lowe The 2002 Model

Dbl%20new%20portrait%20B%26W.png After 10 years of blogging. . .

Derek Lowe, an Arkansan by birth, got his BA from Hendrix College and his PhD in organic chemistry from Duke before spending time in Germany on a Humboldt Fellowship on his post-doc. He's worked for several major pharmaceutical companies since 1989 on drug discovery projects against schizophrenia, Alzheimer's, diabetes, osteoporosis and other diseases. To contact Derek email him directly: Twitter: Dereklowe

Chemistry and Drug Data: Drugbank
Chempedia Lab
Synthetic Pages
Organic Chemistry Portal
Not Voodoo

Chemistry and Pharma Blogs:
Org Prep Daily
The Haystack
A New Merck, Reviewed
Liberal Arts Chemistry
Electron Pusher
All Things Metathesis
C&E News Blogs
Chemiotics II
Chemical Space
Noel O'Blog
In Vivo Blog
Terra Sigilatta
BBSRC/Douglas Kell
Realizations in Biostatistics
ChemSpider Blog
Organic Chem - Education & Industry
Pharma Strategy Blog
No Name No Slogan
Practical Fragments
The Curious Wavefunction
Natural Product Man
Fragment Literature
Chemistry World Blog
Synthetic Nature
Chemistry Blog
Synthesizing Ideas
Eye on FDA
Chemical Forums
Symyx Blog
Sceptical Chymist
Lamentations on Chemistry
Computational Organic Chemistry
Mining Drugs
Henry Rzepa

Science Blogs and News:
Bad Science
The Loom
Uncertain Principles
Fierce Biotech
Blogs for Industry
Omics! Omics!
Young Female Scientist
Notional Slurry
Nobel Intent
SciTech Daily
Science Blog
Gene Expression (I)
Gene Expression (II)
Adventures in Ethics and Science
Transterrestrial Musings
Slashdot Science
Cosmic Variance
Biology News Net

Medical Blogs
DB's Medical Rants
Science-Based Medicine
Respectful Insolence
Diabetes Mine

Economics and Business
Marginal Revolution
The Volokh Conspiracy
Knowledge Problem

Politics / Current Events
Virginia Postrel
Belmont Club
Mickey Kaus

Belles Lettres
Uncouth Reflections
Arts and Letters Daily
In the Pipeline: Don't miss Derek Lowe's excellent commentary on drug discovery and the pharma industry in general at In the Pipeline

In the Pipeline

« Lipinski's Anchor | Main | The Freshness Index »

November 26, 2013

Of Mice (Studies) and Men

Email This Entry

Posted by Derek

Here's an article from Science on the problems with mouse models of disease.

or years, researchers, pharmaceutical companies, drug regulators, and even the general public have lamented how rarely therapies that cure animals do much of anything for humans. Much attention has focused on whether mice with different diseases accurately reflect what happens in sick people. But Dirnagl and some others suggest there's another equally acute problem. Many animal studies are poorly done, they say, and if conducted with greater rigor they'd be a much more reliable predictor of human biology.

The problem is that the rigor of animal studies varies widely. There are, of course, plenty of well-thought-out, well-controlled ones. But there are also a lot of studies with sample sizes that are far too small, that are poorly randomized, unblinded, etc. As the article mentions (just to give one example), sticking your gloved hand into the cage and pulling out the first mouse you can grab is not an appropriate randomization technique. They aren't lottery balls - although some of the badly run studies might as well have used those instead.

After lots of agitating and conversation within the National Institutes of Health (NIH), in the summer of 2012 [Shai] Silberberg and some allies went outside it, convening a workshop in downtown Washington, D.C. Among the attendees were journal editors, whom he considers critical to raising standards of animal research. "Initially there was a lot of finger-pointing," he says. "The editors are responsible, the reviewers are responsible, funding agencies are responsible. At the end of the day we said, 'Look, it's everyone's responsibility, can we agree on some core set of issues that need to be reported' " in animal research?

In the months since then, there's been measurable progress. The scrutiny of animal studies is one piece of an NIH effort to improve openness and reproducibility in all the science it funds. Several institutes are beginning to pilot new approaches to grant review. For an application based on animal results, this might mean requiring that the previous work describe whether blinding, randomization, and calculations about sample size were considered to minimize the risk of bias. . .

Not everyone thinks that these new rules are going to work, though, or are even the right way to approach the problem:

Some in the field consider such requirements uncalled for. "I am not pessimistic enough to believe that the entire scientific community is obfuscating results, or that there's a systematic bias," says Joseph Bass, who studies mouse models of obesity and diabetes at Northwestern University in Chicago, Illinois. Although Bass agrees that mouse studies often aren't reproducible—a problem he takes seriously—he believes that's not primarily because of statistics. Rather, he suggests the reasons vary by field, even by experiment. For example, results in Bass's area, metabolism, can be affected by temperature, to which animals are acutely sensitive. They can also be skewed if a genetic manipulation causes a side effect late in life, and researchers try to use older mice to replicate an effect observed in young animals. Applying blanket requirements across all of animal research, he argues, isn't realistic.

I think, though, that there must be some minimum requirements that could be usefully set, even with every field having its own peculiarities. After all, the same variables that Bass mentions above - which are most certainly real ones - could affect studies in completely different fields. This, of course, is one of the biggest reasons that drug companies restrict access to their animal facilities. There's always a separate system to open those doors, and if you don't have the card to do it, you're not supposed to be in there. Pace the animal rights activists, that's not because it's so terrible in there that the rest of us wouldn't be able to take it. It's because they don't want anyone coming in there and turning on lights, slamming doors, sneezing, or doing any of four dozen less obvious things that could screw up the data. This stuff is expensive, and it can be ruined quite easily. It's like waiting for a four-week-long soufflé to rise.

That brings up another question - how do the animal studies done in industry compare to those done in academia? The Science article mentions some work done recently by Lisa Bero of UCSF. She was looking at animal studies on the effects of statins, and found, actually, that industry-sponsored research was less likely to find that the drug under investigation was beneficial. The explanation she advanced is a perfectly good one: if your animal study is going to lead you to spend the big money in the clinic, you want to be quite sure that you can believe the data. That's not to say that there aren't animal studies in the drug industry that could be (or could have been) run better. It's just that there are, perhaps, more incentives to make sure that the answer is right, rather than just being interesting and publishable.

Doesn't the same reasoning apply to human studies? It certainly should. The main complicating factor I can think of is that once a company, particularly a smaller one, has made the big leap into human clinical trials, it also has an incentive to find something that's good enough to keep going with, and/or good enough to attract more investment. So perverse incentives are, I'd guess, more of a problem once you get to human trials, because it's such a make-or-break situation. People are probably more willing to get the bad news from an animal study and just groan and say "Oh well, let's try something else". Saying that after an unsuccessful Phase II trial is something else again, and takes a bit more sang-froid than most of us have available. (And, in fact, Bero's previous work on human trials of statins seems to show various forms of bias at work, although publication bias is surely not the least of them).

Comments (37) + TrackBacks (0) | Category: Animal Testing


1. Boghog on November 26, 2013 9:29 AM writes...

Another thing to watch out for: variations in gut flora can have a big impact on the outcome of an animal studies. See for example pmid 21262117.

Permalink to Comment

2. anon1 on November 26, 2013 9:30 AM writes...

In my experience, rarely are outbred animals used in animal models, sometimes/often by necessity (i.e. knockouts, Brattleboro rats, etc) but also because inbred lines give "cleaner, more reproducible" results. If you have to stack the deck so much for your to work in the lab, it probably won't work in the clinic.
Also, it seems to be the exception rather than the rule that the scientists/technicians scoring the model are blinded to drug vs. control

Permalink to Comment

3. Anonymous on November 26, 2013 9:47 AM writes...

As a scientist, I find the backlash against certain reporting requirements disturbing to say the least. People seem to think that such requirements mean that you must adhere to a certain protocol, rather than a requirement to have a methods section with sufficient detail. It's hard to reproduce conditions that you do not know. This must have something to do with the difficulty in repeating published work...

Permalink to Comment

4. Anonymous on November 26, 2013 9:54 AM writes...

The over reliance on animal models could be attributed to your previous post on lipiskis anchor?

If it were up to me i would get rid of them all except for basic tox.

Permalink to Comment

5. Charlie Kilian on November 26, 2013 10:10 AM writes...

"industry-sponsored research was less likely to find that the drug under investigation was beneficial"

That also tracks with the view that academia is morphing into a model where it is selling its intellectual property. It makes a certain amount of sense that if academics are under pressure to produce more intellectual property, they would be less likely to set up the more rigorous trials. The incentives just aren't there. If the university can make money licensing research that produced results at p = .05, then what incentive do they have to improve?

Disclaimer: I am neither in the industry, nor in academia. Just an interested outside observer. I could be way off base.

Permalink to Comment

6. Hap on November 26, 2013 10:21 AM writes...

How do you know that something is due to species- and topic-dependent variability and not sloppiness in performing the studies if you haven't tried to eliminate the methodological sloppiness as a factor?

Permalink to Comment

7. Erebus on November 26, 2013 10:28 AM writes...

@4: That's a totally inane comment. The early stages of drug discovery have no replacement for animal models. Clearly, some are better and some are worse, and we should certainly endeavor to work towards more relevant, more robust, more reproducible models... but we certainly can't do away with animal studies entirely. There's a lot to learn from them. Knockout mice are very useful tools, as well.

Permalink to Comment

8. Jack Shaftoe on November 26, 2013 10:28 AM writes...

Another facet on top of animal studies not done with sufficient rigor (a serious problem - I agree), is the misrepresentation of pharmacological results from animal studies.
When labs claim validation of their target from treating their animals with a compound and seeing a result, too often I find that they have not accounted for the PK of the compound at all (they have no idea how much of the compound was present in the poor animal) nor the selectivity of the compound.
PK is important. If you dose a aldehyde containing compound with a half-life of 1 h only once in a 12 day study, I am not at all surprised that it gives a different result than a compound similarly dosed that has a half-life of 15 days, but I do not buy the stated conclusions that the results say anything at all about the targets the compounds are hitting.
As examples of overselling the selectivity of a molecule, I continue to see that massive doses of valproic acid seem to prove that HDACx was involved in driving the phenotype of a model when the selectivity of that compound is suspect at best. Once I saw resveratrol used as an antagonist of AhR.
Please pay attention to the properties of the small molecule tools that are used.

Permalink to Comment

9. Jim on November 26, 2013 10:34 AM writes...

@#6 Hap: I don't even think it's an issue of sloppiness, just a matter of reproducibility. As a behavioral pharmacologist, I have seen many "standard" assays run in fairly different ways. I don't think one way is necessarily "right", but if I don't understand how to assays differ, I cannot interpret data that are not equal (or similar, or close, or whatever).

Clearly, there is lots of sloppy research that is done out there - the standards should be a step towards improving that across the board and it wouldn't be prohibitive to implement them.

Permalink to Comment

10. Anonymous on November 26, 2013 10:37 AM writes...

Why use animal models at all, when molecular dynamics simulations will tell you everything you need to know? In other words, just flip a coin, or consult the planets.

Permalink to Comment

11. Anon on November 26, 2013 10:40 AM writes...

I think we should all be reading this as
"work done recently by [the technicians that are willing to work] for Lisa Bero of UCSF"
IMO, this is the same story we always hear. Reproducibility is an issue. We can always trace this back to the low pay, no career development, no benefits, borderline abused individuals that give results they are expected to give to their PI.
But that is never addressed and hand-waved away.

Permalink to Comment

12. Virgil on November 26, 2013 10:41 AM writes...

What the post misses, is that for many people (particularly outside the US), animal studies are difficult and expensive in the same way that clinical trials are.

Sure, if you're a physiologist in a big US University it's easy to say "make the knockout mouse already". Such people typically have no problem bemoaning their lack of access to clinical samples and the difficulties in translating their results to humans. They would do well to realize their privileged position and respect that those further down the ladder may be equally perplexed at the difficulties in translating INTO the whole animal, from cell culture.

There are a TON of scientists working with (and publishing on) cell culture or even simpler in-vitro biological systems. For them, doing a mouse study is a major undertaking. We're talking about people with degrees in very basic sciences such as biochemistry, who often have no training whatsoever in whole animal physiology. For them, "translational" means doing it in any in-vivo model. It requires dealing with the IACUC and other such regulatory entities. For a basic in-vitro scientist (e.g. cell biologist), this can be just as scary as the IRB is to a large mammalian physiologist. When you run a lab with all cell lines and engineered proteins and expression constructs, moving everything to a whole animal model is expensive, and not something you do on a whim.

So yes, animals don't often translate to humans, but let's not forget there's another massive obstacle further back down the chain, where cell culture results don't translate into a whole animal system. How many things kill cancer cells in a dish? How many of those have been proven in mouse tumor models?

Permalink to Comment

13. Hap on November 26, 2013 10:49 AM writes...

Sloppiness was the wrong word to use, because I assumed that there was one right way to do the studies that wasn't being followed, when there's not.

It would still help for people to know how much variability is present after accounting for the basic methods before assuming that the variability is inherent in the nature of the study and the phenomenon.

Permalink to Comment

14. MILFshake on November 26, 2013 12:08 PM writes...

Another compelling argument made in this paper:

by Valen Johnson recently is that our standards for significance are much too lax. He advocates P

"Although it is difficult to assess the proportion of all
tested null hypotheses that are actually true, if one assumes that this proportion is approximately one-half, then these results suggest that between 17% and 25% of marginally significant scientific findings are false."

I may be dumb but the null hypothesis is true for me waay more that 50% of the time, so this is probably a wild underestimate.....

Permalink to Comment

15. NJBiologist on November 26, 2013 12:32 PM writes...

@5 Charlie Kilian: "If the university can make money licensing research that produced results at p = .05, then what incentive do they have to improve?"

I think you're on the right track, but you're underestimating the scope of the issue. Grants and tenure come from publications; publications come from positive results. The result is a massive bias for positive results.

Permalink to Comment

16. Reverend J on November 26, 2013 12:40 PM writes...

Whoa, wait a minute...are you telling me that mice aren't people?

*Mind blown*

Permalink to Comment

17. ScientistSailor on November 26, 2013 1:35 PM writes...

Lennie said, "I might jus' as well go away. George ain't gonna let me tend no rabbits now."

Permalink to Comment

18. RKN on November 26, 2013 1:46 PM writes...

Whoa, wait a minute...are you telling me that mice aren't people?

*Mind blown*

It's worse -- mice held in captivity. The argument we hear so often is, "Surely we can learn something from all these mouse experiments."

No doubt, just very little in the way of translatable human biology. If you ask me the goal is (or ought to be) translation to humans, not merely reproducibility of the experiments.

It's like proposing to study human sexual psychology and your cohort is a prison population.

And if coming into the animal room and turning on lights, slamming doors, or sneezing is all it takes to foul an experiment, that doesn't say much for the robustness of any result, and even less for translatability. Can you say over-fitted?

Permalink to Comment

19. Cellbio on November 26, 2013 2:10 PM writes...

Don't agree Erebus. For reasons mentioned above and the fact that most models are highly dynamic induced states of exaggerated physiology instead of a disease process, there is not a lot to learn about human disease but rather only something to learn about what are the pressure points for the constructed dynamic process. Is a adjuvant induced arthritis model anything more than an immunization model? No, not much more. Has curing mice of tumors helped pick winners? No. How about asthma models, nope. on and on.

If successful model treatments fail to predict clinical success, why would we assume failure in the model(non-trivial failures)means we should not go forward into humans? If the mechanism is supported by human biology or genetics, then to hell with the animal models.

Permalink to Comment

20. Lyle Langley on November 26, 2013 2:13 PM writes...

@#8, Jack Shaftoe..

Couldn't agree more. I can't tell you how many grants I've reviewed that do not have any PK data attached to their "stunning" in vivo efficacy slides. This isn't isolated to academia either, I've reviewed many a grant for a major foundation and industry is just at fault (large and small). I it such an issue that even before I start my critique the scientific officers know what my first criticism is going to be. It is especially painful in the CNS arena. People looking at no PK - or simply plasma levels with no other correlating piece of information and that is proof it works in the brain.

Permalink to Comment

21. Jack Shaftoe on November 26, 2013 3:15 PM writes...

@#20 - Lyle - Please keep holding the grant writers accountable. PD without PK just doesn't mean much of anything.

Is it that PK analysis is not widely available?

Permalink to Comment

22. newnickname on November 26, 2013 4:21 PM writes...

VERY interesting problem but I haven't read the cited articles yet. 1. Do they say anything about animals and tox studies? 2. In the Pipeline had a story about animals and immunology on February 13, 2013. "Mouse Models of Inflammation Are Basically Worthless. Now We Know." (Add a hotlink?) 3. I often bring up Gerald B Dermer who made a case about how worthless and misleading cancer screening in cell culture can be in "The Immortal Cell" (Avery Press, 1995). 4. @18 RKN: "It's like proposing to study human sexual psychology and your cohort is a prison population." I volunteer to be a non-captive control as long as I get to choose my personal cohort.

Permalink to Comment

23. Erebus on November 26, 2013 4:22 PM writes...

@19: I believe that what you're suggesting would absolutely paralyze pharmaceutical research. We don't need to test everything which clears tox on humans (which cannot be done; it is impractical in the extreme,) what we need is better models. There's lots of room for improvement in those models, the community realizes that we've got a problem with a lot of 'em, and it's an issue that's being worked on.

...Obviously the xenograft rodent tumor model is garbage, and I don't suppose that the OVA-induced asthma models are much better, but that doesn't mean that all animal models of disease are therefore worthless. They simply need to be improved for relevance towards human disease, and standards need to be set for future reproducibility.

Permalink to Comment

24. Hap on November 26, 2013 4:44 PM writes...

If none of the animal models work, why do we use them? I am assuming that people's lack of desire to admit they don't know something and institutional inertia are significant causes, but not sufficient or exclusive ones.

If we don't get any useful data from animals (other than brute toxicity), then couldn't we just wave a dead chicken over a drug candidate and get the same success rate aa lot more cheaply?

Permalink to Comment

25. bacillus on November 26, 2013 5:13 PM writes...

What this article fails to address is why rodents have become the overwhelming models of choice for life scientists. I contend that it has been done in part to assuage the moderate animal welfare organizations, and to save money rather than because rodent models are superior to all other alternatives. Therefore, I think it is more important to determine on a disease by disease basis what animal models are most likely to mimick the human disease and its response to treatment. Scientifically, if it turns out
that dogs, cats, pigs, or monkeys are more appropriate for developing therapies against many human diseases, is the scientific community prepared for the consequences? I don't think so since it was our own capitulation and convenience that got us into this mess in the first place. I shudder to think that this law of unintended consequences has meant that many millions of mice and rats have been squandered needlessly when exponentially fewer "higher" mammals could have led to better translational odds.

FTR. I posted this up for comment at Science 5 hours ago, so I assume it was binned there.

Permalink to Comment

26. Anonymous on November 26, 2013 5:23 PM writes...

Sigh, drug discovery was so much easier when we could just inject random stuff into slaves and prisoners. Damn human rights folks!

Permalink to Comment

27. Anonymous on November 26, 2013 5:27 PM writes...

"why rodents have become the overwhelming models of choice for life scientists?"


Because they breed, grow and die a lot faster than dogs, cats, pigs, or monkeys.

Permalink to Comment

28. Hap on November 26, 2013 5:33 PM writes...

The whole point is to avoid testing in people first, without knowing as much as you can about what you'll get - it's expensive, and people are unpredictable. Animals are supposed to act like us, so we'd rather use them than people, but if they aren't, then we need something else - human cell lines or artificial organs or something that acts like us. I don't think anyone wants to carpet-bomb candidates into people, because then you won't have anyone to test your good candidates on.

Besides, slaves and prisoners probably don't look physiologically like the populations they would be used to represent, so that testing in them wouldn't be helpful (assuming helpful doesn't involve yet another 8th Amendment impalement)

Permalink to Comment

29. Jim on November 26, 2013 6:04 PM writes...

Yes, animal models need improvement, and at least in my field have made significant improvements over the past decade. Clinical trials also need improvement, but that's another topic. For anyone who says they're worthless, the questions I have are 1.) how else do you select your lead candidate from the 10-100 NMEs that have met all criteria in your screening tree? and 2.) can you tell me of any drugs that have been shown to work in the clinic that failed in animal models? I know that seems counterintuitive, but there are lots of drugs that were approved for indication A and later were found to be effective for treating indication B, and then were also shown to be effective in animal models. (Take for example, gabapentin.) In the end, there may be some false positives with animal models, but if there aren't false negatives, their use is still appropriate.

Permalink to Comment

30. Anonymous on November 26, 2013 6:07 PM writes...

"Animals are supposed to act like us, so we'd rather use them than people ... slaves and prisoners probably don't look physiologically like the populations they would be used to represent, so that testing in them wouldn't be helpful".

Are you saying that animals are more similar to human patients than slaves and prisoners are?

Permalink to Comment

31. Anonymous on November 26, 2013 6:09 PM writes...

"can you tell me of any drugs that have been shown to work in the clinic that failed in animal models?"

That's a dumb question, given that nobody would ever try to test any compound in the clinic if it failed in animals.

Permalink to Comment

32. Hap on November 26, 2013 6:26 PM writes...

No, but if you're skipping to testing in people, then the people you're testing had better look like the people you want to buy your drugs, otherwise you're wasting lots of time and money and people's lives doing something that doesn't help you.

Permalink to Comment

33. Cellbio on November 26, 2013 7:36 PM writes...

The criteria for testing agents in clinical trials would not just be that a compound clears tox. There always has to be significant data supporting potential benefit to warrant testing in humans. Those data can, and should, and from a regulatory sense must include testing in animals, just not disease models. One can show a drug has a biochemical or pathway impact after dosing and associated with blood or tissue levels distinct from adverse events. Whether this pharmacological impact "cures" disease in animals is irrelevant to moving forward in humans.

Permalink to Comment

35. hmmmmm.... on November 27, 2013 3:54 AM writes...

@#31 SSRIs are used to treat anxiety disorders in the clinic but don't work in most animal models of anxiety.

Another point: have the ARRIVE guidelines (see fallen into a blackhole in the US? This whole area has been debated, published on and signed up for by a broad range of journals and now the NIH wants to do it again? I've not read the Science article (paywall) but if the author has not mentioned ARRIVE then it will be a shoddy poorly researched piece of work.
Yes, these problems need to be highlighted and discussed but don't ignore the fact that this was all being seriously discussed and acted upon 3 years ago.

Permalink to Comment

36. Pete Kissinger on November 27, 2013 3:27 PM writes...

I very much like what Jack Shaftoe has to say here. It is difficult to generalize across the landscape of diseases and toxic reactions to xenobiotics. Animals are excellent for many macroscopic features, but are less reliable on quantitative details. One area I have worked on for 20 years is the question of how to get a sample from a mouse or rat or pig or monkey without the stress response to sampling dramatically distorting the observables in the sample. This effect is large and often ignored because it cost more than traditional manual ways that were once "the only way" but are not today.
We get complete PK data in a single mouse with no human in sight - this was impossible prior to about 2005.

Permalink to Comment

37. NJBiologist on December 2, 2013 12:28 PM writes...

@35 hmmmmmmmmmmm....: I've run some SSRIs in stress-induced hypothermia and marble burying; they worked beautifully. Better than benzos, in fact.

Permalink to Comment


Remember Me?


Email this entry to:

Your email address:

Message (optional):

The Last Post
The GSK Layoffs Continue, By Proxy
The Move is Nigh
Another Alzheimer's IPO
Cutbacks at C&E News
Sanofi Pays to Get Back Into Oncology
An Irresponsible Statement About Curing Cancer
Oliver Sacks on Turning Back to Chemistry