Corante

About this Author
DBL%20Hendrix%20small.png College chemistry, 1983

Derek Lowe The 2002 Model

Dbl%20new%20portrait%20B%26W.png After 10 years of blogging. . .

Derek Lowe, an Arkansan by birth, got his BA from Hendrix College and his PhD in organic chemistry from Duke before spending time in Germany on a Humboldt Fellowship on his post-doc. He's worked for several major pharmaceutical companies since 1989 on drug discovery projects against schizophrenia, Alzheimer's, diabetes, osteoporosis and other diseases. To contact Derek email him directly: derekb.lowe@gmail.com Twitter: Dereklowe

Chemistry and Drug Data: Drugbank
Emolecules
ChemSpider
Chempedia Lab
Synthetic Pages
Organic Chemistry Portal
PubChem
Not Voodoo
DailyMed
Druglib
Clinicaltrials.gov

Chemistry and Pharma Blogs:
Org Prep Daily
The Haystack
Kilomentor
A New Merck, Reviewed
Liberal Arts Chemistry
Electron Pusher
All Things Metathesis
C&E News Blogs
Chemiotics II
Chemical Space
Noel O'Blog
In Vivo Blog
Terra Sigilatta
BBSRC/Douglas Kell
ChemBark
Realizations in Biostatistics
Chemjobber
Pharmalot
ChemSpider Blog
Pharmagossip
Med-Chemist
Organic Chem - Education & Industry
Pharma Strategy Blog
No Name No Slogan
Practical Fragments
SimBioSys
The Curious Wavefunction
Natural Product Man
Fragment Literature
Chemistry World Blog
Synthetic Nature
Chemistry Blog
Synthesizing Ideas
Business|Bytes|Genes|Molecules
Eye on FDA
Chemical Forums
Depth-First
Symyx Blog
Sceptical Chymist
Lamentations on Chemistry
Computational Organic Chemistry
Mining Drugs
Henry Rzepa


Science Blogs and News:
Bad Science
The Loom
Uncertain Principles
Fierce Biotech
Blogs for Industry
Omics! Omics!
Young Female Scientist
Notional Slurry
Nobel Intent
SciTech Daily
Science Blog
FuturePundit
Aetiology
Gene Expression (I)
Gene Expression (II)
Sciencebase
Pharyngula
Adventures in Ethics and Science
Transterrestrial Musings
Slashdot Science
Cosmic Variance
Biology News Net


Medical Blogs
DB's Medical Rants
Science-Based Medicine
GruntDoc
Respectful Insolence
Diabetes Mine


Economics and Business
Marginal Revolution
The Volokh Conspiracy
Knowledge Problem


Politics / Current Events
Virginia Postrel
Instapundit
Belmont Club
Mickey Kaus


Belles Lettres
Uncouth Reflections
Arts and Letters Daily
In the Pipeline: Don't miss Derek Lowe's excellent commentary on drug discovery and the pharma industry in general at In the Pipeline

In the Pipeline

« And Now, the Retractome | Main | Kitchen Chemistry Gear »

November 12, 2010

99% Yield? That, Friends, Is Deception

Email This Entry

Posted by Derek

Here's an attention-getting paper from Tomas Hudlicky (and his co-author Martina Wernerova), and I'd like to help it get some more. It begins:

One who has been reading the literature concerned with organic synthesis in recent years, be it methodology, catalysis, or total synthesis of natural products, may have noticed considerable inflation in the values reported for isolated product yields, ratios of diastereomers, and enantiomeric excess values. A comparison of papers published during the period 1955 to 1980 with those published between 1980 and 2005 reveals that those from the more recent period frequently report isolated product yields of reactions >95%. Such large values were rarely found in the older literature and are all but absent in Organic Syntheses, a journal that only publishes procedures that have been independently reproduced. . .

There, does that sound like the chemical literature you know? Just a bit? Hudlicky has tackled this issue before, and the reasons he advances for the problem remain the same: pressure to make your methods stand out (to the scientific community, to the journal editors, to the granting agencies), a decrease in scale in reactions (making accuracy and precision more difficult), and, finally, what he refers to as "deliberate adjustment". That's well put; the rest of us know it as fraud.

He identifies the mid-1980s as roughly the period when things really started to go to pieces, saying that most procedures in reputable journals before that era are reproducible by, as they say, one skilled in the art, while the numbers have been decreasing since then. And he puts some numbers on the problem, performing a series of test experiments with extremely careful weighing and analysis.

These confirm what every working organic chemist knows: the more manipulations, the more sample you lose. Filtration through a plug of silica gel, into one flask, can give you pretty much complete recovery. But if you cut fractions, you're going to lose about 1%. And if you have to do a separation, even between two widely separated compounds on silica, you're going to lose about 2%. So people who report a >98% yield after chromatography from a real-world crude mixture are kidding themselves. The same goes for extractions and other common methods. In general, every manipulation of a reaction is going to cost you 1 to 2% of your material, even with careful technique. Hudlicky again:

Given that most academic groups do not subject day-to-day reactions to serious optimization or matrix-optimization [6] as is done in industry, it is reasonable to assume that the vast majority of the reactions reported in the literature do not proceed with quantitative conversions. Such aspect would approximate our experiments with mixtures of pure compounds. Because a minimum of three operations (extraction, filtration, and evaporation) is required in working up most reactions, we conclude that yields higher than ca. 94% obtained by work-up and chromatography of crude reaction mixtures are likely unrealistic and erroneous in nature. Such values may arise as a direct consequence of not following correct protocols, which would be expected in the fast-paced academic environment. (An astute student of the organic literature may discover that this very author has been guilty of reporting yields in this range from time to time!)

He goes on to detail the limits of error in weighing, which depend greatly on the amount of sample and the size of the flask. (The smaller the sample-to-container ratio, the worse things get, as you'd figure). And he turns to analyzing mixures of diastereomers by NMR, LC, and the like. As it turns out, NMR is an excellent way to determine these up to about a ratio of 95:5 , but past that, things get tricky. And "past that" is just where a lot of papers go these days, with a precision that is often completely spurious.

Here's the bottom line:

The conclusion drawn from this set of experiments points to the prevalence of serious discrepancies in the reporting of values for yields and ratios in the current literature. We have demonstrated that the facilities and equipment available in a typical academic laboratory are not adequate to support the accuracy of claims frequently made in the literature. . .The current practice of reporting unrealistically high isolated product yields and stereoisomer ratios creates serious problems in reproducibility and hence leads to diminished credibility of the authors.

He recommends a rigorous disclosure of the spread of product yields over multiple experiments, calibration of LC and GC apparatus, or (failing that) at least admitting that no such analysis has been done. (He also recommends getting rid of the concepts of diastereomeric and enantiomeric excess, in line with my fellow Arkansan Robert Gawley's advice). But I think that these ideas, while perfectly reasonable, don't get at the underlying problems - the inflationary pressure to produce more and more noteworthy results. Hudlicky's rules should be adopted - but I fear that they might just push the self-deception (and outright fraud) into newer territories.

I'm glad he's published this paper, though. Because everyone knows that this is a real problem - we complain about it, we joke about it, we mutter and we grit our teeth. But "officially", in the published literature, it's never mentioned. Let's stop pretending, shall we?

Comments (67) + TrackBacks (0) | Category: The Dark Side | The Scientific Literature


COMMENTS

1. john on November 12, 2010 9:33 AM writes...

So this is something I've always felt strongly about. Yields are always published as a percent, no plus or minus. Well that tells me that you never repeated the experiment or don't want to talk about the variance, why don't journals start demanding a yield which has been calculated based on 3 different replicates of the experiment, at least. Give me some statistics, we all know yields are gonna vary day to day, heck from moon cycle to moon cycle given how tricky some reactions are. If you tell me 80 and I try it and get 50 I'm gonna call your work crap, if you tell me 80 but give me a range I can expect I am happier if I get something in that range, and am less likely to think you're just trying to make your synthesis look better. Also it saves me time and money not using a route which isn't suitable to what I'm trying to achieve.

Permalink to Comment

2. Brian on November 12, 2010 9:35 AM writes...

99 % yield means you forgot to remove your magnetic stir bar from your reaction product or you still have residual metal in your product (like a nugget of Pb). A few times, I thought I had a crystallization nailed down, it was an inorganic salt that was entrained in my product that had precipitated. Having done analysis in my past, I look for water content by Karl-Fischer, a comparative wt/wt assay and a determination of residual inorganics before I would say something like I had >90 % yield. I approach things from a more rigorous approach.

Permalink to Comment

3. gyges on November 12, 2010 9:36 AM writes...

Two points

1) If he wanted to be read, why did he publish, perhaps more accurately, pseudopublish, in a closed source journal?

2) "But I think that these ideas, while perfectly reasonable, don't get at the underlying problems - the inflationary pressure to produce more and more noteworthy results."

One of the underlying problems is the ignorant manner in which professional chemists regard one another. I've seen it time and time again, someone reports an excellent piece of work - mass balances, good analysis, sensible decisions at bifurcation points - the end of which has been to quote yields in the 60 - 70 per cent range. His/her reward has been some ignorant whispering campaigning about how it could've been done better etc ... which is invariably utter and complete bollocks.

To explain this culture look to the chemists with hair triggers who shoot from the hip and are not held to account.

Permalink to Comment

4. tuky tuky on November 12, 2010 9:58 AM writes...

seeing a 99% yield always makes me giggle

Permalink to Comment

5. KC Nicolaou on November 12, 2010 10:01 AM writes...

You're all just jealous.

Permalink to Comment

6. Donough on November 12, 2010 10:14 AM writes...

Different industry, similar story.

Membranes have been touted for energy savings for a while. With membranes any number of polymer, zeolite, ceramic material or mixes can be used. In literature it is 'normal' to see selective comparisons to previous literature whereby the membranes are compared to obviously worse membrane (which no-one would use).
In one field that pretty much ended when most of the membranes related were reviewed and compared to each other (200-300 membranes). The amount of novel membrane types being published suddenly dropped as the crap was cut. While this may seem anti-innovation, most of these novel membranes already has a good chance of being crap before testing.

However not only are the researchers responsible, the publishers are responsible also. If I can identify crap results from the abstract, surely the reviewers who are supposed expects can also.

Permalink to Comment

7. number cruncher on November 12, 2010 10:19 AM writes...

Very gratifying to see commentary like this appearing in the literature. I never understood why synthetic chemists are allowed to avoid reporting the sorts of statistical analyses without which papers in other fields aren't even publishable.

Permalink to Comment

8. Nick K on November 12, 2010 10:37 AM writes...

Bravo! At long last, someone is prepared to tell the truth about reported yields! It's absolutely no coincidence that the reliability and trustworthiness of much of the organic literature went down at the same time as the yields went up. I rarely have problems with papers from the 50's to the 70's (yields 90%) often fails in my hands...

Permalink to Comment

9. milkshake on November 12, 2010 10:43 AM writes...

I do get 99% occasionally, when all these factors consire: 1) I run a very clean transformation on a multigram scale and 2) I isolate the product by filtration and it is very insoluble (a common thing, in kinase field).

Calculating the yield from 5 mg evap residue in a 50g flask is kind of joke - I think it would be better in case of small-scale reactions to provide the overall yield after several steps. (Then of course one has to consistently use the material from previous step - and not from a big batch sitting in a freezer that was made by someone else).

Also, if you report 75% yield and someone else can reproduces your procedure with a 83% yield, it will not give you a bad reputation.

Permalink to Comment

10. Nick K on November 12, 2010 10:46 AM writes...

An addendum to my previous comment: yield inflation may also have something to do with the recent decline in use of microanalysis. It's far easier to claim a 99% yield with a clear conscience if you're relying on HRMS rather than CHN.

Permalink to Comment

11. Will on November 12, 2010 10:50 AM writes...

The greatest compliment anyone ever gave me on my chemistry was returning to grad school a few years after leaving and talking to current members of my PI's lab and hearing that the stuff in my thesis was reproducible in the reported yields. They also told me the current students/postdocs couldn't reproduce the work of my co-worker anywhere near the reported yields.

Of course, that coworker published in JACS whereas I left with a masters after 5+ years...(cue miniscule violins, the bitterness went away after about 4 years)

Permalink to Comment

12. Anonymous on November 12, 2010 11:20 AM writes...

There'll be a few Nobel prize winners having to hand back their medals if this movement gets of the ground. We all know that in some groups a grad student's life expectancy is determined by their yields. We also all know that there are some groups whose published work is extremely difficult to reproduce. What do you think the correlation is like ?

Permalink to Comment

13. mad on November 12, 2010 11:39 AM writes...

Maybe they meant mass balance :)

Permalink to Comment

14. Jordan on November 12, 2010 12:06 PM writes...

I had a chance to chat with Hudlicky a couple of years ago -- he is a pretty cantankerous guy (lots of "kids these days" type remarks) but also very pragmatic and very dedicated to organic chemistry.

Permalink to Comment

15. partial agonist on November 12, 2010 12:19 PM writes...

This is not an issue I ever got too worked up about since I pretty much take all reported yields with a grain of salt anyway. If it says 95%, I think 80%. If it says 85%, I think 70%. Most of us do that every day and have done it for so long, it's just understood.

Back in the day, my advisor wanted us to put the average yield for a transformation in the papers, not the best. That mindset is rare, it seems.

Permalink to Comment

16. Anonymous on November 12, 2010 12:42 PM writes...

THud gave a few classroom lectures and handled problem sessions where I went to school.

Much fun was had at the expense of us all.

Permalink to Comment

17. azetidine on November 12, 2010 12:42 PM writes...

Hey #5, if I actually went back and tried to reproduce your chemistry, what percentage of the reactions would actually WORK as described, much less in the yields reported?

Permalink to Comment

18. molecular architect on November 12, 2010 12:47 PM writes...

Reminds me of the work by a very prominent Nobel Prize Winner (HCB) who used to report "corrected yields" for organoboranes. The organoboranes were themselves rarely isolated and, as any organic chemist should know, can be converted to many different functional groups. His people would take known amounts of the end products, run them through the workup and calculate the % recovery. They then "corrected" the yield of the intermediates assuming 100% conversion to the end products. This was fully disclosed in the publications and "technically valid", nonetheless, I always thought it was cheating since most chemists were interested in the final products not the organoborane intermediates.

Permalink to Comment

19. David Formerly Known as a Chemist on November 12, 2010 12:55 PM writes...

This is the type of article that reaffirms what you already know, like those published scientific studies that concluded prayer has no effect on outcome of sick patients (really, see for example http://www.ncbi.nlm.nih.gov/pubmed/19370557).

I was fortunate to never experience pressure to increase my reported yields, since I never worked on development of synthetic methodology. Development of new synthetic methods seems to be the papers where "yield inflation" would naturally be the most intense. Of course, working in process chemistry, there's always the pressure to increase your yields, but for purely economic reasons.

As bad as this problem is in the scientific literature, it doesn't hold a candle to the irreproducibility of procedures in patents.

Permalink to Comment

20. Anonymous on November 12, 2010 1:46 PM writes...

This seems like agruing about semantics. Do you really disregard a reaction or favor one method over the other because it is 85% vs. 95% yield?

While the point Hudlicky brings up may apply to methodology papers, I don't see where it matters that much in reagards to total synthesis. You cannot keep adding 5% to your yield in each step of a multistep synthesis, the solvents, impuritys, etc. eventually average out throughout the steps.

Permalink to Comment

21. processchemist on November 12, 2010 1:55 PM writes...

Those who work on the process side know far to well how much "yeld" is a stressed word.
Milkshake talks about "stone-like" kinase inhibitors. Yes, many of these compounds can precipitate as rocks from a reaction mixture, so you can have very high yelds. But high yelds of what? Of a product with an HPLC grade excessing 98.5%, with all impurities less than 0.5% by HPLC, total ashes under 1%, AND of the right crystalline form, with a particle size distrubution good for further processing (milling/sieving/pharma manufacturing)?
And I totally agree about dropping enantiomeric and isomeric excess as parameters to evaluate the output of a reaction.

Permalink to Comment

22. Spiny Norman on November 12, 2010 1:57 PM writes...

The problem starts with laboratory courses where all or part of the grade depends on yield x purity. QED, chumps.

Permalink to Comment

23. MTK on November 12, 2010 2:20 PM writes...

Meh.

I'm with partial agonist on this one. It doesn't bother me that much really. I don't take the numbers that seriously. To me 95% yield means it's a good reaction.

Permalink to Comment

24. petros on November 12, 2010 2:40 PM writes...

I think it started rather earlier than 1985, close to where Derek works, with it being the norm in JACS articles from a certain group.

Permalink to Comment

25. Bruce Hamilton on November 12, 2010 2:51 PM writes...

Good subject to discuss. I'd also like to see the end of ee, it's unnecessary with the widespread use of chiral HPLC, and ignores other impurities.

I'd also like to see both achiral and chiral HPLC data for molecules claiming chiral purity. That alone will involve some extra effort, as a chiral system has to show it will separate all molecules of interest.

I've run published chiral separations with enantiomers producing a single peak, suggesting the author did not confirm the chiral column was appropriate for the separation.

I'm not so thrilled about the suggested calibration of analytical HPLC response factors, as it can be a major effort to make relevant "separate source" reference compounds. I'd just prefer evidence that potential impurities and precusors have been separated from the analyte by the submitted method.

Too often, people assume that what's loaded onto a chromatography system will be detected, which is only true for a few types ( eg TLC, Iatroscan (TLC-FID)), whereas the more common GC, Flash, and HPLC techniques obviously can only ever detect what is eluted. "xx% by chromatography" was introduced to describe such products, rather than just "xx%"

I also agree with the concern in a comment above about the increasing use of MS instead of elemental analysis. It complements the composition data, rather than replacing it.

Permalink to Comment

26. Resveratrol Receptor on November 12, 2010 3:56 PM writes...

My reactions create matter, muahahaha.

Permalink to Comment

27. iridium on November 12, 2010 4:15 PM writes...

Even if the author is formally correct, I am with 23 on this one:

" To me 95% yield means it's a good reaction. "

- If you are working in process development you will do your optimization on your substrate.
- If you are a medicinal chemistr....you do not care about a difference in 10%yield, and you most likely are using anyway a slightly different substrate.

Although it is obviously wrong, unethical and not educative to embellish the yields, I do not see the point of students wasting time calibrating everyday HPLC, calculating error of the balance, giving statistical analysis of the yield of their reactions...it is better they use their time for learning and thinking.

The reaction conditions are anyway usually optimized on ONLY ONE substrate and if 10 entries are reported in a table, you can immagine that 9 of them are obtained under NOT optimized conditions.

If there is not intentional cheating...to me 95% yield means it's a good reaction.

Permalink to Comment

28. paul on November 12, 2010 4:23 PM writes...

I want to see % conversion, %yield and %purity with some indication of reproducibility. This would give a good indication of how good the transformation is.

Permalink to Comment

29. provocateur on November 12, 2010 6:53 PM writes...

So, you just proved ppl exaggerate!I do not need a paper to say this..thats why some procedures get popular and some don't...let the market decide it because we will never get the truth!

Permalink to Comment