Corante

About this Author
DBL%20Hendrix%20small.png College chemistry, 1983

Derek Lowe The 2002 Model

Dbl%20new%20portrait%20B%26W.png After 10 years of blogging. . .

Derek Lowe, an Arkansan by birth, got his BA from Hendrix College and his PhD in organic chemistry from Duke before spending time in Germany on a Humboldt Fellowship on his post-doc. He's worked for several major pharmaceutical companies since 1989 on drug discovery projects against schizophrenia, Alzheimer's, diabetes, osteoporosis and other diseases. To contact Derek email him directly: derekb.lowe@gmail.com Twitter: Dereklowe

Chemistry and Drug Data: Drugbank
Emolecules
ChemSpider
Chempedia Lab
Synthetic Pages
Organic Chemistry Portal
PubChem
Not Voodoo
DailyMed
Druglib
Clinicaltrials.gov

Chemistry and Pharma Blogs:
Org Prep Daily
The Haystack
Kilomentor
A New Merck, Reviewed
Liberal Arts Chemistry
Electron Pusher
All Things Metathesis
C&E News Blogs
Chemiotics II
Chemical Space
Noel O'Blog
In Vivo Blog
Terra Sigilatta
BBSRC/Douglas Kell
ChemBark
Realizations in Biostatistics
Chemjobber
Pharmalot
ChemSpider Blog
Pharmagossip
Med-Chemist
Organic Chem - Education & Industry
Pharma Strategy Blog
No Name No Slogan
Practical Fragments
SimBioSys
The Curious Wavefunction
Natural Product Man
Fragment Literature
Chemistry World Blog
Synthetic Nature
Chemistry Blog
Synthesizing Ideas
Business|Bytes|Genes|Molecules
Eye on FDA
Chemical Forums
Depth-First
Symyx Blog
Sceptical Chymist
Lamentations on Chemistry
Computational Organic Chemistry
Mining Drugs
Henry Rzepa


Science Blogs and News:
Bad Science
The Loom
Uncertain Principles
Fierce Biotech
Blogs for Industry
Omics! Omics!
Young Female Scientist
Notional Slurry
Nobel Intent
SciTech Daily
Science Blog
FuturePundit
Aetiology
Gene Expression (I)
Gene Expression (II)
Sciencebase
Pharyngula
Adventures in Ethics and Science
Transterrestrial Musings
Slashdot Science
Cosmic Variance
Biology News Net


Medical Blogs
DB's Medical Rants
Science-Based Medicine
GruntDoc
Respectful Insolence
Diabetes Mine


Economics and Business
Marginal Revolution
The Volokh Conspiracy
Knowledge Problem


Politics / Current Events
Virginia Postrel
Instapundit
Belmont Club
Mickey Kaus


Belles Lettres
Uncouth Reflections
Arts and Letters Daily
In the Pipeline: Don't miss Derek Lowe's excellent commentary on drug discovery and the pharma industry in general at In the Pipeline

In the Pipeline

« Oops. We Didn't Mean to Publish That. | Main | Lab Animals Wiped Out in Hurricane Sandy »

November 1, 2012

Hype, Malpractice, and Scientific Misconduct in Organic Synthesis

Email This Entry

Posted by Derek

That's the word-for-word title of a provocative article by Rolf Carlson and Tomas Hudlicky in Helvetica Chimica Acta. That journal's usually not quite this exciting, but it is proud of its reputation for compound characterization and experimental accuracy. That probably helped this manuscript find a home there, where it's part of a Festschrift issue in honor of Dieter Seebach's 75th birthday.

The authors don't hold back much (and Hudlicky has not been shyabout these issues, either, as some readers will know). So, for the three categories of malfeasance described in the title, the first (hype) includes the overblown titling of many papers:

As long as the foolish use of various metrics continues there is little hope of return to integrity. Young scientists entering academia and competing for resources and recognition are easily infected with the mantra of importance of
publishing in 'high-impact journals' and, therefore, strive to make their work as noticeable as possible by employing excess hype.

It is the reader, not the author, of papers describing synthetic method who should evaluate its merits. Therefore, self-promoting words like 'novel', 'new', 'efficient', 'simple', 'high-yielding', 'versatile', 'optimum' should not be used in the title of the paper if such qualities are not covered by the actual content of the paper.

It also includes the inflation of reaction yields (see that link in the second paragraph above for more on that topic). This is another one that's going to be hard to fix:

Unfortunately, the community has chosen and continues to choose the yield values in submitted manuscripts as a measure of overall quality and/or utility of the report. This, of course, encourages the 􏰛'adjustment' in the values in order to avoid critique. An additional problem in the reported values is the fact that synthesis is performed on small scales, thanks to advances in NMR and other techniques available for structure determination. On milligram scales it is extremely difficult to accurately determine weight and content of a sample, given the equipment available in typical academic laboratory.

The second category, malpractice, is sloppy work, but not outright fraud:

Malpractice, as explained above, is usually not deliberate and derives primarily from ignorance or professional incompetence. The most frequent cases involve improper experimental protocols, improper methods used in characterization of compounds, and the lack of correct citations to previous work.

For example, the authors point out that very, very rarely are any new synthetic methods given a proper optimization. One-variable one-at-a-time changes are worthwhile, but they're not sufficient to explore a reaction manifold, not when these changes can interact with each other. As process chemists in industry know, the only way to explore such landscapes is with techniques such as Design of Experiments (DoE), which try to find out what factors in a multivariate system produce the greatest change in results. Here's an example; the process chemistry literature furnishes many more.

And finally, you have outright scientific misconduct - fraud, poaching of ideas from grant applications and the like, plagiarism in publications, etc. It's hard to get a handle on these - they seem to be increasing, but the techniques to find and expose them are also getting better. Over time, thought, these techniques might just have the effect of making fraud more sophisticated; that would be in line with human behavior as I understand it, and with selection pressure as well. The motives for such acts are with us still, and do not seem to be abating much, so I tend to think that determined miscreants will find ways to do what they want to do.

Thoughts? Some of this paper's points could be put in the "grumblings about the good old days" category, but I think that a lot of it is accurate. I'm not sure how good the old days were, myself, since they were also filled with human beings, but the pressures found today do seem to be bringing on a lot of behaviors we could do without.

Comments (71) + TrackBacks (0) | Category: Chemical News | The Dark Side | The Scientific Literature


COMMENTS

1. RB Woodweird on November 1, 2012 8:35 AM writes...

"A Novel and Efficient Synthesis" of just about anything is rarely novel and never efficient.

My own crusade used to involve taking a pen or marker or whatever defacing tool was to hand and crossing out the word "elegant" whenever I saw it in the chemical literature. The curve of Marilyn Monroe's butt was elegant. Chemical transformations may be interesting or unusual or surprising. They are not elegant.

So if you pick up a copy of some journal sometime and see a hand-redacted adjective, it might have me.

Permalink to Comment

2. Anonymous on November 1, 2012 8:41 AM writes...

This is a can of worms, but what are the votes for the worst in each category? My votes:

Hype: Baran
Misconduct: Corey
Fraud: Tie between JJ LaClair, Bengu Sezen and Dalibor Sames

Permalink to Comment

3. grouchie on November 1, 2012 8:48 AM writes...

I'll go with the following:

hype: macmillan
misconduct: stork (never wrote a full paper)
fraud: agree with above

Permalink to Comment

4. Mark on November 1, 2012 9:02 AM writes...

Hype: Baran/Macmillan tie
Misconduct: Nicolaou (his yields are always ridiculous)
Fraud: LaClair

Brutally Honest: Danishefsky and Trost (although I don't think Danishefskys work is the most efficient, I at least trust his work/yields which are sometimes good and sometimes crap.)

Permalink to Comment

5. Curious Wavefunction on November 1, 2012 9:09 AM writes...

I think most good statisticians would be appalled if they were to encounter standard practices in synthetic organic chemistry.

Permalink to Comment

6. luysii on November 1, 2012 10:00 AM writes...

The medical literature was full of hype in the 36+ years I had to read it professionally. Here's one example -- http://luysii.wordpress.com/2009/10/05/low-socioeconomic-status-in-the-first-5-years-of-life-doubles-your-chance-of-coronary-artery-disease-at-50-even-if-you-became-a-doc-or-why-i-hated-reading-the-medical-literature-when-i-had-to/

My nephew is just starting out in psychology as an undergraduate. He had the following to say after I sent him the above.

"As I've been learning about research methods and reading scientific articles on psychology, I have become more skeptical of "findings" in the field. In addition to the issues surrounding data interpretation, there are also problems with its validity and reliability.

These issues seem to be even more common in psychological research, which is akin to trying to measure the weight of smoke. Not only are there problems with the results, but the operationalization of the concepts being measured can be extremely tricky. Even if one does get significant results, it may not even mean anything for the conceptual variable of interest.

Pondering these issues has actually led me to become extremely interested in statistical analysis. The fact that mathematical concepts can help to address some of these questions is SO cool. I really enjoy using SPSS (a statistical analysis program) to sort through data and examine relationships. I had a meeting with a psychologist who just got her PhD, and it lasted over two hours as she showed me the ins and outs of SPSS. It was fascinating. I'm going to be working on a paper or two, and will hopefully have a published article in a journal by early next year."

Permalink to Comment

7. anon the II on November 1, 2012 10:05 AM writes...

Much ado about nothing. The problem here is that a lot of people think that organic synthesis is some kind of scientific endeavor. It's really not. It's mostly a mixture of art and engineering. And selling it is how you get famous and make money.

You can use organic synthesis to do science, but that's not what our heros mostly do. They make pretty statues to put in their personal galleries while training others to paint signs in the process.

And one day we'll learn to appreciate LaClair the same way I appreciate Jackson Pollack.

Permalink to Comment

8. Aspirin on November 1, 2012 10:05 AM writes...

Hype: MacMillan, definitely. On his way to becoming the next KCN (the fact that he is a tyrant in lab helps). I would say that most of Baran's accolades are well-deserved, although he is no RBW.
Misconduct: Corey and H C Brown.
Fraud: LaClair, Sezen, Sames.

I fail to see how not writing a full paper constitutes "misconduct" (Stork). From all accounts that I have read Stork is a great chemist.

Permalink to Comment

9. anonagain on November 1, 2012 10:27 AM writes...

it'd be ok to do all that stuff if organic chemists (that's unfair, i mean total synthesis people really) were actually doing something valuable.

Permalink to Comment

10. Justice on November 1, 2012 10:32 AM writes...

^ Yeah, because nucleophilic addn to nitroalkanes and ring opening epoxides with fluoride are oh so valuable, lol. Most of us, across all sub fields, are doing research that matters only to a handful of people.

Permalink to Comment

11. Richard Apodaca on November 1, 2012 10:54 AM writes...

Am I the only one who finds it ironic in the worst way that the authors of this paper chose to publish it in a journal I can't access?

Hype, malpractice, and misconduct thrive in an environment of limited access.

Contrary to popular belief, there are quite a few open venues in which this paper could have been published. Yet the authors chose not to - why not, I wonder? Could it be that the "foolish use of various metrics" (a journal's 'prestige') affects both practitioners and critics alike?

Shame on Carlson and Hudlicky - they should know better.

Permalink to Comment

12. SDChemist on November 1, 2012 11:23 AM writes...

At the other end of the spectrum, I repeated several procedures reported by Seebach's group and my yields were always within 3% (plus or minus) of the report. Is this the experience of others?

Permalink to Comment

13. Tomas Hudlicky on November 1, 2012 11:31 AM writes...

To: Richard Apodaca

From your comment I gather that you are confused...

1. This was an invited paper for a special issue honoring Dieter Seebach-hence Helvetica
2. The above named chemist has been one of the first to criticize the state of affairs in organic chemistry-hence the essay
3. I do not publish in "open" journals because no one reads them
4. If you want a PDF of the article let me know and I will
send you one.
5. Elaborate as to what we should "know better". I am curious...

Hope my explanation helps
Cheers,
Tomas

Permalink to Comment

14. Student on November 1, 2012 12:12 PM writes...

Dear Prof. Hudlicky,

You have to realize that many organic chemists agree with you, but are put off by your combative tone. Especially students - nobody likes acknowledging that their exciting results mean less, and we'll do the experiments necessary to prove that our methods are worthwhile.

Take your Synlett for example - it COULD have been an excellent tutorial to those of us who are unenlightened. You COULD have been the one championing a reform of contemporary organic synthesis. But by your tone, you'd rather accuse us of misconduct rather than guide us on the path to more reliable and applicable chemistry. I'm a religious guy so you can guilt me into all sorts of behavioral changes, but I don't think that works on most scientists. They need to be CONVINCED, not SHAMED.

This is what would convince me to follow your lead and start performing critical analyses on the efficiency of my methods: a clear, non-accusatory tutorial on critical analysis. You're right - I didn't get the kind of training in analytical chemistry that's required. I was a bio major before moving into synthesis. It would be nice if someone spelled it out for me, in a high profile journal. Collaborate with industry - I'm sure many process chemists feel strongly about it, and if they could be part of the change that makes academia useful to them, they'll be glad to help.

Maybe this is already published. Then publicize it! Make it visible, and make it attractive! Play the game a little. Get inside our heads and convince us. A lot of us WANT to be convinced. Show us examples where industry has adopted a method BECAUSE of the analysis you're supporting. I'm sure there are plenty. Give a summary of examples. Give quotes, like you normally do, but with a different tone - quotes along the lines of "this looked like a promising method, but fell short of the reported yield, and so we tried something older" that are specific.

I just hate the idea that you're getting a bad rep even though you're trying to make a positive change. I think you make excellent points, and I've met you and I can tell you're passionate about it, but I feel strongly that you're going about it the wrong way.

Permalink to Comment

15. tomas Hudlicky on November 1, 2012 12:19 PM writes...

To: Student

Thanks for your advice. Sorry, I am not the touchy-feely type-that is too much like tehe 21st century.
The Synlett article spells out what is needed quite clearly.
Be well,
Tomas

Permalink to Comment

16. David on November 1, 2012 12:44 PM writes...

Re # 10: At the time of discovery most reactions may not appear to be of value...but that does not necessarily mean they won't at some point. Look at several cross coupling methodologies for example - sure in the final scaled up process there may be a different way of doing things but for quickly synthesizing a range of compounds to generate SAR it represents an easy way. Maybe a bad example but my point is that often things may seem pointless at the time yet find innovative uses which allow Med Chem to synthesise their molecules which have a use. (Unless your comment was sarcastic?)
I also dont agree with the comment that organic synthesis isn't real science - without its developments there would be no Med Chem, ie a 'proper' use of chemsitry.

Permalink to Comment

17. Felix on November 1, 2012 1:48 PM writes...

The murky numbers synthetic organic chemists put out has been criticized since I was an undergrad. Many reactions are conducted on a small scale and in my experience, their isolated yields are very noisy. However, after reading Hudlicky's paper on the limits of calculating isolated yields and his finding that NMR tended to be fairly accurate, at least compared to other methods of quantitation, I have started to utilize more NMR to screen reactions using internal standards. In my experience, it tends to provide a lot of information in a very reproducible manner. I have tended to favor it over GC/MS now.

Student - Hudlicky also criticizes himself to some extent in the synlett paper: "An astute student of the organic literature may discover that this very author has been guilty of reporting yields in this range from time to time!"

I don't think it's that chemists are actively trying to commit fraud, it is that reactions are noisy and chemists do not want to undersell their research. I have often found that a crude NMR with internal standard often allows me to account for almost all the mass in a reaction, giving me a more solid point of comparison of two reactions. However, it really matters what type of standard is used! I've seen some use DMF, which is probably a poor idea in my experience.

Permalink to Comment

18. bbooooooya on November 1, 2012 2:35 PM writes...

Organic synthesis as a field needs more Tomas Hudlicky's!

Permalink to Comment

19. milkshake on November 1, 2012 2:37 PM writes...

Hudlicky has guts the say publicly what many grumble about in private: the rigor of old solid German, US and UK synthetic chemistry got lost because of the grantsmanship and tenure pressure. The chemistry used to be much harder to do in terms of purification and characterization of compounds so people had to be more cautious; naturally there was less tolerance for hype and sloppiness

Permalink to Comment

20. gippgig on November 1, 2012 2:43 PM writes...

A couple articles in Science News are worth reading:
Odds Are, It's Wrong Science fails to face the shortcomings of statistics; March 27, 2010 p. 26, bit.ly/aq1x28
Making Data Work Researchers pursue analogy between statistical evidence and thermodynamics, Sept. 8, 2012 p. 26
For an example of the practical application of organic synthesis techniques see Amphotericin primarily kills yeast by simply binding ergosterol, PNAS vol. 109 p. 2234, www.pnas.org/cgi/doi/10.1073/pnas.1117280109

Permalink to Comment

21. Anonymous chemist on November 1, 2012 4:02 PM writes...

Hallelujah! Finally someone has the guts to call a cat a cat! As a postdoc I felt terribly traumatized when my boss forced me to doctor some spectra... Was physically sick afterwards but I was on a J1 visa and my career depended on his recommendation. I learned afterwards that my boss was a "mild" one. My colleague who lied the most on yields works now for a proeminent law firm in Boston...

Permalink to Comment

22. Eka-silicon on November 1, 2012 6:48 PM writes...

The experiment I would love to see? Send a range of academic labs a vial, and ask: please flash this, isolate the major product, and let us know the yield, with a clean spectra. The dispersion would be staggering.

Keep of rocking T-Hud- the field desperately needs someone to call out the insanity of 90% yield on 1.2 mgs....

Permalink to Comment

23. Mat Todd on November 1, 2012 7:15 PM writes...

Sounding like a broken record, but the open lab notebooks we use are intended to get around some of these issues by

a) placing things like yields for individual reactions in a context of repeats and/or failures - a 90% yield is perhaps more believable if you see that 5 previous attempts with slightly different conditions gave 50-70% yields, and
b) including raw spectroscopic data, so purity can always be checked - a 90% yield is not 20% ethyl acetate.

Incidentally it's often struck me as odd that results in tables in chemistry papers are usually results from single experiments. Biologists are usually more careful to include sample size and error bars. I think we as a community accept that a table of reaction results with small changes in S/M or catalyst structures (for example) count as a collection of experiments revealing a trend, but the individual points are usually single runs.

Permalink to Comment

24. Martyn on November 1, 2012 7:45 PM writes...

Prof. Hudlicky,

Can you point to a (recent?) paper that, in your opinion, gets all this stuff right? I'd like to see an example of how it should be done.

Permalink to Comment

25. Tomas Hudlicky on November 1, 2012 8:32 PM writes...

Well, I was not going to make any more responses but....I HAVE to do it. Here are a few...I am afraid this is a long one...::

To: Eka-silicon:

What a fantastic idea! I was going to say "novel" but caught myself just in time. Yes, indeed, you would get quite a spread. The highest yields would come from Harvard and MIT, of course, the rest of us do not have the star students who are free to violate the laws of Nature. I imagine some folks would actually isolate 1.4 mG out of 1.2 mG, not noticing the cat hairs in the sample...
But there already is a medium for this: Org. Syn. Have you ever seen 98% yields there? No.

To: Mat Todd:

Right. Results of single (i.e., the highest yielding) experiments. No one reports a RANGE anymore.
It's OK. Yet another addiction to quantification and large numbers (invented, no doubt, in the US). Europeans deal with low scores in soccer. Canadians (and Europeans) deal with low scores in hockey. Not Americans. We have to have high football scores and, yes, even higher basketball numbers! We can make six inches sound great if the inch is defined as very small! Funny world we live in....

To: Martyn:

I will have to dig. Send me an email and remind me to send you examples. There are still a few out there that have not been eaten up with self deception....Out of my head, I was impressed with the fact that Barran in his recent paper on germacrane terpenes (ACIE) reported 39%, 45% , etc., yields. There are still people who actually publish what they get-of course, most of the time they get crap from referees because the 70% yields are-oh, so low....Anyway, look up Clayton Heathcock's lycopodine synthesis (Full paper in JACS, 1980s-1982?) I know, that is not recent, but a good example of excellent chemistry and honest reporting. Who gives a F*&^ about yields anyway? If our stuff is at all useful, the process guys will make it >90% in no time. By correctly performed optimization. And from that source, I will believe it!
Optimization is not (and should not) be the province of academics. We should not even use that word because most of us do not know what it means anyway! We should spend our time teaching students how to crystallize solids to CONSTANT melting points!! (What's that?)
Anyway, I had to go on another rant here...

To:The student:
Have you recovered yet? Nothing good ever came from positive reinforcement except a sense of entitlement, which we all have to suffer now, from the young generation. My favorite saying in group meetings (which are still "the old days"): Suck it up and get to work!"

Well, this will do for a while. Thanks for (most of) y'all's comments.
Tomas


Permalink to Comment

26. Free Radical on November 1, 2012 8:47 PM writes...

One of the lesser forms of larceny is using separate experimental results for characterization and for yield reporting. We were taught that, if you're reporting a representative result, the % yield and the characterization should be taken from the same result. Or, if reporting a range of yields, that the yield for the material used to characterize should be in the representative range.

Permalink to Comment

27. Nick K on November 1, 2012 10:28 PM writes...

I'm so glad this issue is out in the open. Frankly, why do people like KC even bother reporting yields? No one has ever believed them, apart from inexperienced grad students, of course...

I've repeated quite a lot of Seebach's work, and have always had excellent results, with yields (real, isolated, pure, solvent-free) within 10% of those he reported. Why can't ALL synthesis papers be like this?

Permalink to Comment

28. American on November 2, 2012 12:14 AM writes...

"Not Americans. We have to have high football scores and, yes, even higher basketball numbers! We can make six inches sound great if the inch is defined as very small! Funny world we live in...."

Tom,

Considering the drop in Americans in graduate science programs over the time that yields have been increasing, I'd say this may not be an American problem...funny world we live in.

"Between 1980 and 2000, the percentage of Ph.D. scientists and engineers employed in the United States who were born abroad has increased from 24% to 37%"

And the numbers only get bigger in the foreign area.

At one point in my career, I had a professor tell me that he wished Americans were more like the foreigners, because they always gave him what he wanted...

Source: http://en.wikipedia.org/wiki/Foreign_born_scientists_and_engineers_in_the_United_States

Permalink to Comment

29. Kumada on November 2, 2012 2:07 AM writes...

@12 and 27
one reason for the high reproducibility of these recipes is that Prof. Seebach accepted our new products only with a matching ELEMENTAL ANALYSIS, be it salts, resins, or oils. The correct HANDLING of chemicals and equipment, purification techniques like distillation and proper crystallization - the old-fashioned craft - was always at part of the discussions.

Permalink to Comment

30. Ricardo Ros on November 2, 2012 2:21 AM writes...

Thanks for the article, and nice debate.

As a former analytical chemists, who has done quite a lot of synthesis, I did never understand why some of my peers where reporting false yields. There were very simple tricks to detect this, from the ones who always had x5's on their yields, to the ones who only had odd or even numbers ... there are more sophisticated analyses which are employed by tax agencies which can could be used, easily.

Some one has mentioned it, but one of the issues here is the total neglect of proper analytical chemistry being taught on most universities in the planet. I had to sit in front of experienced chemists in industry, listening, and then convincing them, that NMR is a quantitative technique (we would be talking here PhDs).

With the current hardware and software, it is possible that every single peak on every single NMR can be fully quantified, in an automatic manner. I am still confused as to why much of the population of chemists have decided to behave like cowboy scientists and relax the rigour with which they should be characterising their work.

The reference that this comes from the US is not a valid one, a scientist should be taught over their first few years in academia the tools to correctly conduct themselves in a scientific manner and as a chemists, and it is the neglect on many areas of chemistry which is the root of this problem.

Permalink to Comment

31. Boekelheide on November 2, 2012 2:24 AM writes...

Whenever I give a talk in front of process chemists, I am citing the Synlett paper, and the audience is nodding - these points are not new to them, they reflect the indispensable and most fundamental requirements when you have to put multi kg's of a compound on a (potato) balance, and someone should repeat it later on. Thanks to your paper, Prof. Hudlicky, there is a higher chance to reach undergraduates with these arguments.

Permalink to Comment

32. Anonymous on November 2, 2012 3:47 AM writes...

My faith in Orgsyn was depressingly reduced recently..
[Org. Synth. 2013, 90, 41-51: This material is purified by silica gel column chromatography (snip) to furnish 7.62 g (27.3 mmol, 92%) of (E)-N,N-diethyl-2-styrylbenzamide as a light
yellow viscous oil (Note 17). The crude material was dissolved in toluene (10 mL) and then
charged onto a column (diameter = 10 cm, height = 11 cm) of 425 g (1000 mL) of silica gel. The column was eluted with n-hexane/EtOAc = 8:1 (7.0 L) to 2:1 (3.6 L) and 100-mL fractions were collected. Fractions 73-108 were combined.]

Yep, nearly 11 litres of solvent to produce 8g material. 92% yield. Why column it then?

Permalink to Comment

33. Eka-silicon on November 2, 2012 4:56 AM writes...

If anyone has excess undergrad labour available, I'd love love love to see Benford's law applied to a) certain labs and b) JACS or JOC by decade from say 1950 to now....

Permalink to Comment

34. Anonymous on November 2, 2012 5:50 AM writes...

I don't think Benford's Law is considered applicable in small numbers of limited range with no powers assocociated with their generation.

Permalink to Comment

35. Eka-silicon on November 2, 2012 7:02 AM writes...

Anon, see

Diekmann, A. "Not the First Digit! Using Benford's Law to Detect Fraudulent Scientific Data" Journal of Applied Statistics; 34(3), 2007; 321-329.

If it works for regression coefficients, it should work for the first, or better yet, *last* digit of yields. Easy enough to test!

Permalink to Comment

36. Anonymous on November 2, 2012 7:23 AM writes...

If it applies, the last digit is definitely the one to go for. I would LOVE to see this done.
A while back I went through a complete custom synthesis lab book of mine.
Average yield = 62% IIRC. I guess I'm not KCN material.

Permalink to Comment

37. Anonymous on November 2, 2012 7:37 AM writes...

When I read posts like this/comments, it makes me sad as all I can think of is Lance Armstrong (i.e. NOWHERE near as bad as that but still finding out that heroes are not all they seem)

Permalink to Comment

38. Derek Lowe on November 2, 2012 7:58 AM writes...

As much as I'd like to see Benford's law used on reported yield data, I think it would be tricky. You really want underlying data that cover several orders of magnitude - the whole underlying mechanism is the mantissas of the logarithms of the raw numbers. Benford's is also inappropriate for normally distributed data, so a careless application would confuse real randomness with fraud.

Permalink to Comment

39. tt on November 2, 2012 9:08 AM writes...

While Prof. Hudlicky may be a rather brash and rude messenger (as most tenured professors tend to be), he raises some very valid points. As a rather experienced process chemist, I tend to view most academic reported yields as a suggestion (i.e.-we got some product, but we can't really say how much and what else is in there). I then assume that we can optimize it through DoE and reagent screening, unless there's some