Corante

About this Author
DBL%20Hendrix%20small.png College chemistry, 1983

Derek Lowe The 2002 Model

Dbl%20new%20portrait%20B%26W.png After 10 years of blogging. . .

Derek Lowe, an Arkansan by birth, got his BA from Hendrix College and his PhD in organic chemistry from Duke before spending time in Germany on a Humboldt Fellowship on his post-doc. He's worked for several major pharmaceutical companies since 1989 on drug discovery projects against schizophrenia, Alzheimer's, diabetes, osteoporosis and other diseases. To contact Derek email him directly: derekb.lowe@gmail.com Twitter: Dereklowe

Chemistry and Drug Data: Drugbank
Emolecules
ChemSpider
Chempedia Lab
Synthetic Pages
Organic Chemistry Portal
PubChem
Not Voodoo
DailyMed
Druglib
Clinicaltrials.gov

Chemistry and Pharma Blogs:
Org Prep Daily
The Haystack
Kilomentor
A New Merck, Reviewed
Liberal Arts Chemistry
Electron Pusher
All Things Metathesis
C&E News Blogs
Chemiotics II
Chemical Space
Noel O'Blog
In Vivo Blog
Terra Sigilatta
BBSRC/Douglas Kell
ChemBark
Realizations in Biostatistics
Chemjobber
Pharmalot
ChemSpider Blog
Pharmagossip
Med-Chemist
Organic Chem - Education & Industry
Pharma Strategy Blog
No Name No Slogan
Practical Fragments
SimBioSys
The Curious Wavefunction
Natural Product Man
Fragment Literature
Chemistry World Blog
Synthetic Nature
Chemistry Blog
Synthesizing Ideas
Business|Bytes|Genes|Molecules
Eye on FDA
Chemical Forums
Depth-First
Symyx Blog
Sceptical Chymist
Lamentations on Chemistry
Computational Organic Chemistry
Mining Drugs
Henry Rzepa


Science Blogs and News:
Bad Science
The Loom
Uncertain Principles
Fierce Biotech
Blogs for Industry
Omics! Omics!
Young Female Scientist
Notional Slurry
Nobel Intent
SciTech Daily
Science Blog
FuturePundit
Aetiology
Gene Expression (I)
Gene Expression (II)
Sciencebase
Pharyngula
Adventures in Ethics and Science
Transterrestrial Musings
Slashdot Science
Cosmic Variance
Biology News Net


Medical Blogs
DB's Medical Rants
Science-Based Medicine
GruntDoc
Respectful Insolence
Diabetes Mine


Economics and Business
Marginal Revolution
The Volokh Conspiracy
Knowledge Problem


Politics / Current Events
Virginia Postrel
Instapundit
Belmont Club
Mickey Kaus


Belles Lettres
Uncouth Reflections
Arts and Letters Daily
In the Pipeline: Don't miss Derek Lowe's excellent commentary on drug discovery and the pharma industry in general at In the Pipeline

In the Pipeline

« Four Billion Compounds At a Time | Main | More Boronic Esters, Please »

August 22, 2012

Watch that Little Letter "c"

Email This Entry

Posted by Derek

Hang around a bunch of medicinal chemists (no, really, it's more fun than you'd think) and you're bound to hear discussion of cLogP. For the chemists in the crowd, I should warn you that I'm about to say nasty things about it.

For the nonchemists in the crowd, logP is a measure of how greasy (or how polar) a compound is. It's based on a partition experiment: shake up a measured amount of a compound with defined volumes of water and n-octanol, a rather greasy solvent which I've never seen referred to in any other experimental technique. Then measure how much of the compound ends up in each layer, and take the log of the octanol/water ratio. So if a thousand times as much compound goes into the octanol as goes into the water (which for drug substances is quite common, in fact, pretty good), then the logP is 3. The reason we care about this is that really greasy compounds (and one can go up to 4, 5, 6, and possibly beyond), have problems. They tend to dissolve poorly in the gut, have problems crossing membranes in living systems, get metabolized extensively in the liver, and stick to a lot of proteins that you'd rather they didn't stick to. Fewer high-logP compounds are capable of making it as drugs.

So far, so good. But there are complications. For one thing, that description above ignores the pH of the water solution, and for charged compounds that's a big factor. logD is the term for the distribution of all species (ionized or not), and logD at pH 7.4 (physiological) is a valuable measurement if you've got the possibility of a charged species (and plenty of drug molecules do, thanks to basic amines, carboxylic acids, etc.) But there are bigger problems.

You'll notice that the experiment outlined in the second paragraph could fairly be described as tedious. In fact, I have never seen it performed. Not once, and I'll bet that the majority of medicinal chemists never have, either. And it's not like it's just being done out of my sight; there's no roomful of automated octanol/water extraction machines clanking away in the basement. I should note that there are other higher-throughput experimental techniques (such as HPLC retention times) that also correlate with logP and have been used to generate real numbers, but even those don't account for the great majority of the numbers that we talk about all the time. So how do we manage to do that?

It has to do with a sleight of hand I've performed while writing the above sections, which some of you have probably already noticed. Most of the time, when we talk about logP values in early drug discovery, we're talking about cLogp. That "c" stands for calculated. There are several programs that estimate logP based on known values for different rings and functional groups, and with different algorithms for combining and interpolating them. In my experience, almost all logP numbers that get thrown around are from these tools; no octanol is involved.

And sometimes that worries me a bit. Not all of these programs will tell you how solid those estimates are. And even if they will, not all chemists will bother to check. If your structure is quite close to something that's been measured, then fine, the estimate is bound to be pretty good. But what if you feed in a heterocycle that's not in the lookup table? The program will spit out a number, that's what. But it may not be a very good number, even if it goes out to two decimal places. I can't even remember when I might have last seen a cLogP value with a range on it, or any other suggestion that it might be a bit fuzzy.

There are more subtle problems, too - I've seen some oddities with substitutions on saturated heterocyclic rings (morpholine, etc.) that didn't quite seem to make sense. Many chemists get these numbers, look at them quizzically, and say "Hmm, I didn't know that those things sorted out like that. Live and learn!" In other words, they take the calculated values as reality. I've even had people defend these numbers by explaining to me patiently that these are, after all, calculated logP values, and the calculated log P values rank-order like so, and what exactly is my problem? And while it's hard to argue with that, we are not putting our compounds into the simulated stomachs of rationalized rodents. Real-world decisions can be made based on numbers that do not come from the real world.

Comments (39) + TrackBacks (0) | Category: Drug Assays | In Silico | Life in the Drug Labs


COMMENTS

1. petros on August 22, 2012 8:15 AM writes...

And then there is the debate as to whether it is better to use logP or logD values

Permalink to Comment

2. mous on August 22, 2012 8:23 AM writes...

Not to mention that there are myriad different versions of that little "c", all of which give different values for the same compound....

I would argue that provided you are comparing apples with apples (i.e. within a chemical series) and using the same calculation software across the board, the rank order of cLogP is infact relatively meaningful. After all, we are usually looking to increase or decrease logP relative to a series benchmark rather than target a defined vlaue (if I can get the clogP up, I'll get the volume of distibution up...hey presto, once daily dosing!)

Permalink to Comment

3. Curious Wavefunction on August 22, 2012 8:26 AM writes...

The other problem of course is that n-octanol is not always a good proxy for a membrane's greasy interior. It has dissolved water to the tune of 2 mM and presents hydrogen bonding groups.

Permalink to Comment

4. Nick K on August 22, 2012 8:30 AM writes...

Back in the Dark Ages (1980's) we used to measure logP's by reverse phase TLC. I don't know how the values correlated with actual octanol/water partition coefficients, but the method was quick and easy and at least had some validity. Furthermore, a dozen compounds or more could be run in one go.

Permalink to Comment

5. Watson on August 22, 2012 8:40 AM writes...

We had a promising compound in our lab with a cLogP a little lower than desired, around -0.5-1, not really that great for CNS. It had excellent selectivity, so we sent it forth to the magical pharmaceutics fairies for an actual experiment, where we learned that it was more around 5.

The source of this behavior was most likely due to an intramolecular hydrogen bond between what should have been a very basic amine and a carbonyl moiety. If there is flexibility and 5-6 heavy atom distance between the N and O, then hydrophobic and/or electrostatic collapse will muck up cLogP.

Permalink to Comment

6. anon the II on August 22, 2012 8:41 AM writes...

I think there's some confusion here about when and where to use clogP's. Never use clogp's to guide your own research. Only use clogP's to criticize the work of others.

Permalink to Comment

7. noko marie on August 22, 2012 8:50 AM writes...

Ah yes, those little "c"'s can be worrisome. My husband just recently got a Lipoprofile done and found out just how wrong the "LDLc" on a normal cholesterol report can be.

Permalink to Comment

8. Anonymous on August 22, 2012 8:56 AM writes...

Its amazing what happens when there aren't any grad students around to run the tedious-yet-oh-so-important experiments!

Permalink to Comment

9. RB Woodweird on August 22, 2012 9:00 AM writes...

There are analytical counter-current chromatography instruments available these days. Has anyone used these to generate data which would be analogous to logP?

Permalink to Comment

10. Calvin on August 22, 2012 9:03 AM writes...

@6 anon the II LOL! Indeed

My own experience of cLogP is to use it sensibly. It's a useful ranking tool within a series (note within a series) to see how other trends might vary. Don't expect nice clean correlations. And don't expect there to be a real-world difference between a cLogP of 3.6 and 3.7 and in many cases between a cLogP of 3 and 4. But if you have a decent range of cLogPs you might actually see something useful. Or not. Always a good idea to plot cLogP against potency and hope you see a nice scatter plt on no correlation!

It's a great tool but I agree with @6 that it's increasing using as a punitive tool to bash once compound versus another is unhelpful

Permalink to Comment

11. PtX on August 22, 2012 9:11 AM writes...

So, why not come up with a new measure that would be easier to evaluate experimentally and more relevant for biological membranes?

Permalink to Comment

12. Anonymous on August 22, 2012 9:17 AM writes...

Combination of RP and IAM chromatography should be used to get a good correlation for a particular class of compounds.

Permalink to Comment

13. anchor on August 22, 2012 9:41 AM writes...


Chemdraw has in its analytical program, has a feature that calculates logP for a given structure. But in one of those program we did both Chemdraw version as well as calculated lopP (the compound had to cross BBB). The difference was huge and after that we junked the CD version.

Permalink to Comment

14. newnickname on August 22, 2012 9:46 AM writes...

I measured real partition and distribution coefficients as an undergrad researcher (using radiolabeled compounds). I used Hansch's (RIP) EXTENSIVE tables to make comparisons. (There's a way to make a career!)

Years later, I had an interview at a major Big Pharma and a freshly minted PhD working there expressed disagreement with the use of logP for guiding drug disco - OK, it's debatable. But the kicker was when he said it was just an unreliable modern invention, only going back to the 1970s or 1980s. I (politely? pedagogically?) mentioned that Meyer and Overton had correlated anesthetic potency to logP back in the 1890s and it may be one of THE first mathematically useful drug correlations by which to make testable predictions. (Prior to M-O, people had made the general observation that greasier is better, without the math.)

It is definitely a useful CONCEPT of which to be aware, even if you don't want to buy into the numbers, in vitro or in silico.

Permalink to Comment

15. CMCguy on August 22, 2012 9:48 AM writes...

I would venture to say the reason most medchemists have not seen actual logP measurement experiments is that they would probably break down and cry at the amount of material consumed after all the work put in to generate the product. Along with the surrogate HPLC and other systems to provide logP estimations I think now-a-days there are automated instruments that do require less compound but I recall it took about 200-500 mg to run manual determinations. Typically often seen would be performed during part of the initial scale-up efforts towards more complex animal models and/or tox,so till that point guidance was by clogPs where sometimes do not end up to correlate well with actual measured value.

Permalink to Comment

16. John Wayne on August 22, 2012 10:05 AM writes...

There are some very decent LogD determination experiments that are done with a small scale octanol/water partition followed by HPLC estimates of concentration available at several CRO's. These experiments come with the usual caveats (not very good at extreme differences between water and organic, there is some DMSO in there, etc.), but the numbers have been pretty good for predicting ADME issues in my hands.

I worked for a fellow who loved cLogP's and kept asking for them in a series I worked on. That was a bad experiment because we were working on a macrocyclic lead with a basic amine, but he kept bugging me about it. I eventually showed him a graph of cLogP's vs. LogD values derived experimentally from the source mentioned above; it was a scatter plot. We tried several sources for the calculations looking for some basis set that could be predictive for our series; we had no luck, and Chemdraw generated the least usable data.

Permalink to Comment

17. Anonymous on August 22, 2012 10:16 AM writes...

Such a machine does exist! It's called a Sirius T3 and measures logP/D in ~2 hours using only 0.5 -1 mg of compound. It does so by measuring the shift in pKa in the presence of varying amounts of octanol. It is medium throughput as the compound has to be accurately weighed out and the data analysed post-run but it is certainly not as tedious as the experiment described in the post!

Permalink to Comment

18. One reason why... on August 22, 2012 10:52 AM writes...

One reason why this is so is because anybody in academia that bothers to do this tedious experiment won't get the next grant, fellowship, high-impact paper, etc. Doing solid science on basic biophysical properties is not going to get you ahead, you'll be beaten by idiots who get lucky results in high-impact systems. So all the careful biophysicists studying drug discovery never get the faculty position, tenure, etc.

Maybe pharma isn't too far off. What would your boss say if you purchased a few liters of octanol and started shaking things up?

Permalink to Comment

19. lynn on August 22, 2012 11:46 AM writes...

I'm a biologist and I measured octanol/water partition coefficients early in my pharma days (in the 80s)since there was little data on PCs/LogPs of antibacterials (besides Hansch's anaesthetics and one or two other papers)and because they seemed to be important for thinking about gram negative permeability. I liked the idea of QSAR...but I couldn't get much buy-in from med chemists in those days.

Permalink to Comment

20. industrial_medchemist on August 22, 2012 2:00 PM writes...

In my (Major Pharma) company we measure LogD routinely on most final compounds from projects. Calculated values are useful in deciding whether to make something - usually you can relate it to an analogue and correct accordingly - but for real SAR always use a measured value.

Permalink to Comment

21. Anonymous on August 22, 2012 2:15 PM writes...

Of course, medicinal chemists go a step further with the "inaccurate" cLogP values - LipE. I never understood why so much emphasis was placed on a value derrived from an assay result (pIC50), possibly out by ~0.5 Log units, then subtract another (cLogP) possibly out by who knows what?

Permalink to Comment

22. Martin on August 22, 2012 5:02 PM writes...

There was the Hajduk JMC paper in 2010 about Rumsfelds "known unknowns" in med chem. The estimate of mean error for cLogP was one log unit.

Permalink to Comment

23. In vivo veritas on August 22, 2012 9:10 PM writes...

@17 -- Yes, this newer potentiometric method is both accurate, fast, and uses

Permalink to Comment

24. In vivo veritas on August 22, 2012 9:12 PM writes...

@17 -- Yes, this newer potentiometric method is both accurate, fast, and uses 1 mg compound. As you mentioned the bonus is getting both measured logP and logD7.4 (or other pH) at the same time. Ca. 25 of these determinations will allow you to see which (if any) "c"alculated method correlates best within your scaffold.

Permalink to Comment

25. Kerry on August 22, 2012 9:17 PM writes...

I have always considered clogP to refer specifically to the Hansch/Leo method distributed by Biobyte to distinguish it from mlogP, xlogP etc.

Permalink to Comment

26. researchfella on August 22, 2012 10:44 PM writes...

What's the big deal? cLogP is just a parameter (calculated), and experimental LogP (octanol/water partition coefficient) is just another parameter. Who's to say which one of these parameters is more relevant to the absorption of a compound in the gut of a rat or human, or to the plasma protein binding, or whatever? They are just parameters to give some kind of value for the lipophilicity of a compound. If useful trends and guidance can be established based on cLogP (e.g., Lipinski Rule of Five), then why is it not a valid parameter? How many trends/guidelines have been established using experimental LogP values? Not very many, I guess... so there's presumably less evidence to support the value of experimental LogP values than there is to support the value of cLogP values. Get over it.

How do you feel about tPSA values? Always calculated, highly dependent on the calculation method and/or the conformations that are sampled, no industry-standard methods that I'm aware of... but they can provide useful guidance so we use them.

Permalink to Comment

27. TX Raven on August 23, 2012 1:12 AM writes...

You have got to be kidding me.... is this what drug hunting has become?

Last time I checked, the job of the medicinal chemist was to understand our compounds.

cLogP helps you understand your compounds like playing FIFA 2012 in the playstation makes you a better soccer player...

Permalink to Comment

28. Lippy on August 23, 2012 2:25 AM writes...

@21, spot on. There's no dumber number in medchem than LLE (or any of the variants that involve combining lipophilicity and potency). As if you couldn't remember two numbers, or couldn't do the subtraction in your head if you really wanted to.

Permalink to Comment

29. Pete on August 23, 2012 5:12 AM writes...

Wavefunction is bang on target with his comment. I will add that we also use (both calculated and measured octanol/water) logP values to quantify the energetics of moving molecules from water to hydrophobic binding pockets.

At the risk of being accused of being overly anal, we should probably be writing ClogP rather than cLogP (do we raise case for log, sin etc when writing equations. Also there is ambiguity in the definition of LLE (check original article to see what I mean) and this is not the only efficiency metric to suffer from this problem.

There are discussions on logP versus logD and the relationship between lipophilicity and promiscuity it the LinkedIn FBDD group.

Permalink to Comment

30. Kissthechemist on August 23, 2012 7:17 AM writes...

For logP prediction, we have used Chemdraw and Pipeline Pilot for comparison with the actual physical experiment carried out by the folks at Sirius on their T3 instrument (5mg of sample).

Chemdraw was pretty useless in this case and Pipeline Pilot was practically spot on. But then that's to be expected, given the price of P-P !!

Permalink to Comment

31. Anonymous on August 23, 2012 8:06 AM writes...

Science used to be about doing experiments and getting data. Unfortunately, that is messy and takes time/people. So in looking for a short cut to 'speed things up' we use models that give us nice clean data really quickly.

Unfortunately, as soon as we venture away from the things that were used to build the models, the prediction accuracy of the models decreases rapidly.

If you think things are bad in chemistry, you should look at the Biology models. 70-80% of the parameters are estimates (for a good system), there is no undstanding of the unknown unknowns and of the factors deemed to be data, most of these have been poorly measured in in vitro systems, in vivo models, yeast2 hybrids etc - not a piece of data from a patient. And yet the whole project may depend on this, not just the fate of a compound.

Now I wonder why we aren't finding many drugs........

Permalink to Comment

32. Egon Willighagen on August 23, 2012 10:23 AM writes...

What I don't get about this discussion is that no one talks about the lack of data to train cLogP models... That's the only reason why expensive tools may give better results: they spend a lot of money on good data.

Had only all that experimental logP/logD data be available for people to reuse (Open Data), then we would have known in enough detail what drives the model, where the error comes from.

Not only would we then be able to given an error-of-prediction for the model, we could even do that for individual compounds: for compound x the cLogP is y +/- error. Really, this is trivial for multivariate statistics.

However, if you do not have access to proper training data, which accurately describes the used method (operating procedure, etc), and which quantifies experimental errors (error on concentrations, volumes, etc), then making a proper statistical model is made needlessly hard.

But I agree that currently the lack of such data, resulted in lack of proper models. So, please complain about the lack of Open Data in chemistry first, before you start complaining about the model performance.

Permalink to Comment

33. john delaney on August 25, 2012 7:46 AM writes...

...to paraphrase a colleague of mine - exactly how much octan-1-ol is there in the human body?

Permalink to Comment

34. Objective_practitioner on August 26, 2012 6:18 AM writes...

I completely agree with Egon Willighagen. Putting some effort into getting good quality and adequately representative experimental data at the outset can save a lot of work in the future.

I don't know why chemists are so hung up on cLog P. We assessed the performance of several log P prediction s/w twice in 5 years, and each time we found that the freeware, Episuite/Kowwin log P gave best fit to experimental data and was more robust on an average. cLog P was not in the top 3 list the first time and came in 2nd the 2nd time. This was true even with in-house data, which were not part of the training set of these programs.

The way the reliability and scope of application for these s/w is improved over time is pretty messy - no statistician can love the models!

Our experimentalists doing log P determinations use the s/w prediction as a guideline for setting experimental protocol. So it has some use!

researchfella has a good point too...

Permalink to Comment

35. Anonymous on August 26, 2012 9:08 PM writes...

#3 & #29: No, membrane biophysics is converging on n-octanol as actually being a nice proxy for membrane permeability. Biological membranes are fluid and are not slabs of uniform grease.

Permalink to Comment

36. Anonymous on August 27, 2012 6:06 AM writes...

There are predictions for everything (tox, hERG, renal clearance, ...). But of course nobody is relying (only) on those predictions. So also in this case: just measure your logD, see post @17 for the machine.

Permalink to Comment

37. clinicaltrialist on August 28, 2012 1:57 PM writes...

At the risk of sounding naive, let me ask something as a non-medicinal chemist. I understand that how a molecule with high cLogP may be troublesome in terms of getting abosorbed, crossing membranes, getting metabolized, etc.

Is there any data that shows that molecules with low cLogP is more likely to have biological effect and ultimately make it across the finish line as an approved drug?

In other words, I understand that high cLogP is a liability for a medicinal chemist, but will screening out molecules with high cLogP help find useful drugs or hurt the endeavor? I am asking about unintended consequences here, because as a biologist and a clinical trialist, what I would rather have are highly active molecules that are greasy rather than well absorbed molecules with no efficacy. PK and metabolic issues, I can design my clinical programs around. Lack of efficacy I cannot.

Permalink to Comment

38. Dr. J on September 9, 2012 7:27 PM writes...

Sometime, for your own entertainment, draw a secondary amine-containing structure in ChemDraw and generate the cLogP value. Make sure the amine is drawn "NH". Then repeat the exercise drawing out the N-H bond. Different numbers, huh?

Evidently you are supposed to draw out the bond for a (closer to) real number, but I was in the Med Chem biz for years before someone told me this. Just a helpful hint for my colleagues out there! Here's another one: spend the money to outsource the LogD (7.4) assay.

Permalink to Comment

39. Richard Prankerd on April 29, 2014 10:02 PM writes...

I have measured logP in the lab many times, along with other physico-chemical properties especially pKa and solubility. I distrust all computed values, and deprecate the research environment that demands quantity over quality. Thanks for highlighting the problem. Unfortunately, the debate is at least 20 years too late. The real physical organic chemists are a rapidly vanishing breed.

Permalink to Comment

POST A COMMENT




Remember Me?



EMAIL THIS ENTRY TO A FRIEND

Email this entry to:

Your email address:

Message (optional):




RELATED ENTRIES
XKCD on Protein Folding
The 2014 Chemistry Nobel: Beating the Diffraction Limit
German Pharma, Or What's Left of It
Sunesis Fails with Vosaroxin
A New Way to Estimate a Compound's Chances?
Meinwald Honored
Molecular Biology Turns Into Chemistry
Speaking at Northeastern