About this Author
DBL%20Hendrix%20small.png College chemistry, 1983

Derek Lowe The 2002 Model

Dbl%20new%20portrait%20B%26W.png After 10 years of blogging. . .

Derek Lowe, an Arkansan by birth, got his BA from Hendrix College and his PhD in organic chemistry from Duke before spending time in Germany on a Humboldt Fellowship on his post-doc. He's worked for several major pharmaceutical companies since 1989 on drug discovery projects against schizophrenia, Alzheimer's, diabetes, osteoporosis and other diseases. To contact Derek email him directly: Twitter: Dereklowe

Chemistry and Drug Data: Drugbank
Chempedia Lab
Synthetic Pages
Organic Chemistry Portal
Not Voodoo

Chemistry and Pharma Blogs:
Org Prep Daily
The Haystack
A New Merck, Reviewed
Liberal Arts Chemistry
Electron Pusher
All Things Metathesis
C&E News Blogs
Chemiotics II
Chemical Space
Noel O'Blog
In Vivo Blog
Terra Sigilatta
BBSRC/Douglas Kell
Realizations in Biostatistics
ChemSpider Blog
Organic Chem - Education & Industry
Pharma Strategy Blog
No Name No Slogan
Practical Fragments
The Curious Wavefunction
Natural Product Man
Fragment Literature
Chemistry World Blog
Synthetic Nature
Chemistry Blog
Synthesizing Ideas
Eye on FDA
Chemical Forums
Symyx Blog
Sceptical Chymist
Lamentations on Chemistry
Computational Organic Chemistry
Mining Drugs
Henry Rzepa

Science Blogs and News:
Bad Science
The Loom
Uncertain Principles
Fierce Biotech
Blogs for Industry
Omics! Omics!
Young Female Scientist
Notional Slurry
Nobel Intent
SciTech Daily
Science Blog
Gene Expression (I)
Gene Expression (II)
Adventures in Ethics and Science
Transterrestrial Musings
Slashdot Science
Cosmic Variance
Biology News Net

Medical Blogs
DB's Medical Rants
Science-Based Medicine
Respectful Insolence
Diabetes Mine

Economics and Business
Marginal Revolution
The Volokh Conspiracy
Knowledge Problem

Politics / Current Events
Virginia Postrel
Belmont Club
Mickey Kaus

Belles Lettres
Uncouth Reflections
Arts and Letters Daily
In the Pipeline: Don't miss Derek Lowe's excellent commentary on drug discovery and the pharma industry in general at In the Pipeline

In the Pipeline

« Touching Up the Spectra | Main | More on the NIH and Its Indian Clinical Trials »

July 17, 2013

MedChemica: When One Compound Collection Isn't Enough

Email This Entry

Posted by Derek

According to SciBx, here's another crack at computational solutions for drug discovery: MedChemica, a venture started by several ex-AstraZeneca scientists. They're going to be working with data from both AZ and Roche, using what sounds like a "matched molecular pair" approach:

Although other algorithms try to relate structure to biological function, most of the analyses look at modifications across a wide array of diverse structures. MedChemica's approach is to look at modifications in a set of similar structures and see how minor differences affect the compounds' biological activity.
Al Dossetter, managing director of MedChemica, said the advantage of the company's platform is the WizePairZ algorithm that looks at pairs of fragments that are similar in structure but differ by a chemical group, such as a change from chlorine to fluorine or the addition of a methyl group.
This platform, he told SciBX, captures the chemical environment of the fragment change. For example, it incorporates the fact that the effect of changing chlorine to fluorine on a molecule will depend on the surrounding structure. The result is a rule that is context dependent.
The MedChemica approach applies to small molecules and uses only partial chemical structures, thus keeping compound identities out of the picture.
Because the platform does not reveal compound identities, AstraZeneca and Roche can share knowledge without disclosing proprietary information.

The belief is that neither company's database on its own gives quite enough statistical power for this approach to work, so they're trying it on the pooled data:

smaller databases only allow researchers to extract one to five matched pairs, which have a low fidelity of prediction. Ten matched pairs are sufficient to draw a prediction, but reliability increases significantly with 20 matched pairs.
The MedChemica database contains 1.2 million datapoints, each of which represents a single molecule fragment in a single assay. It includes 31 different assays, although more are likely to be added in the future, and not all molecules have been tested in all assays.

The article says that AZ and Roche are in discussions with other companies about joining the collaboration. Everyone who joins will get a copy of the pooled database, in addition to being able to share in whatever insights MedChemica comes up with. A limitation is mentioned as well: this is all in vitro data, and its translation to animals or to the clinic provides room to argue.

That's a real concern, I'd say, although I can certainly see why they're doing things the way that they are. It's probably hard enough coming up with in vitro assays across the two companies that are run under similar enough conditions to be usefully paired. In vivo protocols are more varied still, and are notoriously tricky to compare across projects even inside the same company. Just off the top of my head, you have the dosing method (i.v., p.o., etc.), the level of compound given, the vehicle and formulation (a vast source of variability all in itself), the species and strain of animal, the presence of any underlying disease model (versus control animals), what time of day they were dosed and whether they were fed or fasted, whether they were male or female, how old the animals were, and so on and so on. And these factors would be needed just to compare things like PK data, blood levels and so on. If you're talking about toxicology or other effects, there's yet another list of stuff to consider. So yes, the earlier assays will be enough to handle for now.

But will they be enough to provide useful information? Here's where the arguing starts. Limitations of working with only in vitro data aside, you could also say that any trends that are subtle enough to need multi-company-sized pools of data might be too subtle to affect drug discovery very much. The counterargument to that is that some of these rules might still be quite real, but lost in the wilds of chemical diversity space due to lack of effective comparisons. (And the counterargument to that is that if you don't have very many example, how are you so sure that it's a rule?) I'm not sure which side of that one I come down on - "skeptical but willing to listen to data" probably describes me here - but this is the key question that MedChemica will presumably answer, one way or another.

Even so, that in vitro focus is going to be a long-term concern. One of the founders is quoted in the article as saying that the goal is to learn how to predict which compounds shouldn't be made. Fine, but "shouldn't have been made" is a characteristic that's often assigned only after a compound has been dosed in vivo. In the nastier cases, the ones you want to avoid the most, it's only realized after a compound has been in hundreds or thousands of humans in the clinic. The kinds of rules that MedChemica will come up with won't have any bearing on efficacy failures (nor are they meant to), but efficacy failures - failures of biological understanding - are depressingly common. Perhaps they've got a better chance at cutting down the number of "unexplained tox" failures, but that's still a very tall order as well as a very worthy goal.

Falling short of that, I worry, will mean that the MedChemica approach might end up - even if it works - by only optimizing a bit the shortest and cheapest part of the whole drug discovery process, preclinical med-chem. I sympathize - most of my own big ideas, when I get them, bear only on that part of the business, too. But is it the part that needs to be fixed the most? The hope is that there's a connection, but it takes quite a while to prove if one exists.

Comments (8) + TrackBacks (0) | Category: In Silico


1. Pete on July 17, 2013 8:42 AM writes...

Generally, I would ensure that each matched molecular pair came from a single data set when doing this sort of analysis. Matched molecular pair analysis (MMPA) can be seen as a type of local QSAR modeling and one assumes (correctly or otherwise)that differences in values of properties are less sensitive to assay variation than the values of the properties themselves.

Permalink to Comment

2. SAR screener on July 17, 2013 10:15 AM writes...

'It's probably hard enough coming up with in vitro assays across the two companies that are run under similar enough conditions to be usefully paired.'

This is the part I worry about. Screening technology, protein construct, buffer, substrate concentrations, pre-incubation times etc can have a huge impact on the measured IC50.

Permalink to Comment

3. Sweden Calling on July 17, 2013 10:40 AM writes...

Speaking from within the walls...I am amazed. A tool was developed, which never really was used. A few good people were laid off, and they could take the (clunky) tool for free. The tool is then bought back from the laid off people (in the context of getting noisy data in return). The noisy data will never really be used (and good forbid evaluated). Money well spent?! Just waiting for same to happen with another of our (external) success, autoQSAR...which never really is used. Wonder why?

Permalink to Comment

4. Chrispy on July 17, 2013 10:43 AM writes...

Well, this seems in keeping with the "post chemistry" era of drug discovery we seem to have entered. It used to be that this was exactly the kind of work done by real chemists (who, incidentally, could remake compounds or design new compounds to test SAR theories). Perhaps, too, this signals the beginning of a "post target" era since the data will only be useful for those targets already carpet bombed by not just one but two drug companies. Have we given up on finding anything novel?

Permalink to Comment

5. Anonymous on July 17, 2013 10:49 AM writes...

Have we given up on finding anything novel?

Yes, but only because we've given up on doing anything novel - everyone's business model seems to be "Get cheap people to come up with an idea and implement it, buy idea, fire all the people, profit."

Permalink to Comment

6. JC on July 17, 2013 11:02 AM writes...

A giant load of tripe.

Permalink to Comment

7. TX raven on July 17, 2013 11:16 AM writes...

If a given SAR trend works for your chemical series, does it matter whether the trend is a "universal rule" or not?

Permalink to Comment

8. leeh on July 18, 2013 4:08 PM writes...

This approach is puzzling. The whole point to MMP analysis is to discovery activity cliffs. These cliffs are very local, and in the case of a given chemical series in a particular assay very specific to a particular site on the molecule. These rules are not transferable across chemical series (unless you know how the scaffolds align in the binding site) or across assays (unless you have prior knowledge that a scaffold binds to two binding sites that are essentially identical in the area of that particular part of the molecule). It is possible that particular chemical changes result in a higher than average probability of increasing binding affinity (such as substituting chlorine for hydrogen on an aromatic), but these kinds of rules are rather trivial and tend to be obvious by inspection (especially for a particular assay). If you isolate the change (by obscuring much of the structure) the value is lost.

This approach is more general for some properties, such as physicochemical properties, but I'm guessing that's not what these guys are trying to do.

Permalink to Comment


Remember Me?


Email this entry to:

Your email address:

Message (optional):

The Last Post
The GSK Layoffs Continue, By Proxy
The Move is Nigh
Another Alzheimer's IPO
Cutbacks at C&E News
Sanofi Pays to Get Back Into Oncology
An Irresponsible Statement About Curing Cancer
Oliver Sacks on Turning Back to Chemistry