This time last year I mentioned a particularly disturbing-looking compound, sold commercially as a so-called "selective inhibitor" of two deubiquitinase enzymes. Now, I have a fairly open mind about chemical structures, but that thing is horrible, and if it's really selective for just those two proteins, then I'm off to truck-driving school just like Mom always wanted.
Here's an enlightening look through the literature at this whole class of compound, which has appeared again and again. The trail seems to go back to this 2001 paper in Biochemistry. By 2003, you see similar motifs showing up as putative anticancer agents in cell assays, and in 2006 the scaffold above makes its appearance in all its terrible glory.
The problem is, as Jonathan Baell points out in that HTSpains.com post, that this series has apparently never really had a proper look at its SAR, or at its selectivity. It wanders through a series of publications full of on-again off-again cellular readouts, with a few tenuous conclusions drawn about its structure - and those are discarded or forgotten by the time the next paper comes around. As Baell puts it:
The dispiriting thing is that with or without critical analysis, this compound is almost certainly likely to end up with vendors as a “useful tool”, as they all do. Further, there will be dozens if not hundreds of papers out there where entirely analogous critical analyses of paper trails are possible.
The bottom line: people still don’t realize how easy it is to get a biological readout. The more subversive a compound, the more likely this is. True tools and most interesting compounds usually require a lot more medicinal chemistry and are often left behind or remain undiscovered.
Amen to that. There is way too much of this sort of thing in the med-chem literature already. I'm a big proponent of phenotypic screening, but setting up a good one is harder than setting up a good HTS, and working up the data from one is much harder than working up the data from an in vitro assay. The crazier or more reactive your "hit" seems to be, the more suspicious you should be.
The usual reply to that objection is "Tool compound!" But the standards for a tool compound, one used to investigate new biology and cellular pathways, are higher than usual. How are you going to unravel a biochemical puzzle if you're hitting nine different things, eight of which you're totally unaware of? Or skewing your assay readouts by some other effect entirely? This sort of thing happens all the time.
I can't help but think about such things when I read about a project like this one, where IBM's Watson software is going to be used to look at sequences from glioblastoma patients. That's going to be tough, but I think it's worth a look, and the Watson program seems to be just the correlation-searcher for the job. But the first thing they did was feed in piles of biochemical pathway data from the literature, and the problem is, a not insignificant proportion of that data is wrong. Statements like these are worrisome:
Over time, Watson will develop its own sense of what sources it looks at are consistently reliable. . .if the team decides to, it can start adding the full text of articles and branch out to other information sources. Between the known pathways and the scientific literature, however, IBM seems to think that Watson has a good grip on what typically goes on inside cells.
Maybe Watson can tell the rest of us, then. Because I don't know of anyone actually doing cell biology who feels that way, not if they're being honest with themselves. I wish the New York Genome Center and IBM luck in this, and I still think it's a worthwhile thing to at least try. But my guess is that it's going to be a humbling experience. Even if all the literature were correct in every detail, I think it would be one. And the literature is not correct in every detail. It has compounds like that one at the top of the entry in it, and people seem to think that they can draw conclusions from them.