There's an article out from a group in Australia on the long-standing problem of "frequent hitter" compounds. Everyone who's had to work with high-throughput screening data has had to think about this issue, because it's clear that some compounds are nothing but trouble. They show up again and again as hits in all sorts of assays, and eventually someone gets frustrated enough to flag them or physically remove them from the screening deck (although that last option is often a lot harder than you'd think, and compound flags can proliferate to the point that they get ignored).
The larger problem is whether there are whole classes of compounds that should be avoided. It's not an easy one to deal with, because the question turns on how you're running your assays. Some things are going to interfere with fluorescent readouts, by absorbing or emitting light of their own, but that can depend on the wavelengths you're using. Others will mung up a particular coupled assay readout, but leave a different technology untouched.
And then there's the aggregation problem, which we've only really become aware of in the past few years. Some compounds just like to stick together into huge clumps, often taking the assay's protein target (or some other key component) with them. At first, everyone thought "Ah-hah! Now we can really scrub the screening plates of all the nasties!", but it turns out that aggregation itself is an assay-dependent phenomenon. Change the concentrations or added proteins, and whoomph: compounds that were horrible before suddenly behave reasonably, while a new set of well-behaved structures has suddenly gone over to the dark side.
This new paper is another attempt to find "Pan-Assay Interference" compounds or PAINs, as they name them. (This follows a weird-acronym tradition in screening that goes back at least to Vertex's program to get undesirable structures out of screening collections, REOS, for "Rapid Elimination of, uh, Swill"). It will definitely be of interest to people using the AlphaScreen technology, since it's the result of some 40 HTS campaigns using it, but the lessons are worth reading about in general.
What they found was that (as you'd figure) that while it's really hard to blackball compounds permanently with any degree of confidence, the effort needs to be made. Still, even using their best set of filters, 5% of marketed drugs get flagged as problematic screening hits - in fact, hardly any database gives you a warning rate below that, with the exception of a collection of CNS drugs, whose properties are naturally a bit more constrained. Interestingly, they also report the problematic-structure rate for the collections of nine commercial compound vendors, although (frustratingly) without giving their names. Several of them sit around that 5% figure, but a couple of them stand out with 11 or 12% of their compounds setting off alarms. This, the authors surmise, is linked to some of the facile combinatorial-type reactions used to prepare them, particularly ones that leave enones or exo-alkenes in the final structures.
So what kinds of compounds are the most worrisome? If you're going to winnow out anything, you should probably start with these: Rhodanines are bad, which doesn't surprise me. (Abbott and Bristol Myers-Squibb have also reported them as troublesome). Phenol Mannich compounds and phenolic hydrazones are poor bets. And all sort of keto-heterocycles with conjugated exo alkenes make the list. There are several other classes, but those are the worst of the bunch, and I have to say, I'd gladly cross any of them off a list of screening hits.
But not everyone does. As the authors show, there are nearly 800 literature references to rhodanine compounds showing biological effects. A conspicuous example is here, from the good folks at Harvard, which was shown to be rather nonspecifically ugly here. What does all this do for you? Not much:
"Rather than being privileged structures, we suggest that rhodanines are polluting the scientific literature. . .these results reflect the extent of wasted resources that these nuisance compounds are generally causing. We suggest that a significant proportion of screening-based publications and patents may contain assay interference hits and that extensive docking computations and graphics that are frequently produced may often be meaningless. In the case of rhodanines, the answer set represents some 60 patents and we have found patents to be conspicuously prevalent for other classes of PAINS. This collectively represents an enormous cost in protecting intellectual property, much of which may be of little value. . ."