"You like those scatterplots, don't you?", someone said to me the other day. And I can't deny it. On most projects that my lab has been assigned to, at some point I end up messing around with all the project data, plotting one thing against another and looking for correlations.
Often what I find is negative. Plotting liver microsome stability (a measure, in theory, of one of the major pathways for drug metabolism) against compound blood levels in animal dosing has rarely, in my perhaps unrepresentative experience, shown much of a correlation. In vivo blood levels are just too complicated, and influenced by too many other things. But I'm often surprised by how many people assume that there's a correlation - because, to a first approximation, it sort of makes sense that there might be - without actually having run the numbers.
That's a theme that keeps recurring: a fair amount of what people think they know about their project isn't true. I think it's because we keep reaching for simple explanations and rules of thumb, in hopes that we can get some sort of grip on the data. We give these too much weight, though, especially if we don't examine them every so often to see if they're still holding up (or if they ever did in the first place).
Another factor is good ol' fear. It's unnerving to face up to the fact that you don't know why your compounds are behaving the way that they are, and that you don't know what to do about it. It's no fun to plot your primary assay data against your secondary data and see a dropped-paintcan scatter instead of a correlation, because that kind of thing can set your whole project back months (or kill it altogether). One of the biggest problems in an information-driven field is that not everyone wants to know.
One time when I was giving the numbers a complete run-through, I noticed one of the plots actually seemed to have a fairly good shape to it. Y-axis was potency (plotted as -log), and there it was, actually increasing - broadly, messily, but undeniably - with the X-axis, which was. . .corporate compound number, the one assigned to each new compound as it was sent in for the assay. Oh, well. It showed that we were making progress, anyway. And at least nobody suggested that we attempt to give the compounds numbers from years in the future, in order to make them instant surefire winners. I've heard sillier suggestions.