The latest round in the fit-to-never-end saga of the Vioxx APPROVe trial and the New England Journal of Medicine is here. The journal today released a correction of the orginal paper, perspective article on the statistics of the original study, and some inconclusive correspondence about the (recalculated) risks.
The correction is notable for removing the earlier statements that it appears to take 18 months for risk to develop in the study's Vioxx patient group. And since Merck's made a big deal out of that timing, this has already become the headline story. (I can recommend this overview by Matthew Herper at Forbes).
The perspective article, by Stephen Lagakos of Harvard, may be fairly heavy going for someone who doesn't who isn't statistically inclined. I include in that group - please correct me if I'm wrong here - the great majority of newspaper reporters who might be covering the issue (Herper and a few others excepted). I'm no statistician myself, but I spend more time with the subject than most people do, so I'll extract some highlights from Lagakos's piece.
He has a useful figure where he looks at the two incidence curves for the Vioxx and placebo groups. These are the curves that have been the source of so much controversy: whether or not there was an increased risk after 18 months of Vioxx therapy or not, or if the risk was clear from the outset, and so on. As Lagakos points out, in a slap at Merck's public treatment of the graphs:
"It may then be of interest to assess how the cumulative incidence curves might plausibly differ over time. Doing so by means of post hoc analyses based on visual inspection of the shapes of the Kaplan-Meier curves for the treatment groups can be misleading and should be avoided. A better approach is to create a confidence band for the difference between the cumulative incidence curves in the treatment and placebo groups - that is, for the excess risk in the treatment group."
He does just that, at the 95% confidence level. What it shows is that well past the disputed 18-month point, the 95% confidence band still contains the 0% difference line, and there's room around it on both sides. As he summarizes it:
"The graph shows that there are many plausible differences, including a separation of the curves at times both before and after 18 months, and a consistently higher or lower cumulative incidence in the rofecoxib group, relative to the placebo group, before 18 months."
In other words, the data don't really add much support to anyone's definitive statements about Vioxx risks before 18 months. The 95% band only widens out to a plus or minus 1% difference in cumulative incidence rates at a time between 18 and 24 months. At that point, the upper and lower bounds are both creeping up, though, but the band only rises to an all-positive difference between the two groups at the 30-month mark. By the 36-month point, the last in the study, the 95% confidence band is between a 1% and a 4.5% risk difference for Vioxx therapy compared to placebo.
This doesn't help Merck - in fact, since they've made such a lot of noise about this 18-month threshold, it does them quite a bit of damage. But it doesn't directly help the plaintiffs who are suing them, either - the good news for them is that Merck is looking bad again.
Lagakos goes on to talk about what these demonstrated long-term risks can tell us about short-term ones. Assuming that the risk for, say, 12 months of Vioxx is somewhere between the placebo group and the 36-month figure (a reasonable assumption), these figures will set the upper and lower bounds. The most optimistic outcome, then, is that 12 months of Vioxx does nothing to you at all, compared to placebo, even after another two years of observation. And the most pessimistic outcome is that the Vioxx you took continues to increase your risk the same as if you'd been taking it the whole three years (a damage-is-already-done scenario). Although Lagakos doesn't name these as such, you could call these two boundries the Merck line and the Trial Lawyer line, because they correspond to what each side would fervently like to believe is true.
Combining this with his 95% confidence band plot, you end up with a figure that shows that, within 95% confidence, the excess risk for a 12-month treatment could still range anywhere from zero up to the worst that was seen in the full-term-treatment group. So, because this range still includes the no-effect outcome, you can't conclude that a shorter course of Vioxx was harmful. But because it includes the data of the out-to-three-year group, you can't conclude it's safe, either. And that's really the best you can do. If you're not willing to make those starting assumptions, you can't really say anything about the shorter courses of treatment at all.
This is, I think, a valid way of looking at the controversy, but in the end, it's not going to satisfy anyone. It makes me think that both Merck and the lawyers going after them will either: (a) pick their favorite sections from this article and beat each other with them like pig bladders, or (b) ignore it completely. (I think that the first one is already happening, with the advantage, for now, to the lawyers). If Merck can make a successful counterattack that the data don't show that Vioxx was harmful for shorter doses, either, perhaps they can get something out of this. That depends, of course, on people believing a single word that they say. Which they're making more difficult all the time.
1. John Johnson on June 26, 2006 10:52 PM writes...
And, unfortunately from this statistician's perspective, you've really described the state of the art of drug safety analysis. Basically we generate thousands of lines of adverse event counts and laboratory analysis means and perhaps some vitals and physical exams, and try to graft some p-values on top of some or all of these. Because of the multiple comparisons issue, these p-values are meaningless, and I usually recommend against them. And adjusting your acceptable Type I error rate isn't any good, either, because it's not conservative to say that a drug is safe just because a p-value is greater than 0.05. So essentially we hand it off to an MD and hope they make sense out of it (and usually do so with a lot of time and pain).
We're slowly trying to climb out of this hole by using graphical and Bayesian methods (and the MD's job will never be replaced no matter how fancy we can get), but I don't think we'll be out of it anytime soon.
Permalink to Comment