Map of life expectancy at birth from Global Education Project.

## Friday, April 01, 2011

### Here's some boring wonkery . . .

. . . to give you some relief from all the April foolishness.

Xin Sun and a cast of thousands reviewed 469 clinical trials reported in both high impact and lower impact journals. Okay, here's the wonkish part. It's actually kind of important and profound, in ways that go well beyond the current context. So I'm using this as an occasion to talk about a big issue.

It is conventional in clinical science to consider results to be valid when they have an associated "p value" of .05 or less, the jargon for which is "statistical significance." The term is misleading, in both directions. Findings with p values <.05 aren't necessarily "significant" in the sense that they actually matter; and "statistical insignificance" doesn't mean that an association doesn't exist, or is trivial, it just means you haven't established it to the satisfaction of an arbitrary convention. Most people misunderstand what a p value really means. It's usually interpreted as the probability that your result is not real, but arose just by chance. If the p value is less than .05, then you would say that there's a 95% chance your result is real. But that isn't what a p value means. It is the fallacy of the transposed conditional. The p value is actually the probability of getting your results given the hypothesis, in this case the null hypothesis that there is no real association. But, obviously if you make a lot of comparisons you'll see a lot of low p values that are spurious. The true probability that the drug is effective depends not only on the results of your study, but on the probability that the drug is effective based on what you knew before you did the study. If it is extremely unlikely -- say because there is no biological plausibility -- then a p value <.05 doesn't make it likely after all. It's probably just a coincidence. That's why, in order for p values to make sense, they have to be used to test plausible hypotheses that are specified before you undertake your trial. You always have a lot of information about clinical trial subjects, so even if the results of the trial are negative overall, you can comb through your data and look for "significant" results for some group that you didn't define ahead of time - say, women, or people of a certain age range, or people with a particular severity of disease, whatever you like. Chances are, you'll find some. And guess what -- people publish results of that kind. But they're basically BS. They might constitute hypotheses worth testing, but they aren't findings.

What Sun and the gang found is that when trials are funded by drug manufacturers, that did not have significant results for the pre-specified primary outcomes, they are much more likely to report sub-group analyses than trials that aren't funded by the manufacturer. What this means is that they are trying to find a way to sell their product, even though it apparently doesn't work, by inventing some specific category of person for whom they claim it does work, when in fact they don't have credible evidence for that conclusion.

Whoo. Sorry about that. Anyway, we need to do things differently. Industry funding of clinical trials must be totally isolated from the conduct and reporting of the trials, not just because of this finding but because of a vast record of deception, spin and scandals going back decades. There are probably several plausible ways to organize medical research to bring this about. We could establish a government sponsored institute to which the drug companies would transfer the funds for clinical trials. The institute would then accept competitive proposals from researchers, and award the funds to the most credible investigators, who would carry out the trial with no contact whatsoever from the interested parties.

Doesn't look exactly like the Free Market™ of conservative corporatist fantasy. But it would be a better world.

davidknz said...

Some while ago, I wrote a program for Factor Analysis that was of considerable interest to the "social scientists" who collected vast troves of data, confident that contained immutable truth if only they knew how to ask the right questions of it. Part of the process is to reduce the data to a correlation matrix. On one occassion, just for fun, I substituted random numbers (<1.0) on the off diagonal elements of this matrix. The academic concerned was very excited about the results, as they confirmed his intuitions about the nature of the process he was studying. As he was intending to present the results to a conference in a week, I quickly reran the program with the original data. No change in the enthusiasm; no change in the paper presented. The only change was that I became very sceptical about the claims of so called "soft sciences"

roger said...

i am too jaded to even pretend to be shocked at the dishonesty of big business.

both my mother and mother-in-law have suffered discomfort and dangerous medical conditions from (foolishly? uninformed?) prescriptions. and we have had to intervene to get the "treatment' to stop.

Cervantes said...

Well now David, you don't give enough of the particulars for me to interpret your story. (I am quite familiar with principal components analysis, BTW.)

You lied to the man. Apparently, by coincidence, your random data ended up looking like something interpretable. It's not the professor's fault, you gave him particular results. You also don't say how the real data turned out different, or whether you explained your deception.

Why are you skeptical of the scientist when it is you who committed the fraud? I don't really get this.