Map of life expectancy at birth from Global Education Project.

Monday, August 15, 2005

Everything causes cancer . . .

Try googling that three word phrase, and you will get a pretty good idea of where the general public is at these days on the science of medicine and public health. As one commenter on another blog (alldumb, wrote, about John Ioannidis's study finding that about 1/3 of widely cited studies are later overturned, "well no shit, all scientific studies are bullshit... thats why everything causes cancer nowadays... one day theyll tell you to drink 2 glasses of wine every day and then the next day theyll tell you that fishing causes cancer and that you shouldnt drink wine cuz it causes cancer and then theyll mention the report on 20/20 and all of a sudden everyone's wearing anti-fishing, anti-red wine bracelets."

Now pay attention! We are treading on the burning ground. Don't leave the pathway, this is very dangerous!

Ioannidis has an essay, which you may read on Public Library of Science, entitled Why most published research findings are false. He writes:

Published research findings are sometimes refuted by subsequent evidence, with ensuing confusion and disappointment. . . . There is increasing concern that in modern research, false findings may be the majority, or even the vast majority of published research claims. . . It can be proven that most claimed research findings are false.

Uh oh! Is this the death of science? Does this mean that science is no better than faith, that we would do as well to follow the advice of that huckster with an informercial as our M.D.? That scientific belief is just one more orthodoxy, created by consensus, not evidence?

No, it doesn't mean anything of the sort. But it does mean that health and science reporters, the FDA, and for that matter, doctors and professors, need to have a better understanding of epistemological underpinnings of public health and medicine.

Ioannidis's argument depends on Bayes' Theorem. (If you didn't catch my explanation, it's here.) Bayes' Theorem is astoundingly simple, obvious once you see it, and yet disconcertingly counterintuitive. It's one of the most important tools of critical thinkers. Keep it ever in mind.

To make this as simple as I can, the bulk of research consists of looking for associations between two phenomena, e.g. eating twinkies and shooting city councillors, whatever. Now, let us say we decide that we won't call an association "statistically significant" unless we have a p value of .05 or less, i.e. the association would appear to be real purely by chance no more than 5% of the time. That's equivalent to a 95% specific test. But as we have seen, the positive predictive value -- the proportion of true positives -- of a test depends on the prior probability -- the proportion of true positives in the population -- as well as the specificity. Where the prior probability is low, even a highly specific test has a low positive predictive value.

In most fields of inquiry that we might define, we can think up a vast number of possible associations, the vast majority of which will not be real. Hence the prior probability for any given hypothesis is low, and most "statistically significant" findings will be false! Now, this is not at all fatal for science. It just means that a single study, in isolation, means little. It is only suggestive. We now need to go out and try to replicate the results, using if possible bigger samples and more rigorous methodology, specifically to test that hypothesis. If we continue to get positive results, the probability that the association is real goes up dramatically, and may well become a virtual certainty. Smoking and lung cancer.

This fundamental problem is compounded by several features of the culture of science. One is the bias toward publication of positive findings. If six teams investigate one question, and one has a positive finding while 5 do not, what will the world hear about?

Another problem is the psychological and/or financial and/or prestige investment of researchers in their hypotheses and their earlier findings. This creates a bias in favor of confirmation, and creates entrenched orthodoxies which are difficult to overturn. Another is the ill informed, credulous and sensationalistic field of journalism, which gets hyperexcited over single studies which merely suggest places we might want to look further, rather than settling questions.

Ioannidis asks us to look for larger scale studies, larger effect sizes, more rigorous and carefully defined methodologies, disinterested investigators (not a synonym for uninterested!), narrowly specified questions with substantial prior probabilities, and consideration of all studies done in a field rather than relying on a single study (i.e., systematic review). Trying to get a better understanding of the prior probability of a hypothesis (which he calls R) and how that relates to the interpretation of statistical significance is his final recommendation, although that is something of an epistemological tangle, in my view.

The bottom line for the perplexed citizen is this. There is a great deal that we do know. The reason you keep hearing that yesterday's conventional wisdom is false is not because scientific investigation of medical and public health problems doesn't work: it's that the wisdom becomes conventional prematurely. Ours is an impatient culture. We want answers, and we want them now. Where doctors and professors admit to ignorance or uncertainty, hucksters and charlatans will rush in gladly. The corporate media loves the narrative of heroic medical advances and triumph over sickness and death, and so does the drug industry. They'll hype it for all it's worth.

So be skeptical, but don't doubt everything. The truth really is out there, but it isn't always easy to find.

edit: Aarggh. This is what I get for trying to explain really complex ideas simply and in a hurry. I'm sure that some smartass out there (and you know who you are) will quibble with my explanation of the math. Yes yes, p values aren't exactly equivalent to the specificity of a test. The latter is plugged in a priori, based on giving the test to a bunch of people you know are not true positives. The p value is computed from your data, and we can only apply baesian reasoning to its use in hindsight, under certain idealized conditions. But qualitatively, the analogy is basically accurate, okay? I'm trying to make things easy for people, not confuse them.