Map of life expectancy at birth from Global Education Project.

Friday, July 22, 2005

Math attack! Math attack!

Don't worry folks, we're almost done. All that jive with the coin flips and the standard deviations and the p values is the ticket to half of the world of probability and statistics. You need one more ticket, and then we can travel together everywhere.

Suppose there's a rare but serious disease, Chimptastic Virus (CV), that makes you think you're Alexander the Great, causes you to lose the ability to speak in complete sentences, to stay upright on a bicycle, or to swallow pretzels. Sounds pretty serious, huh? Fortunately, only 1% of the population is infected.

There's a diagnostic test for CV which comes back positive 90% of the time when people actually have the virus. (That's called the "sensitivity" of the test, by the way.) If you don't have the virus, it comes back negative 90% of the time. (That's called the "specificity.") Sounds like a pretty good test, huh? The only bad news is that if you test positive, you have to be locked in a padded cell for the fourteen year incubation period to see whether symptoms emerge.

Okay, you take the test. Oh no! It came back positive! What are the chances that you're infected? (You have one minute to think about it. I'm going out for a cup of coffee.)

Here's mine. Or, well, actually, it's the answer of an English minister named Thomas Bayes who died in 1761. But the copyright has expired.

Before you took the test, we figured your chances were 1 out of 100, like everybody else's. 99 out of 100 people who get the test don't have CV. But, 10% of those people will test positive. On average, that's 9.9 people out of 100. One person actually has CV. That person will test positive 90% of the time. Out of 100 people who take the test, on average, there will be .9 true positives. So if 100 people take the test, on average, there will be 10.8 positive tests. So you've got 9.9 false positive tests for every .9 true positives.

.9/10.8 = .083333 (or 8.333%).

If you test positive, you have less than a 10% chance of having CV, even though the test is 90% specific!.

We can write out a formula for this, but you don't have to remember it, you just have to remember the general idea. First, we calculated the overall probability of a positive test result, which turned out to be 10.8%, or .108. Let's call that probability P(+). We already knew that the probability of being infected, all other things being equal, is .01 (i.e., 1%). Let's call that P(infected). And we know that if you aren't infected, the probability of a positive test is .1 (10%). We'll use a vertical bar -- | -- to designate the probability of one thing given another, like this: P(+|~inf) means "the probability of a positive test given that you are not infected."

So, the probability that you are infected, given a positive test, is P(inf|+), and we find it by the formula

P(inf|+)=[P(+|inf) X P(inf)]/P(+).

We already figured out that P(+) is .108, and we know that P(+|inf)=.9 and that P(inf)=.01. So the equation becomes (.9 *.01)/.108 and sure enough, that equals .0833333333333333333, or 8 1/3%, just like I said before.

Why did I inflict all this garbage on you? Because it's extremely important, when your doctor tries to sell you a screening test, that you understand what the test is really going to tell you. If the condition is uncommon in the population -- and just about everything we screen for is, including breast and prostate cancer, at least for people younger than 70 or so -- and given that very few tests have specificity much above 90%, most people who have positive results will not have the disease.

However, they will have to go through whatever happens next: fear, more tests, biopsies, expense, you name it. It isn't necessarily worth it. The usefulness of a test depends more on its specificity than on its sensitivity. The cost of a false negative is theoretically low. We didn't know before and we still don't know, but what have we lost? Of course it could give false reassurance and lead to complacency, as in the case of the very insensitive test for Lyme disease. But there is inevitably a monetary, emotional, and often a physical cost to a false positive. The rarer the condition, the more specific the test has to be to make it worthwhile.

Unfortunately, it has been found in study after study that most doctors, believe it or not, don't understand Bayes' Theorem. They think that if a test is 90% specific, somebody who tests positive probably has the disease. Now you know better.