Map of life expectancy at birth from Global Education Project.

Thursday, July 29, 2021

On the bias of science: more on methods

Actually I could probably spend the next six months writing a book about this, but this will be my last post on methods, at least for a while, because I want to move on to the remaining sections of the generic research paper. I'm going to focus specifically on research involving psychiatric diagnoses, but much of what I will say applies more broadly to any sort of question in which people's subjective experiences are variables. That's a whole lot of published research in psychology, social science, and medicine.


The culture of science strongly favors quantification, which means counting, and in order to count phenomena, you first need to classify them. So if I'm doing research that concerns, say, depression or anxiety disorder, I need to be able to label people as having those disorders and maybe measure their severity. A big problem in psychiatry -- it's dirty not so secret -- is that nobody knows what causes psychiatric disorders and even whether two people who get the same diagnostic label actually have the same "disease," if that's the word for it. 

 

Most medical diagnoses correspond to some specific physical findings that can be established, if not with absolute certainty, at least with a known degree of accuracy. We can culture the pathogenic infectious organism, count the leukocytes, see the abnormal cells under the microscope. Psychiatric diagnoses aren't like that. Diagnosticians have to interview people, and they make the diagnosis based on picking from a laundry list of symptoms derived from what the people tell them. Often this has a Chinese menu quality: if you have at least three from column A and two from column B, you get the label. That means, first of all, that two people who are diagnosed with depression may have exactly zero symptoms in common. In fact, they may have exactly opposite symptoms in some cases, such as excessive sleep and inability to sleep. It also means that people who are evaluated by two or more clinicians may get different diagnoses, and that a person's diagnosis may change over time.

 

For purposes of research, however, proposal and paper reviewers demand that research subjects get a definite classification. This is done by administering relatively short standard questionnaires. For depression, commonly used examples are called the Beck Depression Inventory, the Hamilton Depression Scale, and the CES-D. Some of these are copyrighted and you have to pay a fee to use them. This is called the PHQ-9:

 



In a research study, the score on this or whatever instrument is being used becomes the definition of depression and depression severity. As you can see, people who get the same high score may have absolutely nothing in common. Three of the items consist of completely opposite experiences, in fact. Poor appetite or overeating, trouble falling asleep or sleeping too much, and lethargy or hyperactivity. But everyone with the same score is treated as having the exact same baseline or outcome. There is no reason to believe this is associated with any actually existing, describable, explicable phenomenon. It's just tautological: I say this is what depression is, so that's what depression is. Ergo, treatments for depression are effective if they are associated with a statistically significant reduction in this score. QED. I will leave you to ponder this.


No comments: