Map of life expectancy at birth from Global Education Project.

Monday, September 26, 2016

Social Psychology vs. Parapsychology

Hard to say nowadays which is less credible as science. Here's a lengthy post by Andrew Gelman about the so-called "replication crisis," which is a fancy way of saying that the entire field of social psychology is looking like mostly bunkum.

The defensiveness of its practitioners is to be expected, but let's check our own walls for glass before throwing stones. The main problems in the field of social psychology are not ones to which other social sciences or for that matter biomedical research are immune. It's a bit hard to explain if you haven't taken much in the way of statistics or research methods generally, but the keystone issue is the worship and misunderstanding of the concept of "statistical significance."

If I compare two samples from a given population, with an equal (or at least known) probability of being selected at random, and some with characteristic A are more likely to have characteristic X than are those without A, I want to know how likely this is to just be a coincidence. If the observed difference is expected to occur less than 5% of the time when there isn't really a difference in the total population, we say the "p value" is less than .05 and we declare the observation "statistically significant" which is presumed to be more or less synonymous with "true." If the probability is 6% we declare the observation "not statistically significant" which is presumed to be synonymous with "false."

This is so wrong for so many reasons it makes one feel foolish to point them out. One is that the p value depends on sample size as much as it does on the magnitude of the effect. If my sample is too small, I will be likely to get an insignificant p value even if a meaningfully large effect exists. If the sample is large, I will likely get a "significant" p value for an meaninglessly small effect. A bigger problem is that if I make multiple comparisons I will likely find a "significant" value in there just by chance, because you have to multiply the values by the number of comparisons. Cherry picking the ones that are "significant" is basically fraudulent, although it seems most people who do it don't know that.

Other problems are that in social psychology, dependent variables are typically quite subjectively measured and it may be difficult to detect observer bias; independent variables may be associated with other, unmeasured variables that are actually responsible for any effect; there are all sorts of rationales for excluding cases selectively after the fact; and samples are rarely representative of any broader group than (quite typically) undergraduates at a selective university from which they are drawn -- who, by the way, are quite likely to divine the research question and consciously or unconsciously alter their behavior in response.

Gellman points to all sorts of other design flaws but the overall lesson is that it's just too easy to find what you are looking for. These studies get a lot of press because they seem relatable and often directly relevant to our own lives and supposed behavioral predispositions and those of the people around us. But they're largely gahrbahzh. So sad.


Anonymous said...

I worked in social psychology in the late 80s early 90s. What was most dismaying to me (besides the critiques you allude to and many others since 2000 or so) was that some excellent studies were being performed, others were junk, but it was very difficult to distinguish between the two, and certainly such a distinction was never discussed through a kind of corporatist tribalism. (I’m not even considering the obvious ‘junk’ psy-pop studies, that is a whole other problem.) The worst was that some excellent work (planned or done) was discouraged / blocked / refused for publication and so on. Second worst: interesting studies of a more qualitative nature - circumscribed to a certain population, group, question, etc. and specific historical-social context, was deemed ‘not scientific’, these poor authors who did nothing but be honest and try to avoid the traps of poor stats and impossible so called rigorous expt. design were punished.


Cervantes said...

I wonder why peer review standards went so badly awry in the field.