Wednesday, July 09, 2008

The Epistemology Wonk Strikes Again

Something us science persons seldom like to admit is the goodly distance between reality and data. Data is the muck we run through the ol' statistical sausage grinder to give us the lapidary pictures we put in the journals, and the naive view (sorry if you think I'm talking about you) is that we're slicing and dicing the real world and telling it like it is. But in fact, by the time the input gets to the regression model, it's already been fattened, slaughtered, gutted, skinned, defatted, disinfected, flash frozen, slow thawed, braised, roasted, deboned, and selected.

Or, more specifically and non-analogously in the case of the sociolinguistic research I'm doing, first the doctor and the patient have to meet the eligibility criteria, come to interact with the recruitment efforts, agree to participate in the study, and show up on the appointed days. We had to make a number of decisions in advance that determine who, out of the millions of possible physician-patient dyads in the world, even have a chance of ending up at the top of set of screens, and determine or influence which ones will get through them and end up on the feedlot, as it were.

Then they go ahead and have their interaction, knowing that they are being recorded and will otherwise be interviewed, poked and prodded. Is their interaction changed in any way by this knowledge, or their previous interactions with us? No doubt. We can try to minimize and account for these changes, or explain them away, but there is a Heisenberg principle of social science, as of physics and most scientific endeavors. It's impossible to observe something without changing it.

Then, some of the dialogue will be inaudible, drowned out by extraneous noises, lost to mechanical malfunctions, or to one or another of the parties decided she or he doesn't want to be recorded after all, possibly in the middle of the visit. Once we get the recording, we have to transcribe it. In case you think it is physically possible to make an accurate written representation of natural human discourse, you are sadly mistaken. We have to invent an elaborate serious of rules for doing this, some of which quite consciously involve throwing away information, others of which require judgments which not every transcriptionist will necessarily make in the same way every time.

To analyze this data we need to create a complex set of rules for dividing it into units of analysis, and then operationalize variables -- descriptions or measurements of the units -- by creating a list of possible values and definitions of the conditions or properties which correspond to each of those value labels. Again, the consistency with which different observers may make each of these judgments is variable, but almost never close to perfect. The concepts we use to create the variables, and the ways in which we operationalize them, fully and profoundly limit the possibilities for what we can observe.

Although we are setting out on a journey of discovery, we have some idea of what we are looking for, and even what we hope it will be. To some extent those expectations and wishes will influence our choices unconsciously, but to an even greater extent, I would say, we are fully aware of what we are doing and why. We look for these specific properties of the dialog because that's what we are interested in, that's what is meaningful to us, potentially support our theory, or show up our rivals, get us published, make us famous.

Not that there's anything wrong with that. Ultimately, our observations and our conclusions will be as credible as we can make them, and absolutely true. We're honest and ethical and pure of heart. But they will be true within the limited and carefully designed universe we create for purposes of the investigation. Whether you end up believing that they truly explain the real world -- the sloppy, messy, infinite, buzzing confusion in which you live -- depends on whether you agree that all those decisions we made along the way don't take us too far away from it.

And that's why it's so frustrating for us to argue with denialists -- creationists, global warming deniers, AIDS deniers, and the like. They see our honest and rigorous admission of our limitations as fatal weaknesses, and proof that beliefs are arbitrary. But that's not it at all. You have to walk the path with us, that's the point. Look through the telescope, see those spots moving around Jupiter, test the telescope on comparatively nearby objects and see that it gives an honest picture, watch those spots for a while and see how we deduce that they are circling the planet, see the phases of Venus, see how they correspond to the hypothesis that Venus is illuminated by the sun and is circling it. Look through the telescope.

