Map of life expectancy at birth from Global Education Project.

Tuesday, November 08, 2022

A few words about polling

I'm not going to say anything in particular about the election going on today -- everybody knows where I stand, after all. I do want to talk about the "science" or at least the methods of polling. Survey research is among the kinds of work I do for a living, and polls intended to predict election results are a category of survey. We'll see how accurate the polls turn out to be in this election, but it won't surprise if they aren't so great. For technical and cultural reasons, political polling has become more difficult in recent years.


Accurate survey research -- i.e., research that gives an accurate picture of the population being studied -- has some technical requirements. Ideally, every person in the population of interest needs an equal, or at least a known probability of being sampled. This makes it possible to use statistical theory to determine how likely the sample is to be similar to the entire population. You'll see a "margin or error" associated with poll reporting, but this really refers to computation of something called the "standard" error, and it conventionally means two standard errors or the interval in which 95% of the time the true number will lie given the poll results. (Sorry, I can't find a way to say it more simply.) People ask how it's possible to predict what tens of thousands or millions of voters will do based on interviews with 500 or 1,000 people, and that's the answer -- probability theory.


However, this calculation depends on some assumptions which are almost certainly not true. In addition to the assumption of equal probability of every voter being sampled, the probability that people will refuse to answer the poll has to be the same for people who will vote for each candidate. If  more than 80% or so of people sampled agree to be interviewed, this isn't a big problem. However, response rates to polls have fallen precipitously in recent years, to as low as 1%! That means it's really impossible for pollsters to know if they have a representative sample. They can try to weight responses by demographics, known voter registration, and whatever other fancy ideas they may have, but these methods aren't necessarily reliable.

 

Then the respondents also have to actually go and vote, which not all will do; they have to tell you the truth; and they to not change their mind before they vote. All of these factors are obviously uncertain. 


So I'm not taking anything for granted about what will happen today. Go and vote, and we'll find out tomorrow.

No comments: