Map of life expectancy at birth from Global Education Project.

Friday, May 21, 2021

Data

I'm not talking about the android science officer, although that would be an interesting subject. I'm talking about facts or statistics gathered together for reference or analysis, as the dictionary would have it. 

 

We recently had a commenter who pointed out that the large majority of firearm homicides in the U.S. are not in the context of mass shootings; and that they are perpetrated with handguns, not long guns. This is true! But how did he know it? 

 

He knew it because CDC's National Center for Injury Prevention and Control maintains the National Violent Death Reporting System, which pools information from multiple sources. You will see that NCIPC also maintains other data systems. What these systems can tell us is of course limited -- it depends on what data elements exist in the questionnaires on which they depend, which are necessarily closed-ended, check the box type questions -- but this sort of quantitative information can tells us a lot about prevalence and patterns, and be combined with more in-depth, qualitative inquiry to develop a rich understanding of social problems.

 

The founders recognized the need for good, quantitative information about the population so they put the decennial census in the constitution.  One rationale was just to count people for purposes of apportioning the House and electoral college, but from earliest days the census asked about gender, some concept of race (which changed over the decades) and of course slavery or freedom. As data processing and analysis capabilities grew in the 20th Century, the so-called long form was added to gather much more information from a sample of the population -- about housing, poverty, migration, commuting, detailed ancestry, and other subjects. A census of businesses was added. Now, the long form is no longer used but the American Community Survey has replaced it, which uses sample surveys of various geographic entities on a rotating basis. 


Vital statistics -- registration of births, deaths and marriages -- goes back to before the founding. The categories of causes of death used to be quite quaint, but as medical science evolved the concepts evolved with it. The National Vital Statistics System is where we get out information on life expectancy, causes of death, infant mortality, and many more essential indicators. Then there is the National Health Interview Survey, the National Health and Nutrition Examination Survey, and the National Health Care Survey. You can read more about these here.


Physicians must report many diseases to their state public health agency, and specific ones must be reported to CDC. That's how we know about the prevalence of disease, so we can target prevention and response efforts, and get early warning of outbreaks. Crime statistics come from the National Incident Based Reporting System (formerly the Uniform Crime Reports), not wholly reliable but still useful to detect patterns and trends. The Bureau of Labor Statistics was established as part of the New Deal and that's where we get data on unemployment, GDP, and other important economic indicators.


I could go on and on -- there are innumerable data systems that are essential to the enactment and implementation of public policy, business planning, social programs. The people who operate, analyze and report on these systems do not have bullshit jobs. They do an indispensable public service.

9 comments:

mojrim said...

As I said before, I have no quibble with the value of data in general, but you are conflating which type can be used for what. Quantifiable data can tell you what happens, where, and when, but it cannot tell you why. That is to say, what motivated this person to attack that person is not quantifiable and will never be discovered in a reporting system. With a large enough sample you can make generalized correlations to income, education, etc... but that won't help you fix the problem.

Let's take the gun use as an example. We know it's almost exclusively hand guns and that mass shootings account for less than 0.5% of homicides. The question that matters, however, is why do americans kill each other in such numbers? Switzerland and Canada are awash in guns, too, but they don't have this problem. If you're going to untangle this you'll have to look outside the CDC/DoJ data.

NB: I got the data on firearm deaths from the old UCR tool, which I dearly miss. The new one is an obfuscationary nightmare.

Cervantes said...

Well sure, this sort of comprehensive quantitative data system usually can't go very far toward telling us why shit happens. That requires a different kind of investigation. You need both kinds of information to figure out how to respond.

Cervantes said...

And BTW I'm more on the qualitative investigation side myself than I am on the big-scale number crunching, although the rest of my department is much more the latter. But they're complementary.

mojrim said...

1. The why is the key to dealing with it. Nothing else counts except as triage.
2. They are only complimentary when the quants acknowledge their limitations.
3. Because they don't, we get what Romer called "mathiness."
4. This spate of anti-asian violence creates the perfect conditions for the above. That is: tiny in scale, upsetting to liberal sensibilities, and certain to end before anyone figures out anything.

But the mandate will never end. Never. We could go a decade without one asian assaulted in america and it won't end. We'll have a half dozen people tracking anti-asian violence until the heat death of the universe because that's how bullshit jobs work. They are immune to review because no one wants to be "that guy" unless they're a random internet misanthrope with nothing to lose. That's why almost everyone voted yea and only a couple evil jerks voted nay.

So here it is: I guarantee that not one useful thing will come from this. If shown wrong I will apologize naked in times square in December.

Perhaps I should start that blog, after all.

Cervantes said...

Hmm. If you are right, and the data on anti-Asian violence goes to zero, then that will be the information which is revealed to the world. Are you saying you don't want to know that? And obviously, the historical record is contrary to your prediction. There is no longer a reporting system for smallpox or polio in the U.S., and I can't think of a single reporting or surveillance system for a phenomenon that no longer exists. So there doesn't seem to be any basis for your prediction. Can you provide one single example of a precedent?

Woody Peckerwood said...


I will not be there in Times Square in December just in case mojrim is wrong.

There are some things you can't unsee.

mojrim said...

What I'm saying broadly is that these attacks are statistically insignificant, thus their termination will be undetectable - unlike your example of polio. The reporting requirement, in turn, won't end because the whole thing was undetectable to begin with. This is in stark contrast to tracking actual epidemics for obvious reasons. The reporting itself is generally harmless, though it will serve to generate more statistical noise, it's that "coordinate with..." part that gets us more useless mouths.

mojrim said...

Woody, I promise to warn you in advance if it comes to that.

Cervantes said...

Sorry Mo, but in the first place you don't know the meaning of the term "statistically insignificant."

And what makes you think the "whole thing was undectable to being with"? How do you know that? In fact, there is a non-profit called Stop AAPI Hate that has a reporting system. According to them, "nearly 3,800 incidents were reported over the course of roughly a year during the pandemic. It’s a significantly higher number than last year's count of about 2,600 hate incidents nationwide over the span of five months. Women made up a far higher share of the reports, at 68 percent, compared to men, who made up 29 percent of respondents. The nonprofit does not report incidents to police."

A properly run national system that collects police incident reports as well as providing a reliable way for people to report incidents which either don't rise to the level of a crime or are not reported to police. It's certainly not undetectable, it's a real phenomenon and you're basically just slinging bullshit.