Last year was one dumpster fire after another. And the mostly unusable results by the U.S. Census Bureau are proof that the issues of yesteryear torment us still and have hit us where it will hurt the most.
The census results determine society-changing parameters — the redrawing of voting districts and the redistribution of congressional seats, Electoral College votes and an estimated $1.5 trillion a year in federal funding among the states, to name a few.
From the get-go, there were many red flags in the scramble to get the census up and running during a pandemic. The push to gather the majority of census data through new online means despite the many serious bugs on the census site and iPhone app, for one, was ill-advised.
And then there was the “streamlining” of the census fact-checking phase which cut many local fact-checkers and demographers out of the process in order to meet the Sept. 30, 2020, deadline. Bad move, number two.
Despite all of this, it’s still surprising how bad the data turned out to be due to a new data scrambling system, recently adopted by the Census Bureau’s Disclosure Avoidance System, that for all intents and purposes, was not as thoroughly vetted at every data level before its nationwide implementation.
According to the Census Bureau, “Differential privacy, first developed in 2006, … allows an agency like the Census Bureau to quantify the precise amount of statistical noise required to protect confidentiality. This precision allows (the Census Bureau) to calibrate and allocate precise amounts of statistical noise in a way that protects confidentiality while maintaining overall accuracy of the data in the aggregate.”
Based on the resulting pulls of data by demographers like Washington state’s demographer Mike Mohrman, this allocation of precise “statistical noise” is nonsense. Not only does the injected noise prove to be “an unacceptable amount,” but “the Census Bureau’s differential privacy algorithms mess around with the number themselves,” reports the U-B’s Emry Dinman. How is that “maintaining overall accuracy of the data” when the more one zooms in the more inaccurate the numbers are?
“The majority of the data output from the (Disclosure Avoidance System) appears to be unfit for most uses,” said Mohrman who works for the Washington state Office of Financial Management.
This is unacceptable.
These mistakes make a very real impact on our community, especially our minority populations. At the very least, these inaccurate numbers mean inaccurately dispersed funds. It also means a significant hit to the people’s trust — already significantly diminished — in government data.
We understand the attempt to update privacy protection strategies in a time where cyberterrorism is on the rise, but in the face of such a big failure in balancing data protection and accuracy, did anyone think of a Plan B?