Abraham Lincoln once expressed the desire to preserve in a time of civil war a government that was “of the people, of the people, of the people.” What he did not say was that such a government has also always been of data, of data, and sometimes for data. Democratic governance has been fundamentally data-driven for a very long time. Representation in the United States depends on a constitutional requirement introduced at the founding of an “actual census” of the population every 10 years: a census designed to ensure that the people are accurately represented, in their proper places and in relation to their relative numbers.
A complete national census is always a monumental task, but the most recent actual census faced unprecedented challenges. The 2020 census should first overcome the Trump administration’s ill-considered efforts to add a question of citizenship. Then it spent half the year in the field making an effort to count each person during a pandemic that made it particularly difficult to knock on strangers’ doors. A series of devastating hurricanes and wildfires added to the challenge. And yet, in late April 2021, the professional staff of the U.S. Census Bureau succeeded in fulfilling the Constitution’s mandate and revealed state-level populations, turning them into a distribution of the 435 seats in the U.S. House and a corresponding number of votes in the Electoral College. (The distribution was made automatically according to an algorithm called “equal proportions” or “Huntington-Hill”, which is prescribed by law.) Now, just last monthwe learned that some of these numbers were most likely incorrect.
The Census Bureau’s Post-Enumeration Survey (PES) went back out into the field, re-interviewing a sample of people from across the country and then comparing the new, more in-depth survey with the results of the census. By analyzing this comparison, the agency now estimate that the 2020 census was overcounted in eight states and undercounted in six. To give a sense of the magnitude of these errors, PES reported with 90 percent certainty that New York State’s population was outnumbered by anywhere from 400,000 to over 1 million additional people, or 1.89 to 4.99 percent of the population. Given the circumstances of the census, such low error rates should be considered impressive, and yet such differences can have major consequences when the last place in the U.S. House since 1940 has been decided by as few as 89 people and no more than 17,000. Much of the introductory comment on the PES results have focused on the horse racing implications of the errors and pointed out that several of the states that were over-counted were blue states, while several of the under-counted were red. The mistakes that seem to favor one party over another have even been branded as “a scandal” and the census has been written off as “a bust.”
These are overreactions, and yet the question remains: What should we do about these small but both statistically and politically significant errors?
This is a riddle that the leaders of our nation have struggled with since its inception. Over the last century, two different approaches have dominated. One relies on channeling money and energy to mobilize more censuses and against other systemic reforms that prevent mistakes. The second involves statisticians who have worked on developing techniques that can measure errors accurately and then make corrections to the counts. Both of these approaches remain important, and yet the scale of the 2020 census suggests that an older method of dealing with census errors should be revived: We should expand the House and the Electoral College so that few or no states lose representation in the face of an uncertain figure. We should try to count better and correct the mistakes we can, but our democracy will be more robust if we also lower the stakes for each census. Representation does not have to be a zero-sum game.
The earliest known reference to a sub-census came from Thomas Jefferson, then Secretary of State, who wrote in 1791 about the previous year’s census, the country’s first. Jefferson wrote to his correspondents in Europe, assuring them that the American population was a few percentage points larger than officially stated. It’s hard to say if this was really the case, but history makes it clear that concerns about omissions and undercounting began more than two centuries ago. In subsequent decades, disasters and administrative failures caused serious omissions, such as when the official charged with counting Alabama residents died in office before finishing his work with census 1820or when many of California’s records (including all of San Francisco County) burned after census 1850.