Potential privacy lapse found in Americans’ 2010 census data

FILE - This March 23, 2018, file photo shows an envelope containing a 2018 census letter mailed to a U.S. resident as part of the nation's only test run of the 2020 Census.  The Supreme Court will decide whether the 2020 census can include a question about citizenship that could affect the allocation of seats in the House of Representatives and the distribution of billions of dollars in federal money.(AP Photo/Michelle R. Smith, File)
FILE - This March 23, 2018, file photo shows an envelope containing a 2018 census letter mailed to a U.S. resident as part of the nation's only test run of the 2020 Census. The Supreme Court will decide whether the 2020 census can include a question about citizenship that could affect the allocation of seats in the House of Representatives and the distribution of billions of dollars in federal money.(AP Photo/Michelle R. Smith, File)

WASHINGTON (AP) — An internal team at the Census Bureau found basic personal information collected from more than 100 million Americans during the 2010 head count could be reconstructed from obscured data, but with lots of mistakes, a top agency official disclosed Saturday.

The age, gender, location, race and ethnicity for 138 million people were potentially vulnerable. So far, however, only internal hacking teams have discovered such details at possible risk, and no outside groups are known to have grabbed data intended to remain private for 72 years, chief scientist John Abowd told a scientific conference.

The Census Bureau is now scrapping its old data shielding technique for a state-of-the-art method that Abowd claimed is far better than Google’s or Apple’s.

Some former agency chiefs fear the potential privacy problem will add to the worries that people will avoid answering or lie on the once-every-10-year survey because of the Trump administration’s attempt to add a much-debated citizenship question.

The Supreme Court on Friday announced it would rule on that proposed question, which has been criticized for being political and not properly tested in the field. The census count is hugely important, helping with the allocation of seats in the House of Representatives and distribution of billions of dollars in federal money.

The 8 billion pieces of statistics in census data are supposed to jumbled in a way so what is released publicly for research cannot identify individuals for more than seven decades. In 2010, the Census Bureau did this by swapping similar household information from one city to another, according to Duke University statistics professor Jerome Reiter.

In the internal tests, Abowd said, officials were able to match of 45 percent of the people who answered the 2010 census with information from public and commercial data sets such as Facebook. But errors in this technique meant that only data for 52 million people would be completely correct — little more than 1-in-6 of the U.S. population.

He said the 2010 census used the best possible privacy protection available, but hackers since then have become more skilled in reconstructing data. To counter their growing abilities, the agency has completely changed the system for 2020 and will offer the “gold standard” of privacy regardless of the fate of the citizenship question, Abowd said.

People “want to know that statistical tables aren’t going to come back and haunt them,” Abowd said at the American Association for the Advancement of Science’s annual meeting. “I promise the American people they will have the privacy that they deserve.”

Upcoming Events