Estimating the Success of Re-Identifications in Incomplete Datasets Using Generative Models

Author: Luc Rocher, Julien Hendrickx, Yves-Alexandre de Montjoye

Publisher: Nature

Publication Year: 2019

Summary: In the following article, motivated by the UK National Health Service’s decision to share de-identified data with DeepMind, the authors set out to discover how challenging it would be to identify someone using de-identified data. The authors were able to build a model that could correctly re-identify 99.98% of Americans using just 15 demographic attributes. Governments need to step in and take immediate action to enforce higher standards for data sharing, and for the de-identification process.