Mathematics, Risk and Messy Survey DataIASSIST Quarterly (2020)
Data de-identification or anonymization is a major ethical concern in cases where survey data is to be shared, and one which data professionals may find themselves ill-equipped to deal with. This article is intended to provide an accessible and practical introduction to the theory and concepts behind data anonymization and risk assessment, will describe a couple of case studies that demonstrate how these methods were carried out on actual datasets requiring anonymization, and discuss some of the difficulties encountered. Much of the literature dealing with statistical risk assessment of anonymized data is abstract and aimed at computer scientists and mathematicians, while material aimed at practitioners often does not consider more recent developments in the theory of data anonymization. We hope that this article will help bridge this gap.
- data deidentification,
- survey data
Publication DateDecember, 2020
Citation InformationKristi Thompson and Carolyn Sullivan. "Mathematics, Risk and Messy Survey Data" IASSIST Quarterly Vol. 44 Iss. 4 (2020) ISSN: 2331-4141
Available at: http://works.bepress.com/kristi-thompson/19/
Creative Commons license
This work is licensed under a Creative Commons CC_BY-NC International License.