Skip to main content
Article
kb-anonymity: A model for anonymized behavior-preserving test and debugging data
PLDI 11: Proceedings of the 2011 ACM Conference on Programming Language Design and Implementation, San Jose, CA, June 4-8, 2011
  • Aditya BUDI, Singapore Management University
  • David LO, Singapore Management University
  • Lingxiao JIANG, Singapore Management University
  • Lucia Lucia, Singapore Management University
Publication Type
Conference Proceeding Article
Publication Date
6-2011
Abstract

It is often very expensive and practically infeasible to generate test cases that can exercise all possible program states in a program. This is especially true for a medium or large industrial system. In practice, industrial clients of the system often have a set of input data collected either before the system is built or after the deployment of a previous version of the system. Such data are highly valuable as they represent the operations that matter in a client's daily business and may be used to extensively test the system. However, such data often carries sensitive information and cannot be released to third-party development houses. For example, a healthcare provider may have a set of patient records that are strictly confidential and cannot be used by any third party. Simply masking sensitive values alone may not be sufficient, as the correlation among fields in the data can reveal the masked information. Also, masked data may exhibit different behavior in the system and become less useful than the original data for testing and debugging.For the purpose of releasing private data for testing and debugging, this paper proposes the kb-anonymity model, which combines the k-anonymity model commonly used in the data mining and database areas with the concept of program behavior preservation. Like k-anonymity, kb-anonymity replaces some information in the original data to ensure privacy preservation so that the replaced data can be released to third-party developers. Unlike k-anonymity, kb-anonymity ensures that the replaced data exhibits the same kind of program behavior exhibited by the original data so that the replaced data may still be useful for the purposes of testing and debugging. We also provide a concrete version of the model under three particular configurations and have successfully applied our prototype implementation to three open source programs, demonstrating the utility and scalability of our prototype.

Keywords
  • k-anonymity,
  • symbolic execution,
  • third-party testing and debugging,
  • behavior preservation
ISBN
9781450306638
Identifier
10.1145/1993316.1993551
Publisher
ACM
City or Country
New York
Copyright Owner and License
Authors
Creative Commons License
Creative Commons Attribution-Noncommercial-No Derivative Works 4.0
Additional URL
http://doi.org/10.1145/1993316.1993551
Citation Information
Aditya BUDI, David LO, Lingxiao JIANG and Lucia Lucia. "kb-anonymity: A model for anonymized behavior-preserving test and debugging data" PLDI 11: Proceedings of the 2011 ACM Conference on Programming Language Design and Implementation, San Jose, CA, June 4-8, 2011 (2011) p. 447 - 457
Available at: http://works.bepress.com/david_lo/72/