Skip to main content
Presentation
Mining preconditions of APIs in large-scale code corpus
FSE 2014 Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering
  • Hoan Anh Nguyen, Iowa State University
  • Robert Dyer, Iowa State University
  • Tien N. Nguyen, Iowa State University
  • Hridesh Rajan, Iowa State University
Document Type
Conference Proceeding
Conference
FSE Foundations of Software Engineering
Publication Version
Accepted Manuscript
Link to Published Version
http://dx.doi.org/10.1145/2635868.2635924
Publication Date
11-11-2014
DOI
10.1145/2635868.2635924
Conference Title
The 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering
Conference Date
November 16-21, 2014
Abstract

Modern software relies on existing application programming interfaces (APIs) from libraries. Formal specifications for the APIs enable many software engineering tasks as well as help developers correctly use them. In this work, we mine large-scale repositories of existing open-source software to derive potential preconditions for API methods. Our key idea is that APIs’ preconditions would appear frequently in an ultra-large code corpus with a large number of API usages, while project-specific conditions will occur less frequently. First, we find all client methods invoking APIs. We then compute a control dependence relation from each call site and mine the potential conditions used to reach those call sites. We use these guard conditions as a starting point to automatically infer the preconditions for each API. We analyzed almost 120 million lines of code from SourceForge and Apache projects to infer preconditions for the standard Java Development Kit (JDK) library. The results show that our technique can achieve high accuracy with recall from 75–80% and precision from 82–84%. We also found 5 preconditions missing from human written specifications. They were all confirmed by a specification expert. In a user study, participants found 82% of the mined preconditions as a good starting point for writing specifications. Using our mining result, we also built a benchmark of more than 4,000 precondition-related bugs.

Comments

This article is published as Nguyen, Hoan Anh, Robert Dyer, Tien N. Nguyen, and Hridesh Rajan. "Mining preconditions of APIs in large-scale code corpus." In Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering, pp. 166-177. ACM, 2014.doi: 10.1145/2635868.2635924. Posted with permission.

Rights
© ACM, 2014. This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering, pp. 166-177. ACM, 2014. https://doi.org/10.1145/2635868.2635924
Copyright Owner
Association for Computing Machinery
Language
en
File Format
application/pdf
Citation Information
Hoan Anh Nguyen, Robert Dyer, Tien N. Nguyen and Hridesh Rajan. "Mining preconditions of APIs in large-scale code corpus" Hong Kong, ChinaFSE 2014 Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering (2014) p. 166 - 177
Available at: http://works.bepress.com/hridesh-rajan/101/