"Mining preconditions of APIs in large-scale code corpus" by Hoan Anh Nguyen

Selected Works of Hridesh Rajan

Follow Contact

Presentation

Mining preconditions of APIs in large-scale code corpus

FSE 2014 Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering

Hoan Anh Nguyen, Iowa State University
Robert Dyer, Iowa State University
Tien N. Nguyen, Iowa State University
Hridesh Rajan, Iowa State University

Download Find in your library

Document Type

Conference Proceeding

Disciplines

Conference

FSE Foundations of Software Engineering

Publication Version

Accepted Manuscript

Link to Published Version

http://dx.doi.org/10.1145/2635868.2635924

Publication Date

11-11-2014

DOI

10.1145/2635868.2635924

Conference Title

The 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering

Conference Date

November 16-21, 2014

Abstract

Modern software relies on existing application programming interfaces (APIs) from libraries. Formal specifications for the APIs enable many software engineering tasks as well as help developers correctly use them. In this work, we mine large-scale repositories of existing open-source software to derive potential preconditions for API methods. Our key idea is that APIs’ preconditions would appear frequently in an ultra-large code corpus with a large number of API usages, while project-specific conditions will occur less frequently. First, we find all client methods invoking APIs. We then compute a control dependence relation from each call site and mine the potential conditions used to reach those call sites. We use these guard conditions as a starting point to automatically infer the preconditions for each API. We analyzed almost 120 million lines of code from SourceForge and Apache projects to infer preconditions for the standard Java Development Kit (JDK) library. The results show that our technique can achieve high accuracy with recall from 75–80% and precision from 82–84%. We also found 5 preconditions missing from human written specifications. They were all confirmed by a specification expert. In a user study, participants found 82% of the mined preconditions as a good starting point for writing specifications. Using our mining result, we also built a benchmark of more than 4,000 precondition-related bugs.

Comments

This article is published as Nguyen, Hoan Anh, Robert Dyer, Tien N. Nguyen, and Hridesh Rajan. "Mining preconditions of APIs in large-scale code corpus." In Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering, pp. 166-177. ACM, 2014.doi: 10.1145/2635868.2635924. Posted with permission.

Rights

© ACM, 2014. This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering, pp. 166-177. ACM, 2014. https://doi.org/10.1145/2635868.2635924

Association for Computing Machinery

2014

Language

File Format

application/pdf

Citation Information

Hoan Anh Nguyen, Robert Dyer, Tien N. Nguyen and Hridesh Rajan. "Mining preconditions of APIs in large-scale code corpus" Hong Kong, ChinaFSE 2014 Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering (2014) p. 166 - 177
Available at: http://works.bepress.com/hridesh-rajan/101/