Concern localization refers to the process of locating code units that match a particular textual description. It takes as input textual documents such as bug reports and feature requests and outputs a list of candidate code units that need to be changed to address the bug reports or feature requests. Many information retrieval (IR) based concern localization techniques have been proposed in the literature. These techniques typically represent code units and textual descriptions as a bag of tokens at one level of abstraction, e.g., each token is a word, or each token is a topic. In this work, we propose multi-abstraction concern localization. A code unit and a textual description is represented at multiple abstraction levels. Similarity of a textual description and a code unit, is now made by considering all these abstraction levels. We have evaluated our solution on AspectJ bug reports and feature requests from the iBugs benchmark dataset. The experiment shows that our proposed approach outperforms a baseline approach, in terms of Mean Average Precision, by up to 19.36%.
- Text Retrieval,
- Concern Localization,
- Topic Model,
- Latent Dirichlet Allocation
Available at: http://works.bepress.com/david_lo/125/