Skip to main content
Article
Clustering classes in packages for program comprehension
Scientific Programming
  • Xiaobing SUN, Yangzhou University
  • Xiangyue LIU, Yangzhou University
  • Bin LI, Yangzhou University
  • Bixin LI, Southeast University
  • David LO, Singapore Management University
  • Lingzhi LIAO, Nanjing University
Publication Type
Journal Article
Version
publishedVersion
Publication Date
4-2017
Abstract

During software maintenance and evolution, one of the important tasks faced by developers is to understand a system quickly and accurately. With the increasing size and complexity of an evolving system, program comprehension becomes an increasingly difficult activity. Given a target system for comprehension, developers may first focus on the package comprehension. The packages in the system are of different sizes. For small-sized packages in the system, developers can easily comprehend them. However, for large-sized packages, they are difficult to understand. In this article, we focus on understanding these large-sized packages and propose a novel program comprehension approach for large-sized packages, which utilizes the Latent Dirichlet Allocation (LDA) model to cluster large-sized packages. Thus, these large-sized packages are separated as small-sized clusters, which are easier for developers to comprehend. Empirical studies on four real-world software projects demonstrate the effectiveness of our approach. The results show that the effectiveness of our approach is better than Latent Semantic Indexing- (LSI-) and Probabilistic Latent Semantic Analysis- (PLSA-) based clustering approaches. In addition, we find that the topic that labels each cluster is useful for program comprehension.

Keywords
  • Based clustering,
  • Empirical studies,
  • Latent dirichlet allocations,
  • Latent Semantic Indexing,
  • Probabilistic latent semantic analysis,
  • Program comprehension,
  • Software maintenance and evolution,
  • Software project
Identifier
10.1155/2017/3787053
Publisher
IOS Press / Hindawi Publishing Corporation
Copyright Owner and License
Authors
Creative Commons License
Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International
Additional URL
https://doi.org/10.1155/2017/3787053
Citation Information
Xiaobing SUN, Xiangyue LIU, Bin LI, Bixin LI, et al.. "Clustering classes in packages for program comprehension" Scientific Programming Vol. 2017 (2017) p. 3787053: 1 - 15 ISSN: 1058-9244
Available at: http://works.bepress.com/david_lo/215/