Skip to main content
Article
Condensing class diagrams by analyzing design and network metrics using optimistic classification
ICPC 2014: Proceedings of the 22nd International Conference on Program Comprehension: Hyderabad, India, June 2-3, 2014
  • Ferdian Thung, Singapore Management University
  • David LO, Singapore Management University
  • Mohd Hafeez Osman, Leiden University
  • Michel R.V. Chaudron, Chalmers University of Technology
Publication Type
Conference Proceeding Article
Version
publishedVersion
Publication Date
6-2014
Abstract

A class diagram of a software system enhances our ability to understand software design. However, this diagram is often unavailable. Developers usually reconstruct the diagram by reverse engineering it from source code. Unfortunately, the resultant diagram is often very cluttered; making it difficult to learn anything valuable from it. Thus, it would be very beneficial if we are able to condense the reverse- engineered class diagram to contain only the important classes depicting the overall design of a software system. Such diagram would make program understanding much easier. A class can be important, for example, if its removal would break many connections between classes. In our work, we estimate this kind of importance by using design (e.g., number of attributes, number of dependencies, etc.) and network metrics (e.g., betweenness centrality, closeness centrality, etc.). We use these metrics as features and input their values to our optimistic classifier that will predict if a class is important or not. Different from standard classification, our newly proposed optimistic classification technique deals with data scarcity problem by optimistically assigning labels to some of the unlabeled data and use them for training a better statistical model. We have evaluated our approach to condense reverse-engineered diagrams of 9 software systems and compared our approach with the state-of-the-art work of Osman et al. Our experiments show that our approach can achieve an average Area Under the Receiver Operating Characteristic Curve (AUC) score of 0.825, which is a 9.1% improvement compared to the state-of-the-art approach.

Keywords
  • Design Metrics,
  • Network Metrics,
  • Optimistic Classification,
  • Important Classes
ISBN
9781450328791
Identifier
10.1145/2597008.2597157
Publisher
ACM
City or Country
New York
Copyright Owner and License
Authors
Creative Commons License
Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International
Additional URL
http://doi.org/10.1145/2597008.2597157
Citation Information
Ferdian Thung, David LO, Mohd Hafeez Osman and Michel R.V. Chaudron. "Condensing class diagrams by analyzing design and network metrics using optimistic classification" ICPC 2014: Proceedings of the 22nd International Conference on Program Comprehension: Hyderabad, India, June 2-3, 2014 (2014) p. 110 - 121
Available at: http://works.bepress.com/david_lo/228/