Skip to main content
Collective personalized change classification with multiobjective search
IEEE Transactions on Reliability
  • Xin XIA
  • David LO, Singapore Management University
  • Xinyu WANG
  • Xiaohu YANG
Publication Type
Journal Article
Publication Date

Many change classification techniques have been proposed to identify defect-prone changes. These techniques consider all developers' historical change data to build a global prediction model. In practice, since developers have their own coding preferences and behavioral patterns, which causes different defect patterns, a separate change classification model for each developer can help to improve performance. Jiang, Tan, and Kim refer to this problem as personalized change classification, and they propose PCC+ to solve this problem. A software project has a number of developers; for a developer, building a prediction model not only based on his/her change data, but also on other relevant developers' change data can further improve the performance of change classification. In this paper, we propose a more accurate technique named collective personalized change classification (CPCC), which leverages a multiobjective genetic algorithm. For a project, CPCC first builds a personalized prediction model for each developer based on his/her historical data. Next, for each developer, CPCC combines these models by assigning different weights to these models with the purpose of maximizing two objective functions (i.e., F1-scores and cost effectiveness). To further improve the prediction accuracy, we propose CPCC+ by combining CPCC with PCC proposed by Jiang, Tan, and Kim To evaluate the benefits of CPCC+ and CPCC, we perform experiments on six large software projects from different communities: Eclipse JDT, Jackrabbit, Linux kernel, Lucene, PostgreSQL, and Xorg. The experiment results show that CPCC+ can discover up to 245 more bugs than PCC+ (468 versus 223 for PostgreSQL) if developers inspect the top 20% lines of code that are predicted buggy. In addition, CPCC+ can achieve F1-scores of 0.60-0.75, which are statistically significantly higher than those of PCC+ on all of the six projects.

  • Cost effectiveness,
  • developer,
  • machine learning,
  • multiobjective genetic algorithm,
  • personalized change classification (PCC)
Institute of Electrical and Electronics Engineers (IEEE)
Creative Commons License
Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International
Additional URL
Citation Information
Xin XIA, David LO, Xinyu WANG and Xiaohu YANG. "Collective personalized change classification with multiobjective search" IEEE Transactions on Reliability Vol. 65 Iss. 4 (2016) p. 1810 - 1829 ISSN: 0018-9529
Available at: