Skip to main content
Article
A Comparative Study of Threshold-based Feature Selection Techniques
Proceedings of 2010 IEEE International Conference on Granular Computing (GrC 2010) (2010)
  • Huanjing Wang, Western Kentucky University
  • Taghi M. Khoshgoftaar, Florida Atlantic University
  • Jason Van Hulse, Florida Atlantic University
Abstract
Abstract Given high-dimensional software measurement data, researchers and practitioners often use feature (metric) selection techniques to improve the performance of software quality classification models. This paper presents our newly proposed threshold-based feature selection techniques, comparing the performance of these techniques by building classification models using five commonly used classifiers. In order to evaluate the effectiveness of different feature selection techniques, the models are evaluated using eight different performance metrics separately since a given performance metric usually captures only one aspect of the classification performance. All experiments are conducted on three Eclipse data sets with different levels of class imbalance. The experiments demonstrate that the choice of a performance metric may significantly influence the results. In this study, we have found four distinct patterns when utilizing eight performance metrics to order 11 threshold-based feature selection techniques. Moreover, performances of the software quality models either improve or remain unchanged despite the removal of over 96% of the software metrics (attributes).
Keywords
  • performance metrics,
  • threshold-based feature selection technique,
  • software metrics,
  • classification
Publication Date
August, 2010
Citation Information
Huanjing Wang, Taghi M. Khoshgoftaar and Jason Van Hulse. "A Comparative Study of Threshold-based Feature Selection Techniques" Proceedings of 2010 IEEE International Conference on Granular Computing (GrC 2010) (2010)
Available at: http://works.bepress.com/huanjing_wang/15/