Skip to main content
Article
Gene encoder: a feature selection technique through unsupervised deep learning-based clustering for large gene expression data
Neural Computing and Applications
  • Uzma, Ghulam Ishaq Khan Institute of Engineering Sciences and Technology
  • Feras Al-Obeidat, Zayed University
  • Abdallah Tubaishat, Zayed University
  • Babar Shah, Zayed University
  • Zahid Halim, Ghulam Ishaq Khan Institute of Engineering Sciences and Technology
Document Type
Article
Publication Date
1-1-2020
Abstract

© 2020, Springer-Verlag London Ltd., part of Springer Nature. Cancer is a severe condition of uncontrolled cell division that results in a tumor formation that spreads to other tissues of the body. Therefore, the development of new medication and treatment methods for this is in demand. Classification of microarray data plays a vital role in handling such situations. The relevant gene selection is an important step for the classification of microarray data. This work presents gene encoder, an unsupervised two-stage feature selection technique for the cancer samples’ classification. The first stage aggregates three filter methods, namely principal component analysis, correlation, and spectral-based feature selection techniques. Next, the genetic algorithm is used, which evaluates the chromosome utilizing the autoencoder-based clustering. The resultant feature subset is used for the classification task. Three classifiers, namely support vector machine, k-nearest neighbors, and random forest, are used in this work to avoid the dependency on any one classifier. Six benchmark gene expression datasets are used for the performance evaluation, and a comparison is made with four state-of-the-art related algorithms. Three sets of experiments are carried out to evaluate the proposed method. These experiments are for the evaluation of the selected features based on sample-based clustering, adjusting optimal parameters, and for selecting better performing classifier. The comparison is based on accuracy, recall, false positive rate, precision, F-measure, and entropy. The obtained results suggest better performance of the current proposal.

Publisher
Springer
Disciplines
Keywords
  • Clustering,
  • Deep learning,
  • Gene expression,
  • Genetic algorithm,
  • Unsupervised learning
Scopus ID
85086476667
Indexed in Scopus
Yes
Open Access
No
https://doi.org/10.1007/s00521-020-05101-4
Citation Information
Uzma, Feras Al-Obeidat, Abdallah Tubaishat, Babar Shah, et al.. "Gene encoder: a feature selection technique through unsupervised deep learning-based clustering for large gene expression data" Neural Computing and Applications (2020) - 23 ISSN: <a href="https://v2.sherpa.ac.uk/id/publication/issn/0941-0643" target="_blank">0941-0643</a>
Available at: http://works.bepress.com/babar-shah/39/