Skip to main content
Article
Statistical Comparative Analysis and Evaluation of Validation Indices for Clustering Optimization
2020 IEEE Symposium Series on Computational Intelligence, SSCI 2020
  • Thy Nguyen
  • Jason Viehman
  • Dacosta Yeboah
  • Gayla R. Olbricht, Missouri University of Science and Technology
  • Tayo Obafemi-Ajayi
Abstract

Clustering is a relevant exploratory tool for a broad range of machine learning applications as it aids identification of meaningful subgroups. For a given clustering algorithm, multiple partitions can be obtained on the same data set by varying algorithmic parameters. Internal validation indices provide a means to objectively evaluate how well groupings obtained from a clustering configuration partitions the data, since there is no prior labeled data. This work presents a rigorous statistical evaluation framework that analyzes performance of internal validation indices based on correlation with external indices. A synthetic data generator that captures a wide range of complexity is proposed. Evaluation is conducted on a varied set of synthetic data types and real data sets to investigate performance of the indices.

Meeting Name
2020 IEEE Symposium Series on Computational Intelligence, SSCI (2020: Dec. 1-4, Canberra, ACT, Australia)
Department(s)
Mathematics and Statistics
Research Center/Lab(s)
Center for High Performance Computing Research
Keywords and Phrases
  • clustering,
  • statistical analysis,
  • validation indices
International Standard Book Number (ISBN)
978-172812547-3
Document Type
Article - Conference proceedings
Document Version
Citation
File Type
text
Language(s)
English
Rights
© 2020 Institute of Electrical and Electronics Engineers (IEEE), All rights reserved.
Publication Date
12-4-2020
Publication Date
04 Dec 2020
Citation Information
Thy Nguyen, Jason Viehman, Dacosta Yeboah, Gayla R. Olbricht, et al.. "Statistical Comparative Analysis and Evaluation of Validation Indices for Clustering Optimization" 2020 IEEE Symposium Series on Computational Intelligence, SSCI 2020 (2020) p. 3081 - 3090
Available at: http://works.bepress.com/gayla-olbricht/64/