Skip to main content
Article
Clustering for malware classification
Journal of Computer Virology and Hacking Techniques (2017)
  • Swathi Pai, San Jose State University
  • Fabio Di Troia, Università degli Studi del Sannio
  • Corrado Aaron Visaggio, Università degli Studi del Sannio
  • Thomas H. Austin, San Jose State University
  • Mark Stamp, San Jose State University
Abstract
In this research, we apply clustering techniques to the malware classification problem. We compute clusters using the well-known K-means and Expectation Maximization algorithms, with the underlying scores based on Hidden Markov Models. We compare the results obtained from these two clustering approaches and we carefully consider the interplay between the dimension (i.e., number of models used for clustering), and the number of clusters, with respect to the accuracy of the clustering.
Keywords
  • Receiver Operating Characteristic Curve,
  • Hide Markov Model,
  • Expectation Maximization,
  • Hide Markov Model Model,
  • Silhouette Coefficient
Publication Date
May, 2017
DOI
10.1007/s11416-016-0265-3
Publisher Statement
SJSU Users: use the following link to login and access the article via SJSU databases.
Citation Information
Swathi Pai, Fabio Di Troia, Corrado Aaron Visaggio, Thomas H. Austin, et al.. "Clustering for malware classification" Journal of Computer Virology and Hacking Techniques Vol. 13 Iss. 2 (2017) p. 95 - 107 ISSN: 2274-2042
Available at: http://works.bepress.com/thomas_austin/25/