Using runtime execution to identify whether code is malware, and to which malware family it belongs, is an established technique in the security domain. Traditionally, literature has relied on explicit features derived from network, file system, or registry interaction. While effective, the collection and analysis of these fine-granularity data points makes the technique quite computationally expensive. Moreover, the signatures/heuristics this analysis produces are often easily circumvented by subsequent malware authors.
To this end, we propose "Babble", a system that is concerned only with the *order* in which high-level system events take place. Individual events are mapped onto an alphabet and execution traces are captured via terse concatenations of those letters. Then, leveraging an analysis labeled corpus of malware, n-gram document classification techniques are applied to produce a classifier predicting malware family. This poster describes that technique and its proof-of-concept evaluation. This work concentrates only on network ordering and 3 malware families are highlighted. We show the technique achieves roughly 80% accuracy in isolation and makes non-trivial performance improvements when integrated with a baseline classifier of non-ordered features.