Skip to main content
Article
An evaluation of streaming algorithms for distinct counting over a sliding window
Fronties in ICT
  • Sneha Aman Singh, Iowa State University
  • Srikanta Tirthapura, Iowa State University
Document Type
Article
Publication Date
11-1-2015
DOI
10.3389/fict.2015.00023
Abstract

Counting the number of distinct elements in a data stream (distinct counting) is a fundamental aggregation task in database query processing, query optimization, and network monitoring. On a stream of elements, it is commonly needed to compute an aggregate over only the most recent elements, leading to the problem of distinct counting over a “sliding window” of the stream. We present a detailed experimental study of the performance of different algorithms for distinct counting over a sliding window. We observe that the performance of an algorithm depends on the basic method used, as well as aspects such as the hash function, the mix of query and updates, and the method used to boost accuracy. We compare the performance of prominent algorithms and evaluate the influence of these factors, leading to practical recommendations for implementation. To the best of our knowledge, this is the first detailed experimental study of distinct counting over a sliding window.

Comments

This is an article from Frontiers in ICT 2 (2015): Article 23, doi: 10.3389/fict.2015.00023. Posted with permission.

Rights
This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Copyright Owner
Singh and Tirthapura
Language
en
File Format
application/pdf
Citation Information
Sneha Aman Singh and Srikanta Tirthapura. "An evaluation of streaming algorithms for distinct counting over a sliding window" Fronties in ICT Vol. 2 (2015) p. Article 23
Available at: http://works.bepress.com/srikanta-tirthapura/2/