Skip to main content
Contribution to Book
PL2AP: Fast Parallel Cosine Similarity Search
IA3 '15: Proceedings of the 5th Workshop on Irregular Applications: Architectures and Algorithms (2015)
  • David C. Anastasiu, University of Minnesota - Twin Cities
  • George Karypis, University of Minnesota - Twin Cities
Abstract
Solving the AllPairs similarity search problem entails finding all pairs of vectors in a high dimensional sparse dataset that have a similarity value higher than a given threshold. The output form this problem is a crucial component in many real-world applications, such as clustering, online advertising, recommender systems, near-duplicate document detection, and query refinement. A number of serial algorithms have been proposed that solve the problem by pruning many of the possible similarity candidates for each query object, after accessing only a few of their non-zero values. The pruning process results in unpredictable memory access patterns that can reduce search efficiency. In this context, we introduce pL2AP, which efficiently solves the AllPairs cosine similarity search problem in a multi-core environment. Our method uses a number of cache-tiling optimizations, combined with fine-grained dynamically balanced parallel tasks, to solve the problem 1.5x-238x faster than existing parallel baselines on datasets with hundreds of millions of non-zeros.
Keywords
  • similarity search,
  • similarity join,
  • bounded cosine similarity graph,
  • cosine similarity
Publication Date
November, 2015
Publisher
ACM
ISBN
978-1-4503-4001-4
DOI
10.1145/2833179.2833182
Publisher Statement
This is the Accepted Manuscript of an article that appeared in Proceedings of the 5th Workshop on Irregular Applications: Architectures and Algorithms. The Version of Record is available at the following link.

Citation Information
David C. Anastasiu and George Karypis. "PL2AP: Fast Parallel Cosine Similarity Search" New York, NYIA3 '15: Proceedings of the 5th Workshop on Irregular Applications: Architectures and Algorithms (2015)
Available at: http://works.bepress.com/david-anastasiu/15/