Skip to main content
Article
L2Knng: Fast Exact K-Nearest Neighbor Graph Construction with L2-Norm Pruning
Proceedings of the 24th ACM International on Conference on Information and Knowledge Management (CIKM 2015) (2015)
  • David C. Anastasiu, University of Minnesota - Twin Cities
  • George Karypis, University of Minnesota - Twin Cities
Abstract
The k-nearest neighbor graph is often used as a building block in information retrieval, clustering, online advertising, and recommender systems algorithms. The complexity of constructing the exact k-nearest neighbor graph is quadratic on the number of objects that are compared, and most existing methods solve the problem
approximately. We present L2Knng, an efficient algorithm that finds the exact cosine similarity k-nearest neighbor graph for a set of sparse high-dimensional objects. Our algorithm quickly builds an approximate solution to the problem, identifying many of the most similar neighbors, and then uses theoretic bounds on the similarity of two vectors, based on the l2-norm of part of the vectors, to find each object’s exact k-neighborhood. We perform an extensive evaluation of our algorithm, comparing against both exact and approximate baselines, and demonstrate the efficiency of our method across a variety of real-world datasets and neighborhood sizes. Our approximate and exact L2Knng variants compute the k-nearest neighbor graph up to an order of magnitude faster than their respective baselines.
Keywords
  • k-nearest neighbor graph,
  • similarity search,
  • top-k,
  • cosine similarity
Publication Date
October, 2015
DOI
10.1145/2806416.2806534
Publisher Statement
This is the Accepted Manuscript of an article that appeared in the Proceedings of the 24th ACM International on Conference on Information and Knowledge Management. The Version of Record is available at the following link.

Citation Information
David C. Anastasiu and George Karypis. "L2Knng: Fast Exact K-Nearest Neighbor Graph Construction with L2-Norm Pruning" Proceedings of the 24th ACM International on Conference on Information and Knowledge Management (CIKM 2015) (2015) p. 791 - 800
Available at: http://works.bepress.com/david-anastasiu/7/