Mining Top-K Large Structural Patterns in a Massive NetworkProceedings of VLDB Endowment: 37th VLDB 2011, Seattle
Publication TypeConference Proceeding Article
AbstractWith ever-growing popularity of social networks, web and bio-networks, mining large frequent patterns from a single huge network has become increasingly important. Yet the existing pattern mining methods cannot oﬀer the eﬃciency desirable for large pattern discovery. We propose Spider- Mine, a novel algorithm to eﬃciently mine top-K largest frequent patterns from a single massive network with any user-speciﬁed probability of 1 − ϵ. Deviating from the existing edge-by-edge (i.e., incremental) pattern-growth framework, SpiderMine achieves its eﬃciency by unleashing the power of small patterns of a bounded diameter, which we call “spiders”. With the spider structure, our approach adopts a probabilistic mining framework to ﬁnd the top-k largest patterns by (i) identifying an aﬀordable set of promising growth paths toward large patterns, (ii) generating large patterns with much lower combinatorial complexity, and ﬁnally (iii) greatly reducing the cost of graph isomorphism tests with a new graph pattern representation by a multi-set of spiders. Extensive experimental studies on both synthetic and real data sets show that our algorithm outperforms existing methods.
City or CountrySaratoga, CA
Citation InformationFeida ZHU, Qiang QU, David LO, Xifeng YAN, et al.. "Mining Top-K Large Structural Patterns in a Massive Network" Proceedings of VLDB Endowment: 37th VLDB 2011, Seattle (2011) p. 807 - 818
Available at: http://works.bepress.com/david_lo/64/