With the increasing prevalence of data that model relationships between various entities, the use of a graph-based representation for real-world problems offers a logical strategy for organizing information and making knowledge-based decisions. In particular, often it is useful to identify the most frequent patterns or relationships amongst the data in a graph, which requires finding frequent subgraphs. Algorithms for addressing that problem have been proposed for over 15 years. In the worst case, all subgraphs in the graph must be examined, which is exponential in complexity, and subgraph isomorphisms must be computed, which is an NP-complete problem. Frequent subgraph algorithms may attempt to improve the actual runtime performance by reducing the size of the search space, avoiding duplicate comparisons, and/or minimizing the amount of memory required for compiling intermediate results. Herein we present a frequent subgraph mining algorithm that leverages mapping sets in order to eliminate the isomorphism computation during the search for non-edge-disjoint frequent subgraphs. Experimental results show that absence of isomorphism computation leads to much faster frequent subgraph detection when there is a need to identify all occurrences of those subgraphs.
- Algorithms,
- Artificial intelligence,
- Computational complexity,
- Graphic methods,
- Knowledge based systems,
- Learning systems,
- Mapping,
- Pattern recognition,
- Set theory,
- Frequent subgraph mining,
- Frequent subgraphs,
- Graph mining,
- Graph-based representations,
- Intermediate results,
- Model relationships,
- Run-time performance,
- Subgraph isomorphism,
- Data mining,
- Mapping sets
Available at: http://works.bepress.com/jennifer-leopold/25/