GraMi: Generalized Frequent Pattern Mining in a Single Large Graph

Handle URI:
http://hdl.handle.net/10754/189749
Title:
GraMi: Generalized Frequent Pattern Mining in a Single Large Graph
Authors:
Saeedy, Mohammed El; Kalnis, Panos ( 0000-0002-5060-1360 )
Abstract:
Mining frequent subgraphs is an important operation on graphs. Most existing work assumes a database of many small graphs, but modern applications, such as social networks, citation graphs or protein-protein interaction in bioinformatics, are modeled as a single large graph. Interesting interactions in such applications may be transitive (e.g., friend of a friend). Existing methods, however, search for frequent isomorphic (i.e., exact match) subgraphs and cannot discover many useful patterns. In this paper the authors propose GRAMI, a framework that generalizes frequent subgraph mining in a large single graph. GRAMI discovers frequent patterns. A pattern is a graph where edges are generalized to distance-constrained paths. Depending on the definition of the distance function, many instantiations of the framework are possible. Both directed and undirected graphs, as well as multiple labels per vertex, are supported. The authors developed an efficient implementation of the framework that models the frequency resolution phase as a constraint satisfaction problem, in order to avoid the costly enumeration of all instances of each pattern in the graph. The authors also implemented CGRAMI, a version that supports structural and semantic constraints; and AGRAMI, an approximate version that supports very large graphs. The experiments on real data demonstrate that the authors framework is up to 3 orders of magnitude faster and discovers more interesting patterns than existing approaches.
KAUST Department:
Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division
Issue Date:
Nov-2011
Type:
Technical Report
Appears in Collections:
Technical Reports; Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division

Full metadata record

DC FieldValue Language
dc.contributor.authorSaeedy, Mohammed Elen
dc.contributor.authorKalnis, Panosen
dc.date.accessioned2011-11-15T23:08:42Z-
dc.date.available2011-11-15T23:08:42Z-
dc.date.issued2011-11en
dc.identifier.urihttp://hdl.handle.net/10754/189749en
dc.description.abstractMining frequent subgraphs is an important operation on graphs. Most existing work assumes a database of many small graphs, but modern applications, such as social networks, citation graphs or protein-protein interaction in bioinformatics, are modeled as a single large graph. Interesting interactions in such applications may be transitive (e.g., friend of a friend). Existing methods, however, search for frequent isomorphic (i.e., exact match) subgraphs and cannot discover many useful patterns. In this paper the authors propose GRAMI, a framework that generalizes frequent subgraph mining in a large single graph. GRAMI discovers frequent patterns. A pattern is a graph where edges are generalized to distance-constrained paths. Depending on the definition of the distance function, many instantiations of the framework are possible. Both directed and undirected graphs, as well as multiple labels per vertex, are supported. The authors developed an efficient implementation of the framework that models the frequency resolution phase as a constraint satisfaction problem, in order to avoid the costly enumeration of all instances of each pattern in the graph. The authors also implemented CGRAMI, a version that supports structural and semantic constraints; and AGRAMI, an approximate version that supports very large graphs. The experiments on real data demonstrate that the authors framework is up to 3 orders of magnitude faster and discovers more interesting patterns than existing approaches.en
dc.language.isoenen
dc.titleGraMi: Generalized Frequent Pattern Mining in a Single Large Graphen
dc.typeTechnical Reporten
dc.contributor.departmentComputer, Electrical and Mathematical Sciences and Engineering (CEMSE) Divisionen
dc.contributor.affiliationKing Abdullah University of Science and Technology (KAUST)en
All Items in KAUST are protected by copyright, with all rights reserved, unless otherwise indicated.