Combining Vertex-centric Graph Processing with SPARQL for Large-scale RDF Data Analytics
KAUST DepartmentComputer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division
Computer Science Program
Computer Science, King Abdullah University of Science and Technology, 127355 Thuwal, Makkah Saudi Arabia 23955-6900
MetadataShow full item record
AbstractModern applications, such as drug repositioning, require sophisticated analytics on RDF graphs that combine structural queries with generic graph computations. Existing systems support either declarative SPARQL queries, or generic graph processing, but not both. We bridge the gap by introducing Spartex, a versatile framework for complex RDF analytics. Spartex extends SPARQL to support programs that combine seamlessly generic graph algorithms (e.g., PageRank, Shortest Paths, etc.) with SPARQL queries. Spartex builds on existing vertex-centric graph processing frameworks, such as Graphlab or Pregel. It implements a generic SPARQL operator as a vertex-centric program that interprets SPARQL queries and executes them efficiently using a built-in optimizer. In addition, any graph algorithm implemented in the underlying vertex-centric framework, can be executed in Spartex. We present various scenarios where our framework simplifies significantly the implementation of complex RDF data analytics programs. We demonstrate that Spartex scales to datasets with billions of edges, and show that our core SPARQL engine is at least as fast as the state-of-the-art specialized RDF engines. For complex analytical tasks that combine generic graph processing with SPARQL, Spartex is at least an order of magnitude faster than existing alternatives.
CitationAbdelaziz I, Al-Harbi MR, Salihoglu S, Kalnis P (2017) Combining Vertex-centric Graph Processing with SPARQL for Large-scale RDF Data Analytics. IEEE Transactions on Parallel and Distributed Systems: 1–1. Available: http://dx.doi.org/10.1109/TPDS.2017.2720174.