Combining Vertex-centric Graph Processing with SPARQL for Large-scale RDF Data Analytics

Handle URI:
http://hdl.handle.net/10754/625160
Title:
Combining Vertex-centric Graph Processing with SPARQL for Large-scale RDF Data Analytics
Authors:
Abdelaziz, Ibrahim ( 0000-0003-1449-5115 ) ; Al-Harbi, Mohammad Razen; Salihoglu, Semih; Kalnis, Panos ( 0000-0002-5060-1360 )
Abstract:
Modern applications, such as drug repositioning, require sophisticated analytics on RDF graphs that combine structural queries with generic graph computations. Existing systems support either declarative SPARQL queries, or generic graph processing, but not both. We bridge the gap by introducing Spartex, a versatile framework for complex RDF analytics. Spartex extends SPARQL to support programs that combine seamlessly generic graph algorithms (e.g., PageRank, Shortest Paths, etc.) with SPARQL queries. Spartex builds on existing vertex-centric graph processing frameworks, such as Graphlab or Pregel. It implements a generic SPARQL operator as a vertex-centric program that interprets SPARQL queries and executes them efficiently using a built-in optimizer. In addition, any graph algorithm implemented in the underlying vertex-centric framework, can be executed in Spartex. We present various scenarios where our framework simplifies significantly the implementation of complex RDF data analytics programs. We demonstrate that Spartex scales to datasets with billions of edges, and show that our core SPARQL engine is at least as fast as the state-of-the-art specialized RDF engines. For complex analytical tasks that combine generic graph processing with SPARQL, Spartex is at least an order of magnitude faster than existing alternatives.
KAUST Department:
Computer Science, King Abdullah University of Science and Technology, 127355 Thuwal, Makkah Saudi Arabia 23955-6900
Citation:
Abdelaziz I, Al-Harbi MR, Salihoglu S, Kalnis P (2017) Combining Vertex-centric Graph Processing with SPARQL for Large-scale RDF Data Analytics. IEEE Transactions on Parallel and Distributed Systems: 1–1. Available: http://dx.doi.org/10.1109/TPDS.2017.2720174.
Publisher:
Institute of Electrical and Electronics Engineers (IEEE)
Journal:
IEEE Transactions on Parallel and Distributed Systems
Issue Date:
27-Jun-2017
DOI:
10.1109/TPDS.2017.2720174
Type:
Article
ISSN:
1045-9219
Additional Links:
http://ieeexplore.ieee.org/document/7959641/
Appears in Collections:
Articles

Full metadata record

DC FieldValue Language
dc.contributor.authorAbdelaziz, Ibrahimen
dc.contributor.authorAl-Harbi, Mohammad Razenen
dc.contributor.authorSalihoglu, Semihen
dc.contributor.authorKalnis, Panosen
dc.date.accessioned2017-07-06T09:43:05Z-
dc.date.available2017-07-06T09:43:05Z-
dc.date.issued2017-06-27en
dc.identifier.citationAbdelaziz I, Al-Harbi MR, Salihoglu S, Kalnis P (2017) Combining Vertex-centric Graph Processing with SPARQL for Large-scale RDF Data Analytics. IEEE Transactions on Parallel and Distributed Systems: 1–1. Available: http://dx.doi.org/10.1109/TPDS.2017.2720174.en
dc.identifier.issn1045-9219en
dc.identifier.doi10.1109/TPDS.2017.2720174en
dc.identifier.urihttp://hdl.handle.net/10754/625160-
dc.description.abstractModern applications, such as drug repositioning, require sophisticated analytics on RDF graphs that combine structural queries with generic graph computations. Existing systems support either declarative SPARQL queries, or generic graph processing, but not both. We bridge the gap by introducing Spartex, a versatile framework for complex RDF analytics. Spartex extends SPARQL to support programs that combine seamlessly generic graph algorithms (e.g., PageRank, Shortest Paths, etc.) with SPARQL queries. Spartex builds on existing vertex-centric graph processing frameworks, such as Graphlab or Pregel. It implements a generic SPARQL operator as a vertex-centric program that interprets SPARQL queries and executes them efficiently using a built-in optimizer. In addition, any graph algorithm implemented in the underlying vertex-centric framework, can be executed in Spartex. We present various scenarios where our framework simplifies significantly the implementation of complex RDF data analytics programs. We demonstrate that Spartex scales to datasets with billions of edges, and show that our core SPARQL engine is at least as fast as the state-of-the-art specialized RDF engines. For complex analytical tasks that combine generic graph processing with SPARQL, Spartex is at least an order of magnitude faster than existing alternatives.en
dc.publisherInstitute of Electrical and Electronics Engineers (IEEE)en
dc.relation.urlhttp://ieeexplore.ieee.org/document/7959641/en
dc.rights(c) 2017 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other users, including reprinting/ republishing this material for advertising or promotional purposes, creating new collective works for resale or redistribution to servers or lists, or reuse of any copyrighted components of this work in other works.en
dc.subjectRDF dataen
dc.subjectGraph Analyticsen
dc.subjectSPARQLen
dc.subjectVertex-centricen
dc.titleCombining Vertex-centric Graph Processing with SPARQL for Large-scale RDF Data Analyticsen
dc.typeArticleen
dc.contributor.departmentComputer Science, King Abdullah University of Science and Technology, 127355 Thuwal, Makkah Saudi Arabia 23955-6900en
dc.identifier.journalIEEE Transactions on Parallel and Distributed Systemsen
dc.eprint.versionPost-printen
dc.contributor.institutionPetroleum Engineering Application Services Department, Saudi Arabian Oil Co, 36516 Dhahran, Riyadh Saudi Arabiaen
dc.contributor.institutionDavid Cheriton School of Computer Science, University of Waterloo, 8430 Waterloo, Ontario Canada N2J 4Z1en
kaust.authorAbdelaziz, Ibrahimen
kaust.authorKalnis, Panosen
All Items in KAUST are protected by copyright, with all rights reserved, unless otherwise indicated.