Scaling Techniques for Massive Scale-Free Graphs in Distributed (External) Memory

Handle URI:
http://hdl.handle.net/10754/599561
Title:
Scaling Techniques for Massive Scale-Free Graphs in Distributed (External) Memory
Authors:
Pearce, Roger; Gokhale, Maya; Amato, Nancy M.
Abstract:
We present techniques to process large scale-free graphs in distributed memory. Our aim is to scale to trillions of edges, and our research is targeted at leadership class supercomputers and clusters with local non-volatile memory, e.g., NAND Flash. We apply an edge list partitioning technique, designed to accommodate high-degree vertices (hubs) that create scaling challenges when processing scale-free graphs. In addition to partitioning hubs, we use ghost vertices to represent the hubs to reduce communication hotspots. We present a scaling study with three important graph algorithms: Breadth-First Search (BFS), K-Core decomposition, and Triangle Counting. We also demonstrate scalability on BG/P Intrepid by comparing to best known Graph500 results. We show results on two clusters with local NVRAM storage that are capable of traversing trillion-edge scale-free graphs. By leveraging node-local NAND Flash, our approach can process thirty-two times larger datasets with only a 39% performance degradation in Traversed Edges Per Second (TEPS). © 2013 IEEE.
Citation:
Pearce R, Gokhale M, Amato NM (2013) Scaling Techniques for Massive Scale-Free Graphs in Distributed (External) Memory. 2013 IEEE 27th International Symposium on Parallel and Distributed Processing. Available: http://dx.doi.org/10.1109/IPDPS.2013.72.
Publisher:
Institute of Electrical and Electronics Engineers (IEEE)
Journal:
2013 IEEE 27th International Symposium on Parallel and Distributed Processing
KAUST Grant Number:
KUS-C1-016-04
Issue Date:
May-2013
DOI:
10.1109/IPDPS.2013.72
Type:
Conference Paper
Sponsors:
This work was partially performed under the auspices of the U.S. Department of Energy by Lawrence LivermoreNational Laboratory under Contract DE-AC52-07NA27344 (LLNL-CONF-588232). Funding was partially provided byLDRD 11-ERD-008. Portions of experiments were performed at the Livermore Computing facility resources. Thisresearch used resources of the Argonne Leadership Computing Facility (ALCF) at Argonne National Laboratory, whichis supported by the Office of Science of the U.S. Department of Energy under contract DE-AC02-06CH11357. ALCFresources provided through an INCITE 2012 award for the Fault-Oblivious Exascale Computing Environment project.This research supported in part by NSF awards CNS-0615267, CCF-0833199, CCF-0830753, IIS-0917266, IIS-0916053,NSF/DNDO award 2008-DN-077-ARI018-02, by DOE awards DE-FC52-08NA28616, DE-AC02-06CH11357, B575363,B575366, by THECB NHARP award 000512-0097-2009, by Samsung, Chevron, IBM, Intel, Oracle/Sun and by AwardKUS-C1-016-04, made by King Abdullah University of Science and Technology (KAUST). Pearce is supported in partby a Lawrence Scholar fellowship at LLNL.
Appears in Collections:
Publications Acknowledging KAUST Support

Full metadata record

DC FieldValue Language
dc.contributor.authorPearce, Rogeren
dc.contributor.authorGokhale, Mayaen
dc.contributor.authorAmato, Nancy M.en
dc.date.accessioned2016-02-28T05:53:23Zen
dc.date.available2016-02-28T05:53:23Zen
dc.date.issued2013-05en
dc.identifier.citationPearce R, Gokhale M, Amato NM (2013) Scaling Techniques for Massive Scale-Free Graphs in Distributed (External) Memory. 2013 IEEE 27th International Symposium on Parallel and Distributed Processing. Available: http://dx.doi.org/10.1109/IPDPS.2013.72.en
dc.identifier.doi10.1109/IPDPS.2013.72en
dc.identifier.urihttp://hdl.handle.net/10754/599561en
dc.description.abstractWe present techniques to process large scale-free graphs in distributed memory. Our aim is to scale to trillions of edges, and our research is targeted at leadership class supercomputers and clusters with local non-volatile memory, e.g., NAND Flash. We apply an edge list partitioning technique, designed to accommodate high-degree vertices (hubs) that create scaling challenges when processing scale-free graphs. In addition to partitioning hubs, we use ghost vertices to represent the hubs to reduce communication hotspots. We present a scaling study with three important graph algorithms: Breadth-First Search (BFS), K-Core decomposition, and Triangle Counting. We also demonstrate scalability on BG/P Intrepid by comparing to best known Graph500 results. We show results on two clusters with local NVRAM storage that are capable of traversing trillion-edge scale-free graphs. By leveraging node-local NAND Flash, our approach can process thirty-two times larger datasets with only a 39% performance degradation in Traversed Edges Per Second (TEPS). © 2013 IEEE.en
dc.description.sponsorshipThis work was partially performed under the auspices of the U.S. Department of Energy by Lawrence LivermoreNational Laboratory under Contract DE-AC52-07NA27344 (LLNL-CONF-588232). Funding was partially provided byLDRD 11-ERD-008. Portions of experiments were performed at the Livermore Computing facility resources. Thisresearch used resources of the Argonne Leadership Computing Facility (ALCF) at Argonne National Laboratory, whichis supported by the Office of Science of the U.S. Department of Energy under contract DE-AC02-06CH11357. ALCFresources provided through an INCITE 2012 award for the Fault-Oblivious Exascale Computing Environment project.This research supported in part by NSF awards CNS-0615267, CCF-0833199, CCF-0830753, IIS-0917266, IIS-0916053,NSF/DNDO award 2008-DN-077-ARI018-02, by DOE awards DE-FC52-08NA28616, DE-AC02-06CH11357, B575363,B575366, by THECB NHARP award 000512-0097-2009, by Samsung, Chevron, IBM, Intel, Oracle/Sun and by AwardKUS-C1-016-04, made by King Abdullah University of Science and Technology (KAUST). Pearce is supported in partby a Lawrence Scholar fellowship at LLNL.en
dc.publisherInstitute of Electrical and Electronics Engineers (IEEE)en
dc.subjectbig dataen
dc.subjectdistributed computingen
dc.subjectgraph algorithmsen
dc.subjectparallel algorithmsen
dc.titleScaling Techniques for Massive Scale-Free Graphs in Distributed (External) Memoryen
dc.typeConference Paperen
dc.identifier.journal2013 IEEE 27th International Symposium on Parallel and Distributed Processingen
dc.contributor.institutionTexas A and M University, College Station, United Statesen
dc.contributor.institutionLawrence Livermore National Laboratory, Livermore, United Statesen
kaust.grant.numberKUS-C1-016-04en
All Items in KAUST are protected by copyright, with all rights reserved, unless otherwise indicated.