With the advent of big-data, processing large graphs quickly has become increasingly important. Most existing approaches either utilize in-memory processing techniques that can only process graphs that fit completely in RAM, or disk-based techniques that sacrifice performance. In this work, we propose a novel RAM-Disk hybrid approach to graph processing that can scale well from a single shared-memory node to large distributed-memory systems. It works by partitioning the graph into sub graphs that fit in RAM and uses a paging-like technique to load sub graphs. We show that without modifying the algorithms, this approach can scale from small memory-constrained systems (such as tablets) to large-scale distributed machines with 16, 000+ cores.
Harshvardhan, West B, Fidel A, Amato NM, Rauchwerger L (2015) A Hybrid Approach to Processing Big Data Graphs on Memory-Restricted Systems. 2015 IEEE International Parallel and Distributed Processing Symposium. Available: http://dx.doi.org/10.1109/IPDPS.2015.28.
We would like to thank Glen Hordemann for help withour initial design. We would also like to thank our anony-mous reviewers. This research is supported in part byNSF awards CCF 0702765, CNS-0551685, CCF-0833199,CCF-1439145, CCF-1423111, CCF-0830753, IIS-0917266,by DOE awards DE-AC02-06CH11357, DE-NA0002376,B575363, by Samsung, IBM, Intel, and by Award KUS-C1-016-04, made by King Abdullah University of Scienceand Technology (KAUST). This research used resources ofthe National Energy Research Scientific Computing Center,which is supported by the Office of Science of the U.S. Dept.of Energy under Contract No. DE-AC02-05CH11231.