Show simple item record

dc.contributor.authorAbdulJabbar, Mustafa Abdulmajeed
dc.contributor.authorMarkomanolis, Georgios
dc.contributor.authorIbeid, Huda
dc.contributor.authorYokota, Rio
dc.contributor.authorKeyes, David E.
dc.date.accessioned2017-06-13T07:28:10Z
dc.date.available2017-06-13T07:28:10Z
dc.date.issued2017-05-11
dc.identifier.citationAbduljabbar M, Markomanolis GS, Ibeid H, Yokota R, Keyes D (2017) Communication Reducing Algorithms for Distributed Hierarchical N-Body Problems with Boundary Distributions. High Performance Computing: 79–96. Available: http://dx.doi.org/10.1007/978-3-319-58667-0_5.
dc.identifier.issn0302-9743
dc.identifier.issn1611-3349
dc.identifier.doi10.1007/978-3-319-58667-0_5
dc.identifier.urihttp://hdl.handle.net/10754/624979
dc.description.abstractReduction of communication and efficient partitioning are key issues for achieving scalability in hierarchical N-Body algorithms like Fast Multipole Method (FMM). In the present work, we propose three independent strategies to improve partitioning and reduce communication. First, we show that the conventional wisdom of using space-filling curve partitioning may not work well for boundary integral problems, which constitute a significant portion of FMM’s application user base. We propose an alternative method that modifies orthogonal recursive bisection to relieve the cell-partition misalignment that has kept it from scaling previously. Secondly, we optimize the granularity of communication to find the optimal balance between a bulk-synchronous collective communication of the local essential tree and an RDMA per task per cell. Finally, we take the dynamic sparse data exchange proposed by Hoefler et al. [1] and extend it to a hierarchical sparse data exchange, which is demonstrated at scale to be faster than the MPI library’s MPI_Alltoallv that is commonly used.
dc.description.sponsorshipThis work was supported by JSPS KAKENHI Grant-in-Aid for Young Scientists A Grant Number 16H05859. This work is partially supported by “Joint Usage/Research Center for Interdisciplinary Large-scale Information Infrastructures” and “High Performance Computing Infrastructure” in Japan. The authors are grateful to the KAUST Supercomputing Laboratory for the use of the Shaheen XC40 system.
dc.publisherSpringer Nature
dc.relation.urlhttp://link.springer.com/chapter/10.1007/978-3-319-58667-0_5
dc.rightsThe final publication is available at Springer via http://dx.doi.org/10.1007/978-3-319-58667-0_5
dc.subjectN-body methods
dc.subjectFast multipole method
dc.subjectLoad balancing
dc.subjectCommunication reduction
dc.titleCommunication Reducing Algorithms for Distributed Hierarchical N-Body Problems with Boundary Distributions
dc.typeBook Chapter
dc.contributor.departmentApplied Mathematics and Computational Science Program
dc.contributor.departmentComputer Science Program
dc.contributor.departmentComputer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division
dc.contributor.departmentExtreme Computing Research Center
dc.contributor.departmentKAUST Supercomputing Laboratory (KSL)
dc.contributor.departmentSupercomputing, Computational Scientists
dc.identifier.journalLecture Notes in Computer Science
dc.eprint.versionPost-print
dc.contributor.institutionGlobal Scientific Information and Computing Center (GSIC), Tokyo Institute of Technology (TITECH), Tokyo, Japan
dc.identifier.arxividarXiv:1702.05459
kaust.personAbdulJabbar, Mustafa Abdulmajeed
kaust.personMarkomanolis, Georgios
kaust.personIbeid, Huda
kaust.personKeyes, David E.
refterms.dateFOA2018-05-12T00:00:00Z


Files in this item

Thumbnail
Name:
1702.05459.pdf
Size:
2.226Mb
Format:
PDF
Description:
Accepted Manuscript

This item appears in the following Collection(s)

Show simple item record