Communication Reducing Algorithms for Distributed Hierarchical N-Body Problems with Boundary Distributions
AuthorsAbdulJabbar, Mustafa Abdulmajeed
Keyes, David E.
KAUST DepartmentApplied Mathematics and Computational Science Program
Computer Science Program
Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division
Extreme Computing Research Center
KAUST Supercomputing Laboratory (KSL)
Supercomputing, Computational Scientists
Permanent link to this recordhttp://hdl.handle.net/10754/624979
MetadataShow full item record
AbstractReduction of communication and efficient partitioning are key issues for achieving scalability in hierarchical N-Body algorithms like Fast Multipole Method (FMM). In the present work, we propose three independent strategies to improve partitioning and reduce communication. First, we show that the conventional wisdom of using space-filling curve partitioning may not work well for boundary integral problems, which constitute a significant portion of FMM’s application user base. We propose an alternative method that modifies orthogonal recursive bisection to relieve the cell-partition misalignment that has kept it from scaling previously. Secondly, we optimize the granularity of communication to find the optimal balance between a bulk-synchronous collective communication of the local essential tree and an RDMA per task per cell. Finally, we take the dynamic sparse data exchange proposed by Hoefler et al.  and extend it to a hierarchical sparse data exchange, which is demonstrated at scale to be faster than the MPI library’s MPI_Alltoallv that is commonly used.
CitationAbduljabbar M, Markomanolis GS, Ibeid H, Yokota R, Keyes D (2017) Communication Reducing Algorithms for Distributed Hierarchical N-Body Problems with Boundary Distributions. High Performance Computing: 79–96. Available: http://dx.doi.org/10.1007/978-3-319-58667-0_5.
SponsorsThis work was supported by JSPS KAKENHI Grant-in-Aid for Young Scientists A Grant Number 16H05859. This work is partially supported by “Joint Usage/Research Center for Interdisciplinary Large-scale Information Infrastructures” and “High Performance Computing Infrastructure” in Japan. The authors are grateful to the KAUST Supercomputing Laboratory for the use of the Shaheen XC40 system.