Accelerated Cyclic Reduction: A Distributed-Memory Fast Solver for Structured Linear Systems

Handle URI:
http://hdl.handle.net/10754/626403
Title:
Accelerated Cyclic Reduction: A Distributed-Memory Fast Solver for Structured Linear Systems
Authors:
Chávez, Gustavo; Turkiyyah, George; Zampini, Stefano ( 0000-0002-0435-0433 ) ; Ltaief, Hatem ( 0000-0002-6897-1095 ) ; Keyes, David E. ( 0000-0002-4052-7224 )
Abstract:
We present Accelerated Cyclic Reduction (ACR), a distributed-memory fast solver for rank-compressible block tridiagonal linear systems arising from the discretization of elliptic operators, developed here for three dimensions. Algorithmic synergies between Cyclic Reduction and hierarchical matrix arithmetic operations result in a solver that has O(kNlogN(logN+k2)) arithmetic complexity and O(k Nlog N) memory footprint, where N is the number of degrees of freedom and k is the rank of a block in the hierarchical approximation, and which exhibits substantial concurrency. We provide a baseline for performance and applicability by comparing with the multifrontal method with and without hierarchical semi-separable matrices, with algebraic multigrid and with the classic cyclic reduction method. Over a set of large-scale elliptic systems with features of nonsymmetry and indefiniteness, the robustness of the direct solvers extends beyond that of the multigrid solver, and relative to the multifrontal approach ACR has lower or comparable execution time and size of the factors, with substantially lower numerical ranks. ACR exhibits good strong and weak scaling in a distributed context and, as with any direct solver, is advantageous for problems that require the solution of multiple right-hand sides. Numerical experiments show that the rank k patterns are of O(1) for the Poisson equation and of O(n) for the indefinite Helmholtz equation. The solver is ideal in situations where low-accuracy solutions are sufficient, or otherwise as a preconditioner within an iterative method.
KAUST Department:
Extreme Computing Research Center
Citation:
Chávez G, Turkiyyah G, Zampini S, Ltaief H, Keyes D (2017) Accelerated Cyclic Reduction: A Distributed-Memory Fast Solver for Structured Linear Systems. Parallel Computing. Available: http://dx.doi.org/10.1016/j.parco.2017.12.001.
Publisher:
Elsevier BV
Journal:
Parallel Computing
Issue Date:
15-Dec-2017
DOI:
10.1016/j.parco.2017.12.001
Type:
Article
ISSN:
0167-8191
Sponsors:
We thank the anonymous reviewers for their detailed comments and suggestions for this manuscript. The authors would also like to thank Ronald Kriemann from the Max-Planck-Institute for Mathematics in the Sciences for development and continuous support of HLibPro, Alexander Litvinenko from the King Abdullah University of Science and Technology (KAUST) for the enlightening discussions and advice, and Pieter Ghysels from the Lawrence Berkeley National Laboratory for his recommendations on the use of STRUMPACK. Support from the KAUST Supercomputing Laboratory and access to Shaheen is gratefully acknowledged. The work of all authors was supported by the Extreme Computing Research Center at KAUST.
Additional Links:
http://www.sciencedirect.com/science/article/pii/S0167819117302041
Appears in Collections:
Articles; Extreme Computing Research Center

Full metadata record

DC FieldValue Language
dc.contributor.authorChávez, Gustavoen
dc.contributor.authorTurkiyyah, Georgeen
dc.contributor.authorZampini, Stefanoen
dc.contributor.authorLtaief, Hatemen
dc.contributor.authorKeyes, David E.en
dc.date.accessioned2017-12-21T13:57:03Z-
dc.date.available2017-12-21T13:57:03Z-
dc.date.issued2017-12-15en
dc.identifier.citationChávez G, Turkiyyah G, Zampini S, Ltaief H, Keyes D (2017) Accelerated Cyclic Reduction: A Distributed-Memory Fast Solver for Structured Linear Systems. Parallel Computing. Available: http://dx.doi.org/10.1016/j.parco.2017.12.001.en
dc.identifier.issn0167-8191en
dc.identifier.doi10.1016/j.parco.2017.12.001en
dc.identifier.urihttp://hdl.handle.net/10754/626403-
dc.description.abstractWe present Accelerated Cyclic Reduction (ACR), a distributed-memory fast solver for rank-compressible block tridiagonal linear systems arising from the discretization of elliptic operators, developed here for three dimensions. Algorithmic synergies between Cyclic Reduction and hierarchical matrix arithmetic operations result in a solver that has O(kNlogN(logN+k2)) arithmetic complexity and O(k Nlog N) memory footprint, where N is the number of degrees of freedom and k is the rank of a block in the hierarchical approximation, and which exhibits substantial concurrency. We provide a baseline for performance and applicability by comparing with the multifrontal method with and without hierarchical semi-separable matrices, with algebraic multigrid and with the classic cyclic reduction method. Over a set of large-scale elliptic systems with features of nonsymmetry and indefiniteness, the robustness of the direct solvers extends beyond that of the multigrid solver, and relative to the multifrontal approach ACR has lower or comparable execution time and size of the factors, with substantially lower numerical ranks. ACR exhibits good strong and weak scaling in a distributed context and, as with any direct solver, is advantageous for problems that require the solution of multiple right-hand sides. Numerical experiments show that the rank k patterns are of O(1) for the Poisson equation and of O(n) for the indefinite Helmholtz equation. The solver is ideal in situations where low-accuracy solutions are sufficient, or otherwise as a preconditioner within an iterative method.en
dc.description.sponsorshipWe thank the anonymous reviewers for their detailed comments and suggestions for this manuscript. The authors would also like to thank Ronald Kriemann from the Max-Planck-Institute for Mathematics in the Sciences for development and continuous support of HLibPro, Alexander Litvinenko from the King Abdullah University of Science and Technology (KAUST) for the enlightening discussions and advice, and Pieter Ghysels from the Lawrence Berkeley National Laboratory for his recommendations on the use of STRUMPACK. Support from the KAUST Supercomputing Laboratory and access to Shaheen is gratefully acknowledged. The work of all authors was supported by the Extreme Computing Research Center at KAUST.en
dc.publisherElsevier BVen
dc.relation.urlhttp://www.sciencedirect.com/science/article/pii/S0167819117302041en
dc.rightsNOTICE: this is the author’s version of a work that was accepted for publication in Parallel Computing. Changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. A definitive version was subsequently published in Parallel Computing, [, , (2017-12-15)] DOI: 10.1016/j.parco.2017.12.001 . © 2017. This manuscript version is made available under the CC-BY-NC-ND 4.0 license http://creativecommons.org/licenses/by-nc-nd/4.0/en
dc.subjectCyclic reductionen
dc.subjectHierarchical matricesen
dc.subjectFast direct solversen
dc.subjectElliptic equationsen
dc.titleAccelerated Cyclic Reduction: A Distributed-Memory Fast Solver for Structured Linear Systemsen
dc.typeArticleen
dc.contributor.departmentExtreme Computing Research Centeren
dc.identifier.journalParallel Computingen
dc.eprint.versionPost-printen
dc.contributor.institutionDepartment of Computer Science, American University of Beirut (AUB), Beirut, Lebanonen
kaust.authorChávez, Gustavoen
kaust.authorZampini, Stefanoen
kaust.authorLtaief, Hatemen
kaust.authorKeyes, David E.en
All Items in KAUST are protected by copyright, with all rights reserved, unless otherwise indicated.