Massively Parallel Polar Decomposition on Distributed-Memory Systems

Handle URI:
http://hdl.handle.net/10754/626359
Title:
Massively Parallel Polar Decomposition on Distributed-Memory Systems
Authors:
Ltaief, Hatem ( 0000-0002-6897-1095 ) ; Sukkari, Dalal E.; Esposito, Aniello; Nakatsukasa, Yuji; Keyes, David E. ( 0000-0002-4052-7224 )
Abstract:
We present a high-performance implementation of the Polar Decomposition (PD) on distributed-memory systems. Building upon on the QR-based Dynamically Weighted Halley (QDWH) algorithm, the key idea lies in finding the best rational approximation for the scalar sign function, which also corresponds to the polar factor for symmetric matrices, to further accelerate the QDWH convergence. Based on the Zolotarev rational functions—introduced by Zolotarev (ZOLO) in 1877— this new PD algorithm ZOLO-PD converges within two iterations even for ill-conditioned matrices, instead of the original six iterations needed for QDWH. ZOLO-PD uses the property of Zolotarev functions that optimality is maintained when two functions are composed in an appropriate manner. The resulting ZOLO-PD has a convergence rate up to seventeen, in contrast to the cubic convergence rate for QDWH. This comes at the price of higher arithmetic costs and memory footprint. These extra floating-point operations can, however, be processed in an embarrassingly parallel fashion. We demonstrate performance using up to 102, 400 cores on two supercomputers. We demonstrate that, in the presence of a large number of processing units, ZOLO-PD is able to outperform QDWH by up to 2.3X speedup, especially in situations where QDWH runs out of work, for instance, in the strong scaling mode of operation.
KAUST Department:
Extreme Computing Research Center; Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division
Issue Date:
Jan-2018
Type:
Article
Sponsors:
Cray Inc. and Intel Corp
Appears in Collections:
Articles

Full metadata record

DC FieldValue Language
dc.contributor.authorLtaief, Hatemen
dc.contributor.authorSukkari, Dalal E.en
dc.contributor.authorEsposito, Anielloen
dc.contributor.authorNakatsukasa, Yujien
dc.contributor.authorKeyes, David E.en
dc.date.accessioned2018-01-30T08:22:35Z-
dc.date.available2017-12-13T06:48:18Z-
dc.date.available2018-01-30T08:22:35Z-
dc.date.issued2018-01-
dc.identifier.urihttp://hdl.handle.net/10754/626359-
dc.description.abstractWe present a high-performance implementation of the Polar Decomposition (PD) on distributed-memory systems. Building upon on the QR-based Dynamically Weighted Halley (QDWH) algorithm, the key idea lies in finding the best rational approximation for the scalar sign function, which also corresponds to the polar factor for symmetric matrices, to further accelerate the QDWH convergence. Based on the Zolotarev rational functions—introduced by Zolotarev (ZOLO) in 1877— this new PD algorithm ZOLO-PD converges within two iterations even for ill-conditioned matrices, instead of the original six iterations needed for QDWH. ZOLO-PD uses the property of Zolotarev functions that optimality is maintained when two functions are composed in an appropriate manner. The resulting ZOLO-PD has a convergence rate up to seventeen, in contrast to the cubic convergence rate for QDWH. This comes at the price of higher arithmetic costs and memory footprint. These extra floating-point operations can, however, be processed in an embarrassingly parallel fashion. We demonstrate performance using up to 102, 400 cores on two supercomputers. We demonstrate that, in the presence of a large number of processing units, ZOLO-PD is able to outperform QDWH by up to 2.3X speedup, especially in situations where QDWH runs out of work, for instance, in the strong scaling mode of operation.en
dc.description.sponsorshipCray Inc. and Intel Corpen
dc.subjectPolar Decompositionen
dc.subjectZolotarev Functionsen
dc.subjectParallel Algorithmsen
dc.subjectStrong Scalingen
dc.subjectDistributed-Memory Systemsen
dc.titleMassively Parallel Polar Decomposition on Distributed-Memory Systemsen
dc.typeArticleen
dc.contributor.departmentExtreme Computing Research Centeren
dc.contributor.departmentComputer, Electrical and Mathematical Sciences and Engineering (CEMSE) Divisionen
dc.contributor.institutionCray EMEA Research Laben
dc.contributor.institutionMathematical Institute, University of Oxforden

Version History

VersionItem Editor Date Summary
2 10754/626359grenzdm2018-01-30 08:16:43.369Accepted for publication.
1 10754/626359.1SUKKARDE2017-12-13 06:48:18.0
All Items in KAUST are protected by copyright, with all rights reserved, unless otherwise indicated.