High Performance Polar Decomposition on Distributed Memory Systems
Type
Conference PaperKAUST Department
Applied Mathematics and Computational Science ProgramComputer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division
Extreme Computing Research Center
Date
2016-08-09Online Publication Date
2016-08-09Print Publication Date
2016Permanent link to this record
http://hdl.handle.net/10754/622144
Metadata
Show full item recordAbstract
The polar decomposition of a dense matrix is an important operation in linear algebra. It can be directly calculated through the singular value decomposition (SVD) or iteratively using the QR dynamically-weighted Halley algorithm (QDWH). The former is difficult to parallelize due to the preponderant number of memory-bound operations during the bidiagonal reduction. We investigate the latter scenario, which performs more floating-point operations but exposes at the same time more parallelism, and therefore, runs closer to the theoretical peak performance of the system, thanks to more compute-bound matrix operations. Profiling results show the performance scalability of QDWH for calculating the polar decomposition using around 9200 MPI processes on well and ill-conditioned matrices of 100K×100K problem size. We study then the performance impact of the QDWH-based polar decomposition as a pre-processing step toward calculating the SVD itself. The new distributed-memory implementation of the QDWH-SVD solver achieves up to five-fold speedup against current state-of-the-art vendor SVD implementations. © Springer International Publishing Switzerland 2016.Citation
Sukkari D, Ltaief H, Keyes D (2016) High Performance Polar Decomposition on Distributed Memory Systems. Lecture Notes in Computer Science: 605–616. Available: http://dx.doi.org/10.1007/978-3-319-43659-3_44.Sponsors
For computer time, this research used the resources from the Swiss National Supercomputing Centre (CSCS) in Lugano, Switzerland.Publisher
Springer NatureConference/Event name
22nd International Conference on Parallel and Distributed Computing, Euro-Par 2016Additional Links
http://link.springer.com/chapter/10.1007%2F978-3-319-43659-3_44ae974a485f413a2113503eed53cd6c53
10.1007/978-3-319-43659-3_44