Portable and Efficient Dense Linear Algebra in the Beginning of the Exascale Era
Type
Conference PaperAuthors
Gates, MarkYarKhan, Asim
Sukkari, Dalal
Akbudak, Kadir
Cayrols, Sebastien
Bielich, Daniel
Abdelfattah, Ahmad
Al Farhan, Mohammed

Dongarra, Jack
KAUST Department
KAUST,Thuwal,Saudi ArabiaComputer Science Program
Computer, Electrical and Mathematical Science and Engineering (CEMSE) Division
Date
2023-01-30Permanent link to this record
http://hdl.handle.net/10754/687440
Metadata
Show full item recordAbstract
The SLATE project is implementing a distributed dense linear algebra library for highly-scalable distributed-memory accelerator-based computer systems. The goal is to provide a library that can be easily ported to different hardware (CPUs, GPUs, accelerators) and will provide high performance for machines into the future. Current ports include CPUs, CUDA, ROCm, and oneAPI. We achieve both performance and portability by leveraging several layers and abstractions, including OpenMP tasks to track data dependencies, MPI for distributed communication, and the BLAS++ and LAPACK++ libraries developed as a portable layer across vendor-optimized CPU and GPU BLAS and LAPACK functionality. We rely on the C++ standard library and templating to reduce code duplication for better maintainability. The few kernels not present in BLAS are implemented in CUDA, HIP, and OpenMP target offload, and are easily ported to new platforms.Citation
Gates, M., YarKhan, A., Sukkari, D., Akbudak, K., Cayrols, S., Bielich, D., Abdelfattah, A., Farhan, M. A., & Dongarra, J. (2022). Portable and Efficient Dense Linear Algebra in the Beginning of the Exascale Era. 2022 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC). https://doi.org/10.1109/p3hpc56579.2022.00009Sponsors
This research was supported by the Exascale Computing Project (17-SC-20-SC), a joint project of the U.S. Department of Energy’s Office of Science and National Nuclear Security Administration, responsible for delivering a capable exascale ecosystem, including software, applications, and hardware technology, to support the nation’s exascale computing imperative. This research used resources of the Oak Ridge Leadership Computing Facility at the Oak Ridge National Laboratory, which is supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC05- 00OR22725.Publisher
IEEEConference/Event name
2022 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC)Additional Links
https://ieeexplore.ieee.org/document/10024624/ae974a485f413a2113503eed53cd6c53
10.1109/p3hpc56579.2022.00009