High Productivity Programming of Dense Linear Algebra on Heterogeneous NUMA Architectures
Type
ThesisAuthors
Alomairy, Rabab M.
Advisors
Keyes, David E.
Committee members
Ltaief, Hatem
Moshkov, Mikhail

Shihada, Basem

Program
Computer ScienceDate
2013-07Embargo End Date
2014-07-30Permanent link to this record
http://hdl.handle.net/10754/297194
Metadata
Show full item recordAccess Restrictions
At the time of archiving, the student author of this thesis opted to temporarily restrict access to it. The full text of this thesis became available to the public after the expiration of the embargo on 2014-07-30.Abstract
High-end multicore systems with GPU-based accelerators are now ubiquitous in the hardware landscape. Besides dealing with the nontrivial heterogeneous environ- ment, end users should often take into consideration the underlying memory architec- ture to decrease the overhead of data motion, especially when running on non-uniform memory access (NUMA) platforms. We propose the OmpSs parallel programming model approach using its Nanos++ dynamic runtime system to solve the two challeng- ing problems aforementioned, through 1) an innovative NUMA node-aware scheduling policy to reduce data movement between NUMA nodes and 2) a nested parallelism feature to concurrently exploit the resources available from the GPU devices as well as the CPU host, without compromising the overall performance. Our approach fea- tures separation of concerns by abstracting the complexity of the hardware from the end users so that high productivity can be achieved. The Cholesky factorization is used as a benchmark representative of dense numerical linear algebra algorithms. Superior performance is also demonstrated on the symmetric matrix inversion based on Cholesky factorization, commonly used in co-variance computations in statistics. Performance on a NUMA system with Kepler-based GPUs exceeds that of existing implementations, while the OmpSs-enabled code remains very similar to its original sequential version.Citation
Alomairy, R. M. (2013). High Productivity Programming of Dense Linear Algebra on Heterogeneous NUMA Architectures. KAUST Research Repository. https://doi.org/10.25781/KAUST-J256Dae974a485f413a2113503eed53cd6c53
10.25781/KAUST-J256D