Show simple item record

dc.contributor.authorAbdelfattah, Ahmad
dc.contributor.authorKeyes, David E.
dc.contributor.authorLtaief, Hatem
dc.date.accessioned2017-06-12T10:24:00Z
dc.date.available2017-06-12T10:24:00Z
dc.date.issued2014-05-04
dc.identifier.urihttp://hdl.handle.net/10754/624931
dc.description.abstractKBLAS (KAUST BLAS) is a small library that provides highly optimized BLAS routines on systems accelerated with GPUs. KBLAS is entirely written in CUDA C, and targets NVIDIA GPUs with compute capability 2.0 (Fermi) or higher. The current focus is on level-2 BLAS routines, namely the general matrix vector multiplication (GEMV) kernel, and the symmetric/hermitian matrix vector multiplication (SYMV/HEMV) kernel. KBLAS provides these two kernels in all four precisions (s, d, c, and z), with support to multi-GPU systems. Through advanced optimization techniques that target latency hiding and pushing memory bandwidth to the limit, KBLAS outperforms state-of-the-art kernels by 20-90% improvement. Competitors include CUBLAS-5.5, MAGMABLAS-1.4.0, and CULAR17. The SYMV/HEMV kernel from KBLAS has been adopted by NVIDIA, and should appear in CUBLAS-6.0. KBLAS has been used in large scale simulations of multi-object adaptive optics.
dc.titleEnabling High Performance Large Scale Dense Problems through KBLAS
dc.typePoster
dc.contributor.departmentComputer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division
dc.contributor.departmentComputer Science Program
dc.contributor.departmentApplied Mathematics and Computational Science Program
dc.contributor.departmentExtreme Computing Research Center
dc.contributor.departmentComputer, Electrical and Mathematical Sciences & Engineering (CEMSE)
dc.conference.dateMay 4-6, 2014
dc.conference.nameSHAXC-2 Workshop 2014
dc.conference.locationKAUST
kaust.personAbdelfattah, Ahmad
kaust.personKeyes, David E.
kaust.personLtaief, Hatem
refterms.dateFOA2018-06-14T05:58:12Z


Files in this item

Thumbnail
Name:
ahmad_abdelfattah.pdf
Size:
4.829Mb
Format:
PDF
Description:
Poster

This item appears in the following Collection(s)

Show simple item record