Open problems in CEM: Porting an explicit time-domain volume-integral- equation solver on GPUs with OpenACC

Handle URI:
http://hdl.handle.net/10754/563479
Title:
Open problems in CEM: Porting an explicit time-domain volume-integral- equation solver on GPUs with OpenACC
Authors:
Ergül, Özgür; Feki, Saber; Al-Jarro, Ahmed; Clo, Alain M.; Bagci, Hakan ( 0000-0003-3867-5786 )
Abstract:
Graphics processing units (GPUs) are gradually becoming mainstream in high-performance computing, as their capabilities for enhancing performance of a large spectrum of scientific applications to many fold when compared to multi-core CPUs have been clearly identified and proven. In this paper, implementation and performance-tuning details for porting an explicit marching-on-in-time (MOT)-based time-domain volume-integral-equation (TDVIE) solver onto GPUs are described in detail. To this end, a high-level approach, utilizing the OpenACC directive-based parallel programming model, is used to minimize two often-faced challenges in GPU programming: developer productivity and code portability. The MOT-TDVIE solver code, originally developed for CPUs, is annotated with compiler directives to port it to GPUs in a fashion similar to how OpenMP targets multi-core CPUs. In contrast to CUDA and OpenCL, where significant modifications to CPU-based codes are required, this high-level approach therefore requires minimal changes to the codes. In this work, we make use of two available OpenACC compilers, CAPS and PGI. Our experience reveals that different annotations of the code are required for each of the compilers, due to different interpretations of the fairly new standard by the compiler developers. Both versions of the OpenACC accelerated code achieved significant performance improvements, with up to 30× speedup against the sequential CPU code using recent hardware technology. Moreover, we demonstrated that the GPU-accelerated fully explicit MOT-TDVIE solver leveraged energy-consumption gains of the order of 3× against its CPU counterpart. © 2014 IEEE.
KAUST Department:
KAUST Supercomputing Laboratory (KSL)
Publisher:
Institute of Electrical and Electronics Engineers (IEEE)
Journal:
IEEE Antennas and Propagation Magazine
Issue Date:
Apr-2014
DOI:
10.1109/MAP.2014.6837098
Type:
Article
ISSN:
10459243
Appears in Collections:
Articles; KAUST Supercomputing Laboratory (KSL)

Full metadata record

DC FieldValue Language
dc.contributor.authorErgül, Özgüren
dc.contributor.authorFeki, Saberen
dc.contributor.authorAl-Jarro, Ahmeden
dc.contributor.authorClo, Alain M.en
dc.contributor.authorBagci, Hakanen
dc.date.accessioned2015-08-03T11:52:30Zen
dc.date.available2015-08-03T11:52:30Zen
dc.date.issued2014-04en
dc.identifier.issn10459243en
dc.identifier.doi10.1109/MAP.2014.6837098en
dc.identifier.urihttp://hdl.handle.net/10754/563479en
dc.description.abstractGraphics processing units (GPUs) are gradually becoming mainstream in high-performance computing, as their capabilities for enhancing performance of a large spectrum of scientific applications to many fold when compared to multi-core CPUs have been clearly identified and proven. In this paper, implementation and performance-tuning details for porting an explicit marching-on-in-time (MOT)-based time-domain volume-integral-equation (TDVIE) solver onto GPUs are described in detail. To this end, a high-level approach, utilizing the OpenACC directive-based parallel programming model, is used to minimize two often-faced challenges in GPU programming: developer productivity and code portability. The MOT-TDVIE solver code, originally developed for CPUs, is annotated with compiler directives to port it to GPUs in a fashion similar to how OpenMP targets multi-core CPUs. In contrast to CUDA and OpenCL, where significant modifications to CPU-based codes are required, this high-level approach therefore requires minimal changes to the codes. In this work, we make use of two available OpenACC compilers, CAPS and PGI. Our experience reveals that different annotations of the code are required for each of the compilers, due to different interpretations of the fairly new standard by the compiler developers. Both versions of the OpenACC accelerated code achieved significant performance improvements, with up to 30× speedup against the sequential CPU code using recent hardware technology. Moreover, we demonstrated that the GPU-accelerated fully explicit MOT-TDVIE solver leveraged energy-consumption gains of the order of 3× against its CPU counterpart. © 2014 IEEE.en
dc.publisherInstitute of Electrical and Electronics Engineers (IEEE)en
dc.subjectexplicit marching-on-in-time schemeen
dc.subjectgraphics processing unit (GPU)en
dc.subjectOpenACCen
dc.subjectparallel processingen
dc.subjectparallel programmingen
dc.subjectTime domain volume integral equationen
dc.titleOpen problems in CEM: Porting an explicit time-domain volume-integral- equation solver on GPUs with OpenACCen
dc.typeArticleen
dc.contributor.departmentKAUST Supercomputing Laboratory (KSL)en
dc.identifier.journalIEEE Antennas and Propagation Magazineen
dc.contributor.institutionDepartment of Electrical and Electronics Engineering, Middle East Technical University, 06800 Ankara, Turkeyen
dc.contributor.institutionDivision of Computer, Electrical and Mathematical Sciences and Engineering, University College London, London WC1E 7JE, United Kingdomen
dc.contributor.institutionPhotonics Group, Department of Electronic and Electrical Engineering, University College London, London WC1E 7JE, United Kingdomen
kaust.authorClo, Alain M.en
All Items in KAUST are protected by copyright, with all rights reserved, unless otherwise indicated.