Optimization of an Electromagnetics Code with Multicore Wavefront Diamond Blocking and Multi-dimensional Intra-Tile Parallelization

Handle URI:
http://hdl.handle.net/10754/622653
Title:
Optimization of an Electromagnetics Code with Multicore Wavefront Diamond Blocking and Multi-dimensional Intra-Tile Parallelization
Authors:
Malas, Tareq M.; Hornich, Julian; Hager, Georg; Ltaief, Hatem ( 0000-0002-6897-1095 ) ; Pflaum, Christoph; Keyes, David E. ( 0000-0002-4052-7224 )
Abstract:
Understanding and optimizing the properties of solar cells is becoming a key issue in the search for alternatives to nuclear and fossil energy sources. A theoretical analysis via numerical simulations involves solving Maxwell's Equations in discretized form and typically requires substantial computing effort. We start from a hybrid-parallel (MPI+OpenMP) production code that implements the Time Harmonic Inverse Iteration Method (THIIM) with Finite-Difference Frequency Domain (FDFD) discretization. Although this algorithm has the characteristics of a strongly bandwidth-bound stencil update scheme, it is significantly different from the popular stencil types that have been exhaustively studied in the high performance computing literature to date. We apply a recently developed stencil optimization technique, multicore wavefront diamond tiling with multi-dimensional cache block sharing, and describe in detail the peculiarities that need to be considered due to the special stencil structure. Concurrency in updating the components of the electric and magnetic fields provides an additional level of parallelism. The dependence of the cache size requirement of the optimized code on the blocking parameters is modeled accurately, and an auto-tuner searches for optimal configurations in the remaining parameter space. We were able to completely decouple the execution from the memory bandwidth bottleneck, accelerating the implementation by a factor of three to four compared to an optimal implementation with pure spatial blocking on an 18-core Intel Haswell CPU.
KAUST Department:
Extreme Computing Research Center
Citation:
Malas TM, Hornich J, Hager G, Ltaief H, Pflaum C, et al. (2016) Optimization of an Electromagnetics Code with Multicore Wavefront Diamond Blocking and Multi-dimensional Intra-Tile Parallelization. 2016 IEEE International Parallel and Distributed Processing Symposium (IPDPS). Available: http://dx.doi.org/10.1109/IPDPS.2016.87.
Publisher:
Institute of Electrical and Electronics Engineers (IEEE)
Journal:
2016 IEEE International Parallel and Distributed Processing Symposium (IPDPS)
Conference/Event name:
30th IEEE International Parallel and Distributed Processing Symposium, IPDPS 2016
Issue Date:
21-Jul-2016
DOI:
10.1109/IPDPS.2016.87
Type:
Conference Paper
Sponsors:
For computer time, this research used the resources of the Extreme Computing Research Center (ECRC) at KAUST. The authors thank the ECRC for supporting T. Malas. The authors gratefully acknowledge the support of the Erlangen Graduate School in Advanced Optical Technologies (SAOT) and the Cluster of Excellence “Engineering of Advanced Materials” at the University of Erlangen-Nuremberg, which are both funded by the German Research Foundation (DFG) in the framework of the German excellence initiative. The authors are also grateful for funding provided by the Energy Campus Nuremberg (EnCN, Project “Solarfabrik der Zukunft”).
Additional Links:
http://ieeexplore.ieee.org/document/7516010/
Appears in Collections:
Conference Papers; Extreme Computing Research Center

Full metadata record

DC FieldValue Language
dc.contributor.authorMalas, Tareq M.en
dc.contributor.authorHornich, Julianen
dc.contributor.authorHager, Georgen
dc.contributor.authorLtaief, Hatemen
dc.contributor.authorPflaum, Christophen
dc.contributor.authorKeyes, David E.en
dc.date.accessioned2017-01-09T11:52:21Z-
dc.date.available2017-01-09T11:52:21Z-
dc.date.issued2016-07-21en
dc.identifier.citationMalas TM, Hornich J, Hager G, Ltaief H, Pflaum C, et al. (2016) Optimization of an Electromagnetics Code with Multicore Wavefront Diamond Blocking and Multi-dimensional Intra-Tile Parallelization. 2016 IEEE International Parallel and Distributed Processing Symposium (IPDPS). Available: http://dx.doi.org/10.1109/IPDPS.2016.87.en
dc.identifier.doi10.1109/IPDPS.2016.87en
dc.identifier.urihttp://hdl.handle.net/10754/622653-
dc.description.abstractUnderstanding and optimizing the properties of solar cells is becoming a key issue in the search for alternatives to nuclear and fossil energy sources. A theoretical analysis via numerical simulations involves solving Maxwell's Equations in discretized form and typically requires substantial computing effort. We start from a hybrid-parallel (MPI+OpenMP) production code that implements the Time Harmonic Inverse Iteration Method (THIIM) with Finite-Difference Frequency Domain (FDFD) discretization. Although this algorithm has the characteristics of a strongly bandwidth-bound stencil update scheme, it is significantly different from the popular stencil types that have been exhaustively studied in the high performance computing literature to date. We apply a recently developed stencil optimization technique, multicore wavefront diamond tiling with multi-dimensional cache block sharing, and describe in detail the peculiarities that need to be considered due to the special stencil structure. Concurrency in updating the components of the electric and magnetic fields provides an additional level of parallelism. The dependence of the cache size requirement of the optimized code on the blocking parameters is modeled accurately, and an auto-tuner searches for optimal configurations in the remaining parameter space. We were able to completely decouple the execution from the memory bandwidth bottleneck, accelerating the implementation by a factor of three to four compared to an optimal implementation with pure spatial blocking on an 18-core Intel Haswell CPU.en
dc.description.sponsorshipFor computer time, this research used the resources of the Extreme Computing Research Center (ECRC) at KAUST. The authors thank the ECRC for supporting T. Malas. The authors gratefully acknowledge the support of the Erlangen Graduate School in Advanced Optical Technologies (SAOT) and the Cluster of Excellence “Engineering of Advanced Materials” at the University of Erlangen-Nuremberg, which are both funded by the German Research Foundation (DFG) in the framework of the German excellence initiative. The authors are also grateful for funding provided by the Energy Campus Nuremberg (EnCN, Project “Solarfabrik der Zukunft”).en
dc.publisherInstitute of Electrical and Electronics Engineers (IEEE)en
dc.relation.urlhttp://ieeexplore.ieee.org/document/7516010/en
dc.subjectElectromagneticsen
dc.subjectParallel programmingen
dc.subjectPerformance analysisen
dc.subjectStencilsen
dc.titleOptimization of an Electromagnetics Code with Multicore Wavefront Diamond Blocking and Multi-dimensional Intra-Tile Parallelizationen
dc.typeConference Paperen
dc.contributor.departmentExtreme Computing Research Centeren
dc.identifier.journal2016 IEEE International Parallel and Distributed Processing Symposium (IPDPS)en
dc.conference.date2016-05-23 to 2016-05-27en
dc.conference.name30th IEEE International Parallel and Distributed Processing Symposium, IPDPS 2016en
dc.conference.locationChicago, IL, USAen
dc.contributor.institutionDepartment of Computer Science, Friedrich-Alexander University Erlangen-Nürnberg (FAU), Erlangen, Germanyen
dc.contributor.institutionErlangen Graduate School in Advanced Optical Technologies, SAOT, Friedrich-Alexander University Erlangen-Nürnberg (FAU), Erlangen, Germanyen
dc.contributor.institutionErlangen Regional Computing Center (RRZE), Friedrich-Alexander University Erlangen-Nürnberg (FAU), Erlangen, Germanyen
kaust.authorMalas, Tareq M.en
kaust.authorLtaief, Hatemen
kaust.authorKeyes, David E.en
All Items in KAUST are protected by copyright, with all rights reserved, unless otherwise indicated.