Multicore-Optimized Wavefront Diamond Blocking for Optimizing Stencil Updates

Handle URI:
http://hdl.handle.net/10754/577336
Title:
Multicore-Optimized Wavefront Diamond Blocking for Optimizing Stencil Updates
Authors:
Malas, T.; Hager, G.; Ltaief, Hatem ( 0000-0002-6897-1095 ) ; Stengel, H.; Wellein, G.; Keyes, David E. ( 0000-0002-4052-7224 )
Abstract:
The importance of stencil-based algorithms in computational science has focused attention on optimized parallel implementations for multilevel cache-based processors. Temporal blocking schemes leverage the large bandwidth and low latency of caches to accelerate stencil updates and approach theoretical peak performance. A key ingredient is the reduction of data traffic across slow data paths, especially the main memory interface. In this work we combine the ideas of multicore wavefront temporal blocking and diamond tiling to arrive at stencil update schemes that show large reductions in memory pressure compared to existing approaches. The resulting schemes show performance advantages in bandwidth-starved situations, which are exacerbated by the high bytes per lattice update case of variable coefficients. Our thread groups concept provides a controllable trade-off between concurrency and memory usage, shifting the pressure between the memory interface and the CPU. We present performance results on a contemporary Intel processor.
KAUST Department:
Extreme Computing Research Center
Citation:
Multicore-Optimized Wavefront Diamond Blocking for Optimizing Stencil Updates 2015, 37 (4):C439 SIAM Journal on Scientific Computing
Publisher:
Society for Industrial & Applied Mathematics (SIAM)
Journal:
SIAM Journal on Scientific Computing
Issue Date:
2-Jul-2015
DOI:
10.1137/140991133
Type:
Article
ISSN:
1064-8275; 1095-7197
Additional Links:
http://epubs.siam.org/doi/10.1137/140991133
Appears in Collections:
Articles; Extreme Computing Research Center

Full metadata record

DC FieldValue Language
dc.contributor.authorMalas, T.en
dc.contributor.authorHager, G.en
dc.contributor.authorLtaief, Hatemen
dc.contributor.authorStengel, H.en
dc.contributor.authorWellein, G.en
dc.contributor.authorKeyes, David E.en
dc.date.accessioned2015-09-15T12:43:16Zen
dc.date.available2015-09-15T12:43:16Zen
dc.date.issued2015-07-02en
dc.identifier.citationMulticore-Optimized Wavefront Diamond Blocking for Optimizing Stencil Updates 2015, 37 (4):C439 SIAM Journal on Scientific Computingen
dc.identifier.issn1064-8275en
dc.identifier.issn1095-7197en
dc.identifier.doi10.1137/140991133en
dc.identifier.urihttp://hdl.handle.net/10754/577336en
dc.description.abstractThe importance of stencil-based algorithms in computational science has focused attention on optimized parallel implementations for multilevel cache-based processors. Temporal blocking schemes leverage the large bandwidth and low latency of caches to accelerate stencil updates and approach theoretical peak performance. A key ingredient is the reduction of data traffic across slow data paths, especially the main memory interface. In this work we combine the ideas of multicore wavefront temporal blocking and diamond tiling to arrive at stencil update schemes that show large reductions in memory pressure compared to existing approaches. The resulting schemes show performance advantages in bandwidth-starved situations, which are exacerbated by the high bytes per lattice update case of variable coefficients. Our thread groups concept provides a controllable trade-off between concurrency and memory usage, shifting the pressure between the memory interface and the CPU. We present performance results on a contemporary Intel processor.en
dc.language.isoenen
dc.publisherSociety for Industrial & Applied Mathematics (SIAM)en
dc.relation.urlhttp://epubs.siam.org/doi/10.1137/140991133en
dc.rightsArchived with thanks to SIAM Journal on Scientific Computingen
dc.subjecttemporal blockingen
dc.subjectwavefront parallelizationen
dc.subjectdiamond tilingen
dc.subjectstencil computationsen
dc.subjectmulticoreen
dc.subjectenergy-efficient algorithmsen
dc.titleMulticore-Optimized Wavefront Diamond Blocking for Optimizing Stencil Updatesen
dc.typeArticleen
dc.contributor.departmentExtreme Computing Research Centeren
dc.identifier.journalSIAM Journal on Scientific Computingen
dc.eprint.versionPublisher's Version/PDFen
dc.contributor.institutionErlangen Regional Computing Center (RRZE), Friedrich-Alexander University of Erlangen-Nuremberg, Erlangen 91058, Germanyen
dc.contributor.affiliationKing Abdullah University of Science and Technology (KAUST)en
kaust.authorMalas, Tareq Majed Yasinen
kaust.authorLtaief, Hatemen
kaust.authorKeyes, David E.en
All Items in KAUST are protected by copyright, with all rights reserved, unless otherwise indicated.