Show simple item record

dc.contributor.authorMalas, T.
dc.contributor.authorHager, G.
dc.contributor.authorLtaief, Hatem
dc.contributor.authorStengel, H.
dc.contributor.authorWellein, G.
dc.contributor.authorKeyes, David E.
dc.date.accessioned2015-09-15T12:43:16Z
dc.date.available2015-09-15T12:43:16Z
dc.date.issued2015-07-02
dc.identifier.citationMulticore-Optimized Wavefront Diamond Blocking for Optimizing Stencil Updates 2015, 37 (4):C439 SIAM Journal on Scientific Computing
dc.identifier.issn1064-8275
dc.identifier.issn1095-7197
dc.identifier.doi10.1137/140991133
dc.identifier.urihttp://hdl.handle.net/10754/577336
dc.description.abstractThe importance of stencil-based algorithms in computational science has focused attention on optimized parallel implementations for multilevel cache-based processors. Temporal blocking schemes leverage the large bandwidth and low latency of caches to accelerate stencil updates and approach theoretical peak performance. A key ingredient is the reduction of data traffic across slow data paths, especially the main memory interface. In this work we combine the ideas of multicore wavefront temporal blocking and diamond tiling to arrive at stencil update schemes that show large reductions in memory pressure compared to existing approaches. The resulting schemes show performance advantages in bandwidth-starved situations, which are exacerbated by the high bytes per lattice update case of variable coefficients. Our thread groups concept provides a controllable trade-off between concurrency and memory usage, shifting the pressure between the memory interface and the CPU. We present performance results on a contemporary Intel processor.
dc.language.isoen
dc.publisherSociety for Industrial & Applied Mathematics (SIAM)
dc.relation.urlhttp://epubs.siam.org/doi/10.1137/140991133
dc.rightsArchived with thanks to SIAM Journal on Scientific Computing
dc.subjecttemporal blocking
dc.subjectwavefront parallelization
dc.subjectdiamond tiling
dc.subjectstencil computations
dc.subjectmulticore
dc.subjectenergy-efficient algorithms
dc.titleMulticore-Optimized Wavefront Diamond Blocking for Optimizing Stencil Updates
dc.typeArticle
dc.contributor.departmentExtreme Computing Research Center
dc.identifier.journalSIAM Journal on Scientific Computing
dc.eprint.versionPublisher's Version/PDF
dc.contributor.institutionErlangen Regional Computing Center (RRZE), Friedrich-Alexander University of Erlangen-Nuremberg, Erlangen 91058, Germany
dc.contributor.affiliationKing Abdullah University of Science and Technology (KAUST)
kaust.personMalas, Tareq Majed Yasin
kaust.personLtaief, Hatem
kaust.personKeyes, David E.
refterms.dateFOA2018-06-13T12:17:26Z


Files in this item

Thumbnail
Name:
140991133.pdf
Size:
1.706Mb
Format:
PDF
Description:
Main article

This item appears in the following Collection(s)

Show simple item record