Show simple item record

dc.contributor.authorKamalakkannan, Kamalavasan
dc.contributor.authorMudalige, Gihan R.
dc.contributor.authorReguly, Istvan Z.
dc.contributor.authorFahmy, Suhaib A.
dc.date.accessioned2021-01-13T05:57:07Z
dc.date.available2021-01-13T05:57:07Z
dc.date.issued2021-05-17
dc.identifier.urihttp://hdl.handle.net/10754/666880
dc.description.abstractThis paper presents a workflow for synthesizing near-optimal FPGA implementations of structured-mesh based stencil applications for explicit solvers. It leverages key characteristics of the application class and its computation-communication pattern and the architectural capabilities of the FPGA to accelerate solvers for high-performance computing applications. Key new features of the workflow are (1) the unification of standard state-of-the-art techniques with a number of highgain optimizations such as batching and spatial blocking/tiling, motivated by increasing throughput for real-world workloads and (2) the development and use of a predictive analytical model to explore the design space, and obtain resource and performance estimates. Three representative applications are implemented using the design workflow on a Xilinx Alveo U280 FPGA, demonstrating near-optimal performance and over 85% predictive model accuracy. These are compared with equivalent highly-optimized implementations of the same applications on modern HPC-grade GPUs (Nvidia V100), analyzing time to solution, bandwidth, and energy consumption. Performance results indicate comparable runtimes with the V100 GPU, with over 2× energy savings for the largest non-trivial application on the FPGA. Our investigation shows the challenges of achieving high performance on current generation FPGAs compared to traditional architectures. We discuss determinants for a given stencil code to be amenable to FPGA implementation, providing insights into the feasibility and profitability of a design and its resulting performance.
dc.publisherIEEE
dc.rightsArchived with thanks to IEEE
dc.subjectFPGAs
dc.subjectStencil Applications
dc.subjectExplicit solvers
dc.titleHigh-Level FPGA Accelerator Design for Structured-Mesh-Based Explicit Numerical Solvers
dc.typeConference Paper
dc.contributor.departmentKing Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia
dc.conference.date17–21 May 2021
dc.conference.nameIEEE International Parallel and Distributed Processing Symposium
dc.conference.locationPortland, OR
dc.eprint.versionPost-print
dc.contributor.institutionDept. of Computer Science University of Warwick, UK
dc.contributor.institutionFaculty of Information Technology & Bionics Pazmany Peter Catholic University, Hungary
pubs.publication-statusAccepted
dc.identifier.arxividarxiv.org/pdf/2101.01177
kaust.personFahmy, Suhaib A.
refterms.dateFOA2021-01-13T05:57:08Z


Files in this item

Thumbnail
Name:
ipdps2021-vasan.pdf
Size:
820.2Kb
Format:
PDF

This item appears in the following Collection(s)

Show simple item record