High-Level FPGA Accelerator Design for Structured-Mesh-Based Explicit Numerical Solvers
Permanent link to this recordhttp://hdl.handle.net/10754/666880
MetadataShow full item record
AbstractThis paper presents a workflow for synthesizing near-optimal FPGA implementations of structured-mesh based stencil applications for explicit solvers. It leverages key characteristics of the application class and its computation-communication pattern and the architectural capabilities of the FPGA to accelerate solvers for high-performance computing applications. Key new features of the workflow are (1) the unification of standard state-of-the-art techniques with a number of high-gain optimizations such as batching and spatial blocking/tiling, motivated by increasing throughput for real-world workloads and (2) the development and use of a predictive analytical model to explore the design space, and obtain resource and performance estimates. Three representative applications are implemented using the design workflow on a Xilinx Alveo U280 FPGA, demonstrating near-optimal performance and over 85% predictive model accuracy. These are compared with equivalent highly-optimized implementations of the same applications on modern HPC-grade GPUs (Nvidia V100), analyzing time to solution, bandwidth, and energy consumption. Performance results indicate comparable runtimes with the V100 GPU, with over 2× energy savings for the largest non-trivial application on the FPGA. Our investigation shows the challenges of achieving high performance on current generation FPGAs compared to traditional architectures. We discuss determinants for a given stencil code to be amenable to FPGA implementation, providing insights into the feasibility and profitability of a design and its resulting performance.
CitationKamalakkannan, K., Mudalige, G. R., Reguly, I. Z., & Fahmy, S. A. (2021). High-Level FPGA Accelerator Design for Structured-Mesh-Based Explicit Numerical Solvers. 2021 IEEE International Parallel and Distributed Processing Symposium (IPDPS). doi:10.1109/ipdps49936.2021.00117
SponsorsGihan Mudalige was supported by the Royal Society Industry Fellowship Scheme (INF/R1/1800 12). István Reguly was supported by National Research, Development and Innovation Fund of Hungary (PD 124905), under the PD 17 funding scheme. We are grateful to Jacques Du Toit and Tim Schmielau at NAG UK for the RTM application and Xilinx for their hardware and software donation. REFERENCES
Conference/Event name2021 IEEE International Parallel and Distributed Processing Symposium (IPDPS)