Optimizing the performance of streaming numerical kernels on the IBM Blue Gene/P PowerPC 450 processor
Type
ArticleKAUST Department
Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) DivisionComputer Science Program
KAUST Supercomputing Laboratory (KSL)
Applied Mathematics and Computational Science Program
Extreme Computing Research Center
Core Labs
Date
2012-05-21Preprint Posting Date
2012-01-17Online Publication Date
2012-05-21Print Publication Date
2013-05Permanent link to this record
http://hdl.handle.net/10754/562189
Metadata
Show full item recordAbstract
Several emerging petascale architectures use energy-efficient processors with vectorized computational units and in-order thread processing. On these architectures the sustained performance of streaming numerical kernels, ubiquitous in the solution of partial differential equations, represents a challenge despite the regularity of memory access. Sophisticated optimization techniques are required to fully utilize the CPU. We propose a new method for constructing streaming numerical kernels using a high-level assembly synthesis and optimization framework. We describe an implementation of this method in Python targeting the IBM® Blue Gene®/P supercomputer's PowerPC® 450 core. This paper details the high-level design, construction, simulation, verification, and analysis of these kernels utilizing a subset of the CPU's instruction set. We demonstrate the effectiveness of our approach by implementing several three-dimensional stencil kernels over a variety of cached memory scenarios and analyzing the mechanically scheduled variants, including a 27-point stencil achieving a 1.7× speedup over the best previously published results. © The Author(s) 2012.Citation
Malas, T., Ahmadia, A. J., Brown, J., Gunnels, J. A., & Keyes, D. E. (2012). Optimizing the performance of streaming numerical kernels on the IBM Blue Gene/P PowerPC 450 processor. The International Journal of High Performance Computing Applications, 27(2), 193–209. doi:10.1177/1094342012444795Publisher
SAGE PublicationsarXiv
1201.3496Additional Links
http://arxiv.org/abs/arXiv:1201.3496v1ae974a485f413a2113503eed53cd6c53
10.1177/1094342012444795