Fork-join and data-driven execution models on multi-core architectures: Case study of the FMM
Type
Conference PaperKAUST Department
Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) DivisionExtreme Computing Research Center
Date
2013Permanent link to this record
http://hdl.handle.net/10754/575764
Metadata
Show full item recordAbstract
Extracting maximum performance of multi-core architectures is a difficult task primarily due to bandwidth limitations of the memory subsystem and its complex hierarchy. In this work, we study the implications of fork-join and data-driven execution models on this type of architecture at the level of task parallelism. For this purpose, we use a highly optimized fork-join based implementation of the FMM and extend it to a data-driven implementation using a distributed task scheduling approach. This study exposes some limitations of the conventional fork-join implementation in terms of synchronization overheads. We find that these are not negligible and their elimination by the data-driven method, with a careful data locality strategy, was beneficial. Experimental evaluation of both methods on state-of-the-art multi-socket multi-core architectures showed up to 22% speed-ups of the data-driven approach compared to the original method. We demonstrate that a data-driven execution of FMM not only improves performance by avoiding global synchronization overheads but also reduces the memory-bandwidth pressure caused by memory-intensive computations. © 2013 Springer-Verlag.Citation
Amer, A., Maruyama, N., Pericàs, M., Taura, K., Yokota, R., & Matsuoka, S. (2013). Fork-Join and Data-Driven Execution Models on Multi-core Architectures: Case Study of the FMM. Supercomputing, 255–266. doi:10.1007/978-3-642-38750-0_19Publisher
Springer Berlin HeidelbergConference/Event name
28th International Supercomputing Conference on Supercomputing, ISC 2013ISBN
97836423874949783642387500
Additional Links
http://link.springer.com/10.1007/978-3-642-38750-0_19http://www.mcs.anl.gov/%7Eaamer/papers/isc13-fmm-multicore.pdf
ae974a485f413a2113503eed53cd6c53
10.1007/978-3-642-38750-0_19