Performance Characteristics of Hybrid MPI/OpenMP Scientific Applications on a Large-Scale Multithreaded BlueGene/Q Supercomputer

Handle URI:
http://hdl.handle.net/10754/599159
Title:
Performance Characteristics of Hybrid MPI/OpenMP Scientific Applications on a Large-Scale Multithreaded BlueGene/Q Supercomputer
Authors:
Wu, Xingfu; Taylor, Valerie
Abstract:
In this paper, we investigate the performance characteristics of five hybrid MPI/OpenMP scientific applications (two NAS Parallel benchmarks Multi-Zone SP-MZ and BT-MZ, an earthquake simulation PEQdyna, an aerospace application PMLB and a 3D particle-in-cell application GTC) on a large-scale multithreaded Blue Gene/Q supercomputer at Argonne National laboratory, and quantify the performance gap resulting from using different number of threads per node. We use performance tools and MPI profile and trace libraries available on the supercomputer to analyze and compare the performance of these hybrid scientific applications with increasing the number OpenMP threads per node, and find that increasing the number of threads to some extent saturates or worsens performance of these hybrid applications. For the strong-scaling hybrid scientific applications such as SP-MZ, BT-MZ, PEQdyna and PLMB, using 32 threads per node results in much better application efficiency than using 64 threads per node, and as increasing the number of threads per node, the FPU (Floating Point Unit) percentage decreases, and the MPI percentage (except PMLB) and IPC (Instructions per cycle) per core (except BT-MZ) increase. For the weak-scaling hybrid scientific application such as GTC, the performance trend (relative speedup) is very similar with increasing number of threads per node no matter how many nodes (32, 128, 512) are used. © 2013 IEEE.
Citation:
Wu X, Taylor V (2013) Performance Characteristics of Hybrid MPI/OpenMP Scientific Applications on a Large-Scale Multithreaded BlueGene/Q Supercomputer. 2013 14th ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing. Available: http://dx.doi.org/10.1109/SNPD.2013.81.
Publisher:
Institute of Electrical and Electronics Engineers (IEEE)
Journal:
2013 14th ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing
KAUST Grant Number:
KUS-I1-010-01
Issue Date:
Jul-2013
DOI:
10.1109/SNPD.2013.81
Type:
Conference Paper
Sponsors:
This work is supported by NSF grant CNS-0911023 and the Award No. KUS-I1-010-01 made by King Abdullah University of Science and Technology (KAUST). The authors would like to acknowledge Argonne Leadership Computing Facility for the use of BlueGene/Q under DOE INCITE project “Performance Evaluation and Analysis Consortium End Station” and BGQ Tools project. We would also like to thank Stephane Ethier from Princeton Plasma Physics Laboratory for providing the GTC code.
Appears in Collections:
Publications Acknowledging KAUST Support

Full metadata record

DC FieldValue Language
dc.contributor.authorWu, Xingfuen
dc.contributor.authorTaylor, Valerieen
dc.date.accessioned2016-02-25T13:54:00Zen
dc.date.available2016-02-25T13:54:00Zen
dc.date.issued2013-07en
dc.identifier.citationWu X, Taylor V (2013) Performance Characteristics of Hybrid MPI/OpenMP Scientific Applications on a Large-Scale Multithreaded BlueGene/Q Supercomputer. 2013 14th ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing. Available: http://dx.doi.org/10.1109/SNPD.2013.81.en
dc.identifier.doi10.1109/SNPD.2013.81en
dc.identifier.urihttp://hdl.handle.net/10754/599159en
dc.description.abstractIn this paper, we investigate the performance characteristics of five hybrid MPI/OpenMP scientific applications (two NAS Parallel benchmarks Multi-Zone SP-MZ and BT-MZ, an earthquake simulation PEQdyna, an aerospace application PMLB and a 3D particle-in-cell application GTC) on a large-scale multithreaded Blue Gene/Q supercomputer at Argonne National laboratory, and quantify the performance gap resulting from using different number of threads per node. We use performance tools and MPI profile and trace libraries available on the supercomputer to analyze and compare the performance of these hybrid scientific applications with increasing the number OpenMP threads per node, and find that increasing the number of threads to some extent saturates or worsens performance of these hybrid applications. For the strong-scaling hybrid scientific applications such as SP-MZ, BT-MZ, PEQdyna and PLMB, using 32 threads per node results in much better application efficiency than using 64 threads per node, and as increasing the number of threads per node, the FPU (Floating Point Unit) percentage decreases, and the MPI percentage (except PMLB) and IPC (Instructions per cycle) per core (except BT-MZ) increase. For the weak-scaling hybrid scientific application such as GTC, the performance trend (relative speedup) is very similar with increasing number of threads per node no matter how many nodes (32, 128, 512) are used. © 2013 IEEE.en
dc.description.sponsorshipThis work is supported by NSF grant CNS-0911023 and the Award No. KUS-I1-010-01 made by King Abdullah University of Science and Technology (KAUST). The authors would like to acknowledge Argonne Leadership Computing Facility for the use of BlueGene/Q under DOE INCITE project “Performance Evaluation and Analysis Consortium End Station” and BGQ Tools project. We would also like to thank Stephane Ethier from Princeton Plasma Physics Laboratory for providing the GTC code.en
dc.publisherInstitute of Electrical and Electronics Engineers (IEEE)en
dc.subjectBlueGene/Qen
dc.subjecthybrid MPI/OpenMPen
dc.subjectmultithreadeden
dc.subjectPerformance analysisen
dc.titlePerformance Characteristics of Hybrid MPI/OpenMP Scientific Applications on a Large-Scale Multithreaded BlueGene/Q Supercomputeren
dc.typeConference Paperen
dc.identifier.journal2013 14th ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computingen
dc.contributor.institutionTexas A and M University, College Station, United Statesen
kaust.grant.numberKUS-I1-010-01en
All Items in KAUST are protected by copyright, with all rights reserved, unless otherwise indicated.