• Login
    View Item 
    •   Home
    • Office of Sponsored Research (OSR)
    • KAUST Funded Research
    • Publications Acknowledging KAUST Support
    • View Item
    •   Home
    • Office of Sponsored Research (OSR)
    • KAUST Funded Research
    • Publications Acknowledging KAUST Support
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Browse

    All of KAUSTCommunitiesIssue DateSubmit DateThis CollectionIssue DateSubmit Date

    My Account

    Login

    Quick Links

    Open Access PolicyORCID LibguidePlumX LibguideSubmit an Item

    Statistics

    Display statistics

    Performance Characteristics of Hybrid MPI/OpenMP Scientific Applications on a Large-Scale Multithreaded BlueGene/Q Supercomputer

    • CSV
    • RefMan
    • EndNote
    • BibTex
    • RefWorks
    Type
    Conference Paper
    Authors
    Wu, Xingfu
    Taylor, Valerie
    KAUST Grant Number
    KUS-I1-010-01
    Date
    2013-07
    Permanent link to this record
    http://hdl.handle.net/10754/599159
    
    Metadata
    Show full item record
    Abstract
    In this paper, we investigate the performance characteristics of five hybrid MPI/OpenMP scientific applications (two NAS Parallel benchmarks Multi-Zone SP-MZ and BT-MZ, an earthquake simulation PEQdyna, an aerospace application PMLB and a 3D particle-in-cell application GTC) on a large-scale multithreaded Blue Gene/Q supercomputer at Argonne National laboratory, and quantify the performance gap resulting from using different number of threads per node. We use performance tools and MPI profile and trace libraries available on the supercomputer to analyze and compare the performance of these hybrid scientific applications with increasing the number OpenMP threads per node, and find that increasing the number of threads to some extent saturates or worsens performance of these hybrid applications. For the strong-scaling hybrid scientific applications such as SP-MZ, BT-MZ, PEQdyna and PLMB, using 32 threads per node results in much better application efficiency than using 64 threads per node, and as increasing the number of threads per node, the FPU (Floating Point Unit) percentage decreases, and the MPI percentage (except PMLB) and IPC (Instructions per cycle) per core (except BT-MZ) increase. For the weak-scaling hybrid scientific application such as GTC, the performance trend (relative speedup) is very similar with increasing number of threads per node no matter how many nodes (32, 128, 512) are used. © 2013 IEEE.
    Citation
    Wu X, Taylor V (2013) Performance Characteristics of Hybrid MPI/OpenMP Scientific Applications on a Large-Scale Multithreaded BlueGene/Q Supercomputer. 2013 14th ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing. Available: http://dx.doi.org/10.1109/SNPD.2013.81.
    Sponsors
    This work is supported by NSF grant CNS-0911023 and the Award No. KUS-I1-010-01 made by King Abdullah University of Science and Technology (KAUST). The authors would like to acknowledge Argonne Leadership Computing Facility for the use of BlueGene/Q under DOE INCITE project “Performance Evaluation and Analysis Consortium End Station” and BGQ Tools project. We would also like to thank Stephane Ethier from Princeton Plasma Physics Laboratory for providing the GTC code.
    Publisher
    Institute of Electrical and Electronics Engineers (IEEE)
    Journal
    2013 14th ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing
    DOI
    10.1109/SNPD.2013.81
    ae974a485f413a2113503eed53cd6c53
    10.1109/SNPD.2013.81
    Scopus Count
    Collections
    Publications Acknowledging KAUST Support

    entitlement

     
    DSpace software copyright © 2002-2021  DuraSpace
    Quick Guide | Contact Us | Send Feedback
    Open Repository is a service hosted by 
    Atmire NV
     

    Export search results

    The export option will allow you to export the current search results of the entered query to a file. Different formats are available for download. To export the items, click on the button corresponding with the preferred download format.

    By default, clicking on the export buttons will result in a download of the allowed maximum amount of items. For anonymous users the allowed maximum amount is 50 search results.

    To select a subset of the search results, click "Selective Export" button and make a selection of the items you want to export. The amount of items that can be exported at once is similarly restricted as the full export.

    After making a selection, click one of the export format buttons. The amount of items that will be exported is indicated in the bubble next to export format.