A Visual Approach to Investigating Shared and Global Memory Behavior of CUDA Kernels

Handle URI:
http://hdl.handle.net/10754/597436
Title:
A Visual Approach to Investigating Shared and Global Memory Behavior of CUDA Kernels
Authors:
Rosen, Paul
Abstract:
We present an approach to investigate the memory behavior of a parallel kernel executing on thousands of threads simultaneously within the CUDA architecture. Our top-down approach allows for quickly identifying any significant differences between the execution of the many blocks and warps. As interesting warps are identified, we allow further investigation of memory behavior by visualizing the shared memory bank conflicts and global memory coalescence, first with an overview of a single warp with many operations and, subsequently, with a detailed view of a single warp and a single operation. We demonstrate the strength of our approach in the context of a parallel matrix transpose kernel and a parallel 1D Haar Wavelet transform kernel. © 2013 The Author(s) Computer Graphics Forum © 2013 The Eurographics Association and Blackwell Publishing Ltd.
Citation:
Rosen P (2013) A Visual Approach to Investigating Shared and Global Memory Behavior of CUDA Kernels. Computer Graphics Forum 32: 161–170. Available: http://dx.doi.org/10.1111/cgf.12103.
Publisher:
Wiley-Blackwell
Journal:
Computer Graphics Forum
KAUST Grant Number:
KUS-C1-016-04
Issue Date:
Jun-2013
DOI:
10.1111/cgf.12103
Type:
Article
ISSN:
0167-7055
Sponsors:
We thank Kristi Potter for her feedback. This work was supported by DOE NETL and KAUST award KUS-C1-016-04.
Appears in Collections:
Publications Acknowledging KAUST Support

Full metadata record

DC FieldValue Language
dc.contributor.authorRosen, Paulen
dc.date.accessioned2016-02-25T12:33:12Zen
dc.date.available2016-02-25T12:33:12Zen
dc.date.issued2013-06en
dc.identifier.citationRosen P (2013) A Visual Approach to Investigating Shared and Global Memory Behavior of CUDA Kernels. Computer Graphics Forum 32: 161–170. Available: http://dx.doi.org/10.1111/cgf.12103.en
dc.identifier.issn0167-7055en
dc.identifier.doi10.1111/cgf.12103en
dc.identifier.urihttp://hdl.handle.net/10754/597436en
dc.description.abstractWe present an approach to investigate the memory behavior of a parallel kernel executing on thousands of threads simultaneously within the CUDA architecture. Our top-down approach allows for quickly identifying any significant differences between the execution of the many blocks and warps. As interesting warps are identified, we allow further investigation of memory behavior by visualizing the shared memory bank conflicts and global memory coalescence, first with an overview of a single warp with many operations and, subsequently, with a detailed view of a single warp and a single operation. We demonstrate the strength of our approach in the context of a parallel matrix transpose kernel and a parallel 1D Haar Wavelet transform kernel. © 2013 The Author(s) Computer Graphics Forum © 2013 The Eurographics Association and Blackwell Publishing Ltd.en
dc.description.sponsorshipWe thank Kristi Potter for her feedback. This work was supported by DOE NETL and KAUST award KUS-C1-016-04.en
dc.publisherWiley-Blackwellen
dc.titleA Visual Approach to Investigating Shared and Global Memory Behavior of CUDA Kernelsen
dc.typeArticleen
dc.identifier.journalComputer Graphics Forumen
dc.contributor.institutionUniversity of Utah, Salt Lake City, United Statesen
kaust.grant.numberKUS-C1-016-04en
All Items in KAUST are protected by copyright, with all rights reserved, unless otherwise indicated.