Handle URI:
http://hdl.handle.net/10754/597607
Title:
Assembly of finite element methods on graphics processors
Authors:
Cecka, Cris; Lew, Adrian J.; Darve, E.
Abstract:
Recently, graphics processing units (GPUs) have had great success in accelerating many numerical computations. We present their application to computations on unstructured meshes such as those in finite element methods. Multiple approaches in assembling and solving sparse linear systems with NVIDIA GPUs and the Compute Unified Device Architecture (CUDA) are created and analyzed. Multiple strategies for efficient use of global, shared, and local memory, methods to achieve memory coalescing, and optimal choice of parameters are introduced. We find that with appropriate preprocessing and arrangement of support data, the GPU coprocessor using single-precision arithmetic achieves speedups of 30 or more in comparison to a well optimized double-precision single core implementation. We also find that the optimal assembly strategy depends on the order of polynomials used in the finite element discretization. © 2010 John Wiley & Sons, Ltd.
Citation:
Cecka C, Lew AJ, Darve E (2010) Assembly of finite element methods on graphics processors. Int J Numer Meth Engng 85: 640–669. Available: http://dx.doi.org/10.1002/nme.2989.
Publisher:
Wiley-Blackwell
Journal:
International Journal for Numerical Methods in Engineering
Issue Date:
23-Aug-2010
DOI:
10.1002/nme.2989
Type:
Article
ISSN:
0029-5981
Sponsors:
This work was partially supported by a research grant from the Academic Excellence Alliance program between King Abdullah University of Science and Technology (KAUST) and the Stanford University.
Appears in Collections:
Publications Acknowledging KAUST Support

Full metadata record

DC FieldValue Language
dc.contributor.authorCecka, Crisen
dc.contributor.authorLew, Adrian J.en
dc.contributor.authorDarve, E.en
dc.date.accessioned2016-02-25T12:42:57Zen
dc.date.available2016-02-25T12:42:57Zen
dc.date.issued2010-08-23en
dc.identifier.citationCecka C, Lew AJ, Darve E (2010) Assembly of finite element methods on graphics processors. Int J Numer Meth Engng 85: 640–669. Available: http://dx.doi.org/10.1002/nme.2989.en
dc.identifier.issn0029-5981en
dc.identifier.doi10.1002/nme.2989en
dc.identifier.urihttp://hdl.handle.net/10754/597607en
dc.description.abstractRecently, graphics processing units (GPUs) have had great success in accelerating many numerical computations. We present their application to computations on unstructured meshes such as those in finite element methods. Multiple approaches in assembling and solving sparse linear systems with NVIDIA GPUs and the Compute Unified Device Architecture (CUDA) are created and analyzed. Multiple strategies for efficient use of global, shared, and local memory, methods to achieve memory coalescing, and optimal choice of parameters are introduced. We find that with appropriate preprocessing and arrangement of support data, the GPU coprocessor using single-precision arithmetic achieves speedups of 30 or more in comparison to a well optimized double-precision single core implementation. We also find that the optimal assembly strategy depends on the order of polynomials used in the finite element discretization. © 2010 John Wiley & Sons, Ltd.en
dc.description.sponsorshipThis work was partially supported by a research grant from the Academic Excellence Alliance program between King Abdullah University of Science and Technology (KAUST) and the Stanford University.en
dc.publisherWiley-Blackwellen
dc.subjectCUDAen
dc.subjectFEMen
dc.subjectFinite element methoden
dc.subjectGPGPUen
dc.subjectGPUen
dc.subjectHigh-performance computingen
dc.subjectMulticoreen
dc.titleAssembly of finite element methods on graphics processorsen
dc.typeArticleen
dc.identifier.journalInternational Journal for Numerical Methods in Engineeringen
dc.contributor.institutionStanford University, Palo Alto, United Statesen
kaust.grant.programAcademic Excellence Alliance (AEA)en
All Items in KAUST are protected by copyright, with all rights reserved, unless otherwise indicated.