Show simple item record

dc.contributor.authorHu, Qi
dc.contributor.authorGumerov, Nail A.
dc.contributor.authorYokota, Rio
dc.contributor.authorBarba, Lorena A.
dc.contributor.authorDuraiswami, Ramani
dc.date.accessioned2015-08-24T09:27:05Z
dc.date.available2015-08-24T09:27:05Z
dc.date.issued2014-05
dc.identifier.citationHu, Q., Gumerov, N. A., Yokota, R., Barba, L., & Duraiswami, R. (2014). Scalable Fast Multipole Accelerated Vortex Methods. 2014 IEEE International Parallel & Distributed Processing Symposium Workshops. doi:10.1109/ipdpsw.2014.110
dc.identifier.isbn9780769552088
dc.identifier.issn15302075
dc.identifier.doi10.1109/IPDPSW.2014.110
dc.identifier.urihttp://hdl.handle.net/10754/575821
dc.description.abstractThe fast multipole method (FMM) is often used to accelerate the calculation of particle interactions in particle-based methods to simulate incompressible flows. To evaluate the most time-consuming kernels - the Biot-Savart equation and stretching term of the vorticity equation, we mathematically reformulated it so that only two Laplace scalar potentials are used instead of six. This automatically ensuring divergence-free far-field computation. Based on this formulation, we developed a new FMM-based vortex method on heterogeneous architectures, which distributed the work between multicore CPUs and GPUs to best utilize the hardware resources and achieve excellent scalability. The algorithm uses new data structures which can dynamically manage inter-node communication and load balance efficiently, with only a small parallel construction overhead. This algorithm can scale to large-sized clusters showing both strong and weak scalability. Careful error and timing trade-off analysis are also performed for the cutoff functions induced by the vortex particle method. Our implementation can perform one time step of the velocity+stretching calculation for one billion particles on 32 nodes in 55.9 seconds, which yields 49.12 Tflop/s.
dc.publisherInstitute of Electrical and Electronics Engineers (IEEE)
dc.subjectFMM
dc.subjectheterogeneous algorithm
dc.subjectGPGPU
dc.subjectvortex methods
dc.titleScalable fast multipole accelerated vortex methods
dc.typeConference Paper
dc.contributor.departmentExtreme Computing Research Center
dc.identifier.journal2014 IEEE International Parallel & Distributed Processing Symposium Workshops
dc.conference.date19 May 2014 through 23 May 2014
dc.conference.name28th IEEE International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2014
dc.contributor.institutionUniversity of Maryland, Institute for Advanced Computer Studies, United States
dc.contributor.institutionDepartment of Computer Science, University of MarylandCollege Park, United States
dc.contributor.institutionFantalgo LLCElkridge, MD, United States
dc.contributor.institutionMechanical and Aerospace Engineering, George Washington University, United States
kaust.personYokota, Rio


This item appears in the following Collection(s)

Show simple item record