Show simple item record

dc.contributor.advisorKeyes, David E.
dc.contributor.authorAl-Harthi, Noha A.
dc.date.accessioned2013-06-12T13:40:05Z
dc.date.available2013-06-12T13:40:05Z
dc.date.issued2013-06
dc.identifier.doi10.25781/KAUST-347V2
dc.identifier.urihttp://hdl.handle.net/10754/293890
dc.description.abstractThe current trends in computer architecture are shifting towards smaller byte/flop ratios, while available parallelism is increasing at all levels of granularity – vector length, core count, and MPI process. Intel’s Xeon Phi coprocessor, NVIDIA’s Kepler GPU, and IBM’s BlueGene/Q all have a Byte/flop ratio close to 0.2, which makes it very difficult for most algorithms to extract a high percentage of the theoretical peak flop/s from these architectures. Popular algorithms in scientific computing such as FFT are continuously evolving to keep up with this trend in hardware. In the meantime it is also necessary to invest in novel algorithms that are more suitable for computer architectures of the future. The fast multipole method (FMM) was originally developed as a fast algorithm for ap- proximating the N-body interactions that appear in astrophysics, molecular dynamics, and vortex based fluid dynamics simulations. The FMM possesses have a unique combination of being an efficient O(N) algorithm, while having an operational intensity that is higher than a matrix-matrix multiplication. In fact, the FMM can reduce the requirement of Byte/flop to around 0.01, which means that it will remain compute bound until 2020 even if the cur- rent trend in microprocessors continues. Despite these advantages, there have not been any benchmarks of FMM codes on modern architectures such as Xeon Phi, Kepler, and Blue- Gene/Q. This study aims to provide a comprehensive benchmark of a state of the art FMM code “exaFMM” on the latest architectures, in hopes of providing a useful reference for deciding when the FMM will become useful as the computational engine in a given application code. It may also serve as a warning to certain problem size domains areas where the FMM will exhibit insignificant performance improvements. Such issues depend strongly on the asymptotic constants rather than the asymptotics themselves, and therefore are strongly implementation and hardware dependent. The primary objective of this study is to provide these constants on various computer architectures.
dc.language.isoen
dc.subjectFast Multiple Method
dc.subjectFMM
dc.subjectBenchmark
dc.subjectScalability
dc.subjectload balancing
dc.subjectGPU
dc.titlePerformance Benchmarking of Fast Multipole Methods
dc.typeThesis
dc.contributor.departmentComputer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division
thesis.degree.grantorKing Abdullah University of Science and Technology
dc.contributor.committeememberBagci, Hakan
dc.contributor.committeememberRavasi, Timothy
thesis.degree.disciplineComputer Science
thesis.degree.nameMaster of Science
refterms.dateFOA2014-06-12T00:00:00Z


Files in this item

Thumbnail
Name:
Noha Thesis.pdf
Size:
1.530Mb
Format:
PDF
Description:
Noha Thesis Approval Form

This item appears in the following Collection(s)

Show simple item record