KAUST Repository
>
Office of Sponsored Research (OSR)
>
KAUST Funded Research
>
Publications Acknowledging KAUST Support
>

# A Parallel Butterfly Algorithm

- Handle URI:
- http://hdl.handle.net/10754/597370
- Title:
- A Parallel Butterfly Algorithm
- Authors:
- Abstract:
- The butterfly algorithm is a fast algorithm which approximately evaluates a discrete analogue of the integral transform (Equation Presented.) at large numbers of target points when the kernel, K(x, y), is approximately low-rank when restricted to subdomains satisfying a certain simple geometric condition. In d dimensions with O(Nd) quasi-uniformly distributed source and target points, when each appropriate submatrix of K is approximately rank-r, the running time of the algorithm is at most O(r2Nd logN). A parallelization of the butterfly algorithm is introduced which, assuming a message latency of α and per-process inverse bandwidth of β, executes in at most (Equation Presented.) time using p processes. This parallel algorithm was then instantiated in the form of the open-source DistButterfly library for the special case where K(x, y) = exp(iΦ(x, y)), where Φ(x, y) is a black-box, sufficiently smooth, real-valued phase function. Experiments on Blue Gene/Q demonstrate impressive strong-scaling results for important classes of phase functions. Using quasi-uniform sources, hyperbolic Radon transforms, and an analogue of a three-dimensional generalized Radon transform were, respectively, observed to strong-scale from 1-node/16-cores up to 1024-nodes/16,384-cores with greater than 90% and 82% efficiency, respectively. © 2014 Society for Industrial and Applied Mathematics.
- Citation:
- Poulson J, Demanet L, Maxwell N, Ying L (2014) A Parallel Butterfly Algorithm. SIAM Journal on Scientific Computing 36: C49–C65. Available: http://dx.doi.org/10.1137/130921544.
- Publisher:
- Journal:
- Issue Date:
- 4-Feb-2014
- DOI:
- 10.1137/130921544
- Type:
- Article
- ISSN:
- 1064-8275; 1095-7197
- Sponsors:
- This work was partially supported by NSF CAREER grant 0846501 (L.Y.), DOE grant DE-SC0009409 (L.Y.), and KAUST. Furthermore, this research used resources of the Argonne Leadership Computing Facility at Argonne National Laboratory, which is supported by the Office of Science of the U.S. Department of Energy under contract DE-AC02-06CH11357.

- Appears in Collections:
- Publications Acknowledging KAUST Support

# Full metadata record

DC Field | Value | Language |
---|---|---|

dc.contributor.author | Poulson, Jack | en |

dc.contributor.author | Demanet, Laurent | en |

dc.contributor.author | Maxwell, Nicholas | en |

dc.contributor.author | Ying, Lexing | en |

dc.date.accessioned | 2016-02-25T12:31:48Z | en |

dc.date.available | 2016-02-25T12:31:48Z | en |

dc.date.issued | 2014-02-04 | en |

dc.identifier.citation | Poulson J, Demanet L, Maxwell N, Ying L (2014) A Parallel Butterfly Algorithm. SIAM Journal on Scientific Computing 36: C49–C65. Available: http://dx.doi.org/10.1137/130921544. | en |

dc.identifier.issn | 1064-8275 | en |

dc.identifier.issn | 1095-7197 | en |

dc.identifier.doi | 10.1137/130921544 | en |

dc.identifier.uri | http://hdl.handle.net/10754/597370 | en |

dc.description.abstract | The butterfly algorithm is a fast algorithm which approximately evaluates a discrete analogue of the integral transform (Equation Presented.) at large numbers of target points when the kernel, K(x, y), is approximately low-rank when restricted to subdomains satisfying a certain simple geometric condition. In d dimensions with O(Nd) quasi-uniformly distributed source and target points, when each appropriate submatrix of K is approximately rank-r, the running time of the algorithm is at most O(r2Nd logN). A parallelization of the butterfly algorithm is introduced which, assuming a message latency of α and per-process inverse bandwidth of β, executes in at most (Equation Presented.) time using p processes. This parallel algorithm was then instantiated in the form of the open-source DistButterfly library for the special case where K(x, y) = exp(iΦ(x, y)), where Φ(x, y) is a black-box, sufficiently smooth, real-valued phase function. Experiments on Blue Gene/Q demonstrate impressive strong-scaling results for important classes of phase functions. Using quasi-uniform sources, hyperbolic Radon transforms, and an analogue of a three-dimensional generalized Radon transform were, respectively, observed to strong-scale from 1-node/16-cores up to 1024-nodes/16,384-cores with greater than 90% and 82% efficiency, respectively. © 2014 Society for Industrial and Applied Mathematics. | en |

dc.description.sponsorship | This work was partially supported by NSF CAREER grant 0846501 (L.Y.), DOE grant DE-SC0009409 (L.Y.), and KAUST. Furthermore, this research used resources of the Argonne Leadership Computing Facility at Argonne National Laboratory, which is supported by the Office of Science of the U.S. Department of Energy under contract DE-AC02-06CH11357. | en |

dc.publisher | Society for Industrial & Applied Mathematics (SIAM) | en |

dc.subject | Blue Gene/Q | en |

dc.subject | Butterfly algorithm | en |

dc.subject | Egorov operator | en |

dc.subject | Parallel | en |

dc.subject | Radon transform | en |

dc.title | A Parallel Butterfly Algorithm | en |

dc.type | Article | en |

dc.identifier.journal | SIAM Journal on Scientific Computing | en |

dc.contributor.institution | Stanford University, Palo Alto, United States | en |

dc.contributor.institution | Massachusetts Institute of Technology, Cambridge, United States | en |

dc.contributor.institution | University of Houston, Houston, United States | en |

All Items in KAUST are protected by copyright, with all rights reserved, unless otherwise indicated.