Numerical Mathematics Group
Research within the Numerical Mathematics Group at KAUST focuses on the design, analysis, and implementation of numerical methods for ordinary and partial differential equations, as well as application of numerical methods to problems in nonlinear wave propagation. Their work spans the numerical method development pipeline from theoretical numerical analysis, to computational experimentation, to software development and high performance computational science, in collaboration with application domain researchers. Particular emphasis is placed on robustness, wide applicability, and efficiency.
Recent Submissions

Twodimensional wave propagation in layered periodic media(SIAM Journal on Applied Mathematics, Society for Industrial & Applied Mathematics (SIAM), 20140916) [Article]We study twodimensional wave propagation in materials whose properties vary periodically in one direction only. High order homogenization is carried out to derive a dispersive effective medium approximation. Onedimensional materials with constant impedance exhibit no effective dispersion. We show that a new kind of effective dispersion may arise in two dimensions, even in materials with constant impedance. This dispersion is a macroscopic effect of microscopic diffraction caused by spatial variation in the sound speed. We analyze this dispersive effect by using highorder homogenization to derive an anisotropic, dispersive effective medium. We generalize to two dimensions a homogenization approach that has been used previously for onedimensional problems. Pseudospectral solutions of the effective medium equations agree to high accuracy with finite volume direct numerical simulations of the variablecoeffi cient equations.

Internal Error Propagation in Explicit RungeKutta Methods(SIAM Journal on Numerical Analysis, Society for Industrial & Applied Mathematics (SIAM), 20140911) [Article]In practical computation with RungeKutta methods, the stage equations are not satisfied exactly, due to roundoff errors, algebraic solver errors, and so forth. We show by example that propagation of such errors within a single step can have catastrophic effects for otherwise practical and wellknown methods. We perform a general analysis of internal error propagation, emphasizing that it depends significantly on how the method is implemented. We show that for a fixed method, essentially any set of internal stability polynomials can be obtained by modifying the implementation details. We provide bounds on the internal error amplification constants for some classes of methods with many stages, including strong stability preserving methods and extrapolation methods. These results are used to prove error bounds in the presence of roundoff or other internal errors.

Error Analysis of Explicit Partitioned Runge–Kutta Schemes for Conservation Laws(Journal of Scientific Computing, Springer Nature, 20140827) [Article]An error analysis is presented for explicit partitioned Runge–Kutta methods and multirate methods applied to conservation laws. The interfaces, across which different methods or time steps are used, lead to order reduction of the schemes. Along with cellbased decompositions, also fluxbased decompositions are studied. In the latter case mass conservation is guaranteed, but it will be seen that the accuracy may deteriorate.

A comparison of highorder explicit Runge–Kutta, extrapolation, and deferred correction methods in serial and parallel(Communications in Applied Mathematics and Computational Science, Mathematical Sciences Publishers, 20140613) [Article]We compare the three main types of highorder onestep initial value solvers: extrapolation, spectral deferred correction, and embedded Runge–Kutta pairs. We consider orders four through twelve, including both serial and parallel implementations. We cast extrapolation and deferred correction methods as fixedorder Runge–Kutta methods, providing a natural framework for the comparison. The stability and accuracy properties of the methods are analyzed by theoretical measures, and these are compared with the results of numerical tests. In serial, the eighthorder pair of Prince and Dormand (DOP8) is most efficient. But other highorder methods can be more efficient than DOP8 when implemented in parallel. This is demonstrated by comparing a parallelized version of the wellknown ODEX code with the (serial) DOP853 code. For an Nbody problem with N = 400, the experimental extrapolation code is as fast as the tuned Runge–Kutta pair at loose tolerances, and is up to two times as fast at tight tolerances.

Propagation of internal errors in explicit Runge–Kutta methods and internal stability of SSP and extrapolation methods(20140411) [Technical Report]In practical computation with RungeKutta methods, the stage equations are not satisfied exactly, due to roundoff errors, algebraic solver errors, and so forth. We show by example that propagation of such errors within a single step can have catastrophic effects for otherwise practical and wellknown methods. We perform a general analysis of internal error propagation, emphasizing that it depends significantly on how the method is implemented. We show that for a fixed method, essentially any set of internal stability polynomials can be obtained by modifying the implementation details. We provide bounds on the internal error amplification constants for some classes of methods with many stages, including strong stability preserving methods and extrapolation methods. These results are used to prove error bounds in the presence of roundoff or other internal errors.

On the absolute stability regions corresponding to partial sums of the exponential function(IMA Journal of Numerical Analysis, Oxford University Press (OUP), 20131203) [Article]Certain numerical methods for initial value problems have as stability function the nth partial sum of the exponential function. We study the stability region, i.e., the set in the complex plane over which the nth partial sum has at most unit modulus. It is known that the asymptotic shape of the part of the stability region in the left halfplane is a semidisk. We quantify this by providing disks that enclose or are enclosed by the stability region or its left halfplane part. The radius of the smallest disk centered at the origin that contains the stability region (or its portion in the left halfplane) is determined for 1 n 20. Bounds on such radii are proved for n 2; these bounds are shown to be optimal in the limit n ! +1. We prove that the stability region and its complement, restricted to the imaginary axis, consist of alternating intervals of length tending to , as n ! 1. Finally, we prove that a semidisk in the left halfplane with vertical boundary being the imaginary axis and centered at the origin is included in the stability region if and only if n 0 mod 4 or n 3 mod 4. The maximal radii of such semidisks are exactly determined for 1 n 20.

Spatially Partitioned Embedded RungeKutta Methods(SIAM Journal on Numerical Analysis, Society for Industrial & Applied Mathematics (SIAM), 20131030) [Article]We study spatially partitioned embedded RungeKutta (SPERK) schemes for partial differential equations (PDEs), in which each of the component schemes is applied over a different part of the spatial domain. Such methods may be convenient for problems in which the smoothness of the solution or the magnitudes of the PDE coefficients vary strongly in space. We focus on embedded partitioned methods as they offer greater efficiency and avoid the order reduction that may occur in nonembedded schemes. We demonstrate that the lack of conservation in partitioned schemes can lead to nonphysical effects and propose conservative additive schemes based on partitioning the fluxes rather than the ordinary differential equations. A variety of SPERK schemes are presented, including an embedded pair suitable for the time evolution of fifthorder weighted nonoscillatory spatial discretizations. Numerical experiments are provided to support the theory.

Optimal StrongStabilityPreserving Runge–Kutta Time Discretizations for Discontinuous Galerkin Methods(Journal of Scientific Computing, Springer Science + Business Media, 20131029) [Article]Discontinuous Galerkin (DG) spatial discretizations are often used in a methodoflines approach with explicit strongstabilitypreserving (SSP) Runge–Kutta (RK) time steppers for the numerical solution of hyperbolic conservation laws. The time steps that are employed in this type of approach must satisfy Courant–Friedrichs–Lewy stability constraints that are dependent on both the region of absolute stability and the SSP coefficient of the RK method. While existing SSPRK methods have been optimized with respect to the latter, it is in fact the former that gives rise to stricter constraints on the time step in the case of RKDG stability. Therefore, in this work, we present the development of new “DGoptimized” SSPRK methods with stability regions that have been specifically designed to maximize the stable time step size for RKDG methods of a given order in one space dimension. These new methods represent the best available RKDG methods in terms of computational efficiency, with significant improvements over methods using existing SSPRK time steppers that have been optimized with respect to SSP coefficients. Second, third, and fourthorder methods with up to eight stages are presented, and their stability properties are verified through application to numerical test cases.

Strong Stability Preserving Explicit RungeKutta Methods of Maximal Effective Order(SIAM Journal on Numerical Analysis, Society for Industrial & Applied Mathematics (SIAM), 20130723) [Article]We apply the concept of effective order to strong stability preserving (SSP) explicit RungeKutta methods. Relative to classical RungeKutta methods, methods with an effective order of accuracy are designed to satisfy a relaxed set of order conditions but yield higher order accuracy when composed with special starting and stopping methods. We show that this allows the construction of fourstage SSP methods with effective order four (such methods cannot have classical order four). However, we also prove that effective order five methodslike classical order five methodsrequire the use of nonpositive weights and so cannot be SSP. By numerical optimization, we construct explicit SSP RungeKutta methods up to effective order four and establish the optimality of many of them. Numerical experiments demonstrate the validity of these methods in practice.

Numerical Simulation of Cylindrical Solitary Waves in Periodic Media(Journal of Scientific Computing, Springer Nature, 20130714) [Article]We study the behavior of nonlinear waves in a twodimensional medium with density and stress relation that vary periodically in space. Efficient approximate Riemann solvers are developed for the corresponding variablecoefficient firstorder hyperbolic system. We present direct numerical simulations of this multiscale problem, focused on the propagation of a single localized perturbation in media with strongly varying impedance. For the conditions studied, we find little evidence of shock formation. Instead, solutions consist primarily of solitary waves. These solitary waves are observed to be stable over long times and to interact in a manner approximately like solitons. The system considered has no dispersive terms; these solitary waves arise due to the material heterogeneity, which leads to strong reflections and effective dispersion.

Optimized Explicit RungeKutta Schemes for the Spectral Difference Method Applied to Wave Propagation Problems(SIAM Journal on Scientific Computing, Society for Industrial & Applied Mathematics (SIAM), 20130410) [Article]Explicit RungeKutta schemes with large stable step sizes are developed for integration of highorder spectral difference spatial discretizations on quadrilateral grids. The new schemes permit an effective time step that is substantially larger than the maximum admissible time step of standard explicit RungeKutta schemes available in the literature. Furthermore, they have a small principal error norm and admit a lowstorage implementation. The advantages of the new schemes are demonstrated through application to the Euler equations and the linearized Euler equations.

HighOrder Wave Propagation Algorithms for Hyperbolic Systems(SIAM Journal on Scientific Computing, Society for Industrial & Applied Mathematics (SIAM), 20130122) [Article]We present a finite volume method that is applicable to hyperbolic PDEs including spatially varying and semilinear nonconservative systems. The spatial discretization, like that of the wellknown Clawpack software, is based on solving Riemann problems and calculating fluctuations (not fluxes). The implementation employs weighted essentially nonoscillatory reconstruction in space and strong stability preserving RungeKutta integration in time. The method can be extended to arbitrarily high order of accuracy and allows a wellbalanced implementation for capturing solutions of balance laws near steady state. This wellbalancing is achieved through the $f$wave Riemann solver and a novel waveslope WENO reconstruction procedure. The wide applicability and advantageous properties of the method are demonstrated through numerical examples, including problems in nonconservative form, problems with spatially varying fluxes, and problems involving nearequilibrium solutions of balance laws.

Optimal stability polynomials for numerical integration of initial value problems(Communications in Applied Mathematics and Computational Science, Mathematical Sciences Publishers, 20130108) [Article]We consider the problem of finding optimally stable polynomial approximations to the exponential for application to onestep integration of initial value ordinary and partial differential equations. The objective is to find the largest stable step size and corresponding method for a given problem when the spectrum of the initial value problem is known. The problem is expressed in terms of a general least deviation feasibility problem. Its solution is obtained by a new fast, accurate, and robust algorithm based on convex optimization techniques. Global convergence of the algorithm is proven in the case that the order of approximation is one and in the case that the spectrum encloses a starlike region. Examples demonstrate the effectiveness of the proposed algorithm even when these conditions are not satisfied.

PyClaw: Accessible, Extensible, Scalable Tools for Wave Propagation Problems(SIAM Journal on Scientific Computing, Society for Industrial & Applied Mathematics (SIAM), 20120815) [Article]Development of scientific software involves tradeoffs between ease of use, generality, and performance. We describe the design of a general hyperbolic PDE solver that can be operated with the convenience of MATLAB yet achieves efficiency near that of handcoded Fortran and scales to the largest supercomputers. This is achieved by using Python for most of the code while employing automatically wrapped Fortran kernels for computationally intensive routines, and using Python bindings to interface with a parallel computing library and other numerical packages. The software described here is PyClaw, a Pythonbased structured grid solver for general systems of hyperbolic PDEs [K. T. Mandli et al., PyClaw Software, Version 1.0, http://numerics.kaust.edu.sa/pyclaw/ (2011)]. PyClaw provides a powerful and intuitive interface to the algorithms of the existing Fortran codes Clawpack and SharpClaw, simplifying code development and use while providing massive parallelism and scalable solvers via the PETSc library. The package is further augmented by use of PyWENO for generation of efficient highorder weighted essentially nonoscillatory reconstruction code. The simplicity, capability, and performance of this approach are demonstrated through application to example problems in shallow water flow, compressible flow, and elasticity.

Optimized loworder explicit RungeKutta schemes for high order spectral difference method(Proceedings of the 11th Finnish Mechanics Days, University of Oulu, 2012) [Conference Paper]Optimal explicit RungeKutta (ERK) schemes with large stable step sizes are developed for methodoflines discretizations based on the spectral difference (SD) spatial discretization on quadrilateral grids. These methods involve many stages and provide the optimal linearly stable time step for a prescribed SD spectrum and the minimum leading truncation error coefficient, while admitting a lowstorage implementation. Using a large number of stages, the new ERK schemes lead to efficiency improvements larger than 60% over standard ERK schemes for 4th and 5thorder spatial discretization.

Shock dynamics in layered periodic media(Communications in Mathematical Sciences, International Press of Boston, 2012) [Article]Solutions of constantcoeffcient nonlinear hyperbolic PDEs generically develop shocks, even if the initial data is smooth. Solutions of hyperbolic PDEs with variable coeffcients can behave very differently. We investigate formation and stability of shock waves in a onedimensional periodic layered medium by a computational study of timereversibility and entropy evolution. We find that periodic layered media tend to inhibit shock formation. For small initial conditions and large impedance variation, no shock formation is detected even after times much greater than the time of shock formation in a homogeneous medium. Furthermore, weak shocks are observed to be dynamically unstable in the sense that they do not lead to significant longterm entropy decay. We propose a characteristic condition for admissibility of shocks in heterogeneous media that generalizes the classical Lax entropy condition and accurately predicts the formation or absence of shocks in these media.

Strong Stability Preserving Twostep Runge–Kutta Methods(SIAM Journal on Numerical Analysis, Society for Industrial & Applied Mathematics (SIAM), 20111222) [Article]We investigate the strong stability preserving (SSP) property of twostep Runge–Kutta (TSRK) methods. We prove that all SSP TSRK methods belong to a particularly simple subclass of TSRK methods, in which stages from the previous step are not used. We derive simple order conditions for this subclass. Whereas explicit SSP Runge–Kutta methods have order at most four, we prove that explicit SSP TSRK methods have order at most eight. We present explicit TSRK methods of up to eighth order that were found by numerical search. These methods have larger SSP coefficients than any known methods of the same order of accuracy and may be implemented in a form with relatively modest storage requirements. The usefulness of the TSRK methods is demonstrated through numerical examples, including integration of very high order weighted essentially nonoscillatory discretizations.

Step Sizes for Strong Stability Preservation with DownwindBiased Operators(SIAM Journal on Numerical Analysis, Society for Industrial & Applied Mathematics (SIAM), 20110804) [Article]Strong stability preserving (SSP) integrators for initial value ODEs preserve temporal monotonicity solution properties in arbitrary norms. All existing SSP methods, including implicit methods, either require small step sizes or achieve only first order accuracy. It is possible to achieve more relaxed step size restrictions in the discretization of hyperbolic PDEs through the use of both upwind and downwindbiased semidiscretizations. We investigate bounds on the maximum SSP step size for methods that include negative coefficients and downwindbiased semidiscretizations. We prove that the downwind SSP coefficient for linear multistep methods of order greater than one is at most equal to two, while the downwind SSP coefficient for explicit Runge–Kutta methods is at most equal to the number of stages of the method. In contrast, the maximal downwind SSP coefficient for second order Runge–Kutta methods is shown to be unbounded. We present a class of such methods with arbitrarily large SSP coefficient and demonstrate that they achieve second order accuracy for large CFL number.

PetClaw: A scalable parallel nonlinear wave propagation solver for Python(Proceedings of the High Performance Computing Symposium 2011, 19th High Performance Computing Symposium (HPC 2011), 2011) [Conference Paper]We present PetClaw, a scalable distributedmemory solver for timedependent nonlinear wave propagation. PetClaw unifies two wellknown scientific computing packages, Clawpack and PETSc, using Python interfaces into both. We rely on Clawpack to provide the infrastructure and kernels for timedependent nonlinear wave propagation. Similarly, we rely on PETSc to manage distributed data arrays and the communication between them.We describe both the implementation and performance of PetClaw as well as our challenges and accomplishments in scaling a Pythonbased code to tens of thousands of cores on the BlueGene/P architecture. The capabilities of PetClaw are demonstrated through application to a novel problem involving elastic waves in a heterogeneous medium. Very finely resolved simulations are used to demonstrate the suppression of shock formation in this system.

Using Python to Construct a Scalable Parallel Nonlinear Wave Solver(Proceedings of the 10th PYTHON IN SCIENCE CONF. (SCIPY 2011), THE 10th PYTHON IN SCIENCE CONF. (SCIPY 2011), 2011) [Conference Paper]Computational scientists seek to provide efficient, easytouse tools and frameworks that enable application scientists within a specific discipline to build and/or apply numerical models with uptodate computing technologies that can be executed on all available computing systems. Although many tools could be useful for groups beyond a specific application, it is often difficult and time consuming to combine existing software, or to adapt it for a more general purpose. Python enables a highlevel approach where a general framework can be supplemented with tools written for different fields and in different languages. This is particularly important when a large number of tools are necessary, as is the case for high performance scientific codes. This motivated our development of PetClaw, a scalable distributedmemory solver for timedependent nonlinear wave propagation, as a casestudy for how Python can be used as a highlevel framework leveraging a multitude of codes, efficient both in the reuse of code and programmer productivity. We present scaling results for computations on up to four racks of Shaheen, an IBM BlueGene/P supercomputer at King Abdullah University of Science and Technology. One particularly important issue that PetClaw has faced is the overhead associated with dynamic loading leading to catastrophic scaling. We use the walla library to solve the issue which does so by supplanting highcost filesystem calls with MPI operations at a low enough level that developers may avoid any changes to their codes.