Applied Mathematics and Computational Science Program
For more information visit: https://cemse.kaust.edu.sa/amcs
Recent Submissions

A simple approach to proving the existence, uniqueness, and strong and weak convergence rates for a broad class of McKeanVlasov equations(arXiv, 20210104) [Preprint]By employing a system of interacting stochastic particles as an approximation of the McKeanVlasov equation and utilizing classical stochastic analysis tools, namely It\^o's formula and KolmogorovChentsov continuity theorem, we prove the existence and uniqueness of strong solutions for a broad class of McKeanVlasov equations. Considering an increasing number of particles in the approximating stochastic particle system, we also prove the $L^p$ strong convergence rate and derive the weak convergence rates using the Kolmogorov backward equation and variations of the stochastic particle system. Our convergence rates were verified by numerical experiments which also indicate that the assumptions made here and in the literature can be relaxed.

Sum of Kronecker products representation and its Cholesky factorization for spatial covariance matrices from large grids(Computational Statistics & Data Analysis, Elsevier BV, 202101) [Article]The sum of Kronecker products (SKP) representation for spatial covariance matrices from gridded observations and a corresponding adaptivecrossapproximationbased framework for building the Kronecker factors are investigated. The time cost for constructing an dimensional covariance matrix is and the total memory footprint is , where is the number of Kronecker factors. The memory footprint under the SKP representation is compared with that under the hierarchical representation and found to be one order of magnitude smaller. A Cholesky factorization algorithm under the SKP representation is proposed and shown to factorize a onemillion dimensional covariance matrix in under 600 seconds on a standard scientific workstation. With the computed Cholesky factor, simulations of Gaussian random fields in one million dimensions can be achieved at a low cost for a wide range of spatial covariance functions.

The Relaxation Limit of Bipolar Fluid Models(arXiv, 20201228) [Preprint]This work establishes the relaxation limit from a bipolar EulerPoisson system with friction towards a bipolar driftdiffusion system. A weakstrong formalism is developed and, within this framework, a dissipative weak solution of the bipolar EulerPoisson system converges in the highfriction regime to a conservative, bounded away from vacuum, strong solution of the bipolar driftdiffusion system. This limiting process is based on a relative entropy identity for the bipolar fluid system.

Energyconserving 3D elastic wave simulation with finite difference discretization on staggered grids with nonconforming interfaces(arXiv, 20201227) [Preprint]In this work, we describe an approach to stably simulate the 3D isotropic elastic wave propagation using finite difference discretization on staggered grids with nonconforming interfaces. Specifically, we consider simulation domains composed of layers of uniform grids with different grid spacings, separated by planar interfaces. This discretization setting is motivated by the observation that wave speeds of earth media tend to increase with depth due to sedimentation and consolidation processes. We demonstrate that the layerwise finite difference discretization approach has the potential to significantly reduce the simulation cost, compared to its counterpart that uses holistically uniform grids. Such discretizations are enabled by summationbyparts finite difference operators, which are standard finite difference operators with special adaptations near boundaries or interfaces, and simultaneous approximation terms, which are penalty terms appended to the discretized system to weakly impose boundary or interface conditions. Combined with specially designed interpolation operators, the discretized system is shown to preserve the energyconserving property of the continuous elastic wave equation, and a fortiori ensure the stability of the simulation. Numerical examples are presented to corroborate these analytical developments.

Threedimensional Electromagnetic Void Space(arXiv, 20201225) [Preprint]We report a realization of threedimensional (3D) electromagnetic void space. Despite occupying a finite volume of space, such a medium is optically equivalent to an infinitesimal point where electromagnetic waves experience no phase accumulation. The 3D void space is realized by constructing alldielectric 3D photonic crystals such that the effective permittivity and permeability vanish simultaneously, forming a sixfold Diraclike point with Diraclike linear dispersions at the center of the Brillouin Zone. We demonstrate, both theoretically and experimentally, that such a 3D void space exhibits unique properties and rich functionalities absent in any other electromagnetic media, such as boundarycontrol transmission switching and 3D perfect wavesteering mechanisms. Especially, contrary to the photonic "doping" effect in its twodimensional counterpart, the 3D void space exhibits an amazing property of "impurityimmunity". Our work paves a road towards the realization of 3D void space where electromagnetic waves can be manipulated in unprecedented ways.

Existence and uniqueness for a viscoelastic KelvinVoigt model with nonconvex stored energy(arXiv, 20201218) [Preprint]We consider nonlinear viscoelastic materials of KelvinVoigt type with stored energies satisfying an AndrewsBall condition, allowing for non convexity in a compact set. Existence of weak solutions with deformation gradients in $H^1$ is established for energies of any superquadratic growth. In two space dimensions, weak solutions notably turn out to be unique in this class. Conservation of energy for weak solutions in two and three dimensions, as well as global regularity for smooth initial data in two dimensions are established under additional mild restrictions on the growth of the stored energy.

KAUST Metagenomic Analysis Platform (KMAP), Enabling Access to Massive Analytics of ReAnnotated Metagenomic Data.(Research Square, 20201214) [Preprint]Abstract Exponential rise of metagenomics sequencing is delivering massive functional environmental genomics data. However, this also generates a procedural bottleneck for ongoing reanalysis as reference databases grow and methods improve, and analyses need be updated for consistency, which require acceess to increasingly demanding bioinformatic and computational resources. Here, we present the KAUST Metagenomic Analysis Platform (KMAP), a new integrated open webbased tool for the comprehensive exploration of shotgun metagenomic data. We illustrate the capacities KMAP provides through the reassembly of ~27,000 public metagenomic samples captured in ~450 studies sampled across ~77 diverse habitats, resulting in 36 new habitatspecific gene catalogs, all based on fulllength (complete) genes. Extensive taxonomic and gene annotations are stored in Gene Information Tables (GITs), a simple tractable data integration format useful for analysis through command line or for database management. KMAP facilitates the exploration and comparison of microbial GITs across different habitats with over 275 million genes.

Moving source identification in an uncertain marine flow: Mediterranean Sea application(Ocean Engineering, Elsevier BV, 202012) [Article]Identifying marine pollutant sources is essential to assess, contain and minimize their risk. We propose a Lagrangian Particle Tracking algorithm (LPT) to study the transport of passive tracers advected by an uncertain flow field described by an ensemble of realizations of the ocean currents, and to identify the source parameters of the release in backward mode. Starting from a probability map describing the distribution of a pollutant, reverse tracking is used to generate probabilistic inverse maps by integrating it with the ensemble of flow fields backward in time. An objective function based on the probabilityweighted distance between the resulting inverse maps and the source trajectory is then minimized to identify the likely source of pollution. We conduct numerical experiments to demonstrate the efficiency of the proposed algorithm in the Mediterranean Sea. Passive tracers are released along the path of a ship and propagated with an ensemble of realistic flow fields to generate a probability map, which is then used for the inverse problem of source identification. Our results suggest that the proposed algorithm captures the release time and source of pollution, successfully pinpointing to the release parameters up to two weeks back in time in certain case studies.

Computational Design of Cold Bent Glass Façades(ACM Transactions on Graphics, ACM, 20201127) [Article]Cold bent glass is a promising and costefficient method for realizing doubly curved glass facades. They are produced by attaching planar glass sheets to curved frames and require keeping the occurring stress within safe limits. However, it is very challenging to navigate the design space of cold bent glass panels due to the fragility of the material, which impedes the formfinding for practically feasible and aesthetically pleasing cold bent glass facades. We propose an interactive, datadriven approach for designing cold bent glass facades that can be seamlessly integrated into a typical architectural design pipeline. Our method allows nonexpert users to interactively edit a parametric surface while providing realtime feedback on the deformed shape and maximum stress of cold bent glass panels. Designs are automatically refined to minimize several fairness criteria while maximal stresses are kept within glass limits. We achieve interactive frame rates by using a differentiable Mixture Density Network trained from more than a million simulations. Given a curved boundary, our regression model is capable of handling multistable configurations and accurately predicting the equilibrium shape of the panel and its corresponding maximal stress. We show predictions are highly accurate and validate our results with a physical realization of a cold bent glass surface.

Surfaceonly ferrofluids(ACM Transactions on Graphics, Association for Computing Machinery (ACM), 20201126) [Article]We devise a novel surfaceonly approach for simulating the three dimensional freesurface flow of incompressible, inviscid, and linearly magnetizable ferrofluids. A Lagrangian velocity field is stored on a triangle mesh capturing the fluid's surface. The two key problems associated with the dynamic simulation of the fluid's interesting geometry are the magnetization process transitioning the fluid from a nonmagnetic into a magnetic material, and the evaluation of magnetic forces. In this regard, our key observation is that for linearly incompressible ferrofluids, their magnetization and application of magnetic forces only require knowledge about the position of the fluids' boundary. Consequently, our approach employs a boundary element method solving the magnetization problem and evaluating the socalled magnetic pressure required for the force evaluation. The magnetic pressure is added to the Dirichlet boundary condition of a surfaceonly liquids solver carrying out the dynamical simulation. By only considering the fluid's surface in contrast to its whole volume, we end up with an efficient approach enabling more complex and realistic ferrofluids to be explored in the digital domain without compromising efficiency. Our approach allows for the use of physical parameters leading to accurate simulations as demonstrated in qualitative and quantitative evaluations.

The equations of polyconvex thermoelasticity(20201125) [Dissertation]
Advisor: Tzavaras, Athanasios
Committee members: Hoteit, Ibrahim; Markowich, Peter A.; Christoforou, Cleopatra; Dafermos Constantine, M.In my Dissertation, I consider the system of thermoelasticity endowed with poly convex energy. I will present the equations in their mathematical and physical con text, and I will explain the relevant research in the area and the contributions of my work. First, I embed the equations of polyconvex thermoviscoelasticity into an aug mented, symmetrizable, hyperbolic system which possesses a convex entropy. Using the relative entropy method in the extended variables, I show convergence from ther moviscoelasticity with Newtonian viscosity and Fourier heat conduction to smooth solutions of the system of adiabatic thermoelasticity as both parameters tend to zero and convergence from thermoviscoelasticity to smooth solutions of thermoelasticity in the zeroviscosity limit. In addition, I establish a weakstrong uniqueness result for the equations of adiabatic thermoelasticity in the class of entropy weak solutions. Then, I prove a measurevalued versus strong uniqueness result for adiabatic poly convex thermoelasticity in a suitable class of measurevalued solutions, de ned by means of generalized Young measures that describe both oscillatory and concentra tion e ects. Instead of working directly with the extended variables, I will look at the parent system in the original variables utilizing the weak stability properties of certain transportstretching identities, which allow to carry out the calculations by placing minimal regularity assumptions in the energy framework. Next, I construct a variational scheme for isentropic processes of adiabatic polyconvex thermoelasticity. I establish existence of minimizers which converge to a measurevalued solution that dissipates the total energy. Also, I prove that the scheme converges when the limit ing solution is smooth. Finally, for completeness and for the reader's convenience, I present the wellestablished theory for local existence of classical solutions and how it applies to the equations at hand. 
A Multilayer Nonlinear Elimination Preconditioned Inexact Newton Method for SteadyState Incompressible Flow Problems in Three Dimensions(SIAM Journal on Scientific Computing, Society for Industrial & Applied Mathematics (SIAM), 20201124) [Article]We develop a multilayer nonlinear elimination preconditioned inexact Newton method for a nonlinear algebraic system of equations, and a target application is the threedimensional steadystate incompressible NavierStokes equations at high Reynolds numbers. Nonlinear steadystate problems are often more difficult to solve than timedependent problems because the Jacobian matrix is less diagonally dominant, and a good initial guess from the previous time step is not available. For such problems, Newtonlike methods may suffer from slow convergence or stagnation even with globalization techniques such as line search. In this paper, we introduce a cascadic multilayer nonlinear elimination approach based on feedback from intermediate solutions to improve the convergence of Newton iteration. Numerical experiments show that the proposed algorithm is superior to the classical inexact Newton method and other single layer nonlinear elimination approaches in terms of the robustness and efficiency. Using the proposed nonlinear preconditioner with a highly parallel domain decomposition framework, we demonstrate that steady solutions of the NavierStokes equations with Reynolds numbers as large as 7,500 can be obtained for the liddriven cavity flow problem in three dimensions without the use of any continuation methods.

Energy Stability of Explicit RungeKutta Methods for Nonautonomous or Nonlinear Problems(SIAM Journal on Numerical Analysis, Society for Industrial & Applied Mathematics (SIAM), 20201124) [Article]Many important initial value problems have the property that energy is nonincreasing in time. Energy stable methods, also referred to as strongly stable methods, guarantee the same property discretely. We investigate requirements for conditional energy stability of explicit RungeKutta methods for nonlinear or nonautonomous problems. We provide both necessary and sufficient conditions for energy stability over these classes of problems. Examples of conditionally energy stable schemes are constructed, and an example is given in which unconditional energy stability is obtained with an explicit scheme.

Efficient Ensemble Data Assimilation and Forecasting of the Red Sea Circulation(20201123) [Dissertation]
Advisor: Hoteit, Ibrahim
Committee members: Knio, Omar; AlNaffouri, Tareq Y.; Iskandarani, MohamadThis thesis presents our efforts to build an operational ensemble forecasting system for the Red Sea, based on the Data Research Testbed (DART) package for ensemble data assimilation and the Massachusetts Institute of Technology general circulation ocean model (MITgcm) for forecasting. The Red Sea DARTMITgcm system efficiently integrates all the ensemble members in parallel, while accommodating different ensemble assimilation schemes. The promising ensemble adjustment Kalman filter (EAKF), designed to avoid manipulating the gigantic covariance matrices involved in the ensemble assimilation process, possesses relevant features required for an operational setting. The need for more efficient filtering schemes to implement a high resolution assimilation system for the Red Sea and to handle large ensembles for proper description of the assimilation statistics prompted the design and implementation of new filtering approaches. Making the most of our worldclass supercomputer, Shaheen, we first pushed the system limits by designing a faulttolerant scheduler extension that allowed us to test for the first time a fully realistic and high resolution 1000 ensemble members ocean ensemble assimilation system. In an operational setting, however, timely forecasts are of essence, and running large ensembles, albeit preferable and desirable, is not sustainable. New schemes aiming at lowering the computational burden while preserving reliable assimilation results, were developed. The ensemble Optimal Interpolation (EnOI) algorithm requires only a single model integration in the forecast step, using a static ensemble of preselected members for assimilation, and is therefore computationally significantly cheaper than the EAKF. To account for the strong seasonal variability of the Red Sea circulation, an EnOI with seasonallyvarying ensembles (SEnOI) was first implemented. To better handle intraseasonal variabilities and enhance the developed seasonal EnOI system, an automatic procedure to adaptively select the ensemble members through the assimilation cycles was then introduced. Finally, an efficient Hybrid scheme combining the dynamical flowdependent covariance of the EAKF and a static covariance of the EnOI was proposed and successfully tested in the Red Sea. The developed Hybrid ensemble data assimilation system will form the basis of the first operational Red Sea forecasting system that is currently being implemented to support Saudi Aramco operations in this basin. 
NodePy: A package for the analysis of numerical ODE solvers(Journal of Open Source Software, The Open Journal, 20201117) [Article]Ordinary differential equations (ODEs) are used to model a vast range of physical and other phenomena. They also arise in the discretization of partial differential equations. In most cases, solutions of differential equations must be approximated by numerical methods. The study of the properties of numerical methods for ODEs comprises an important and large body of knowledge. NodePy (available from https://github.com/ketch/nodepy, with documentation at https://nodepy.readthedocs.io/en/latest/) is a software package for designing and studying the properties of numerical ODE solvers. For the most important classes of methods, NodePy can automatically assess their stability, accuracy, and many other properties. NodePy has also been used as a catalog of coefficients for time integration methods in PDE solver codes.

PATHcre8: A Tool That Facilitates the Searching for Heterologous Biosynthetic Routes(ACS Synthetic Biology, American Chemical Society (ACS), 20201116) [Article]Developing computational tools that can facilitate the rational design of cell factories producing desired products at increased yields is challenging, as the tool needs to take into account that the preferred host organism usually has compounds that are consumed by competing reactions that reduce the yield of the desired product. On the other hand, the preferred host organisms may not have the native metabolic reactions needed to produce the compound of interest; thus, the computational tool needs to identify the metabolic reactions that will most efficiently produce the desired product. In this regard, we developed the generic tool PATHcre8 to facilitate an optimized search for heterologous biosynthetic pathway routes. PATHcre8 finds and ranks biosynthesis routes in a large number of organisms, including Cyanobacteria. The tool ranks the pathways based on feature scores that reflect reaction thermodynamics, the potentially toxic products in the pathway (compound toxicity), intermediate products in the pathway consumed by competing reactions (product consumption), and hostspecific information such as enzyme copy number. A comparison with several other similar tools shows that PATHcre8 is more efficient in ranking functional pathways. To illustrate the effectiveness of PATHcre8, we further provide case studies focused on isoprene production and the biodegradation of cocaine. PATHcre8 is free for academic and nonprofit users and can be accessed at https://www.cbrc.kaust.edu.sa/pathcre8/.

ParityTime Symmetry and Exceptional Points for FlexuralGravity Waves in Buoyant ThinPlates(Crystals, MDPI AG, 20201116) [Article]We derive and apply a transfer matrix method (Mmatrix) coupling liquid surface waves and flexuralgravity waves in buoyant thin elastic plates. We analyze the scattering matrix (Smatrix) formalism for such waves propagating within a FabryPerot like system, which are solutions of a sixth order partial differential equation (PDE) supplied with adequate boundary conditions. We develop a paritytime (PT)symmetry theory and its applications to thin elastic floating plates. The sixth order PDE governing the propagation of these waves leads to six by six M and S matrices, and results in specific physical properties of the PTsymmetric elastic plate systems. We show the effect of geometry and gain/loss on the asymmetric propagation of flexuralgravity waves, as well as a Fanolike lineshape of the reflection signature. Importantly, we show the possibility of obtaining coherent perfect absorberlaser (CPAL) using simple thin structures.

Local SGD: Unified Theory and New Efficient Methods(arXiv, 20201103) [Preprint]We present a unified framework for analyzing local SGD methods in the convex and strongly convex regimes for distributed/federated training of supervised machine learning models. We recover several known methods as a special case of our general framework, including LocalSGD/FedAvg, SCAFFOLD, and several variants of SGD not originally designed for federated learning. Our framework covers both the identical and heterogeneous data settings, supports both random and deterministic number of local steps, and can work with a wide array of local stochastic gradient estimators, including shifted estimators which are able to adjust the fixed points of local iterations for faster convergence. As an application of our framework, we develop multiple novel FL optimizers which are superior to existing methods. In particular, we develop the first linearly converging local SGD method which does not require any data homogeneity or other strong assumptions.

MultiIteration Stochastic Optimizers(arXiv, 20201103) [Preprint]We here introduce MultiIteration Stochastic Optimizers, a novel class of firstorder stochastic optimizers where the coefficient of variation of the mean gradient approximation, its relative statistical error, is estimated and controlled using successive control variates along the path of iterations. By exploiting the correlation between iterates, control variates may reduce the estimator's variance so that an accurate estimation of the mean gradient becomes computationally affordable. We name the estimator of the mean gradient MultiIteration stochastiC EstimatorMICE. In principle, MICE can be flexibly coupled with any firstorder stochastic optimizer given its nonintrusive nature. Optimally, our generic algorithm decides whether to drop a particular iteration out of the control variates hierarchy, restart it, or clip it somewhere in the hierarchy, discarding previous iterations. We present a simplified study of the convergence and complexity of MultiIteration Stochastic Optimizers for stronglyconvex and Lsmooth functions. Motivated by this analysis, we provide a generic stepsize choice for Stochastic Gradient Descent (SGD) with MICE, which, combined with the above characteristics, yields an efficient and robust stochastic optimizer. To assess the efficiency of MICE, we present several examples in which we use SGDMICE and AdamMICE. We include one example based on a stochastic adaptation of the Rosenbrock function and logistic regression training for various datasets. When compared to SGD, SAG, SAGA, SRVG, and SARAH methods, the MultiIteration Stochastic Optimizers reduced, without the need to tune parameters for each example, the gradient sampling cost in all cases tested, also lowering the total runtime in some cases.

Leveraging PaRSEC Runtime Support to Tackle Challenging 3D DataSparse Matrix Problems(IEEE, 20201101) [Conference Paper]The taskbased programming model associated with dynamic runtime systems has gained popularity for challenging problems because of workload imbalance, heterogeneous resources, or extreme concurrency. During the last decade, lowrank matrix approximations, where the main idea consists of exploiting data sparsity typically by compressing offdiagonal tiles up to an applicationspecific accuracy threshold, have been adopted to address the curse of dimensionality at extreme scale. In this paper, we create a bridge between the runtime and the linear algebra by communicating knowledge of the data sparsity to the runtime. We design and implement this synergistic approach with high user productivity in mind, in the context of the PaRSEC runtime system and the HiCMA numerical library. This requires to extend PaRSEC with new features to integrate rank information into the dataflow so that proper decisions can be taken at runtime. We focus on the tile lowrank (TLR) Cholesky factorization for solving 3D datasparse covariance matrix problems arising in environmental applications. In particular, we employ the 3D exponential model of Matern matrix kernel, which exhibits challenging nonuniform ´high ranks in offdiagonal tiles. We first provide a dynamic data structure management driven by a performance model to reduce extra floatingpoint operations. Next, we optimize the memory footprint of the application by relying on a dynamic memory allocator, and supported by a rankaware data distribution to cope with the workload imbalance. Finally, we expose further parallelism using kernel recursive formulations to shorten the critical path. Our resulting highperformance implementation outperforms existing datasparse TLR Cholesky factorization by up to 7fold on a largescale distributedmemory system, while minimizing the memory footprint up to a 44fold factor. This multidisciplinary work highlights the need to empower runtime systems beyond their original duty of task scheduling for servicing nextgeneration lowrank matrix algebra libraries.