For more information visit:

Recent Submissions

  • A simple approach to proving the existence, uniqueness, and strong and weak convergence rates for a broad class of McKean--Vlasov equations

    Haji-Ali, Abdul-Lateef; Hoel, Håkon; Tempone, Raul (arXiv, 2021-01-04) [Preprint]
    By employing a system of interacting stochastic particles as an approximation of the McKean--Vlasov equation and utilizing classical stochastic analysis tools, namely It\^o's formula and Kolmogorov--Chentsov continuity theorem, we prove the existence and uniqueness of strong solutions for a broad class of McKean--Vlasov equations. Considering an increasing number of particles in the approximating stochastic particle system, we also prove the $L^p$ strong convergence rate and derive the weak convergence rates using the Kolmogorov backward equation and variations of the stochastic particle system. Our convergence rates were verified by numerical experiments which also indicate that the assumptions made here and in the literature can be relaxed.
  • Sum of Kronecker products representation and its Cholesky factorization for spatial covariance matrices from large grids

    Cao, Jian; Genton, Marc G.; Keyes, David E.; Turkiyyah, George M. (Computational Statistics & Data Analysis, Elsevier BV, 2021-01) [Article]
    The sum of Kronecker products (SKP) representation for spatial covariance matrices from gridded observations and a corresponding adaptive-cross-approximation-based framework for building the Kronecker factors are investigated. The time cost for constructing an -dimensional covariance matrix is and the total memory footprint is , where is the number of Kronecker factors. The memory footprint under the SKP representation is compared with that under the hierarchical representation and found to be one order of magnitude smaller. A Cholesky factorization algorithm under the SKP representation is proposed and shown to factorize a one-million dimensional covariance matrix in under 600 seconds on a standard scientific workstation. With the computed Cholesky factor, simulations of Gaussian random fields in one million dimensions can be achieved at a low cost for a wide range of spatial covariance functions.
  • The Relaxation Limit of Bipolar Fluid Models

    Alves, Nuno Januario; Tzavaras, Athanasios (arXiv, 2020-12-28) [Preprint]
    This work establishes the relaxation limit from a bipolar Euler-Poisson system with friction towards a bipolar drift-diffusion system. A weak-strong formalism is developed and, within this framework, a dissipative weak solution of the bipolar Euler-Poisson system converges in the high-friction regime to a conservative, bounded away from vacuum, strong solution of the bipolar drift-diffusion system. This limiting process is based on a relative entropy identity for the bipolar fluid system.
  • Energy-conserving 3D elastic wave simulation with finite difference discretization on staggered grids with nonconforming interfaces

    Gao, Longfei; Ghattas, Omar; Keyes, David E. (arXiv, 2020-12-27) [Preprint]
    In this work, we describe an approach to stably simulate the 3D isotropic elastic wave propagation using finite difference discretization on staggered grids with nonconforming interfaces. Specifically, we consider simulation domains composed of layers of uniform grids with different grid spacings, separated by planar interfaces. This discretization setting is motivated by the observation that wave speeds of earth media tend to increase with depth due to sedimentation and consolidation processes. We demonstrate that the layer-wise finite difference discretization approach has the potential to significantly reduce the simulation cost, compared to its counterpart that uses holistically uniform grids. Such discretizations are enabled by summation-by-parts finite difference operators, which are standard finite difference operators with special adaptations near boundaries or interfaces, and simultaneous approximation terms, which are penalty terms appended to the discretized system to weakly impose boundary or interface conditions. Combined with specially designed interpolation operators, the discretized system is shown to preserve the energy-conserving property of the continuous elastic wave equation, and a fortiori ensure the stability of the simulation. Numerical examples are presented to corroborate these analytical developments.
  • Three-dimensional Electromagnetic Void Space

    Xu, Changqing; Chu, Hongchen; Luo, Jie; Hang, Zhi Hong; Wu, Ying; Lai, Yun (arXiv, 2020-12-25) [Preprint]
    We report a realization of three-dimensional (3D) electromagnetic void space. Despite occupying a finite volume of space, such a medium is optically equivalent to an infinitesimal point where electromagnetic waves experience no phase accumulation. The 3D void space is realized by constructing all-dielectric 3D photonic crystals such that the effective permittivity and permeability vanish simultaneously, forming a six-fold Dirac-like point with Dirac-like linear dispersions at the center of the Brillouin Zone. We demonstrate, both theoretically and experimentally, that such a 3D void space exhibits unique properties and rich functionalities absent in any other electromagnetic media, such as boundary-control transmission switching and 3D perfect wave-steering mechanisms. Especially, contrary to the photonic "doping" effect in its two-dimensional counterpart, the 3D void space exhibits an amazing property of "impurity-immunity". Our work paves a road towards the realization of 3D void space where electromagnetic waves can be manipulated in unprecedented ways.
  • Existence and uniqueness for a viscoelastic Kelvin-Voigt model with nonconvex stored energy

    Koumatos, Konstantinos; Lattanzio, Corrado; Spirito, Stefano; Tzavaras, Athanasios (arXiv, 2020-12-18) [Preprint]
    We consider nonlinear viscoelastic materials of Kelvin-Voigt type with stored energies satisfying an Andrews-Ball condition, allowing for non convexity in a compact set. Existence of weak solutions with deformation gradients in $H^1$ is established for energies of any superquadratic growth. In two space dimensions, weak solutions notably turn out to be unique in this class. Conservation of energy for weak solutions in two and three dimensions, as well as global regularity for smooth initial data in two dimensions are established under additional mild restrictions on the growth of the stored energy.
  • KAUST Metagenomic Analysis Platform (KMAP), Enabling Access to Massive Analytics of Re-Annotated Metagenomic Data.

    Alam, Intikhab; Kamau, Allan; Gugi, David; Gojobori, Takashi; Duarte, Carlos M.; Bajic, Vladimir B. (Research Square, 2020-12-14) [Preprint]
    Abstract Exponential rise of metagenomics sequencing is delivering massive functional environmental genomics data. However, this also generates a procedural bottleneck for on-going re-analysis as reference databases grow and methods improve, and analyses need be updated for consistency, which require acceess to increasingly demanding bioinformatic and computational resources. Here, we present the KAUST Metagenomic Analysis Platform (KMAP), a new integrated open web-based tool for the comprehensive exploration of shotgun metagenomic data. We illustrate the capacities KMAP provides through the re-assembly of ~27,000 public metagenomic samples captured in ~450 studies sampled across ~77 diverse habitats, resulting in 36 new habitat-specific gene catalogs, all based on full-length (complete) genes. Extensive taxonomic and gene annotations are stored in Gene Information Tables (GITs), a simple tractable data integration format useful for analysis through command line or for database management. KMAP facilitates the exploration and comparison of microbial GITs across different habitats with over 275 million genes.
  • Moving source identification in an uncertain marine flow: Mediterranean Sea application

    Hammoud, Mohamad Abed El Rahman; Lakkis, Issam; Knio, Omar; Hoteit, Ibrahim (Ocean Engineering, Elsevier BV, 2020-12) [Article]
    Identifying marine pollutant sources is essential to assess, contain and minimize their risk. We propose a Lagrangian Particle Tracking algorithm (LPT) to study the transport of passive tracers advected by an uncertain flow field described by an ensemble of realizations of the ocean currents, and to identify the source parameters of the release in backward mode. Starting from a probability map describing the distribution of a pollutant, reverse tracking is used to generate probabilistic inverse maps by integrating it with the ensemble of flow fields backward in time. An objective function based on the probability-weighted distance between the resulting inverse maps and the source trajectory is then minimized to identify the likely source of pollution. We conduct numerical experiments to demonstrate the efficiency of the proposed algorithm in the Mediterranean Sea. Passive tracers are released along the path of a ship and propagated with an ensemble of realistic flow fields to generate a probability map, which is then used for the inverse problem of source identification. Our results suggest that the proposed algorithm captures the release time and source of pollution, successfully pinpointing to the release parameters up to two weeks back in time in certain case studies.
  • Computational Design of Cold Bent Glass Façades

    Gavriil, Konstantinos; Guseinov, Ruslan; PÉREZ, JESÚS; PELLIS, DAVIDE; Henderson, Paul; Rist, Florian; Pottmann, Helmut; Bickel, Bernd (ACM Transactions on Graphics, ACM, 2020-11-27) [Article]
    Cold bent glass is a promising and cost-efficient method for realizing doubly curved glass facades. They are produced by attaching planar glass sheets to curved frames and require keeping the occurring stress within safe limits. However, it is very challenging to navigate the design space of cold bent glass panels due to the fragility of the material, which impedes the form-finding for practically feasible and aesthetically pleasing cold bent glass facades. We propose an interactive, data-driven approach for designing cold bent glass facades that can be seamlessly integrated into a typical architectural design pipeline. Our method allows non-expert users to interactively edit a parametric surface while providing real-time feedback on the deformed shape and maximum stress of cold bent glass panels. Designs are automatically refined to minimize several fairness criteria while maximal stresses are kept within glass limits. We achieve interactive frame rates by using a differentiable Mixture Density Network trained from more than a million simulations. Given a curved boundary, our regression model is capable of handling multistable configurations and accurately predicting the equilibrium shape of the panel and its corresponding maximal stress. We show predictions are highly accurate and validate our results with a physical realization of a cold bent glass surface.
  • Surface-only ferrofluids

    Huang, Libo; Michels, Dominik L. (ACM Transactions on Graphics, Association for Computing Machinery (ACM), 2020-11-26) [Article]
    We devise a novel surface-only approach for simulating the three dimensional free-surface flow of incompressible, inviscid, and linearly magnetizable ferrofluids. A Lagrangian velocity field is stored on a triangle mesh capturing the fluid's surface. The two key problems associated with the dynamic simulation of the fluid's interesting geometry are the magnetization process transitioning the fluid from a non-magnetic into a magnetic material, and the evaluation of magnetic forces. In this regard, our key observation is that for linearly incompressible ferrofluids, their magnetization and application of magnetic forces only require knowledge about the position of the fluids' boundary. Consequently, our approach employs a boundary element method solving the magnetization problem and evaluating the so-called magnetic pressure required for the force evaluation. The magnetic pressure is added to the Dirichlet boundary condition of a surface-only liquids solver carrying out the dynamical simulation. By only considering the fluid's surface in contrast to its whole volume, we end up with an efficient approach enabling more complex and realistic ferrofluids to be explored in the digital domain without compromising efficiency. Our approach allows for the use of physical parameters leading to accurate simulations as demonstrated in qualitative and quantitative evaluations.
  • The equations of polyconvex thermoelasticity

    Galanopoulou, Myrto Maria (2020-11-25) [Dissertation]
    Advisor: Tzavaras, Athanasios
    Committee members: Hoteit, Ibrahim; Markowich, Peter A.; Christoforou, Cleopatra; Dafermos Constantine, M.
    In my Dissertation, I consider the system of thermoelasticity endowed with poly- convex energy. I will present the equations in their mathematical and physical con- text, and I will explain the relevant research in the area and the contributions of my work. First, I embed the equations of polyconvex thermoviscoelasticity into an aug- mented, symmetrizable, hyperbolic system which possesses a convex entropy. Using the relative entropy method in the extended variables, I show convergence from ther- moviscoelasticity with Newtonian viscosity and Fourier heat conduction to smooth solutions of the system of adiabatic thermoelasticity as both parameters tend to zero and convergence from thermoviscoelasticity to smooth solutions of thermoelasticity in the zero-viscosity limit. In addition, I establish a weak-strong uniqueness result for the equations of adiabatic thermoelasticity in the class of entropy weak solutions. Then, I prove a measure-valued versus strong uniqueness result for adiabatic poly- convex thermoelasticity in a suitable class of measure-valued solutions, de ned by means of generalized Young measures that describe both oscillatory and concentra- tion e ects. Instead of working directly with the extended variables, I will look at the parent system in the original variables utilizing the weak stability properties of certain transport-stretching identities, which allow to carry out the calculations by placing minimal regularity assumptions in the energy framework. Next, I construct a variational scheme for isentropic processes of adiabatic polyconvex thermoelasticity. I establish existence of minimizers which converge to a measure-valued solution that dissipates the total energy. Also, I prove that the scheme converges when the limit- ing solution is smooth. Finally, for completeness and for the reader's convenience, I present the well-established theory for local existence of classical solutions and how it applies to the equations at hand.
  • A Multilayer Nonlinear Elimination Preconditioned Inexact Newton Method for Steady-State Incompressible Flow Problems in Three Dimensions

    Luo, Li; Cai, Xiao Chuan; Yan, Zhengzheng; Xu, Lei; Keyes, David E. (SIAM Journal on Scientific Computing, Society for Industrial & Applied Mathematics (SIAM), 2020-11-24) [Article]
    We develop a multilayer nonlinear elimination preconditioned inexact Newton method for a nonlinear algebraic system of equations, and a target application is the three-dimensional steady-state incompressible Navier--Stokes equations at high Reynolds numbers. Nonlinear steadystate problems are often more difficult to solve than time-dependent problems because the Jacobian matrix is less diagonally dominant, and a good initial guess from the previous time step is not available. For such problems, Newton-like methods may suffer from slow convergence or stagnation even with globalization techniques such as line search. In this paper, we introduce a cascadic multilayer nonlinear elimination approach based on feedback from intermediate solutions to improve the convergence of Newton iteration. Numerical experiments show that the proposed algorithm is superior to the classical inexact Newton method and other single layer nonlinear elimination approaches in terms of the robustness and efficiency. Using the proposed nonlinear preconditioner with a highly parallel domain decomposition framework, we demonstrate that steady solutions of the Navier--Stokes equations with Reynolds numbers as large as 7,500 can be obtained for the lid-driven cavity flow problem in three dimensions without the use of any continuation methods.
  • Energy Stability of Explicit Runge--Kutta Methods for Nonautonomous or Nonlinear Problems

    Ranocha, Hendrik; Ketcheson, David I. (SIAM Journal on Numerical Analysis, Society for Industrial & Applied Mathematics (SIAM), 2020-11-24) [Article]
    Many important initial value problems have the property that energy is nonincreasing in time. Energy stable methods, also referred to as strongly stable methods, guarantee the same property discretely. We investigate requirements for conditional energy stability of explicit Runge--Kutta methods for nonlinear or nonautonomous problems. We provide both necessary and sufficient conditions for energy stability over these classes of problems. Examples of conditionally energy stable schemes are constructed, and an example is given in which unconditional energy stability is obtained with an explicit scheme.
  • Efficient Ensemble Data Assimilation and Forecasting of the Red Sea Circulation

    Toye, Habib (2020-11-23) [Dissertation]
    Advisor: Hoteit, Ibrahim
    Committee members: Knio, Omar; Al-Naffouri, Tareq Y.; Iskandarani, Mohamad
    This thesis presents our efforts to build an operational ensemble forecasting system for the Red Sea, based on the Data Research Testbed (DART) package for ensemble data assimilation and the Massachusetts Institute of Technology general circulation ocean model (MITgcm) for forecasting. The Red Sea DART-MITgcm system efficiently integrates all the ensemble members in parallel, while accommodating different ensemble assimilation schemes. The promising ensemble adjustment Kalman filter (EAKF), designed to avoid manipulating the gigantic covariance matrices involved in the ensemble assimilation process, possesses relevant features required for an operational setting. The need for more efficient filtering schemes to implement a high resolution assimilation system for the Red Sea and to handle large ensembles for proper description of the assimilation statistics prompted the design and implementation of new filtering approaches. Making the most of our world-class supercomputer, Shaheen, we first pushed the system limits by designing a fault-tolerant scheduler extension that allowed us to test for the first time a fully realistic and high resolution 1000 ensemble members ocean ensemble assimilation system. In an operational setting, however, timely forecasts are of essence, and running large ensembles, albeit preferable and desirable, is not sustainable. New schemes aiming at lowering the computational burden while preserving reliable assimilation results, were developed. The ensemble Optimal Interpolation (EnOI) algorithm requires only a single model integration in the forecast step, using a static ensemble of preselected members for assimilation, and is therefore computationally significantly cheaper than the EAKF. To account for the strong seasonal variability of the Red Sea circulation, an EnOI with seasonally-varying ensembles (SEnOI) was first implemented. To better handle intra-seasonal variabilities and enhance the developed seasonal EnOI system, an automatic procedure to adaptively select the ensemble members through the assimilation cycles was then introduced. Finally, an efficient Hybrid scheme combining the dynamical flow-dependent covariance of the EAKF and a static covariance of the EnOI was proposed and successfully tested in the Red Sea. The developed Hybrid ensemble data assimilation system will form the basis of the first operational Red Sea forecasting system that is currently being implemented to support Saudi Aramco operations in this basin.
  • NodePy: A package for the analysis of numerical ODE solvers

    Ketcheson, David I.; Ranocha, Hendrik; Parsani, Matteo; Waheed, Umair bin; Hadjimichael, Yiannis (Journal of Open Source Software, The Open Journal, 2020-11-17) [Article]
    Ordinary differential equations (ODEs) are used to model a vast range of physical and other phenomena. They also arise in the discretization of partial differential equations. In most cases, solutions of differential equations must be approximated by numerical methods. The study of the properties of numerical methods for ODEs comprises an important and large body of knowledge. NodePy (available from, with documentation at is a software package for designing and studying the properties of numerical ODE solvers. For the most important classes of methods, NodePy can automatically assess their stability, accuracy, and many other properties. NodePy has also been used as a catalog of coefficients for time integration methods in PDE solver codes.
  • PATHcre8: A Tool That Facilitates the Searching for Heterologous Biosynthetic Routes

    Motwalli, Olaa Amin; Uludag, Mahmut; Mijakovic, Ivan; Alazmi, Meshari; Bajic, Vladimir B.; Gojobori, Takashi; Gao, Xin; Essack, Magbubah (ACS Synthetic Biology, American Chemical Society (ACS), 2020-11-16) [Article]
    Developing computational tools that can facilitate the rational design of cell factories producing desired products at increased yields is challenging, as the tool needs to take into account that the preferred host organism usually has compounds that are consumed by competing reactions that reduce the yield of the desired product. On the other hand, the preferred host organisms may not have the native metabolic reactions needed to produce the compound of interest; thus, the computational tool needs to identify the metabolic reactions that will most efficiently produce the desired product. In this regard, we developed the generic tool PATHcre8 to facilitate an optimized search for heterologous biosynthetic pathway routes. PATHcre8 finds and ranks biosynthesis routes in a large number of organisms, including Cyanobacteria. The tool ranks the pathways based on feature scores that reflect reaction thermodynamics, the potentially toxic products in the pathway (compound toxicity), intermediate products in the pathway consumed by competing reactions (product consumption), and host-specific information such as enzyme copy number. A comparison with several other similar tools shows that PATHcre8 is more efficient in ranking functional pathways. To illustrate the effectiveness of PATHcre8, we further provide case studies focused on isoprene production and the biodegradation of cocaine. PATHcre8 is free for academic and nonprofit users and can be accessed at
  • Parity-Time Symmetry and Exceptional Points for Flexural-Gravity Waves in Buoyant Thin-Plates

    Farhat, Mohamed; Guenneau, Sebastien; Chen, Pai-Yen; Wu, Ying (Crystals, MDPI AG, 2020-11-16) [Article]
    We derive and apply a transfer matrix method (M-matrix) coupling liquid surface waves and flexural-gravity waves in buoyant thin elastic plates. We analyze the scattering matrix (S-matrix) formalism for such waves propagating within a Fabry-Perot like system, which are solutions of a sixth order partial differential equation (PDE) supplied with adequate boundary conditions. We develop a parity-time (PT)-symmetry theory and its applications to thin elastic floating plates. The sixth order PDE governing the propagation of these waves leads to six by six M and S matrices, and results in specific physical properties of the PT-symmetric elastic plate systems. We show the effect of geometry and gain/loss on the asymmetric propagation of flexural-gravity waves, as well as a Fano-like line-shape of the reflection signature. Importantly, we show the possibility of obtaining coherent perfect absorber-laser (CPAL) using simple thin structures.
  • Local SGD: Unified Theory and New Efficient Methods

    Gorbunov, Eduard; Hanzely, Filip; Richtarik, Peter (arXiv, 2020-11-03) [Preprint]
    We present a unified framework for analyzing local SGD methods in the convex and strongly convex regimes for distributed/federated training of supervised machine learning models. We recover several known methods as a special case of our general framework, including Local-SGD/FedAvg, SCAFFOLD, and several variants of SGD not originally designed for federated learning. Our framework covers both the identical and heterogeneous data settings, supports both random and deterministic number of local steps, and can work with a wide array of local stochastic gradient estimators, including shifted estimators which are able to adjust the fixed points of local iterations for faster convergence. As an application of our framework, we develop multiple novel FL optimizers which are superior to existing methods. In particular, we develop the first linearly converging local SGD method which does not require any data homogeneity or other strong assumptions.
  • Multi-Iteration Stochastic Optimizers

    Carlon, Andre; Espath, Luis; Lopez, Rafael; Tempone, Raul (arXiv, 2020-11-03) [Preprint]
    We here introduce Multi-Iteration Stochastic Optimizers, a novel class of first-order stochastic optimizers where the coefficient of variation of the mean gradient approximation, its relative statistical error, is estimated and controlled using successive control variates along the path of iterations. By exploiting the correlation between iterates, control variates may reduce the estimator's variance so that an accurate estimation of the mean gradient becomes computationally affordable. We name the estimator of the mean gradient Multi-Iteration stochastiC Estimator-MICE. In principle, MICE can be flexibly coupled with any first-order stochastic optimizer given its non-intrusive nature. Optimally, our generic algorithm decides whether to drop a particular iteration out of the control variates hierarchy, restart it, or clip it somewhere in the hierarchy, discarding previous iterations. We present a simplified study of the convergence and complexity of Multi-Iteration Stochastic Optimizers for strongly-convex and L-smooth functions. Motivated by this analysis, we provide a generic step-size choice for Stochastic Gradient Descent (SGD) with MICE, which, combined with the above characteristics, yields an efficient and robust stochastic optimizer. To assess the efficiency of MICE, we present several examples in which we use SGD-MICE and Adam-MICE. We include one example based on a stochastic adaptation of the Rosenbrock function and logistic regression training for various datasets. When compared to SGD, SAG, SAGA, SRVG, and SARAH methods, the Multi-Iteration Stochastic Optimizers reduced, without the need to tune parameters for each example, the gradient sampling cost in all cases tested, also lowering the total runtime in some cases.
  • Leveraging PaRSEC Runtime Support to Tackle Challenging 3D Data-Sparse Matrix Problems

    Cao, Qinglei; Pei, Yu; Akbudak, Kadir; Bosilca, George; Ltaief, Hatem; Keyes, David E.; Dongarra, Jack (IEEE, 2020-11-01) [Conference Paper]
    The task-based programming model associated with dynamic runtime systems has gained popularity for challenging problems because of workload imbalance, heterogeneous resources, or extreme concurrency. During the last decade, lowrank matrix approximations, where the main idea consists of exploiting data sparsity typically by compressing off-diagonal tiles up to an application-specific accuracy threshold, have been adopted to address the curse of dimensionality at extreme scale. In this paper, we create a bridge between the runtime and the linear algebra by communicating knowledge of the data sparsity to the runtime. We design and implement this synergistic approach with high user productivity in mind, in the context of the PaRSEC runtime system and the HiCMA numerical library. This requires to extend PaRSEC with new features to integrate rank information into the dataflow so that proper decisions can be taken at runtime. We focus on the tile low-rank (TLR) Cholesky factorization for solving 3D data-sparse covariance matrix problems arising in environmental applications. In particular, we employ the 3D exponential model of Matern matrix kernel, which exhibits challenging nonuniform ´high ranks in off-diagonal tiles. We first provide a dynamic data structure management driven by a performance model to reduce extra floating-point operations. Next, we optimize the memory footprint of the application by relying on a dynamic memory allocator, and supported by a rank-aware data distribution to cope with the workload imbalance. Finally, we expose further parallelism using kernel recursive formulations to shorten the critical path. Our resulting high-performance implementation outperforms existing data-sparse TLR Cholesky factorization by up to 7-fold on a large-scale distributed-memory system, while minimizing the memory footprint up to a 44-fold factor. This multidisciplinary work highlights the need to empower runtime systems beyond their original duty of task scheduling for servicing next-generation low-rank matrix algebra libraries.

View more