Applied Mathematics and Computational Science Program
For more information visit: https://cemse.kaust.edu.sa/amcs
Recent Submissions

Random Reshuffling with Variance Reduction: New Analysis and Better Rates(arXiv, 20210419) [Preprint]Virtually all stateoftheart methods for training supervised machine learning models are variants of SGD enhanced with a number of additional tricks, such as minibatching, momentum, and adaptive stepsizes. One of the tricks that works so well in practice that it is used as default in virtually all widely used machine learning software is {\em random reshuffling (RR)}. However, the practical benefits of RR have until very recently been eluding attempts at being satisfactorily explained using theory. Motivated by recent development due to Mishchenko, Khaled and Richt\'{a}rik (2020), in this work we provide the first analysis of SVRG under Random Reshuffling (RRSVRG) for general finitesum problems. First, we show that RRSVRG converges linearly with the rate $\mathcal{O}(\kappa^{3/2})$ in the stronglyconvex case, and can be improved further to $\mathcal{O}(\kappa)$ in the big data regime (when $n > \mathcal{O}(\kappa)$), where $\kappa$ is the condition number. This improves upon the previous best rate $\mathcal{O}(\kappa^2)$ known for a variance reduced RR method in the stronglyconvex case due to Ying, Yuan and Sayed (2020). Second, we obtain the first sublinear rate for general convex problems. Third, we establish similar fast rates for CyclicSVRG and ShuffleOnceSVRG. Finally, we develop and analyze a more general variance reduction scheme for RR, which allows for less frequent updates of the control variate. We corroborate our theoretical results with suitably chosen experiments on synthetic and real datasets.

Multiindex ensemble Kalman filtering(arXiv, 20210415) [Preprint]In this work we marry multiindex Monte Carlo with ensemble Kalman filtering (EnKF) to produce the multiindex EnKF method (MIEnKF). The MIEnKF method is based on independent samples of fourcoupled EnKF estimators on a multiindex hierarchy of resolution levels, and it may be viewed as an extension of the multilevel EnKF (MLEnKF) method developed by the same authors in 2020. Multiindex here refers to a twoindex method, consisting of a hierarchy of EnKF estimators that are coupled in two degrees of freedom: time discretization and ensemble size. Under certain assumptions, the MIEnKF method is proven to be more tractable than EnKF and MLEnKF, and this is also verified in numerical examples.

Optimized RungeKutta Methods with Automatic Step Size Control for Compressible Computational Fluid Dynamics(arXiv, 20210414) [Preprint]We develop errorcontrol based time integration algorithms for compressible fluid dynamics (CFD) applications and show that they are efficient and robust in both the accuracylimited and stabilitylimited regime. Focusing on discontinuous spectral element semidiscretizations, we design new controllers for existing methods and for some new embedded RungeKutta pairs. We demonstrate the importance of choosing adequate controller parameters and provide a means to obtain these in practice. We compare a wide range of errorcontrolbased methods, along with the common approach in which step size control is based on the CourantFriedrichsLewy (CFL) number. The optimized methods give improved performance and naturally adopt a step size close to the maximum stable CFL number at loose tolerances, while additionally providing control of the temporal error at tighter tolerances. The numerical examples include challenging industrial CFD applications.

A class of highorder weighted compact central schemes for solving hyperbolic conservation laws(arXiv, 20210409) [Preprint]We propose a class of weighted compact central (WCC) schemes for solving hyperbolic conservation laws. The linear version can be considered as a highorder extension of the central LaxFriedrichs (LxF) scheme and the central conservation element and solution element (CESE) scheme. On every cell, the solution is approximated by a Pth order polynomial of which all the DOFs are stored and updated separately. The cell average is updated by a classical finite volume scheme which is constructed based on spacetime staggered meshes such that the fluxes are continuous across the interfaces of the adjacent control volumes and, therefore, the local Riemann problem is bypassed. The kth order spatial derivatives are updated by a central difference of (k1)th order spatial derivatives at cell vertices. All the spacetime information is calculated by the CauchyKovalewski procedure. By doing so, the schemes are able to achieve arbitrarily uniform spacetime high order on a supercompact stencil with only one explicit time step. In order to capture discontinuities without spurious oscillations, a weighted essentially nonoscillatory (WENO) type limiter is tailormade for the schemes. The limiter preserves the compactness and high order accuracy of the schemes. The accuracy, robustness, and efficiency of the schemes are verified by several numerical examples of scalar conservation laws and the compressible Euler equations.

Accelerated Bregman proximal gradient methods for relatively smooth convex optimization(Computational Optimization and Applications, Springer Science and Business Media LLC, 20210407) [Article]We consider the problem of minimizing the sum of two convex functions: one is differentiable and relatively smooth with respect to a reference convex function, and the other can be nondifferentiable but simple to optimize. We investigate a triangle scaling property of the Bregman distance generated by the reference convex function and present accelerated Bregman proximal gradient (ABPG) methods that attain an O(kγ) convergence rate, where γ∈ (0 , 2] is the triangle scaling exponent (TSE) of the Bregman distance. For the Euclidean distance, we have γ= 2 and recover the convergence rate of Nesterov’s accelerated gradient methods. For nonEuclidean Bregman distances, the TSE can be much smaller (say γ≤ 1), but we show that a relaxed definition of intrinsic TSE is always equal to 2. We exploit the intrinsic TSE to develop adaptive ABPG methods that converge much faster in practice. Although theoretical guarantees on a fast convergence rate seem to be out of reach in general, our methods obtain empirical O(k 2) rates in numerical experiments on several applications and provide posterior numerical certificates for the fast rates.

High Performance Multivariate Geospatial Statistics on Manycore Systems(IEEE Transactions on Parallel and Distributed Systems, Institute of Electrical and Electronics Engineers (IEEE), 20210406) [Article]Modeling and inferring spatial relationships and predicting missing values of environmental data are some of the main tasks of geospatial statisticians. These routine tasks are accomplished using multivariate geospatial models and the cokriging technique, which requires the evaluation of the expensive Gaussian loglikelihood function. This largescale cokriging challenge provides a fertile ground for supercomputing implementations for the geospatial statistics community as it is paramount to scale computational capability to match the growth in environmental data. In this paper, we develop largescale multivariate spatial modeling and inference on parallel hardware architectures. To tackle the increasing complexity in matrix operations and the massive concurrency in parallel systems, we leverage lowrank matrix approximation techniques with taskbased programming models and schedule the asynchronous computational tasks using a dynamic runtime system. The proposed framework provides both the dense and approximated computations of the Gaussian loglikelihood function. It demonstrates accuracy robustness and performance scalability on a variety of computer systems. Using both synthetic and real datasets, the lowrank matrix approximation shows better performance compared to exact computation, while preserving the application requirements in both parameter estimation and prediction accuracy. We also propose a novel algorithm to assess the prediction accuracy after the online parameter estimation.

A tangent linear approximation of the ignition delay time. I: Sensitivity to rate parameters(Combustion and Flame, Elsevier BV, 20210402) [Article]A tangent linear approximation is developed to estimate the sensitivity of the ignition delay time with respect to individual rate parameters in a detailed chemical mechanism. Attention is focused on a gas mixture reacting under adiabatic, constantvolume conditions. The uncertainty in the rates of elementary reactions is described in terms of uncertainty factors, and are parameterized using independent canonical random variables. The approach is based on integrating the linearized system of equations governing the evolution of the partial derivatives of the state vector with respect to individual random variables, and a linearized approximation is developed to relate the ignition delay sensitivity to the scaled partial derivatives of temperature. The efficiency of the approach is demonstrated through applications to chemical mechanisms of different sizes. In particular, the computations indicate that for detailed reaction mechanisms the TLA leads to robust local sensitivity predictions at a computational cost that is orderofmagnitude smaller than that incurred by finitedifference approaches based on oneatatime rate perturbations.

The Arab world prepares the exascale workforce(Communications of the ACM, Association for Computing Machinery (ACM), 202104) [Article]THE ARAB WORLD is currently host to eight supercomputers in the Top500 globally, including the current #10 and a former #7. Hardware can become a honeypot for talent attraction—senior talent from abroad, and rising talent from within. Good return on investment from leadingedge hardware motivates forging collaborative ties to global supercomputing leaders, which leads to integration into the global campaigns that supercomputing excels in, such as predicting climate change and developing sustainable energy resources for its mitigation, positing properties of new materials and catalysis by design, repurposing alreadycertified drugs and discovering new ones, and big data analytics and machine learning applied to science and to society. While the petroleum industry has been the historical motivation for supercomputing in the Arab World with its workloads of seismic imaging and reservoir modeling, the attraction today is universal.

On the depth of decision trees over infinite 1homogeneous binary information systems(Array, Elsevier BV, 202104) [Article]In this paper, we study decision trees, which solve problems defined over a specific subclass of infinite information systems, namely: 1homogeneous binary information systems. It is proved that the minimum depth of a decision tree (defined as a function on the number of attributes in a problem’s description) grows – in the worst case – logarithmically or linearly for each information system in this class. We consider a number of examples of infinite 1homogeneous binary information systems, including one closely related to the decision trees constructed by the CART algorithm.

Coastal circulation and water transport properties of the Red Sea Project lagoon(Ocean Modelling, Elsevier BV, 20210326) [Article]The Red Sea Project (RSP) is based on a coastal lagoon with over 90 pristine islands. The project intends to transform the Red Sea coast into a worldclass tourist destination. To better understand the regional dynamics and water exchange scenarios in the lagoon, a highresolution numerical model is implemented. The general and tidal circulation dynamics are then investigated with a particular focus on the response of the lagoon to strong wind jets. Significant variations in winter and summer circulation patterns are identified. The tidal amplitude inside the lagoon is greater than that outside, with strong tidal currents passing over its surrounding coral reef banks. The lagoon rapidly responds to the strong easterly wind jets that occur mainly in winter; it develops a reverse flow at greater depths, and the coastal water elevation is instantly affected. Lagrangian particle simulations are conducted to study the residence time of water in the lagoon. The results suggest that water renewal is slow in winter. Analysis of the Lagrangian coherent structures (LCS) reveals that water renewal is largely linked to the circulation patterns in the lagoon. In winter, the water becomes restricted in the central lagoon with only moderate exchange, whereas in summer, more circulation is observed with a higher degree of interaction between the central lagoon and external water. The results of LCS also highlight the tidal contribution to stirring and mixing while identifying the hotspots of the phenomenon. Our analysis demonstrates an effective approach for studying regional water mixing and connectivity, which could support coastal management in datalimited regions.

Critical Properties and Complexity Measures of ReadOnce Boolean Functions(Annals of Mathematics and Artificial Intelligence, Springer Nature, 20210322) [Article]In this paper, we define a quasiorder on the set of readonce Boolean functions and show that this is a wellquasiorder. This implies that every parameter measuring complexity of the functions can be characterized by a finite set of minimal subclasses of readonce functions, where this parameter is unbounded. We focus on two parameters related to certificate complexity and characterize each of them in the terminology of minimal classes.

Numerical simulation and entropy dissipative cure of the carbuncle instability for the shallow water circular hydraulic jump(arXiv, 20210317) [Preprint]We investigate the numerical artifact known as a carbuncle, in the solution of the shallow water equations. We propose a new Riemann solver that is based on a local measure of the entropy residual and aims to avoid carbuncles while maintaining high accuracy. We propose a new challenging test problem for shallow water codes, consisting of a steady circular hydraulic jump that can be physically unstable. We show that numerical methods are prone to either suppress the instability completely or form carbuncles. We test existing cures for the carbuncle. In our experiments, only the proposed method is able to avoid unphysical carbuncles without suppressing the physical instability.

Sample average approximation for riskaverse problems: A virtual power plant scheduling application(EURO Journal on Computational Optimization, Elsevier B.V., 20210316) [Article]In this paper, we address the decisionmaking problem of a virtual power plant (VPP) involving a selfscheduling and market involvement problem under uncertainty in the wind speed and electricity prices. The problem is modeled using a riskneutral and two riskaverse twostage stochastic programming formulations, where the conditional value at risk is used to represent risk. A sample average approximation methodology is integrated with an adapted LShaped solution method, which can solve riskneutral and specific riskaverse problems. This methodology provides a framework to understand and quantify the impact of the sample size on the variability of the results. The numerical results include an analysis of the computational performance of the methodology for two case studies, estimators for the bounds of the true optimal solutions of the problems, and an assessment of the quality of the solutions obtained. In particular, numerical experiences indicate that when an adequate sample size is used, the solution obtained is close to the optimal one.

Enhanced acoustic pressure sensors based on coherent perfect absorberlaser effect(Journal of Applied Physics, AIP Publishing, 20210314) [Article]Lasing is a wellestablished field in optics with several applications. Yet, having lasing or huge amplification in other wave systems remains an elusive goal. Here, we utilize the concept of coherent perfect absorberlaser to realize an acoustic analog of laser with a proven amplification of more than 10 4 in terms of the scattered acoustic signal at a frequency of a few kHz. The obtained acoustic laser (or the coherent perfect absorberlaser) is shown to possess extremely high sensitivity and figure of merit with regard to ultrasmall variations of the pressure (density and compressibility) and suggests its evident potential to build future acoustic pressure devices such as precise sensors.

Some estimates for the planning problem with potential(Nonlinear Differential Equations and Applications, Springer Nature, 20210311) [Article]In this paper, we study a priori estimates for a firstorder meanfield planning problem with a potential. In the theory of meanfield games (MFGs), a priori estimates play a crucial role to prove the existence of classical solutions. In particular, uniform bounds for the density of players’ distribution and its inverse are of utmost importance. Here, we investigate a priori bounds for those quantities for a planning problem with a nonvanishing potential. The presence of a potential raises nontrivial difficulties, which we overcome by exploring a displacementconvexity property for the meanfield planning problem with a potential together with Moser’s iteration method. We show that if the potential satisfies a certain smallness condition, then a displacementconvexity property holds. This property enables Lq bounds for the density. In the onedimensional case, the displacementconvexity property also gives Lq bounds for the inverse of the density. Finally, using these Lq estimates and Moser’s iteration method, we obtain L∞ estimates for the density of the distribution of the players and its inverse. We conclude with an application of our estimates to prove existence and uniqueness of solutions for a particular firstorder meanfield planning problem with a potential.

Broadband vectorial ultrathin optics with experimental efficiency up to 99% in the visible region via universal approximators(Light: Science & Applications, Springer Nature, 20210304) [Article]AbstractIntegrating conventional optics into compact nanostructured surfaces is the goal of flat optics. Despite the enormous progress in this technology, there are still critical challenges for realworld applications due to the limited operational efficiency in the visible region, on average lower than 60%, which originates from absorption losses in wavelengththick (≈ 500 nm) structures. Another issue is the realization of ondemand optical components for controlling vectorial light at visible frequencies simultaneously in both reflection and transmission and with a predetermined wavefront shape. In this work, we developed an inverse design approach that allows the realization of highly efficient (up to 99%) ultrathin (down to 50 nm thick) optics for vectorial light control with broadband input–output responses in the visible and nearIR regions with a desired wavefront shape. The approach leverages suitably engineered semiconductor nanostructures, which behave as a neural network that can approximate a userdefined input–output function. Nearunity performance results from the ultrathin nature of these surfaces, which reduces absorption losses to nearnegligible values. Experimentally, we discuss polarizing beam splitters, comparing their performance with the best results obtained from both direct and inverse design techniques, and new flatoptics components represented by dichroic mirrors and the basic unit of a flatoptics display that creates full colours by using only two subpixels, overcoming the limitations of conventional LCD/OLED technologies that require three subpixels for each composite colour. Our devices can be manufactured with a complementary metaloxidesemiconductor (CMOS)compatible process, making them scalable for mass production at low cost.

Triple Decomposition of Velocity Gradient Tensor in Compressible Turbulence(Fluids, MDPI AG, 20210302) [Article]The decomposition of the local motion of a fluid into straining, shearing, and rigidbody rotation is examined in this work for a compressible isotropic turbulence by means of direct numerical simulations. The triple decomposition is closely associated with a basic reference frame (BRF), in which the extraction of the biasing effect of shear is maximized. In this study, a new computational and inexpensive procedure is proposed to identify the BRF for a threedimensional flow field. In addition, the influence of compressibility effects on some statistical properties of the turbulent structures is addressed. The direct numerical simulations are carried out with a Reynolds number that is based on the Taylor microscale of Reλ=100 for various turbulent Mach numbers that range from Mat=0.12 to Mat=0.89. The DNS database is generated with an improved seventhorder accurate weighted essentially nonoscillatory scheme to discretize the nonlinear advective terms, and an eighthorder accurate centered finite difference scheme is retained for the diffusive terms. One of the major findings of this analysis is that regions featuring strong rigidbody rotations or straining motions are highly spatially intermittent, while most of the flow regions exhibit moderately strong shearing motions in the absence of rigidbody rotations and straining motions. The majority of compressibility effects can be estimated if the scaling laws in the case of compressible turbulence are rescaled by only considering the solenoidal contributions.

A sharp interface method using enriched finite elements for elliptic interface problems(Numerische Mathematik, Springer Nature, 20210302) [Article]We present an immersed boundary method for the solution of elliptic interface problems with discontinuous coefficients which provides a secondorder approximation of the solution. The proposed method can be categorised as an extended or enriched finite element method. In contrast to other extended FEM approaches, the new shape functions get projected in order to satisfy the Kroneckerdelta property with respect to the interface. The resulting combination of projection and restriction was already derived in Höllbacher and Wittum (TBA, 2019a) for application to particulate flows. The crucial benefits are the preservation of the symmetry and positive definiteness of the continuous bilinear operator. Besides, no additional stabilisation terms are necessary. Furthermore, since our enrichment can be interpreted as adaptive mesh refinement, the standard integration schemes can be applied on the cut elements. Finally, small cut elements do not impair the condition of the scheme and we propose a simple procedure to ensure good conditioning independent of the location of the interface. The stability and convergence of the solution will be proven and the numerical tests demonstrate optimal order of convergence.

On some singular meanfield games(Journal of Dynamics & Games, American Institute of Mathematical Sciences (AIMS), 202103) [Article]Here, we prove the existence of smooth solutions for meanfield games with a singular meanfield coupling; that is, a coupling in the HamiltonJacobi equation of the form g(m)=−m−α with α>0. We consider stationary and timedependent settings. The function g is monotone, but it is not bounded from below. With the exception of the logarithmic coupling, this is the first time that MFGs whose coupling is not bounded from below is examined in the literature. This coupling arises in models where agents have a strong preference for lowdensity regions. Paradoxically, this causes the agents move towards lowdensity regions and, thus, prevents the creation of those regions. To prove the existence of solutions, we consider an approximate problem for which the existence of smooth solutions is known. Then, we prove new a priori bounds for the solutions that show that 1m is bounded. Finally, using a limiting argument, we obtain the existence of solutions. The proof in the stationary case relies on a blowup argument and in the timedependent case on new bounds for m−1.

Parallel Hierarchical Matrix Technique to Approximate Large Covariance Matrices, Likelihood Functions and Parameter Identi fication(20210301) [Presentation]We develop the HLIBCov package, which is using parallel hierarchical (H) matrices to: 1) Approximate large dense inhomogeneous covariance matrices with a loglinear computational cost and storage requirement. 2) Compute matrixvector product, Cholesky factorization and inverse with a loglinear complexity. 3) Identify unknown parameters of the covariance function (variance, smoothness, and covariance length). These unknown parameters are estimated by maximizing the joint Gaussian loglikelihood function. To demonstrate the numerical performance, we identify three unknown parameters in an example with 2,000,000 locations on a PCdesktop.