For more information visit:

Recent Submissions

  • Random Reshuffling with Variance Reduction: New Analysis and Better Rates

    Malinovsky, Grigory; Sailanbayev, Alibek; Richtarik, Peter (arXiv, 2021-04-19) [Preprint]
    Virtually all state-of-the-art methods for training supervised machine learning models are variants of SGD enhanced with a number of additional tricks, such as minibatching, momentum, and adaptive stepsizes. One of the tricks that works so well in practice that it is used as default in virtually all widely used machine learning software is {\em random reshuffling (RR)}. However, the practical benefits of RR have until very recently been eluding attempts at being satisfactorily explained using theory. Motivated by recent development due to Mishchenko, Khaled and Richt\'{a}rik (2020), in this work we provide the first analysis of SVRG under Random Reshuffling (RR-SVRG) for general finite-sum problems. First, we show that RR-SVRG converges linearly with the rate $\mathcal{O}(\kappa^{3/2})$ in the strongly-convex case, and can be improved further to $\mathcal{O}(\kappa)$ in the big data regime (when $n > \mathcal{O}(\kappa)$), where $\kappa$ is the condition number. This improves upon the previous best rate $\mathcal{O}(\kappa^2)$ known for a variance reduced RR method in the strongly-convex case due to Ying, Yuan and Sayed (2020). Second, we obtain the first sublinear rate for general convex problems. Third, we establish similar fast rates for Cyclic-SVRG and Shuffle-Once-SVRG. Finally, we develop and analyze a more general variance reduction scheme for RR, which allows for less frequent updates of the control variate. We corroborate our theoretical results with suitably chosen experiments on synthetic and real datasets.
  • Multi-index ensemble Kalman filtering

    Hoel, Hakon; Shaimerdenova, Gaukhar; Tempone, Raul (arXiv, 2021-04-15) [Preprint]
    In this work we marry multi-index Monte Carlo with ensemble Kalman filtering (EnKF) to produce the multi-index EnKF method (MIEnKF). The MIEnKF method is based on independent samples of four-coupled EnKF estimators on a multi-index hierarchy of resolution levels, and it may be viewed as an extension of the multilevel EnKF (MLEnKF) method developed by the same authors in 2020. Multi-index here refers to a two-index method, consisting of a hierarchy of EnKF estimators that are coupled in two degrees of freedom: time discretization and ensemble size. Under certain assumptions, the MIEnKF method is proven to be more tractable than EnKF and MLEnKF, and this is also verified in numerical examples.
  • Optimized Runge-Kutta Methods with Automatic Step Size Control for Compressible Computational Fluid Dynamics

    Ranocha, Hendrik; Dalcin, Lisandro; Parsani, Matteo; Ketcheson, David I. (arXiv, 2021-04-14) [Preprint]
    We develop error-control based time integration algorithms for compressible fluid dynamics (CFD) applications and show that they are efficient and robust in both the accuracy-limited and stability-limited regime. Focusing on discontinuous spectral element semidiscretizations, we design new controllers for existing methods and for some new embedded Runge-Kutta pairs. We demonstrate the importance of choosing adequate controller parameters and provide a means to obtain these in practice. We compare a wide range of error-control-based methods, along with the common approach in which step size control is based on the Courant-Friedrichs-Lewy (CFL) number. The optimized methods give improved performance and naturally adopt a step size close to the maximum stable CFL number at loose tolerances, while additionally providing control of the temporal error at tighter tolerances. The numerical examples include challenging industrial CFD applications.
  • A class of high-order weighted compact central schemes for solving hyperbolic conservation laws

    Shen, Hua; Parsani, Matteo (arXiv, 2021-04-09) [Preprint]
    We propose a class of weighted compact central (WCC) schemes for solving hyperbolic conservation laws. The linear version can be considered as a high-order extension of the central Lax-Friedrichs (LxF) scheme and the central conservation element and solution element (CESE) scheme. On every cell, the solution is approximated by a Pth order polynomial of which all the DOFs are stored and updated separately. The cell average is updated by a classical finite volume scheme which is constructed based on space-time staggered meshes such that the fluxes are continuous across the interfaces of the adjacent control volumes and, therefore, the local Riemann problem is bypassed. The kth order spatial derivatives are updated by a central difference of (k-1)th order spatial derivatives at cell vertices. All the space-time information is calculated by the Cauchy-Kovalewski procedure. By doing so, the schemes are able to achieve arbitrarily uniform spacetime high order on a super-compact stencil with only one explicit time step. In order to capture discontinuities without spurious oscillations, a weighted essentially non-oscillatory (WENO) type limiter is tailor-made for the schemes. The limiter preserves the compactness and high order accuracy of the schemes. The accuracy, robustness, and efficiency of the schemes are verified by several numerical examples of scalar conservation laws and the compressible Euler equations.
  • Accelerated Bregman proximal gradient methods for relatively smooth convex optimization

    Hanzely, Filip; Richtarik, Peter; Xiao, Lin (Computational Optimization and Applications, Springer Science and Business Media LLC, 2021-04-07) [Article]
    We consider the problem of minimizing the sum of two convex functions: one is differentiable and relatively smooth with respect to a reference convex function, and the other can be nondifferentiable but simple to optimize. We investigate a triangle scaling property of the Bregman distance generated by the reference convex function and present accelerated Bregman proximal gradient (ABPG) methods that attain an O(k-γ) convergence rate, where γ∈ (0 , 2] is the triangle scaling exponent (TSE) of the Bregman distance. For the Euclidean distance, we have γ= 2 and recover the convergence rate of Nesterov’s accelerated gradient methods. For non-Euclidean Bregman distances, the TSE can be much smaller (say γ≤ 1), but we show that a relaxed definition of intrinsic TSE is always equal to 2. We exploit the intrinsic TSE to develop adaptive ABPG methods that converge much faster in practice. Although theoretical guarantees on a fast convergence rate seem to be out of reach in general, our methods obtain empirical O(k- 2) rates in numerical experiments on several applications and provide posterior numerical certificates for the fast rates.
  • High Performance Multivariate Geospatial Statistics on Manycore Systems

    Salvaña, Mary Lai O.; Abdulah, Sameh; Huang, Huang; Ltaief, Hatem; Sun, Ying; Genton, Marc G.; Keyes, David E. (IEEE Transactions on Parallel and Distributed Systems, Institute of Electrical and Electronics Engineers (IEEE), 2021-04-06) [Article]
    Modeling and inferring spatial relationships and predicting missing values of environmental data are some of the main tasks of geospatial statisticians. These routine tasks are accomplished using multivariate geospatial models and the cokriging technique, which requires the evaluation of the expensive Gaussian log-likelihood function. This large-scale cokriging challenge provides a fertile ground for supercomputing implementations for the geospatial statistics community as it is paramount to scale computational capability to match the growth in environmental data. In this paper, we develop large-scale multivariate spatial modeling and inference on parallel hardware architectures. To tackle the increasing complexity in matrix operations and the massive concurrency in parallel systems, we leverage low-rank matrix approximation techniques with task-based programming models and schedule the asynchronous computational tasks using a dynamic runtime system. The proposed framework provides both the dense and approximated computations of the Gaussian log-likelihood function. It demonstrates accuracy robustness and performance scalability on a variety of computer systems. Using both synthetic and real datasets, the low-rank matrix approximation shows better performance compared to exact computation, while preserving the application requirements in both parameter estimation and prediction accuracy. We also propose a novel algorithm to assess the prediction accuracy after the online parameter estimation.
  • A tangent linear approximation of the ignition delay time. I: Sensitivity to rate parameters

    Almohammadi, Saja Mohammad; Hantouche, Mireille; Le Maître, Olivier P.; Knio, Omar (Combustion and Flame, Elsevier BV, 2021-04-02) [Article]
    A tangent linear approximation is developed to estimate the sensitivity of the ignition delay time with respect to individual rate parameters in a detailed chemical mechanism. Attention is focused on a gas mixture reacting under adiabatic, constant-volume conditions. The uncertainty in the rates of elementary reactions is described in terms of uncertainty factors, and are parameterized using independent canonical random variables. The approach is based on integrating the linearized system of equations governing the evolution of the partial derivatives of the state vector with respect to individual random variables, and a linearized approximation is developed to relate the ignition delay sensitivity to the scaled partial derivatives of temperature. The efficiency of the approach is demonstrated through applications to chemical mechanisms of different sizes. In particular, the computations indicate that for detailed reaction mechanisms the TLA leads to robust local sensitivity predictions at a computational cost that is order-of-magnitude smaller than that incurred by finite-difference approaches based on one-at-a-time rate perturbations.
  • The Arab world prepares the exascale workforce

    Keyes, David E. (Communications of the ACM, Association for Computing Machinery (ACM), 2021-04) [Article]
    THE ARAB WORLD is currently host to eight supercomputers in the Top500 globally, including the current #10 and a former #7. Hardware can become a honeypot for talent attraction—senior talent from abroad, and rising talent from within. Good return on investment from leading-edge hardware motivates forging collaborative ties to global supercomputing leaders, which leads to integration into the global campaigns that supercomputing excels in, such as predicting climate change and developing sustainable energy resources for its mitigation, positing properties of new materials and catalysis by design, repurposing already-certified drugs and discovering new ones, and big data analytics and machine learning applied to science and to society. While the petroleum industry has been the historical motivation for supercomputing in the Arab World with its workloads of seismic imaging and reservoir modeling, the attraction today is universal.
  • On the depth of decision trees over infinite 1-homogeneous binary information systems

    Moshkov, Mikhail (Array, Elsevier BV, 2021-04) [Article]
    In this paper, we study decision trees, which solve problems defined over a specific subclass of infinite information systems, namely: 1-homogeneous binary information systems. It is proved that the minimum depth of a decision tree (defined as a function on the number of attributes in a problem’s description) grows – in the worst case – logarithmically or linearly for each information system in this class. We consider a number of examples of infinite 1-homogeneous binary information systems, including one closely related to the decision trees constructed by the CART algorithm.
  • Coastal circulation and water transport properties of the Red Sea Project lagoon

    Zhan, Peng; Krokos, Georgios; Langodan, Sabique; Guo, Daquan; Dasari, Hari Prasad; Papadopoulos, Vassilis P.; Lermusiaux, Pierre F.J.; Knio, Omar; Hoteit, Ibrahim (Ocean Modelling, Elsevier BV, 2021-03-26) [Article]
    The Red Sea Project (RSP) is based on a coastal lagoon with over 90 pristine islands. The project intends to transform the Red Sea coast into a world-class tourist destination. To better understand the regional dynamics and water exchange scenarios in the lagoon, a high-resolution numerical model is implemented. The general and tidal circulation dynamics are then investigated with a particular focus on the response of the lagoon to strong wind jets. Significant variations in winter and summer circulation patterns are identified. The tidal amplitude inside the lagoon is greater than that outside, with strong tidal currents passing over its surrounding coral reef banks. The lagoon rapidly responds to the strong easterly wind jets that occur mainly in winter; it develops a reverse flow at greater depths, and the coastal water elevation is instantly affected. Lagrangian particle simulations are conducted to study the residence time of water in the lagoon. The results suggest that water renewal is slow in winter. Analysis of the Lagrangian coherent structures (LCS) reveals that water renewal is largely linked to the circulation patterns in the lagoon. In winter, the water becomes restricted in the central lagoon with only moderate exchange, whereas in summer, more circulation is observed with a higher degree of interaction between the central lagoon and external water. The results of LCS also highlight the tidal contribution to stirring and mixing while identifying the hotspots of the phenomenon. Our analysis demonstrates an effective approach for studying regional water mixing and connectivity, which could support coastal management in data-limited regions.
  • Critical Properties and Complexity Measures of Read-Once Boolean Functions

    Lozin, Vadim; Moshkov, Mikhail (Annals of Mathematics and Artificial Intelligence, Springer Nature, 2021-03-22) [Article]
    In this paper, we define a quasi-order on the set of read-once Boolean functions and show that this is a well-quasi-order. This implies that every parameter measuring complexity of the functions can be characterized by a finite set of minimal subclasses of read-once functions, where this parameter is unbounded. We focus on two parameters related to certificate complexity and characterize each of them in the terminology of minimal classes.
  • Numerical simulation and entropy dissipative cure of the carbuncle instability for the shallow water circular hydraulic jump

    Ketcheson, David I.; Quezada de Luna, Manuel (arXiv, 2021-03-17) [Preprint]
    We investigate the numerical artifact known as a carbuncle, in the solution of the shallow water equations. We propose a new Riemann solver that is based on a local measure of the entropy residual and aims to avoid carbuncles while maintaining high accuracy. We propose a new challenging test problem for shallow water codes, consisting of a steady circular hydraulic jump that can be physically unstable. We show that numerical methods are prone to either suppress the instability completely or form carbuncles. We test existing cures for the carbuncle. In our experiments, only the proposed method is able to avoid unphysical carbuncles without suppressing the physical instability.
  • Sample average approximation for risk-averse problems: A virtual power plant scheduling application

    Lima, Ricardo; Conejo, Antonio J.; Giraldi, Loic; Le Maître, Olivier; Hoteit, Ibrahim; Knio, Omar (EURO Journal on Computational Optimization, Elsevier B.V., 2021-03-16) [Article]
    In this paper, we address the decision-making problem of a virtual power plant (VPP) involving a self-scheduling and market involvement problem under uncertainty in the wind speed and electricity prices. The problem is modeled using a risk-neutral and two risk-averse two-stage stochastic programming formulations, where the conditional value at risk is used to represent risk. A sample average approximation methodology is integrated with an adapted L-Shaped solution method, which can solve risk-neutral and specific risk-averse problems. This methodology provides a framework to understand and quantify the impact of the sample size on the variability of the results. The numerical results include an analysis of the computational performance of the methodology for two case studies, estimators for the bounds of the true optimal solutions of the problems, and an assessment of the quality of the solutions obtained. In particular, numerical experiences indicate that when an adequate sample size is used, the solution obtained is close to the optimal one.
  • Enhanced acoustic pressure sensors based on coherent perfect absorber-laser effect

    Farhat, Mohamed; Ahmed, Waqas Waseem; Khelif, Abdelkrim; Salama, Khaled N.; Wu, Ying (Journal of Applied Physics, AIP Publishing, 2021-03-14) [Article]
    Lasing is a well-established field in optics with several applications. Yet, having lasing or huge amplification in other wave systems remains an elusive goal. Here, we utilize the concept of coherent perfect absorber-laser to realize an acoustic analog of laser with a proven amplification of more than 10 4 in terms of the scattered acoustic signal at a frequency of a few kHz. The obtained acoustic laser (or the coherent perfect absorber-laser) is shown to possess extremely high sensitivity and figure of merit with regard to ultra-small variations of the pressure (density and compressibility) and suggests its evident potential to build future acoustic pressure devices such as precise sensors.
  • Some estimates for the planning problem with potential

    Bakaryan, Tigran; Ferreira, Rita; Gomes, Diogo A. (Nonlinear Differential Equations and Applications, Springer Nature, 2021-03-11) [Article]
    In this paper, we study a priori estimates for a first-order mean-field planning problem with a potential. In the theory of mean-field games (MFGs), a priori estimates play a crucial role to prove the existence of classical solutions. In particular, uniform bounds for the density of players’ distribution and its inverse are of utmost importance. Here, we investigate a priori bounds for those quantities for a planning problem with a non-vanishing potential. The presence of a potential raises non-trivial difficulties, which we overcome by exploring a displacement-convexity property for the mean-field planning problem with a potential together with Moser’s iteration method. We show that if the potential satisfies a certain smallness condition, then a displacement-convexity property holds. This property enables Lq bounds for the density. In the one-dimensional case, the displacement-convexity property also gives Lq bounds for the inverse of the density. Finally, using these Lq estimates and Moser’s iteration method, we obtain L∞ estimates for the density of the distribution of the players and its inverse. We conclude with an application of our estimates to prove existence and uniqueness of solutions for a particular first-order mean-field planning problem with a potential.
  • Broadband vectorial ultrathin optics with experimental efficiency up to 99% in the visible region via universal approximators

    Getman, Fedor; Makarenko, M.; Burguete-Lopez, A.; Fratalocchi, Andrea (Light: Science & Applications, Springer Nature, 2021-03-04) [Article]
    AbstractIntegrating conventional optics into compact nanostructured surfaces is the goal of flat optics. Despite the enormous progress in this technology, there are still critical challenges for real-world applications due to the limited operational efficiency in the visible region, on average lower than 60%, which originates from absorption losses in wavelength-thick (≈ 500 nm) structures. Another issue is the realization of on-demand optical components for controlling vectorial light at visible frequencies simultaneously in both reflection and transmission and with a predetermined wavefront shape. In this work, we developed an inverse design approach that allows the realization of highly efficient (up to 99%) ultrathin (down to 50 nm thick) optics for vectorial light control with broadband input–output responses in the visible and near-IR regions with a desired wavefront shape. The approach leverages suitably engineered semiconductor nanostructures, which behave as a neural network that can approximate a user-defined input–output function. Near-unity performance results from the ultrathin nature of these surfaces, which reduces absorption losses to near-negligible values. Experimentally, we discuss polarizing beam splitters, comparing their performance with the best results obtained from both direct and inverse design techniques, and new flat-optics components represented by dichroic mirrors and the basic unit of a flat-optics display that creates full colours by using only two subpixels, overcoming the limitations of conventional LCD/OLED technologies that require three subpixels for each composite colour. Our devices can be manufactured with a complementary metal-oxide-semiconductor (CMOS)-compatible process, making them scalable for mass production at low cost.
  • Triple Decomposition of Velocity Gradient Tensor in Compressible Turbulence

    Boukharfane, Radouan; Er-raiy, Aimad; Alzaben, Linda; Parsani, Matteo (Fluids, MDPI AG, 2021-03-02) [Article]
    The decomposition of the local motion of a fluid into straining, shearing, and rigid-body rotation is examined in this work for a compressible isotropic turbulence by means of direct numerical simulations. The triple decomposition is closely associated with a basic reference frame (BRF), in which the extraction of the biasing effect of shear is maximized. In this study, a new computational and inexpensive procedure is proposed to identify the BRF for a three-dimensional flow field. In addition, the influence of compressibility effects on some statistical properties of the turbulent structures is addressed. The direct numerical simulations are carried out with a Reynolds number that is based on the Taylor micro-scale of Reλ=100 for various turbulent Mach numbers that range from Mat=0.12 to Mat=0.89. The DNS database is generated with an improved seventh-order accurate weighted essentially non-oscillatory scheme to discretize the non-linear advective terms, and an eighth-order accurate centered finite difference scheme is retained for the diffusive terms. One of the major findings of this analysis is that regions featuring strong rigid-body rotations or straining motions are highly spatially intermittent, while most of the flow regions exhibit moderately strong shearing motions in the absence of rigid-body rotations and straining motions. The majority of compressibility effects can be estimated if the scaling laws in the case of compressible turbulence are rescaled by only considering the solenoidal contributions.
  • A sharp interface method using enriched finite elements for elliptic interface problems

    Hollbacher, Susanne; Wittum, Gabriel (Numerische Mathematik, Springer Nature, 2021-03-02) [Article]
    We present an immersed boundary method for the solution of elliptic interface problems with discontinuous coefficients which provides a second-order approximation of the solution. The proposed method can be categorised as an extended or enriched finite element method. In contrast to other extended FEM approaches, the new shape functions get projected in order to satisfy the Kronecker-delta property with respect to the interface. The resulting combination of projection and restriction was already derived in Höllbacher and Wittum (TBA, 2019a) for application to particulate flows. The crucial benefits are the preservation of the symmetry and positive definiteness of the continuous bilinear operator. Besides, no additional stabilisation terms are necessary. Furthermore, since our enrichment can be interpreted as adaptive mesh refinement, the standard integration schemes can be applied on the cut elements. Finally, small cut elements do not impair the condition of the scheme and we propose a simple procedure to ensure good conditioning independent of the location of the interface. The stability and convergence of the solution will be proven and the numerical tests demonstrate optimal order of convergence.
  • On some singular mean-field games

    Cirant, Marco; Gomes, Diogo A.; Pimentel, Edgard A.; Sánchez-Morgado, Héctor (Journal of Dynamics & Games, American Institute of Mathematical Sciences (AIMS), 2021-03) [Article]
    Here, we prove the existence of smooth solutions for mean-field games with a singular mean-field coupling; that is, a coupling in the Hamilton-Jacobi equation of the form g(m)=−m−α with α>0. We consider stationary and time-dependent settings. The function g is monotone, but it is not bounded from below. With the exception of the logarithmic coupling, this is the first time that MFGs whose coupling is not bounded from below is examined in the literature. This coupling arises in models where agents have a strong preference for low-density regions. Paradoxically, this causes the agents move towards low-density regions and, thus, prevents the creation of those regions. To prove the existence of solutions, we consider an approximate problem for which the existence of smooth solutions is known. Then, we prove new a priori bounds for the solutions that show that 1m is bounded. Finally, using a limiting argument, we obtain the existence of solutions. The proof in the stationary case relies on a blow-up argument and in the time-dependent case on new bounds for m−1.
  • Parallel Hierarchical Matrix Technique to Approximate Large Covariance Matrices, Likelihood Functions and Parameter Identi fication

    Litvinenko, Alexander; Berikov, V.; Genton, Marc G.; Keyes, David E.; Kriemann, R.; Sun, Ying (2021-03-01) [Presentation]
    We develop the HLIBCov package, which is using parallel hierarchical (H-) matrices to: 1) Approximate large dense inhomogeneous covariance matrices with a log-linear computational cost and storage requirement. 2) Compute matrix-vector product, Cholesky factorization and inverse with a log-linear complexity. 3) Identify unknown parameters of the covariance function (variance, smoothness, and covariance length). These unknown parameters are estimated by maximizing the joint Gaussian log-likelihood function. To demonstrate the numerical performance, we identify three unknown parameters in an example with 2,000,000 locations on a PC-desktop.

View more