Recent Submissions

• Smaller generalization error derived for deep compared to shallow residual neural networks

(arXiv, 2020-10-05) [Preprint]
Estimates of the generalization error are proved for a residual neural network with $L$ random Fourier features layers $\bar z_{\ell+1}=\bar z_\ell + \text{Re}\sum_{k=1}^K\bar b_{\ell k}e^{{\rm i}\omega_{\ell k}\bar z_\ell}+ \text{Re}\sum_{k=1}^K\bar c_{\ell k}e^{{\rm i}\omega'_{\ell k}\cdot x}$. An optimal distribution for the frequencies $(\omega_{\ell k},\omega'_{\ell k})$ of the random Fourier features $e^{{\rm i}\omega_{\ell k}\bar z_\ell}$ and $e^{{\rm i}\omega'_{\ell k}\cdot x}$ is derived. The derivation is based on the corresponding generalization error to approximate function values $f(x)$. The generalization error turns out to be smaller than the estimate ${\|\hat f\|^2_{L^1(\mathbb{R}^d)}}/{(LK)}$ of the generalization error for random Fourier features with one hidden layer and the same total number of nodes $LK$, in the case the $L^\infty$-norm of $f$ is much less than the $L^1$-norm of its Fourier transform $\hat f$. This understanding of an optimal distribution for random features is used to construct a new training method for a deep residual network that shows promising results.
• Lower Bounds and Optimal Algorithms for Personalized Federated Learning

(arXiv, 2020-10-05) [Preprint]
In this work, we consider the optimization formulation of personalized federated learning recently introduced by Hanzely and Richt\'arik (2020) which was shown to give an alternative explanation to the workings of local {\tt SGD} methods. Our first contribution is establishing the first lower bounds for this formulation, for both the communication complexity and the local oracle complexity. Our second contribution is the design of several optimal methods matching these lower bounds in almost all regimes. These are the first provably optimal methods for personalized federated learning. Our optimal methods include an accelerated variant of {\tt FedProx}, and an accelerated variance-reduced version of {\tt FedAvg}/Local {\tt SGD}. We demonstrate the practical superiority of our methods through extensive numerical experiments.
• Joint seismic and electromagnetic inversion for reservoir mapping using a deep learning aided feature-oriented approach

(Society of Exploration Geophysicists, 2020-10-01) [Conference Paper]
As more and more types of geophysical measurements informing about different characteristics of subsurface formations are available, effectively synergizing the information from these measurements becomes critical to enhance deep reservoir characterization, determine interwell fluid distribution and ultimately maximize oil recovery. In this study, we develop a feature-based model calibration workflow by combining the power of ensemble methods in data integration and deep learning techniques in feature segmentation. The performance of the developed workflow is demonstrated with a synthetic channelized reservoir model, in which crosswell seismic and electromagnetic (EM) data are jointly inverted.
• High-order accurate entropy-stable discontinuous collocated Galerkin methods with the summation-by-parts property for compressible CFD frameworks: Scalable SSDC algorithms and flow solver

(Journal of Computational Physics, Elsevier BV, 2020-09-22) [Article]
This work reports on the performances of a fully-discrete hp-adaptive entropy stable discontinuous collocated Galerkin method for the compressible Naiver–Stokes equations. The resulting code framework is denoted by SSDC, the first S for entropy, the second for stable, and DC for discontinuous collocated. The method is endowed with the summation-by-parts property, allows for arbitrary spatial and temporal order, and is implemented in an unstructured high performance solver. The considered class of fully-discrete algorithms are systematically designed with mimetic and structure preserving properties that allow the transfer of continuous proofs to the fully discrete setting. Our goal is to provide numerical evidence of the adequacy and maturity of these high-order methods as potential base schemes for the next generation of unstructured computational fluid dynamics tools. We provide a series of test cases of increased difficulty, ranging from non-smooth to turbulent flows, in order to evaluate the numerical performance of the algorithms. Results on weak and strong scaling of the distributed memory implementation demonstrate that the parallel SSDC solver can scale efficiently over 100,000 processes.
• Global sensitivity analysis of n-butanol ignition delay times to thermodynamics class and rate rule parameters

(Combustion and Flame, Elsevier BV, 2020-09-14) [Article]
We study the variability in the ignition delay time, τign, of n-butanol due to uncertainty in the enthalpies and entropies of the fuel and fuel radicals. A stoichiometric mixture reacting adiabatically at constant volume is considered, over a range of initial temperatures (700–1000 K) and pressures (10–80 bar). We develop a thermodynamic class approach to account for the variability in the thermodynamic properties of species of interest, and to define associated uncertainty ranges. To gain insight into the impact of the variability of the thermodynamic properties of individual species, a brute force sensitivity analysis is first performed. The results show that large variations in τign are mainly due to perturbations in the enthalpies of six species belonging to two thermodynamic classes. A refined 1D analysis is then conducted of the uncertain enthalpies and entropies of these six species. In particular, a complex, nonmonotonic dependence of τign on species enthalpies is observed, highlighting potential limitations in extrapolating local sensitivity results. The 1D analysis also shows that uncertainties in species entropies have a weaker impact on the variability in τign than the species enthalpies. A global sensitivity analysis of the impact of thermodynamic class uncertainties is then performed, namely using surrogates constructed using an adaptive pseudo-spectral method. The results indicate that the variability of τign is dominated by uncertainties in the classes associated with peroxy and hydroperoxide radicals. Lastly, we perform a combined sensitivity analysis of uncertainty in kinetic rates and thermodynamic properties. In particular, the results indicate that uncertainties in thermodynamic properties can induce variabilities in ignition delay time that are as large as those associated with kinetic rate uncertainties.
• Effects of Composition Heterogeneities on Flame Kernel Propagation: A DNS Study

(Fluids, MDPI AG, 2020-09-04) [Article]
In this study, a new set of direct numerical simulations is generated and used to examine the influence of mixture composition heterogeneities on the propagation of a premixed iso-octane/air spherical turbulent flame, with a representative chemical description. The dynamic effects of both turbulence and combustion heterogeneities are considered, and their competition is assessed. The results of the turbulent homogeneous case are compared with those of heterogeneous cases which are characterized by multiple stratification length scales and segregation rates in the regime of a wrinkled flame. The comparison reveals that stratification does not alter turbulent flame behaviors such as the preferential alignment of the convex flame front with the direction of the compression. However, we find that the overall flame front propagation is slower in the presence of heterogeneities because of the differential on speed propagation. Furthermore, analysis of different displacement speed components is performed by taking multi-species formalism into account. This analysis shows that the global flame propagation front slows down due to the heterogeneities caused by the reaction mechanism and the differential diffusion accompanied by flame surface density variations. Quantification of the effects of each of these mechanisms shows that their intensity increases with the increase in stratification’s length scale and segregation rate.
• Weak error rates for option pricing under the rough Bergomi model

(arXiv, 2020-09-02) [Preprint]
In quantitative finance, modeling the volatility structure of underlying assets is a key component in the pricing of options. Rough stochastic volatility models, such as the rough Bergomi model [Bayer, Friz, Gatheral, Quantitative Finance 16(6), 887-904, 2016], seek to fit observed market data based on the observation that the log-realized variance behaves like a fractional Brownian motion with small Hurst parameter, $H < 1/2$, over reasonable timescales. Both time series data of asset prices and option derived price data indicate that $H$ often takes values close to $0.1$ or smaller, i.e. rougher than Brownian Motion. This change greatly improves the fit to time series data of underlying asset prices as well as to option prices while maintaining parsimoniousness. However, the non-Markovian nature of the driving fractional Brownian motion in the rough Bergomi model poses severe challenges for theoretical and numerical analyses as well as for computational practice. While the explicit Euler method is known to converge to the solution of the rough Bergomi model, the strong rate of convergence is only $H$. We prove rate $H + 1/2$ for the weak convergence of the Euler method and, surprisingly, in the case of quadratic payoff functions we obtain rate one. Indeed, the problem of weak convergence for rough Bergomi is very subtle; we provide examples demonstrating that the rate of convergence for payoff functions well approximated by second-order polynomials, as weighted by the law of the fractional Brownian motion, may be hard to distinguish from rate one empirically. Our proof relies on Taylor expansions and an affine Markovian representation of the underlying and is further supported by numerical experiments.
• Bound-preserving convex limiting for high-order Runge-Kutta time discretizations of hyperbolic conservation laws

(arXiv, 2020-09-02) [Preprint]
We introduce a general framework for enforcing local or global inequality constraints in high-order time-stepping methods for a scalar hyperbolic conservation law. The proposed methodology blends an arbitrary Runge-Kutta scheme and a bound-preserving (BP) first-order approximation using two kinds of limiting techniques. The first one is a predictor-corrector method that belongs to the family of flux-corrected transport (FCT) algorithms. The second approach constrains the antidiffusive part of a high-order target scheme using a new globalized monolithic convex (GMC) limiter. The flux-corrected approximations are BP under the time step restriction of the forward Euler method in the explicit case and without any time step restrictions in the implicit case. The FCT and GMC limiters can be applied to antidiffusive fluxes of intermediate RK stages and/or of the final solution update. Stagewise limiting ensures the BP property of intermediate cell averages. If the calculation of high-order fluxes involves polynomial reconstructions from BP data, these reconstructions can be constrained using a slope limiter to correct unacceptable input. The BP property of the final solution is guaranteed for all flux-corrected methods. Numerical studies are performed for one-dimensional test problems discretized in space using explicit weighted essentially nonoscillatory (WENO) finite volume schemes.
• Trajectory Planning for Autonomous Underwater Vehicles: A Stochastic Optimization Approach

(2020-08-30) [Dissertation]
Advisors: Knio, Omar; Shamma, Jeff S.
Committee members: Hoteit, Ibrahim; Lermusiaux, Pierre F.J.
In this dissertation, we develop a new framework for 3D trajectory planning of Autonomous Underwater Vehicles (AUVs) in realistic ocean scenarios. The work is divided into three parts. In the rst part, we provide a new approach for deterministic trajectory planning in steady current, described using Ocean General Circulation Model (OGCM) data. We apply a Non-Linear Programming (NLP) to the optimal time trajectory planning problem. To demonstrate the effectivity of the resulting model, we consider the optimal time trajectory planning of an AUV operating in the Red Sea and the Gulf of Aden. In the second part, we generalize our 3D trajectory planning framework to time-dependent ocean currents. We also extend the framework to accommodate multi-objective criteria, focusing speci cally on the Pareto front curve between time and energy. To assess the effectiveness of the extended framework, we initially test the methodology in idealized settings. The scheme is then demonstrated for time-energy trajectory planning problems in the Gulf of Aden. In the last part, we account for uncertainty in the ocean current eld, is described by an ensemble of flow realizations. The proposed approach is based on a non-linear stochastic programming methodology that uses a risk-aware objective function, accounting for the full variability of the flow ensemble. We formulate stochastic problems that aim to minimize a risk measure of the travel time or energy consumption, using a fexible methodology that enables the user to explore various objectives, ranging seamlessly from risk-neutral to risk-averse. The capabilities of the approach are demonstrated using steady and transient currents. Advanced visualization tools have been further designed to simulate results.
• Fundamental Molecular Communication Modelling

(2020-08-25) [Thesis]
Committee members: Keyes, David E.; Parsani, Matteo
As traditional communication technology we use in our day-to-day life reaches its limitations, the international community searches for new methods to communicate information. One such novel approach is the so-called molecular communication system. During the last few decades, molecular communication systems become more and more popular. The main di erence between traditional communication and molecular communication systems is that in the latter, information transfer occurs through chemical means, most often between microorganisms. This process already happens all around us naturally, for example, in the human body. Even though the molecular communication topic is attractive to researchers, and a lot of theoretical results are available - one cannot claim the same about the practical use of molecular communication. As for experimental results, a few studies have been done on the macroscale, but investigations at the micro- and nanoscale ranges are still lacking because they are a challenging task. In this work, a self-contained introduction of the underlying theory of molecular communication is provided, which includes knowledge from di erent areas such as biology, chemistry, communication theory, and applied mathematics. Two numerical methods are implemented for three well-studied partial di erential equations of the MC eld where advection, di usion, and the reaction are taken into account. Numerical results for test cases in one and three dimensions are presented and discussed in detail. Conclusions and essential analytical and numerical future directions are then drawn.
• Channel-Directed Gradients for Optimization of Convolutional Neural Networks

(arXiv, 2020-08-25) [Preprint]
We introduce optimization methods for convolutional neural networks that can be used to improve existing gradient-based optimization in terms of generalization error. The method requires only simple processing of existing stochastic gradients, can be used in conjunction with any optimizer, and has only a linear overhead (in the number of parameters) compared to computation of the stochastic gradient. The method works by computing the gradient of the loss function with respect to output-channel directed re-weighted L2 or Sobolev metrics, which has the effect of smoothing components of the gradient across a certain direction of the parameter tensor. We show that defining the gradients along the output channel direction leads to a performance boost, while other directions can be detrimental. We present the continuum theory of such gradients, its discretization, and application to deep networks. Experiments on benchmark datasets, several networks and baseline optimizers show that optimizers can be improved in generalization error by simply computing the stochastic gradient with respect to output-channel directed metrics.
• A Hybrid Ensemble Adjustment Kalman Filter based High-resolution Data Assimilation System for the Red Sea: Implementation and Evaluation

(Quarterly Journal of the Royal Meteorological Society, Wiley, 2020-08-25) [Article]
A new Hybrid ensemble data assimilation system is implemented with a Massachusetts Institute of Technology general circulation model (MITgcm) of the Red Sea. The system is based on the Data Assimilation Research Testbed (DART) and combines a time-varying ensemble generated by the Ensemble Adjustment Kalman filter (EAKF) with a pre-selected quasi-static (monthly varying) ensemble as used in an Ensemble Optimal Interpolation (EnOI) scheme. The goal is to develop an efficient system that enhances the state estimate and model forecasting skill in the Red Sea with reduced computational load compared to the EAKF. Observations of satellite sea surface temperature (SST), altimeter sea surface height (SSH), and in situ temperature and salinity profiles are assimilated to evaluate the new system. The performance of the Hybrid scheme (here after Hybrid-EAKF) is assessed with respect to the EnOI and the EAKF results. The comparisons are based on the daily averaged forecasts against satellite SST and SSH measurements and independent in situ temperature and salinity profiles. Hybrid-EAKF yields significant improvements in terms of ocean state estimates compared to both EnOI and EAKF, in particular mitigating for dynamical imbalances that affects EnOI. Hybrid-EAKF improves the estimation of SST and SSH root-mean-square-differences by up to 20% compared to EAKF. High-resolution mesoscale eddy features, which dominate the Red Sea circulation, are further better represented in Hybrid-EAKF. Important reduction, by about 75%, in computational cost is also achieved with the Hybrid-EAKF system compared to the EAKF. These significant improvements were obtained with the Hybrid-EAKF after accounting for uncertainties in the atmospheric forcing and internal model physics in the time-varying ensemble.
• Maximizing I/O bandwidth for reverse time migration on heterogeneous large-scale systems

(Springer International Publishing, 2020-08-17) [Conference Paper]
Reverse Time Migration (RTM) is an important scientific application for oil and gas exploration. The 3D RTM simulation generates terabytes of intermediate data that does not fit in main memory. In particular, RTM has two successive computational phases, i.e., the forward modeling and the backward propagation, that necessitate to write and then to read the state of the computed solution grid at specific time steps of the time integration. Advances in memory architecture have made it feasible and affordable to integrate hierarchical storage media on large-scale systems, starting from the traditional Parallel File Systems (PFS) to intermediate fast disk technologies (e.g., node-local and remote-shared Burst Buffer) and up to CPU main memory. To address the trend of heterogeneous HPC systems deployment, we introduce an extension to our Multilayer Buffer System (MLBS) framework to further maximize RTM I/O bandwidth in presence of GPU hardware accelerators. The main idea is to leverage the GPU’s High Bandwidth Memory (HBM) as an additional storage media layer. The objective of MLBS is ultimately to hide the application’s I/O overhead by enabling a buffering mechanism operating across all the hierarchical storage media layers. MLBS is therefore able to sustain the I/O bandwidth at each storage media layer. By asynchronously performing expensive I/O operations and creating opportunities for overlapping data motion with computations, MLBS may transform the original I/O bound behavior of the RTM application into a compute-bound regime. In fact, the prefetching strategy of MLBS allows the RTM application to believe that it has access to a larger memory capacity on the GPU, while transparently performing the necessary housekeeping across the storage layers. We demonstrate the effectiveness of MLBS on the Summit supercomputer using 2048 compute nodes equipped with a total of 12288 GPUs by achieving up to 1.4X performance speedup compared to the reference PFS-based RTM implementation for large 3D solution grid.
• Extensions of dynamic programming for multi-stage combinatorial optimization

(Theoretical Computer Science, Elsevier BV, 2020-08-17) [Article]
We propose a dynamic programming framework for an exact multi-stage (lexicographic) combinatorial optimization. Unlike conventional algorithms of dynamic programming that return one optimal solution, two dynamic programming algorithms proposed in this paper are coping with the whole set of optimal solutions or with its essential part. We describe the set of elements for optimization by a labeled directed acyclic graph, which in some sense, is similar to the structure of subproblems of the considered problem. For a given cost function (objective), the first algorithm constructs a subgraph of this graph that describes the whole set of optimal elements or its essential part. This algorithm can be used for multi-stage optimization of elements relative to a sequence of cost functions. The second algorithm counts elements before the optimization and after each optimization step. The considered labeled directed acyclic graph is a kind of circuit. This circuit builds the set of elements for optimization from one-element sets attached to input nodes. It uses the operation of union of sets attached to unifying nodes and functional operations attached to functional nodes. The algorithms for optimization and counting elements are defined for an arbitrary circuit without repetitions in which each element is generated exactly one time. For a considered problem with a known conventional dynamic programming solution algorithm, usually, it is easy to construct a corresponding circuit without repetitions. Once the circuit and cost functions are defined, our framework provides the correctness proofs and detailed time complexity analysis for the proposed algorithms. To make this approach more intuitive, we consider an illustrative example related to the maximum subarray problem. We tested our approach on the following nine combinatorial optimization problems: matrix chain multiplication, global sequence alignment, optimal paths in directed graphs, binary search trees, optimal bitonic tour, segmented least squares, convex polygon triangulation, one-dimensional clustering, and line breaking (text justification). We consider the last three problems in detail: construct a circuit without repetitions, describe at least two cost functions, evaluate a number of operations and time required by the algorithms, discuss an example, and consider the results of computer experiments with randomly generated instances of the problem.
• Dirk schemes with high weak stage order

(Springer International Publishing, 2020-08-11) [Conference Paper]
Runge-Kutta time-stepping methods in general suffer from order reduction: the observed order of convergence may be less than the formal order when applied to certain stiff problems. Order reduction can be avoided by using methods with high stage order. However, diagonally-implicit Runge-Kutta (DIRK) schemes are limited to low stage order. In this paper we explore a weak stage order criterion, which for initial boundary value problems also serves to avoid order reduction, and which is compatible with a DIRK structure. We provide specific DIRK schemes of weak stage order up to 3, and demonstrate their performance in various examples.
• Next-Generation Sequencing at High Sequencing Depth as a Tool to Study the Evolution of Metastasis Driven by Genetic Change Events of Lung Squamous Cell Carcinoma

(Frontiers in Oncology, Frontiers Media SA, 2020-08-05) [Article]
Background: The aim of this study is to report tumoral genetic mutations observed at high sequencing depth in a lung squamous cell carcinoma (SqCC) sample. We describe the findings and differences in genetic mutations that were studied by deep next-generation sequencing methods on the primary tumor and liver metastasis samples. In this report, we also discuss how these differences may be involved in determining the tumor progression leading to the metastasis stage. Methods: We followed one lung SqCC patient who underwent FDG-PET scan imaging, before and after three months of treatment. We sequenced 26 well-known cancer-related genes, at an average of ~6,000 × sequencing coverage, in two spatially distinct regions, one from a primary lung tumor metastasis and the other from a distal liver metastasis, which was present before the treatment. Results: A total of 3,922,196 read pairs were obtained across all two samples' sequenced locations. Merged mapped reads showed several variants, from which we selected 36 with high confidence call. While we found 83% of genetic concordance between the distal metastasis and primary tumor, six variants presented substantial discordance. In the liver metastasis sample, we observed three de novo genetic changes, two on the FGFR3 gene and one on the CDKN2A gene, and the frequency of one variant found on the FGFR2 gene has been increased. Two genetic variants in the HRAS gene, which were present initially in the primary tumor, have been completely lost in the liver tumor. The discordant variants have coding consequences as follows: FGFR3 (c.746C>G, p. Ser249Cys), CDKN2A (c.47_50delTGGC, p. Leu16Profs*9), and HRAS (c.182A>C, p. Gln61Pro). The pathogenicity prediction scores for the acquired variants, assessed using several databases, reported these variants as pathogenic, with a gain of function for FGFR3 and a loss of function for CDKN2A. The patient follow-up using imaging with 18F-FDG PET/CT before and after four cycles of treatment shows discordant tumor progression in metastatic liver compared to primary lung tumor. Conclusions: Our results report the occurrence of several genetic changes between primary tumor and distant liver metastasis in lung SqCC, among which non-silent mutations may be associated with tumor evolution during metastasis.
• A note on tools for prediction under uncertainty and identifiability of SIR-like dynamical systems for epidemiology

(arXiv, 2020-08-04) [Preprint]
We provide an overview of the methods that can be used for prediction under uncertainty and data fitting of dynamical systems, and of the fundamental challenges that arise in this context. The focus is on SIR-like models, that are being commonly used when attempting to predict the trend of the COVID-19 pandemic. In particular, we raise a warning flag about identifiability of the parameters of SIR-like models; often, it might be hard to infer the correct values of the parameters from data, even for very simple models, making it non-trivial to use these models for meaningful predictions. Most of the points that we touch upon are actually generally valid for inverse problems in more general setups.
• Sequencing Effort Dictates Gene Discovery in Marine Microbial Metagenomes

(Environmental Microbiology, Wiley, 2020-08-03) [Article]
Massive metagenomic sequencing combined with gene prediction methods were previously used to compile the gene catalogue of ocean and host-associated microbes. Global expeditions conducted over the past 15 years have sampled to ocean to build a catalogue of genes from pelagic microbes. Here we undertook a large sequencing effort of a perturbed Red Sea plankton community to uncover that the rate of gene discovery increases continuously with sequencing effort, with no indication that the retrieved 2.83 million non-redundant (complete) genes predicted from the experiment represented a nearly complete inventory of the genes present in the sampled community (i.e. no evidence of saturation). The underlying reason is the Pareto-like distribution of the abundance of genes in the plankton community, resulting in a very long tail of millions of genes present at remarkably low abundances, which can only be retrieved through massive sequencing. Microbial metagenomic projects retrieve variable number of unique genes per Tera base-pair (Tbp), with a median value of 14.7 million unique genes per Tbp sequenced across projects. The increase in the rate of gene discovery in microbial metagenomes with sequencing effort implies that there is ample room for new gene discovery in further ocean and holobiont sequencing studies This article is protected by copyright. All rights reserved.
• Characterization of pressure fluctuations within a controlled-diffusion blade boundary layer using the equilibrium wall-modelled LES.

(Scientific reports, Springer Science and Business Media LLC, 2020-07-31) [Article]
In this study, the generation of airfoil trailing edge broadband noise that arises from the interaction of turbulent boundary layer with the airfoil trailing edge is investigated. The primary objectives of this work are: (i) to apply a wall-modelled large-eddy simulation (WMLES) approach to predict the flow of air passing a controlled-diffusion blade, and (ii) to study the blade broadband noise that is generated from the interaction of a turbulent boundary layer with a lifting surface trailing edge. This study is carried out for two values of the Mach number, [Formula: see text] and 0.5, two values of the chord Reynolds number, [Formula: see text] and [Formula: see text], and two angles of attack, AoA [Formula: see text] and [Formula: see text]. To examine the influence of the grid resolution on aerodynamic and aeroacoustic quantities, we compare our results with experimental data available in the literature. We also compare our results with two in-house numerical solutions generated from two wall-resolved LES (WRLES) calculations, one of which has a DNS-like resolution. We show that WMLES accurately predicts the mean pressure coefficient distribution, velocity statistics (including the mean velocity), and the traces of Reynolds tensor components. Furthermore, we observe that the instantaneous flow structures computed by the WMLES resemble those found in the reference WMLES database, except near the leading edge region. Some of the differences observed in these structures are associated with tripping and the transition to a turbulence mechanism near the leading edge, which are significantly affected by the grid resolution. The aeroacoustic noise calculations indicate that the power spectral density profiles obtained using the WMLES compare well with the experimental data.
• NonlinearWave Motion in Viscoelasticity and Free Surface Flows

(2020-07-24) [Dissertation]