Recent Submissions

  • Nonlinear Preconditioning Strategies for Two-Phase Flows in Porous Media Discretized by a Fully Implicit Discontinuous Galerkin Method

    Luo, Li; Cai, Xiao-Chuan; Keyes, David E. (SIAM Journal on Scientific Computing, Society for Industrial & Applied Mathematics (SIAM), 2021-06-09) [Article]
    We consider numerical simulation of two-phase flows in porous media using implicit methods. Because of the complex features involving heterogeneous permeability and nonlinear capillary effects, the nonlinear algebraic systems arising from the discretization are very difficult to solve. The traditional Newton method suffers from slow convergence in the form of a long stagnation or sometimes does not converge at all. In this paper, we develop nonlinear preconditioning strategies for the system of two-phase flows discretized by a fully implicit discontinuous Galerkin method. The preconditioners identify and approximately eliminate the local high nonlinearities that cause the Newton method to take small updates. Specifically, we propose two elimination strategies: one is based on exploring the unbalanced nonlinearities of the pressure and the saturation fields, and the other is based on identifying certain elements of the finite element space that have much higher nonlinearities than the rest of the elements. We compare the performance and robustness of the proposed algorithms with an existing single-field elimination approach and the classical inexact Newton method with respect to some physical and numerical parameters. Experiments on three-dimensional porous media applications show that the proposed algorithms are superior to other methods in terms of robustness and parallel efficiency.
  • Crustal and upper-mantle structure below Central and Southern Mexico

    Espindola-Carmona, A.; Peter, Daniel; Ortiz-Aleman, C. (Journal of Geophysical Research: Solid Earth, American Geophysical Union (AGU), 2021-06-03) [Article]
    The Mexican subduction zone is a complex convergent margin characterized by a slab with an unusual morphology, an abnormal location of the volcanic arc, and the existence of slow earthquakes (slow slip events and tectonic tremors). The number of seismic imaging studies of the Mexican subduction zone has increased over the past decade. These studies have revealed the seismic wave velocity structure beneath Central and Southern Mexico. However, the existing tomographic models are either confined to specific areas or have coarse resolution. As a consequence, they do not capture the lateral and finer-scale variations in the mantle and the crust structure. Here, we fit frequency-dependent traveltime differences between observed and synthetic seismograms in a three-dimensional model of Central and Southern Mexico to constrain crustal and upper mantle seismic velocity structures jointly. We use ∼3,300 seismic records, filtered between 5 and 50 s, from 74 regional earthquakes. We perform point spread function tests to assess the resolution and trade-offs between model parameters. Our tomographic model reveals several seismic wave velocity features. These features include the geometry of the Cocos slab, the partial melting zone beneath the Trans-Mexican Volcanic Belt, and the Yucatan slab diving below Los Tuxtlas Volcanic Field that are consistent with previous studies. We also identify horizontal structures in the lower crust and the shallow upper mantle that were not previously identified. These structures include slow seismic wave velocity anomalies that may be correlated with ultra-slow seismic velocity zones and high conductivity regions, where slow earthquakes have been identified.
  • Mapping full seismic waveforms to vertical velocity profiles by deep learning

    Kazei, Vladimir; Ovcharenko, Oleg; Plotnitskii, Pavel; Peter, Daniel; Zhang, Xiangliang; Alkhalifah, Tariq Ali (GEOPHYSICS, Society of Exploration Geophysicists, 2021-05-28) [Article]
    Building realistic and reliable models of the subsurface is the primary goal of seismic imaging. Here we construct an ensemble of convolutional neural networks (CNNs) to build velocity models directly from the data. Most other approaches attempt to map full data into 2D labels. We exploit the regularity of seismic acquisition and train CNNs to map gathers of neighboring common midpoints (CMPs) to vertical 1D velocity logs. This allows us to integrate well-log data into the inversion, simplify the mapping by using the 1D labels, and accommodate larger dips relative to using single CMP inputs. We dynamically generate the training data in parallel with training the CNNs, which reduces overfitting. Data generation and training of the CNNs is more computationally expensive than conventional full-waveform inversion (FWI). However, once the network is trained, data sets with similar acquisition parameters can be inverted much faster than with FWI. The multiCMP CNN ensemble is tested on multiple realistic synthetic models, performs well, and was combined with FWI for even better performance.
  • A rotated characteristic decomposition technique for high-order reconstructions in multi-dimensions

    Shen, Hua; Parsani, Matteo (arXiv, 2021-05-21) [Preprint]
    When constructing high-order schemes for solving hyperbolic conservation laws, the corresponding high-order reconstructions are commonly performed in characteristic spaces to eliminate spurious oscillations as much as possible. For multi-dimensional finite volume (FV) schemes, we need to perform the characteristic decomposition several times in different normal directions of the target cell, which is very time-consuming. In this paper, we propose a rotated characteristic decomposition technique which requires only one-time decomposition for multi-dimensional reconstructions. The rotated direction depends only on the gradient of a specific physical quantity which is cheap to calculate. This technique not only reduces the computational cost remarkably, but also controls spurious oscillations effectively. We take a third-order weighted essentially non-oscillatory finite volume (WENO-FV) scheme for solving the Euler equations as an example to demonstrate the efficiency of the proposed technique.
  • Efficiency assessment of approximated spatial predictions for large datasets

    Hong, Yiping; Abdulah, Sameh; Genton, Marc G.; Sun, Ying (Spatial Statistics, Elsevier BV, 2021-05-14) [Article]
    Due to the well-known computational showstopper of the exact Maximum Likelihood Estimation (MLE) for large geospatial observations, a variety of approximation methods have been proposed in the literature, which usually require tuning certain inputs. For example, the recently developed Tile Low-Rank approximation (TLR) method involves many tuning parameters, including numerical accuracy. To properly choose the tuning parameters, it is crucial to adopt a meaningful criterion for the assessment of the prediction efficiency with different inputs. Unfortunately, the most commonly-used Mean Square Prediction Error (MSPE) criterion cannot directly assess the loss of efficiency when the spatial covariance model is approximated. Though the Kullback–Leibler Divergence criterion can provide the information loss of the approximated model, it cannot give more detailed information that one may be interested in, e.g., the accuracy of the computed MSE. In this paper, we present three other criteria, the Mean Loss of Efficiency (MLOE), Mean Misspecification of the Mean Square Error (MMOM), and Root mean square MOM (RMOM), and show numerically that, in comparison with the common MSPE criterion and the Kullback–Leibler Divergence criterion, our criteria are more informative, and thus more adequate to assess the loss of the prediction efficiency by using the approximated or misspecified covariance models. Hence, our suggested criteria are more useful for the determination of tuning parameters for sophisticated approximation methods of spatial model fitting. To illustrate this, we investigate the trade-off between the execution time, estimation accuracy, and prediction efficiency for the TLR method with extensive simulation studies and suggest proper settings of the TLR tuning parameters. We then apply the TLR method to a large spatial dataset of soil moisture in the area of the Mississippi River basin, and compare the TLR with the Gaussian predictive process and the composite likelihood method, showing that our suggested criteria can successfully be used to choose the tuning parameters that can keep the estimation or the prediction accuracy in applications.
  • Seismic Velocity Variations in a 3D Martian Mantle: Implications for the InSight Measurements

    Plesa, Ana-Catalina; Bozdağ, E.; Rivoldini, Attilio; Knapmeyer, Martin; McLennan, Scott; Padovan, S.; Tosi, Nicola; Breuer, Doris; Peter, Daniel; Staehler, Simon; Wieczorek, M. A.; van Driel, M.; Khan, A.; Spohn, T. (Journal of Geophysical Research: Planets, American Geophysical Union (AGU), 2021-05-14) [Article]
    We use a large data set of 3D thermal evolution models to predict the distribution of present-day seismic velocities in the Martian interior. Our models show a difference between maximum and minimum S-wave velocity of up to 10% either below the crust, where thermal variations are largest, or at the depth of the olivine to wadsleyite phase transition, located at around 1000 – 1200 km depth. Models with thick lithospheres on average have weak low-velocity zones that extend deeper than 400 km, and seismic velocity variations in the uppermost 400 – 600 km that closely follow the crustal thickness pattern. For these cases the crust contains more than half of the total amount of heat producing elements. Models with limited crustal heat production have thinner lithospheres and shallower but prominent low-velocity zones that are incompatible with InSight observations. Seismic events suggested to originate in Cerberus Fossae indicate the absence of S-wave shadow zones in 25° - 30° epicentral distance. This result is compatible with previous best-fit models that require a large average lithospheric thickness and a crust containing more than half of the bulk amount of heat producing elements to be compatible with geological and geophysical constraints. Ongoing and future InSight measurements that will determine the existence of a weak low-velocity zone will directly bear on the crustal heat production.
  • High-Performance Partial Spectrum Computation for Symmetric eigenvalue problems and the SVD

    Keyes, David E.; Ltaief, Hatem; Nakatsukasa, Y.; Sukkari, D. (arXiv, 2021-04-29) [Preprint]
    Current dense symmetric eigenvalue (EIG) and singular value decomposition (SVD) implementations may suffer from the lack of concurrency during the tridiagonal and bidiagonal reductions, respectively. This performance bottleneck is typical for the two-sided transformations due to the Level-2 BLAS memory-bound calls. Therefore, the current state-of-the-art EIG and SVD implementations may achieve only a small fraction of the system's sustained peak performance. The QR-based Dynamically Weighted Halley (QDWH) algorithm may be used as a pre-processing step toward the EIG and SVD solvers, while mitigating the aforementioned bottleneck. QDWH-EIG and QDWH-SVD expose more parallelism, while relying on compute-bound matrix operations. Both run closer to the sustained peak performance of the system, but at the expense of performing more FLOPS than the standard EIG and SVD algorithms. In this paper, we introduce a new QDWH-based solver for computing the partial spectrum for EIG (QDWHpartial-EIG) and SVD (QDWHpartial-SVD) problems. By optimizing the rational function underlying the algorithms only in the desired part of the spectrum, QDWHpartial-EIG and QDWHpartial-SVD algorithms efficiently compute a fraction (say 1-20%) of the corresponding spectrum. We develop high-performance implementations of QDWHpartial-EIG and QDWHpartial-SVD on distributed-memory anymore systems and demonstrate their numerical robustness. Experimental results using up to 36K MPI processes show performance speedups for QDWHpartial-SVD up to 6X and 2X against PDGESVD from ScaLAPACK and KSVD, respectively. QDWHpartial-EIG outperforms PDSYEVD from ScaLAPACK up to 3.5X but remains slower compared to ELPA. QDWHpartial-EIG achieves, however, a better occupancy of the underlying hardware by extracting higher sustained peak performance than ELPA, which is critical moving forward with accelerator-based supercomputers.
  • Optimized Runge-Kutta Methods with Automatic Step Size Control for Compressible Computational Fluid Dynamics

    Ranocha, Hendrik; Dalcin, Lisandro; Parsani, Matteo; Ketcheson, David I. (arXiv, 2021-04-14) [Preprint]
    We develop error-control based time integration algorithms for compressible fluid dynamics (CFD) applications and show that they are efficient and robust in both the accuracy-limited and stability-limited regime. Focusing on discontinuous spectral element semidiscretizations, we design new controllers for existing methods and for some new embedded Runge-Kutta pairs. We demonstrate the importance of choosing adequate controller parameters and provide a means to obtain these in practice. We compare a wide range of error-control-based methods, along with the common approach in which step size control is based on the Courant-Friedrichs-Lewy (CFL) number. The optimized methods give improved performance and naturally adopt a step size close to the maximum stable CFL number at loose tolerances, while additionally providing control of the temporal error at tighter tolerances. The numerical examples include challenging industrial CFD applications.
  • A class of high-order weighted compact central schemes for solving hyperbolic conservation laws

    Shen, Hua; Parsani, Matteo (arXiv, 2021-04-09) [Preprint]
    We propose a class of weighted compact central (WCC) schemes for solving hyperbolic conservation laws. The linear version can be considered as a high-order extension of the central Lax-Friedrichs (LxF) scheme and the central conservation element and solution element (CESE) scheme. On every cell, the solution is approximated by a Pth order polynomial of which all the DOFs are stored and updated separately. The cell average is updated by a classical finite volume scheme which is constructed based on space-time staggered meshes such that the fluxes are continuous across the interfaces of the adjacent control volumes and, therefore, the local Riemann problem is bypassed. The kth order spatial derivatives are updated by a central difference of (k-1)th order spatial derivatives at cell vertices. All the space-time information is calculated by the Cauchy-Kovalewski procedure. By doing so, the schemes are able to achieve arbitrarily uniform spacetime high order on a super-compact stencil with only one explicit time step. In order to capture discontinuities without spurious oscillations, a weighted essentially non-oscillatory (WENO) type limiter is tailor-made for the schemes. The limiter preserves the compactness and high order accuracy of the schemes. The accuracy, robustness, and efficiency of the schemes are verified by several numerical examples of scalar conservation laws and the compressible Euler equations.
  • High Performance Multivariate Geospatial Statistics on Manycore Systems

    Salvaña, Mary Lai O.; Abdulah, Sameh; Huang, Huang; Ltaief, Hatem; Sun, Ying; Genton, Marc G.; Keyes, David E. (IEEE Transactions on Parallel and Distributed Systems, Institute of Electrical and Electronics Engineers (IEEE), 2021-04-06) [Article]
    Modeling and inferring spatial relationships and predicting missing values of environmental data are some of the main tasks of geospatial statisticians. These routine tasks are accomplished using multivariate geospatial models and the cokriging technique, which requires the evaluation of the expensive Gaussian log-likelihood function. This large-scale cokriging challenge provides a fertile ground for supercomputing implementations for the geospatial statistics community as it is paramount to scale computational capability to match the growth in environmental data. In this paper, we develop large-scale multivariate spatial modeling and inference on parallel hardware architectures. To tackle the increasing complexity in matrix operations and the massive concurrency in parallel systems, we leverage low-rank matrix approximation techniques with task-based programming models and schedule the asynchronous computational tasks using a dynamic runtime system. The proposed framework provides both the dense and approximated computations of the Gaussian log-likelihood function. It demonstrates accuracy robustness and performance scalability on a variety of computer systems. Using both synthetic and real datasets, the low-rank matrix approximation shows better performance compared to exact computation, while preserving the application requirements in both parameter estimation and prediction accuracy. We also propose a novel algorithm to assess the prediction accuracy after the online parameter estimation.
  • The Arab world prepares the exascale workforce

    Keyes, David E. (Communications of the ACM, Association for Computing Machinery (ACM), 2021-04) [Article]
    THE ARAB WORLD is currently host to eight supercomputers in the Top500 globally, including the current #10 and a former #7. Hardware can become a honeypot for talent attraction—senior talent from abroad, and rising talent from within. Good return on investment from leading-edge hardware motivates forging collaborative ties to global supercomputing leaders, which leads to integration into the global campaigns that supercomputing excels in, such as predicting climate change and developing sustainable energy resources for its mitigation, positing properties of new materials and catalysis by design, repurposing already-certified drugs and discovering new ones, and big data analytics and machine learning applied to science and to society. While the petroleum industry has been the historical motivation for supercomputing in the Arab World with its workloads of seismic imaging and reservoir modeling, the attraction today is universal.
  • Constraints on the upper mantle structure beneath the Pacific from 3-D anisotropic waveform modelling

    Kendall, Elodie; Ferreira, A.M.G.; Chang, Sung-Joon; Witek, M.; Peter, Daniel (Journal of Geophysical Research: Solid Earth, American Geophysical Union (AGU), 2021-03-24) [Article]
    Seismic radial anisotropy is a crucial tool to help constrain flow in the Earth's mantle. However, Earth structure beneath the oceans imaged by current 3-D radially anisotropic mantle models shows large discrepancies. In this study, we provide constraints on the radially anisotropic upper mantle structure beneath the Pacific by waveform modelling and subsequent inversion. Specifically, we objectively evaluate three 3-D tomography mantle models which exhibit varying distributions of radial anisotropy through comparisons of independent real datasets with synthetic seismograms computed with the spectral-element method. The data require an asymmetry at the East Pacific Rise (EPR) with stronger positive radial anisotropy ξ=urn:x-wiley:21699313:media:jgrb54831:jgrb54831-math-0001=1.13-1.16 at ∼100 km depth to the west of the East Pacific Rise than to the east (ξ=1.11-1.13). This suggests that the anisotropy in this region is due to the lattice preferred orientation (LPO) of anisotropic mantle minerals produced by shear-driven asthenospheric flow beneath the South Pacific Superswell. Our new radial anisotropy constraints in the Pacific show three distinct positive linear anomalies at ∼100 km depth. These anomalies are possibly related to mantle entrainment at the Nazca-South America subduction zone, flow at the East Pacific Rise and from the South Pacific Superswell and SPO (shape-preferred orientation) of melt beneath Hawaii. Radial anisotropy reduces with lithospheric age to ξ < 1.05 in the west at ∼100 km depth, which possibly reflects a deviation from horizontal flow as the mantle is entrained with subducting slabs, a change in temperature or water content that could alter the anisotropic olivine fabric or the shape-preferred orientation of melt.
  • The influence of depth-varying elastic properties in controlling the dynamic rupture of megathrust earthquakes and upper-plate coseismic deformation

    Prada, Manel; Galvez, Percy; Sanchez-Linares, Carlos; Ampuero, Jean-Paul; Sallarès, Valentí; Macias, Jorge; Peter, Daniel (Copernicus GmbH, 2021-03-03) [Presentation]
    It has been recently proposed that the depth-varying rupture properties of megathrust earthquakes can be explained by the depth distribution of elastic properties of the rocks overlying the megathrust fault. Here we demonstrate that such relationship is mechanically viable by using 3D dynamic rupture simulations. We compare results from two subduction zone scenarios with different depth-distribution of elastic properties to explore the influence of realistic upper-plate elasticity on rupture characteristics such as slip, rupture duration, and frequency content.</p><p>The first scenario has a homogeneous distribution of elastic properties, with values of Vp, Vs, and density typical of rocks overlying the megathrust fault at 25 km depth. The second scenario includes the typical depth distribution of elastic properties overlying the megathrust fault inferred from worldwide tomographic models of the upper plate. For both scenarios, we simulate three cases with ruptures confined to the shallow domain (0-5 km depth), transitional domain (5-10 km depth), and regular domain (10-25 km depth), respectively. We assume the same friction properties for both scenarios.</p><p>Results show that the realistic distribution of elastic properties accounts for increasing slip and decreasing high frequency content trenchwards, and that slip may be 8 times larger and corner frequency 2 times lower in the shallow domain than in the regular domain. Rupture times along depth shows that the rupture through a realistic elastic model may be 2.5-3 times slower in the shallow domain than in the regular domain. Depth-variations of slip, frequency content, and rupture time quantitatively agree with previous predictions, confirming that depletion of high frequency content and slow rupture are inherent of ruptures propagating through the shallow domain, where elastic properties variations drop more rapidly than in the regular and transitional domains.</p><p>Depth-dependent elastic properties also affect the dynamics of slip rate. Peak slip rate values in the heterogeneous model anticorrelate with rigidity variations and are 3-4 times higher than those observed in the homogeneous model in the shallow domain. Increasing peak slip-rate difference trenchwards correlates with increasing local ground motion differences between models. We also find important differences on permanent coseismic deformation of the upper plate. We show that coseismic deformation is significantly larger in the shallow domain in the heterogeneous models, where uplift ratios may be up to 2 times larger and along-dip displacement of the seafloor may be >6 times larger than displacement values from the homogeneous model. We use the permanent uplift seafloor deformation from both models to model the corresponding tsunamis with Tsunami-HySEA software. The results show that, at the coast, the maximum amplitude of the tsunami generated by the heterogeneous model may be up to 25% larger than that excited by the homogeneous model.</p><p>This study demonstrate the relevant role of upper-plate elasticity in controlling not only rupture characteristics, but also coseismic upper plate deformation, and tsunamigenesis. Neglecting the distribution of these properties may result in important underestimation of slip, rupture time, and local ground motion, as well as on seafloor coseismic deformation of the shallow domain, which in turn may lead to underestimations of tsunami size.
  • Triple Decomposition of Velocity Gradient Tensor in Compressible Turbulence

    Boukharfane, Radouan; Er-raiy, Aimad; Alzaben, Linda; Parsani, Matteo (Fluids, MDPI AG, 2021-03-02) [Article]
    The decomposition of the local motion of a fluid into straining, shearing, and rigid-body rotation is examined in this work for a compressible isotropic turbulence by means of direct numerical simulations. The triple decomposition is closely associated with a basic reference frame (BRF), in which the extraction of the biasing effect of shear is maximized. In this study, a new computational and inexpensive procedure is proposed to identify the BRF for a three-dimensional flow field. In addition, the influence of compressibility effects on some statistical properties of the turbulent structures is addressed. The direct numerical simulations are carried out with a Reynolds number that is based on the Taylor micro-scale of Reλ=100 for various turbulent Mach numbers that range from Mat=0.12 to Mat=0.89. The DNS database is generated with an improved seventh-order accurate weighted essentially non-oscillatory scheme to discretize the non-linear advective terms, and an eighth-order accurate centered finite difference scheme is retained for the diffusive terms. One of the major findings of this analysis is that regions featuring strong rigid-body rotations or straining motions are highly spatially intermittent, while most of the flow regions exhibit moderately strong shearing motions in the absence of rigid-body rotations and straining motions. The majority of compressibility effects can be estimated if the scaling laws in the case of compressible turbulence are rescaled by only considering the solenoidal contributions.
  • A sharp interface method using enriched finite elements for elliptic interface problems

    Hollbacher, Susanne; Wittum, Gabriel (Numerische Mathematik, Springer Nature, 2021-03-02) [Article]
    We present an immersed boundary method for the solution of elliptic interface problems with discontinuous coefficients which provides a second-order approximation of the solution. The proposed method can be categorised as an extended or enriched finite element method. In contrast to other extended FEM approaches, the new shape functions get projected in order to satisfy the Kronecker-delta property with respect to the interface. The resulting combination of projection and restriction was already derived in Höllbacher and Wittum (TBA, 2019a) for application to particulate flows. The crucial benefits are the preservation of the symmetry and positive definiteness of the continuous bilinear operator. Besides, no additional stabilisation terms are necessary. Furthermore, since our enrichment can be interpreted as adaptive mesh refinement, the standard integration schemes can be applied on the cut elements. Finally, small cut elements do not impair the condition of the scheme and we propose a simple procedure to ensure good conditioning independent of the location of the interface. The stability and convergence of the solution will be proven and the numerical tests demonstrate optimal order of convergence.
  • Parallel Hierarchical Matrix Technique to Approximate Large Covariance Matrices, Likelihood Functions and Parameter Identi fication

    Litvinenko, Alexander; Berikov, V.; Genton, Marc G.; Keyes, David E.; Kriemann, R.; Sun, Ying (2021-03-01) [Presentation]
    We develop the HLIBCov package, which is using parallel hierarchical (H-) matrices to: 1) Approximate large dense inhomogeneous covariance matrices with a log-linear computational cost and storage requirement. 2) Compute matrix-vector product, Cholesky factorization and inverse with a log-linear complexity. 3) Identify unknown parameters of the covariance function (variance, smoothness, and covariance length). These unknown parameters are estimated by maximizing the joint Gaussian log-likelihood function. To demonstrate the numerical performance, we identify three unknown parameters in an example with 2,000,000 locations on a PC-desktop.
  • KSPHPDDM and PCHPDDM: Extending PETSc with advanced Krylov methods and robust multilevel overlapping Schwarz preconditioners

    Jolivet, Pierre; Roman, Jose E.; Zampini, Stefano (Computers and Mathematics with Applications, Elsevier BV, 2021-01-22) [Article]
    Contemporary applications in computational science and engineering often require the solution of linear systems which may be of different sizes, shapes, and structures. The goal of this paper is to explain how two libraries, PETSc and HPDDM, have been interfaced in order to offer end-users robust overlapping Schwarz preconditioners and advanced Krylov methods featuring recycling and the ability to deal with multiple right-hand sides. The flexibility of the implementation is showcased and explained with minimalist, easy-to-run, and reproducible examples, to ease the integration of these algorithms into more advanced frameworks. The examples provided cover applications from eigenanalysis, elasticity, combustion, and electromagnetism.
  • Exploiting low-rank covariance structures for computing high-dimensional normal and Student-t probabilities

    Cao, Jian; Genton, Marc G.; Keyes, David E.; Turkiyyah, George M. (Statistics and Computing, Springer Nature, 2021-01-12) [Article]
    We present a preconditioned Monte Carlo method for computing high-dimensional multivariate normal and Student-t probabilities arising in spatial statistics. The approach combines a tile-low-rank representation of covariance matrices with a block-reordering scheme for efficient quasi-Monte Carlo simulation. The tile-low-rank representation decomposes the high-dimensional problem into many diagonal-block-size problems and low-rank connections. The block-reordering scheme reorders between and within the diagonal blocks to reduce the impact of integration variables from right to left, thus improving the Monte Carlo convergence rate. Simulations up to dimension 65,536 suggest that the new method can improve the run time by an order of magnitude compared with the hierarchical quasi-Monte Carlo method and two orders of magnitude compared with the dense quasi-Monte Carlo method. Our method also forms a strong substitute for the approximate conditioning methods as a more robust estimation with error guarantees. An application study to wind stochastic generators is provided to illustrate that the new computational method makes the maximum likelihood estimation feasible for high-dimensional skew-normal random fields.
  • Entropy–Stable No–Slip Wall Boundary Conditions for the Eulerian Model for Viscous and Heat Conducting Compressible Flows

    Sayyari, Mohammed; Dalcin, Lisandro; Parsani, Matteo (American Institute of Aeronautics and Astronautics (AIAA), 2021-01-11) [Conference Paper]
    Nonlinear (entropy) stability analysis is used to derive entropy–stable no–slip wall boundary conditions at the continuous and semi–discrete levels for the Eulerian model proposed by Svärd in 2018 (Physica A: Statistical Mechanics and its Applications, 2018). The spatial discretization is based on discontinuous Galerkin summation-by-parts operators of any order for unstructured grids. We provide a set of two–dimensional numerical results for laminar and turbulent flows simulated with both the Eulerian and classical Navier–Stokes models. These results are computed with a high-performance ℎ–entropy–stable solver, that also features explicit and implicit entropy–stable time integration schemes.
  • A Comparison of Parallel Profiling Tools for Programs utilizing the FFT

    Leu, Brian; Aseeri, Samar; Muite, Benson (Association for Computing Machinery (ACM), 2021-01-06) [Conference Paper]
    Performance monitoring is an important component of code optimization. Performance monitoring is also important for the beginning user, but can be difficult to configure appropriately. The overhead of the performance monitoring tools Craypat, FPMP, mpiP, Scalasca and TAU, are measured using default configurations likely to be choosen by a novice user and shown to be small when profiling Fast Fourier Transform based solvers for the Klein Gordon equation based on 2decomp&FFT and on FFTE. Performance measurements help explain that despite FFTE having a more efficient parallel algorithm, it is not always faster than 2decom&FFT because the complied single core FFT is not as fast as that in FFTW which is used in 2decomp&FFT.

View more