## Search

Now showing items 1-10 of 27

JavaScript is disabled for your browser. Some features of this site may not work without it.

AuthorKeyes, David E. (11)Ltaief, Hatem (7)Parsani, Matteo (5)Dalcin, Lisandro (4)Peter, Daniel (4)View MoreDepartmentExtreme Computing Research Center (27)Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division (11)Applied Mathematics and Computational Science Program (7)Materials Science and Engineering Program (2)Numerical Porous Media SRI Center (NumPor) (2)View MoreJournalSIAM Journal on Scientific Computing (3)Computer Methods in Applied Mechanics and Engineering (2)Geophysical Journal International (2)Journal of Computational Physics (2)Proceedings of the Platform for Advanced Scientific Computing Conference on ZZZ - PASC '16 (2)View MorePublisherElsevier BV (7)Society for Industrial & Applied Mathematics (SIAM) (4)Association for Computing Machinery (ACM) (3)Oxford University Press (OUP) (2)Springer Nature (2)View MoreSubjectIsogeometric analysis (3)Computational seismology (2)High-performance computing (2)PetIGA (2)Seismic tomography (2)View MoreTypeArticle (16)Conference Paper (7)Book Chapter (2)Preprint (1)Presentation (1)Year (Issue Date)

2016 (27)

Item AvailabilityOpen Access (16)Metadata Only (11)
All of KAUSTCommunitiesTitleAuthorsKAUST AuthorsIssue DateSubmit DateSubjectsThis CollectionTitleAuthorsKAUST AuthorsIssue DateSubmit DateSubjects

Now showing items 1-10 of 27

- List view
- Grid view
- Sort Options:
- Relevance
- Title Asc
- Title Desc
- Issue Date Asc
- Issue Date Desc
- Results Per Page:
- 5
- 10
- 20
- 40
- 60
- 80
- 100

A 1.8 trillion degrees-of-freedom, 1.24 petaflops global seismic wave simulation on the K computer

Tsuboi, Seiji; Ando, Kazuto; Miyoshi, Takayuki; Peter, Daniel; Komatitsch, Dimitri; Tromp, Jeroen (SAGE Publications, 2016-03-01)

We present high-performance simulations of global seismic wave propagation with an unprecedented accuracy of 1.2 s seismic period for a realistic three-dimensional Earth model using the spectral element method on the K computer. Our seismic simulations use a total of 665.2 billion grid points and resolve 1.8 trillion degrees of freedom. To realize these large-scale computations, we optimize a widely used community software code to efficiently address all hardware parallelization, especially thread-level parallelization to solve the bottleneck of memory usage for coarse-grained parallelization. The new code exhibits excellent strong scaling for the time stepping loop, that is, parallel efficiency on 82,134 nodes relative to 36,504 nodes is 99.54%. Sustained performance of these computations on the K computer is 1.24 petaflops, which is 11.84% of its peak performance. The obtained seismograms with an accuracy of 1.2 s for the entire globe should help us to better understand rupture mechanisms of devastating earthquakes.

Bayesian Parameter Estimation via Filtering and Functional Approximations

Matthies, Hermann G.; Litvinenko, Alexander; Rosic, Bojana V.; Zander, Elmar (arXiv, 2016-11-25)

The inverse problem of determining parameters in a model by comparing some output of the model with observations is addressed. This is a description for what hat to be done to use the Gauss-Markov-Kalman filter for the Bayesian estimation and updating of parameters in a computational model. This is a filter acting on random variables, and while its Monte Carlo variant --- the Ensemble Kalman Filter (EnKF) --- is fairly straightforward, we subsequently only sketch its implementation with the help of functional representations.

BDDC Deluxe for Isogeometric Analysis

da Veiga, L. Beirão; Pavarino, L. F.; Scacchi, S.; Widlund, O. B.; Zampini, Stefano (Springer Science + Business Media, 2016)

The main goal of this paper is to design, analyze, and test a BDDC (Balancing Domain Decomposition by Constraints, see [12, 23]) preconditioner for Isogeometric Analysis (IGA), based on a novel type of interface averaging, which we will denote by deluxe scaling, with either full or reduced set of primal constraints. IGA is an innovative numerical methodology, introduced in [17] and first analyzed in [1], where the geometry description of the PDE domain is adopted from a Computer Aided Design (CAD) parametrization usually based on Non-Uniform Rational B-Splines (NURBS) and the same NURBS basis functions are also used as the PDEs discrete basis, following an isoparametric paradigm; see the monograph [10]. Recent works on IGA preconditioners have focused on overlapping Schwarz preconditioners [3, 5, 7, 9], multigrid methods [16], and non-overlapping preconditioners [4, 8, 20].

Efficient Sphere Detector Algorithm for Massive MIMO using GPU Hardware Accelerator

Arfaoui, Mohamed-Amine; Ltaief, Hatem; Rezki, Zouheir; Alouini, Mohamed-Slim; Keyes, David E. (Elsevier BV, 2016-06-01)

To further enhance the capacity of next generation wireless communication systems, massive MIMO has recently appeared as a necessary enabling technology to achieve high performance signal processing for large-scale multiple antennas. However, massive MIMO systems inevitably generate signal processing overheads, which translate into ever-increasing rate of complexity, and therefore, such system may not maintain the inherent real-time requirement of wireless systems. We redesign the non-linear sphere decoder method to increase the performance of the system, cast most memory-bound computations into compute-bound operations to reduce the overall complexity, and maintain the real-time processing thanks to the GPU computational power. We show a comprehensive complexity and performance analysis on an unprecedented MIMO system scale, which can ease the design phase toward simulating future massive MIMO wireless systems.

Entropy Stable Staggered Grid Discontinuous Spectral Collocation Methods of any Order for the Compressible Navier--Stokes Equations

Parsani, Matteo; Carpenter, Mark H.; Fisher, Travis C.; Nielsen, Eric J. (Society for Industrial & Applied Mathematics (SIAM), 2016-10-04)

Staggered grid, entropy stable discontinuous spectral collocation operators of any order are developed for the compressible Euler and Navier--Stokes equations on unstructured hexahedral elements. This generalization of previous entropy stable spectral collocation work [M. H. Carpenter, T. C. Fisher, E. J. Nielsen, and S. H. Frankel, SIAM J. Sci. Comput., 36 (2014), pp. B835--B867, M. Parsani, M. H. Carpenter, and E. J. Nielsen, J. Comput. Phys., 292 (2015), pp. 88--113], extends the applicable set of points from tensor product, Legendre--Gauss--Lobatto (LGL), to a combination of tensor product Legendre--Gauss (LG) and LGL points. The new semidiscrete operators discretely conserve mass, momentum, energy, and satisfy a mathematical entropy inequality for the compressible Navier--Stokes equations in three spatial dimensions. They are valid for smooth as well as discontinuous flows. The staggered LG and conventional LGL point formulations are compared on several challenging test problems. The staggered LG operators are significantly more accurate, although more costly from a theoretical point of view. The LG and LGL operators exhibit similar robustness, as is demonstrated using test problems known to be problematic for operators that lack a nonlinear stability proof for the compressible Navier--Stokes equations (e.g., discontinuous Galerkin, spectral difference, or flux reconstruction operators).

Convergence Analysis for the Multiplicative Schwarz Preconditioned Inexact Newton Algorithm

Liu, Lulu; Keyes, David E. (Society for Industrial & Applied Mathematics (SIAM), 2016-10-26)

The multiplicative Schwarz preconditioned inexact Newton (MSPIN) algorithm, based on decomposition by field type rather than by subdomain, was recently introduced to improve the convergence of systems with unbalanced nonlinearities. This paper provides a convergence analysis of the MSPIN algorithm. Under reasonable assumptions, it is shown that MSPIN is locally convergent, and desired superlinear or even quadratic convergence can be obtained when the forcing terms are picked suitably.

Estimation of uncertain parameters of large Matern covariance functions with using hierarchical matrix technique

Litvinenko, Alexander; Genton, Marc G.; Sun, Ying; Keyes, David E. (2016-01-09)

Entropy Stable Summation-by-Parts Formulations for Compressible Computational Fluid Dynamics

Carpenter, M.H.; Fisher, T.C.; Nielsen, E.J.; Parsani, Matteo; Svärd, M.; Yamaleev, N. (Elsevier BV, 2016-11-09)

A systematic approach based on a diagonal-norm summation-by-parts (SBP) framework is presented for implementing entropy stable (SS) formulations of any order for the compressible Navier–Stokes equations (NSE). These SS formulations discretely conserve mass, momentum, energy and satisfy a mathematical entropy equality for smooth problems. They are also valid for discontinuous flows provided sufficient dissipation is added at shocks and discontinuities to satisfy an entropy inequality. Admissible SBP operators include all centred diagonal-norm finite-difference (FD) operators and Legendre spectral collocation-finite element methods (LSC-FEM). Entropy stable multiblock FD and FEM operators follows immediately via nonlinear coupling operators that ensure conservation, accuracy and preserve the interior entropy estimates. Nonlinearly stable solid wall boundary conditions are also available. Existing SBP operators that lack a stability proof (e.g. weighted essentially nonoscillatory) may be combined with an entropy stable operator using a comparison technique to guarantee nonlinear stability of the pair. All capabilities extend naturally to a curvilinear form of the NSE provided that the coordinate mappings satisfy a geometric conservation law constraint. Examples are presented that demonstrate the robustness of current state-of-the-art entropy stable SBP formulations.

Adaptive Optics Simulation for the World's Largest Telescope on Multicore Architectures with Multiple GPUs

Ltaief, Hatem; Gratadour, Damien; Charara, Ali; Gendron, Eric (Association for Computing Machinery (ACM), 2016-06-02)

We present a high performance comprehensive implementation of a multi-object adaptive optics (MOAO) simulation on multicore architectures with hardware accelerators in the context of computational astronomy. This implementation will be used as an operational testbed for simulating the de- sign of new instruments for the European Extremely Large Telescope project (E-ELT), the world's biggest eye and one of Europe's highest priorities in ground-based astronomy. The simulation corresponds to a multi-step multi-stage pro- cedure, which is fed, near real-time, by system and turbulence data coming from the telescope environment. Based on the PLASMA library powered by the OmpSs dynamic runtime system, our implementation relies on a task-based programming model to permit an asynchronous out-of-order execution. Using modern multicore architectures associated with the enormous computing power of GPUS, the resulting data-driven compute-intensive simulation of the entire MOAO application, composed of the tomographic reconstructor and the observing sequence, is capable of coping with the aforementioned real-time challenge and stands as a reference implementation for the computational astronomy community.

An energy-stable time-integrator for phase-field models

Vignal, Philippe; Collier, N.; Dalcin, Lisandro; Brown, D.L.; Calo, V.M. (Elsevier BV, 2016-12-27)

We introduce a provably energy-stable time-integration method for general classes of phase-field models with polynomial potentials. We demonstrate how Taylor series expansions of the nonlinear terms present in the partial differential equations of these models can lead to expressions that guarantee energy-stability implicitly, which are second-order accurate in time. The spatial discretization relies on a mixed finite element formulation and isogeometric analysis. We also propose an adaptive time-stepping discretization that relies on a first-order backward approximation to give an error-estimator. This error estimator is accurate, robust, and does not require the computation of extra solutions to estimate the error. This methodology can be applied to any second-order accurate time-integration scheme. We present numerical examples in two and three spatial dimensions, which confirm the stability and robustness of the method. The implementation of the numerical schemes is done in PetIGA, a high-performance isogeometric analysis framework.

The export option will allow you to export the current search results of the entered query to a file. Different formats are available for download. To export the items, click on the button corresponding with the preferred download format.

By default, clicking on the export buttons will result in a download of the allowed maximum amount of items.

To select a subset of the search results, click "Selective Export" button and make a selection of the items you want to export. The amount of items that can be exported at once is similarly restricted as the full export.

After making a selection, click one of the export format buttons. The amount of items that will be exported is indicated in the bubble next to export format.