• Login
    Search 
    •   Home
    • Research
    • Posters
    • Search
    •   Home
    • Research
    • Posters
    • Search
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Filter by Category

    AuthorKeyes, David E. (17)Ltaief, Hatem (7)Yokota, Rio (5)Abdelfattah, Ahmad (2)Al Farhan, Mohammed (2)View MoreDepartment
    Applied Mathematics and Computational Science Program (20)
    Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division (20)
    Extreme Computing Research Center (20)
    Computer Science Program (13)Earth Science and Engineering Program (1)View MoreTypePoster (20)Year (Issue Date)2017 (7)2015 (1)2014 (12)Item AvailabilityOpen Access (20)

    Browse

    All of KAUSTCommunitiesIssue DateSubmit DateThis CollectionIssue DateSubmit Date

    My Account

    Login

    Quick Links

    Open Access PolicyORCID LibguidePlumX LibguideSubmit an Item

    Statistics

    Display statistics
     

    Search

    Show Advanced FiltersHide Advanced Filters

    Filters

    Now showing items 1-10 of 20

    • List view
    • Grid view
    • Sort Options:
    • Relevance
    • Title Asc
    • Title Desc
    • Issue Date Asc
    • Issue Date Desc
    • Submit Date Asc
    • Submit Date Desc
    • Results Per Page:
    • 5
    • 10
    • 20
    • 40
    • 60
    • 80
    • 100

    • 20CSV
    • 20RefMan
    • 20EndNote
    • 20BibTex
    • Selective Export
    • Select All
    • Help
    Thumbnail

    Modeling droplet merging via the incompressible Navier-Stokes-Cahn-Hilliard equations

    Sarmiento, Adel; Espath, Luis; Dalcin, Lisandro; Parsani, Matteo; Calo, V.M. (2017-05-23) [Poster]
    Thumbnail

    Large Scale Computation of Direct Initiation of Cylindrical Detonations

    Shen, Hua; Parsani, Matteo (2017-05-23) [Poster]
    Thumbnail

    Likelihood Approximation With Parallel Hierarchical Matrices For Large Spatial Datasets

    Litvinenko, Alexander; Sun, Ying; Genton, Marc G.; Keyes, David E. (2017-03-13) [Poster]
    Thumbnail

    Batched Triangular DLA for Very Small Matrices on GPUs

    Charara, Ali; Keyes, David E.; Ltaief, Hatem (2017-03-13) [Poster]
    In several scientific applications, like tensor contractions in deep learning computation or data compression in hierarchical low rank matrix approximation, the bulk of computation typically resides in performing thousands of independent dense linear algebra operations on very small matrix sizes (usually less than 100). Batched dense linear algebra kernels are becoming ubiquitous for such scientific computations. Within a single API call, these kernels are capable of simultaneously launching a large number of similar matrix computations, removing the expensive overhead of multiple API calls while increasing the utilization of the underlying hardware.
    Thumbnail

    Earthquake Ground Motion Analysis and extreme computing on multi-Petaflops machine

    De Martin, Florent; Dupros, Fabrice; Thierry, Philippe; Paciucci, Gabriele; Sochala, Pierre; Boulahya, Faïza; Benaichouche, Abed; Chaljub, Emmanuel; Hadri, Bilel; Ltaief, Hatem; Keyes, David E. (2017-03-13) [Poster]
    Thumbnail

    Implicit Unstructured Aerodynamics on Emerging Multi- and Many-Core HPC Architectures

    Al Farhan, Mohammed; Kaushik, Dinesh K.; Keyes, David E. (2017-03-13) [Poster]
    Shared memory parallelization of PETSc-FUN3D, an unstructured tetrahedral mesh Euler code previously characterized for distributed memory Single Program, Multiple Data (SPMD) for thousands of nodes, is hybridized with shared memory Single Instruction, Multiple Data (SIMD) for hundreds of threads per node. We explore thread-level performance optimizations on state-of-the-art multi- and many-core Intel processors, including the second generation of Xeon Phi, Knights Landing (KNL). We study the performance on the KNL with different configurations of memory and cluster modes, with code optimizations to minimize indirect addressing and enhance the cache locality. The optimizations employed are expected to be of value other unstructured applications as many-core architecture evolves.
    Thumbnail

    d3f: Parallel Simulation of Large-scale Groundwater Flow with ug4

    Wittum, Gabriel; Logashenko, Dmitry; Hoffer, Michael; Lampe, Michael; Nägel, Arne; Reiter, Sebastian; Vogel, Andreas (2017-03-13) [Poster]
    Thumbnail

    Scalable Hierarchical Algorithms for stochastic PDEs and UQ

    Litvinenko, Alexander; Chavez Chavez, Gustavo Ivan; Keyes, David E.; Ltaief, Hatem; Yokota, Rio (2015-01-07) [Poster]
    H-matrices and Fast Multipole (FMM) are powerful methods to approximate linear operators coming from partial differential and integral equations as well as speed up computational cost from quadratic or cubic to log-linear (O(n log n)), where n number of degrees of freedom in the discretization. The storage is reduced to the log-linear as well. This hierarchical structure is a good starting point for parallel algorithms. Parallelization on shared and distributed memory systems was pioneered by Kriemann [1,2]. Since 2005, the area of parallel architectures and software is developing very fast. Progress in GPUs and Many-Core Systems (e.g. XeonPhi with 64 cores) motivated us to extend work started in [1,2,7,8].
    Thumbnail

    Enabling High Performance Large Scale Dense Problems through KBLAS

    Abdelfattah, Ahmad; Keyes, David E.; Ltaief, Hatem (2014-05-04) [Poster]
    KBLAS (KAUST BLAS) is a small library that provides highly optimized BLAS routines on systems accelerated with GPUs. KBLAS is entirely written in CUDA C, and targets NVIDIA GPUs with compute capability 2.0 (Fermi) or higher. The current focus is on level-2 BLAS routines, namely the general matrix vector multiplication (GEMV) kernel, and the symmetric/hermitian matrix vector multiplication (SYMV/HEMV) kernel. KBLAS provides these two kernels in all four precisions (s, d, c, and z), with support to multi-GPU systems. Through advanced optimization techniques that target latency hiding and pushing memory bandwidth to the limit, KBLAS outperforms state-of-the-art kernels by 20-90% improvement. Competitors include CUBLAS-5.5, MAGMABLAS-1.4.0, and CULAR17. The SYMV/HEMV kernel from KBLAS has been adopted by NVIDIA, and should appear in CUBLAS-6.0. KBLAS has been used in large scale simulations of multi-object adaptive optics.
    Thumbnail

    Pipelining Computational Stages of the Tomographic Reconstructor for Multi-Object Adaptive Optics on a Multi?GPU System

    Charara, Ali; Ltaief, Hatem; Gratadour, Damien; Keyes, David E.; Sevin, Arnaud; Abdelfattah, Ahmad; Gendron, Eric; Morel, Carine; Vidal, Fabrice (2014-05-04) [Poster]
    European Extreme Large Telescope (E-ELT) is a high priority project in ground based astronomy that aims at constructing the largest telescope ever built. MOSAIC is an instrument proposed for E-ELT using Multi- Object Adaptive Optics (MOAO) technique for astronomical telescopes, which compensates for effects of atmospheric turbulence on image quality, and operates on patches across a large FoV.
    • 1
    • 2
    DSpace software copyright © 2002-2019  DuraSpace
    Quick Guide | Contact Us | Send Feedback
    Open Repository is a service hosted by 
    Atmire NV
     

    Export search results

    The export option will allow you to export the current search results of the entered query to a file. Different formats are available for download. To export the items, click on the button corresponding with the preferred download format.

    By default, clicking on the export buttons will result in a download of the allowed maximum amount of items. For anonymous users the allowed maximum amount is 50 search results.

    To select a subset of the search results, click "Selective Export" button and make a selection of the items you want to export. The amount of items that can be exported at once is similarly restricted as the full export.

    After making a selection, click one of the export format buttons. The amount of items that will be exported is indicated in the bubble next to export format.