Show simple item record

dc.contributor.authorDongarra, Jack
dc.contributor.authorFaverge, Mathieu
dc.contributor.authorLtaief, Hatem
dc.contributor.authorLuszczek, Piotr R.
dc.date.accessioned2015-08-24T09:26:10Z
dc.date.available2015-08-24T09:26:10Z
dc.date.issued2012-01-01
dc.identifier.isbn9781614990406
dc.identifier.issn09275452
dc.identifier.doi10.3233/978-1-61499-041-3-429
dc.identifier.urihttp://hdl.handle.net/10754/575786
dc.description.abstractThe LU factorization is an important numerical algorithm for solving system of linear equations. This paper proposes a novel approach for computing the LU factorization in parallel on multicore architectures. It improves the overall performance and also achieves the numerical quality of the standard LU factorization with partial pivoting. While the update of the trailing submatrix is computationally intensive and highly parallel, the inherently problematic portion of the LU factorization is the panel factorization due to its memory-bound characteristic and the atomicity of selecting the appropriate pivots. We remedy this in our new approach to LU factorization of (narrow and tall) panel submatrices. We use a parallel fine-grained recursive formulation of the factorization. It is based on conflict-free partitioning of the data and lock-less synchronization mechanisms. Our implementation lets the overall computation naturally flow with limited contention. Our recursive panel factorization provides the necessary performance increase for the inherently problematic portion of the LU factorization of square matrices. A large panel width results in larger Amdahl's fraction as our experiments have revealed which is consistent with related efforts. The performance results of our implementation reveal superlinear speedup and far exceed what can be achieved with equivalent MKL and/or LAPACK routines. © 2012 The authors and IOS Press. All rights reserved.
dc.subjectLU factorization
dc.subjectparallel linear algebra
dc.subjectrecursion
dc.subjectshared-memory synchronization
dc.subjectthreaded parallelism
dc.titleExploiting fine-grain parallelism in recursive LU factorization
dc.typeBook Chapter
dc.contributor.departmentKAUST Supercomputing Laboratory (KSL)
dc.contributor.departmentExtreme Computing Research Center
dc.identifier.journalAdvances in Parallel Computing
dc.contributor.institutionDepartment of Electrical Engineering and Computer Science, University of Tennessee, Knoxville, United States
kaust.personLtaief, Hatem


This item appears in the following Collection(s)

Show simple item record