Task Parallel Incomplete Cholesky Factorization using 2D Partitioned-Block Layout

01/22/2016
by   Kyungjoo Kim, et al.
0

We introduce a task-parallel algorithm for sparse incomplete Cholesky factorization that utilizes a 2D sparse partitioned-block layout of a matrix. Our factorization algorithm follows the idea of algorithms-by-blocks by using the block layout. The algorithm-by-blocks approach induces a task graph for the factorization. These tasks are inter-related to each other through their data dependences in the factorization algorithm. To process the tasks on various manycore architectures in a portable manner, we also present a portable tasking API that incorporates different tasking backends and device-specific features using an open-source framework for manycore platforms i.e., Kokkos. A performance evaluation is presented on both Intel Sandybridge and Xeon Phi platforms for matrices from the University of Florida sparse matrix collection to illustrate merits of the proposed task-based factorization. Experimental results demonstrate that our task-parallel implementation delivers about 26.6x speedup (geometric mean) over single-threaded incomplete Cholesky-by-blocks and 19.2x speedup over serial Cholesky performance which does not carry tasking overhead using 56 threads on the Intel Xeon Phi processor for sparse matrices arising from various application problems.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/01/2019

GLU3.0: Fast GPU-based Parallel Sparse LU Factorization for Circuit Simulation

In this article, we propose a new GPU-based sparse LU factorization meth...
research
12/13/2018

Javelin: A Scalable Implementation for Sparse Incomplete LU Factorization

In this work, we present a new scalable incomplete LU factorization fram...
research
05/25/2023

Neural incomplete factorization: learning preconditioners for the conjugate gradient method

In this paper, we develop a novel data-driven approach to accelerate sol...
research
11/19/2016

A Case for Malleable Thread-Level Linear Algebra Libraries: The LU Factorization with Partial Pivoting

We propose two novel techniques for overcoming load-imbalance encountere...
research
08/27/2019

High Performance Block Incomplete LU Factorization

Many application problems that lead to solving linear systems make use o...
research
05/08/2023

Parallel Cholesky Factorization for Banded Matrices using OpenMP Tasks

Cholesky factorization is a widely used method for solving linear system...
research
03/03/2017

Decoupled Block-Wise ILU(k) Preconditioner on GPU

This research investigates the implementation mechanism of block-wise IL...

Please sign up or login with your details

Forgot password? Click here to reset