In this paper, we describe a scalable parallel algorithm for sparse Cholesky factorization, analyze its performance and scalability, and present experimental results of its implementation on a 1024-processor nCUBE2 parallel computer. Through our analysis and experimental results, we demonstrate that our algorithm improves the state of the art in parallel direct solution of sparse linear systems by an order of magnitude - both in terms of speedups and the number of processors that can be utilized effectively for a given problem size. This algorithm incurs strictly less communication overhead and is more scalable than any known parallel formulation of sparse matrix factorization. We show that our algorithm is optimally scalable on hypercube and mesh architectures and that its asymptotic scalability is the same as that of dense matrix factorization for a wide class of sparse linear systems, including those arising in all two- and three- dimensional finite element problems.
|Original language||English (US)|
|Number of pages||10|
|Journal||Proceedings of the ACM/IEEE Supercomputing Conference|
|State||Published - Jan 1 1994|
|Event||Proceedings of the 1994 Supercomputing Conference - Washington, DC, USA|
Duration: Nov 14 1994 → Nov 18 1994