TY - JOUR

T1 - Parallel matrix-vector product using approximate hierarchical methods

AU - Grama, Ananth

AU - Kumar, Vipin

AU - Sameh, Ahmed

PY - 1995/12/1

Y1 - 1995/12/1

N2 - Matrix-vector products (mat-vecs) form the core of iterative methods used for solving dense linear systems. Often, these systems arise in the solution of integral equations used in electromagnetics, heat transfer, and wave propagation. In this paper, we present a parallel approximate method for computing mat-vecs used in the solution of integral equations. We use this method to compute dense mat-vecs of hundreds of thousands of elements. The combined speedups obtained from the use of approximate methods and parallel processing represent an improvement of several orders of magnitude over exact mat-vecs on uniprocessors. We demonstrate that our parallel formulation incurs minimal parallel processing overhead and scales up to a large number of processors. We study the impact of varying the accuracy of the approximate mat-vec on overall time and on parallel efficiency. Experimental results are presented for 256 processor Cray T3D and Thinking Machines CM5 parallel computers. We have achieved computation rates in excess of 5 GFLOPS on the T3D.

AB - Matrix-vector products (mat-vecs) form the core of iterative methods used for solving dense linear systems. Often, these systems arise in the solution of integral equations used in electromagnetics, heat transfer, and wave propagation. In this paper, we present a parallel approximate method for computing mat-vecs used in the solution of integral equations. We use this method to compute dense mat-vecs of hundreds of thousands of elements. The combined speedups obtained from the use of approximate methods and parallel processing represent an improvement of several orders of magnitude over exact mat-vecs on uniprocessors. We demonstrate that our parallel formulation incurs minimal parallel processing overhead and scales up to a large number of processors. We study the impact of varying the accuracy of the approximate mat-vec on overall time and on parallel efficiency. Experimental results are presented for 256 processor Cray T3D and Thinking Machines CM5 parallel computers. We have achieved computation rates in excess of 5 GFLOPS on the T3D.

UR - http://www.scopus.com/inward/record.url?scp=0029430861&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0029430861&partnerID=8YFLogxK

M3 - Conference article

AN - SCOPUS:0029430861

VL - 2

SP - 2065

EP - 2084

JO - Proceedings of the ACM/IEEE Supercomputing Conference

JF - Proceedings of the ACM/IEEE Supercomputing Conference

SN - 1063-9535

T2 - Proceedings of the 1995 ACM/IEEE Supercomputing Conference. Part 2 (of 2)

Y2 - 3 December 1995 through 8 December 1995

ER -