We propose several implementations of Gaussian elimination for solving banded linear systems on multiprocessors. Three simple architectures are considered: a multiprocessor ring, a grid array, and a hypercube. Our complexity analysis fully accounts for communication delays by using simple models where both latency and actual transfer times are incorporated. When the number of processors is small relative to the bandwidth of the system, a row-interleaved implementation of Gaussian elimination algorithm is attractive. Otherwise, a two-dimensional grid is essential for achieving higher speedup. The hypercube architecture gives the smallest communication latency times.
Bibliographical noteFunding Information:
*Revised from August 1985 version. This work was supported in part by ONR grant NOOO14-82-K-0184 and in part by a joint study with IBM/Kingston.