Abstract
Recent work has shown that pipelining and multiple instruction issuing are architecturally equivalent in their abilities to exploit parallelism, but there has been little work directly comparing the performance of these fine-grain parallel architectures with that of the coarse-grain multiprocessors. Using trace-driven simulations, the authors compare the performance of a superscalar processor and a pipelined processor using dynamic dependence checking with that of a shared memory multiprocessor. For very parallel programs, they find that the fine-grain processors must bypass an unrealistically large number of branches to match the performance of the multiprocessor. When executing programs with a wide range of potential parallelism, the best performance is obtained using a multiprocessor where each individual processor has a fine-grain parallelism of two to four.
Original language | English (US) |
---|---|
Article number | 183902 |
Pages (from-to) | 324-333 |
Number of pages | 10 |
Journal | Proceedings of the Annual Hawaii International Conference on System Sciences |
Volume | 1 |
DOIs | |
State | Published - 1991 |
Externally published | Yes |
Event | 24th Annual Hawaii International Conference on System Sciences, HICSS 1991 - Kauai, United States Duration: Jan 8 1991 → Jan 11 1991 |
Bibliographical note
Funding Information:This work was supported by the National Science Foundation under Grant No. NSF MIP-8410110, with additional support from NASA Ames Research Center Grant No. NASA NCC 2-559 (DARPA), National Science Foundation Grant No. NSF MIP-88-07775, and Department of Energy Grant No. DOE DE-FG02-85ER25001.
Publisher Copyright:
© 1991 IEEE.