A Measure of Graceful Degradation in Parallel-Computer Systems

Vladimir Cherkassky, Miroslaw Malek

Research output: Contribution to journalArticlepeer-review

13 Scopus citations

Abstract

We study gracefully-degrading multiprocessor systems using a multistage interconnection network-based parallel computer as an example. A measure of graceful degradation, viz, system functionality, is used and analyzed; it is proportional to the number of dataflow paths in a system in presence of faults. Each path consists of a processor, switches, and a memory. Under this approach, graceful degradation of a multiprocessor system can be evaluated as a combination of individual degradations in each of the processor, memory, and network subsystems. System functionality is proportional to the mean benefit that a system provides during its use period. A detailed evaluation of graceful degradation of tightly coupled multiprocessor systems (of large size) based on multistage interconnection networks is given. The results are based on realistic assumptions and approximations. The two major assumptions are: 1) system size (number of components) is large, and 2) mean number of component failures between successive instances of periodic maintenance (and repair) is small. Under these assumptions, we approximate the number of faults in the system by a Poisson distribution, and find a closed-form expression for the mean number of dataflow paths during the use period. The system functionality can be expressed as a separable function with respect to the number of faulty processors, memory units, and network switches. Thus the effect of faults in the processor, memory, and network subsystems on system degradation can be evaluated independently. With a growth of system size the interconnection network can become a critical subsystem and dominate the system degradation.

Original languageEnglish (US)
Pages (from-to)76-81
Number of pages6
JournalIEEE Transactions on Reliability
Volume38
Issue number1
DOIs
StatePublished - Apr 1989

Bibliographical note

Funding Information:
This work was supported in part by the Graduate School at the University of Minnesota, US DARPA Grant No. N00039-86-C-0167, IBM Corporation, and US Office of Naval Research Grant No. N00014-86-K-0554 under SDIOAST.

Keywords

  • Banyan
  • Failure rate
  • Graceful degradation
  • Multiprocessor system
  • Multistage interconnection network
  • Non-rectangular network
  • Poisson distribution
  • Separable function
  • system functionality level

Fingerprint

Dive into the research topics of 'A Measure of Graceful Degradation in Parallel-Computer Systems'. Together they form a unique fingerprint.

Cite this