Performance-based path determination for interprocessor communication in distributed computing systems

Jun Seonq Kim; David J. Lilja

doi:10.1109/71.755832

Performance-based path determination for interprocessor communication in distributed computing systems

Jun Seonq Kim, David J. Lilja

Electrical and Computer Engineering

Research output: Contribution to journal › Article › peer-review

8 Scopus citations

Abstract

The different types of messages used by a parallel application program executing in a distributed computing system can each have unique characteristics so that no single, communication network can produce the lowest latency for all messages. For instance, short control messages may be sent with the lowest overhead on one type of network, such as Ethernet, while bulk data transfers may be better suited to a different type of network, such as Fibre Channel or HiPPI. This work investigates how to exploit multiple heterogeneous communication networks that interconnect the same set of processing nodes using a set of techniques we call performance-based path determination (PBPD). The performance-based path selection (PBPS) technique selects the best (lowest latency) network among several for each individual message to reduce the communication overhead of parallel programs. The performance-based path aggregation (PBPA) technique, on the other hand, aggregates multiple networks into a single virtual network to increase the available bandwidth. We test the PBPD techniques on a cluster of SGI multiprocessors interconnected with Ethernet, Fibre Channel, and HiPPI networks using a custom communication library built on top of the TCP/IP protocol layers. We find that PBPS can reduce communication overhead in applications compared to using either network alone, while aggregating networks into a single virtual network can reduce communication latency for bandwidth-limited applications. The performance of the PBPD techniques depends on the mix of message sizes in the application program and the relative overheads of the networks, as demonstrated in our analytical models.

Original language	English (US)
Pages (from-to)	316-327
Number of pages	12
Journal	IEEE Transactions on Parallel and Distributed Systems
Volume	10
Issue number	3
DOIs	https://doi.org/10.1109/71.755832
State	Published - 1999

Bibliographical note

Funding Information:
This work was supported in part by the U.S. National Science Foundation under Grant CDA-9414015 and by a University of Minnesota McKnight Land-Grant Professorship. Preliminary versions of this work were presented at the 1997 Heterogeneous Computing Workshop [9] and the 1997 International Symposium on High Performance Distributed Computing [10].

Access

10.1109/71.755832

OpenUrl availability

Full text

Cite this

@article{894c04ec75be44d69ac83c2f87d04b98,

title = "Performance-based path determination for interprocessor communication in distributed computing systems",

abstract = "The different types of messages used by a parallel application program executing in a distributed computing system can each have unique characteristics so that no single, communication network can produce the lowest latency for all messages. For instance, short control messages may be sent with the lowest overhead on one type of network, such as Ethernet, while bulk data transfers may be better suited to a different type of network, such as Fibre Channel or HiPPI. This work investigates how to exploit multiple heterogeneous communication networks that interconnect the same set of processing nodes using a set of techniques we call performance-based path determination (PBPD). The performance-based path selection (PBPS) technique selects the best (lowest latency) network among several for each individual message to reduce the communication overhead of parallel programs. The performance-based path aggregation (PBPA) technique, on the other hand, aggregates multiple networks into a single virtual network to increase the available bandwidth. We test the PBPD techniques on a cluster of SGI multiprocessors interconnected with Ethernet, Fibre Channel, and HiPPI networks using a custom communication library built on top of the TCP/IP protocol layers. We find that PBPS can reduce communication overhead in applications compared to using either network alone, while aggregating networks into a single virtual network can reduce communication latency for bandwidth-limited applications. The performance of the PBPD techniques depends on the mix of message sizes in the application program and the relative overheads of the networks, as demonstrated in our analytical models.",

author = "Kim, {Jun Seonq} and Lilja, {David J.}",

note = "Funding Information: This work was supported in part by the U.S. National Science Foundation under Grant CDA-9414015 and by a University of Minnesota McKnight Land-Grant Professorship. Preliminary versions of this work were presented at the 1997 Heterogeneous Computing Workshop [9] and the 1997 International Symposium on High Performance Distributed Computing [10].",

year = "1999",

doi = "10.1109/71.755832",

language = "English (US)",

volume = "10",

pages = "316--327",

journal = "IEEE Transactions on Parallel and Distributed Systems",

issn = "1045-9219",

publisher = "IEEE Computer Society",

number = "3",

}

TY - JOUR

T1 - Performance-based path determination for interprocessor communication in distributed computing systems

AU - Kim, Jun Seonq

AU - Lilja, David J.

N1 - Funding Information: This work was supported in part by the U.S. National Science Foundation under Grant CDA-9414015 and by a University of Minnesota McKnight Land-Grant Professorship. Preliminary versions of this work were presented at the 1997 Heterogeneous Computing Workshop [9] and the 1997 International Symposium on High Performance Distributed Computing [10].

PY - 1999

Y1 - 1999

N2 - The different types of messages used by a parallel application program executing in a distributed computing system can each have unique characteristics so that no single, communication network can produce the lowest latency for all messages. For instance, short control messages may be sent with the lowest overhead on one type of network, such as Ethernet, while bulk data transfers may be better suited to a different type of network, such as Fibre Channel or HiPPI. This work investigates how to exploit multiple heterogeneous communication networks that interconnect the same set of processing nodes using a set of techniques we call performance-based path determination (PBPD). The performance-based path selection (PBPS) technique selects the best (lowest latency) network among several for each individual message to reduce the communication overhead of parallel programs. The performance-based path aggregation (PBPA) technique, on the other hand, aggregates multiple networks into a single virtual network to increase the available bandwidth. We test the PBPD techniques on a cluster of SGI multiprocessors interconnected with Ethernet, Fibre Channel, and HiPPI networks using a custom communication library built on top of the TCP/IP protocol layers. We find that PBPS can reduce communication overhead in applications compared to using either network alone, while aggregating networks into a single virtual network can reduce communication latency for bandwidth-limited applications. The performance of the PBPD techniques depends on the mix of message sizes in the application program and the relative overheads of the networks, as demonstrated in our analytical models.

AB - The different types of messages used by a parallel application program executing in a distributed computing system can each have unique characteristics so that no single, communication network can produce the lowest latency for all messages. For instance, short control messages may be sent with the lowest overhead on one type of network, such as Ethernet, while bulk data transfers may be better suited to a different type of network, such as Fibre Channel or HiPPI. This work investigates how to exploit multiple heterogeneous communication networks that interconnect the same set of processing nodes using a set of techniques we call performance-based path determination (PBPD). The performance-based path selection (PBPS) technique selects the best (lowest latency) network among several for each individual message to reduce the communication overhead of parallel programs. The performance-based path aggregation (PBPA) technique, on the other hand, aggregates multiple networks into a single virtual network to increase the available bandwidth. We test the PBPD techniques on a cluster of SGI multiprocessors interconnected with Ethernet, Fibre Channel, and HiPPI networks using a custom communication library built on top of the TCP/IP protocol layers. We find that PBPS can reduce communication overhead in applications compared to using either network alone, while aggregating networks into a single virtual network can reduce communication latency for bandwidth-limited applications. The performance of the PBPD techniques depends on the mix of message sizes in the application program and the relative overheads of the networks, as demonstrated in our analytical models.

UR - http://www.scopus.com/inward/record.url?scp=0032673224&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0032673224&partnerID=8YFLogxK

U2 - 10.1109/71.755832

DO - 10.1109/71.755832

M3 - Article

AN - SCOPUS:0032673224

SN - 1045-9219

VL - 10

SP - 316

EP - 327

JO - IEEE Transactions on Parallel and Distributed Systems

JF - IEEE Transactions on Parallel and Distributed Systems

IS - 3

ER -

Performance-based path determination for interprocessor communication in distributed computing systems

Abstract

Bibliographical note

Access

OpenUrl availability

Other files and links

Fingerprint

Cite this