Scalable load balancing in the presence of heterogeneous servers

Kristen Gardner; Jazeem Abdul Jaleel; Alexander Wickeham; Sherwin Doroudi

doi:10.1016/j.peva.2020.102151

Scalable load balancing in the presence of heterogeneous servers

Kristen Gardner, Jazeem Abdul Jaleel, Alexander Wickeham, Sherwin Doroudi

Research output: Contribution to journal › Article › peer-review

11 Scopus citations

Abstract

Heterogeneity is becoming increasingly ubiquitous in modern large-scale computer systems. Developing good load balancing policies for systems whose resources have varying speeds is crucial in achieving low response times. Indeed, how best to dispatch jobs to servers is a classical and well-studied problem in the queueing literature. Yet the bulk of existing work on large-scale systems assumes homogeneous servers; unfortunately, policies that perform well in the homogeneous setting can cause unacceptably poor performance in heterogeneous systems. We adapt the “power-of-d” versions of both the Join-the-Idle-Queue and Join-the-Shortest-Queue policies to design two corresponding families of heterogeneity-aware dispatching policies, each of which is parameterized by a pair of routing probabilities. Unlike their heterogeneity-unaware counterparts, our policies use server speed information both when choosing which servers to query and when probabilistically deciding where (among the queried servers) to dispatch jobs. Both of our policy families are analytically tractable: our mean response time and queue length distribution analyses are exact as the number of servers approaches infinity, under standard assumptions. Furthermore, our policy families achieve maximal stability and outperform well-known dispatching rules – including heterogeneity-aware policies such as Shortest-Expected-Delay – with respect to mean response time.

Original language	English (US)
Article number	102151
Journal	Performance Evaluation
Volume	145
DOIs	https://doi.org/10.1016/j.peva.2020.102151
State	Published - Jan 2021
Externally published	Yes

Bibliographical note

Publisher Copyright:
© 2020 Elsevier B.V.

Keywords

Dispatching
Heterogeneity
Join-Idle-Queue
Join-the-Shortest-Queue
Load balancing
Power of d

Access

10.1016/j.peva.2020.102151

OpenUrl availability

Full text

Cite this

@article{aeb4fbd959794267885e457ce113155c,

title = "Scalable load balancing in the presence of heterogeneous servers",

abstract = "Heterogeneity is becoming increasingly ubiquitous in modern large-scale computer systems. Developing good load balancing policies for systems whose resources have varying speeds is crucial in achieving low response times. Indeed, how best to dispatch jobs to servers is a classical and well-studied problem in the queueing literature. Yet the bulk of existing work on large-scale systems assumes homogeneous servers; unfortunately, policies that perform well in the homogeneous setting can cause unacceptably poor performance in heterogeneous systems. We adapt the “power-of-d” versions of both the Join-the-Idle-Queue and Join-the-Shortest-Queue policies to design two corresponding families of heterogeneity-aware dispatching policies, each of which is parameterized by a pair of routing probabilities. Unlike their heterogeneity-unaware counterparts, our policies use server speed information both when choosing which servers to query and when probabilistically deciding where (among the queried servers) to dispatch jobs. Both of our policy families are analytically tractable: our mean response time and queue length distribution analyses are exact as the number of servers approaches infinity, under standard assumptions. Furthermore, our policy families achieve maximal stability and outperform well-known dispatching rules – including heterogeneity-aware policies such as Shortest-Expected-Delay – with respect to mean response time.",

keywords = "Dispatching, Heterogeneity, Join-Idle-Queue, Join-the-Shortest-Queue, Load balancing, Power of d",

author = "Kristen Gardner and {Abdul Jaleel}, Jazeem and Alexander Wickeham and Sherwin Doroudi",

note = "Publisher Copyright: {\textcopyright} 2020 Elsevier B.V.",

year = "2021",

month = jan,

doi = "10.1016/j.peva.2020.102151",

language = "English (US)",

volume = "145",

journal = "Performance Evaluation",

issn = "0166-5316",

publisher = "Elsevier",

}

TY - JOUR

T1 - Scalable load balancing in the presence of heterogeneous servers

AU - Gardner, Kristen

AU - Abdul Jaleel, Jazeem

AU - Wickeham, Alexander

AU - Doroudi, Sherwin

PY - 2021/1

Y1 - 2021/1

N2 - Heterogeneity is becoming increasingly ubiquitous in modern large-scale computer systems. Developing good load balancing policies for systems whose resources have varying speeds is crucial in achieving low response times. Indeed, how best to dispatch jobs to servers is a classical and well-studied problem in the queueing literature. Yet the bulk of existing work on large-scale systems assumes homogeneous servers; unfortunately, policies that perform well in the homogeneous setting can cause unacceptably poor performance in heterogeneous systems. We adapt the “power-of-d” versions of both the Join-the-Idle-Queue and Join-the-Shortest-Queue policies to design two corresponding families of heterogeneity-aware dispatching policies, each of which is parameterized by a pair of routing probabilities. Unlike their heterogeneity-unaware counterparts, our policies use server speed information both when choosing which servers to query and when probabilistically deciding where (among the queried servers) to dispatch jobs. Both of our policy families are analytically tractable: our mean response time and queue length distribution analyses are exact as the number of servers approaches infinity, under standard assumptions. Furthermore, our policy families achieve maximal stability and outperform well-known dispatching rules – including heterogeneity-aware policies such as Shortest-Expected-Delay – with respect to mean response time.

AB - Heterogeneity is becoming increasingly ubiquitous in modern large-scale computer systems. Developing good load balancing policies for systems whose resources have varying speeds is crucial in achieving low response times. Indeed, how best to dispatch jobs to servers is a classical and well-studied problem in the queueing literature. Yet the bulk of existing work on large-scale systems assumes homogeneous servers; unfortunately, policies that perform well in the homogeneous setting can cause unacceptably poor performance in heterogeneous systems. We adapt the “power-of-d” versions of both the Join-the-Idle-Queue and Join-the-Shortest-Queue policies to design two corresponding families of heterogeneity-aware dispatching policies, each of which is parameterized by a pair of routing probabilities. Unlike their heterogeneity-unaware counterparts, our policies use server speed information both when choosing which servers to query and when probabilistically deciding where (among the queried servers) to dispatch jobs. Both of our policy families are analytically tractable: our mean response time and queue length distribution analyses are exact as the number of servers approaches infinity, under standard assumptions. Furthermore, our policy families achieve maximal stability and outperform well-known dispatching rules – including heterogeneity-aware policies such as Shortest-Expected-Delay – with respect to mean response time.

KW - Dispatching

KW - Heterogeneity

KW - Join-Idle-Queue

KW - Join-the-Shortest-Queue

KW - Load balancing

KW - Power of d

UR - http://www.scopus.com/inward/record.url?scp=85092651009&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85092651009&partnerID=8YFLogxK

U2 - 10.1016/j.peva.2020.102151

DO - 10.1016/j.peva.2020.102151

M3 - Article

AN - SCOPUS:85092651009

SN - 0166-5316

VL - 145

JO - Performance Evaluation

JF - Performance Evaluation

M1 - 102151

ER -

Scalable load balancing in the presence of heterogeneous servers

Abstract

Bibliographical note

Keywords

Access

OpenUrl availability

Other files and links

Fingerprint

Cite this