TY - JOUR
T1 - Scalable load balancing in the presence of heterogeneous servers
AU - Gardner, Kristen
AU - Abdul Jaleel, Jazeem
AU - Wickeham, Alexander
AU - Doroudi, Sherwin
N1 - Publisher Copyright:
© 2020 Elsevier B.V.
PY - 2021/1
Y1 - 2021/1
N2 - Heterogeneity is becoming increasingly ubiquitous in modern large-scale computer systems. Developing good load balancing policies for systems whose resources have varying speeds is crucial in achieving low response times. Indeed, how best to dispatch jobs to servers is a classical and well-studied problem in the queueing literature. Yet the bulk of existing work on large-scale systems assumes homogeneous servers; unfortunately, policies that perform well in the homogeneous setting can cause unacceptably poor performance in heterogeneous systems. We adapt the “power-of-d” versions of both the Join-the-Idle-Queue and Join-the-Shortest-Queue policies to design two corresponding families of heterogeneity-aware dispatching policies, each of which is parameterized by a pair of routing probabilities. Unlike their heterogeneity-unaware counterparts, our policies use server speed information both when choosing which servers to query and when probabilistically deciding where (among the queried servers) to dispatch jobs. Both of our policy families are analytically tractable: our mean response time and queue length distribution analyses are exact as the number of servers approaches infinity, under standard assumptions. Furthermore, our policy families achieve maximal stability and outperform well-known dispatching rules – including heterogeneity-aware policies such as Shortest-Expected-Delay – with respect to mean response time.
AB - Heterogeneity is becoming increasingly ubiquitous in modern large-scale computer systems. Developing good load balancing policies for systems whose resources have varying speeds is crucial in achieving low response times. Indeed, how best to dispatch jobs to servers is a classical and well-studied problem in the queueing literature. Yet the bulk of existing work on large-scale systems assumes homogeneous servers; unfortunately, policies that perform well in the homogeneous setting can cause unacceptably poor performance in heterogeneous systems. We adapt the “power-of-d” versions of both the Join-the-Idle-Queue and Join-the-Shortest-Queue policies to design two corresponding families of heterogeneity-aware dispatching policies, each of which is parameterized by a pair of routing probabilities. Unlike their heterogeneity-unaware counterparts, our policies use server speed information both when choosing which servers to query and when probabilistically deciding where (among the queried servers) to dispatch jobs. Both of our policy families are analytically tractable: our mean response time and queue length distribution analyses are exact as the number of servers approaches infinity, under standard assumptions. Furthermore, our policy families achieve maximal stability and outperform well-known dispatching rules – including heterogeneity-aware policies such as Shortest-Expected-Delay – with respect to mean response time.
KW - Dispatching
KW - Heterogeneity
KW - Join-Idle-Queue
KW - Join-the-Shortest-Queue
KW - Load balancing
KW - Power of d
UR - http://www.scopus.com/inward/record.url?scp=85092651009&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85092651009&partnerID=8YFLogxK
U2 - 10.1016/j.peva.2020.102151
DO - 10.1016/j.peva.2020.102151
M3 - Article
AN - SCOPUS:85092651009
SN - 0166-5316
VL - 145
JO - Performance Evaluation
JF - Performance Evaluation
M1 - 102151
ER -