Abstract
Interference is a key performance challenge faced by cloud users, and can significantly degrade application performance on virtual machines (VMs). For load-balanced cloud applications, a key question is how to distribute the load among VMs in the presence of interference. Using a Markov decision process (MDP) model, we investigate dynamic control polices to assign jobs among a cluster of VMs that are prone to interference in a system with a central queue and an arbitrary number of VMs. We characterize the structural properties of the MDP optimality equation, and we prove that the optimal control policy is a threshold policy based on the queue length. The optimal policy is characterized by multiple thresholds depending on the current conditions of the VMs, including the number of busy under-interference VMs. We discuss the existence of an ordering among such thresholds, and we prove the ordering for a two-VM system. Our numerical results show that the optimal dynamic policy can significantly improve performance compared to the the commonly employed non-idling policy. For low utilization systems, we observe improvements on the order of around 20%. We further implement the optimal policy in a real-world testbed using the HAProxy load balancer, and show that it can reduce web server response times by as much as 40%-60%, even for time-varying request rates.
Original language | English (US) |
---|---|
Title of host publication | Proceedings - 2019 IEEE 27th International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems, MASCOTS 2019 |
Publisher | IEEE Computer Society |
Pages | 295-308 |
Number of pages | 14 |
ISBN (Electronic) | 9781728149509 |
DOIs | |
State | Published - Oct 2019 |
Event | 27th IEEE International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems, MASCOTS 2019 - Rennes, France Duration: Oct 22 2019 → Oct 25 2019 |
Publication series
Name | Proceedings - IEEE Computer Society's Annual International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunications Systems, MASCOTS |
---|---|
Volume | 2019-October |
ISSN (Print) | 1526-7539 |
Conference
Conference | 27th IEEE International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems, MASCOTS 2019 |
---|---|
Country/Territory | France |
City | Rennes |
Period | 10/22/19 → 10/25/19 |
Bibliographical note
Funding Information:ACKNOWLEDGMENT This work was supported by NSF CNS grants 1617046, 1717588, and 1750109.
Publisher Copyright:
© 2019 IEEE.
Keywords
- Cloud Computing
- Markov Chains
- Markov Decision Process
- Optimal Control of Queues