To date, the study of dispatching or load balancing in server farms has primarily focused on the minimization of response time. Server farms are typically modeled by a front-end router that employs a dispatching policy to route jobs to one of several servers, with each server scheduling all the jobs in its queue via Processor-Sharing. However, the common assumption has been that all jobs are equally important or valuable, in that they are equally sensitive to delay. Our work departs from this assumption: we model each arrival as having a randomly distributed value parameter, independent of the arrival's service requirement (job size). Given such value heterogeneity, the correct metric is no longer the minimization or response time, but rather, the minimization of value-weighted response time. In this context, we ask "what is a good dispatching policy to minimize the value-weighted response time metric?" We propose a number of new dispatching policies that are motivated by the goal of minimizing the value-weighted response time. Via a combination of exact analysis, asymptotic analysis, and simulation, we are able to deduce many unexpected results regarding dispatching.
Bibliographical noteFunding Information:
The authors would like to thank the reviewers for their helpful comments. Special thanks to Bruno Gaujal and Gautam Iyer for their assistance in refining some of the paper’s technical details. The second author’s work has been supported by the Academy of Finland in TOP-Energy project (grant no. 268992 ). The third author’s work was funded by NSF-CMMI - 1334194 as well as a Computational Thinking grant from Microsoft Research.
- C-MU rule
- Heterogeneous values
- Holding cost
- Server farms
- Task assignment