The increasing data requirements of Internet applications have driven a dramatic surge in developing new programming paradigms and complex scheduling algorithms to handle data-intensive workloads. Due to the expanding volume and the variety of such flows, their raw data are often processed on Intermediate Processing Nodes (IPNs) before being sent to servers. However, the intermediate processing constraint is rarely considered in existing flow computing models. This paper aims to minimize the tardiness of data-intensive applications in the presence of intermediate processing constraint. Motivating cases show that the tardiness is affected by both IPN locations and flow dispatching strategies. Based on the observation that dispatching flows to IPNs is essentially building a matching between flows and IPNs, a novel solution is proposed based on matching theory. In the deployment phase, a tardiness-aware deferred acceptance algorithm is developed to optimize IPN locations. In the operation phase, the Power-of-D paradigm and matching theory are combined together to dispatch flows efficiently. Evaluation results show that our solution effectively minimizes the total tardiness of data-intensive applications in heterogeneous systems.
|Original language||English (US)|
|Number of pages||15|
|Journal||IEEE Transactions on Parallel and Distributed Systems|
|State||Published - Jan 1 2020|
Bibliographical noteFunding Information:
Ke Xu’s work was supported in part by the National Key R&D Program of China under Grant 2018YFB0803405, in part by the China National Funds for Distinguished Young Scientists under Grant 61825204 and in part by the Beijing Outstanding Young Scientist Project. Meng Shen’s work was supported in part by the National Natural Science Foundation of China under Grant 61602039, in part by the Beijing Natural Science Foundation under Grant 4192050, and in part by the CCF-Tencent Open Fund WeBank Special Funding. Kun Yang’s work was supported by the UK EPSRC Project NIRVANA (EP/L026031/1).
© 1990-2012 IEEE.
- Heterogeneous system
- data-intensive application
- matching theory