The scalability of a dynamic binary translation (DBT) system has become important due to the prevalence of multicore systems and large multi-threaded applications. Several recent efforts have addressed some critical issues in extending a DBT system to run on multicore platforms for better scalability. In this paper, we present a distributed DBT framework, called DQEMU, that goes beyond a single-node multicore processor and can be scaled up to a cluster of multi-node servers. In such a distributed DBT system, we integrate a page-level directory-based data coherence protocol, a hierarchical locking mechanism, a delegation scheme for system calls, and a remote thread migration approach that are effective in reducing its overheads. We also proposed several performance optimization strategies that include page splitting to mitigate false data sharing among nodes, data forwarding for latency hiding, and a hint-based locality-aware scheduling scheme. Comprehensive experiments have been conducted on DQEMU with micro-benchmarks and the PARSEC benchmark suite. The results show that DQEMU can scale beyond a single-node machine with reasonable overheads. For "embarrassingly-parallel" benchmark programs, DQEMU can achieve near-linear speedup when the number of nodes increases - as opposed to flattened out due to lack of computing resources as in current single-node, multi-core version of QEMU.
|Original language||English (US)|
|Title of host publication||Proceedings of the 49th International Conference on Parallel Processing, ICPP 2020|
|Publisher||Association for Computing Machinery|
|State||Published - Aug 17 2020|
|Event||49th International Conference on Parallel Processing, ICPP 2020 - Virtual, Online, Canada|
Duration: Aug 17 2020 → Aug 20 2020
|Name||ACM International Conference Proceeding Series|
|Conference||49th International Conference on Parallel Processing, ICPP 2020|
|Period||8/17/20 → 8/20/20|
Bibliographical noteFunding Information:
This work is partially supported by the National Natural Science Foundation of China (61702286), the National Key Research and Development Program of China (2018YFB1003405), the Natural Science Foundation of Tianjin, China (18JCY-BJC15600), the CERNET Innovation Project (NGII20190514), and a faculty startup funding of the University of Georgia.
© 2020 ACM.
- Dynamic binary translator
- distributed emulator
- distributed system