Most existing computer architecture simulators are cycle oriented, i.e., they are driven cycle by cycle. However, frequent switches among simulation contexts, excessive buffer accesses and tightly coupled manner often make such an architecture simulator slow, difficult to parallelize and hard to scale to large-scale many-core systems. In this paper, we propose Prophet, a parallel instruction-oriented simulation framework for many-cores. Prophet adopts a general instruction-oriented model to simulate processor cores, in which a simulator is built from the perspective of each simulated instruction impacting a small number of relevant processor components, as opposed to that of a large number of processor components executing many instructions in each cycle as in the cycle-oriented approach. Prophet determines the execution cycle of a simulated instruction based on the states of the relevant components impacted by the instruction, and update the components states after the execution of the instruction. Prophet also adopts a speculative model to decouple private resources from the shared resources (e.g., shared cache), which avoids unnecessary interactions between them and only pays a penalty upon a rare mis-speculation. We have designed and implemented a prototype of Prophet that supports both user-level and full-system simulation. Experimental results show Prophet can scale up to simulate thousands of simulated cores (4,096 cores in the current implementation) with good performance and small accuracy loss. It achieves average simulation speeds of about 98 and 235 MIPS (millions of simulated instructions per second) for full-system and user-level simulation, respectively, with only 3 percent IPC error rate and negligible deviation in cache simulation results. When run on a many-core platform (i.e., Intel Xeon Phi), it achieved an average simulation speed of about 413 MIPS.
|Original language||English (US)|
|Number of pages||14|
|Journal||IEEE Transactions on Parallel and Distributed Systems|
|State||Published - Oct 1 2017|
Bibliographical noteFunding Information:
The work is supported in part by National Key Research and Development Program of China (No. 2016YFB0200501), the National Natural Science Foundation of China (No. 61672160 and 61370081). We would like to thank all of the anonymous reviewers for their valuable comments and suggestions that vastly improve the paper. Weihua Zhang is the corresponding author.
© 1990-2012 IEEE.