Abstract
Predicting the values that are likely to be produced by instructions has been suggested as a way of increasing the instruction-level parallelism available in a superscalar processor. One of the potential difficulties in cost-effectively predicting values for a given instruction, however, is selecting the proper type of predictor, such as a last-value predictor, a stride predictor, or a context-based predictor. We propose a compiler-directed classification scheme that statically partitions all of the instructions in a program into several groups, each of which is associated with a specific value predictability pattern. This value predictability pattern is encoded into the instructions to identify the type of value predictor that will be best suited for predicting the values that are likely to be produced by each instruction at runtime. Both a profile-based compiler implementation and an implementation based on the GCC compiler are studied to show the performance bounds for the proposed technique. Our simulations using an extension to the SimpleScalar tool set and the SPEC95 and SPEC2000 benchmark programs indicate that this approach can efficiently use the limited hardware resources in superscalar processors. This static partitioning approach produces better performance than a dynamically partitioned approach and a simple round-robin distribution approach for a given hardware configuration. Finally, this work further demonstrates the connection between value locality behavior and source-level program structures, thereby leading to a deeper understanding of the causes of this behavior.
Original language | English (US) |
---|---|
Pages (from-to) | 929-944 |
Number of pages | 16 |
Journal | IEEE Transactions on Computers |
Volume | 53 |
Issue number | 8 |
DOIs | |
State | Published - Aug 2004 |
Bibliographical note
Funding Information:This work was supported in part by US National Science Foundation grants no. MIP-9610379, EIA-9971666, and CCR-9900605, by KAI Software, a division of Intel America, Inc., and by the Minnesota Supercomputing Institute. David J. Lilja was supported by a Fulbright award from the Australian-American Education Foundation during portions of this work. A condensed version of this work was presented at the 2001 International Conference on Computer Design [9].