Code transformations for enhancing the performance of speculatively parallel threads

Shengyue Wang, Pen Chung Yew, Antonia Zhai

Research output: Contribution to journalArticlepeer-review

5 Scopus citations

Abstract

As technology advances, microprocessors that integrate multiple cores on a single chip are becoming increasingly common. How to use these processors to improve the performance of a single program has been a challenge. For general-purpose applications, it is especially difficult to create efficient parallel execution due to the complex control flow and ambiguous data dependences. Thread-level speculation and transactional memory provide two hardware mechanisms that are able to optimistically parallelize potentially dependent threads. However, a compiler that performs detailed performance trade-off analysis is essential for generating efficient parallel programs for these hardwares. This compiler must be able to take into consideration the cost of intra-thread as well as inter-thread value communication. On the other hand, the ubiquitous existence of complex, input-dependent control flow and data dependence patterns in general-purpose applications makes it impossible to have one technique optimize all program patterns. In this paper, we propose three optimization techniques to improve the thread performance: (i) scheduling instruction and generating recovery code to reduce the critical forwarding path introduced by synchronizing memory resident values; (ii) identifying reduction variables and transforming the code the minimize the serializing execution; and (iii) dynamically merging consecutive iterations of a loop to avoid stalls due to unbalanced workload. Detailed evaluation of the proposed mechanism shows that each optimization technique improves a subset but none improve all of the SPEC2000 benchmarks. On average, the proposed optimizations improve the performance by 7% for the set of the SPEC2000 benchmarks that have already been optimized for register-resident value communication.

Original languageEnglish (US)
Article number1240008
JournalJournal of Circuits, Systems and Computers
Volume21
Issue number2
DOIs
StatePublished - Apr 2012

Bibliographical note

Funding Information:
This work is supported in part by grants from National Science Foundation under CNS-0834599, CSR-0834599, and CPS-0931931, a contract from Semiconductor Research Corporation under SRC-2008-TJ-1819, and gift grants from HP, IBM, and Intel.

Keywords

  • Thread-level speculation
  • compiler optimizations
  • multicore systems
  • parallelizing compiler

Fingerprint

Dive into the research topics of 'Code transformations for enhancing the performance of speculatively parallel threads'. Together they form a unique fingerprint.

Cite this