TY - GEN
T1 - Accordion
T2 - 20th IEEE International Symposium on High Performance Computer Architecture, HPCA 2014
AU - Karpuzcu, Ulya
AU - Akturk, Ismail
AU - Kim, Nam Sung
N1 - Copyright:
Copyright 2014 Elsevier B.V., All rights reserved.
PY - 2014
Y1 - 2014
N2 - While more cores can find place in the unit chip area every technology generation, excessive growth in power density prevents simultaneous utilization of all. Due to the lower operating voltage, Near-Threshold Voltage Computing (NTC) promises to fit more cores in a given power envelope. Yet NTC prospects for energy efficiency disappear without mitigating (i) the performance degradation due to the lower operating frequency; (ii) the intensified vulnerability to parametric variation. To compensate for the first barrier, we need to raise the degree of parallelism - the number of cores engaged in computation. NTC-prompted power savings dominate the power cost of increasing the core count. Hence, limited parallelism in the application domain constitutes the critical barrier to engaging more cores in computation. To avoid the second barrier, the system should tolerate variation-induced errors. Unfortunately, engaging more cores in computation exacerbates vulnerability to variation further. To overcome NTC barriers, we introduce Accordion, a novel, light-weight framework, which exploits weak scaling along with inherent fault tolerance of emerging R(ecognition), M(ining), S(ynthesis) applications. The key observation is that the problem size not only dictates the number of cores engaged in computation, but also the application output quality. Consequently, Accordion designates the problem size as the main knob to trade off the degree of parallelism (i.e. the number of cores engaged in computation), with the degree of vulnerability to variation (i.e. the corruption in application output quality due to variation-induced errors). Parametric variation renders ample reliability differences between the cores. Since RMS applications can tolerate faults emanating from data-intensive program phases as opposed to control, variation-afflicted Accordion hardware executes fault-tolerant data-intensive phases on error-prone cores, and reserves reliable cores for control.
AB - While more cores can find place in the unit chip area every technology generation, excessive growth in power density prevents simultaneous utilization of all. Due to the lower operating voltage, Near-Threshold Voltage Computing (NTC) promises to fit more cores in a given power envelope. Yet NTC prospects for energy efficiency disappear without mitigating (i) the performance degradation due to the lower operating frequency; (ii) the intensified vulnerability to parametric variation. To compensate for the first barrier, we need to raise the degree of parallelism - the number of cores engaged in computation. NTC-prompted power savings dominate the power cost of increasing the core count. Hence, limited parallelism in the application domain constitutes the critical barrier to engaging more cores in computation. To avoid the second barrier, the system should tolerate variation-induced errors. Unfortunately, engaging more cores in computation exacerbates vulnerability to variation further. To overcome NTC barriers, we introduce Accordion, a novel, light-weight framework, which exploits weak scaling along with inherent fault tolerance of emerging R(ecognition), M(ining), S(ynthesis) applications. The key observation is that the problem size not only dictates the number of cores engaged in computation, but also the application output quality. Consequently, Accordion designates the problem size as the main knob to trade off the degree of parallelism (i.e. the number of cores engaged in computation), with the degree of vulnerability to variation (i.e. the corruption in application output quality due to variation-induced errors). Parametric variation renders ample reliability differences between the cores. Since RMS applications can tolerate faults emanating from data-intensive program phases as opposed to control, variation-afflicted Accordion hardware executes fault-tolerant data-intensive phases on error-prone cores, and reserves reliable cores for control.
UR - http://www.scopus.com/inward/record.url?scp=84903978220&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84903978220&partnerID=8YFLogxK
U2 - 10.1109/HPCA.2014.6835977
DO - 10.1109/HPCA.2014.6835977
M3 - Conference contribution
AN - SCOPUS:84903978220
SN - 9781479930975
T3 - Proceedings - International Symposium on High-Performance Computer Architecture
SP - 72
EP - 83
BT - 20th IEEE International Symposium on High Performance Computer Architecture, HPCA 2014
PB - IEEE Computer Society
Y2 - 15 February 2014 through 19 February 2014
ER -