Abstract
The authors propose automated algorithmic error resilience based on outlier detection. The approach exploits characteristic behavior of a class of applications to create metric functions that normally produce values according to a designed distribution or behavior and produce outlier values when computations are affected by errors. For a robust algorithm that employs such an approach, error detection becomes equivalent to outlier detection. As such, the authors use well-established, statistically rigorous techniques for outlier detection to effectively and efficiently detect errors, and subsequently correct them. The authors' error-resilient algorithms incur significantly lower overhead than traditional hardware and software error-resilience techniques (such as triple modular redundancy). In addition, compared to previous approaches to application-based error resilience, the authors' approaches parameterize the robustification process, making it easy to automatically transform large classes of applications into robust applications with the use of parser-based tools and minimal programmer effort. They demonstrate the use of automated error resilience based on outlier detection for dynamic programming problems. For error rates up to 10E-3, the error-resilient algorithms achieve the same output quality as their error-free counterparts with significantly lower overhead (less than 59 percent for monadic problems and 263 percent for polyadic problems, on average) than conventional hardware and software error-resilience techniques.
Original language | English (US) |
---|---|
Article number | 7006347 |
Pages (from-to) | 46-59 |
Number of pages | 14 |
Journal | IEEE Micro |
Volume | 36 |
Issue number | 1 |
DOIs | |
State | Published - Jan 1 2016 |
Bibliographical note
Publisher Copyright:© 1981-2012 IEEE.
Keywords
- Algorithmic error resilience
- Application robustification
- Dynamic programming
- Outlier detection