Abstract
Since stochastic computing performs operations using streams of bits that represent probability values instead of deterministic values, it can tolerate a large number of failures in a noisy system. However, the simulation of a stochastic implementation is extremely time-consuming. In this paper, we investigate two approaches to speed up the stochastic simulation: a GPU-based simulation and an OpenMP-based simulation. To compare these two approaches, we start with several basic stochastic computing elements (SCEs) and then use the stochastic implementation of a frame difference-based image segmentation algorithm as case study to conduct extensive experiments. Measured results show that the GPU-based simulation with 448 processing elements can achieve up to 119x performance speedup compared to the single-threaded CPU simulation and 17x performance speedup over the OpenMP-based simulation with eight processor cores. In addition, we present several performance optimisations for the GPU-based simulation which significantly benefit the performance of stochastic simulation.
Original language | English (US) |
---|---|
Pages (from-to) | 34-46 |
Number of pages | 13 |
Journal | International Journal of Computational Science and Engineering |
Volume | 8 |
Issue number | 1 |
DOIs | |
State | Published - 2013 |
Keywords
- Fault-tolerance
- GPU computing
- Image processing
- Parallel computing
- Stochastic computing