TY - GEN
T1 - Parallel implementation of finite state machines for reducing the latency of stochastic computing
AU - Ma, Cong
AU - Lilja, David J.
PY - 2018/5/9
Y1 - 2018/5/9
N2 - Stochastic computing, which employs random bit streams for computations, has shown low hardware cost and high fault-tolerance compared to the computations using a conventional binary encoding. Finite state machine (FSM) based stochastic computing elements can compute complex functions, such as the exponentiation and hyperbolic tangent functions, more efficiently than those using combinational logic. However, the FSM, as a sequential logic, cannot be directly implemented in parallel like the combinational logic, so reducing the long latency of the calculation becomes difficult. Applications in the relatively higher frequency domain would require an extremely fast clock rate using FSM. This paper proposes a parallel implementation of the FSM, using an estimator and a dispatcher to directly initialize the FSM to the steady state. Experimental results show that the outputs of four typical functions using the parallel implementation are very close to those of the serial version. The parallel FSM scheme further shows equivalent or better image quality than the serial implementation in two image processing applications Edge Detection and Frame Difference.
AB - Stochastic computing, which employs random bit streams for computations, has shown low hardware cost and high fault-tolerance compared to the computations using a conventional binary encoding. Finite state machine (FSM) based stochastic computing elements can compute complex functions, such as the exponentiation and hyperbolic tangent functions, more efficiently than those using combinational logic. However, the FSM, as a sequential logic, cannot be directly implemented in parallel like the combinational logic, so reducing the long latency of the calculation becomes difficult. Applications in the relatively higher frequency domain would require an extremely fast clock rate using FSM. This paper proposes a parallel implementation of the FSM, using an estimator and a dispatcher to directly initialize the FSM to the steady state. Experimental results show that the outputs of four typical functions using the parallel implementation are very close to those of the serial version. The parallel FSM scheme further shows equivalent or better image quality than the serial implementation in two image processing applications Edge Detection and Frame Difference.
UR - http://www.scopus.com/inward/record.url?scp=85047944922&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85047944922&partnerID=8YFLogxK
U2 - 10.1109/ISQED.2018.8357309
DO - 10.1109/ISQED.2018.8357309
M3 - Conference contribution
AN - SCOPUS:85047944922
T3 - Proceedings - International Symposium on Quality Electronic Design, ISQED
SP - 335
EP - 340
BT - 2018 19th International Symposium on Quality Electronic Design, ISQED 2018
PB - IEEE Computer Society
T2 - 19th International Symposium on Quality Electronic Design, ISQED 2018
Y2 - 13 March 2018 through 14 March 2018
ER -