VLSI architectures for the restricted boltzmann machine

Bo Yuan; Keshab K. Parhi

doi:10.1145/3007193

VLSI architectures for the restricted boltzmann machine

Bo Yuan, Keshab K. Parhi

Electrical and Computer Engineering

Research output: Contribution to journal › Article › peer-review

4 Scopus citations

Abstract

Neural network (NN) systems are widely used in many important applications ranging from computer vision to speech recognition. To date, most NN systems are processed by general processing units like CPUs or GPUs. However, as the sizes of dataset and network rapidly increase, the original software implementations suffer from long training time. To overcome this problem, specialized hardware accelerators are needed to design high-speed NN systems. This article presents an efficient hardware architecture of restricted Boltzmann machine (RBM) that is an important category of NN systems. Various optimization approaches at the hardware level are performed to improve the training speed. As-soon-as-possible and overlappedscheduling approaches are used to reduce the latency. It is shown that, compared with the flat design, the proposedRBMarchitecture can achieve 50% reduction in training time. In addition, an on-the-fly computation scheme is also used to reduce the storage requirement of binary and stochastic states by several hundreds of times. Then, based on the proposed approach, a 784-2252 RBM design example is developed for MNIST handwritten digit recognition dataset. Analysis shows that the VLSI design of RBM achieves significant improvement in training speed and energy efficiency as compared to CPU/GPU-based solution.

Original language	English (US)
Article number	35
Journal	ACM Journal on Emerging Technologies in Computing Systems
Volume	13
Issue number	3
DOIs	https://doi.org/10.1145/3007193
State	Published - May 2017

Bibliographical note

Publisher Copyright:
© 2017 ACM.

Keywords

Restricted Boltzmann machine (RBM)
VLSI
high-speed
memory reduction
neural network (NN)
overlapped-scheduling
reduced-latency

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

Access

10.1145/3007193

OpenUrl availability

Full text

Cite this

@article{41a3450aa52a4f12ae01703875201563,

title = "VLSI architectures for the restricted boltzmann machine",

abstract = "Neural network (NN) systems are widely used in many important applications ranging from computer vision to speech recognition. To date, most NN systems are processed by general processing units like CPUs or GPUs. However, as the sizes of dataset and network rapidly increase, the original software implementations suffer from long training time. To overcome this problem, specialized hardware accelerators are needed to design high-speed NN systems. This article presents an efficient hardware architecture of restricted Boltzmann machine (RBM) that is an important category of NN systems. Various optimization approaches at the hardware level are performed to improve the training speed. As-soon-as-possible and overlappedscheduling approaches are used to reduce the latency. It is shown that, compared with the flat design, the proposedRBMarchitecture can achieve 50% reduction in training time. In addition, an on-the-fly computation scheme is also used to reduce the storage requirement of binary and stochastic states by several hundreds of times. Then, based on the proposed approach, a 784-2252 RBM design example is developed for MNIST handwritten digit recognition dataset. Analysis shows that the VLSI design of RBM achieves significant improvement in training speed and energy efficiency as compared to CPU/GPU-based solution.",

keywords = "Restricted Boltzmann machine (RBM), VLSI, high-speed, memory reduction, neural network (NN), overlapped-scheduling, reduced-latency",

author = "Bo Yuan and Parhi, {Keshab K.}",

note = "Publisher Copyright: {\textcopyright} 2017 ACM.",

year = "2017",

month = may,

doi = "10.1145/3007193",

language = "English (US)",

volume = "13",

journal = "ACM Journal on Emerging Technologies in Computing Systems",

issn = "1550-4832",

publisher = "Association for Computing Machinery (ACM)",

number = "3",

}

TY - JOUR

T1 - VLSI architectures for the restricted boltzmann machine

AU - Yuan, Bo

AU - Parhi, Keshab K.

PY - 2017/5

Y1 - 2017/5

N2 - Neural network (NN) systems are widely used in many important applications ranging from computer vision to speech recognition. To date, most NN systems are processed by general processing units like CPUs or GPUs. However, as the sizes of dataset and network rapidly increase, the original software implementations suffer from long training time. To overcome this problem, specialized hardware accelerators are needed to design high-speed NN systems. This article presents an efficient hardware architecture of restricted Boltzmann machine (RBM) that is an important category of NN systems. Various optimization approaches at the hardware level are performed to improve the training speed. As-soon-as-possible and overlappedscheduling approaches are used to reduce the latency. It is shown that, compared with the flat design, the proposedRBMarchitecture can achieve 50% reduction in training time. In addition, an on-the-fly computation scheme is also used to reduce the storage requirement of binary and stochastic states by several hundreds of times. Then, based on the proposed approach, a 784-2252 RBM design example is developed for MNIST handwritten digit recognition dataset. Analysis shows that the VLSI design of RBM achieves significant improvement in training speed and energy efficiency as compared to CPU/GPU-based solution.

AB - Neural network (NN) systems are widely used in many important applications ranging from computer vision to speech recognition. To date, most NN systems are processed by general processing units like CPUs or GPUs. However, as the sizes of dataset and network rapidly increase, the original software implementations suffer from long training time. To overcome this problem, specialized hardware accelerators are needed to design high-speed NN systems. This article presents an efficient hardware architecture of restricted Boltzmann machine (RBM) that is an important category of NN systems. Various optimization approaches at the hardware level are performed to improve the training speed. As-soon-as-possible and overlappedscheduling approaches are used to reduce the latency. It is shown that, compared with the flat design, the proposedRBMarchitecture can achieve 50% reduction in training time. In addition, an on-the-fly computation scheme is also used to reduce the storage requirement of binary and stochastic states by several hundreds of times. Then, based on the proposed approach, a 784-2252 RBM design example is developed for MNIST handwritten digit recognition dataset. Analysis shows that the VLSI design of RBM achieves significant improvement in training speed and energy efficiency as compared to CPU/GPU-based solution.

KW - Restricted Boltzmann machine (RBM)

KW - VLSI

KW - high-speed

KW - memory reduction

KW - neural network (NN)

KW - overlapped-scheduling

KW - reduced-latency

UR - http://www.scopus.com/inward/record.url?scp=85019913374&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85019913374&partnerID=8YFLogxK

U2 - 10.1145/3007193

DO - 10.1145/3007193

M3 - Article

AN - SCOPUS:85019913374

SN - 1550-4832

VL - 13

JO - ACM Journal on Emerging Technologies in Computing Systems

JF - ACM Journal on Emerging Technologies in Computing Systems

IS - 3

M1 - 35

ER -

VLSI architectures for the restricted boltzmann machine

Abstract

Bibliographical note

Keywords

UN SDGs

Access

OpenUrl availability

Other files and links

Fingerprint

Cite this