As neural networks continue to infiltrate diverse application domains, computing will begin to move out of the cloud and onto edge devices necessitating fast, reliable, and low-power (LP) solutions. To meet these requirements, we propose a time-domain core using one-shot delay measurements and a lightweight post-processing technique, dynamic threshold error correction (DTEC). This design differs from traditional digital implementations in that it uses the delay accumulated through a simple inverter chain distributed through an SRAM array to intrinsically compute resource intensive multiply-accumulate (MAC) operations. Implemented in 65-nm LP CMOS, we achieve an energy efficiency of 104.8 TOp/s/W at 0.7-V with 3b resolution for 19.1 fJ/MAC.
Bibliographical noteFunding Information:
Manuscript received January 14, 2019; revised March 11, 2019 and April 25, 2019; accepted April 29, 2019. Date of publication May 20, 2019; date of current version September 24, 2019. This paper was approved by Guest Editor Chen-Hao Chang. This work was supported in part by the National Science Foundation under Award CCF-1763761 and in part by IGERT under Grant DGE-1069104. (Corresponding author: Chris H. Kim.) The authors are with the Electrical and Computer Engineering Department, University of Minnesota, Minneapolis, MN 55455 USA (e-mail: firstname.lastname@example.org; email@example.com).
- Machine learning (ML)
- neuromorphic computing
- time-domain computing
- time-to-digital converter (TDC)