NeuPart: Using Analytical Models to Drive Energy-Efficient Partitioning of CNN Computations on Cloud-Connected Mobile Clients

Susmita Dey Manasi, Farhana Sharmin Snigdha, Sachin S. Sapatnekar

Research output: Contribution to journalArticlepeer-review

Abstract

Data processing on convolutional neural networks (CNNs) places a heavy burden on energy-constrained mobile platforms. This article optimizes energy on a mobile client by partitioning CNN computations between in situ processing on the client and offloaded computations in the cloud. A new analytical CNN energy model is formulated, capturing all major components of the in situ computation, for ASIC-based deep learning accelerators. The model is benchmarked against measured silicon data. The analytical framework is used to determine the optimal energy partition point between the client and the cloud at runtime. On standard CNN topologies, partitioned computation is demonstrated to provide significant energy savings on the client over a fully cloud-based computation or fully in situ computation. For example, at 80 Mbps effective bit rate and 0.78 W transmission power, the optimal partition for AlexNet [SqueezeNet] saves up to 52.4% [73.4%] energy over a fully cloud-based computation and 27.3% [28.8%] energy over a fully in situ computation.

Original languageEnglish (US)
Article number9113336
Pages (from-to)1844-1857
Number of pages14
JournalIEEE Transactions on Very Large Scale Integration (VLSI) Systems
Volume28
Issue number8
DOIs
StatePublished - Aug 2020

Bibliographical note

Funding Information:
Manuscript received November 24, 2019; revised March 12, 2020; accepted April 28, 2020. Date of publication June 10, 2020; date of current version July 30, 2020. This work was supported in part by the National Science Foundation (NSF) under Award CCF-1763761. (Corresponding author: Susmita Dey Manasi.) The authors are with the Department of Electrical and Computer Engineering, University of Minnesota Twin Cities, Minneapolis, MN 55455 USA (e-mail: manas018@umn.edu; sharm304@umn.edu; sachin@umn.edu).

Keywords

  • Computation partitioning
  • convolutional neural networks (CNNs)
  • embedded deep learning
  • energy modeling
  • hardware acceleration

Fingerprint Dive into the research topics of 'NeuPart: Using Analytical Models to Drive Energy-Efficient Partitioning of CNN Computations on Cloud-Connected Mobile Clients'. Together they form a unique fingerprint.

Cite this