DeepOpt: Optimized Scheduling of CNNWorkloads for ASIC-based Systolic Deep Learning Accelerators

Susmita Dey Manasi, Sachin S. Sapatnekar

Research output: Chapter in Book/Report/Conference proceedingConference contribution

9 Scopus citations

Abstract

Scheduling computations in each layer of a convolutional neural network on a deep learning (DL) accelerator involves a large number of choices, each of which involves a different set of memory reuse and memory access patterns. Since memory transactions are the primary bottleneck in DL acceleration, these choices can strongly impact the energy and throughput of the accelerator. This work proposes an optimization framework, DeepOpt, for general ASIC-based systolic hardware accelerators for layer-specific and hardware-specific scheduling strategy for each layer of a CNN to optimize energy and latency. Optimal hardware allocation significantly reduces execution cost as compared to generic static hardware resource allocation, e.g., improvements of up to 50 in the energy-delay product for VGG-16 and 41 for GoogleNet-v1.

Original languageEnglish (US)
Title of host publicationProceedings of the 26th Asia and South Pacific Design Automation Conference, ASP-DAC 2021
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages235-241
Number of pages7
ISBN (Electronic)9781450379991
DOIs
StatePublished - Jan 18 2021
Event26th Asia and South Pacific Design Automation Conference, ASP-DAC 2021 - Virtual, Online, Japan
Duration: Jan 18 2021Jan 21 2021

Publication series

NameProceedings of the Asia and South Pacific Design Automation Conference, ASP-DAC

Conference

Conference26th Asia and South Pacific Design Automation Conference, ASP-DAC 2021
Country/TerritoryJapan
CityVirtual, Online
Period1/18/211/21/21

Bibliographical note

Funding Information:
We thank Z. Wang and A. B. Kahng (UCSD) for helping in modeling SRAM area. This work is supported in part by NSF (CCF-1763761).

Publisher Copyright:
© 2021 Association for Computing Machinery.

Keywords

  • CNN
  • hardware accelerator
  • scheduling

Fingerprint

Dive into the research topics of 'DeepOpt: Optimized Scheduling of CNNWorkloads for ASIC-based Systolic Deep Learning Accelerators'. Together they form a unique fingerprint.

Cite this