Auto WS: automate weights streaming in layer-wise pipelined DNN accelerators
File(s)AutoWS_DATE2024.pdf (752.76 KB)
Accepted version
Author(s)
Yu, Z
Bouganis, CS
Type
Conference Paper
Abstract
With the great success of Deep Neural Networks (DNN), the design of efficient hardware accelerators has triggered wide interest in the research community. Existing research explores two architectural strategies: sequential layer execution and layer-wise pipelining. While the former supports a wider range of models, the latter is favoured for its enhanced customization and efficiency. A challenge for the layer-wise pipelining architecture is its substantial demand for the on-chip memory for weights storage, impeding the deployment of large-scale networks on resource-constrained devices. This paper introduces AutoWs,a pioneering memory management methodology that exploits both on-chip and off-chip memory to optimize weight storage within a layer-wise pipelining architecture, taking advantage of its static schedule. Through a comprehensive investigation on both the hardware design and the Design Space Exploration, our methodology is fully automated and enables the deployment of large-scale DNN models on resource-constrained devices, which was not possible in existing works that target layer-wise pipelining architectures. AutoWS is open-source: https://github.com/Yu-Zhewen/AutoWS.
Date Issued
2024-06-10
Date Acceptance
2023-11-14
Citation
2024 Design, Automation & Test in Europe Conference & Exhibition (DATE), 2024, pp.1-6
ISSN
1530-1591
Publisher
IEEE
Start Page
1
End Page
6
Journal / Book Title
2024 Design, Automation & Test in Europe Conference & Exhibition (DATE)
Copyright Statement
© 2024 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
Source
DATE 2024
Publication Status
Published
Start Date
2024-03-25
Finish Date
2024-03-27
Coverage Spatial
Valencia, Spain