Optimizing DNN accelerator compression using tolerable accuracy loss
File(s)fpt24zq23.pdf (293.76 KB)
Accepted version
Author(s)
Que, Zhiqiang
Zhao, Anyan
Coutinho, Jose GF
Guo, Ce
Luk, Wayne
Type
Conference Paper
Abstract
This paper proposes a novel nested-loop optimization approach which utilizes the maximum tolerable model accuracy loss as a hyperparameter to improve DNNs compression for hardware accelerators. This process includes local inner-loop optimization and global outer-loop optimization with bottom-up feedback. Our multi-level approach encompasses optimization tasks distributed across different computational spaces, such as software and hardware (High-Level Synthesis, HLS). As an example of an optimization task, we introduce and detail the mixed-precision Quantization Heuristic Search (QHS), which adjusts numerical representations, reducing hardware complexity while maintaining accuracy within user-defined tolerances. This approach offers a new perspective for model compression, leading to efficient and effective DNN hardware accelerators.
Date Issued
2025-08-18
Date Acceptance
2024-12-01
Citation
2024 International Conference on Field Programmable Technology (ICFPT), 2025, pp.1-2
ISSN
2837-0430
Publisher
IEEE
Start Page
1
End Page
2
Journal / Book Title
2024 International Conference on Field Programmable Technology (ICFPT)
Copyright Statement
Copyright © 2024, IEEE. This is the author’s accepted manuscript made available under a CC-BY licence in accordance with Imperial’s Research Publications Open Access policy (www.imperial.ac.uk/oa-policy)
License URL
Source
2024 International Conference on Field Programmable Technology (ICFPT)
Publication Status
Published
Start Date
2024-12-10
Finish Date
2024-12-12
Coverage Spatial
Sydney, Australia