1
IRUS TotalDownloads
Altmetric
Hardware aware Convolutional Neural Network (CNN) training acceleration
File | Description | Size | Format | |
---|---|---|---|---|
Vink-D-2023-PhD-Thesis.pdf | Thesis | 5.11 MB | Adobe PDF | View/Open |
Title: | Hardware aware Convolutional Neural Network (CNN) training acceleration |
Authors: | Vink, Diederik |
Item Type: | Thesis or dissertation |
Abstract: | Convolutional Neural Networks (CNNs) have emerged as a powerful deep learning tool, revolutionizing various domains such as computer vision. From applications like self-flying drones to self-driving cars, CNNs have demonstrated their effectiveness in enabling autonomous systems. As the demand for higher accuracy increases, CNN models are growing in complexity and time required to train. CNN training requires increasingly sophisticated hardware to handle the computational and memory requirements associated with training and inference tasks. However, CNNs are becoming increasingly intricate and have increasingly specific layers [3, 4, 5, 2, 1, 6, 7]. Additionally, the increasingly varying workloads and number representations is growing accordingly [8, 9, 10, 11]. One approach to address the issue of long training times is through the use of low-precision data representations and computations. In this thesis, a novel training strategy called MuPPET (Multi-Precision Policy Enforced Training) is proposed to that encompasses multiple precisions, including low-precision fixed-point representations. Beyond long training times, CNNs provide a large variety of workloads throughout training. Currently accelerators are struggling to find the best hardware architecture to allow for efficient utilization of the available resources throughout training [12, 13]. Field-programmable gate arrays (FPGAs) provide a high degree of flexibility, unlocking the potential to adapt designs to the incoming workload. Caffe Barista integrate FPGAs into CNN training frameworks, providing a custom convolution kernel to accelerate training. Following Caffe Barista, FPGPT is a complete toolflow consisting of a state-of-the-art high performance FPGA convolution unit. This unit supports runtime workload adaptation providing the option to execute a variety of workloads on the same compiled design. Finally, the thesis culminates in the creation of an acceleration policy building on MuPPET. This work utilizes the synergism between these two works to address the issues addressed by MuPPET and FPGPT better than either work could do individually. |
Content Version: | Open Access |
Issue Date: | Aug-2023 |
Date Awarded: | Jun-2024 |
URI: | http://hdl.handle.net/10044/1/113383 |
DOI: | https://doi.org/10.25560/113383 |
Copyright Statement: | Creative Commons Attribution NonCommercial Licence |
Supervisor: | Bouganis, Christos-Savvas |
Sponsor/Funder: | Engineering and Physical Sciences Research Council |
Department: | Electrical and Electronic Engineering |
Publisher: | Imperial College London |
Qualification Level: | Doctoral |
Qualification Name: | Doctor of Philosophy (PhD) |
Appears in Collections: | Electrical and Electronic Engineering PhD theses |
This item is licensed under a Creative Commons License