1
IRUS Total
Downloads
  Altmetric

Hardware aware Convolutional Neural Network (CNN) training acceleration

File Description SizeFormat 
Vink-D-2023-PhD-Thesis.pdfThesis5.11 MBAdobe PDFView/Open
Title: Hardware aware Convolutional Neural Network (CNN) training acceleration
Authors: Vink, Diederik
Item Type: Thesis or dissertation
Abstract: Convolutional Neural Networks (CNNs) have emerged as a powerful deep learning tool, revolutionizing various domains such as computer vision. From applications like self-flying drones to self-driving cars, CNNs have demonstrated their effectiveness in enabling autonomous systems. As the demand for higher accuracy increases, CNN models are growing in complexity and time required to train. CNN training requires increasingly sophisticated hardware to handle the computational and memory requirements associated with training and inference tasks. However, CNNs are becoming increasingly intricate and have increasingly specific layers [3, 4, 5, 2, 1, 6, 7]. Additionally, the increasingly varying workloads and number representations is growing accordingly [8, 9, 10, 11]. One approach to address the issue of long training times is through the use of low-precision data representations and computations. In this thesis, a novel training strategy called MuPPET (Multi-Precision Policy Enforced Training) is proposed to that encompasses multiple precisions, including low-precision fixed-point representations. Beyond long training times, CNNs provide a large variety of workloads throughout training. Currently accelerators are struggling to find the best hardware architecture to allow for efficient utilization of the available resources throughout training [12, 13]. Field-programmable gate arrays (FPGAs) provide a high degree of flexibility, unlocking the potential to adapt designs to the incoming workload. Caffe Barista integrate FPGAs into CNN training frameworks, providing a custom convolution kernel to accelerate training. Following Caffe Barista, FPGPT is a complete toolflow consisting of a state-of-the-art high performance FPGA convolution unit. This unit supports runtime workload adaptation providing the option to execute a variety of workloads on the same compiled design. Finally, the thesis culminates in the creation of an acceleration policy building on MuPPET. This work utilizes the synergism between these two works to address the issues addressed by MuPPET and FPGPT better than either work could do individually.
Content Version: Open Access
Issue Date: Aug-2023
Date Awarded: Jun-2024
URI: http://hdl.handle.net/10044/1/113383
DOI: https://doi.org/10.25560/113383
Copyright Statement: Creative Commons Attribution NonCommercial Licence
Supervisor: Bouganis, Christos-Savvas
Sponsor/Funder: Engineering and Physical Sciences Research Council
Department: Electrical and Electronic Engineering
Publisher: Imperial College London
Qualification Level: Doctoral
Qualification Name: Doctor of Philosophy (PhD)
Appears in Collections:Electrical and Electronic Engineering PhD theses



This item is licensed under a Creative Commons License Creative Commons