473
IRUS Total
Downloads
  Altmetric

Automated methodologies for mapping convolutional neural networks on reconfigurable hardware

File Description SizeFormat 
Venieris-S-2019-PhD-Thesis.pdfThesis3.56 MBAdobe PDFView/Open
Title: Automated methodologies for mapping convolutional neural networks on reconfigurable hardware
Authors: Venieris, Stylianos
Item Type: Thesis or dissertation
Abstract: Convolutional neural networks (ConvNets) are a family of machine learning models which have demonstrated state-of-the-art performance in a wide range of Artificial Intelligence (AI) tasks. To obtain accuracy gains, ConvNets have been typically enhanced either by designing deeper and wider models with a larger number of trainable parameters, or by designing novel components that introduce irregular dataflow. Both approaches are computationally expensive and pose challenges with respect to the deployment of ConvNets in real-life applications. ConvNet-enabled applications are also characterised by a variability across performance requirements, spanning from throughput-driven to latency-critical systems. This property calls for a model- and performance-aware design of computing systems in order to meet the diverse application-level specifications. Furthermore, in emerging complex AI systems, such as autonomous vehicles, ConvNets constitute mere building blocks of the overall system leading to multi-ConvNet settings. Upon deployment, the different models have to run concurrently, meet their respective performance constraints and share the underlying resources. This thesis proposes design methodologies and hardware architectures targeting field-programmable gate arrays (FPGAs) that address the aforementioned challenges, aiming for the high-performance deployment of ConvNets. The contributions of this work include: an analytical model for representing both ConvNet workloads and hardware mappings, together with a ConvNet-to-FPGA toolflow for the automated generation of ConvNet accelerators; a latency-driven methodology for the generation of latency-optimised hardware mappings which meet the stringent response-time constraints of modern ConvNet applications; novel architectural optimisations for state-of-the-art ConvNets with irregular connectivity, together with the corresponding mapping methodology; and a toolflow for the parallel deployment of multiple ConvNets on a single FPGA, enabling emerging multi-ConvNet applications. By applying the above methodologies to real-life workloads, it is shown that significant performance gains are achieved over existing state-of-the-art implementations on FPGAs and GPUs, enabling in this way the automated generation of ConvNet accelerators that are tailored to both the ConvNet-FPGA pair and the target performance requirements in single- and multi-ConvNet settings.
Content Version: Open Access
Issue Date: Sep-2018
Date Awarded: Feb-2019
URI: http://hdl.handle.net/10044/1/68017
DOI: https://doi.org/10.25560/68017
Copyright Statement: Creative Commons Attribution Non-Commercial No Derivatives licence
Supervisor: Bouganis, Christos-Savvas
Sponsor/Funder: Engineering and Physical Sciences Research Council
Funder's Grant Number: 1507723
Department: Electrical and Electronic Engineering
Publisher: Imperial College London
Qualification Level: Doctoral
Qualification Name: Doctor of Philosophy (PhD)
Appears in Collections:Electrical and Electronic Engineering PhD theses