473
IRUS TotalDownloads
Altmetric
Automated methodologies for mapping convolutional neural networks on reconfigurable hardware
File | Description | Size | Format | |
---|---|---|---|---|
Venieris-S-2019-PhD-Thesis.pdf | Thesis | 3.56 MB | Adobe PDF | View/Open |
Title: | Automated methodologies for mapping convolutional neural networks on reconfigurable hardware |
Authors: | Venieris, Stylianos |
Item Type: | Thesis or dissertation |
Abstract: | Convolutional neural networks (ConvNets) are a family of machine learning models which have demonstrated state-of-the-art performance in a wide range of Artificial Intelligence (AI) tasks. To obtain accuracy gains, ConvNets have been typically enhanced either by designing deeper and wider models with a larger number of trainable parameters, or by designing novel components that introduce irregular dataflow. Both approaches are computationally expensive and pose challenges with respect to the deployment of ConvNets in real-life applications. ConvNet-enabled applications are also characterised by a variability across performance requirements, spanning from throughput-driven to latency-critical systems. This property calls for a model- and performance-aware design of computing systems in order to meet the diverse application-level specifications. Furthermore, in emerging complex AI systems, such as autonomous vehicles, ConvNets constitute mere building blocks of the overall system leading to multi-ConvNet settings. Upon deployment, the different models have to run concurrently, meet their respective performance constraints and share the underlying resources. This thesis proposes design methodologies and hardware architectures targeting field-programmable gate arrays (FPGAs) that address the aforementioned challenges, aiming for the high-performance deployment of ConvNets. The contributions of this work include: an analytical model for representing both ConvNet workloads and hardware mappings, together with a ConvNet-to-FPGA toolflow for the automated generation of ConvNet accelerators; a latency-driven methodology for the generation of latency-optimised hardware mappings which meet the stringent response-time constraints of modern ConvNet applications; novel architectural optimisations for state-of-the-art ConvNets with irregular connectivity, together with the corresponding mapping methodology; and a toolflow for the parallel deployment of multiple ConvNets on a single FPGA, enabling emerging multi-ConvNet applications. By applying the above methodologies to real-life workloads, it is shown that significant performance gains are achieved over existing state-of-the-art implementations on FPGAs and GPUs, enabling in this way the automated generation of ConvNet accelerators that are tailored to both the ConvNet-FPGA pair and the target performance requirements in single- and multi-ConvNet settings. |
Content Version: | Open Access |
Issue Date: | Sep-2018 |
Date Awarded: | Feb-2019 |
URI: | http://hdl.handle.net/10044/1/68017 |
DOI: | https://doi.org/10.25560/68017 |
Copyright Statement: | Creative Commons Attribution Non-Commercial No Derivatives licence |
Supervisor: | Bouganis, Christos-Savvas |
Sponsor/Funder: | Engineering and Physical Sciences Research Council |
Funder's Grant Number: | 1507723 |
Department: | Electrical and Electronic Engineering |
Publisher: | Imperial College London |
Qualification Level: | Doctoral |
Qualification Name: | Doctor of Philosophy (PhD) |
Appears in Collections: | Electrical and Electronic Engineering PhD theses |