Montgomerie-Corcoran, AlexanderAlexanderMontgomerie-CorcoranYu, ZhewenZhewenYuBouganis, Christos-SavvasChristos-SavvasBouganis2024-03-052024-03-052022-02-132022 32nd International Conference on Field-Programmable Logic and Applications (FPL), 2022, pp.418-424978-1-6654-7390-31946-1488http://hdl.handle.net/10044/1/109725Significant effort has been placed on the development of toolflows that map Convolutional Neural Network (CNN) models to Field Programmable Gate Arrays (FPGAs) with the aim of automating the production of high performance designs for a diverse set of applications. However, within these toolflows, the problem of finding an optimal mapping is often overlooked, with the expectation that the end user will tune their generated hardware for their desired platform. This is particularly prominent within Streaming Architecture toolflows, where there is a large design space to be explored. In this work, we establish the framework SAMO: a Streaming Architecture Mapping Optimiser. SAMO exploits the structure of CNN models and the common features that exist in Streaming Architectures, and casts the mapping optimisation problem under a unified methodology. Furthermore, SAMO explicitly explores the re-configurability property of FPGAs, allowing the methodology to overcome mapping limitations imposed by certain toolflows under resource-constrained scenarios, as well as improve on the achievable throughput. Three optimisation methods - Brute-Force, Simulated Annealing and Rule-Based - have been developed in order to generate valid, high performance designs for a range of target platforms and CNN models. Results show that SAMO-optimised designs can achieve 4x-20x better performance compared to existing hand-tuned designs. The SAMO framework is open-source: https://github.com/AlexMontgomerie/samo.Copyright © 2023 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.Computer ScienceComputer Science, Hardware & ArchitectureComputer Science, Software EngineeringComputer Science, Theory & Methodsneural network acceleratoroptimisationScience & Technologystreaming architectureTechnologySAMO: optimised mapping of convolutional neural networks to streaming architecturesConference Paperhttps://www.dx.doi.org/10.1109/FPL57034.2022.00069https://ieeexplore.ieee.org/document/10035202