80
IRUS TotalDownloads
Altmetric
Data Reuse and Parallelism in Hardware Compilation
File | Description | Size | Format | |
---|---|---|---|---|
Liu-Q-2009-PhD-Thesis.pdf | 1.25 MB | Adobe PDF | View/Open |
Title: | Data Reuse and Parallelism in Hardware Compilation |
Authors: | Liu, Qiang |
Item Type: | Thesis or dissertation |
Abstract: | This thesis presents a methodology to automatically determine a data memory organisation at compile time, suitable to exploit data reuse and loop-level parallelization, in order to achieve high performance and low power design for data-dominated applications. Moore’s Law has enabled more and more heterogeneous components integrated on a single chip. However, there are challenges to extract maximum performance from these hardware resources efficiently. Unlike previous approaches, which mainly focus on making efficient use of computational resources, our focus is on data memory organisation and input-output bandwidth considerations, which are the typical stumbling block of existing hardware compilation schemes. To optimize accesses to large off-chip memories, an approach is adopted and formalized to identify data reuse opportunities in local scratch-pad memory. An approach is presented for evaluating different data reuse options in terms of the memory space required by buffering reused data and execution time for loading the data to the local memories. Determining the data reuse design option that consumes the least power or performs operations quickest with respect to a memory constraint is a NP-hard problem. In this work, the problem of data reuse exploration for low-power designs is formulated as a Multiple-Choice Knapsack problem. Together with a proposed power model, the problem is solved efficiently. An integer geometric programming framework is presented for exploring data reuse and loop-level parallelization within a single step. The objective is to find the design that achieves the shortest execution time for an application. We describe our approaches based on formal optimization techniques, and present some results from applying these approaches to several benchmarks that show the advantages of optimizing data memory organisation and of exposing the interaction between data memory system design and parallelism extraction to the compiler. |
Issue Date: | Nov-2008 |
Date Awarded: | Mar-2009 |
URI: | http://hdl.handle.net/10044/1/4370 |
DOI: | https://doi.org/10.25560/4370 |
Supervisor: | Cheung, Peter Constantinides, George Anthony Masselos, Konstantinos |
Author: | Liu, Qiang |
Department: | Electrical and Electronic Engineering |
Publisher: | Imperial College London |
Qualification Level: | Doctoral |
Qualification Name: | Doctor of Philosophy (PhD) |
Appears in Collections: | Electrical and Electronic Engineering PhD theses |