Achieving Low-overhead Fault Tolerance for Parallel Accelerators with Dynamic Partial Reconfiguration
File(s)fpl14.pdf (147.91 KB)
Published version
Author(s)
Davis, J
Cheung, PYK
Type
Conference Paper
Abstract
While allowing for the fabrication of increasingly complex and efficient circuitry, transistor shrinkage and count-per-device expansion have major downsides: chiefly increased variation, degradation and fault susceptibility. For this reason, design-time consideration of fault tolerance will have to be given to increasing numbers of electronic systems in the future to ensure yields, reliabilities and lifetimes remain acceptably high. Many commonly implemented operators are suited to modification resulting in datapath error detection capabilities with low area overheads. FPGAs are uniquely placed to allow further area savings to be made when incorporating fault avoidance mechanisms thanks to their dynamic reconfigurability. In this paper, we examine the practicalities and costs involved in implementing hardware-software fault tolerance on a test platform: a parallel matrix multiplication accelerator in hardware, with controller in software, running on a Xilinx Zynq system-on-chip. A combination of `bolt-on' error detection logic and software-triggered routing reconfiguration serve to provide low-overhead datapath fault tolerance at runtime. Rapid yet accurate fault diagnoses along with low hardware (area), software (configuration storage) and performance penalties are achieved.
Date Issued
2014-10-20
Date Acceptance
2014-06-03
Citation
2014, pp.1-6
ISBN
978-3-00-044645-0
ISSN
1946-147X
Publisher
IEEE
Start Page
1
End Page
6
Journal / Book Title
Proceedings of the 2014 24th International Conference on Field Programmable Logic and Applications (FPL)
Copyright Statement
© 2014 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
Identifier
http://ieeexplore.ieee.org/xpl/articleDetails.jsp?&arnumber=6927447
Source
International Conference on Field-programmable Logic and Applications (FPL) 2014
Publication Status
Published
Start Date
2014-09-02
Finish Date
2014-09-04
Coverage Spatial
Munich, Germany
Date Publish Online
2014-10-20