Architecture and performance of Devito, a system for automated stencil computation
File(s)1807.03032v1.pdf (1.76 MB)
Working paper
Author(s)
Type
Working Paper
Abstract
Stencil computations are a key part of many high-performance computing applications, such as image processing, convolutional neural networks, and finite-difference solvers for partial differential equations. Devito is a framework capable of generating highly-optimized code given symbolic equations expressed in Python, specialized in, but not limited to, affine (stencil) codes. The lowering process -- from mathematical equations down to C++ code -- is performed by the Devito compiler through a series of intermediate representations. Several performance optimizations are introduced, including advanced common sub-expressions elimination, tiling and parallelization. Some of these are obtained through well-established stencil optimizers, integrated in the back-end of the Devito compiler. The architecture of the Devito compiler, as well as the performance optimizations that are applied when generating code, are presented. The effectiveness of such performance optimizations is demonstrated using operators drawn from seismic imaging applications.
Copyright Statement
© The Author(s).
Sponsor
Engineering & Physical Science Research Council (EPSRC)
Intel Corporation
Engineering & Physical Science Research Council (EPSRC)
Identifier
http://arxiv.org/abs/1807.03032v1
Grant Number
EP/L000407/1
PESCI Donation
EP/R029423/1
Subjects
cs.MS
65N06, 68N20
Notes
Submitted to SIAM Journal on Scientific Computing