Some useful optimisations for unstructured computational fluid dynamics codes on multicore and manycore architectures
File(s)1-s2.0-S0010465518302492-main.pdf (1.39 MB)
Published version
Author(s)
Hadade, IC
Wang, Feng
Carnevale, Mauro
Di Mare, Luca
Type
Journal Article
Abstract
This paper presents a number of optimisations for improving the performance of unstructured computational fluid dynamics codes on multicore and manycore architectures such as the Intel Sandy Bridge, Broadwell and Skylake CPUs and the Intel Xeon Phi Knights Corner and Knights Landing manycore processors. We discuss and demonstrate their implementation in two distinct classes of computational kernels: face-based loops represented by the computation of fluxes and cell-based loops representing updates to state vectors. We present the importance of making efficient use of the underlying vector units in both classes of computational kernels with special emphasis on the changes required for vectorising face-based loops and their intrinsic indirect and irregular access patterns. We demonstrate the advantage of different data layouts for cell-centred as well as face data structures and architectural specific optimisations for improving the performance of gather and scatter operations which are prevalent in unstructured mesh applications. The implementation of a software prefetching strategy based on auto-tuning is also shown along with an empirical evaluation on the importance of multithreading for in-order architectures such as Knights Corner. We explore the various memory modes available on the Intel Xeon Phi Knights Landing architecture and present an approach whereby both traditional DRAM as well as MCDRAM interfaces are exploited for maximum performance. We obtain significant full application speed-ups between 2.8 and 3X across the multicore CPUs in two-socket node configurations, 8.6X on the Intel Xeon Phi Knights Corner coprocessor and 5.6X on the Intel Xeon Phi Knights Landing processor in an unstructured finite volume CFD code representative in size and complexity to an industrial application.
Date Issued
2019-02-01
Date Acceptance
2018-07-05
Citation
Computer Physics Communications, 2019, 235, pp.305-323
ISSN
0010-4655
Publisher
Elsevier
Start Page
305
End Page
323
Journal / Book Title
Computer Physics Communications
Volume
235
Copyright Statement
© 2018 The Author(s). Published by Elsevier B.V. This is an open access article under the CC-BY license
(http://creativecommons.org/licenses/by/4.0/)
(http://creativecommons.org/licenses/by/4.0/)
Subjects
Nuclear & Particles Physics
01 Mathematical Sciences
02 Physical Sciences
08 Information and Computing Sciences
Publication Status
Published
Date Publish Online
2018-07-18