Accelerating the merge phase of sort-merge join
File(s)FPL19mergejoinB.pdf (313.2 KB)
Accepted version
Author(s)
Papaphilippou, Philippos
Pirk, Holger
Luk, Wayne
Type
Conference Paper
Abstract
We present an efficient, high-throughput and scalable hardware design for accelerating the merge phase of the sort-merge join operation. Sort-merge join is one of the fundamental join algorithms and among the most frequently executed operations in relational databases. It has been the focus of various recent pieces of research, each having different shortcomings and usually only focusing on the sort phase. In this paper, a new parallel sort-merge join architecture is developed, that provides data streaming functionality and high throughput. The key idea of the paper is the use of a novel design for a co-grouping engine, with which the input data are summarised on-the-fly. In this way, the operation is performed on streams of data, preserving the linear access pattern for faster data movement and also eliminates the need for a replay buffer/cache. In contrast to related work, our approach does not make assumptions about the value distribution of the input data and applies to any input size and width. We evaluate the design on a Zynq UltraScale+ based platform, and show that there is up to 3.1 times speedup over the host processor, even without accelerating the sort phase, while still taking into account the data transfers from and to main memory in Linux.
Date Issued
2020-11-07
Date Acceptance
2019-05-17
Citation
2019 29th International Conference on Field Programmable Logic and Applications (FPL), 2020
Publisher
IEEE
Journal / Book Title
2019 29th International Conference on Field Programmable Logic and Applications (FPL)
Copyright Statement
© 2019 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
Sponsor
Dunnhumby Limited
Engineering & Physical Science Research Council (E
Engineering & Physical Science Research Council (E
Engineering & Physical Science Research Council (EPSRC)
Grant Number
PO: 250130008264
516075101 (EP/N031768/1)
PO 20131167
EP/P010040/1
Source
The International Conference on Field-Programmable Logic and Applications (FPL) 2019
Subjects
Sort-merge join
stream processing
operators
database acceleration
high-throughput
FPGA
MPSoC
big data
Publication Status
Published
Start Date
2019-09-09
Finish Date
2019-09-13
Coverage Spatial
Barcelona, Spain