IRUS Total

Towards batch-processing on cold storage devices

File Description SizeFormat 
ColdPack_HardBD___Camera_Ready.pdfAccepted version2.25 MBAdobe PDFView/Open
Title: Towards batch-processing on cold storage devices
Authors: Hadian, A
Heinis, T
Item Type: Conference Paper
Abstract: Large amounts of data in storage systems is cold, i.e., Written Once and Read Occasionally (WORO). The rapid growth of massive-scale archival and historical data increases the demand for petabyte-scale cheap storage for such cold data. A Cold Storage Device (CSD) is a disk-based storage system which is designed to trade off performance for cost and power efficiency. Inevitably, the design restrictions used in CSD's results in performance limitations. These limitations are not a concern for WORO workloads, however, the very low price/performance characteristics of CSDs makes them interesting for other applications, e.g., batch processes, too. Applications, however, can be very slow on CSD's if they do not take their characteristics into account. In this paper we design two strategies for data partitioning in CSDs -- a crucial operation in many batch analytics tasks like hash-join, near-duplicate detection, and data localization. We show that our strategies can efficiently use CSDs for batch processing of terabyte-scale data by accelerating data partitioning by 3.5x in our experiments.
Issue Date: 5-Jul-2018
Date of Acceptance: 16-Apr-2018
URI: http://hdl.handle.net/10044/1/57011
DOI: https://doi.org/10.1109/ICDEW.2018.00028
ISBN: 9781538663073
ISSN: 2473-3490
Publisher: IEEE
Start Page: 134
End Page: 139
Journal / Book Title: 34th IEEE International Conference on Data Engineering Workshops, ICDE Workshops 2018, Paris, France, April 16-20, 2018
Copyright Statement: © 2018 IEEE.
Sponsor/Funder: Engineering & Physical Science Research Council (E
European Research Office
Funder's Grant Number: EP/N023242/1
Conference Name: 34th International Conference on Data Engineering Workshops (ICDEW)
Keywords: Science & Technology
Computer Science, Information Systems
Computer Science, Theory & Methods
Engineering, Electrical & Electronic
Computer Science
Notes: timestamp: Tue, 31 Jul 2018 01:00:00 +0200 biburl: https://dblp.org/rec/bib/conf/icde/HadianH18 bibsource: dblp computer science bibliography, https://dblp.org
Publication Status: Published
Start Date: 2018-04-16
Finish Date: 2018-04-20
Conference Place: Paris, France
Online Publication Date: 2018-07-05
Appears in Collections:Computing
Faculty of Engineering