66
IRUS Total
Downloads
  Altmetric

Java2SDG: Stateful big data processing for the masses

File Description SizeFormat 
Java2SDG_ICDE16_demo.pdfAccepted version453.26 kBAdobe PDFView/Open
Title: Java2SDG: Stateful big data processing for the masses
Authors: Fernandez, RC
Garefalakis, P
Pietzuch, P
Item Type: Conference Paper
Abstract: Big data processing is no longer restricted to specially-trained engineers. Instead, domain experts, data scientists and data users all want to benefit from applying data mining and machine learning algorithms at scale. A considerable obstacle towards this “democratisation of big data” are programming models: current scalable big data processing platforms such as Spark, Naiad and Flink require users to learn custom functional or declarative programming models, which differ fundamentally from popular languages such as Java, Matlab, Python or C++. An open challenge is how to provide a big data programming model for users that are not familiar with functional programming, while maintaining performance, scalability and fault tolerance. We describe JAVA2SDG, a compiler that translates annotated Java programs to stateful dataflow graphs (SDGs) that can execute on a compute cluster in a data-parallel and fault-tolerant fashion. Compared to existing distributed dataflow models, a distinguishing feature of SDGs is that their computational tasks can access distributed mutable state, thus allowing SDGs to capture the semantics of stateful Java programs. As part of the demonstration, we provide examples of machine learning programs in Java, including collaborative filtering and logistic regression, and we explain how they are translated to SDGs and executed on a large set of machines.
Issue Date: 16-May-2016
Date of Acceptance: 16-May-2016
URI: http://hdl.handle.net/10044/1/37490
DOI: http://dx.doi.org/10.1109/ICDE.2016.7498352
ISBN: 978-1-5090-2020-1
Publisher: IEEE
Start Page: 1390
End Page: 1393
Journal / Book Title: Proceedings of the 32nd IEEE International Conference on Data Engineering (ICDE)
Copyright Statement: © 2016 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
Conference Name: 32nd IEEE International Conference on Data Engineering (ICDE)
Publication Status: Published
Start Date: 2016-05-16
Finish Date: 2016-05-20
Conference Place: Helsinki, Finland
Appears in Collections:Computing
Faculty of Engineering