An Uncertainty-Aware Approach to Optimal Configuration of Stream Processing Systems

Jamshidi Dermani, P; Casale, G

Altmetric

An Uncertainty-Aware Approach to Optimal Configuration of Stream Processing Systems

DOI:	dx.doi.org/10.5281/zenodo.56238

Title:	An Uncertainty-Aware Approach to Optimal Configuration of Stream Processing Systems
Authors:	Jamshidi Dermani, P Casale, G
Item Type:	Dataset
Abstract:	The archive contains 10 comma separated datasets representing performance measurements (throughput and latency) for 3 different stream benchmark applications. These have been experimentally collected on 5 different cloud cluster over the course of 3 months (24/7). Each row in the datasets represents a different configuration setting for the application and the last two columns represent the average performance of the application measured over the course of 10 minutes under that specific configuration setting. The datasets contains a full factorial and exhaustive measurements for all possible settings limited to a predetermined interval for each variable. Each dataset is named in the following format: "benchmark_application-dimensions-cluster_name". For example, "wc-6d-c1" refers to WordCount benchmark application with 6 dimensions (i.e., we varied 6 configuration parameters) and the application was deployed on c1 cluster (OpenNebula, see Appendix). This resulted in a dataset of size 2880, i.e., it has taken 288010m=480h=20days for collecting the data The archive contains 10 comma separated datasets representing performance measurements (throughput and latency) for 3 different stream benchmark applications. These have been experimentally collected on 5 different cloud cluster over the course of 3 months (24/7). Each row in the datasets represents a different configuration setting for the application and the last two columns represent the average performance of the application measured over the course of 10 minutes under that specific configuration setting. The datasets contains a full factorial and exhaustive measurements for all possible settings limited to a predetermined interval for each variable. Each dataset is named in the following format: "benchmark_application-dimensions-cluster_name". For example, "wc-6d-c1" refers to WordCount benchmark application with 6 dimensions (i.e., we varied 6 configuration parameters) and the application was deployed on c1 cluster (OpenNebula, see Appendix). This resulted in a dataset of size 2880, i.e., it has taken 288010m=480h=20days for collecting the data
Issue Date:	22-Jun-2016
URI:	http://hdl.handle.net/10044/1/40575
DOI:	dx.doi.org/10.5281/zenodo.56238
Sponsor/Funder:	Commission of the European Communities
Funder's Grant Number:	644869
Keywords:	Stream Processing System DevOps Big Data
Access Data Notes:	https://zenodo.org/record/56238 https://zenodo.org/record/56238
Appears in Collections:	Faculty of Engineering - Research Data

Unless otherwise indicated, items in Spiral are protected by copyright and are licensed under a Creative Commons Attribution NonCommercial NoDerivatives License.