5
IRUS Total
Downloads
  Altmetric

Uncovering Bugs in Distributed Storage Systems during Testing (not in Production!)

File Description SizeFormat 
FAST.pdfAccepted version427.28 kBAdobe PDFView/Open
Title: Uncovering Bugs in Distributed Storage Systems during Testing (not in Production!)
Authors: Deligiannis, P
McCutchen, M
Thomson, P
Chen, S
Donaldson, AF
Erickson, J
Huang, C
Lal, A
Mudduluru, R
Qadeer, S
Schulte, W
Item Type: Conference Paper
Abstract: Testing distributed systems is challenging due to multiple sources of nondeterminism. Conventional testing techniques, such as unit, integration and stress testing, are ineffective in preventing serious but subtle bugs from reaching production. Formal techniques, such as TLA+, can only verify high-level specifications of systems at the level of logic-based models, and fall short of checking the actual executable code. In this paper, we present a new methodology for testing distributed systems. Our approach applies advanced systematic testing techniques to thoroughly check that the executable code adheres to its high-level specifications, which significantly improves coverage of important system behaviors. Our methodology has been applied to three distributed storage systems in the Microsoft Azure cloud computing platform. In the process, numerous bugs were identified, reproduced, confirmed and fixed. These bugs required a subtle combination of concurrency and failures, making them extremely difficult to find with conventional testing techniques. An important advantage of our approach is that a bug is uncovered in a small setting and witnessed by a full system trace, which dramatically increases the productivity of debugging.
Issue Date: 25-Feb-2016
Date of Acceptance: 7-Dec-2015
URI: http://hdl.handle.net/10044/1/31770
DOI: https://www.usenix.org/conference/fast16/technical-sessions/presentation/deligiannis
ISBN: 9781931971287
Publisher: USENIX
Journal / Book Title: 14th USENIX Conference on File and Storage Technologies
Copyright Statement: USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone
Conference Name: 14th USENIX Conference on File and Storage Technologies
Publication Status: Published
Start Date: 2016-02-22
Finish Date: 2016-02-25
Conference Place: Santa Clara, CA, USA
Appears in Collections:Computing
Electrical and Electronic Engineering
Faculty of Engineering