Scalable techniques for analysing and testing asynchronous software systems
File(s)
Author(s)
Deligiannis, Pantazis
Type
Thesis or dissertation
Abstract
This thesis is about scalable analysis and testing techniques for asynchronous programs. Due to their highly-concurrent nature, the number of states that such programs can reach grows exponentially (in the worst case) with program size, leading to state-space explosion. For this reason, searching the state-space of real-world asynchronous programs to find bugs is computationally expensive, and thus it is important to develop analysis and testing techniques that can scale well. However, techniques typically scale by sacrificing precision. Detecting only true bugs - versus reporting a large number of false alarms - is crucial to increasing the productivity of developers. Alas, finding the fine balance between precision and scalability is a challenging problem. Motivated by this challenge, we focus on developing tools and techniques for analysing and testing two different kinds of real-world asynchronous programs: device drivers and distributed systems. In this thesis we make the following three original contributions:
1. A novel symbolic lockset analysis that can detect all potential data races in a device driver. Our analysis avoids reasoning about thread interleavings and thus scales well. Exploiting the race-freedom guarantees provided by our analysis, we achieve a sound partial-order reduction that significantly accelerates Corral, an industrial-strength bug-finder for concurrent programs.
2. The design and implementation of P#, a new programming language for building, analysing and testing asynchronous programs. The novelty of P# is that it is comes with a static data race analysis and controlled testing infrastructure, which makes a strong move towards building highly-reliable asynchronous programs.
3. A new approach for partially-modelling and testing production-scale asynchronous distributed systems using P#. We report our experience of applying P# on Azure Storage vNext, a cloud storage system developed inside Microsoft, and show how P# managed to detect a subtle liveness bug that had remained unresolved even after months of troubleshooting using conventional testing techniques.
1. A novel symbolic lockset analysis that can detect all potential data races in a device driver. Our analysis avoids reasoning about thread interleavings and thus scales well. Exploiting the race-freedom guarantees provided by our analysis, we achieve a sound partial-order reduction that significantly accelerates Corral, an industrial-strength bug-finder for concurrent programs.
2. The design and implementation of P#, a new programming language for building, analysing and testing asynchronous programs. The novelty of P# is that it is comes with a static data race analysis and controlled testing infrastructure, which makes a strong move towards building highly-reliable asynchronous programs.
3. A new approach for partially-modelling and testing production-scale asynchronous distributed systems using P#. We report our experience of applying P# on Azure Storage vNext, a cloud storage system developed inside Microsoft, and show how P# managed to detect a subtle liveness bug that had remained unresolved even after months of troubleshooting using conventional testing techniques.
Version
Open Access
Date Issued
2016-09
Date Awarded
2017-05
Advisor
Donaldson, Alastair F.
Cadar, Cristian
Sponsor
Intel Corporation
Engineering and Physical Sciences Research Council
Grant Number
EP/I020357/1
Publisher Department
Computing
Publisher Institution
Imperial College London
Qualification Level
Doctoral
Qualification Name
Doctor of Philosophy (PhD)