Efficient Verification of Fault-Tolerant Message-Passing Protocols.
Technische Universität, Darmstadt
[Ph.D. Thesis], (2011)
Available under Creative Commons Attribution Non-commercial No Derivatives, 2.5.
Download (1MB) | Preview
|Item Type:||Ph.D. Thesis|
|Title:||Efficient Verification of Fault-Tolerant Message-Passing Protocols|
This thesis deals with efficient formal verification of fault-tolerant distributed protocols. The main focus is on protocols that achieve fault-tolerance using replication in distributed systems. In addition, communication in the distributed system is abstracted using message-passing. However, most of the concepts and solutions discussed in the thesis apply beyond replication-based message-passing protocols.
The outcome of verification is two-fold: Either the verification proves that the system (protocol) satisfies its specification or it returns (or claims the existence) of a counterexample that witnesses that the system (protocol) violates the specification. Both the development and application of fault-tolerant message-passing protocols can benefit from verification. Firstly, these protocols can be complex conceptual designs and hard to implement due to (i) a rich variety of (even malicious) faults that should be tolerated by the protocol and (ii) the concurrency in the distributed system executing the protocol. Therefore, counterexamples returned by verification in the phase of development can help developers to get the conceptual protocol and its implementation right. Secondly, fault-tolerant messages-passing protocols specify strong guarantees such as atomic broadcast or diagnosis of malicious faults. Therefore, the verification of such protocols can avoid failures of highly available systems that build upon these protocols.
The complexity of verification strongly depends on the size and nature of the system. Therefore, verification should be efficient in terms of space, time, and human interaction (ranging from full automation to requiring intuitive human guidance). Due to (i) and (ii), the verification of fault-tolerant message-passing protocols faces with a large problem space. Hence, straightforward verification approaches are inefficient. The thesis enables efficient verification of fault-tolerant message-passing systems in several ways.
The input of verification is a model of the system. The model represents the system and can be of varying resolution, e.g., source code, binaries, or high-level executable pseudocode. The efficiency of verification can significantly differ for different models of the same system. In the first part of the thesis, models of fault-tolerant message-passing protocols are proposed to enable efficient verification.
The proposed models address different aspects of fault-tolerant message-passing protocols. These aspects include (A1) message traffic -- an equivalence is shown between different models of sending, delivering, and consuming messages, which allows to select the model that is most amenable to efficient verification; (A2) fault-model -- a surprisingly simple and sound model of the widely-applied crash fault assumption is proposed allowing efficient verification of systems with crash faults; (A3) symmetries -- an approach for finding symmetries in the model is introduced, which can be exploited by symmetry reduction; and (A4) partial orders -- an equivalent translation of models is proposed to improve on the efficiency of partial-order reduction. Symmetry and partial-order reductions are general tools for efficient verification whose efficiency strongly depends on the model in use.
The second part of the thesis improves on existing verification techniques to enhance their efficiency.
The first technique is partial-order reduction, whose performance is limited by the poor flexibility and usability, and high time overhead of existing implementations. A new implementation of partial-order reduction is presented, called LPOR, that improves on all these features at the same time. To demonstrate the features of LPOR, it is applied to general message-passing systems. Experiments with representative fault-tolerant message-passing protocols justify LPOR's improvements. In addition, LPOR is implemented as an open-source Java library. This implementation is applied in MP-Basset, a model checker for general message-passing systems developed in the context of this thesis.
The second verification technique is automated induction. The efficiency of induction is limited by its general incompleteness, i.e., induction might return spurious (wrong) counterexamples for systems that satisfy the specification. A solution against spurious counterexamples is using lemmas. However, the discovery of lemmas is, in general, hard to automate. The thesis proposes a classification of lemmas in fault-tolerant message-passing protocols to enable their automated discovery. An example protocol is verified using classified lemmas and machine-checked induction proofs. Finally, a new approach, called strengthened transitions, is proposed to combat spurious counterexamples. An application of strengthened transitions for general multi-process systems is also presented.
|Place of Publication:||Darmstadt|
|Classification DDC:||000 Allgemeines, Informatik, Informationswissenschaft > 004 Informatik|
|Divisions:||20 Department of Computer Science > Formal Methods in System Engineering|
|Date Deposited:||09 Dec 2011 10:57|
|Last Modified:||07 Dec 2012 12:02|
|Referees:||Suri, Prof. Neeraj and Ölveczky, Prof. Csaba|
|Refereed:||29 November 2011|