Formal Validation of Fault-tolerance Mechanisms
Cinzia Bernardeschi(1), Alessandro Fantechi(2),
Stefania Gnesi(3), Antonella Santone (1)
1. Dipartimento di Ingegneria della Informazione, Univ. di Pisa
2. Dipartimento di Sistemi e Informatica, Univ. di Firenze
3. Istituto di Elaborazione dell'Informazione, Consiglio Nazionale delle
Ricerche
The European project GUARDS
(Generic Upgradable Architecture for Real-time Dependable Systems) addresses
the development of architectures, methods, techniques, and tools to support
the design, the implementation and validation of critical systems [WBBP96].
The innovative approach of the project towards this objective lies on the
elaboration and exploitation of a coherent methodology integrating five
fundamental features: Genericity, to support reliability of hardware
and software components and architectures in multiple applications and domains,
in particular in space, nuclear and railway fields; Dependability,
to support the design and implementation of fault tolerant mechanisms complying
with safety and availability requirements; Real-time predictability,
to support the fulfillment of the stringent real-time constraints imposed
by the considered applications; Ability to be validated, to support
the definition of instances of the generic architecture that comply with
the requirements imposed by each of the target applications; Ability
to be certified, to meet the certification obligations imposed by various
standardisation authorities. The validation policy in GUARDS is based on
the a priori validation of selected critical components and mechanisms of
the architecture [Arlat97].
We report the experience done to formally validate
a particular fault-tolerance mechanism of GUARDS, the Inter-consistency
mechanism. This mechanism is responsible of the consistency of the outputs
produced by the channels (redundant processors), also in presence of faults
in a minority of them. The formal validation of the mechanism has been carried
out following an approach based on model checking. Model checking approaches
[CES86] work on a finite state model of the behavior of the system, derived
by the specification of the system following an operational approach. Verification
is carried out automatically by expressing a desired property of the system
as a temporal logic formula, and by checking the satisfiability of the formula
in a polynomial time complexity over the model of the system.
The formal verification of the Inter-consistency mechanism
requires that the given specification satisfies some correctness properties
also in presence of faults. Faults are formalised as actions, whose occurrence
simulates the occurrence of the fault in the system. In particular, the
specification of faults in terms of actions appears adherent to the original
description of the fault tolerant mechanism to be validated; and gives us
control over the occurrence of faults.
The automatic verification of the mechanism is done
according to the following steps:
1. The components of the mechanism are specified (as process algebra terms
or alternatively by means of Labelled Transition Systems (LTSs) and networks
of LTSs).
2. Processes that constraint the actual occurrence of fault actions, according
to the considered fault hypothesis, are added to the specification.
3. The global model of the behaviour of the Inter-consistency mechanism
under the fault hypothesis, as a single LTS, is generated.
4. The properties that express the correctness or fault tolerance requirements
on the mechanism are specified by ACTL temporal logic [DNV90] formulae.
The tools of the JACK ( Just Another Concurrency
Kit) environment [BGL94] support both the specification and the verification
phase. Detailed information
about JACK are available
The validation experience on the Inter-consistency
mechanism has first addressed an instance of this mechanism composed of
three communicating nodes. Covering a mechanism with four nodes case has
required writing a new specification of the nodes of the network. In this
case the number of messages and the number of variables increase for each
node. Moreover, since the number of nodes also increases, the increment
of the number of states of the global model of the mechanism is quite large.
This is a typical state explosion problem case, and indeed the
standard tools in the JACK environment fail to work on such a large model
for lack of memory. We have therefore employed methods of compactly representing
the state space by an implicit description based on Binary Decision Diagram
(BDD) in the development of new efficient model checking tools. In particular,
a model checker has been recently built for ACTL, based on symbolic algorithms
and implicit representation of the labeled transition relation of LTSs [BAMC].
Work is in progress on the application of these new model checking tools
to check the Agreement and Validity properties for the four node case.
Given the validation effort done for the three nodes
case, we could expect that the validation for the four nodes case could
exploit the already built model, by performing some kind of incremental
extension, that would avoid to generate again the state space global model
from scratch. Unfortunately, this is not possible due to the structure of
the mechanism itself: validation has to be repeated on each instance of
the Inter-consistency mechanism generic definition, generating each time
the global state space.
On the other hand, a validation of the generic, n-node,
Inter-consistency algorithm, could be performed using its recursive definition
and using validation approaches based on theorem proving [GPR95], rather
than model cheching.
In the framework of the GUARDS project, however, we
have considered that the GUARDS architecture is going to be exploited by
three different pilot applications, namely a nuclear submarin control subsystem,
a railway interlocking system, and a space-station payload control system.
Of the three cases, the first employs an instance of the GUARDS architecture
having just two channels, where the Inter-consistency mechanism does not
apply, while the second one employs three channels and the latter four channels.
Hence, for the purpose of this project, it has been considered more convenient
to perform a model checking based validation of the Inter-consistency algorithm
with three and four nodes.
This parallels the fact that during the implementation
phase, different running code will be written for the Inter-consistency
mechanism with three nodes and with four nodes, since the recursive definition
cannot be applied in an embedded system, where determinism of execution
time is of paramount importance. This means that the object of our model
checking based validation is much closer to the actually running code than
to the generic algorithm. This is in our opinion more appealing to end users
and regulatory bodies.
The experience carried out in the validation of the
fault-tolerance Inter-consistency mechanism confirms the primary role of
model checking approaches inside formal validation. On the basis of this
experience, we can envisage model checking as a basic approach to the validation
of particular instances of systems. In addition to having the role of automating
and speeding up subproofs in a theorem proving approach to formal validation,
as exemplified in [LR93]. On the other hand, theorem proving could be used
instead to validate ad hoc abstraction techniques that are often necessary
to deal with state explosion problems. The complementarity of the two approaches
will be exploited in the next future to gain a larger industrial applicability
of formal methods to cope with the ever more challenging validation requirements
of the future safety critical computer applications.
References
[Arlat97] J. Arlat. Preliminary definition of the
GUARDS validation strategy. GUARDS Deliverable D3O1, LAAS -CNRS Report 96378,
January 1997.
[BGL94] A. Bouali, S. Gnesi, S. Larosa. The integration
Project for the JACK Environment. Bulletin of the EATCS, 54, October 1994,
pp.\ 207--223.
[CES86] E.M. Clarke, E.A. Emerson, A.P. Sistla. Automatic
Verification of Finite--State Concurrent Systems Using Temporal Logic Specification.
ACM Transaction on Programming Languages and Systems, 8 (2), April 198 6,
pp. 244--263.
[DNV90] R. De Nicola, F.W. Vaandrager. Actions versus
State Based Logics for Transition Systems. In Proc. Ecole de Printemps
on Semantics of Concurrency, Lecture Notes in Computer Science vol.
469, Springer, Berlin, 1990, pp. 407-419.
[BAMC] A. Fantechi, S. Gnesi, F. Mazzanti, R. Pugliese,
E. Tronci. A Symbolic Model Checker for Action-based CTL. Submitted for
publication.
[GPR95] L. Gong, P. Lincoln, J. Rushby. Byzantine
Agreement with Authentication: Observations and Applications in Tolerating
Hybrid and Link Faults. Proc. 5th Conference on Dependable Computing for
Critical Applications (DCCA-5), Urbana-Champaign, Il, USA, 1995.
[LR93] P. Lincoln, J. Rushby. A formally verified
algorithm for Interactive Consistency under hybrid fault model. Proc. 523rd
International Symposium on Fault-Tolerant Computing (FTCS 23), Toulose,
France, pp. 402-411, June 1993.
[WBBP96] A. Wellings, L. Beus-Dukic, A. Burns, D.
Powell. Genericity and Upgradability in Ultra-Dependable Real-Time Architectures.
Work in Progress Proceedings, Real-Time Systems Symp., Washington D.C. ,
pp.15-18, IEEE Computer Society Press, December 1996.
|