On Automating Abstraction Proofs
Rob Sumners and Jacob A Abraham
Computer Engineering Research Center
The University of Texas at Austin
Introduction
Model checking methods have become a popular technique for the removal
of faults by checking correctness of temporal logic properties in finite
state hardware and software models. The difficulty with this approach is
the familiar state explosion problem. A common technique for alleviating
the state explosion problem is state space abstraction where irrelevant
state information is quotiented out of the state space. Abstraction is often
done manually and while abstract model analysis may provide important feedback,
the analysis does not necessarily correlate to the original system unless
the abstraction is shown to preserve the behavior of the original system.
In a sense, abstraction takes a behavioral correctness proof for some system
and breaks it into two steps, (1) abstract the atomic state model to reasonably
finite abstract state model and (2) check resulting abstract model for behavioral
correctness. Unless step (1) is guaranteed, the entire ``verification''
is in question. The problem is then how to show that step (1) is correct.
For common rich system implementation languages, it is undecidable and some
form of deduction may be required. Others have attempted to use theorem
provers to verify the abstraction[HS96], but the abstraction proofs can
often be as difficult if not more so than a complete proof of the property
on the original system to begin with. We present some simple techniques
to simplify these obligations so that a more-directed theorem prover like
ACL2[BM97], Otter, etc. can be used to automatically prove abstractions
correct and alleviate much of the need for user interaction. We have built
a simple preprocessing tool which takes a LISP definition of a systems transition
function and abstraction function and outputs (1) a simplified proof script
for ACL2 and (2) a finite model definition which can be model checked. We
have used this tool to automatically prove the abstraction for an infinite
state 2-cache system given in the figure below. The example system consists
of two caches with processor interfaces -- the latched address, data, command
-- and a pending bit used to denote that a load or store command is waiting
to be performed. Each processor has an unbounded cache and use a common
shared memory. A property we wish to show is that for any address $A$ and
data value D, if we perform the command store,A,D for some
processor and perform no other stores to A, then the next
$load$ should return the value D.

Abstractions
A state model is defined by an initial state q in Q and a next-state
transition function step : Q X I --> Q which takes the current
state from state set Q and an input from the set I and returns
a next state. An abstraction is defined by a total idempotent function rep
: Q --> Q which for every state returns a representative. Intuitively,
a representative state is chosen such that it has representative behavior
for all states which map to it relative to the other representative states
chosen. We also need to choose representative inputs, so we will also assume
an abstraction function repI : I --> I is defined. Formally, we
would like to show:
(1) for any states s,t and inputs i,j, (rep(s)
= rep(t) and repI(i) = repI(j)) implies (rep(step(s,i)) = rep(step(t,j))).
This would guarantee that states s and t were bisimilar
and we could safely coalesce these states together assuming the resulting
model -- defined by $rep(q)$ and next state function rep(step(rep(s),repI(i)))
-- would measure the state propositions in our temporal logic formula. In
practice it is difficult to get bisimulations with a finite number of representatives,
so we may need to weaken the definition by including another abstraction
function repN : Q --> Q, where repN(s) = repN(t) implies rep(s)
= rep(t), which defines representatives which agree on the next rep
states for repI equivalent inputs. Formally we need to show, for
all states s,t and inputs i,j, (repN(s) =
repN(t) and repI(i) = repI(j)) implies (rep(step(s,i)) = rep(step(t,j))).
This allows the abstract model on rep states to hide enough information
to be finite (with some cost of model correlation). We will focus on the
first case, nrep = rep, for presentation simplicity.
Proof Automation
We want to prove property (1) using a theorem prover. Proving
this directly will most likely require some amount of human-prover interaction.
This interaction will consist of doing some case splitting on the relationship
between s,t,i,j and in each of these cases use some reasoning about
equality, induction, etc. to show that for this case, the theorem holds.
Intuitively, though, we have defined all the cases in the abstract model.
So a viable alternative would be to somehow recast (1) such that
the case splitting required is defined by the abstract model. Assuming a
finite number of reachable representatives and a finite number of representative
inputs, we can construct the abstract model defined by initial state rep(q)
in R within Q and next state function rstep : R X E --> R,
where E within I and rstep := rep(step(rep(s),repI(i))) by
simply executing over representatives. With rstep, instead of proving
(1) monolithically, we can alternatively prove a simplified theorem
for each edge in rstep:
(2) for all edges (v,k,w) rstep: for all states
s and inputs i, (rep(s) = v and repI(i) = k) implies rep(step(s,i))
= w
More aggressive theorem provers such as ACL2 can deal much more effectively
with theorems of the form in (2) rather than (1). In fact,
for our cache example, ACL2 is able to prove every theorem in the set defined
by (2). Unfortunately, in practice, while the abstract model may
be an effective case splitter for automating proofs, the number of edges
in abstract models can make the number of proofs required too large. For
many cases though, including the cache example, we can decompose the problem
of abstraction proofs to specific types used in the function definition.
Finite types (such as booleans and {load,store}) do not need to be
abstracted. The non-finite types -- address, data, cache, memory -- do need
to be abstracted. Above these types though are composite types whose abstractions
simply compose the abstractions of the underlying types. It is preferable
to push the abstraction proof obligations down through the composite types
to the non-finite types. We can do this using simple variants of the following
observation. Given models step1 : Q1 X I --> Q1 and step2 :
Q2 X I --> Q2 with proven abstractions rep1,rep2 respectively
for input abstraction repI. Then, for input abstraction repI,
the function rep : Q1 X Q2 --> Q1 X Q2 defined by rep(<s,t>)
= <rep1(s),rep2(t) > is an abstraction for the model step
: Q1 X Q2 X I --> Q1 X Q2 defined by step(<s,t>,i) =
<step1(s,i),step2(t,i)>.
Conclusions
We wrote a simple tool which took a LISP definition of a next state transition
function and generated a set of proof obligations. The tool attempted to
find simple composite types from which it could push down proof obligations.
For the simple cache example in figure 1, the resulting proof obligations
were localized to the cache, memory, data, and address types -- the rest
of the types are finite composites or preserved completely by the abstraction.
The resulting 392 proof obligations are proven automatically using ACL2.
The resulting model can then be used to check our property of interest.
Our short presentation here is an attempt to show the usefulness in abstract
models not only for model checking but for efficient methods of case analysis.
Future work will include attempts to extend these constructions with well-foundedness
arguments and more efficient representations.
References
[HS96]K. Havelund, and N. Shankar. ``Experiments in Theorem Proving and
Model Checking for Protocol Verification,'' in FME 96.
[BM97]R.S.Boyer, and J.S. Moore. {\em A Computational Logic Handbook,
2nd ed.} Academic Press, 1997. |