|
|
| |
|
|
|
Our society is facing with an increasing dependence on computing systems, even in areas where a failure can be critical for the safety of human beings. Fault injection emerged as a viable solution for studying the behavior of computer-based systems when faults occur, and has been deeply investigated by both academia and industry. Several Fault Injection techniques have been proposed and practically experimented; they can basically be grouped into simulation-based techniques [1], software-implemented techniques [2], and hardware-based techniques [3].
The goal of this note is to present a fault injection system suited to be used in embedded microprocessor-based boards. The system is based on a hybrid (software and hardware) approach: a non-intrusive ad hoc hardware (called controller) monitors the target board bus to perform crucial operations such as activating the fault injection procedure at the proper fault injection time or triggering time-out conditions without modifying the target system behavior. The very low intrusiviness in the target system behavior of this additional hardware makes the system ideally suitable to real-time applications.
We present a prototypical version of a tool implementing the proposed
approach on a commercial board based on a M68040 microprocessor.
2. Fault Injection System
The overall Fault Injection system runs on two different units connected
by a serial port interface: a host computer and the actual target board.
The system exploits the routines available through the built-in ROM Monitor
of the target board to implement the communication interface between the
two units, to download the code into the target board, and to the analyze
the system behavior.
The adopted fault model is the transient single bit-flip fault. This model is frequently used in fault injection tools since it is highly representative of faults occurring in real systems [4]. Nevertheless, the approach can be easily extended to other fault models. Each fault is thus characterized by the following information:
Our technique is ideally suited to systems whose behavior, in presence
of a given sequence of input stimuli, can be deterministically computed
and easily reproduced. 
The fault injection system can be divided in three modules (Fig. 1):
Fig. 1: The Fault Injection environment.
3. Fault Injection Manager
The Fault Injection Manager is the most crucial part in the whole Fault Injection System. A hardware controller monitors the processor in order to start the injection of a fault selected from the fault list, or to stop the program execution if a time-out condition has occurred. The following paragraphs describe the controller architecture and its tasks.
3.1 Controller architecture and programming
The hardware board is connected to the CPU Bus and works as a peripheral from the processor point of view. It is memory mapped, so that the CPU can program and control it through simple memory write and read instructions.
To correctly execute a single fault injection experiment the controller must receive some commands before starting the target program:
The host computer sends the commands to the controller using the serial interface.
The controller performs two kinds of operations:
A Programming Logic Device guarantees the re-programmability and the flexibility of the controller. The PLD must match the strict time requirements of the bus protocol needed to decode the address, read the command, and generate an interrupt. The controller has been realized with 2 Xilinx FPGAs and some extra logic mounted on a PCB connected to the target application bus.
3.2 Fault Injection
The controller counts the number of executed instructions by analyzing the processor status pins that indicate the internal execution unit's status. The controller has been designed considering a M68040 microprocessor, but the approach is general thanks to the availability of this kind of pins in almost all processors.
As soon as the instruction counter matches the injection time of the fault that has to be injected, the controller sends an interrupt to the processor. The interrupt handling routine is in charge of injecting the fault. This is the only intrusiveness introduced by our fault injection system. The execution of this routine consists of a very limited number of instructions and can be generally well tolerated by a real-time system.
3.3 Time-out condition
The controller continuosly monitors the internal instruction counter: if its value exceeds a user-defined limit, an interrupt is sent to the processor. The time-out interrupt handling routine terminates the experiment and sends a message on the serial interface.
4. Conclusions
In this note we presented a fault injection environment suitable to be used for fault coverage evaluation of microprocessor-based boards.
During the fault injection experiments, the target application program is executed at speed and faults are injected by an interrupt handler routine triggered by a low-cost extra board, without any modification in the target application code and with minimum intrusiveness in the system behavior. This allows a very high speed in the overall fault injection experiment and makes it suitable for real-time systems. The approach is quite general and flexible, as it is based on common features supported by most microprocessors.
To practically evaluate the feasibility of the approach, a fault injection environment has been set up for a commercial board based on a Motorola 68040 processor and it is currently being evaluated on some benchmarks applications.
References [1] E. Jenn, J.
Arlat, M. Rimen, J. Ohlsson, J. Karlsson, Fault injection into VHDL Models:
the MEFISTO Tool, Proc. FTCS-24, 1994, pp. 66-75
[2] G.A. Kanawati, N.A. Kanawati, J.A. Abraham, FERRARI: A Flexible
Software-Based Fault and Error Injection System, IEEE Trans. on Computers,
Vol 44, N. 2, February 1995, pp. 248-260
[3] J. Arlat et al., Fault Injection for Dependability Validation: A Methodology
and some Applications, IEEE Transactions on Software Engineering, Vol.
16, No. 2, Feb. 1990, pp. 166-182
[4] P.K. Lala, Fault Tolerant and Fault Testable Hardware Design, Prentice
Hall Int., New York, 1985