Fast Abstracts Archives . .

FastAbstracts


WHAT IS a
FastAbstract

The History

Archives of
FastAbstracts

ISSRE 2003
ISSRE 2002
ISSRE 2001
ISSRE 2000
ISSRE 1999
ISSRE 1998
FTCS 1999
FTCS 1998



 

 

Wrapping Windows NT Binary Executables for Failure Simulation 1

Anup K. Ghosh and Matt Schmid
 
Reliable Software Technologies
21515 Ridgetop Circle, #250
Sterling, VA 20166
{aghosh,mschmid}@rstcorp.com
www.rstcorp.com

Introduction

In this short paper, we describe a tool for testing the reliability and robustness of Windows NT software applications under stressful environmental conditions, i.e., under system resource failure conditions. Windows NT systems are increasingly being deployed in mission-critical applications such as for command and control in US Navy ships [Bin 98]. However, as recently as July, 1998, the Navy's Aegis missile cruiser, USS Yorktown, suffered a significant software problem in the Windows NT systems that control the ``smart ship'' that effectively left the ship dead in the water [Sla 98]. The ship had to be towed to the Norfolk Naval shipyard because a database overflow error (resulting from a divide by zero operation) caused the ship's propulsion system to fail.

The research approach and prototype tool described here are specifically designed to analyze commercial off-the-shelf (COTS) software for Win32 systems where source code is not released, but binary executables are available for dynamic analysis. The purpose of this research is to assess the robustness of software applications to failing system resources such as memory allocation functions and system I/O functions. The tool gives an analyst the capability to artificially simulate stressful conditions (e.g., complete memory utilization) that a program may experience during its lifetime using simple toggle functions.

Approach

Given the constraint of working with binary executables without resorting to decompilation techniques, the approach in this research project has been to instrument interfaces between the application program under analysis and the shared libraries within the operating system that the application uses. The approach is to ``wrap'' a binary executable with an instrumentation layer such that all interactions between an application and the operating system can be captured, observed, perturbed, and questioned.

Windows NT applications import hundreds of system functions from shared libraries called Dynamically Linked Libraries (DLLs). As implied by their name, these libraries are linked during runtime. System DLLs make a good candidate for studying the effect of system failures on applications because they typically contain the core functions within the operating system that applications require. As such, they can be a single point of failure in a system. If the core OS functions fail, the programs that use them may fail in turn. For this reason, selectively simulating failures of operating system resources (such as memory allocation/deallocation, file system operations, and other system I/O operations) can identify how robust, or conversely, how vulnerable, an application is to failing system resources. Studying these failure modes is important in critical applications where system resources may be unavailable during peak periods when they are most essential.

Figure 1: Wrapping Executable Binaries

Wrapping Binary Executables

The approach taken to simulate failed system resources is to wrap binary executables with an instrumentation layer that simulates system failures. Figure 1 illustrates how program executables are wrapped. The application's Import Address Table (IAT), which is used to look up the address of imported DLL functions, is modified for functions that are wrapped to point to the wrapper DLL. For instance, in Figure 1, functions S1 and S3 are wrapped by modifying the IAT of the application. When functions S1 and S3 are called by the application, the wrapper DLL is called instead. The wrapper DLL, in turn, executes, providing the ability to modify, perturb, question or simply log the request to the target DLL.

When calling the target DLL function, the wrapper DLL looks up the address of the target DLL function in its IAT, then passes the request on to the target DLL. After executing the request, the results, if any, are returned back through the wrapper DLL to the application that made the request. The wrapper DLL has the opportunity, again, to modify or question the returned data from the system DLL. It is at this point that system calls can be modified to simulate anomalous or failed behavior from the system. An alternative is not to pass the system call from the application to the system DLL, but rather simply return a failure condition back to the requesting program. Note that in Figure 1 when function S2 is called, it is unadulterated by the wrapper.


Figure 2: Failure Simulation Tool

Failure Simulation Tool

The prototype tool graphical interface shown in Figure 2 is an implementation of the wrapping procedure shown in Figure 1. The tool provides the ability to instrument as many of the interfaces from an application to the OS as desired. The window shows the memory functions that can be instrumented with failure or success functions. Other system functions are available for instrumentation via the System tab shown in the window in Figure 2.

The success/failure functions can be toggled on or off at any point during execution of the program. For instance, the GlobalReAlloc and LocallAlloc functions are both shown to be toggled for failure. A Success wrapping function indicates that calls are passed through without modification. The Success and Failure columns show the number of times the calls for a particular function are made under the success or failure condition. For example, the LocalAlloc function was toggled to Failure at some point during the testing, after which six successive calls to LocalAlloc were failed via the instrumentation wrapper. The log of the success/failures for each call is recorded during testing in the window on the right-hand side of the interface, as well as to a log file.

Summary and Future Directions

This brief paper has provided an overview of an approach and tool for simulating system failures for COTS application programs. The approach is to wrap the application program binaries with an instrumentation layer that can selectively fail particular system calls. The effect of these failures can be observed to study the robustness of applications under anomalous or stressful system conditions.

A prototype tool has been implemented that allows selective failure of system resources on-the-fly during testing. The future direction of this research will be to add increased capability for failing several types of system resources. The tool will be used to study the effects of system failures on critical applications.

References

[Bin 98] M. Binderberger. "Re: Navy turns to off-the-shelf PCs to power ships," RISKS Digest, 19(76), May 25, 1998.

[Sla 98] G. Slabodkin. "Software Glitches Leave Navy Smart Ship Dead in the Water," GCN Network, Available online: www.gcn.com/gcn/1998/July13/cov2.htm.


1. This work is sponsored by the Air Force Research Laboratory and the Defense Advanced Research Projects Agency (DARPA) under Contract F30602-97-C-0117.