A Reflective Object-Oriented Framework for Developing Dependable Distributed Software based on Patterns and Metapatterns

 

Delano M. Beder

Cecília M.F. Rubira

University of Campinas
P.O.Box 6176, 13083-970 Campinas - SP - Brazil
e-mail: {delano,cmrubira}@dcc.unicamp.br
 

Fault tolerance represents a major challenge to software designers of modern computing systems. The construction of dependable distributed systems is not a simple task; it requires the use of appropriate techniques during the whole software development cycle. In general, these techniques are based on the provision of redundancy, both for error detection and error recovery. However, the provision of software redundancy implies: (i) a cost increase of the software development, and (ii) a complexity increase of the system, caused by the addition of redundant components. Ideally, the additional software redundancy should be incorporated to the original system in a structured and non-intrusive manner in order to make easier the task of constructing dependable systems by application designers.

In this context, the Laboratory of Distributed System at Unicamp has concentrated its efforts in the definition of a reflective framework for the development of dependable distributed applications. This research includes four main areas:

1. high-level programming interfaces;

2. software structuring techniques (computational reflection, delegation, object framework, pattern and metapattern);

3. fault tolerance techniques (error recovery, exception handling, atomic action, coordinated atomic action, recovery block, N-version programming);

4. support for cryptography.

In particular, our research has been concentrated on items 2 and 3 above mentioned. A object framework is the design of a set of objects that collaborate to carry out a set of responsibilities. Frameworks are a way to reuse high-level design. A pattern [3,5] describes a particular recurring design problem that arises in specific design contexts, and presents a well-proven generic scheme for its solution. The solution scheme is specified by describing its constituent components, their responsibilities and relationships, and ways in which they collaborate. Patterns are smaller architectural elements than frameworks. A framework can contain several patterns but the reverse is never true. In fault-tolerance domain, patterns can be used to present a collection of relatively independent solutions to common non-functional properties problems such as atomic actions, recovery points, replication policies, exception handling and so on.

Reflection is defined as being the ability of observing and manipulating the computational behavior of a system. In the object paradigm, this means to represent all abstractions of the model in terms of the object model itself. This concept has already been captured as a architectural pattern called reflection pattern [3]. In this pattern, an application is split into two parts. A meta level provides information about selected system properties and makes the software self-aware. A base level includes the application logic. Its implementation builds on the meta level. Changes to information kept in the meta level affect subsequent base-level behavior. A meta object protocol (MOP) serves as an external interface to the meta level, and makes the implementation of a reflective system accessible in a defined way. Clients of the meta object protocol can specify modifications to meta objects or their relationships using the base level. The meta object protocol itself is responsible for performing these changes. The proposed framework is decomposed into three layers and its rough draft is illustrated in the figure 1.

 

Figure 1. The proposal of reflective framework.

Interface Layer. This layer provides tools that support (and integrates) analysis, design, implementation and management of reliable programs and also helps building the meta level (implementation of meta objects). Such tools should be used during the software development cycle.

Middleware Layer. This layer provides mechanisms that make easier the task of constructing secure dependable distributed systems by application designers. Amongst mechanisms provided by this layer, some can be enumerated: atomic action, coordinated atomic action, remote procedure call, cryptographic techniques, etc. A system of patterns to cryptographic object-oriented software [2] is currently being designed.

Reflective Kernel Layer. This layer provides support for development of dependable distributed programs. Amongst services provided by this layer, some can be enumerated: management of persistent objects, support to meta-level programming, etc. A meta object protocol (MOP) called Guaraná and a library of meta objects [8] suitable for developing distributed systems have been designed recently.

Our research has been concentrated on middleware layer. The proposed framework supports mechanisms to tolerate both hardware and software faults. Hardware fault tolerance is obtained either through simple replication of objects or through techniques of saving and recovering computation states. In this case, services can be executed as atomic actions [1]. Object recovery [10] and memento [5] patterns provide recoverability of objects, and they are used in the context of distributed failure recovery of atomic actions. Two main schemes have been proposed for structuring a software system, and providing software fault tolerance: N-version programming and recovery block [7]. These schemes have been previously implemented as master-slave [3] and backup [11] patterns respectively. The reliable hybrid pattern [4] allows reuse of these two patterns, as well as construction of more complex hybrid solutions. This pattern is based on the use of the composite pattern [5] to recursively combine the master-slave and backup patterns.

The distribution of applications imposes an important requirement. Distributed subsystems must collaborate, and therefore need a means of communications with each other. The forwarder-receiver [3] pattern provides transparent inter-process communication with a peer-to-peer interaction model. It introduces forwarders and receivers to decouple peers from the underlying communication mechanisms. The client-dispatcher-server [3] pattern introduces an intermediate layer between clients and servers, the dispatcher component. It provides location transparency by means of a name service, and hides the details of the establishment of the communication connection between clients and servers.

The proposed framework also supports mechanisms to tolerate faults in complex distributed systems. The coordinated atomic (CA) action [12] is a unified scheme for coordinating complex concurrent activities and achieving fault tolerance by extending and integrating two complementary concepts: conversations [9] and atomic actions. A common side-effect of partitioning a system into a collection of cooperating objects is the need of maintain consistency between related objects. The publisher-subscriber [3] (or observer [5]) pattern defines a one-to-many dependency between objects so that when one object changes state, all its dependents are notified and updated automatically. The work of Lea [6] collects standard design techniques and new ideas from research literature on OO concurrency and presents them in a way that is meant to be used and reused. Some of the patterns presented on this work are extensions or applications of common sequential patterns to concurrent programming problems.

 

References

[1] P.A. Bernstein, V. Hadzilacos, and N. Goodman. Concurrency Control and Recovery in Database Systems. Addisson-Wesley Publishing Company, 1987.

[2] A.M. Braga, C.M.F. Rubira, and R. Dahab. A System of Patterns to Cryptographic Object-Oriented Software. Techical Report, Institute of Computing, University of Campinas, text under construction.

[3] F. Buschmann, R. Meunier, H. Rohnert, P. Sommerlad, and M. Stal. A System of Patterns. John Wiley & Sons.

[4] F. Daniels, K. Kim, and M.A. Vouk. The Reliable Hybrid Pattern - A Generalized Software Fault Tolerant Design Pattern. PLOP'97, 1997.

[5] E. Gamma, R. Helm, R. Johnson, and J. Vlissides. Design Patterns - Elements of Reusable Object-Oriented Software. Addison Wesley Publishing Company, 1995.

[6] D. Lea. Concurrent Programming in Java: Design Principles and Patterns. Addison-Wesley Publishing Company.

[7] P.A. Lee and T. Anderson. Fault Tolerance: Principles and Practice. Springer-Verlag, 2nd edition, 1990.

[8] A. Oliva and L.E. Buzato. An Overview of MOLDS: A Meta-Object Library forDistributed Systems. Technical Report 98-15, Institute of Computing, Univ. of Campinas, April 1998.

[9] B. Randell. System Structure for Software Fault Tolerance. IEEE Trans. on Soft. Engineering, SE-1(2), 1975.

[10] A.R. Silva, J. Pereira, and J.A. Marques. Object Recovery Pattern. PLOP'96, 1996.

[11] S. Subramanian and W. Tsai. Backup Pattern: Designing Redundancy in Object-Oriented Software. PLOP'96.

[12] J. Xu, B. Randell, A. Romanovsky, C.M.F. Rubira, R. Stroud, and Z. Wu. Fault Tolerance in Concurrent Object-Oriented Software through Coordinated Error Recovery. FTCS-25, 1995.