A Redundant Virtual Service Redirector for Computer Networks
Yinong Chen1
Department of Computer Science
University of the Witwatersrand, Johannesburg
Reliability and availability
of network servers, including servers for internet data caching, web servers,
DNS servers and firewalls, are becoming a matter of concern of service providers
and their clients. Many service providers have used redundant servers as
backup spares in their systems to address this problem. Current implementations
of redundant servers are
- User Configuration. The redundant servers
with different IP addresses are all connected to the network. The users
have to put the IP addresses of the primary and backup servers in their
client software configuration. When the primary server is not available,
the client software looks for a backup server's IP address.
The drawback of this implementation is that the service provider has to
rely on users to setup their configuration correctly.
- Manual Switch. The primary and backup servers
have the same IP address. When the primary server is down, a backup spare
can be manually switch on to replace the primary one.
- Hard service redirector. The IP address
of the server is in fact for the redirector that redirects each client
request to one of the servers available, as shown in fig.1.
Currently, service redirectors
are implemented by hardware and its price is about 10 times as expensive
as a typical server that is not acceptable to many service providers. Another
drawback is that there is still a single point of failure: The failure of
the redirector will result in the unavailability of the entire service.
The aim of this research
is to explore, based on our previous research results [Chen
98], an alternative way to implement the service redirection that can
overcome the problems that current techniques encounter. The objectives
of the research are to achieve low cost and high availability and reliability
by using
- automatic fault detection and system reconfiguration
among the redundant servers,
- eliminating or reducing the probability
of the single point of failure in the system,
- flexible configuration to cope with different
kinds of applications and different level of dependability requirements.
The proposed approach achieving
the aim and objectives of the project is by means of software implemented
fault-tolerant protocols in distributed systems [Chen
93]. As shown in fig.2, all redundant servers are given the same IP
address. Any request will be received by all servers. However, it will be
dealt with by only a subset of the available servers in the form of task
replication. A hash algorithm in each server is used to decide which servers
will handle an incoming request. A comparison protocol running among the
servers will check the output agreement among the replicate servers. The
agreements and disagreements are logged in a syndrome table for fault diagnosis.
The syndrome table, together with the results of a heartbeat protocol, can
be used by a reconfiguration protocol to localise faults. Localised faulty
servers can then be excluded from participating in the collaborative operation.
The set of protocols that
distribute requests to a subset of fault-free server and that handle faults
has the same function as a physical redirector and is called a virtual
redirector. The advantages of the virtual redirector are:
- It has a generic architecture that can be used for various service
redirection, e.g., for network data caching, web servers, DNS servers and
firewalls.
- It has a flexible configuration that can manage from two to a number
of servers.
- The automatic fault detection and system reconfiguration are implemented
by a comparison, a heartbeat protocol, and a reconfiguration protocol,
which reduce the delay of fault detection and fault removal.
- There is no single point of failure in a system with more than two
servers;
- The entire redundant system is not significantly more expensive than
the costs of servers.
Currently, a fault-tolerant
firewall for internet-intranet communication has been chosen for experimentation
[Chen 98]. The system in fig.2 is implemented by three
SGI stations connected by an ethernet, as shown in fig.3. Two extra stations
are used. One simulates an internet from which requests will be sent to
the firewalls and the other simulates an intranet which allows only controlled
accesses from the internet.
Fig.4 gives more details
of the software structure of the entire system. The fault-tolerant firewall
is implemented on a rather generic hierarchy of layers. The firewall programs
are at the application layer. Below are the fault-tolerant protocols that
handle various fault conditions to ensure the application layer above works
correctly. The fault injector is used only in the testing stage to ensure
that the system can be tested under all possible fault conditions that should
be tolerated by the system [Chen 93]. Since an ethernet
doesn't ensure real-time communication, a token bus has been implemented
in the system [Mdakane 97].
A service redirector doesn't
need hard real-time property. However, soft real-time, that allows occasional
deadline missing but keeps real-time responses in most cases, is desirable.
Furthermore, the synchronisation among replicate tasks must have all results
available within a time constraint so that a slower response and a never
arriving response can be differentiated. These requirements motivated us
to put a real-time schedule and a real-time communication level in the system
to ensure deadlines of real-time tasks. The other three layers, i.e., the
UNIX operating system, the ethernet communication, and the hardware system
are off-the-shelf components.
The research project investigating
dependable real-time distributed systems has been sponsored by the South
African Foundation for Research Development (FRD) since 1996. The experimental
system was first implemented in 1995, improved in 1996 and tested in 1997
with the help of Honours and Masters students in the department of Computer
Science at the University of Witwatersrand. The system is used as the laboratory
environment for the course Reliability of Computer Systems.
Applying the techniques
developed in the project to construct dependable virtual network service
redirectors is motivated by our industry partner where such systems are
required not only for firewalls but also for data caching, web servers and
DNS servers. Both research and implementation works are being further performed
to meet their requirements.
References
[Chen
98] Y. Chen, On Development of a Dependable Local Area Network, IFIP
International Workshop on Dependable Computing and its Applications,
Johannesburg, January, 1998, pp. 83 - 96.
[Chen
93] Y. Chen, Testing and Evaluating Fault-Tolerant Protocols by Deterministic
Fault Injection, VDI Series 10, No. 260, VDI Verlag, Duesseldorf, 1993.
[Mdakane
97] S Mdakane, Token bus protocol on ethernet, Honours Research Report,
Department of Computer Science, University of the Witwatersrand, 1997.
|