Distributed Systems and the Client/Server Model (from May 1994)

A distributed system consists of multiple processors that share resources and communicate through a network. Each processor is autonomous and has its own memory and clock. This type of processing is called loosely-coupled, as opposed to tightly-coupled processing such as parallel processing or multiprocessing where processors share the same memory.

Distributed systems include the ability to share resources across a large geographic area and to easily upgrade the system by simply adding additional processors. This can be far more flexible than a centralized processing system. By keeping files replicated in different locations, access times may be reduced and files will be replicated in the event that files in one location are damaged by hardware failure. By including these built-in redundancies, speed can be optimized, and the system may be able to survive if most of the processors are undamaged.

Also, performance of the system may exceed centralized systems when computations are performed on several processors concurrently. Sharing the load across multiple processors can reduce the amount of time that processors will remain idle.

However, distributed systems are generally not as secure as centralized systems. Because the data is transmitted across a network, distributed systems must rely on encryption techniques to keep data secure. Also, the dependence on networks may be a problem if communications are interrupted or traffic is slow.

In the 1970s, time-sharing systems were used to run programs on a centralized computer from remote terminals. To enable this system to respond quickly to a large number of users, multiprocessing was used. With the development of microcomputers and computer networks, operating systems (such as UNIX) were developed that used distributed processing to share resources effectively. Shared access to files was one of the first features implemented in such systems.

These systems were developed extensively by Xerox from 1971 to 1980. The Xerox research team experimented with single-user computers to allow instantaneous interactive graphics and high performance without the delays caused by time-sharing systems. The development of the file server in this research had a strong influence on future developments.

Operating systems used for distributed systems require the ability to coordinate multiple tasks and share resources by utilizing communication between processors. Because each processor has its own memory and clock, protocols must be used that either elect a coordinator process or make decisions based on consensus. These decisions will determine task synchronization to ensure that computations or resource requests occur in the right order or with a given priority.

Because the processors are autonomous, programs that run on multiple processors must give special consideration to the order that events are performed if those events cannot be performed concurrently. One solution to this problem is to use message passing between processes using a 'Happened-Before' relation to signal other processors that an event has completed and the program can continue. This relation requires one specific event to happen before another specific event. This technique is known as partial ordering of events. Another similar approach is to keep an array of counter values maintained by message passing for synchronization. This technique of generating 'logical clocks' can keep events partially or fully ordered.

To share resources, a distributed system needs to allocate the resources to a process and include a locking mechanism to ensure that other users (processes) must wait for the resource to become available. Control lists can be set up to enforce security by providing authentication to verify right of access to the resource. The lists can then be accessed by the coordinator process, or some other consensus method, before access to the resource will be granted.

A situation called deadlock occurs when two processes are continually waiting for resources that are locked by the other process. A deadlock-detection algorithm can be used to detect this problem, however, one of the processes may be denied access to the requested resource and will release its resources before aborting. The coordinator process can build a resource-allocation graph that can keep track of resource requests. A coordinator process can then detect when deadlock occurs by checking the graph for cycles. The coordinator can then recover from the deadlock by terminating some of the resource allocations until the cycle is eliminated.

Consensus is used between the processors to synchronize clocks and to perform election algorithms to elect a coordinator process. Elections are sometimes done by all processors attempting to generate the highest (or lowest) number using a given algorithm. Eventually, one of the processors must win, and will be elected coordinator. As a result of the election, all processors must agree on exactly one coordinator process.

Another application of consensus involves distributed database management systems. If a transaction involves data residing on several processors, all of the processors involved must agree whether to commit changes to the database, or to abort. In this case, the abort option has a higher priority because, if any processor chooses to abort the transaction, then all of the processors must abort also.

By achieving consensus, processors are able to make agreements and tolerate certain types of processor failures. They can even continue wait-free, without being held up by other processors that have crashed, as long as consensus is possible. Using a Byzantine agreement, consensus can still be reached even if some of the processors are maliciously attempting to confuse the consensus process.

In the client/server model, certain processors, called servers, are designed to perform certain tasks at the request of other processors, called clients. These tasks may include: file storage, printing, electronic mail, or database access. This enables the resources in a distributed system to be concentrated at the servers. File servers can be given a large disk capacity, printers connected to the print servers, and direct access storage devices (DASDs) can be connected to the database servers. This model will ensure that valuable resources are available to multiple client processes.

A file server is a repository for data. The data can be accessed in a distributed system by simple file operations performed by client processor in the same way that local files are accessed. The underlying file service will forward requests to read files to the file server through the network, where the file will be returned to the client. Additionally, requests can be made to write, rename, or delete files. The fact that the files are remote should remain transparent to the client process.

Generally, a file server will also maintain a directory service, or a listing of all the files residing on the server. The directory service will also contain file attributes for access control, as well as storage of a user ID and timestamp when a file is created, updated, or deleted. Client processes can scan the directory listing to search for a desired file or to get attribute information about a file.

Most file servers also use a cache to store a copy of recently-accessed disk blocks in memory. File access performance can be significantly improved using this method if certain files are accessed repeatedly. When the file server cache memory is full, the least-recently used disk blocks in the cache will be released so the memory can be utilized effectively.

File servers can provide centralized access control in distributed systems with the ability to grant or revoke access for a list of users to particular files. The access can be set to various levels, such as read, write, delete, etc. Also, files can be archived in one central location, improving recoverability. Because file servers can be accessed by multiple users at the same time, a transaction service needs to be implemented to provide concurrency control and transaction ability to the client process.

Distributed database management systems allow databases to be kept on separate processors which may be geographically separated. This is different from a distributed file system because the database will act as one database, not as separate files in separate locations. These locations, called servers, are able to communicate through a network, and data can be kept closest to its point-of-use. When a client process accesses the database, it only needs to directly access one server.

The database will generally consist of tables of data each organized into fields (columns) and records (rows). To retrieve data in a distributed relational database, a request is made by the client process for a collection of fields from a specific set of records. The data can then be returned to the client process. If changes are made to the tables, a transaction will be created by the client and sent to the server. The transaction may include commands that delete, update, or insert records in the tables. This transaction is then stored by the server as a list of intentions, until changes are committed. Before committing the transaction, if the server finds that any step in the transaction cannot be performed, then the transaction will abort and any intended changes will be rolled back (cancelled).

When a request for data or transactions, expressed in SQL (structured query language), is passed from the client to a server, the request can then be performed on data that resides on any server. The server that holds the data can then perform the SQL immediately and send the results back to the client. This eliminates the need for the client to directly access more than one server.

Distributed database systems differ from file services because entire files do not need to be manipulated by the client process. Only the needed data fields will be transmitted across the network, not entire records or files. Also, while a lengthy SQL transaction is taking place on the server, the client processor is free to perform other tasks, such as updating graphic displays and running background applications.

References:

A. Silberschatz, J. Peterson and P.Galvin 1991
Operating System Concepts pp.437-508

George F. Coulouris and Jean Dollimore 1988
Distributed Systems - Concepts and Design pp.1-26

Joel M. Crichlow 1988
Distributed and Parallel Computing pp.2-11, 112-129

John Turek and Dennis Shasha
'The Many Faces of Consensus in Distributed Systems'
IEEE Computer Mag. Vol. 25 No. 6

Colin Fidge
'Logical Time in Distributed Computing Systems'
IEEE Computer Mag. Vol. 24. No. 8

M.Tamer Ozsu and Patrick Valduriez
'Distributed Database Systems: Where Are We Now?'
IEEE Computer Mag. Vol. 24. No. 8


Return to Dan's Home Page
email
Last Update: September 3, 2003