Download as pdf or txt
Download as pdf or txt
You are on page 1of 10

A distributed system is a collection of independent computers that appear to the users of the system as a

single coherent system.


System architecture: the machines are autonomous; this means they are computers which, in principle,
could work independently.
The user’s perception: the distributed system is perceived as a single system solving a certain problem
(even though, in reality, we have several computers placed in different locations).
The distributed system has following characteristics:
 They do not have share memory or clock
 The computers communicate between themselves by the exchanging messages over a
communication network
 Each computer has its own memory and operating system

Advantages of Distributed Systems


 Performance: very often a collection of processors can provide higher performance (and better
price/performance ratio) than a Centralized computer.
 Distribution: many applications involve, by their nature, spatially separated machines banking,
Commercial, automotive system).
 Reliability (fault tolerance): if some of the machines crash, the system can survive.
 Incremental growth: as requirements on processing power grow, new machines can be added
incrementally.
 Sharing of data/resources: shared data is essential to many applications (banking, computer
supported
 Cooperative work, reservation systems); other resources can be also shared (e.g. expensive printers).
 Communication: facilitates human-to-human communication.

Disadvantages of Distributed Systems


 Difficulties of developing distributed software: how should operating systems, programming
languages and applications look like?
 Networking problems: several problems are created by the network infrastructure, which have to be
dealt with: loss of messages, overloading,
 Security problems: sharing generates the problem of data security.

Design Issues with Distributed Systems


Design issues that arise specifically from the distributed nature of the application:
 Transparency
 Communication
 Performance
 Scalability
 Heterogeneity
 Flexibility
 Reliability & fault tolerance
 Security

Transparency
Transparency is the concealment from the user and the application programmer of the separation of the
components of a distributed system (i.e., a single image view). Transparency is a strong property that is often
difficult to achieve. There are a number of different forms of transparency including the following:
 Access Transparency: Local and remote resources are accessed using identical operations.
 Location Transparency: Users are unaware of the location of resources
 Migration Transparency: Resources can migrate without name change
 Replication Transparency: Users are unaware of the existence of multiple copies of resources
 Failure Transparency: Users are unaware of the failure of individual components
 Concurrency Transparency: Users are unaware of sharing resources with others
 Performance transparency: load variation should not lead to performance degradation. This could
be achieved by automatic reconfiguration as response to changes of the load; it is difficult to achieve.
 Relocation Transparency: Hide that a resource may be moved to another location while in use (the
others don’t notice).
 Persistence Transparency: Hide whether a (software) resource is in memory or on disk

Scalability
Scalability is important in distributed systems, and in particular in wide area distributed systems, and those
expected to experience large growth. A system is scalable if:
 It can handle the addition of users and resources without suffering a noticeable loss of performance
or increase in administrative complexity.
 Adding users and resources causes a system to grow. This growth has three dimensions:
 Size: A distributed system can grow with regards to the number of users or resources (e.g.,
computers) that it supports. As the number of users grows the system may become overloaded
 Geography: A distributed system can grow with regards to geography or the distance between nodes.
An increased distance generally results in greater communication delays and the potential or
communication failure. Another aspect of geographic scale is the clustering of users in a particular
area. While the whole system may have enough resources to handle all users, when they are all
concentrated in a single area, the resources available there may not be sufficient to handle the load.
 Administration: As a distributed system grows, its various components (users, resources, nodes,
networks, etc.) will start to cross administrative domains. This means that the number of
organizations or individuals that exert administrative control over the system will grow. In a system
that scales poorly with regards to administrative growth this can lead to problems of resource usage,
reimbursement, security, etc.

Communication
Components of a distributed system have to communicate in order to interact. This implies support at two
levels:
1. Networking infrastructure (interconnections & network software).
2. Appropriate communication primitives and models and their implementation: • communication primitives:
- send
- receive
- Remote procedure call (RPC)
• Communication models
- Client-server communication: implies a message exchange between two processes:
The process which requests a service
And the one which provides it;
- Group muticast: the target of a message is a set of processes, which are members of a given group.

Performance
Any system should strive for maximum performance, but in the case of distributed systems this is particularly
interesting challenge, since it directly conflicts with some other desirable properties. In particular,
transparency, security, dependability and scalability can easily be detrimental to performance.
Several factors are influencing the performance of a distributed system:
• The performance of individual workstations.
• The speed of the communication infrastructure.
• Extent to which reliability (fault tolerance) is provided (replication and preservation of coherence mean
large overheads).
• Flexibility in workload allocation: for example, idle processors (workstations) could be allocated
automatically to a user’s task.

Heterogeneity
Distributed applications are typically heterogeneous:
- Different hardware: mainframes, workstations, PCs, servers, etc.
- Different software: UNIX, MS Windows, IBM OS/2, Real-time OSs, etc.
- Unconventional devices: teller machines, telephone switches, robots, manufacturing systems, etc.;
- Diverse networks and protocols: Ethernet, FDDI, ATM, TCP/IP, Novell Netware, etc.
Flexibility
A flexible distributed system can be configured to provide exactly the services that a user or programmer
needs. A system with this kind of flexibility generally provides a number of key properties.
 Extensibility allows one to add or replace system components in order to extend or modify system
functionality.
 Openness means that a system provides its services according to standard rules regarding
invocation syntax and semantics. Openness allows multiple implementations of standard
components to be produced. This provides choice and flexibility.
 Interoperability ensures that systems implementing the same standards (and possibly even those
that do not) can interoperate.
Reliability and Fault Tolerance
One of the main goals of building distributed systems is improvement of reliability.
Availability: If machines go down, the system should work with the reduced amount of resources.
• There should be a very small number of critical resources; critical resources: resources which have to be up
in order the distributed system to work.
• Key pieces of hardware and software (critical resources) should be replicated ⇒ if one of them fails another
one takes up - redundancy.
Data on the system must not be lost, and copies stored redundantly on different servers must be kept
consistent.
• The more copies kept, the better the availability, but keeping consistency becomes more difficult.
Fault-tolerance is a main issue related to reliability: the system has to detect faults and act in a reasonable
way:
• mask the fault: continue to work with possibly reduced performance but without loss of data/information.
• fail gracefully: react to the fault in a predictable way and possibly stop functionality for a short period, but
without loss of data/information.

Security
Security of information resources:
1. Confidentiality: Protection against disclosure to unauthorized person
2. Integrity: Protection against alteration and corruption
3. Availability: Keep the resource accessible
The appropriate use of resources by different users has to be guaranteed.
Distributed systems should allow communication between programs/users/ resources on different
computers.

Examples of Distributed System


 Web servers
 Intranet/ Network of workstations
 XT series of parallel computers by Cray
 Automatic banking (teller machine) system
 Automotive system (a distributed real-time system)
 New Cell processor (PlayStation 3)
 Amoeba

Web servers
The collection of Web servers—or more precisely, servers implementing the HTTP protocol—that jointly
provide the distributed database of hypertext and multimedia documents that is commonly known as the
World-Wide Web

Intranet
The computers of a local network that provide a uniform view of a distributed file system and the collection
of computers on the Internet that implement the Domain Name Service (DNS)

XT series of parallel computers by Cray


These are high-performance machines consisting of a collection of computing nodes that are linked by a high-
speed low-latency network. The operating system, Cray Linux Environment (CLE) (also called UNICOS/lc),
presents users with a standard Linux environment upon login, but transparently schedules login sessions
over a number of available login nodes

Automatic banking (teller machine) system

Automotive system (a distributed real-time system)


Amoeba
Amoeba is a general-purpose distributed operating system. It is designed to take a collection of machines and
make them act together as a single integrated system. Amoeba provides the necessary mechanism for doing
both distributed and parallel applications, but the policy is entirely determined by user-level programs.
Amoeba is a distributed system, in which multiple machines can be connected together. These machines need
not all be of the same kind. The machines can be spread around a building on a LAN. Amoeba uses the high
performance FLIP network protocol for LAN communication. If an Amoeba machine has more than one
network interface it will automatically act as a FLIP router between the various networks and thus connect
the various LANs together. Amoeba is intended for both ‘‘distributed’’ computing (multiple independent users
working on different projects) and ‘‘parallel’’ computing (e.g., one user using 50 CPUs to play chess in
parallel). Amoeba provides the necessary mechanism for doing both distributed and parallel applications.
Amoeba was designed with what is currently termed a microkernel architecture. This means that every
machine in an Amoeba system runs a small, identical piece of software called the kernel. The kernel supports
the basic process, communication, and object primitives. It also handles raw device I/O and memory
management. Everything else is built on top of these fundamentals, usually by user-space server processes.

Machines on which Amoeba Runs


Amoeba currently runs on the following architectures:
 Sun 4c and MicroSPARC SPARCstations
 Intel 386/486/Pentium/Pentium Pro (IBM AT bus, PCI bus)
 68030 VME-bus boards (Force CPU-30)
 Sun 3/60 & Sun 3/50 workstations

Weak Points In Amoeba


 Over 1000 pages of documentation supplied
 Not binary compatible with UNIX
 No virtual memory (for performance reasons)
 Works poorly when there is insufficient memory
 No NFS support
 While fine for experimenting, it is not a totally polished production system

System Models:
1) Architectural Model
2) Fundamental Model

Architectural Model defines the way in which the component of system interact with each another and the
way they have been implemented in the network.
These models are basically concern with the placement of system and its part and the relationship that exist
between them. The goal of architecture model is to meet the current needs as well as the need of future
a) Software Architecture
b) System Architecture
Software Architecture
It refers to the structuring of software as layers or modules in a single computer and in terms of services
offered and requested between processes located in the same or different computers. These process and
service oriented views can be expressed as service layers.

Application, services

Middleware

Operating System
Platform

Computer and network hardware


Platform
The lowest level hardware and software layers are referred as platform to distributed system and their
application. These low level layers provide services to the layers above them. Intel x86/ Windows, Intel
x86/Solarsis, Intel x86/Linux, PowerPC/Mac OS X are major example of hardware and software layers.

Middleware
It is layer of software whose purpose is to mask heterogeneity and to provide a convenient programming
model to application programmers. The goal of middleware is to create system independent interfaces for
distributed applications.
The principle aim of middleware, namely raising the level of abstraction for distributed programming, is
achieved in three ways: (1) communication mechanisms that are more convenient and less error prone than
basic message passing; (2) independence from OS, network protocol, programming language, etc. and (3)
standard services (such as a naming service, transaction service, security service, etc.). To make the
integration of these various services easier, and to improve transparency and system independence,
middleware is usually based on a particular paradigm, or model, for describing distribution and
communication.

Fig: Middleware System

RPC (Sun RPC) and group communication systems such as Isis were amongst the earliest instances of
middleware. Object-oriented middleware products and standards are widely used, such as
 Java RMI (Remote Method Invocation)
 CORBA (Common Object Request Broker Architecture)
 Web services
 Microsoft DCOM (Distributed Component Object Model)

System Architecture
A distributed system is composed of a number of elements, the most important of which are software
components, processing nodes and networks. Some of these elements can be specified as part of a distributed
system’s design, while others are given. Typically when building a distributed system, the software is under
the designer’s control. Depending on the scale of the system, the hardware can be specified within the design
as well, or already exists and has to be taken as-is. The key, however, is that the software components must be
distributed over the hardware components in some way. The software of distributed systems can become
fairly complex—especially in large distributed systems—and its components can spread over many
machines. It is important, therefore, to understand how to organize the system. The software architecture of
distributed systems deals with how software components are organized and how they work together, i.e.,
communicate with each other. Typical software architectures include the layered, object-oriented, data-
centered, and event-based architectures. Once the software components are instantiated and placed on real
machines, then the actual system architecture comes into picture.

Client-Server
The client-server architecture is the most common and widely used model for communication between
processes. In this architecture one process takes on the role of a server, while all other processes take on the
roles of clients. The server process provides a service (e.g., a time service, a database service, a banking
service, etc.) and the clients are customers of that service. A client sends a request to a server, the request is
processed at the server and a reply is returned to the client.
A typical client-server application can be decomposed into three logical parts: the interface part, the
application logic part, and the data part. Implementations of the client-server architecture vary with regards
to how the parts are separated over the client and server roles. A thin client implementation will provide a
minimal user interface layer, and leave everything else to the server. A fat client implementation, on the other
hand, will include all of the user interface and application logic in the client, and only rely on the server to
store and provide access to data. Implementations in between will split up the interface or application logic
parts over the clients and server in different ways.

Fig: Client-Server Architecture

Vertical Distribution (Multi-Tier)


An extension of the client-server architecture, the vertical distribution, or multi-tier, architecture distributes
the traditional server functionality over multiple servers. A client request is sent to the first server. During
processing of the request this server will request the services of the next server, who will do the same, until
the final server is reached. In this way the various servers become clients of each other. Each server is
responsible for a different step (or tier) in the fulfillment of the original client request.

Fig: Multi-tier Architecture

Horizontal Distribution
While vertical distribution focuses on splitting up a server’s functionality over multiple computers, horizontal
distribution involves replicating a server’s functionality over multiple computers. A typical example, as
shown in Figure 4, is a replicated Web server. In this case each server machine contains a complete copy of all
hosted Web pages and client requests are passed on to the servers in a round robin fashion. The horizontal
distribution architecture is generally used to improve scalability (by reducing the load on individual servers)
and reliability (by providing redundancy).
Fig: Horizontal Distribution

Peer to Peer
The peer to peer (P2P) architecture takes the opposite approach and assumes that all processes play the
same role, and are therefore peers of each other. In this architecture, each process acts as both client and
server, both sending out requests and processing incoming requests. The P2P model all processes provide the
same logical services. Well known examples of the P2P model are file-sharing applications.
When a node wishes to send a message to an arbitrary other node, in this architecture it must first locate that
node by propagating a request along the links in the overlay network. Once the destination node is found, the
two nodes can typically communicate directly. There are two key types of overlay networks, the distinction
being based on how they are built and maintained. In all cases a node in the network will maintain a list of
neighbors.
In unstructured overlays the structure of the network often resembles a random graph. In order to keep the
network connected as nodes join and leave, all nodes periodically exchange their partial views with
neighbors, creating a new neighbor list for themselves. As long as nodes both push and pull this information
the network tends to stay well connected.
In the case of structured overlays the choice of a node’s neighbors is determined according to a specific
structure. In a distributed hash table, for example, nodes work together to implement a hash table. Each node
is responsible for storing the data associated with a range of identifiers. When joining a network, a node is
assigned an identifier, locates the node responsible for the range containing that identifier, and takes over
part of that identifier space. Each node keeps track of its neighbors in the identifier space.

Fig: Peer-to-Peer Architecture

Fundamental Model
a) Interaction Model
b) Failure Model
c) Security Model
Interaction Models describe how processes coordinate their actions. The process interacts within process
by passing messages, resulting in communication flow and coordination of process. The rate at which each
process proceed and the timing of the transmission of messages cannot be predicated. The interacting
process performs all the activity in a distributed system. Each process has its own state, consisting of the set
of data that it can access and update. The state belonging to each process is completely private, i.e. it cannot
be accessed or updated by another process. In a distributed system it is hard to set time limits on the time
taken for execution by a process, or for message delivery or clock drift.
There are two interaction models:
i) Synchronous Model
ii) Asynchronous Model
Synchronous Model is one in which following bounds are defined:
 The time to execute each step of a process has known upper and lower bounds
 Each message transmitted over a channel is received within a known bound time
 Each process has a local clock whose drift rate from real time has a known bound
It is difficult to arrive at the realistic values of the bounds, message delays and clock drift in the distributed
system and to provide the guarantee over these chosen values. Unless these values cannot be guaranteed,
their reliability will be in question. However, in a synchronous model it is possible to use timeouts.
Asynchronous Models are one in which there is no bounds on:
 Process execution speed
 Message transmission delay
 Clock drift rates
The asynchronous model allows no assumption about the time intervals involved in any execution. Actual
distributed systems are very often asynchronous because of the need for processes to share the processors
and for communication channels to share the network. If the process execution speed is not known and the
process is sharing too many processor, than the nature of the process will be asynchronous.

Failure Model defines the ways in which failure may occur in order to provide an understanding of the
effects of failures. In distributed system, both process and communication channel may fail; these failures are
listed in different categories:
 Omission Failure refers to cases when a process or communication channel fails to perform action
that it is supposed to do.
o Process Omission Failure If a process is crashed i.e. it has halted and will not be executed
any further. Such process called as fail-stop if other process can detect certainly that the
process has crashed.
o Communication Omission Failure The communication channel produces an omission
failure if it is not able to transport message from sender to receiver’s buffer memory. This is
known as buffer memory and is generally caused by lack of space in the buffer memory.
 Arbitrary Failures or Byzantine Failure: it is used to describe possible failure semantics, in which
any type of error may occur. An arbitrary failure of a process is one in which it arbitrarily omits
intended processing steps or takes unintended processing steps. These failures cannot be detected
by using whether the process responds to invocation. Arbitrary failures of communication channels
are rare as the communication software is able to detect and remove faulty messages.
 Timing Failures Timing failures are applicable in synchronous distributed systems where time
limits are set on process execution time, message delivery time and clock drift rate. In asynchronous
distributed system if the responses from the server take time it cannot be counted as timing failure.
The timing failures are normally limited to synchronous systems where the bounds are known and
therefore it could be assessed whether the failure has occurred or not.
 Masking failure A service masks a failure, either by hiding it altogether or by converting it into a
more acceptable type of failure.

Security Model The security of a distributed system can be achieved by securing the processes and the
channels used for their interactions and by protecting the objects that they encapsulate against unauthorized
access.
The threats from a potential enemy are threat to processors, threat to communication channels and denial of
service. The system can be secured from these threats by the technique of cryptography.

Distributed operating system (DOS)


A distributed operating system (DOS) is an operating system that is built, from the ground up, to provide
distributed services. As such, a DOS integrates key distributed services into its architecture. These services
may include distributed shared memory, assignment of tasks to processors; masking of failures, distributed
storage, inter process communication, transparent sharing of resources, distributed resource management,
etc. A key property of a distributed operating system is that it strives for a very high level of transparency,
ideally providing a single system image. That is, with an ideal DOS users would not be aware that they are, in
fact, working on a distributed system. Distributed operating systems generally assume a homogeneous
multicomputer. They are also generally more suited to LAN environments than to wide-area network
environments. In the earlier days of distributed systems research, distributed operating systems where the
main topic of interest. Most research focused on ways of integrating distributed services into the operating
system, or on ways of distributing traditional operating system services. Currently, however, the emphasis
has shifted more toward middleware systems. The main reason for this is that middleware is more flexible
(i.e., it does not require that users install and run a particular operating system), and is more suitable for
heterogeneous and wide-area multicomputers.

A Distributed Operating System


Network Operating Systems
In contrast to distributed operating systems, network operating systems do not assume that the underlying
hardware is homogeneous and that it should be managed as if it were a single system. Instead, they are
generally constructed from a collection of uniprocessor systems, each with its own operating system,

A Network Operating System

You might also like