Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 7

Challenges for a Distributed System

Designing a distributed system does not come as easy and straight forward. A number of  challenges need to be
overcome in order to get the ideal system. The major challenges in distributed systems are listed below:

1. Heterogeneity:

The Internet enables users to access services and run applications over a heterogeneous collection of computers
and networks. Heterogeneity (that is, variety and difference) applies to all of the following:

o Hardware devices: computers, tablets, mobile phones, embedded devices, etc.


o Operating System: Ms Windows, Linux, Mac, Unix, etc.
o Network: Local network, the Internet, wireless network, satellite links, etc.
o Programming languages: Java, C/C++, Python, PHP, etc.
o Different roles of software developers, designers, system managers
Different programming languages use different representations for characters and data structures such as arrays
and records. These differences must be addressed if programs written in different languages are to be able to
communicate with one another. Programs written by different developers cannot communicate with one another
unless they use common standards, for example, for network communication and the representation of primitive
data items and data structures in messages. For this to happen, standards need to be agreed and adopted – as have
the Internet protocols.
Middleware : The term middleware applies to a software layer that provides a programming abstraction as well
as masking the heterogeneity of the underlying networks, hardware, operating systems and programming
languages.  Most middleware is implemented over the Internet protocols, which themselves mask the differences
of the underlying networks, but all middleware deals with the differences in operating systems and hardware.
Heterogeneity and mobile code : The term mobile code is used to refer to program code that can be transferred

1
from one computer to another and run at the destination – Java applets are an example. Code suitable for running
on one computer is not necessarily suitable for running on another because executable programs are normally
specific both to the instruction set and to the host operating system.
2. Transparency:

Transparency is defined as the concealment from the user and the application programmer of the separation of
components in a distributed system, so that the system is perceived as a whole rather than as a collection of
independent components. In other words, distributed systems designers must hide the complexity of the systems
as much as they can.  Some terms of transparency in distributed systems are:
Access     Hide differences in data representation and how a resource is accessed
Location     Hide where a resource is located
Migration     Hide that a resource may move to another location
Relocation     Hide that a resource may be moved to another location while in use
Replication     Hide that a resource may be copied in several places
Concurrency     Hide that a resource may be shared by several competitive users
Failure     Hide the failure and recovery of a resource
Persistence     Hide whether a (software) resource is in memory or a disk
3. Openness

The openness of a computer system is the characteristic that determines whether the system can be extended and
reimplemented in various ways. The openness of distributed systems is determined primarily by the degree to
which new resource-sharing services can be added and be made available for use by a variety of client programs.
If the well-defined interfaces for a system are published, it is easier for developers to add new features or replace
sub-systems in the future. Example: Twitter and Facebook have API that allows developers to develop theirs own
software interactively.

4. Concurrency

Both services and applications provide resources that can be shared by clients in a distributed system. There is
therefore a possibility that several clients will attempt to access a shared resource at the same time. For example,
a data structure that records bids for an auction may be accessed very frequently when it gets close to the deadline
time. For an object to be safe in a concurrent environment, its operations must be synchronized in such a way that
its data remains consistent. This can be achieved by standard techniques such as semaphores, which are used in
most operating systems.

5. Security

Many of the information resources that are made available and maintained in distributed systems have a high
intrinsic value to their users. Their security is therefore of considerable importance. Security for information
resources has three components:
confidentiality (protection against disclosure to unauthorized individuals)
    integrity (protection against alteration or corruption),
availability for the authorized (protection against interference with the means to access the resources).
6. Scalability

Distributed systems must be scalable as the number of user increases.


2
A system is said to be scalable if it can handle the addition of users and resources without suffering a noticeable
loss of performance or increase in administrative complexity

Scalability has 3 dimensions:

o Size
o Number of users and resources to be processed. Problem associated is overloading
o Geography
o Distance between users and resources. Problem associated is communication reliability
o Administration
o As the size of distributed systems increases, many of the system needs to be controlled. Problem
associated is administrative mess
7. Failure Handling

Computer systems sometimes fail. When faults occur in hardware or software, programs may produce incorrect
results or may stop before they have completed the intended computation. The handling of failures is particularly
difficult.

Limitations of the Distributed System


Distributed system limitations has the impact on both design and implementation of distributed systems. There
are mainly two limitations of the distributed system which are as following:
1. Absence of a Global Clock
2. Absence of Shared Memory
The above two limitations of the distributed system are explained as following below:
1. Absence of a Global Clock:
In a distributed system there are a lot of systems and each system has its own clock. Each clock on each system
is running at a different rate or granularity leading to them asynchronous. In starting the clocks are regulated to
keep them consistent, but only after one local clock cycle they are out of the synchronization and no clock has
the exact time.
Time is known for a certain precision because it is used for the following in distributed system:
 Temporal ordering of events
 Collecting up-to-date information on the state of the integrated system
 Scheduling of processes
There are restrictions on the precision of time by which processes in a distributed system can synchronize their
clocks due to asynchronous message passing. Every clock in distributed system is synchronize with a more
reliable clock, but due to transmission and execution time lapses the clocks becomes different. Absence of
global clock make more difficult the algorithm for designing and debugging of distributed system.
2. Absence of Shared Memory:
Distributed systems have not any physically shared memory, all computers in the distributed system have their
own specific physical memory. As computer in the distributed system do not share the common memory, it is
impossible for any one system to know the global state of the full distributed system. Process in the distributed
system obtains coherent view of the system but in actual that view is partial view of the system.
As in distributed system there is an absence of a global state, it is challenging to recognize any global property
of the system. The global state in distributed system is divided by many number of computers into smaller
entities.
Middleware organization
3
Organization of middleware, that is, independent of the overall organization of a distributed system or application. There
are two important types of design patterns that are often applied to the organization of middleware: wrappers and
interceptors. Each targets different problems, yet addresses the same goal for middleware: achieving openness.

Wrappers

A wrapper(or adapter) is a special component that offers an interface acceptable to a client-application, of which the
functions are transformed into those available at the component.

It solves the problem of incompatible interface.

Wrappers have always played an important role in extending systems with existing components.

When building a distributed system out of existing components, we immediately bump into a fundamental problem: the
interfaces offered by the legacy component are most likely not suitable for all applications.

A reduction of the number of wrappers is typically done through middleware. One way of doing this is implementation a
so called broker, which is logically a centralized component that handles all the accesses between different applications.

Wrapper and Broker

Interceptors

An interceptor is nothing but a software construct that will break the usual flow of control and allow other (application
specific) code to be executed.

Interceptors are a primary means for adapting middleware to the specific needs of an application. As such, they play an
important
role in making middleware open.

To make interceptors generic may require a substantial implementation effort.

Modifiable middleware

What wrappers and interceptors offer are means to extend and adapt the middleware.

Distributed applications are executed changes continuously.

Changes include those resulting from mobility, a strong variance in the quality-of service of networks, failing hardware,
and battery drainage.
Rather than making applications responsible for reacting to changes, this task is placed in the middleware.

Moreover, as the size of a distributed system increases, changing its parts can rarely be done by temporarily shutting it
down.
These strong influences from the environment have brought many designers of middleware to consider the construction
of adaptive software.

System architecture

Distributed systems are actually organized by considering where software components are placed. Deciding on software
components, their interaction, and their placement leads to an instance of a software architecture, also known as a
system architecture

Centralized organizations
Despite the lack of consensus on many distributed systems issues, there is one issue that many researchers and
4
practitioners agree upon: thinking in terms of clients that request services from servers helps understanding and
managing the complexity of distributed systems.

Characteristics of Centralized System

Presence of global clock

One single central unit

Dependent failure of components

Advantages of Centralized System

Easy to physical secure

Smooth and elegant personal experience

Dedicated resources

More cost-efficient for small systems up to a certain limit

Easy detachment

Disadvantages of Centralized System

Highly dependent on the network connectivity

No graceful degradation of the system

Less possibility of data backup

Difficult to Server maintenance

Client-server architecture
Client-server model, processes in a distributed system are divided into two (possibly overlapping) groups.

A server is a process implementing a specific service, for example, a file system service or a database service.

A client is a process that requests a service from a server by sending it a request and subsequently waiting for the server’s
reply. This client-server interaction, also known as request-reply behavior

Advantages of Client Server Architecture

Simplicity and modularity

Flexibility

Concurrency

Fault-tolerance

Cost Effectiveness

Specialization

Extensibility

Disadvantages of Client Server Architecture

5
Security – A client-server based development a lot of flexibility is provided and a client can connect from anywhere. This
makes it easy for hackers to break into the system.

Servers can be bottlenecks – Servers can turn out to be bottlenecks because many clients might try to connect to a server
at the same time. This problem arises due to the flexibility given that any client can connect anytime required.

Compatibility

Clients and servers may not be compatible to each other., they may not be compatible with respect to data types,
language, etc.

Inconsistency – Replication of servers is a problem as it can make data inconsistent.

Decentralized Organization

In decentralized organization we often see an equal role played by the processes that constitute a distributed system, also
known as peer-to-peer systems.

Characteristics of Decentralized Systems

Lack of a global clock: Every node is independent of each other and hence, has different clocks that they run and follow.

Multiple central units (Computers/Nodes/Servers): More than one central unit which can listen for connections from
other nodes

Dependent failure of components: one central node failure causes a part of the system to fail; not the whole system

Advantages of Decentralized System

Minimal problem of performance bottlenecks occurring – The entire load gets balanced on all the nodes; leading to
minimal to no bottleneck situations

High availability –

Some nodes(computers, mobiles, servers) are always available/online for work, leading to high availability

More autonomy and control over resources –

As each node controls its own behavior, it has better autonomy leading to more control over resources

Disadvantages of Decentralized System

Difficult to achieve global big tasks –

No chain of command to command others to perform certain tasks

No regulatory oversight

Difficult to know which node failed –

Each node must be pinged for availability checking and partitioning of work has to be done to actually find out which node
failed by checking the expected output with what the node generated

Difficult to know which node responded –

When a request is served by a decentralized system, the request is actually served by one of the nodes in the system but it
is actually difficult to find out which node indeed served the request.

Peer-to-peer architecture
6
In peer-to-peer systems, the processes are organized into an overlay network, which is a logical network in which every
process has a local list of other peers that it can communicate with.

P2P architecture Advantages and Disadvantages

Advantages Disadvantages

Because each computer might be being accessed by


No need for a network operating system
others it can slow down the performance for the user

Does not need an expensive server


because individual workstations are Files and folders cannot be centrally backed up
used to access the files

No need for specialist staff such as Files and resources are not centrally organised into a
network technicians because each user specific 'shared area'. They are stored on individual
sets their own permissions as to which computers and might be difficult to locate if the
files they are willing to share. computer's owner doesn't have a logical filing system.

Much easier to set up than a client-


Ensuring that viruses are not introduced to the network
server network - does not need
is the responsibility of each individual user
specialist knowledge

If one computer fails it will not disrupt


any other part of the network. It just There is little or no security besides the permissions.
means that those files aren't available to Users often don't need to log onto their workstations.
other users at that time.

You might also like