Download as pdf or txt
Download as pdf or txt
You are on page 1of 20

DEPARTMENT OF INFORMATON TECHNOLOGY

DISTRIBUTED SYSTEM ASSIGNMENT 1


2/25/2021

NAME ID
YONAS H\SELASSIE----------------------------------------------------------01830/10
SALMAN MOHAMMED----------------------------------------------------- 01840/10
YILKAL ENEYEW-------------------------------------------------------------01761/10
YITAYEW SALE----------------------------------------------------------------01842/10
NEBIYU SHIFERAW----------------------------------------------------------01816/10
MESAFINT AGIMAS----------------------------------------------------------01831/10

SUBMITTED TO
INST. DERSE
Table of Contents
1. Explain in detail about consistency models with examples ................................................................... 1
1.1. Data-Centric Consistency Models ........................................................................................................ 1
Sequential consistency ...................................................................................................................................... 1
Causal consistency ............................................................................................................................................ 2
Entry consistency .............................................................................................................................................. 4
1.2. Client-Centric Consistency Models ..................................................................................................... 5
Monotonic Reads .............................................................................................................................................. 5
Monotonic writes .............................................................................................................................................. 6
Read Your Writes ........................................................................................................................................ 7
Writes Follow Reads ........................................................................................................................................ 8
2. Explain briefly about replica management and issues related to placement ...................................... 9
Replica-Server Placement ................................................................................................................................ 9
Content replication and Placement ............................................................................................................... 10
Permanent Replicas ........................................................................................................................................ 10
Server-Initiated Replicas ............................................................................................................................... 11
Client-Initiated Replicas ................................................................................................................................ 11
3. What is fault tolerance in distributed systems? Explain the strategies to handle failures? ............. 12
Some strategies to handle failure .................................................................................................................. 13
4. Discuss about reliable client-server and group communication ............................................................ 14
Point-to-Point Communication ..................................................................................................................... 14
RPC Semantics in the Presence of Failures ................................................................................................. 14
Client is unable to locate the sever ................................................................................................................ 15
The client’s request to the server is lost ....................................................................................................... 15
Reliable Group Communication ................................................................................................................... 16
Reference ......................................................................................................................................................... 18

ii
1. Explain in detail about consistency models with
examples
1.1. Data-Centric Consistency Models
Traditionally, consistency has been discussed in the context of read and write operations on
shared data, available by means of (distributed) shared memory. a (distributed) shared
database, or a (distributed) file system. In this section, we use the broader term data store.

A consistency model is essentially a contract between processes and the data


store. It says that if processes agree to obey certain rules, the store promises to
work correctly. Normally, a process that performs a read operation on a data item,
expects the operation to return a value that shows the results of the last write operation on
that data.

Sequential consistency
Sequential consistency is an important data-centric consistency model,
which was first defined by Lamport (1979) in the context of shared memory formulating
processor systems. The result of any execution is the same as if the (read and write)
operations by all processes on the data store were executed in some sequential
order and the operations of-each individual process appear in this sequence in the order
specified by its program.

- Sequential consistency is a slightly weaker consistency


- a data store is said to be sequentially consistent when it satisfies the following
condition:
• The result of any execution is the same as if the (read and write) operations by
all processes on the data store were executed in some sequential order and the
operations of each individual process appear in this sequence in the order
specified by its program.
i.e., all processes see the same interleaving of operations time does not play a role; no
reference to the “most recent” write operation.
In the following, we will use a special notation in which we draw the operations of a process
along a time axis. The time axis is always drawn horizontally, with time increasing from left
to right. The symbols mean that a write by process P; to data item x with the value a and a
read from that item by Pi returning b have been done, respectively.

Sequential consistency is one of the consistency models used in the domain of concurrent
computing (e.g. in distributed shared memory, distributed transactions, etc.).
It was first defined as the property that requires that
"the result of any execution is the same as if the operations of all the processors were
executed in some sequential order, and the operations of each individual processor
appear in this sequence in the order specified by its program."
To understand this statement, it is essential to understand one key property of sequential
consistency: execution order of program in the same processor (or thread) is the same as
the program order, while execution order of program between processors (or threads) is
undefined. In an example like this:

processor 1: <-- A1 run --> <-- B1 run --> <-- C1 run -->
processor 2: <-- A2 run --> <-- B2 run -->
Time --------------------------------------------------------------------->

execution order between A1, B1 and C1 is preserved, that is, A1 runs before B1, and B1
before C1. The same for A2 and B2. But, as execution order between processors is
undefined, B2 might run before or after C1 (B2 might physically run before C1, but the
effect of B2 might be seen after that of C1, which is the same as "B2 run after C1")
Sequential consistency can produce non-deterministic results. This is because the sequence of
sequential operations between processors can be different during different runs of the
program. All memory operations need to happen in the program order.
Linearizability (also known as atomic consistency) can be defined as sequential consistency
with the real-time constraint.

Causal consistency
The causal consistency model (Hutto and Ahamad, 1990) represents a weakening of
sequential consistency in that it makes a distinction between events that are potentially
causally related and those that are not. We already came across causality when discussing
vector timestamps in the previous chapter. If event b is caused or influenced by an earlier
event a, causality requires that everyone else first see a, then see b. Consider a simple
interaction by means of a distributed shared database. Suppose that process P, writes a data
item x. Then P2 reads x and writes y. Here the reading of x and the writing of y are potentially
causally related because the SEC. computation of y may have depended on the value of x as
read by Pz (i.e., the value written by PI)' On the other hand, if two processes spontaneously
and simultaneously write two different data items, these are not causally related. Operations
that are not causally related are said to be concurrent. For a data store to be considered
causally consistent, it is necessary that the store obeys the following condition:

• Writes that are potentially causally related must be seen by all processes in the
same order. Concurrent writes may be seen in a different order on different
machines.

2
- it is a weakening of sequential consistency, it distinguishes between events that are
potentially causally related and those that are not.
Example: a write on y that follows a read on x; the writing of y may have depended on the
value of x; e.g., y = x+5
- otherwise the two events are concurrent
- two processes write two different variables
- if event B is caused or influenced by an earlier event, A, causality requires that
everyone else must first see A, then B.
Causal consistency captures the potential causal relationships between operations, and
guarantees that all processes observe causally-related operations in a common order. In
other words, all processes in the system agree on the order of the causally-related
operations. They may disagree on the order of operations that are causally unrelated.
▪ Thus, a system provides causal consistency if this following condition holds: write
operations that are related by potential causality are seen by each process of the system
in their causal precedence order. Different processes may observe concurrent writes in
different orders.
▪ The Causal Consistency model is weaker than sequential consistency, which ensures
that all processes observe all write operations in common order, whether causally
related or not. However, causal consistency is stronger than PRAM consistency, which
requires only the write operations that are done by a single process to be observed in
common order by each other process. It follows that when a system is sequentially
consistent, it is also causally consistent. Additionally, causal consistency implies
PRAM consistency, but not vice versa.
Here is an example of causal consistency.
P1 : W(x)1 W(x)3
P2 : R(x)1 W(x)2
P3 : R(x)1 R(x)3 R(x)2
P4 : R(x)1 R(x)2 R(x)3

▪ Process P2 observes, reads, the earlier write W(x)1 that is done by process P1.
Therefore, the two writes W(x)1 and W(x)2 are causally related. Under causal
consistency, every process observes W(x)1 first, before observing W(x)2. Notice that
the two write operations W(x)2 and W(x)3, with no intervening read operations, are
concurrent, and processes P3 and P4 observe (read) them in different orders.
As in sequential consistency, reads do not need to reflect changes instantaneously, however,
they need to reflect all changes to a variable sequentially.

3
Entry consistency
- it requires an acquire and release to be used at the start and end of a critical section
however, it requires that each ordinary shared data item to be associated with some
synchronization variable such as a lock. if it is desired that elements of an array be accessed
independently in parallel, then different array elements may be associated with different
locks
- synchronization variable ownership
• each synchronization variable has a current owner, the process that acquired it last.
• the owner may enter and exit critical sections repeatedly without sending messages.
• other processes must send a message to the current owner asking for ownership and
the current values of the data associated with that synchronization variable.
• several processes can also simultaneously own a synchronization variable, but only
for reading.
A data store exhibits entry consistency if it meets all the following conditions:

- An acquire access of a synchronization variable is not allowed to perform with respect


to a process until all updates to the guarded shared data have been performed with
respect to that process. (at an acquire, all remote changes to the guarded data must be
made visible).
- Before an exclusive mode access to a synchronization variable by a process is allowed
to perform with respect to that process, no other process may hold the synchronization
variable, not even in nonexclusive mode.
- After an exclusive mode access to a synchronization variable has been performed, any
other process's next nonexclusive mode access to that synchronization variable may
not be performed until it has performed with respect to that variable's owner. (it must
first fetch the most recent copies of the guarded shared data)
data-centric consistency for distributed objects comes naturally in the form of entry
consistency. Recall that in this case, the goal is to group operations on shared data using
synchronization variables (e.g., in the form of locks). As objects naturally combine data and
the operations on that data, locking objects during an invocation serializes access and keeps
them consistent. Although conceptually associating a lock with an object is simple, it does
not necessarily 'provide a proper solution when an object is replicated.

In many cases, designing replicated objects is done by first designing a single object, possibly
protecting it against concurrent access through local locking, and subsequently replicating it.
If we were to use a primary-based scheme, then additional effort from the application
developer is needed to serialize object invocations. Therefore, it is often convenient to
assume that the underlying middleware supports totally-ordered multicasting, as this would
not require any changes at the clients, nor would it require additional programming effort

4
from application developers. Of course, how the totally ordered multicasting is realized by
the middleware should be transparent. For all the application may know its implementation
may use a primary-based scheme, but it could equally well be based on Lamport clocks.
However, under entry consistency, every shared variable is assigned a synchronization
variable specific to it. This way, only when the acquire is to variable x, all operations related
to x need to be completed with respect to that processor. This allows concurrent operations of
different critical sections of different shared variables to occur. Concurrency cannot be seen
for critical operations on the same shared variable. Such a consistency model will be useful
when different matrix elements can be processed at the same time.

1.2. Client-Centric Consistency Models


Monotonic Reads
- The first client-centric consistency model is that of monotonic reads. A data store is said to
provide monotonic-read consistency if the following condition holds: *If a process reads
the value of a data item x, any successive read operation on x by that process will always
return that same value or a more recent value. i.e., a process never sees a version of data
older than what it has already seen.

the read operations performed by a single process P at two different local copies of the same
data store.
a) a monotonic-read consistent data store
b) a data store that does not provide monotonic reads; there is no guaranty that
when R(x2) is executed WS (x2) also contains WS (x1)
In other words, monotonic-read consistency guarantees that if a process has seen a value of x
at time t, it will never see an older version of x at a later time. As an example where
monotonic reads are useful, consider a distributed e- mail database. In such a database, each
user's mailbox may be distributed and replicated across multiple machines. Mail can be
inserted in a mailbox at any location. However, updates are propagated in a lazy (i.e., on
demand) fashion.

Only when a copy needs certain data for consistency are those data propagated to that copy.
Suppose a user reads his mail in San Francisco. Assume that only reading mail does not

5
affect the mailbox, that is, messages are not removed, stored in subdirectories, or even tagged
as having already been read, and so on. When the user later flies to New York and opens his
mailbox again, monotonic-read consistency guarantees that the messages that were in the
mailbox in San Francisco will also be in the mailbox when it is opened in New York.

Monotonic writes
In many situations, it is important that write operations are propagated in the correct order to
all copies of the data store. This property is expressed in monotonic-write consistency.
In a monotonic-write consistent store, the following condition holds:
- A write operation by a process on a data item x is completed before any successive write
operation on X by the same process.
Thus completing a write operation means that the copy on which a successive operation is
performed reflects the effect of a previous write operation by the same process, no matter
where that operation was initiated. In other words, a write operation on a copy of item x is
performed only if that copy has been brought up to date by means of any preceding write
operation, which may have taken place on other copies of x. If need be, the new write must
wait for old ones to finish.
Bringing a copy of x up to date need not be necessary when each write operation completely
overwrites the present value of x. However, write operations are often performed on only part
of the state of a data item. Consider, for example, a software library. In many cases, updating
such a library is done by replacing one or more functions, leading to a next version. With
monotonic-write consistency, guarantees are given that if an update is performed on a copy of
the library, all preceding updates will be performed first. The resulting library will then
indeed become the most recent version and will include all updates that have led to previous
versions of the library.

- may not be necessary if a later write operation completely overwrites the present

x = 78;

x = 90;

6
- no need to make sure that x has been first changed to 78
- it is important only if part of the state of the data item changes
- e.g., a software library, where one or more functions are replaced, leading to a new
version

the
write operations performed by a single process P at two different local copies of the same
data store
a) a monotonic-write consistent data store
b) a data store that does not provide monotonic-write consistency

Read Your Writes


In a distributed system, the changes made at the master are not always instantaneously
available at every replica, although they eventually will be. In general, replicas not directly
involved in contributing to the acknowledgement of a transaction commit will lag behind
other replicas because they do not synchronize their commits with the master.

For this reason, you might want to make use of the read-your-writes consistency feature. This
feature allows you to ensure that a replica is at least current enough to have the changes made
by a specific transaction. Because transactions are applied serially, by ensuring a replica has a
specific commit applied to it, you know that all transaction commits occurring prior to the
specified transaction have also been applied to the replica.

You determine whether a transaction has been applied to a replica by generating a commit
token at the master. You then transfer this commit token to the replica, where it is used to
determine whether the replica is consistent enough relative to the master.

For example, suppose the you have a web application where a replication group is
implemented within a load balanced web server group. Each request to the web server
consists of an update operation followed by read operations (say, from the same client), The
read operations naturally expect to see the data from the updates executed by the same
request. However, the read operations might have been routed to a replica that did not
execute the update.

In such a case, the update request would generate a commit token, which would be
resubmitted by the browser, along with subsequent read requests. The read request could be
directed at any one of the available web servers by a load balancer. The replica which
services the read request would use that commit token to determine whether it can service the

7
read operation. If the replica is current enough, it can immediately execute the transaction and
satisfy the request.

What action the replica takes if it is not consistent enough to service the read request is up to
you as the application developer. You can do anything from blocking while you wait for the
transaction to be applied locally, to rejecting the read request outright.

The absence of read-your-writes consistency is sometimes experienced when


updating Web documents and subsequently viewing the effects. Update operations
frequently take place by means of a standard editor or word processor, which
saves the new version on a file system that is shared by the Web server.

a) a data store that provides read-your-writes consistency, b) a data store that does not

Writes Follow Reads


The last client-centric consistency model is one in which updates are propagated as the result
of previous read operations. A data store is said to provide writes-follow-reads consistency, if
the following holds.

- A write operation by a process on a data item x following a previous read


operation on x by the same process is guaranteed to take place on the
same or a more recent value of x that was read.

a) a writes-follow-reads consistent data store

b) a data store that does not provide writes-follow-reads consistency x

In other words, any successive write operation by a process on a data item x will
be performed on a copy of x that is up to date with the value most recently read by
that process.
Writes-follow-reads consistency can be used to guarantee that users of a network newsgroup
see a posting of a reaction to an article only after they have seen the original article (Terry et
aI., 1994). To understand the problem, assume that a user first reads an article A. Then, he
reacts by posting a response B. By requiring writes-follow-reads consistency, B will be

8
written to any copy of the newsgroup only after A has been written as well. Note that users
who only read articles need not require any specific client-centric consistency model. The
writes-follows reads consistency assures that reactions to articles are stored at a local copy
only if the original is stored there as well.

2. Explain briefly about replica management and issues


related to placement
A key issue for any distributed system that supports replication is to decide where, when, and
by whom replicas should be placed, and subsequently which mechanisms to use for keeping
the replicas consistent. The placement problem itself should be split into two subproblems:
that of placing replica servers, and that of placing content. The difference is a subtle but
important one and the two issues are often not clearly separated. Replica-server placement
is concerned with finding the best locations to place a server that can host (part of) a data
store. Content placement deals with finding the best servers for placing content. Note that
this often means that we are looking for the optimal placement of only a single data item.
Obviously, before content placement can take place, replica servers will have to be placed
first.

Replica-Server Placement
The placement of replica servers is not an intensively studied problem for the simple reason
that it is often more of a management and commercial issue than an optimization problem.
Nonetheless, analysis of client and network properties are useful to come to informed
decisions.
There are various ways to compute the. best placement of replica servers, but all boil down to
an optimization problem in which the best K out of N locations need to be selected (K < N).
These problems are known to be computationally complex and can be solved only through
heuristics. Distance can be measured in terms of latency or bandwidth. Their solution selects
one server at a time such that the average distance between that server and its clients is
minimal given that already k servers have been placed (meaning that there are N - k locations
left).

As an alternative, Radoslavov et aI. (2001) propose to ignore the position of clients and only
take the topology of the Internet as formed by the autonomous systems. An autonomous
system (AS) can best be. viewed as a network in which the nodes all run the same routing
protocol and which is managed by a single organization. As of January 2006, there were just
over 20,000 ASes. Radoslavov et aI. first consider the largest AS and place a server on the
router with the largest number of network interfaces (i.e., links). This algorithm is then
repeated with the second largest AS, and so on.As it turns out, client-unaware server

9
placement achieves similar results as client-aware placement, under the assumption that
clients are uniformly distributed across the Internet (relative to the existing topology). To
what extent this assumption is true is unclear. It has not been well studied.

Content replication and Placement


let us now move away from server placement and concentrate on content placement. When it
comes to content replication and placement, three different types of replicas can be
distinguished logically organized.

Permanent Replicas
Permanent replicas can be considered as the initial set of replicas that constitute a distributed
data store. In many cases, the number of permanent replicas is small. Consider, for example,
a Web site. Distribution of a Web site generally comes in one of two forms. The first kind of
distribution is one in which the files that constitute a site are replicated across a limited
number of servers at a single location. Whenever a request comes in, it is forwarded to one of
the servers, for instance, using a round-robin strategy. The second form of distributed Web
sites is what is called mirroring. In this case, a Web site is copied to a limited number of
servers, called mirror sites. which are geographically spread across the Internet. In most
cases, clients simply choose one of the various mirror sites from a list offered to them.

▪ the initial set of replicas that constitute a distributed data store; normally a small number
of replicas

▪ e.g., a Web site: two forms

▪ the files that constitute a site are replicated across a limited number of servers on a LAN;
a request is forwarded to one of the servers

▪ mirroring: a Web site is copied to a limited number of servers, called mirror sites, which
are geographically spread across the Internet; clients choose one of the mirror sites

10
Server-Initiated Replicas
In contrast to permanent replicas, server-initiated replicas are copies of a data store that exist
to enhance performance and which are created at the initiative of the (owner of the) data
store. Consider, for example, a Web server placed in New York. Normally, this server can
handle incoming requests quite easily, but it may happen that over a couple of days a sudden
burst of requests come in from an unexpected location far from the server. In that case, it may
be worthwhile to install a number of temporary replicas in regions where requests are coming
from.
The problem of dynamically placing replicas is also being addressed in Web hosting services.
These services offer a (relatively static) collection of servers spread across the Internet that
can maintain and provide access to Web files belonging to third parties. To provide optimal
facilities such hosting services can dynamically replicate files to servers where those files are
needed to enhance performance, that is, close to demanding (groups of) clients.

Given that the replica servers are already in place, deciding where to place content is easier
than in the case of server placement. The algorithm is designed to support Web pages for
which reason it assumes that updates are relatively rare compared to read requests. Using tiles
as the unit of data, the algorithm works as follows.
The algorithm for dynamic replication takes two issues into account. First, replication can
take place to reduce the load on a server. Second, specific files on a server can be migrated or
replicated to servers placed in the proximity of clients that issue many requests for those files.
In the following pages, we concentrate only on this second issue.

Client-Initiated Replicas
An important kind of replica is the one initiated by a client. Client-initiated replicas are more
commonly known as (client) caches. In essence, a cache is a local storage facility that is used
by a client to temporarily store a copy of the data it has just requested. In principle, managing
the cache is left entirely to the client. The data store from where the data had been fetched has
nothing to do with keeping cached data consistent. However, as we shall see, there are many
occasions in which the client can rely on participation from the data store to inform it when
cached data has become stale.
Client caches are used only to improve access times to data. Normally, when a client wants
access to some data, it connects to the nearest copy of the data store from where it fetches the
data it wants to read, or to where it stores the data it had just modified. When most operations
involve only reading data, performance can be improved by letting the client store requested
data in a nearby cache. Such a cache could be located on the client's machine, or on a separate
machine in the same local-area network as the client.

11
The next time that same data needs to be read, the client can simply fetch it from this local
cache. This scheme works fine as long as the fetched data have not been modified in the
meantime. Data are generally kept in a cache for a limited amount of time, for example, to
prevent extremely stale data from being used, or simply to make room for other data.
Whenever requested data can be fetched from the local cache, a cache bit is said to have
occurred. To improve the number of cache hits, caches can be shared between clients. The
underlying assumption is that a data request from client C1 may also be useful for a request
from another nearby client C2.

▪ to improve access time


▪ a cache is a local storage facility used by a client to temporarily store a copy of the data it
has just received
▪ placed on the same machine as its client or on a machine shared by clients on a LAN
▪ managing the cache is left entirely to the client; the data store from which the data have
been fetched has nothing to do with keeping cached data consistent

3. What is fault tolerance in distributed systems? Explain the


strategies to handle failures?
A characteristic feature of distributed systems that distinguishes them from single-machine
systems is the notion of partial failure. A partial failure may hap- pen when one component in
a distributed system fails. This failure may affect the proper operation of other components,
while at the same time leaving yet other components totally unaffected. In contrast, a failure
in non distributed systems is often total in the sense that it affects all components, and may
easily bring down the entire system.To understand the role of fault tolerance in distributed
systems we first need to take a closer look at what it actually means for a distributed system
to tolerate faults. Being fault tolerant is strongly related to what are called dependable sys-
tems. Dependability is a term that covers a number of useful requirements for distributed
systems including the following:

Availability defined as the property that a system is ready to be used immediately. In


general, it refers to the probability that the system is operating correctly.

Reliability:- refers to the property that a system can run continuously without failure. In
contrast to availability, reliability is defined in terms of a time interval instead of an instant in
time.

Safety:- refers to the situation that when a system temporarily fails to operate correctly,
nothing catastrophic happens. For example, many process control systems, such as those used

12
for controlling nuclear power plants or sending people into space, are required to provide a
high degree of safety.

Maintainability: - refers to how easy a failed system can be repaired. A highly maintainable
system may also show a high degree of availability, especially if failures can be detected and
repaired automatically.

Some strategies to handle failure:


Masking failures: Some failures that have been detected can be hidden or made less
severe. Two examples of hiding failures:
1. Messages can be retransmitted when they fail to arrive.
2. File data can be written to a pair of disks so that if one is corrupted, the other may still be
correct. Just dropping a message that is corrupted is an example of making a fault less severe,
it could be retransmitted.

The reader will probably realize that the techniques described for hiding failures are not
guaranteed to work in the worst cases; for example, the data on the second disk may be
corrupted too, or the message may not get through in a reasonable time however often it is
retransmitted.

Tolerating failures: Most of the services in the Internet do exhibit failures – it


would not be practical for them to attempt to detect and hide all of the failures that
might occur in such a large network with so many components. Their clients can
be designed to tolerate failures, which generally involves the users tolerating them as
well. For example, when a web browser cannot contact a web server, it does not
make the user wait forever while it keeps on trying.

It informs the user about the problem, leaving them free to try again later. Services that
tolerate failures are discussed in the paragraph on redundancy below.

Detecting failures: Some failures can be detected. For example, checksums can be
used to detect corrupted data in a message or a file. It is obvious that it is difficult or even
impossible to detect some other failures, such as a remote crashed server in the Internet. The
challenge is to manage in the presence of failures that cannot be detected but may be
suspected.
Recovery from failures: Recovery involves the design of software so that the state
of permanent data can be recovered or ‘rolled back’ after a server has crashed. In
general, the computations performed by some programs will be incomplete when a
fault occurs, and the permanent data that they update (files and other material

13
stored in permanent storage) may not be in a consistent state.
Redundancy: Services can be made to tolerate failures by the use of redundant
components. Consider the following examples:

1. There should always be at least two different routes between any two routers in the
Internet.
2. In the Domain Name System, every name table is replicated in at least two different
servers.
3. A database may be replicated in several servers to ensure that the data remains accessible
after the failure of any single server; the servers can be designed to detect faults in their
peers; when a fault is detected in one server, clients are redirected to the remaining servers.

4. Discuss about reliable client-server and group


communication
In many cases, fault tolerance in distributed systems concentrates on faulty processes.
However, we also need to consider communication failures. Most of the failure models
discussed previously apply equally well to communication channels. In particular, a
communication channel may exhibit crash, omission, timing, and arbitrary failures. In
practice, when building reliable communication channels, the focus is on masking crash and
omission failures.

Point-to-Point Communication
In many distributed systems, reliable point-to-point communication is established by making
use of a reliable transport protocol, such as TCP. TCP masks omission failures, which occur
in the form of lost messages, by using acknowledgments and retransmissions. Such failures
are completely hidden from a TCP client.
However, crash failures of connections are not masked. A crash failure may occur when (for
whatever reason) a TCP connection is abruptly broken so that no more messages can be
transmitted through the channel. In most cases, the client is informed that the channel has
crashed by raising an exception. The only way to mask such failures is to let the distributed
system attempt to automatically set up a new connection, by simply resending a connection
request. The underlying asymptotia’s that the other side is still, or again, responsive to such
requests.

RPC Semantics in the Presence of Failures


The goal of RPC is to hide communication by making remote procedure calls look just like
local ones. With a few exceptions, so far we have come fairly close. Indeed, as long as both
client and server are functioning perfectly, RPC does its job well. The problem comes about

14
when errors occur. It is then that the differences between local and remote calls are not
always easy to mask.
To structure our discussion, let us distinguish between five different classes of failures that
can occur in RPC systems, as follows:
1. The client is unable to locate the server.
2. The request message from the client to the server is lost.
3. The server crashes after receiving a request.
4. The reply message from the server to the client is lost.
5. The client crashes after sending a request

Client is unable to locate the sever


- may be the client is unable to locate a suitable server or the server is down
- or an interface at the server has been changed making the client’s stub obsolete (it was not
used for a long time) and the binder failing to match it with the server
- one solution: let the client raise an exception and have exception handlers; inserted by the
programmer
- examples are exceptions in Java or signal handlers in C. But,
- not every language may provide exceptions or signals
- the major objective of having transparency is violated

The client’s request to the server is lost


let the OS or the client stub start a timer when sending a request; if the timer expires before a
reply or acknowledgement comes back, the message is retransmitted. The server receives the
request and crashes.

▪ the failure can happen before or after execution a server in client-server


communication;

(a) normal case,


(b) (b) crash after execution,
(c) (c) crash before execution
- in (b), let the system report failure back to the client (raise an exception)
- in (c) retransmit the request
- but the client’s OS doesn’t know which is which; only a timer has expired
✓ Three possible solutions:

15
- the client tries the operation again after the server has restarted (or rebooted) or rebinds to
a new server, called at least once semantics; keep on trying until a reply is received; it
guarantees that the RPC has been carried out at least one time, but possibly more
- the client gives up after the first attempt and reports error, called at most once semantics;
it guarantees that the RPC has been carried out at most one time, but possibly none at all
- no guaranty at all (0-N times)
▪ none of the above is the right solution; what is required is exactly once semantics; but
difficult to achieve

Reliable Group Communication


Considering how important process resilience by replication is, it is not surprising that
reliable multicast services are important as well. Such services guarantee that messages are
delivered to all members in a process group. Unfortunately, reliable multicasting turns out to
be surprisingly tricky. In this section, we take a closer look at the issues involved in reliably
delivering messages to a process group.

Basic Reliable-Multicasting Schemes


Although most transport layers offer reliable point-to-point channels, they rarely offer
reliable communication to a collection of processes. The best they can offer is to let each
process set up a point-to-point connection to each other process it wants to communicate
with. Obviously, such an organization is not very efficient as it may waste network
bandwidth. Nevertheless, if the number of processes is small, achieving reliability through
multiple reliable point-to-point channels is a simple and often straightforward solution.
To go beyond this simple case, we need to define precisely what reliable multicasting is.
Intuitively, it means that a message that is sent to a process group should be delivered to each
member of that group. However, what happens if during communication a process joins the
group? Should that process also receive the message? Likewise, we should also determine
what happens if a (sending) process crashes during communication.
To cover such situations, a distinction should be made between reliable communication in the
presence of faulty processes, and reliable communication when processes are assumed to
operate correctly. In the first case, multicasting is considered to be reliable when it can be
guaranteed that all nonfaulty group members receive the message. The tricky part is that
agreement should be reached on what the group actually looks like before a message can be
delivered, in addition to various ordering constraints. We return to these matters when we
discuss atomic multicasts below. a weaker solution assuming that all receivers are known

16
(and they are limited) and that none will fail is for the sending process to assign a sequence
number to each message and to buffer all messages so that lost ones can be retransmitted

A simple solution to reliable multicasting when all receivers are known and are assumed
not to fail; (a) message transmission, (b) reporting feedback

The sending process assigns a sequence number to each message it multicasts. We assume
that messages are received in the order they are sent. In this way, it is easy for a receiver to
detect it is missing a message. Each multicast message is stored locally in a history buffer at
the sender. Assuming the receivers are known to the sender, the sender simply keeps the
message in its history buffer until each receiver has returned an acknowledgment. If a
receiver detects it is missing a message, it may return a negative
acknowledgment, requesting the sender for a retransmission.

17
Reference
✓ Distributed Systems-Tan-2nd Edition
✓ Distributed Principles Algorithms and Systems Computing
✓ www.wikipidia.org
✓ www.google.com

18

You might also like