Professional Documents
Culture Documents
Ds - cs8603 - III Yr - Unit II Digital Material
Ds - cs8603 - III Yr - Unit II Digital Material
This document is confidential and intended solely for the educational purpose of RMK
Group of Educational Institutions. If you have received this document through email in
error, please notify the system manager. This document contains proprietary
information and is intended only to the respective group / learning community as
intended. If you are not the addressee you should not disseminate, distribute or copy
through e-mail. Please notify the sender immediately by e-mail if you have received
this document by mistake and delete this document from your system. If you are not
the intended recipient you are notified that disclosing, copying, distributing or taking
any action in reliance on the contents of this information is strictly prohibited.
CS8603 DISTRIBUTED SYSTEM
Department:Computer Science & Engineering
Batch/Year:2019-2023/III Year
Created by:Dr P.Kavitha Associate Professor
Date:9/3/2022
Table of Contents
2. Course Objectives
Pre Requisites
:
:To learn the characteristics of peer-to-peer and
distributed shared memory systems
6. CO- PO/PSO Mapping
CO - PO Articulation Matrix with Correlation Level Based on the contribution of CO
with PO, wet set the Correaltion values between CO-PO
CO PO PO PO PO PO PO PO PO PO PO PO PO PS PS PS
1 2 3 4 5 6 7 8 9 10 11 12 O1 O2 O3
1 2 1 1 - - - - - - - - - - - -
2 2 1 1 - - - - - - - - - - - -
3 2 1 1 - - - - - - - - - - - -
4 2 1 1 - - - - - - - - - - - -
5 2 1 2 - - - - - - - - - - - -
7. Lecture Plan
pe
Sessi No. Proposed Actual rta Taxon Mode of
Topics to be
on of date Lecture ini omy Delivery
covered
No. Peri Date ng level
ods CO
24.03.22
Message ordering
2 paradigms
1 2 K1 PPT
25.03.22
Asynchronous execution
3 with synchronous 1 2 K1 PPT
communication
26.03.22
Synchronous program
4 order on an 1 2 K1 PPT
asynchronous system
28.03.22
1.Find out the correct relation between the different message ordering paradigms,
where SYNC, FIFO, A and CO denote the set of all possible executions ordered by
synchronous order, FIFO order, non- FIFO order and causal order respectively.
A. SYNC ⊂ FIFO⊂ A⊂ CO
Statement 2: Moving sequencer algorithms, which work with open groups, provide
total ordering.
https://youtu.be/Ifeoyhn7t9U
10. Assignments ( For higher level learning and
Evaluation - Examples: Case study, Comprehensive
design, etc.,)
Consider the following simple method to collect a global snapshot (it may not always
collect a consistent global snapshot): an initiator process takes its snapshot and broadcasts
a request to take snapshot. Whe n some other process receives this request, it takes a
Prove that such a collected distributed snapshot will be consistent iff the following
holds (assume there are n processes in the system and Vti denotes the vector timestamp of
Asynchronous executions
FIFO executions
A FIFO execution is an A-execution in which
UNIT II – MESSAGE ORDERING & SNAPSHOTS
On any logical link in the system, messages are necessarily delivered in the
order in which they are sent. Although the logical link is inherently non-FIFO,
most network protocols provide a connection-oriented service at the transport
layer. A simple algorithm to implement a FIFO logical channel over a non-FIFO
channel would use a separate num-bering scheme to sequence the messages on
each logical channel. The sender assigns and appends a sequence_num,
connection_id tuple to each message. The receiver uses a buffer to order the
incoming messages as per the sender’s sequence numbers, and accepts only the
“next” message in sequence.
If two send events s and s’ are related by causality ordering (not physical
time ordering), then a causally ordered execution requires that their
corresponding receive events r and r’ occur in the same order at all common
destinations.
UNIT II – MESSAGE ORDERING & SNAPSHOTS
Examples
(b) shows an execution that satisfies CO. Only s1 and s2 are related by
causality but the destinations of the corresponding messages are different.
(c) shows an execution that satisfies CO. No send events are related by
causality.
(d) shows an execution that satisfies CO. s2 and s1 are related by causality
but the destinations of the corresponding messages are different. Similarly for s2
and s3.
The message m sent causally later than m is not received causally earlier at
the common destination. This ordering is known as message ordering (MO).
Synchronous execution
By assuming that a send event and its corresponding receive event are
viewed atomically i.e., s(M) ≺ r(M) and r(M) ≺ s(M) , it follows that for any events
ei and ej that are not the send event and the receive event of the same message,
ei ≺ ej => T i < T j.
In the non-separated linear extension, if the adjacent send event and its
corresponding receive event are viewed atomically, then that pair of events shares
a common past and a common future with each other.
UNIT II – MESSAGE ORDERING & SNAPSHOTS
In a crown, the send event si and receive event ri+1 may lie on the same
process or may lie on different processes.
1. In an execution that is not CO, there must exist pairs (s,r), (s’,r’) and
such that s ≺ r and s‘≺ r’. It is possible to generalize this to state that a non-CO
execution must have a crown of size at least 2.
An execution is not RSC and its graph G→ contains a cycle if and only if in
the corresponding space– time diagram, it is possible to form a cycle by (i)
moving along message arrows in either direction, but (ii) always going left to right
along the time line of any process.
Simulations
Non-determinism
1. A receive call can receive a message from any sender who has sent a
message, if the expected sender is not specified.
2. Multiple send and receive calls which are enabled at a process can be
executed in an interchangeable order.
Rendezvous
2. Each process attempts to schedule only one send event at any time.
The message types used are: (i) M, (ii) ack(M), (iii) request(M), and (iv)
permission(M). A process blocks when it knows that it can successfully syn-
chronize the current message with the partner process. Each process maintains a
queue that is processed in FIFO order only when the process is unblocked. When
a process is blocked waiting for a particular message that it is currently
synchronizing, any other message that arrives is queued up.
Execution events in the synchronous execution are only the send of the
mes-sage M and receive of the message M. The send and receive events for the
other message types – ack(M), request(M), and permission(M) which are control
messages. We use capital SEND(M) and RECEIVE(M) to denote the primitives in
the application execution, the lower case send and receive are used for the
control messages.
The key rules to prevent cycles among the messages are summarized
as follows:
Also, while waiting for this permission, if a request(M’) from a lower priority
process arrives, a permission(M’) is returned and the process blocks until M’
actually arrives.
Group communication
Also, while waiting for this permission, if a request(M’) from a lower priority
process arrives, a permission(M’) is returned and the process blocks until M’
actually arrives.
Causal order has many applications such as updating replicated data, allo-cating
requests in a fair manner, and synchronizing multimedia streams.
The Propagation Constraints also imply that if either (I) or (II) is false, the
information “d ∈ M ” must not be stored or propagated, even to remember that
(I) or (II) has been falsified.
4. must not exist at e7 ,e8 because (I) and (II) are false.
UNIT II – MESSAGE ORDERING & SNAP SHOTS
Information about messages (i) not known to be delivered and (ii) not
guaranteed to be delivered in CO, is explicitly tracked by the algorithm using
(source, timestamp, destination) information. The information must be deleted as
soon as either (i) or (ii) becomes false. The key problem in designing an optimal
CO algorithm is to identify the events at which (i) or (ii) becomes false
Multicast M4,3
At event (4, 3), the information “P6 ∈ M5,1 ” in Log4 is propagated on multicast
M4,3 only to process P6 to ensure causal delivery using the Delivery Condition. The
piggybacked information on message M4,3 sent to process P3 must not contain this
information because of constraint II.
UNIT II – MESSAGE ORDERING & SNAPSHOTS
Processing at P1
TOTAL ORDER
For each pair of processes Pi and Pj and for each pair of messages Mx and My
that are delivered to both the processes, Pi is delivered Mx before My if and only if
Pj is delivered Mx before My.
Complexity: Each message transmission takes two message hops and exactly n
messages in a system of n processes.
Drawbacks: A centralized algorithm has a single point of failure and congestion,
and is therefore not an elegant solution.
Three-phase distributed algorithm
A distributed algorithm that enforces total and causal order for closed groups is
given in Algorithm.
Sender
Phase 1 In the first phase, a process multicasts (line 1b) the message M with a
locally unique tag and the local timestamp to the group members.
Phase 2 In the second phase, the sender process awaits a reply from all the
group members who respond with a tentative proposal for a revised timestamp
for that message M. The await call in line 1d is non-blocking.
Phase 3 In the third phase, the process multicasts the final timestamp to the
group in line.
Receivers
Phase 1 In the first phase, the receiver receives the message with a
tentative/proposed timestamp. It updates the variable priority that tracks the
highest proposed timestamp (line 2a), then revises the proposed timestamp to
the priority, and places the message with its tag and the revised timestamp at the
tail of the queue temp_Q (line 2b). In the queue, the entry is marked as
undeliverable.
UNIT II – MESSAGE ORDERING & SNAPSHOTS
Phase 2 In the second phase, the receiver sends the revised timestamp (and
the tag) back to the sender (line 2c). The receiver then waits in a non-blocking
manner for the final timestamp.
Phase 3 In the third phase, the final timestamp is received from the multicaster
(line 3). The corresponding message entry in temp_Q is identified using the tag
(line 3a), and is marked as deliverable (line 3b) after the revised timestamp is
overwritten by the final timestamp (line 3c). The queue is then resorted using the
timestamp field of the entries as the key (line 3c).
UNIT II – MESSAGE ORDERING & SNAPSHOTS
Complexity
This algorithm uses three phases, and, to send a message to n − 1 processes, it uses 3(n – 1) messages and
Example
1. A sends a REVISE_TS (7) message, having timestamp 7. B sends a REVISE_TS(9) message, having
timestamp 9.
2. C receives A’s REVISE_TS(7), enters the corresponding message in temp_Q, and marks it as undeliverable;
3. D receives B’s REVISE_TS(9), enters the corresponding message in temp_Q, and marks it as undeliverable;
4. C receives B’s REVISE_TS(9), enters the corresponding message in temp_Q, and marks it as undeliverable;
5. D receives A’s REVISE_TS(7), enters the corresponding message in temp_Q, and marks it as undeliverable;
priority = 10. D assigns a tentative timestamp value of 10, which is greater than all of the times-tamps on REVISE_TSs
seen so far, and then sends PROPOSED_TS(10) message to A.
6. When A receives PROPOSED_TS (7) from C and PROPOSED_TS (10) from D, it computes the final timestamp
7. When B receives PROPOSED_TS(9) from C and PROPOSED_TS(9) from D, it computes the final timestamp as
8. C receives FINAL_TS (10) from A, updates the corresponding entry in temp_Q with the timestamp, resorts the
queue, and marks the message as deliverable. As the message is not at the head of the queue, and some entry ahead
9. D receives FINAL_TS (9) from B, updates the corresponding entry in temp_Q by marking the corresponding
message as deliverable, and resorts the queue. As the message is at the head of the queue, it is moved to delivery_Q.
10. When C receives FINAL_TS(9) from B, it will update the correspond-ing entry in temp_Q by marking the
corresponding message as deliv-erable. As the message is at the head of the queue, it is moved to the delivery_Q, and
the next message (of A), which is also deliverable, is also moved to the delivery_Q.
11. When D receives FINAL_TS (10) from A, it will update the corre-sponding entry in temp_Q by marking the
corresponding message as deliverable. As the message is at the head of the queue, it is moved to the delivery_Q.
UNIT II – MESSAGE ORDERING & SNAPSHOTS
The system consists of a collection of n processes, p1, p2,.. , pn, that are
connected by channels. There is no globally shared memory and processes
communicate solely by passing messages. There is no physical global clock in the
system. Message send and receive is asynchronous. Messages are delivered
reliably with finite but arbitrary time delay. The system can be described as a
directed graph in which vertices represent the processes and edges represent
unidirectional communication channels. Let Cij denote the channel from process pi
to process pj .
UNIT II – MESSAGE ORDERING & SNAPSHOTS
Processes and channels have states associated with them. The state of a
process at any time is defined by the contents of processor registers, stacks, local
memory, etc., and may be highly dependent on the local context of the distributed
application. The state of channel C ij , denoted by SCij , is given by the set of
messages in transit in the channel.
At any instant, the state of process pi, denoted by LSi, is a result of the
sequence of all the events executed by pi up to that instant. A channel is a
distributed entity and its state depends on the local states of the processes on
which it is incident. For a channel Cij , the following set of messages can be
defined based on the local states of the processes pi and pj.
In the FIFO model, each channel acts as a first-in first-out message queue and,
thus, message ordering is preserved by a channel. In the non-FIFO model, a
channel acts like a set in which the sender process adds messages and the
receiver process removes messages from it in a random order. A system that
supports causal delivery of messages satisfies the following property: for any two
messages mij and mkj , if sendij → sendkj , then recij → reckj.
UNIT II – MESSAGE ORDERING & SNAPSHOTS
The global state of a distributed system is a collection of the local states of the
processes and the channels. Notationally, global state GS is defined as
A cut is a line joining an arbitrary point on each process line that slices the
space–time diagram into a PAST and a FUTURE. Recall that every cut corresponds
to a global state and every global state can be graphically represented by a cut in
the computation’s space–time diagram.
UNIT II – MESSAGE ORDERING & SNAPSHOTS
2. Any message that is sent by a process after recording its snapshot, must
not be recorded in the global snapshot (from C2).
A process pj must record its snapshot before processing a message mij that
was sent by process pi after recording its snapshot.
UNIT II – MESSAGE ORDERING & SNAPSHOTS
Chandy–Lamport algorithm
Since all messages that follow a marker on channel Cij have been sent by
process pi after pi has taken its snapshot, process pj must record its snapshot no
later than when it receives a marker on channel Cij .
The algorithm
The recorded local snapshots can be put together to create the global snapshot in
several ways. One policy is to have each process send its local snapshot to the initiator of
the algorithm. Another policy is to have each process send the information it records
along all outgoing channels, and to have each process receiving such information for the
first time propagate it along its outgoing channels. All the local snapshots get
disseminated to all other processes and all the processes can determine the global state.
Multiple processes can initiate the algorithm concurrently. If multiple processes initiate
the algorithm concurrently, each initiation needs to be distinguished by using unique
markers. Different initiations by a process are identified by a sequence number
Correctness
To prove the correctness of the algorithm, we show that a recorded snapshot satisfies
conditions C1 and C2. Due to FIFO property of channels, it follows that no message sent
after the marker on that channel is recorded in the channel state. Thus, condition C2 is
satisfied. When a process pj receives message mij that precedes the marker on channel Cij
, it acts as follows: if process pj has not taken its snapshot yet, then it includes mij in its
recorded snapshot. Otherwise, it records mij in the state of the channel Cij . Thus,
condition C1 is satisfied.
UNIT II – MESSAGE ORDERING & SNAPSHOTS
Complexity
The recording part of a single instance of the algorithm requires O(e) messages and
O{d} time, where e is the number of edges in the network and d is the diameter of the
network.
The recorded global state is a valid state in an equivalent execution and if a stable
property (i.e., a property that persists such as termination or deadlock) holds in the
system before the snapshot algorithm begins, it holds in the recorded global snapshot.
Therefore, a recorded global state is useful in detecting stable properties.
Spezialetti–Kearns algorithm
A key notion used by the optimizations is that of a region in the system. A region
encompasses all the processes whose master field contains the iden-tifier of the same
initiator. A region is identified by the initiator’s identifier. When the initiator’s identifier in
a marker received along a channel is different from the value in the master variable, a
concurrent initiation of the algorithm is detected and the sender of the marker lies in a
different region. The identifier of the concurrent initiator is recorded in a local variable id-
border-set. The process receiving the marker does not take a snapshot for this marker
and does not propagate this marker.
Snapshot recording at a process is complete after it has received a marker along each
of its channels. After every process has recorded its snapshot, the system is partitioned
into as many regions as the number of concurrent initiations of the algorithm. The
variable id-border-set at a process contains the identifiers of the neighboring regions.
UNIT II – MESSAGE ORDERING & SNAPSHOTS
When the initiator receives the locally recorded states of all its descendents
from its children processes, it assembles the snapshot for all the processes in its
region and the channels incident on these processes The initiator exchanges the
snapshot of its region with the initiators in adjacent regions in rounds. In each
round, an initiator sends to initiators in adjacent regions, any new information
obtained from the initiator in the adjacent region during the previous round of
message exchange. A round is complete when an initiator receives information, or
the blank message (signifying no new information will be forthcoming) from all
initiators of adjacent regions from which it has not already received a blank
message.
The incremental snapshot algorithm assumes bidirectional FIFO channels, the presence
of a single initiator, a fixed spanning tree in the network, and four types of control
messages: init_snap, regular, and ack. init_snap, and snap_completed messages traverse
the spanning tree edges. regular and ack messages, which serve to record the state of
non-spanning edges, are not sent on those edges on which no computation message has
been sent since the previous snapshot.
The algorithm works as follows: snapshots are assigned version numbers and all
algorithm messages carry this version number. The initiator notifies all the processes the
version number of the new snapshot by sending init_snap messages along the spanning
tree edges. A process follows the “marker sending rule” when it receives this notification
or when it receives a regular message with a new version number. The “marker sending
rule” is modified so that the process sends regular messages along only those channels
on which it has sent computation messages since the previous snapshot, and the process
waits for ack messages in response to these regular messages. When a leaf process in
the spanning tree receives all the ack messages it expects, it sends a snap_completed
message to its parent process. When a non-leaf process in the spanning tree receives all
the ack messages it expects, as well as a snap_completed message from each of its child
processes, it sends a snap_completed message to its parent process.The algorithm
terminates when the initiator has received all the ack mes-sages it expects, as well as a
snap_completed message from each of its child processes.
UNIT II – MESSAGE ORDERING & SNAPSHOTS
In Helary’s algorithm, the “marker sending rule” is executed when a control message
belonging to the wave flow visits the process. The process then forwards a control
message to other processes, depending on the wave traversal structure, to continue the
wave’s progression. The “marker receiving rule” is modified so that if the process has not
recorded its state when a marker is received on some channel, the “marker receiving
rule” is not executed and no messages received after the marker on this channel are
processed until the control message belonging to the wave flow visits the process. Thus,
each process follows the “marker receiving rule” only after it is visited by a control
message belonging to the wave.
Open Group
9 What are the key rules to prevent cycles among the K2 CO2
messages.
The key rules to prevent cycles among the
messages are summarized as follows:
To send to a lower priority process, messages M
and ack(M) are involved in that order. The sender
issues send(M) and blocks until ack(M) arrives.
To send to a higher priority process, messages
request(M), permission(M), and M are involved, in
that order. The sender issues send(request(M)),
does not block, and awaits permission. When
permission(M) arrives, the sender issues send(M).
10 Define Global State. K1 CO2
The global state of a distributed system is a collection of the
local states of its components.
A global state of the distributed system (called a check-point) is
periodically saved and recovery from a processor failure is done
by restoring the system to the last saved global state for
debugging distributed software, the system is restored to a
consistent global state and the execution resumes from there in
a controlled manner.
1
i)Design FIFO and non-FIFO executions. k1 CO2
ii)Discuss on causally ordered executions
2
Show with an equivalent timing diagram of a k1 CO2
synchronous execution on an asynchronous system.
3
Show with an equivalent timing diagram of a k1 CO2
asynchronous execution on a synchronous system.
4
Explain chandy and lamport algorithm k1 CO2
5
Examine the two possible executions of the snapshot k1 CO2
algorithm for money transfer.
6
Examine the necessary and sufficient conditions for k2 CO2
causal ordering.
7
Analyze in detail about the centralized algorithm to k1 CO2
implement total order and causal order of messages.
8
Discuss in detail about the distributed algorithm to k2 CO2
implement total order and causal order of messages.
9
Explain SNAPSHOT ALGORITHMS FOR FIFO CHANNELS k1 CO2
10.
Compare Chandy–Lamport algorithm vs Spezialetti– k2 CO2
Kearns algorithm
13. Supportive online Certification courses
(NPTEL, Swayam, Coursera, Udemy, etc.,)
Online Repository:
1. Google Drive 2. GitHub 3. Code Guru
2 NPTEL Swayam
https://onlinecourses.nptel.ac.in
http://engineeringvideolectures.com
14. Real time Applications in day to day life
and to Industry
Examples of distributed systems and applications of
distributed computing include the following:
ONLINE GAMES
Telecommunication networks: telephone networks and
cellular networks, ...
network applications: World Wide Web and peer-to-peer
networks, ...
real-time process control: aircraft control systems, ...
parallel computation:
15.Contents beyond the Syllabus ( COE related Value
added courses)
1. Bankers Algorithm
16. Assessment Schedule ( Proposed Date & Actual
Date)
3 Model Examination
17. Prescribed Text Books & Reference Books
TEXT BOOKS:
. 1. Kshemkalyani, Ajay D., and Mukesh Singhal. Distributed computing: principles,
algorithms, and systems. Cambridge University Press, 2011.
2. George Coulouris, Jean Dollimore and Tim Kindberg, ―Distributed Systems
Concepts and Design‖, Fifth Edition, Pearson Education, 2012.
REFERENCES:
Disclaimer:
This document is confidential and intended solely for the educational purpose of RMK Group of
Educational Institutions. If you have received this document through email in error, please notify the
system manager. This document contains proprietary information and is intended only to the
respective group / learning community as intended. If you are not the addressee you should not
disseminate, distribute or copy through e-mail. Please notify the sender immediately by e-mail if you
have received this document by mistake and delete this document from your system. If you are not
the intended recipient you are notified that disclosing, copying, distributing or taking any action in
reliance on the contents of this information is strictly prohibited.