MJF Distributed Computing Series 1

DC Series 1
Important topics
Snapshot recording algorithm is not in this pdf.
Primitives for distributed communication

1. Synchronous Primitives:
Both Send and Receive primitives are synchronous if they engage in a

handshake.
Processing for the Send primitive finishes only when the invoking processor
confirms that the corresponding Receive primitive has been invoked and the
receive operation has completed.
2. Asynchronous Primitives:
DC Series 1 1
A Send primitive is asynchronous if control returns to the invoking process
after the data item has been copied from the user-specified buffer.
3. Blocking Primitives:
A primitive is blocking if control returns to the invoking process after the

primitive's processing, regardless of whether it's synchronous or
asynchronous.
Blocking Send: Waits until the sent message is received by the receiver and
acknowledged.
Blocking Receiver: Remains blocked until a message is received.
4. Non-blocking Primitives:
A primitive is non-blocking if control returns immediately to the invoking

process after invocation, even if the operation hasn't completed.
Non-blocking Send: Continues execution after sending the message.
DC Series 1 2
Non-blocking Receiver: Continues execution regardless of whether a message
is received.
5. Processor Synchrony:
Processor synchrony means all processors execute at the same time with
synchronized clocks.
Achieving perfect synchrony in a distributed system is not feasible.
Instead, synchronization occurs at a higher level of code, known as a "step."
Barrier synchronization methods are used to ensure that no processor

proceeds to the next step until all processors have completed the current step.
Characteristics of distributed computing
Distributed System
A distributed system is a collection of independent entities that cooperate to
solve a problem that cannot be individually solved.
Distributed system has been characterized in one of several ways :
1. Crash tolerance: A single machine failure does not halt the system's ability to
perform work.
2. Lack of shared memory and physical clock: Computers in the system

communicate via message passing over a network, each with its own memory
and operating system.
3. Single coherent system: Despite being independent, the collection of

computers appears as a unified entity to users.
4. Wide range of configurations: Encompasses systems from loosely coupled

wide-area networks to tightly coupled local area networks.
Features of Distributed Systems include:
1. No common physical clock: Computers may have different time references.
DC Series 1 3
2. No shared memory: Each computer operates independently with its own
memory.
3. Geographical separation: Computers can be located in different physical

locations.
4. Autonomy and heterogeneity: Processors have varying speeds and may run
different operating systems, yet cooperate by offering services or solving
problems jointly.
Motivation For Distributed Systems ( Advantages )

1. Resource sharing: Distributed systems allow for sharing resources among
multiple entities.
2. Access to geographically remote data and resources: Users can access data
and resources located in different geographical locations.
3. Enhanced reliability: Reliability encompasses availability, integrity, and fault

tolerance.
4. Increased performance/cost ratio: By sharing resources and accessing

remote data, distributed systems improve performance relative to cost.
5. Scalability: Distributed systems can scale to accommodate growing demands.
6. Modularity and incremental expandability: Distributed systems are modular

and can be expanded incrementally.
7. Primitives for distributed communication: Various communication primitives

facilitate communication between distributed entities.
Scalar and Vector time

Scalar time
DC Series 1 4
DC Series 1 5
DC Series 1 6
Vector Time
DC Series 1 7
DC Series 1 8
DC Series 1 9
DC Series 1 10
Bully Algorithm
DC Series 1 11
NOTE : For Understanding
The Bully Algorithm determines the coordinator in a distributed system.
The process with the highest ID becomes the coordinator.
There are three types of messages in the algorithm: election, answer, and
coordinator messages.
A process with the highest ID can declare itself as the coordinator by sending
a coordinator message to processes with lower IDs.
Conversely, a process with a lower ID initiates an election by sending an

election message to higher-ID processes and awaits answer messages.
If no response arrives within time T, the initiating process becomes the

coordinator and notifies lower-ID processes.
Otherwise, it waits for a coordinator message from the new coordinator.
DC Series 1 12
Upon receiving a coordinator message, a process updates its coordinator
variable and acknowledges the new coordinator.
Upon receiving an election message, a process responds with an answer

message and initiates another election if not already in progress.
When a process notices the coordinator is unresponsive, it starts an election

by sending an ELECTION message to higher-ID processes.
If no response, the initiating process becomes the coordinator; if a higher-up

responds, it takes over.
Upon receiving an ELECTION message from a lower-ID process, a process

responds with an OK message and may initiate an election if not already
underway.
Eventually, only one process continues the election and becomes the new
coordinator.
The new coordinator announces its victory by informing all processes.
If a previously down process comes back, it initiates an election.
If it has the highest ID, it wins the election and becomes the coordinator.
The "biggest guy" always wins, hence the name "bully" algorithm.
YouTube Video link : https://youtu.be/2DUq-9yNSls
Design issues and challenges of distributed

systems
The following functions must be addressed when designing and building a
distributed system:
1. Communication:
Designing mechanisms like RPC, ROI, message-oriented vs. stream-oriented

communication.
1. Processes:
DC Series 1 13
Managing processes/threads, code migration, designing software and
mobile agents.
2. Naming:
Creating robust schemes for names, identifiers, and addresses for

resource location.
3. Synchronization Mechanisms:
Mutual exclusion, clock synchronization, logical clocks for time passage.
4. Data storage and access:
Efficient data storage and access schemes across the network.
5. Consistency and replication:
Replication of data objects to avoid bottlenecks and provide scalability.
6. Fault tolerance:
Ensuring correct operation despite failures of links, nodes, or processes.
7. Security:
Cryptography, secure channels, access control, key management,

authorization, and group management.
8. API and transparency:
Access, location, migration, relocation, replication, concurrency, and

failure transparency.
9. Scalability and modularity:
Distributed algorithms, data, and services, along with techniques like

replication and caching for scalability.
Models of Communication Channels

1. FIFO (first-in, first-out): each channel acts as a FIFO message queue.
2. Non-FIFO (N-FIFO): a channel acts like a set in which a sender process adds
messages and receiver removes messages in random order.
DC Series 1 14
3. Causal Ordering (CO): It follows Lamport’s law. This property ensures that
casually related messages destined to the same destination are delivered in an
order that is consistent with their causality relation. It simplifies the design of
distributed algorithms as it provides built in synchronization.
Models of distributed computation

A distributed system comprises processors connected by a communication
network.
Communication network facilitates information exchange between processors.
Processors operate independently without a shared global memory,

communicating solely via message passing.
Distributed Program:
Comprises asynchronous processes communicating via message passing.
Each process runs on a separate processor.
Processes do not share memory and communicate only through

messages.
Global state includes process states and communication channels.
Execution and message transfer are asynchronous.
Message transmission delay is finite and unpredictable.
Model of Distributed Executions:
Process execution involves sequential actions.
Actions are atomic, categorized into:
DC Series 1 15
1. Internal events
2. Message send events
3. Message receive events.
DC Series 1 16
DC Series 1 17
Spanning Tree based Termination Detection
Algorithm
Initially, each leaf process is given a token.
Each leaf process, after it has terminated, sends its token to its parent.
When a parent process terminates and after it has received a token from each
of its children, it sends a token to its parent.
This way, each process indicates to its parent process that the subtree below
it has become idle.
In a similar manner, the tokens get propagated to the root.
The root of the tree concludes that termination has occurred, after it has
become idle and has received a token from each of its children.
NOTE : For Understanding
N processes, Pi, 0 ≤ i ≤ N, are represented as nodes in a fixed connected

undirected graph.
Communication channels between processes are represented by edges in the

graph.
A fixed spanning tree of the graph has process P0 as its root, responsible for
termination detection.
Process P0 communicates with other processes using signals to determine

their states.
Leaf nodes report termination to their parents.
Parent nodes report completion and termination of their immediate children.
The root concludes termination if it and all immediate children have

terminated.
The termination detection algorithm involves two waves of signals: inward and
outward through the spanning tree.
DC Series 1 18
Initially, a contracting wave of signals (tokens) moves inward from leaves to
the root.
If the token wave reaches the root without detecting termination, the root
initiates an outward wave of repeat signals.
As the repeat wave reaches leaves, the token wave reforms and moves inward
again.
This process repeats until termination is detected.
YouTube Link : https://youtu.be/aJEIoKxdqFQ
Termination detection by weight throwing
DC Series 1 19
DC Series 1 20
YouTube Video Link : https://youtu.be/b3S2pQ6oR60
DC Series 1 21

MJF Distributed Computing Series 1

Uploaded by

Copyright:

Available Formats

You might also like

MJF Distributed Computing Series 1

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

MJF Distributed Computing Series 1

Uploaded by

Copyright:

Available Formats

DC Series 1

Primitives for distributed communication

Both Send and Receive primitives are synchronous if they engage in a

A primitive is blocking if control returns to the invoking process after the

Blocking Receiver: Remains blocked until a message is received.

A primitive is non-blocking if control returns immediately to the invoking

Non-blocking Send: Continues execution after sending the message.

Achieving perfect synchrony in a distributed system is not feasible.

Instead, synchronization occurs at a higher level of code, known as a "step."

Barrier synchronization methods are used to ensure that no processor

Characteristics of distributed computing

Distributed system has been characterized in one of several ways :

2. Lack of shared memory and physical clock: Computers in the system

3. Single coherent system: Despite being independent, the collection of

4. Wide range of configurations: Encompasses systems from loosely coupled

Features of Distributed Systems include:

1. No common physical clock: Computers may have different time references.

3. Geographical separation: Computers can be located in different physical

Motivation For Distributed Systems ( Advantages )

3. Enhanced reliability: Reliability encompasses availability, integrity, and fault

4. Increased performance/cost ratio: By sharing resources and accessing

5. Scalability: Distributed systems can scale to accommodate growing demands.

6. Modularity and incremental expandability: Distributed systems are modular

7. Primitives for distributed communication: Various communication primitives

Scalar and Vector time

The Bully Algorithm determines the coordinator in a distributed system.

The process with the highest ID becomes the coordinator.

Conversely, a process with a lower ID initiates an election by sending an

If no response arrives within time T, the initiating process becomes the

Otherwise, it waits for a coordinator message from the new coordinator.

Upon receiving an election message, a process responds with an answer

When a process notices the coordinator is unresponsive, it starts an election

If no response, the initiating process becomes the coordinator; if a higher-up

Upon receiving an ELECTION message from a lower-ID process, a process

The new coordinator announces its victory by informing all processes.

If a previously down process comes back, it initiates an election.

YouTube Video link : https://youtu.be/2DUq-9yNSls

Design issues and challenges of distributed

Designing mechanisms like RPC, ROI, message-oriented vs. stream-oriented

Creating robust schemes for names, identifiers, and addresses for

Mutual exclusion, clock synchronization, logical clocks for time passage.

4. Data storage and access:

Efficient data storage and access schemes across the network.

5. Consistency and replication:

Replication of data objects to avoid bottlenecks and provide scalability.

Ensuring correct operation despite failures of links, nodes, or processes.

Cryptography, secure channels, access control, key management,

8. API and transparency:

Access, location, migration, relocation, replication, concurrency, and

9. Scalability and modularity:

Distributed algorithms, data, and services, along with techniques like

Models of Communication Channels

Models of distributed computation

Communication network facilitates information exchange between processors.

Processors operate independently without a shared global memory,

Comprises asynchronous processes communicating via message passing.

Each process runs on a separate processor.

Processes do not share memory and communicate only through