Download as pdf or txt
Download as pdf or txt
You are on page 1of 15

1

DC QB

1. Describe code migration issues in details. *


Ans:
 Code migration in distributed systems can involve moving an entire program
from one machine to another, which is known as process migration.
 This can be a complex task that should only be done for a good reason.
 Code migration in distributed systems can indeed pose several challenges.
Here are some common issues and considerations:

1) Data Consistency:
Ensuring that data remains consistent during migration is crucial. When
migrating code that interacts with databases or other data stores, you need
to handle data synchronization to avoid inconsistencies or data loss.

2) Versioning and Compatibility:


Ensuring that the migrated code is compatible with the existing versions of
software libraries, APIs, and dependencies across different nodes can be
challenging. Version mismatches can lead to runtime errors and
compatibility issues.

3) Network Latency and Bandwidth:


Distributed systems often span multiple nodes across a network. Migrating
code can lead to increased network traffic, latency, and bandwidth
consumption, especially if large amounts of data need to be transferred
between nodes.

4) Security:
Ensuring the security of the migrated code and data during transit and
execution is critical. Unauthorized access, data breaches, or injection attacks
can compromise system integrity and confidentiality.

© Hritik Ghatge
2

2. Discuss and differentiate various client consistency models. *


Ans:
 Client consistency models define the guarantees provided to clients accessing a
distributed system regarding the visibility and ordering of data updates.
 These models ensure that clients observe a consistent view of the system
despite its distributed nature.
 Let's discuss and differentiate some of the key client consistency models:

1) Strong Consistency:
 In a strongly consistent system, all clients observe the same order of
updates and see the most recent version of data at all times.
 This model provides the strongest guarantee of consistency but may
incur higher latency due to synchronization requirements.

2) Sequential Consistency:
 Sequential consistency ensures that the order of operations performed
by each individual client is preserved and consistent with a global total
order.
 However, operations from different clients may be interleaved arbitrarily
as long as they respect each client's order.

3) Read Your Writes Consistency:


 Read your writes consistency ensures that a client always observes its
own writes.
 Any write operation performed by a client will be immediately visible to
subsequent read operations from the same client.

4) Causal Consistency:
 Causal consistency preserves the causal relationships between
operations.
 If operation A causally precedes operation B, all clients must observe
operation A before operation B. However, concurrent operations may be
observed in different orders by different clients.

© Hritik Ghatge
3

3. Explain Raymond’s Algorithm for mutual exclusion. *


Ans:
 Raymond's Algorithm is a distributed algorithm used for achieving mutual
exclusion in distributed systems.
 The algorithm ensures that only one process can enter its critical section at a
time, even in a distributed environment where processes are located on
different nodes.
 Here's how the Raymond's Algorithm works:

1) Initiation: Each process maintains a local queue of pending requests for the
resource it wants to access. Initially, the queue is empty.

2) Requesting Access: When a process wants to access the shared resource, it


sends a request message to its neighbors in the network.

3) Token Passing: A token is passed among processes to control access to the


shared resource. The token indicates which process has the right to access
the resource.

4) Handling Requests: When a process receives a request message from a


neighbor, it determines whether to grant or defer the request based on its
own local state and the state of the token.

5) Granting Access: If a process determines that it can grant access to the


requesting process without violating mutual exclusion (i.e., no other process
has a higher priority), it forwards the token to the requesting process.

6) Deferring Access: If a process cannot grant access immediately (e.g., if it


currently holds the token or has higher-priority requests pending), it defers
the request by placing it in its local queue.

7) Releasing Access: When a process finishes using the shared resource, it


releases the token by passing it to the next process in its local queue (if any)
or by forwarding it to a designated neighbor.

© Hritik Ghatge
4

4. What is Fault tolerance? Explain Failure models. *


Ans:
 Fault tolerance in distributed systems refers to the ability of a system to
continue operating properly in the presence of faults, such as hardware
failures, software errors, or network issues.
 Fault tolerance provides:
Availability: The system is always available for use
Reliability: The system can work continuously without failure
Safety: The system can remain safe from unauthorized access even if any
failure occurs
 It involves designing systems that can detect, isolate, and recover from faults
without causing disruptions to the overall service.
 There are several approaches to achieve fault tolerance in distributed systems:

1) Redundancy:
Redundancy involves replicating components or data across multiple nodes
in the system. If one node fails, another replica can take over its
responsibilities.

2) Checkpointing and Recovery:


Checkpointing involves periodically saving the state of a distributed system
to stable storage. In the event of a failure, the system can recover by
restoring from the latest checkpoint.

3) Isolation and Containment:


Isolation mechanisms ensure that faults or failures in one part of the system
do not propagate to other parts, minimizing the impact of failures on the
overall system.

4) Failover:
Failover mechanisms automatically redirect traffic or workload from a failed
component to a backup or standby component. This ensures that the
system remains operational even if individual components fail.

© Hritik Ghatge
5

 Failure models

 When a failure occurs in one node, it can ripple through the network,
impacting other nodes and potentially leading to a cascade of failures.
 Failure models describe different types of failures that can occur in distributed
systems.
 There are four main types of failure models in distributed systems:

1) Crash-stop faults:
Nodes or components fail abruptly without any warning or indication. Crash
failures are common in distributed systems due to hardware faults,
software bugs, or network issues.

2) Omission faults:
A type of fault that occurs when a system fails to perform a required
function. Detecting and handling omission faults typically involve
mechanisms such as watchdog timers, redundancy, and error detection and
correction codes.

3) Crash recovery faults:


The crash-recovery model assumes that crashed nodes may eventually
recover and resume operation. Recovery could involve rebooting the
hardware, restarting the software process, or restoring the node's state
from a checkpoint or backup.

4) Byzantine Failures:
Nodes or components exhibit malicious behavior, such as sending incorrect
or misleading messages, in an attempt to disrupt the system. Byzantine
failures are more challenging to tolerate than crash failures and require
specialized fault-tolerant algorithms.

© Hritik Ghatge
6

5. List desirable features of distributed file system. How are modifications


propagated in file caching schemes?
Ans:
 Scalability:
The DFS should easily scale to accommodate growing amounts of data and
increasing numbers of users or clients without significant degradation in
performance.

 Transparency:
The DFS provides transparency of data even if the server or disk fails. This
makes the system easier to use, and users don't need to know where data is
stored.

 High Availability:
Data should be readily accessible even in the presence of node failures or
network issues.

 Security:
Robust authentication, authorization, and encryption mechanisms to protect
data confidentiality, integrity, and availability.

 Performance:
Efficient data access and transfer mechanisms are essential for good
performance.

© Hritik Ghatge
7

 Modifications propagated in file caching schemes:

1) Write-Through Caching:
 In write-through caching, modifications are immediately written both to
the cache and the underlying storage system such as a disk.
 This ensures that the cache and the storage system are always
synchronized, but it may introduce higher latency for write operations
since data must be written to both locations.

2) Write-Behind Caching:
 In write-behind caching, modifications are initially written only to the
cache. The updates are then asynchronously propagated to the
underlying storage system at a later time
 This approach can improve write performance since writes are initially
handled by the cache, but it introduces the risk of data loss if the cached
data is lost before being written to the storage system

3) Invalidation-Based Caching:
 In invalidation-based caching, modifications are propagated by
invalidating (marking as stale) the relevant cache entries when changes
are made to the original data source.
 When a client requests data that has been invalidated, the cache fetches
the updated data from the source and updates the cache according

© Hritik Ghatge
8

6. Explain file caching schemes.


Ans:
 In distributed computing, file caching schemes play a crucial role in optimizing
performance and reducing latency by storing frequently accessed data closer
to the clients or computation nodes.
 Here are some common file caching schemes used in distributed computing
environments:

1) Client-Side Caching:
 In client-side caching, clients cache frequently accessed files locally,
reducing the need to fetch them from the remote server or storage
system repeatedly.
 This scheme can significantly reduce latency for read-heavy workloads
and improve overall system performance, especially in scenarios with
limited network bandwidth or high latency.

2) Server-Side Caching:
 In server-side caching, caching mechanisms are implemented on the
server or storage nodes to cache frequently accessed files or data blocks.
 This approach reduces the load on backend storage systems and
improves data access latency for clients by serving cached data directly
from the server's local storage.

3) Proxy Caching:
 Proxies intercept client requests, check if the requested data is available
in the cache, and serve it directly to clients without forwarding the
request to the origin server.
 This scheme reduces network traffic, minimizes server load, and
improves response times for clients, especially in scenarios with a large
number of clients accessing common data resources.

© Hritik Ghatge
9

4) Distributed Caching:
 Distributed caching schemes distribute cache storage and management
across multiple nodes in a distributed system.
 Distributed caching improves scalability, fault tolerance, and load
balancing by distributing cache load across multiple nodes and ensuring
high availability of cached data.

7. How Lamport does synchronize logical clock? Which events are said to be
concurrent in Lamport’s timestamp?
Ans:
 A Lamport logical clock is a numerical software counter value maintained in
each process.
 The basic idea behind Lamport clocks is that each process maintains a counter
representing its own notion of time. Every time an event occurs, the process
increments its counter.
 When a process receives a message, it re-synchronizes its logical clock with
that sender.
 Simplified explanation of how Lamport clocks synchronize:

1) Initialization: Each process starts with its own logical clock initialized to 0

2) Event Occurrence: When an event happens at a process, it increments its


logical clock by 1.

3) Message Sending: When a process sends a message, it includes its current


logical timestamp with the message.

4) Message Reception: When a process receives a message. It updates its own


logical clock to be greater than the maximum of its current logical time and
the timestamp received in the message. Then, it processes the
message/even

© Hritik Ghatge
10

 Concurrent events in Lamport’s timestamp

 This concept of concurrency is based solely on the partial ordering of events


established by Lamport timestamps and doesn't necessarily mean that the
events actually occurred simultaneously in real time.
 It just means that, from the perspective of the distributed system, there is no
causal relationship between these events; neither event causally depends on
the other.
 In Lamport's logical clock system, two events are considered concurrent if
neither event happened before the other according to the logical timestamps
assigned to them.
 In other words, if the timestamps of two events are equal, those events are
concurrent.

© Hritik Ghatge
11

8. Explain different load estimation and progress transfer policies used by


balancing algorithms. *
Ans:
 Load estimation and progress transfer policies are essential components of
load balancing algorithms in distributed systems.
 Load balancing algorithms are used to improve the execution of a distributed
application. They assign tasks to each processor and minimize the program's
execution time
 Load balancing algorithms try to keep work queues similar in length and
reduce average response time

Load Estimation Policies:

1) Centralized Monitoring: In this approach, a central controller gathers


information about the load on each processing unit or node by monitoring
their resource usage metrics such as CPU utilization, memory usage, or
network traffic.

2) Decentralized Monitoring: Each processing unit or node independently


monitors its own load and periodically exchanges load information with
neighboring units, each unit can make autonomous load balancing
decisions without relying on a central controller.

3) Feedback-Based Estimation: This approach involves using feedback from


the processing units or nodes to estimate their current and future loads.
Feedback can be in the form of performance metrics, response times, or
completion rates of tasks.

4) Predictive Estimation: Predictive models are used to forecast future load


patterns based on historical data and current trends. These models can
anticipate load fluctuations and adjust the load distribution preemptively to
prevent overloads or underutilization of resources.

© Hritik Ghatge
12

Progress Transfer Policies:

1) Task Migration: When a processing unit becomes overloaded or


underutilized, tasks can be migrated from heavily loaded units to lightly
loaded ones to balance the workload.

2) Work Stealing: In systems with task parallelism, work-stealing algorithms


allow idle processing units to "steal" tasks from other units' queues

3) Task Replication: Instead of moving tasks between units, task replication


involves duplicating tasks and executing them on multiple units
concurrently. This approach can improve fault tolerance

4) Load-Balanced Routing: In distributed storage systems or networks, load-


balanced routing algorithms route data packets or requests to the least
loaded or most suitable nodes or paths

© Hritik Ghatge
13

9. Explain Absolute ordering and casual ordering process with the help of
example for many to many communication. *
Ans:
Absolute Ordering:
 Absolute ordering, also known as total ordering, ensures that all events in a
distributed system are totally ordered, meaning that every process in the
system agrees on the same order of events.
 In an absolutely ordered system, there is a global sequence of events, and
every process observes the same sequence.

Process:
 Every event is assigned a unique, global timestamp.
 Events are ordered based on these timestamps, ensuring a total order of
events across all processes.
 Processes must synchronize their clocks to ensure that timestamps are
accurate and consistent.

Example:
 Consider a distributed system where events represent messages being sent
and received between processes.
 Each message is assigned a unique timestamp representing the time at
which it was sent.
 By comparing timestamps, all processes can agree on the order in which
messages were sent and received, even if they were processed on different
machines.

© Hritik Ghatge
14

Casual Ordering:
 Causal ordering focuses on preserving the causal relationships between events.
 It ensures that if event A causally affects event B, then event A must precede
event B in the ordering.
 However, events that are concurrent can be ordered arbitrarily.

Process:
 Each event is associated with a causal relationship specifying its
dependencies on other events.
 Events are ordered based on their causal relationships, ensuring that events
causally affecting each other are ordered accordingly.
 Concurrent events may be ordered arbitrarily, as long as the causal
relationships are preserved.

Example:
 Suppose a distributed system tracks transactions between bank accounts.
 If a withdrawal from one account precedes a deposit into another account,
the withdrawal event causally affects the deposit event.
 Even if two deposits into the same account occur concurrently, their order
in the sequence is not significant as long as their causal relationships with
other events are maintained.

© Hritik Ghatge
15

10. Explain Ricart-Agrawala Algorithm.


Ans:
 It was proposed by Glenn Ricart and Ashok Agrawala in 1981.
 The primary goal of the Ricart-Agrawala algorithm is to ensure that only one
process can access a critical section at any given time while allowing multiple
processes to request access concurrently.
 The algorithm guarantees safety, meaning that no two processes can
simultaneously execute their critical sections, and it ensures liveness, meaning
that if a process requests access to the critical section, it eventually obtains it.

Algorithm Steps:
1) Initialization: Each process maintains two flags: requesting to indicate if it
wants to access the critical section and replied to keep track of replies from
other processes

2) Requesting Access: When a process wants to enter the critical section, it sends
a request message to all other processes, indicating its intention to access the
critical section. After sending the request, the process waits for replies from all
other processes.

3) Receiving Requests: If it's not currently requesting access (requesting = false),


it immediately sends a reply. If it's also requesting access (requesting = true), it
compares the timestamps of the requests

4) Entering the Critical Section: Once a process receives replies from all other
processes, and none of them have a higher priority request, it enters the
critical section.

5) Exiting the Critical Section: After finishing execution in the critical section, the
process resets its requesting flag to false, indicating it no longer requires access
to the critical section. It also broadcasts a release message to inform other
processes that the critical section is now available.

© Hritik Ghatge

You might also like