Professional Documents
Culture Documents
Distributed Computing QB Answers
Distributed Computing QB Answers
DC QB
1) Data Consistency:
Ensuring that data remains consistent during migration is crucial. When
migrating code that interacts with databases or other data stores, you need
to handle data synchronization to avoid inconsistencies or data loss.
4) Security:
Ensuring the security of the migrated code and data during transit and
execution is critical. Unauthorized access, data breaches, or injection attacks
can compromise system integrity and confidentiality.
© Hritik Ghatge
2
1) Strong Consistency:
In a strongly consistent system, all clients observe the same order of
updates and see the most recent version of data at all times.
This model provides the strongest guarantee of consistency but may
incur higher latency due to synchronization requirements.
2) Sequential Consistency:
Sequential consistency ensures that the order of operations performed
by each individual client is preserved and consistent with a global total
order.
However, operations from different clients may be interleaved arbitrarily
as long as they respect each client's order.
4) Causal Consistency:
Causal consistency preserves the causal relationships between
operations.
If operation A causally precedes operation B, all clients must observe
operation A before operation B. However, concurrent operations may be
observed in different orders by different clients.
© Hritik Ghatge
3
1) Initiation: Each process maintains a local queue of pending requests for the
resource it wants to access. Initially, the queue is empty.
© Hritik Ghatge
4
1) Redundancy:
Redundancy involves replicating components or data across multiple nodes
in the system. If one node fails, another replica can take over its
responsibilities.
4) Failover:
Failover mechanisms automatically redirect traffic or workload from a failed
component to a backup or standby component. This ensures that the
system remains operational even if individual components fail.
© Hritik Ghatge
5
Failure models
When a failure occurs in one node, it can ripple through the network,
impacting other nodes and potentially leading to a cascade of failures.
Failure models describe different types of failures that can occur in distributed
systems.
There are four main types of failure models in distributed systems:
1) Crash-stop faults:
Nodes or components fail abruptly without any warning or indication. Crash
failures are common in distributed systems due to hardware faults,
software bugs, or network issues.
2) Omission faults:
A type of fault that occurs when a system fails to perform a required
function. Detecting and handling omission faults typically involve
mechanisms such as watchdog timers, redundancy, and error detection and
correction codes.
4) Byzantine Failures:
Nodes or components exhibit malicious behavior, such as sending incorrect
or misleading messages, in an attempt to disrupt the system. Byzantine
failures are more challenging to tolerate than crash failures and require
specialized fault-tolerant algorithms.
© Hritik Ghatge
6
Transparency:
The DFS provides transparency of data even if the server or disk fails. This
makes the system easier to use, and users don't need to know where data is
stored.
High Availability:
Data should be readily accessible even in the presence of node failures or
network issues.
Security:
Robust authentication, authorization, and encryption mechanisms to protect
data confidentiality, integrity, and availability.
Performance:
Efficient data access and transfer mechanisms are essential for good
performance.
© Hritik Ghatge
7
1) Write-Through Caching:
In write-through caching, modifications are immediately written both to
the cache and the underlying storage system such as a disk.
This ensures that the cache and the storage system are always
synchronized, but it may introduce higher latency for write operations
since data must be written to both locations.
2) Write-Behind Caching:
In write-behind caching, modifications are initially written only to the
cache. The updates are then asynchronously propagated to the
underlying storage system at a later time
This approach can improve write performance since writes are initially
handled by the cache, but it introduces the risk of data loss if the cached
data is lost before being written to the storage system
3) Invalidation-Based Caching:
In invalidation-based caching, modifications are propagated by
invalidating (marking as stale) the relevant cache entries when changes
are made to the original data source.
When a client requests data that has been invalidated, the cache fetches
the updated data from the source and updates the cache according
© Hritik Ghatge
8
1) Client-Side Caching:
In client-side caching, clients cache frequently accessed files locally,
reducing the need to fetch them from the remote server or storage
system repeatedly.
This scheme can significantly reduce latency for read-heavy workloads
and improve overall system performance, especially in scenarios with
limited network bandwidth or high latency.
2) Server-Side Caching:
In server-side caching, caching mechanisms are implemented on the
server or storage nodes to cache frequently accessed files or data blocks.
This approach reduces the load on backend storage systems and
improves data access latency for clients by serving cached data directly
from the server's local storage.
3) Proxy Caching:
Proxies intercept client requests, check if the requested data is available
in the cache, and serve it directly to clients without forwarding the
request to the origin server.
This scheme reduces network traffic, minimizes server load, and
improves response times for clients, especially in scenarios with a large
number of clients accessing common data resources.
© Hritik Ghatge
9
4) Distributed Caching:
Distributed caching schemes distribute cache storage and management
across multiple nodes in a distributed system.
Distributed caching improves scalability, fault tolerance, and load
balancing by distributing cache load across multiple nodes and ensuring
high availability of cached data.
7. How Lamport does synchronize logical clock? Which events are said to be
concurrent in Lamport’s timestamp?
Ans:
A Lamport logical clock is a numerical software counter value maintained in
each process.
The basic idea behind Lamport clocks is that each process maintains a counter
representing its own notion of time. Every time an event occurs, the process
increments its counter.
When a process receives a message, it re-synchronizes its logical clock with
that sender.
Simplified explanation of how Lamport clocks synchronize:
1) Initialization: Each process starts with its own logical clock initialized to 0
© Hritik Ghatge
10
© Hritik Ghatge
11
© Hritik Ghatge
12
© Hritik Ghatge
13
9. Explain Absolute ordering and casual ordering process with the help of
example for many to many communication. *
Ans:
Absolute Ordering:
Absolute ordering, also known as total ordering, ensures that all events in a
distributed system are totally ordered, meaning that every process in the
system agrees on the same order of events.
In an absolutely ordered system, there is a global sequence of events, and
every process observes the same sequence.
Process:
Every event is assigned a unique, global timestamp.
Events are ordered based on these timestamps, ensuring a total order of
events across all processes.
Processes must synchronize their clocks to ensure that timestamps are
accurate and consistent.
Example:
Consider a distributed system where events represent messages being sent
and received between processes.
Each message is assigned a unique timestamp representing the time at
which it was sent.
By comparing timestamps, all processes can agree on the order in which
messages were sent and received, even if they were processed on different
machines.
© Hritik Ghatge
14
Casual Ordering:
Causal ordering focuses on preserving the causal relationships between events.
It ensures that if event A causally affects event B, then event A must precede
event B in the ordering.
However, events that are concurrent can be ordered arbitrarily.
Process:
Each event is associated with a causal relationship specifying its
dependencies on other events.
Events are ordered based on their causal relationships, ensuring that events
causally affecting each other are ordered accordingly.
Concurrent events may be ordered arbitrarily, as long as the causal
relationships are preserved.
Example:
Suppose a distributed system tracks transactions between bank accounts.
If a withdrawal from one account precedes a deposit into another account,
the withdrawal event causally affects the deposit event.
Even if two deposits into the same account occur concurrently, their order
in the sequence is not significant as long as their causal relationships with
other events are maintained.
© Hritik Ghatge
15
Algorithm Steps:
1) Initialization: Each process maintains two flags: requesting to indicate if it
wants to access the critical section and replied to keep track of replies from
other processes
2) Requesting Access: When a process wants to enter the critical section, it sends
a request message to all other processes, indicating its intention to access the
critical section. After sending the request, the process waits for replies from all
other processes.
4) Entering the Critical Section: Once a process receives replies from all other
processes, and none of them have a higher priority request, it enters the
critical section.
5) Exiting the Critical Section: After finishing execution in the critical section, the
process resets its requesting flag to false, indicating it no longer requires access
to the critical section. It also broadcasts a release message to inform other
processes that the critical section is now available.
© Hritik Ghatge