Professional Documents
Culture Documents
DS IAT 3 answer key
DS IAT 3 answer key
CO5: Understand peer to peer and distributed shared memory systems in distributed systems
BLOOM'S TAXONOMY
PART A
1. Timing failure:
Timing failure occurs when a node in a system correctly sends a response, but the
response arrives earlier or later than anticipated. Timing failures, also known as
performance failures, occur when a node delivers a response that is either earlier
or later than anticipated.
2. Response failure:
When a server’s response is flawed, a response failure occurs. The response’s value
could be off or transmitted using the inappropriate control flow.
3. Omission failure:
A timing issue known as an “infinite late” or omission failure occurs when the node’s
answer never appears to have been sent.
4. Crash failure:
If a node encounters an omission failure once and then totally stops responding and
goes unresponsive, this is known as a crash failure.
5. Arbitrary failure :
A server may produce arbitrary response at arbitrary times.
A database index is a data structure that improves the speed of data retrieval
operations on a database table at the cost of additional writes and storage space to
maintain the index data structure. Indexes are used to quickly locate data without
having to search every row in a database table every time said table is accessed.
Indexes can be created using one or more columns of a database table, providing the
basis for both rapid random lookups and efficient access of ordered records.
PART B
6. Identify the role of distributed shared memory in distributed systems
DSM is a mechanism that manages memory across multiple nodes and makes
inter-process communications transparent to end-users. The applications will think that
they are running on shared memory. DSM is a mechanism of allowing user processes to
access shared data without using inter-process communications. In DSM every node
has its own memory and provides memory read and write services and it provides
consistency protocols. The distributed shared memory (DSM) implements the shared
memory model in distributed systems but it doesn’t have physical shared memory. All
the nodes share the virtual address space provided by the shared memory model. The
Data moves between the main memories of different nodes.
On-Chip Memory:
The data is present in the CPU portion of the chip.
Memory is directly connected to address lines.
On-Chip Memory DSM is expensive and complex.
Bus-Based Multiprocessors:
A set of parallel wires called a bus acts as a connection between CPU and memory.
accessing of same memory simultaneously by multiple CPUs is prevented by using
some algorithms
Cache memory is used to reduce network traffic.
Ring-Based Multiprocessors:
There is no global centralized memory present in Ring-based DSM.
All nodes are connected via a token passing ring.
In ring-bases DSM a single address line is divided into the shared area.
Object DSM: in that class of approaches, shared data are objects i.e. variables with
access functions. In his applications, the user has only to define which data (objects) are
shared. The whole management of the shared objects (creation, access, modification) is
handled by the DSM system. In opposite of SVM systems which work at operating
system layer, objects DSM systems actually propose a programming model alternative
to the classical message-passing.
In any case, implementing a DSM system implies to address problems of data location,
data access, sharing and locking of data, data coherence. Such problems are not specific
to parallelism but have connections with distributed or replicated databases
management systems (transactional model), networks (data migrations), uniprocessor
operating systems (concurrent programming), distributed systems.
Advantages of Distributed shared memory
Simpler abstraction: Programmer need not concern about data movement, As the
address space is the same it is easier to implement than RPC.
Easier portability: The access protocols used in DSM allow for a natural transition
from sequential to distributed systems. DSM programs are portable as they use a
common programming interface.
locality of data: Data moved in large blocks i.e. data near to the current memory
location that is being fetched, may be needed future so it will be also fetched.
on-demand data movement: It provided by DSM will eliminate the data exchange
phase.
larger memory space: It provides large virtual memory space, the total memory size is
the sum of the memory size of all the nodes, paging activities are reduced.
Better Performance: DSM improve performance and efficiency by speeding up access
to data.
Flexible communication environment: They can join and leave DSM system without
affecting the others as there is no need for sender and receiver to existing,
process migration simplified: They all share the address space so one process can
easily be moved to a different machine.
Apart from the above-mentioned advantages, DSM has furthermore advantages like:
Less expensive when compared to using a multiprocessor system.
No bottlenecks in data access.
Scalability i.e. Scales are pretty good with a large number of nodes.
DSM Architecture
· Each node of the system consist of one or more CPUs and memory unit
· Nodes are connected by high speed communication network
· Simple message passing system for nodes to exchange information
· Main memory of individual nodes is used to cache pieces of shared memory space
· Memory mapping manager routine maps local memory to shared virtual memory
· Shared memory space is partitioned into blocks
· Shared memory of DSM exist only virtually
· Data caching is used in DSM system to reduce network latency
· The basic unit of caching is a memory block
· The missing block is migrate from the remote node to the client process’s node and
operating system maps into the application’s address space
· Data block keep migrating from one node to another on demand but no
communication is visible to the user processes
· If data is not available in local memory network block fault is generated.
7. Illustrate how memory consistency models are implemented in Distributed
systems.
At the heart of any distributed system, lies the very principle that ensures that they are a
delight to use - consistency. You see, consistency in a distributed system ensures that all
users view the same data at the same time, irrespective of their location or the part of the
system they are accessing.
To understand this better, consider this - You're at an online auction, and you're locked in a
fierce bidding war for a cherished piece of artwork. The stakes are high, the competition is
stiff, and your eyes are glued to the screen, waiting for the next bid. Suddenly, you find that
the auction has ended, and someone else won the item because your screen did not update
in real-time. How would that make you feel? Frustrated, right? That's exactly what can
happen in a distributed system without a proper consistency model.
Consistency ensures that all updates in the system are reflected across all nodes (servers) in
real-time or near-real-time, giving all users a uniform view and experience of the system.
Without consistency, every interaction with a distributed system, like the auction we
discussed, can turn into an unpredictable and frustrating experience.
Now, let's imagine our distributed system as an orchestra. Each computer (or server) is an
instrument, and the data they hold are the musical notes. The audience (users) are here for a
delightful symphony. But, what happens if each instrument starts playing its own tune, with
no coordination with the others? The result would be far from harmonious!
Consistency not only improves the user experience but also ensures the overall reliability
and trustworthiness of the system. Consider an online banking application. You've just paid
your bills and want to check your updated balance. You expect your balance to
immediately reflect the deducted amount. But what if it doesn't? What if the balance
remains unchanged or takes hours to update? Such inconsistencies could result in distrust
and dissatisfaction, impacting the system's reputation.
Consistency, therefore, plays a pivotal role in keeping systems reliable, trustworthy, and
user-friendly. It's one of the fundamental principles ensuring that our interaction with
technology is smooth and intuitive, despite the complex mechanics at work behind the
scenes.
Consistency, while highly desirable, is not always easy to achieve in distributed systems. It
requires careful design, thorough testing, and continuous monitoring. The very nature of
distributed systems, with their various independent nodes spread across different
geographical locations, introduces latency. This latency, or delay, can sometimes make
maintaining consistency a challenging task.
Yet, the challenge is worth taking on. The seamless, enjoyable, and reliable experience that
a consistent distributed system provides is invaluable. It is what keeps users coming back,
and what enables distributed systems to support critical applications that we rely on daily,
from online shopping and banking to social media and entertainment.
In our upcoming sections, we will delve into the various models or 'patterns' of consistency
used in distributed systems. Each comes with its unique set of benefits and challenges. Our
goal? To help you understand these patterns and guide you in choosing the most suitable
one for your use case. So, hold tight as we continue our journey into the captivating world
of distributed systems and the crucial role of consistency within them.
To start with, what is a consistency model? Picture this - a consistency model is like the
rulebook for a game. It defines how the players (servers, in our case) should play, how they
interact with each other, and how they present a unified front to the audience (users).
In the realm of distributed systems, a consistency model lays down the rules about reading
and writing data across multiple servers. It determines how updates are propagated,
ensuring all users see the same data at the same time. The type of consistency model
chosen can greatly impact the performance, scalability, and reliability of the system.
The three commonly used consistency models are - Eventual Consistency, Strong
Consistency, and Causal Consistency. Each one provides a different level of consistency
and suits different kinds of applications. Let's explore each of these models in detail.
Eventual Consistency
Think of Eventual Consistency as the cool, laid-back player in the game. This model
follows the principle of 'relaxed consistency', where the system is allowed to be in an
inconsistent state temporarily, with the promise that it will eventually become consistent.
Imagine you're playing a game of 'Telephone' with friends. You whisper a phrase to the
person next to you, who then whispers it to the next person, and so on. By the time it
reaches the last person, the phrase might have changed. However, if everyone repeats the
process enough times, eventually, everyone will hear the same phrase. That's the essence of
eventual consistency!
Strong Consistency
Now, let's meet the strict disciplinarian of the lot - Strong Consistency. In this model, every
read operation must return the most recent write operation. This model insists on 'absolute
consistency', meaning all changes to the system are instantly seen by all servers and,
subsequently, all users.
Causal Consistency
Last but not least, meet the balanced player, Causal Consistency. This model finds a middle
ground between eventual and strong consistency. It ensures that related events are seen by
all servers in the same order, while unrelated events can be seen in any order.
While each model has its own strengths and weaknesses, it's important to remember that no
one model is inherently 'better' than the others. The choice depends largely on the specific
requirements of the system. Some applications might prioritize data availability and can
afford to have temporary inconsistencies, making eventual consistency a good fit. Other
applications might require strict consistency at all times, in which case, strong consistency
is the way to go. Yet, some might need a balance of the two, making causal consistency an
attractive option.
As we continue our journey, we will delve deeper into each of these models. We'll explore
their advantages, disadvantages, and real-world applications. We'll also look at some
strategies used to implement these models, helping you gain a robust understanding of
consistency in distributed systems.
Despite these challenges, Strong Consistency has its place in the world of distributed
systems, particularly in applications where data consistency is of paramount importance.
Google's Bigtable and Spanner databases are prime examples of systems that implement
Strong Consistency.
Bigtable provides a consistent view of the data by using a single-master design, where one
server coordinates all write operations. In contrast, Spanner uses a global clock (TrueTime)
to synchronize updates across all servers, providing strong consistency while maintaining
high availability.
As we journey ahead, we'll delve deeper into the workings of such systems, exploring the
strategies they use to implement Strong Consistency, and the various factors to consider
when choosing this model for your system.
At the heart of Causal Consistency is a focus on preserving the order of related events. In a
causally consistent system, if there's a cause-effect relationship between operations, they
appear in the same order to all nodes. But for unrelated operations? Well, they could appear
in any order, and that's perfectly fine.
Imagine a string of dominoes toppling one after the other. The fall of each domino causes
the next one to fall, creating a causal relationship. Now, if we had multiple such strings
falling in parallel, it wouldn't matter which string fell first as long as the order within each
string was maintained. That's causal consistency for you!
PREPARED BY
VERIFIED BY
APPROVED BY