Professional Documents
Culture Documents
Introduction To DSM: Unit - III Essay Questions
Introduction To DSM: Unit - III Essay Questions
Introduction to DSM
ESSAY QUESTIONS
1. What is meant by DSM and explain the design and implementation of DSM Systems?
1) The distributed shared memory (DSM) implements the shared memory model in
distributed systems, which have no physical shared memory.
2) The shared memory model provides a virtual address space shared between all nodes.
The overcome the high cost of communication in distributed systems, DSM systems move
data to the location of access DSM also known as DSVM.
DSM provides a virtual address space shared among processes on loosely coupled processors.
DSM is basically an abstraction that integrates the local memory of different machine into a
single logical entity.
1) Shared by cooperating processes.
2) Each node of the system consists of one or more CUPs and memory unit.
3) Nodes are connected by high speed communication network.
4) Simple message passing system for nodes to exchange information.
5) Main memory of individual nodes is used to cache pieces of shared memory space.
6) Memory mapping manager routine maps local memory to shared virtual memory.
7) Shared memory of DSM exist only virtually.
8) Shared memory space is partitioned into blocks.
9) Data caching is used in DSM system to reduce network latency.
10) The basic unit of caching is a memory block.
11) The missing block is migrate from the remote node to the client process’s node and
operating system maps into the application’s address space.
12)
13) Data block keep migrating from one node to another on demand but no communication
is visible to the user processes.
14) If data is not available in local memory network block fault is generated.
2. Explain the Granularity?
Most visible parameter in the design of DSM system is block size.
Factors influencing block size selection: Sending large packet of data is not much more
expensive than sending small ones.
Paging overhead: A process is likely to access a largeregion of its shared address space in a
small amount of time.
Therefore the paging overhead is less for large block sizeas compared to the paging
overhead for small block size.
Directory size: the larger the block size, the smaller the directory.
Ultimately result in reduced directory management overhead for larger block size.
Thrashing: The problem of thrashing may occur when data item in the same data block are
being updated by multiple node at the same time.
Problem may occur with any block size, it is more likely with larger block size.
False sharing:
Process P1 P
accesses data in 1
this area
Process P1 P
accesses data in
this area 2
A data block
Fig: False Sharing
Occur when two different processes access two unrelated variable that reside in the same
data block. The larger is the block size the higher is the probability of false sharing. False sharing of
a block may lead to a thrashing problem.
Using page size as block size: Relative advantage and disadvantages of small and
largeblock size make it difficult for DSM designer to decide on a proper block size.
Following advantage: It allows the use of existing page fault schemes to trigger a DSM page
fault.
It allows the access right controlpage size do not impose undue communication overhead at
the time of network page fault. Page size is a suitable data entity unit with respect to memory
contention.
The types of consistency models are Data – Centric and client centric consistency models.
1. Data – Centric Consistency Models:
A data store may be physically distributed across multiple machines. Each process that can
access data from the store is assumed to have a local or nearby copy available of the entire
store.
i. Strict Consistency Model:
1) Any read on data item X returns a value corresponding to the result of the most
recent write on X
2) This is the strongest form of memory coherence which has the most stringent
consistency requirement.
Example: Assume three operations read (R1), write (W1), read (R2) performed in an order
on a memory address. Then (R1, W1, R2), (R1, R2, W1), (W1, R1, R2) (R2, W1, R1) are acceptable
provided all processes see the same ordering.
iii. Linearizability:
1) It that is weaker than strict consistency, but stronger than sequential
consistency.
2) A data store is said to be linerarizable when each operation is timestamped and
the result of any execution is the same as if the (read and write) operations by
all processes on the data store were executed in some sequential order.
3) The operations of each individual process appear in sequence order of each
individual process appear in sequence in some sequential order specified by its
program.
5) If a write (w2) operation is casually related to another write (w1) the acceptable
order is (w1, w2).
v. FIFO Consistency:
1) It is weaker than causal consistency.
2) This model ensures that all write operations performed by a single process are
seen by all other processes in the order in which they were performed like a
single process in a pipeline.
3) This model is simple and easy to implement having good performance because
processes are ready in the pipeline.
3) Release consistency affects all shared data but entry consistency affects only
those shared data associated with a synchronization variable.
2. Client – Centric Consistency Models:
1) Client – Centric Consistency models aim at providing a system wide view on a data
store.
2) This model concentrates on consistency from the perspective of a single mobile
client.
3) Client – Centric Consistency models are generally used for applications that lack
simultaneous updates were most operations involve reading data.
a) Eventual Consistency:
1) In Systems that tolerate high degree of inconsistency, if no updates take place
for a long time all replicas will gradually and eventually become consistent. This
form of consistency is called eventual consistency.
2) Eventual consistency only requires those updates that guarantee propagation to
all replicas.
3) Eventual consistent data stores work fine as long as clients always access the
same replica.
4) Write conflicts are often relatively easy to solve when assuming that only a small
group of processes can perform updates. Eventual consistency is therefore often
cheap to implement.
b) Monotonic Reads Consistency:
a) Physical Clocks: The time difference between two computers is known as drift.
Clock drift over time is known as skew. Computer clock manufacturers specify a
maximum skew rate in their products.
Computer clocks are among the least accurate modern timepieces. Inside every computer is a chip
surrounding a quartz crystal oscillator to record time. These crystals cost 25 seconds to produce.
Physical Clocks – UTC: Coordinated Universal Time (UTC) is the international time standard. UTC
is the current term for what was commonly referred to as Greenwich Mean Time (GMT). Zero hours
UTC is midnight in Greenwich, England which lies on the zero longitudinal meridians. UTC is based
on a 24-hour clock.
(i) Physical Clocks – Christian’s Algorithm:
Assuming there is one time server with UTC:Each node in the distributed system
periodically polls the time server.
Time (T1) is estimated as Stime + (T1 – T0)/2
This process is repeated several times and an average is provided.
Machine T1 then attempts to adjust its time.
Disadvantages: Must attempt to take delay between server T1 and time server into account Single
point of failure if time server fails.
Enables clients across the Internet to be synchronized accurately to UTC. Overcomes large
and variable message delays.
Employs statistical techniques for filtering, based on past quality of servers and several
other measures.
Can survive lengthy losses of connectivity: Redundant servers. Redundant paths to servers.
Provides protection against malicious interference through authentication techniques.
Uses a hierarchy of servers located across the Internet. Primary servers are directly connected to a
UTC time source.
Hierarchy in NTP
(UTC)
Most accurate 1
1 1
Less accurate 1 1 1
NTP has three modes: Multicast Mode:Suitable for user workstations on a LAN. One or more
servers periodically mutlicasts the time to other machines on the network. Procedure Call
Mode:Similar to Christian’s algorithm. Provides higher accuracy than Multicast Mode because
delays are compensated for
Symmetric Mode: Pairs of servers exchange pairs of timing messages that contain time
stamps of recent message events. The most accurate, but also the most expensive mode.
(b) Logical Clocks: Often, it is not necessary for a computer to know the exact time, only
relative time. This is known as “logical time”.
Logical time is not based on timing but on the ordering of events. Logical clocks can only
advance forward, not in reverse. Non – interacting processes cannot share a logical clock.
Computers generally obtain logical time using interrupts to update a software clock. The
more interrupts (the more frequently time is updated), the higher the overhead.
(i) Logical Clocks – Lamport’s Logical Clock Synchronization Algorithm:
The most common logical clock synchronization algorithm for distributed systems is
Lamport’s Algorithm. It is used in situations where ordering is important but global time is not
required.
Basedon the “happens – before” relation: Event A “happens – before” Event when all
processes involved in a distributed system agree that event A occurred first, and B subsequently
occurred.
This DOES NOT mean that Event A actually occurred before Event B in absolute clock time.
A distributed system can use the “happens – before” relation when: Events A and B are observed by
the same process, or by multiple processes with the same global clock.
Event A acknowledges sending a message and Event B acknowledges receiving it, since a
message cannot be received before it is sent. If two events do not communicate via messages, they
are concurrent because order cannot be determined and it does not matter. Concurrent events can
be ignored.
me
rR
rrTi3120
ge
ssa
P
Me
0 1 2 0 1 2 0 1 2
Request Ok Request Release
Ok
No reply
3 Queue is
3 2 3
empty
Coordinator
(a) (b) (c)
When a process wants to enter a critical section, it builds a message containing the name of
the critical section, its process number and the current time. It then sends the message to all other
processes, as well as to itself.
When a process receives a request message, the action it takes depends on its state with
respect to the critical section named in the message. There are three cases: if the receiver is not in
the critical section and does not want to enter it, it sends an ok message to the sender.
If the receiver is in the critical section, it does not reply. It instead queues the request.
If the receiver also wants to enter the same critical section, it compares the time stamp in
the incoming message with the time stamp in the message it has sent out. The lowest time stamp
wins. If its own message has a lower time stamp it does not reply and queues the request from the
sending process.
When a process has received OK messages from all other processes, it enters the critical
section. Upon exiting the critical section, it sends OK messages to all processes in its queue and
deletes them all from the queue.
(iii) Token – Based Algorithms: Another approach is to create a logical or physical ring.
Each process knows the identity of the process succeeding it. When the ring is initialized,
process 0 is give a token. The token circulates around the ring in order, from process k to process
k+1.
When a process receives the token from its neighbor, it checks to see if it is attempting to
enter a critical section. If so, the process enters the critical section and does its work, keeping the
token the whole time.
After the process exits the critical section, it passes the token to the next process in the ring.
It is not permitted to enter a second critical section using the same token.
If a process is handed a token an is not interested in entering a critical section, it passes the
token to the next process.
1
2
3
7
8
5
4
6
0 2 4 8 7 1 6 5 8 3
(a)
2. For each process, the request that it is waiting for then, one can check if the current system
state is deadlocked, or not.
In single – processor systems, OS can maintain this information, and periodically execute
deadlock detection algorithm.
What to do if a deadlock is detected:
1. Kill a process involved in the deadlocked set
2. Inform the users, etc.
6. Global wait – for graph can be obtained by the union of the edges in all the local copies
(iii) Deadlock Prevention:
1. Hierarchical ordering of resources avoids cycles
2. Time – stamp ordering approach:
c) This prevents deadlocks since for every edge (P,Q)in the wait – for graph, P has a
higher priority than Q. Thus a cycle cannot exist.
3. Else, Pi waits for T’ time units to hear from the new coordinator, and if there is no
response start from step (1) again.
Algorithm for other processes (also called Pi)
If Pi is not the coordinator then Pi may receive either of these messages from Pj
If Pi sends “Elected Pj”; [this message is only received if i<j]
Pi updates its records to say that Pj is the coordinator.
Else if Pj sends “election” message (i>j)
DSM has to keep track of locations of all copies of data objects. Examples of
implementations:
1. IVY: Owner node of data object knows all nodes that have copies.
2. PLUS: Distributed linked – list tracks all nodes that have copies.
Advantage: The read – replication can lead to substantial performance improvements if the ratio of
reads to writes is large.
1.