Download as pdf or txt
Download as pdf or txt
You are on page 1of 12

Syllabus  

Blank Homework  
Notes   Labs   Scores   Blank

Lecture Notes
Dr. Tong Lai Yu, March 2010

0. Review and Overview 7. Distributed OS Theories


1. B-Trees 8. Distributed Mutual Exclusions
2. An Introduction to Distributed Systems 9. Agreement Protocols
3. Deadlocks 10. Distributed Scheduling
    
4. Distributed Systems Architecture 11. Distributed Resource Management
5. Processes 12. Recovery and Fault Tolerance
6. Communication 13. Security and Protection
 
Distributed Mutual Exclusion
Life consists not in holding good cards but in playing those you hold well.

Josh Billings

1. Introduction

A Centralized Algorithm
One process is elected as the coordinator.
Whenever a process wants to access a shared-resource, it sends request

to the coordinator to ask for permission.


Coordinator may queue requests.

Decentralized
nontoken-based
token-based

Requirements of Mutual Exclusion Algorithms


only one request accessess the CS at a time ( primary goal )
Freedom from deadlocks
Freedom from starvation
Fairness
Fault Tolerance

Performance of a mutual exclusion algorithm


System throughput S ( rate at which the system executes requests for the CS )

1
S = --------
Sd + E
Sd = synchronization delay

E = average execution time


low load and high load performance
best and worst case performance; if fluctuates statistically, take average

8. Election Algorithms

Principle

An algorithm requires that some process acts as a coordinator. The question

is how to select this special process dynamically.

Note

In many systems the coordinator is chosen by hand (e.g. file servers). This

leads to centralized solutions ) single point of failure.

After a network partition, the leader-less partition must elect a leader.

Election by bullying

Principle

Each process has an associated priority (weight). The process with

the highest priority should always be elected as the coordinator.

Issue

How do we find the heaviest process?

Any process can just start an election by sending an election

message to all other processes with higher numbers.


If a process Pheavy receives an election message from a lighter

process Plight, it sends a take-over message to Plight. Plight is out of

the race.
If a process doesn't get a take-over message back, it wins, and

sends a victory message to all other processes.

(a) Pocess 4 holds an election. (b) Processes 5 and 6 respond, telling 4 to stop.
(c) Noew 5 and 6 hold an election. (d) Process 6 tells 5 to stop.
(e) Process 6 wins and tells everyone.

Issue

Suppose crashed nodes comes back on line:

Sends a new election message to higher numbered processes


Repeat until only one process left standing
Announces victory by sending message saying that it is coordinator (if not already coordinator)
Existing (lower numbered) coordinator yields

Hence the term 'bully'

Election in a ring

Principle

Process priority is obtained by organizing processes into a (logical)

ring. Process with the highest priority should be elected as

coordinator.

Any process can start an election by sending an election message

to its successor. If a successor is down, the message is passed

on to the next successor.


If a message is passed on, the sender adds itself to the list. When

it gets back to the initiator, everyone had a chance to make its

presence known.
The initiator sends a coordinator message around the ring

containing a list of all living processes. The one with the highest

priority is elected as coordinator.

2 and 5 start election message independently. Both messages continue to circulate.


Eventually, both messages will go all the way around
. 2 and 5 will convert Election messages to COORDINATOR messages.

All processes recognize highest numbered process as new coordinator.

Question

Does it matter if two processes initiate an election?

Question

What happens if a process crashes during the election?

19. Non-token-based algorithms

Lamport's Algorithm

Si -- site, N sites

each site maintains a request set

Ri = { S1, S2, ..., SN }


request-queuei containing mutual exclusion
requests ordered by their timestamps,

use => total order relation ( with Lamport's clock )

tsi -- timestamp of site i

Assume
messages are received in the same order as they are sent
eventually every message is received

1. To request entering the CS, process Pi sends a


REQUEST( tsi, i ) message to every
process ( including itself ), puts the request on request-
queuei

2. When process Pj receives REQUEST(tsi, i ), it places it on its request-queuej and sends a timestamped
REPLY ( acknowledgement ) to Pi

3. Process Pi enters CS when the following 2 conditions


are satisfied:
Pi's request is at the head of request-queuei
Pi has received a ( REPLY ) message
from every other process time-stamped later than tsi

4. When exiting the CS, process Pi removes its request


from head of its request-queue and sends a timestamped RELEASE to
every other
process

5. When Pj receives a RELEASE from Pi, it


removes Pi's request from its request queue.

See video Lamport Mutual Exclusion Agorithm

Performance

for each CS invocation


(N-1) REQUEST

(N-1) REPLY

(N-1) RELEASE

total 3(N-1) messages

synchronization delay Sd = average delay

Ricart, Agrawala optimized Lamport's algorithm by


merging the RELEASE and REPLY messages.

(See example below.)

Example:

(a) Two processes want to access a shared resource at the same time

(b)Process 0 has the lowest timestamp, so it wins

(c) When process 0 is done, it sends an OK also, so 2 can now go ahead

Maekawa's Voting Algorithm

Voting Algorithms:

Lamport's algorithem requires a process to get permisson from all other processes. It is an overkill.
A different approach is to let processes compete for votes. If a process has received more votes than any other process, it can
enter the CS. If it does not have enough votes, it waits until the process in the CS is done and releases its votes.
Quorums have the property that any two groups have a non-empty intersection.
Simple majorities are quorums. Any 2 sets whose sizes are simple majorities must have at least one element in common.

12 nodes, so majority is 7

Grid quorum: arrange nodes in logical grid (square). A quorum is all of a row and all of a column. Quorum size is 2 √ N - 1.

Principles:

To get accessi to a CS, not all processes have to agree


Suffices to split set of processes up into subsets ("voting sets") that
overlap
Suffices that there is consensus within every subset
When a process wishes to enter the CS, it sends a vote request to every member of its voting district.
When the process receives replies from all the members of the district, it can enter the CS.
When a process receives a vote request, it responds with a "YES" vote if it has not already cast its vote.
When a process exits the CS, it informs the voting district, which can then vote for other candidates.
May have deadlock.

Request sets

N = { 1, 2, ..., N }

Ri ∩ Rj ≠ ∅   all i, j ∈ N

A site can send a REPLY ( LOCKED ) message only if it has not been LOCKED (i.e. has not cast the vote).

Properties:

1. Ri ∩ Rj ≠ ∅
2. Si ∈ Ri
3. |Ri| = K     for all i ∈ N
4. any site Si is in K number of Ri's

Maekawa found that:

N=K(K-1)+1

or K = |Ri| ≈ √N

Messages exchange:

Failed -- F,   Sj cannot grant permission to Sk


because Sj has granted permission to a site with higher
request priority.

Inquire -- I,   Sj wants to find out if Sk has


successfully locked all sites. ( the outstanding grant to
Sk
has a lower priority than the new request )

Yield -- Y,   Sj yields to Sk
( Sj has received a failed message from some other site
or Sj has sent a
yield to some other site but has not received a new grant )

( The request's priority is determined by its


sequence number ( timestamp ); the samller the sequence number,
the higher the priority; if sequence #
same, the one
with smaller site number has higher priority )

Algorithm:

1. A site Si requests access to CS by sending


REQUEST(i) messages to all the sites in its request set Ri
2. When a site Sj receives the REQUEST(i) message,
it sends a REPLY(j) message to Si provided it hasn't
sent any REPLY to any site since last
RELEASE. Otherwise, it
queues up the REQUEST.
3. Site Si could access the CS only after it has
received REPLY from all sites in Ri
Deadlock Handling:

1. When a REQUEST(i) from Si blocks at site Sj


because Sj has currently granted permission to site
Sk then Sj sends FAILED(j) message to
Si if
Si has lower priority. Otherwise
Sj sends an INQUIRE (j) message to Sk.
2. In response to an INQUIRE(j) from Sj, site
Sk sends YIELD(k) to Sj, provided
Sk has received a FAILED message or has sent
a YIELD to
another site, but has not recived a new REPLY from it.
3. In response to a YIELD(k) message from Sk,
site Sj assumes it has been released by Sk, places the request of Sk at the
appropriate location in
the request queue, and sends a
REPLY(j) to the top request's site in the queue. Sj

Example

13 nodes, 13 = 4(4-1) + 1, thus K = 4

R1 = { 1, 2, 3, 4 }

R2 = { 2, 5, 8, 11 }

R3 = { 3, 6, 8, 13 }

R4 = { 4, 6, 10, 11 }

R5 = { 1, 5, 6, 7 }

R6 = { 2, 6, 9, 12 }

R7 = { 2, 7, 10, 13 }

R8 = { 1, 8, 9, 10 }

R9 = { 3, 7, 9, 11 }

R10 = { 3, 5, 10, 12 }

R11 = { 1, 11, 12, 13 }

R12 = { 4, 7, 8, 12 }

R13 = { 4, 5, 9, 13 }

Suppose sites 11, 8, 7 want to enter CS; they all send requests
with sequence number 1. ( 7 has highest priority, 8 next, 11 lowest )

1. site 11 wants to enter; requests have arrived at 12, 13;


R to 1 is on the way
2. 7 wants to enter CS; R arrived at 2 and 10 but R to 13 is
on its way
3. 8 also wants to enter CS; sends R to 1, 9, 10 but fails
to lock 10 because 10 has been locked by 7 with
higher priority

4. R from 11 finally arrived at 1 and R from 7 arrived at 13


11, 7, 8 are circularly locked:

8 receives F and cannot enter CS


11 receives F and cannot enter CS
7 cannot enter CS because it has not received all REPLY ( LOCKED) messages

8. 13 is locked by 11 ( has lower priority than 7 ) and receives request from 7, so it sends an
INQUIRE to 11 to ask it to yield
9. When 11 receives an INQUIRE, it knows that it cannot enter CS; therefore it
sends a YIELD to 13
10. then 13 can send L to 7 which enters CS
11. when 7 finished, sends RELEASE
12. then 8 locks all members, ... , sends RELEASE
13. then 11 enters
35. Token-based algorithms

Principles
one token, shared among all sites
site can enter its CS iff it holds token
The major difference is the way the token is searched
use sequence numbers instead of timestamps

o used to distinguish requests from same site

o kept independently for each site

o use sequence number to distinguish between old


and current requests

The proof of mutual exclusion is trivial


The proof of other issues (deadlock and starvation) may be less so

(a) An unordered group of processes on a network.

(b) A logical ring connected in software.

a) Suzuki-Kasami's Broadcast Algorithm

TOKEN -- a special PRIVILEGE message


node owns TOKEN can enter CS
initially node 1 has the TOKEN
node holding TOKEN can execute CS repeatedly if no request
from others comes
if a node wants TOKEN, it broadcasts a REQUEST message to all
other nodes
node:

REQUEST(j, n)

  node j requesting n-th CS invocation

n = 1, 2, 3, ... , sequence #
node i receives REQUEST from j

update RNi[j] = max ( RNi[j], n )

RNi[j] = largest seq # received so far from node j

TOKEN:

TOKEN(Q, LN ) ( suppose at node i )

Q -- queue of requesting nodes

LN -- array of size N such that

   LN[j] = the seq # of the request of node j


granted most recently

When node i finished executing CS, it does the following


1. set LN[i] = RNi[i] to indicate that current request
of node i has been granted ( executed )
2. all node k such that

RNi[k] > LN[i]

(i.e. node k requesting ) is appended to Q if its not there


When these updates are complete, if Q is not empty, the front
node is deleted and TOKEN is sent there

FCFS

Example:

There are three processes, p1, p2, and p3.


p1 and p3 seek mutually exclusive access to a shared resource.

Initially: the token is at p2 and the token's state is LN = [0, 0, 0] and Q empty;

p1's state is: n1 ( seq # ) = 0, RN1 = [0, 0, 0];


p2's state is: n2 = 0, RN2 = [0, 0, 0];
p3's state is: n3 = 0, RN3 = [0, 0, 0];

p1 sends REQUEST(1, 1) to p2 and p3; p1: n1 = 1, RN1 = [ 1, 0, 0 ]

p3 sends REQUEST(3, 1) to p1 and p2; p3: n3 = 1, RN3 = [ 0, 0, 1 ]

p2 receives REQUEST(1, 1) from p1; p2: n2 = 1, RN2 = [ 1, 0, 0 ], holding token

p2 sends the token to p1

p1 receives REQUEST(3, 1) from p3: n1 = 1, RN1 = [ 1, 0, 1 ]


p3 receives REQUEST(1, 1) from p1; p3: n3 = 1, RN3 = [ 1, 0, 1 ]

p1 receives the token from p2

p1 enters the critical section

p1 exits the critical section and sets the token's state to LN = [ 1, 0, 0 ] and Q = ( 3 )

p1 sends the token to p3; p1: n1 = 2, RN1 = [ 1, 0, 1 ], holding token; token's state is LN = [ 1, 0, 0 ] and Q empty

p3 receives the token from p1; p3: n3 = 1, RN3 = [ 1, 0, 1 ], holding token


p3 enters the critical section

p3 exits the critical section and sets the token's state to LN = [ 1, 0, 1 ] and Q empty

Performance:

It requires at most N message exchange per CS execution ( (N-1) REQUEST messages + TOKEN message
or 0 message if TOKEN is in the site
synchronization delay is 0 or T
deadlock free ( because of TOKEN requirement )
no starvation ( i.e. a requesting site enters CS in finite time )

Comparison of Lamport and Suzuki-Kazami Algorithms


The essential difference is in who keeps the queue. In one case every site keeps its own local copy of the queue. In the other case, the
queue is passed around within the token.

You might also like