Professional Documents
Culture Documents
Chap. 6 Consistency & Replication: Distributed Systems
Chap. 6 Consistency & Replication: Distributed Systems
Chap. 6 Consistency & Replication: Distributed Systems
Objectives
• Reasons for replication
• Relationship between replication and scalability.
▪ Scaling in numbers – Increasing num. of processes that need to access the single
server.
▪ Scaling in geographical area – Placing a copy of data in the proximity of the process
using them.
▪ Caveat / Cautions
▪ Gain in performance
▪ Fault tolerance – correctness concerns on the freshness of the data supplied to the
client and the effects of clients’ operation on the data e.g. Air traffic control(correct
data is needed in a short timescale)
Requests and
replies
RM RM
C FE
C FE Replica
RM managers
Components of the basic architectural model for the management of replicated data
Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 4
© Pearson Education 2007
Figure 6.2
Services provided for process groups
Group Leave
send
Join
Process group
Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 4
© Pearson Education 2007
FACULTY OF INFORMATION & COMMUNICATION TECHNOLOGY
Passive(Primary-backup) Replication
• There is only a single primary replica manager and one or more secondary replica
managers – ‘backup’/’slaves’.
• Sequence of events as follows:
1. Request – Front end issues the request (containing unique identifier to primary
manager)
2. Coordination – Primary takes the request , checks the unique identifier, if already
executed re-send the response.
3. Execution – Primary executes the request and stores the response.
4. Agreement – If update, primary sends the update state, response and unique
identifier to all the backup. Backup send ACK.
5. Response – Primary responds to the front end, and pass back to the front end
RM
C FE RM FE C
RM
Front end multicast their request to the group of replicas managers and
all the replica managers process the request independently but
identically and reply
Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 4
© Pearson Education 2007
FACULTY OF INFORMATION & COMMUNICATION TECHNOLOGY
Active Replication
• Sequence of events as follows:
1. Request – Front end issues the request (containing unique identifier) and multicast
it to the group of replica managers. It will not issue next request until it receives a
response.
2. Coordination – The group comm. System delivers the request to every correct
replica manager in the same order.
3. Execution – Every replica manager executes the request. Correct replica managers
all process the request identically. The response contains the client’s unique request
identifier.
4. Agreement – No agreement phase is needed, because of the multicast delivery
semantics.
5. Response – Each replica manager send its response to the front end, then the front
end passes the first response to arrive back to the client and discard the rest
Service
Provide two basic types of
RM Operation: Queries and
update operation .
gossip
RM Front ends send queries and
RM
update to any replica
manager
Update id
Query,prev Val, new Update,prev they choose (any that is
available and can provide
FE FE
reasonable response time).
Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 4
© Pearson Education 2007
FACULTY OF INFORMATION & COMMUNICATION TECHNOLOGY
Gossip Architecture
• Sequence of events as follows:
1. Request – Front end issues the request normally to a single replica manager at a
time. If the single replica manager fails/unreachable it may try the other RM.
2. Update the response - If the request is an update then the RM replies as soon as it
has received the update.
3. Coordination – RM receives a request and will not process it until it can apply the
request according to the required ordering constraints.
4. Execution – RM executes the request.
5. Query response – If the request is a query, then RM replies at this point.
6. Agreement – RM update one another by exchanging gossip messages, which
contain the most recent updates they have received.
Service
Each Front End keep a
vector timestamp that reflects RM
the version of the latest data
values accessed by the FE. gossip
RM RM
Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 4
© Pearson Education 2007 13
Figure 6.8
A gossip replica manager, showing its main state components
Updat
es Prev – Latest value accessed by the FE
Operation Updat Pre Update – Unique update ID from the RM
F F ID e v
E E
Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 4
© Pearson Education 2007 14
Figure 6.9
Committed and tentative updates in Bayou – Provide data replication for
high availability with weaker guarantees.
Committed Tentative
c0 c1 c2 cN t0 t1 t2 ti ti+1
Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 4
© Pearson Education 2007 15
Figure 6.10
Transactions on replicated data
getBalance(A) B
Replica managers
Replica managers
A A A B B B
getBalance(A) B
Replica managers
Replica managers
A A A B B B
getBalance(B)
deposit(A,3);
getBalance(A)
deposit(B,3); Replica managers
B
Replica managers M
A A B B
X Y P N
Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 4
© Pearson Education 2007 18
FACULTY OF INFORMATION & COMMUNICATION TECHNOLOGY
Quorum-Based Protocols
• Basic idea – Require clients to request and acquire the permission of multiple
servers before either reading/writing a replicated data item.
1. NR + NW > N
2. NW > N/2
N=num. of replicas
1. NR + NW > N
2. NW > N/2
N=num. of replicas
1. NR + NW > N
2. NW > N/2
N=num. of replicas
NW > N/2
Figure 6-22. A correct choice, known as ROWA (read one, write all).
Replica managers
X V Y Z
“Network partition” refer to the barrier that divides RM into several parts.
Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 4
© Pearson Education 2007 24
Figure 6.14
Virtual partition
X V Y Z
E.g. V will keep on trying to contact Y and Z until both of them replies e.g. when
Y can be accessed. The group of replica managers V, X and Y comprise a virtual
partition because they are sufficient to form read and write quora.
When a new virtual partition is created during transaction that has performed an
operation at one of the RM e.g. Transaction T, the transaction must be
aborted and the replicas within the new partition must be updated.
Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 4
© Pearson Education 2007 25
Figure 6.15
Two overlapping virtual partitions
Y X V Z
Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 4
© Pearson Education 2007 26
FACULTY OF INFORMATION & COMMUNICATION TECHNOLOGY
References
These slides are taken from Tanenbaum & Van Steen, Distributed Systems:
Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved.
0-13-239227-5
Sub Point #1
Reasons for Replication
Sub Point #4
Network Partition Vs. KEY Sub Point #2
Types of Replication
Virtual Partition POINTS
Sub Point #3
Consistency
Questions : Replication
Three computers together provide a replicated service. The manufacturers claim that
each computer has a mean time between failures of five days; a failure typically takes
four hours to fix. What is the availability of the replicated service?
Answer:
Formula to use:-
n= num. of replicas.
Cont..
n= num. of replicas.
= 1 – 0.033
= 0.999973.
End of Lecture