Chap 15.


 Processes need to agree on a single bit

 No link failures
 A process can fail by crashing (no malicious
 Messages take finite (though unbounded)
 Looks easy, can this be solved ?
Consensus in Asynchronous systems
 Impossible even if just one process can fail !
(Fischer, Lynch, Peterson – FLP result)

 N (N ¸ 2) processes
 Each process starts with an initial value {0,1}
that is modeled as the input register x
 Making a decision is modeled by writing to
the output register y
 Output registers are write once

 Initial independence
 Processes can choose their input independently
 Commute property :
 If events e and f are
on different processes
they commute
Assumptions (contd.)

 Asynchrony of events:
 Any receive event can be arbitrarily delayed
 Every message is eventually delivered
 If e is a receive event
and e is enabled at G then
se is also enabled at G

 Agreement
 Two non-faulty processes cannot commit on
different values
 Non-triviality
 Both 0 and 1 should be possible outcomes
 Termination
 A non-faulty process terminates in finite time
Informal proof of the impossibility result

 We show that no protocol can satisfy

agreement, non-triviality and termination in
the presence of even 1 failure

 We show that :
 There is an initial global state in which the system
is non-decisive
 There exists a way to keep the system non-
 Lat G.V be the set of decision values reachable from
a global state G
 Since a non-faulty process terminates, G.V is non-
 G is :
 Bivalent: G.V = { 0 ,1 } – indecisive
 0-Valent: G.V = { 0 } – always leads to deciding 0
 1-Valent: G.V = { 1 } – always leads to deciding 1
 We show that there exists a bivalent initial state
Claim: Every consensus protocol has a
bivalent initial state
 Assume claim is false
 Non-triviality : The initial set of global states must
contain 0-valent and 1-valent states
 Adjacent global states: If they differ in the state of
exactly one process
 There must be adjacent 0-valent and 1-valent states
which differ in the state of, say, p
 Apply a sequence where p does not take any steps
 Contradiction
Claim: There exists a method to keep the
system indecisive
 Event e (on process p) is applicable to G
 G is the set of global states reachable from
G without applying e
 H = e(G )

 Claim : H contains a bivalent global state

 Assume that H contains no bivalent states
 Claim 1: H contains both 0-valent and 1-
valent states

 Neighbors : 2 global states are neighbors if

one results from the other in a single step
 Claim 2: There exist neighbors G0, G1 such
 H0 = e(G0) is 0-valent and
 H1 = e(G1) is 1-valent
Claim 2:There exist neighbors G0, G1 :
H0 = e(G0) is 0-valent and
H1 = e(G1) is 1-valent
 Let the the smallest sequence of events
applied to G without applying e such that
et(G) has a different valency from e(G)
 Such a sequence exists
 The last two global states in the sequence give us
the required neighbors
 w.l.o.g. let G1 = f(G0) where f is an event on
process q.
 Case 1 : p is different from q
 F is applicable to H0 resulting in H1
 But H0 is 0-valent and H1 is 1-valent
 Case 2:
 p=q
 Commute property
Application: Terminating Reliable
Broadcast (TRB)
 There are N processes in the system and P0 wants
to broadcast a message to all processes.
 Termination: Every correct process eventually delivers
some message
 Validity: If the sender is correct and broadcasts m then all
correct processes deliver m
 Agreement: If a correct process delivers m then all correct
processes deliver m
 Integrity: Every correct process delivers at most one
message, and if it delivers m ( and m  ‘sender faulty’) then
the sender must have broadcasted m
TRB is impossible in asynchronous
 Can use TRB to solve consensus
 If a process receives ‘sender faulty’ it decides
on 0
 Else it decides on the value of the message
Faults in a distributed system
 Crash: Processor halts, does not perform any
other action and does not recover
 Crash+Link: Either processor crashes or the
link fails and remains inactive. The network
may get partitioned
 Omission: Process sends or receives only a
proper subset of messages required for
correct operation
 Byzantine: Process can exhibit arbitrary
Consensus in synchronous systems

 There is an upper bound on the on the

message delay and the durations of actions
performed by the processes

 Consensus under crash failures

 Consensus under Byzantine faults

Consensus under crash failures

 Requirements :
 Agreement: Non faulty processes cannot decide
on different values
 Validity: If all processes propose the same value,
v, then the decided value should be v
 Termination: A non-faulty process decides in a
finite time
 f denotes the maximum number of failures
 Each process maintains V the set of values
proposed by other processes (initially it
contains only its own value)
 In every round a process:
 Sends to all other processes the values from V
that it has not sent before
 After f+1 rounds each process decides on the
minimum value in V
Proof: Agreement

 If value x is in Vi at correct process i then

belongs to the V of all correct processes
 If x was added to Vi in round k<f+1, all correct
process will receive that value in round k+1
 If x was added to Vi in the last round (f+1)
then there exists a chain of f+1 processes
that have x in their V. At least one of them is
non-faulty and will broadcast the value to
other correct processes

 Message complexity:
 O((f+1)N2)
 If each value needs b bits then the total bits
communicated per round is O(bN3)
 Time:
 Needs f+1 rounds
Consensus under Byzantine faults

 Story:
 N Byzantine generals out to repel an attack by a
Turkish Sultan
 Each general has a preference – attack or retreat
 Coordinated attack or retreat by loyal generals
necessary for victory
 Treacherous Byzantine generals could conspire
together and send conflicting messages to
mislead loyal generals
Byzantine General Agreement
 Reliable messages
 Possible to show that no protocol can tolerate
f failures if N · 3f

 Lets assume N > 4f

BGA Algorithm
 Takes f+1 rounds
 Rotating coordinator processes (kings)
 Pi is the king in round i
 Phase 1:
 Exchange V with other processes
 Based on V decide myvalue (majority value)
 Phase 2:
 Receive value from king- kingvalue
 If V has more than N/2 + f copies of myvalue then
V[i]=myvalue else V[i]= kingvalue

 After f+1 rounds decide on V[i]

Informal proof argument
 If correct processes agree on a value at the beginning of a round
they continue to do so at the end
 N>4f
 N-N/2 > 2f
 N-f > N/2 +f
 Each process will receive > N/2+f identical messages

 At least one non-faulty process becomes the king (f+1 rounds)

 In the correct round if any process chooses myvalue then it

received more than N/2+f myvalue messages)

 Therefore king received more than N/2 myvalue messages, i.e.,

kingvalue = myvalue

 Knowledge about the system can be

increased by communicating with other

 Can use notion of knowledge to prove

fundamental results, e.g. Agreement is
impossible in asynchronous unreliable
Notations and definitions
 Ki(b) : process i in group G of processors
knows b
 Someone knows b:

 Everyone knows b:

 Everyone knows E(b): E(E(b))

 Ek(b) : k ¸ 0
 E0(b) = b and Ek+1(b) = E(Ek(b))
Notations and definitions

 Common knowledge C(b):

 Hence for any k

C(b) )Ek(b)
Application: Two generals problem
 The situation:
 Enemy camped in valley
 Two generals hills separated by enemy
 Communication by messengers who have to pass through
enemy territory … may be delayed or caught
 Generals need to agree whether to attack or retreat
 Protocol which always solves problem impossible

 Can we design a protocol that can lead to

agreement in some run?
Application: Two generals problem
 Solution: Don’t start a war if your enemy controls the
 Agreement not possible
 Let r be the run corresponding to the least number
of messages that lead to common knowledge
 Let m be the last message, say it was sent from P
to Q
 Since channel is unreliable P does not know if m
was received, hence P can assert C(b) before m
was sent
 Contradiction – r is the minimal run

