Professional Documents
Culture Documents
Da10 Byzantine
Da10 Byzantine
Nikola Knezevic
knl
Monday, 13 December 2010
Different Types of Failures
Crash / Fail-stop
Receive Omissions Send Omissions
General Omission
Arbitrary failures,
authenticated messages
Arbitrary failures
Monday, 13 December 2010
Arbitrary Byzantine Failures
Some subset of the processes may fail:
Send fake messages
Not send any messages
Try to disrupt the computation
Monday, 13 December 2010
Why Byzantine Fault Tolerance?
Does this happen in the real world?
Malfunctioning hardware
Buggy software
Compromised system due to hackers
Assumptions are vulnerabilities
Is the cost worth it?
Hardware is always getting cheaper
Protocols are getting more and more efficient.
Monday, 13 December 2010
Byzantine Fault Tolerance
Seen in the fail-stop model:
Question: How do you build a reliable service?
Answer: State machine replication solves the
problem of crash failures.
This week:
What if we want to build a service that can
tolerate buggy software, hackers, etc... How do
we build a service that tolerates arbitrary
failures?
Monday, 13 December 2010
Consensus
Key building block for state machine
replication.
Agreement: not two processes decide on
different values.
Validity: every decision is the initial value of
some process.
Termination: every correct process eventually
decides.
Monday, 13 December 2010
Validity
(Strong) Validity: If all correct processes
start with the same initial value v, then
value v is the only decision value of
correct processes.
(Weak) Validity: If there are no failures,
then every decision is the initial value of
some process.
Monday, 13 December 2010
Achieving Fault Tolerance
How many arbitrary failures can we
tolerate?
Assume a system of n processes.
For crash failures:
If failure-detector P, then n-1 failures.
If failure-detector <>P, then < n/2 failures.
For arbitrary failures:
If failure-detector P, then < n/3 failures.
Monday, 13 December 2010
Number of Possible Failures
Lemma 1: In a system with n=3 and 1 failure,
there is no algorithm that solves consensus.
p1
p2
p3
Assume there exists some such protocol A.
fo
c
u
s
o
n
th
e
w
h
ite
b
o
a
rd
Monday, 13 December 2010
Another look on why n>3t
read-write register
t faulty, liveness => n-t responses
W set: n-t responses
R set: n-t responses
W and R have n-2t in common
t may not be faulty
W and R have n-3t in common
=> n>3t
10
Monday, 13 December 2010
Number of Possible Failures
Lemma 1: In a system with n processes & t = n/3
failures, there is no algorithm that solves
consensus.
Proof: By reduction. Given an algorithm A that
solves consensus for system (3t,t) where n=3t, we
show how to solve consensus in a system (3,1)
where n=3 and t=1. Since we know this is
impossible, we get a contradiction.
Monday, 13 December 2010
Number of Possible Failures
Lemma 1: In a system with n processes and t
= n/3 failures, there is no algorithm that solves
consensus.
Proof: (Continued)