Professional Documents
Culture Documents
Heartbeat Protocols: - Protocol
Heartbeat Protocols: - Protocol
Heartbeat Protocols: - Protocol
• Protocol:
– fixed (maximum) number of participants
– exchanging heartbeats continuously
• Goal:
inactivation of one
inactivation of all
(within a time limit)
Applications
• Core component of Linux-HA (High-
Availability). (http://www.linux-
ha.org/Heartbeat)
– Detection of death-of-node
– Cluster management.
• Economical Fault-Tolerant Systems.
• Failure Detectors
• Alert communication primitives in TCP.
Protocols
• Heartbeat:
A message without any specific content.
tmax
Channel
Channel
t=12
timeout
Voluntarily
inactive
t=12
timeout
t=6
timeout
t=3
timeout tmax= 12
Non-Voluntarily tmin= 3
inactive t=tmax
Two-phase and revised version of
heartbeat protocols.
The static heartbeat protocol
• p[0] and ‘n’ processes
(n: fixed and priori known)
• Binary heartbeat protocol
between
p[0] & p[i] (for each 1≤ i ≤ n)
[R1]:
Failure of p[1] =>
p[0] also terminates
(within 2*tmax).
General Requirements cont’d
• “…If every process in the network,
continues to choose remain active, then all
processes remain active indefinitely”
[R2]:
If p[0] is active &&
every sent message is received =>
p[i] will never be inactivated.
• [R3]:
If p[i] is active &&
every sent message is received =>
p[0] will never be inactivated.
Results for (revised) binary, two-
phase and static
• Different data sets for tmin and tmax for each protocol.
tmin 1 4 5 9 10
tmax 10 10 10 10 10
R1 F F F T T
R2 T T T T F
R3 T T T T F
Counter examples for R1
p[0]
2tmin<tmax p[1]
Voluntarily
tmax inactive
timeout
tmax
2tmax-tmin timeout
timeout
timeout
Non-Voluntarily
inactive
Counter examples for R1
p[0]
2tmin<=tmax p[1]
Voluntarily
tmax inactive
timeout
tmax
timeout
tmax/2
timeout
Non-Voluntarily
inactive
Counter examples for R2
p[0]
tmin=tmax p[1]
tmax
timeout
3tmax-tmin
tmax
timeout
Non-Voluntarily Non-Voluntarily
inactive inactive
Counter examples for R3
p[0]
tmin=tmax p[1]
tmax
timeout
tmax
timeout
Non-Voluntarily
inactive
Results for expanding and dynamic
protocols
tmin 1 4 5 9 10
tmax 10 10 10 10 10
R1 F F F T T
R2 T T F F F
R3 T T T T F
Counter examples for R2
Let
tmin=9
tmax=10 p[1]
p[0]
tmax tmin
timeout
tmin
tmax
3
timeout
3tmax-tmin time is
over
Q&A