Heartbeat Protocols: - Protocol

You might also like

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 20

Heartbeat Protocols

• Protocol:
– fixed (maximum) number of participants
– exchanging heartbeats continuously

• Goal:
inactivation of one 
inactivation of all
(within a time limit)
Applications
• Core component of Linux-HA (High-
Availability). (http://www.linux-
ha.org/Heartbeat)
– Detection of death-of-node
– Cluster management.
• Economical Fault-Tolerant Systems.
• Failure Detectors
• Alert communication primitives in TCP.
Protocols
• Heartbeat:
A message without any specific content.

• Types of Heartbeat protocols:


1. Binary
2. Two-phase.
3. Static.
4. Expanding.
5. Dynamic.
The binary heartbeat protocol
p[0] p[1]

tmax
Channel

Channel

Heartbeat of p[0] tmax = maximum waiting time for a round


Heartbeat of p[1] tmin = minimum waiting time for a round.
t is initially tmax, re-calculated in each round
Inactivation of p[1] and its effect
p[0] p[1]

t=12

timeout
Voluntarily
inactive
t=12

timeout

t=6

timeout
t=3
timeout tmax= 12
Non-Voluntarily tmin= 3
inactive t=tmax
Two-phase and revised version of
heartbeat protocols.
The static heartbeat protocol
• p[0] and ‘n’ processes
(n: fixed and priori known)
• Binary heartbeat protocol
between
p[0] & p[i] (for each 1≤ i ≤ n)

• tm[i] = waiting time for the ith process.

• Waiting time for the next round=


min (tm[1], …, tm[n])
The expanding heartbeat protocol
• Extension of the static heartbeat protocol.
• At the start only p[0]
• Other processes join latter.
• To join p[i] sends a heartbeat
• Static protocol with all joined p[i].
The dynamic heartbeat protocol
• Like expanding but other processes
– can join
– can leave (permanently) at will.
• To join p[i]: heartbeat (“true” )
• To leave: heartbeat (“false”)
• Expanding heartbeat protocol with not-left
processes
General Requirements
• “… if p[0] does not receive any beat
message for a period of 2tmax, then p[0]
becomes inactive”

[R1]:
Failure of p[1] =>
p[0] also terminates
(within 2*tmax).
General Requirements cont’d
• “…If every process in the network,
continues to choose remain active, then all
processes remain active indefinitely”

[R2]:
If p[0] is active &&
every sent message is received =>
p[i] will never be inactivated.
• [R3]:
If p[i] is active &&
every sent message is received =>
p[0] will never be inactivated.
Results for (revised) binary, two-
phase and static
• Different data sets for tmin and tmax for each protocol.

tmin 1 4 5 9 10
tmax 10 10 10 10 10
R1 F F F T T

R2 T T T T F

R3 T T T T F
Counter examples for R1
p[0]
2tmin<tmax p[1]

Voluntarily
tmax inactive

timeout

tmax

2tmax-tmin timeout

timeout

timeout
Non-Voluntarily
inactive
Counter examples for R1
p[0]
2tmin<=tmax p[1]

Voluntarily
tmax inactive

timeout

tmax

timeout

tmax/2

timeout
Non-Voluntarily
inactive
Counter examples for R2
p[0]
tmin=tmax p[1]

tmax

timeout
3tmax-tmin

tmax

timeout
Non-Voluntarily Non-Voluntarily
inactive inactive
Counter examples for R3
p[0]
tmin=tmax p[1]

tmax

timeout

tmax

timeout
Non-Voluntarily
inactive
Results for expanding and dynamic
protocols
tmin 1 4 5 9 10
tmax 10 10 10 10 10
R1 F F F T T

R2 T T F F F

R3 T T T T F
Counter examples for R2
Let
tmin=9
tmax=10 p[1]
p[0]

tmax tmin

timeout
tmin
tmax

3
timeout
3tmax-tmin time is
over
Q&A

You might also like