9.DS Clock Synchronization

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 54

DISTRIBUTED SYSTEMS (CS 3006)

National Institute of Technology Rourkela


Measuring Time
Traditionally time measured astronomically
Transit of the sun (highest point in the sky)
Solar day and solar second

Problem: Earth’s rotation is slowing down


Days get longer and longer
300 million years ago there were 400 days in the year ;-)

Modern way to measure time is atomic clock


Based on transitions in Cesium-133 atom
Still need to correct for Earth’s rotation

Result: Universal Coordinated Time (UTC)


UTC available via radio signal, telephone line, satellite (GPS)

2
Hw/Sw Clocks
• Physical clocks in computers are realized as crystal oscillation
counters at the hardware level
– Correspond to counter register H(t)
– Used to generate interrupts
• Usually scaled to approximate physical time t, yielding software clock
C(t), C(t) = αH(t) + β
– C(t) measures time relative to some reference event, e.g., 64 bit counter for
# of nanoseconds since last boot
– Simplification: C(t) carries an approximation of real time
– Ideally, C(t) = t (never 100% achieved)
– Note: Values given by two consecutive clock queries will differ only if clock
resolution is sufficiently smaller than processor cycle time

3
Problems with H/W or S/W Clock
• Skew: Disagreement in the reading of two clocks

• Drift: Difference in the rate at which two clocks count the time
– Due to physical differences in crystals, plus heat, humidity, voltage, etc.
– Accumulated drift can lead to significant skew

• Clock drift rate: Difference in precision between a prefect reference


clock and a physical clock,
– Usually, 10-6 sec/sec, 10-7 sec/sec to 10-8 sec/sec for high precision clocks

4
Challenges
• Two clocks do not agree perfectly
• Skew: The time difference between two clocks
• Quartz oscillators vibrate at different rates
• Drift: The difference in rates of two clocks
• If we had two perfect clocks
– Skew = 0
– Drift = 0

5
Clock Skew
• When we detect a clock has a skew
• Eg: it is 5 seconds behind Or 5 seconds ahead
• What can we do?

6
Clock Skew: Impacts & Solutions
• When we detect a clock has a skew
• Eg: it is 5 seconds behind – We can advance it 5 seconds to correct/Run it
faster until it catches up
• Or 5 seconds ahead – Pushing back 5 seconds is a bad idea/Run it slower
until it catches up
• • This does not guarantee correct clock in future
• – Need to check and adjust periodically
• Problems due to Skew:
– Message was received before it was sent
– Document closed before it was saved etc..
– We want monotonicity: time always increases

7
How clocks synchronize ?
Obtain time from Time server:

Request Time

Client Server
Reply: Time: 00:05:20

A dedicated time server is allocated for clock synchronization

8
Causes of Inaccurate Time

 Delays in message transmission

 Delays due to processing

 Server’s time may be inaccurate

9
Clock Inaccuracies
• Clock inaccuracies cause problems and troublesome in solving tasks
in distributed systems.
• The clocks of different nodes need to be synchronized to limit errors.
• Need of an efficient communication or resource sharing.
• Clocks need to be monitored and adjusted continuously. Otherwise,
the clocks drift apart.
• Similarly clock skew also introduces mismatch in time value of two
clocks.
• Both these are to be addressed to make an efficient usage of
features of distributed systems.

• Example: Clock synchronization using token ring 10


Clock Synchronization
Clock synchronization aims to coordinate independent clocks available in
individual node.

Even when initially set accurately, real clocks will differ after some amount of
time due to clock drift,

Caused by clocks counting time at slightly different rates.

11
Solutions
The synchronization solution using a central server is trivial; the server will dictate the system
time. (Single point of failure)

Due to lack of global time/clock, achieving clock synchronization in distributed systems is


difficult.

Two Solutions: (Physical & Logical Clock Synchronization)

(1) Popular algorithms for Clock Synchronization (Physical) in distributed systems:


(a) Cristian’s algorithm & (b) Berkeley algorithm

(2) Concepts of Logical clock in distributed systems for Clock Synchronization (Logical):
(a) Lamport timestamps & (b) Vector clocks

12
Solutions
Wired Distributed Systems: Internet, LAN, MAN, WAN, PAN etc
Network Time Protocol (NTP): Works on client-server architecture

User Datagram Protocol (UDP) message passing


Wireless Distributed Systems: WSN, VANET, MANET, FANET,ANET etc
The problem becomes even more challenging due to the
possibility of collision of the synchronization packets on the
wireless medium and the higher drift rate of clocks on low-cost
wireless devices.
Wired-Cum-Wireless Distributed Systems: Wireless Internet
Collision of synchronization packets
Prediction of delay is challenging
Traditional protocols (Latency & Bandwidth) may not be suitable for real time applications
Cristian's algorithm (Physical Clock Synchronization)

Introduced by Flaviu Cristian in 1989

Primarily used in low-latency intranets.

Though the algorithm is simple, the obtained clock value is probabilistic:


It only achieves synchronization if the round-trip time (RTT) of the request is less than
the required accuracy.

It also suffers in implementations using a single server, making it unsuitable for many
distributive applications where redundancy may be crucial.

14
Cristian's algorithm

Client To T1

Req Cutc = Current UTC of Time Server


Time Server

Interrupt Handling Time

Best estimation of message propagation time = (To +T1)/2

Both To and T1 are measured using same clock

Tnew = Tserver + (To+T1)/2 i.e Cutc + message Propagation Time

15
Cristian's algorithm
Cristian's algorithm works between a process P, and a time server S connected to a time reference source.

Step 1: P requests the time from S


Step 2: S receives the request from P
Step 3: S prepares a response and appends the time T from its own clock
Step 4: S sends the time to P
Step 4: P then sets its time to be T + RTT/2 where RTT is the round trip time (Req Time + Resp Time)
Stop
Assumption: Request time = response time (May be reasonable for LAN but not always)
Further accuracy can be gained by making multiple requests to S and using the response with the shortest RTT.

We can estimate the accuracy of the system as follows.


Let min be the minimum time to transmit a message one-way.
Transmission time includes message preparation time or nodes ready time to send a message.
The earliest point at which S can place the time T, is min after P sent its request.
Therefore, the time at S, when the message by P is received, is in the range (T + min) to (T + RTT - min).
The width of this range is (RTT - 2*min).
This gives an accuracy of (RTT/2 - min).

16
Berkeley algorithm (Physical Clock Synchronization)
 Developed by Gusella and Zatti at the University of California, Berkeley in 1989.

 Assumes no machine has an accurate time source.

 Intended for use within intranets.

 The server process (called the leader) periodically polls other follower processes
requesting for time.

 Based on the answers, it computes an average time & tells all the other nodes to advance
their clocks to the new time or slow their clocks down until some specific reduction has
been achieved

 The time daemons time must be set manually by the operator periodically

17
Example
The time daemon sends its own clock value and asks all other nodes
for their clock values (i.e., 3.00)
3.00
Time Daemon

Network

2.50 P1 P2 3.25

18
Example
The nodes answer the difference in their time w.r.t time at Time
Daemon ( i.e., -10 & +25)
3.00
Time Daemon

-10 (2.50 – 3.00) +25 (3.25 - 3.00)


Network

2.50 P1 P2 3.25

19
Example
The time daemon computes the average of time of all the nodes including time daemon i.e, (3.00 + 2.50 +
3.25)/3 = 9.15/3 = 3.05.
Time Daemon
3.05

2.50+15 Network 3.25- 20

3.05 P1 P2 3.05

The time daemon tells other nodes to adjust their cock values by increasing or
decreasing by sending difference in values (i.e., 15 & -20) instead of average
values.
Berkeley algorithm
A leader is chosen via an election process such as Chang and Roberts algorithm.

The leader polls the followers who reply with their time in a similar way to Cristian's algorithm.

The leader observes the round-trip time (RTT) of the messages and estimates the time of
each follower and its own.

The leader then averages the clock times, ignoring any values it receives far outside the values
of the others.

Instead of sending the updated current time back to the other process, the leader then sends
out the amount (positive or negative) that each follower must adjust its clock.

This avoids further uncertainty due to RTT at the follower processes.


With this method the average cancels out individual clock's tendencies to drift.
21
Limitations Berkeley algorithm
Gusella and Zatti released results involving 15 computers whose clocks were synchronized to
within about 20-25 milliseconds using their protocol.

Computer systems normally avoid rewinding their clock when they receive a negative clock
alteration from the leader. This would break the property of monotonic time, which is a
fundamental assumption in certain algorithms in the system itself or in programs such
as make.

A simple solution to this problem is to halt the clock for the duration specified by the leader,
but this simplistic solution can also cause problems, although they are less severe. For minor
corrections, most systems slow the clock (known as "clock slew"), applying the correction over
a longer period of time.

Often, any client whose clock differs by a value outside of a given tolerance is disregarded
when averaging the results. This prevents the overall system time from being drastically
skewed due to one erroneous clock.
22
Logical Clock
Due to lack of global physical clock in a distributed system and limitations of Berkeley
algorithm, Lamports introduced the concept of logical clock based on event ordering instead of
using physical clock.

Two event ordering clocks : (i) Lamports Clock (Also known as Scalar Clock)
For Partial ordering of events
(ii) Vector Clocks (modification of Lamport Clocks)

Lamports Logical Clocks: Can be considered as a counter/integer value

Lamports has defined certain rules to increament the counter values which are assigned to
events in the processes of distributed system

Clock Drift rate is usually assumed 1 unit however any value greater than 1 can be assumed

Network delay is usually assumed 1 unit however any value greater than 1 can be assumed
23
Each process has n number of instructions or tasks

An Event ?

Send, Receive, Print etc

24
Three Conditions proposed by Lamport:
(1) a -> b C(a) < C (b) (Happened Before Relations) indicates event a is always earlier than b

(2) If a is a sending event of message m and b is a Receive event of message m then C(a) < C(b)

(3) a -> b, b -> c => a -> c (Transitive Relations)

Where a, b & c are events may be executed in same or different processes and

C(x) = Time stamp at event x

Example of Events: Sending, Receiving, Executing, Print etc.

25
Logical Clocks
Physical clocks are physical entities that assign physical times to events,

Logical clocks order events logically by assigning logical timestamps instead of physical
ordering.

In fact, the logical clock decides the order of execution of different parallel or concurrent or
independent processes

Whereas the logical clocks are simply a conceptualization of a mathematical function that
assigns numbers to events. These numbers act as timestamps that help in ordering events.

Refer to implementing a protocol on all machines within your distributed system, so that the
machines are able to maintain consistent ordering of events within some virtual timespan.

26
Logical Clocks
More formally, each process Pi has a clock Ci which is a function from events to the integers.

The timestamp of an event e in Pi is Ci(e).

The system clock, C = f(from events to the integers) where C(e)=Ci(e) and e is an event in Pi

Causal Functionality:

Given 2 events (e1, e2) where one is caused by the other (e1 contributes to e2 occurring). Then the
timestamp of the ‘caused by’ event (e1) is less than the other event (e2).

27
Implementation Rules
To provide this functionality any Logical Clock must provide 2 rules:

Rule 1: this determines how a local process updates its own clock when an event occurs.

Before executing an event (excluding the event of receiving a message) increment the local clock by 1.
Local_clock = local_clock + 1

Rule 2: determines how a local process updates its own clock when it receives a message from another
process. This can be described as how the process brings its local clock inline with information about the
global time.

When receiving a message (the message must include the senders local clock value) set your local
clock to the maximum of the received clock value and the local clock value. After this, increment
your local clock by 1

1. local_clock = max(local_clock, received_clock)

2. local_clock = local_clock + 1

3. message becomes available.

28
Lamports Logical Clocks
Key Idea:
(1) Processes exchange messages
(2) Messages must be send before they are received
(3) Send/Receive is used to order the events & synchronize the logical clocks

Let

Pi is process i
a, b & c …. are events in processes
Ci(a) is the time stamp of event ‘a’ of process Pi
IR is the implementation rule

Clock Condition to evaluate the logical clocks with the following correctness criterion

1. ∀a,b. a → b ⟹ C(a)<C(b) (happened before relation denoted by →)


2. [C1] : Ci(a) < Ci(b) applies to same process
3. [C2] : Ci(a) < Cj(b) applies to different processes
4. [IR1] : If a → b then Ci(a) = Ci(b) + d {d>0} where d is the drift rate of the clock (applies to the same process )
5. [IR2]: Cj = max (Cj, tm + d ) where tm is same as Ci(a)
(applies to process j when we get an incoming arrow to the current process j)
29
Clock Values

a, b, c, d, e, f, g, h, i, j, k, l, m are events

1, 2, 3, 4, 5, 6, 7, are clock values or time stamps for above events

No proper ordering events


Ordering of events
Every process Po, P1, & P2 in a distributed system orders the event for execution

Process Po has a, b, c, d, e, f, and g has 7 events, Time stamps are 1,2,3,4,5,6,7

Process P1 has h, i, and j has 3 events, Time stamps are 1,2, & 3

Process P2 has 3 k, l, & m has 3 events, Time stamps are 1, 2, & 3.

C(d) = 4;C(m) = 3 4 > 3 does not satisfy lamports happened before relationship

Similarly between C(f) = 6 and C(j) = 3


31
Clock Values
Rule Applied: 1 1 1 1 2 ? ?
Clock Values: (1) (2) (3) (4) ? Incoming Arrow Encountered ?
P1 e11 e12 e13 e14 e15 e16 e17

P2 e21 e22 e23 e24 e25


Clock Values: (1) (2) (3) max (3,3) ? ?
Rule Applied: 1 1 2 ? ?
(1) When an incoming arrow is detected with respect to a process, Rule 2 needs to be followed i.e.,
Max(local clock +1, Sending process clock value + n/w delay 1)

(1) Drift rate d is assumed to be value 1


32
Clock Values
Rule No Applied: 1 1 1 1 2 1 ?
Clock Values: (1) (2) (3) (4) (5) max (5, 3) (6)
P1 e11 e12 e13 e14 e15 e16 e17

(2+1)
(2+1) (6+1)

P2 e21 e22 e23 e24 e25


Clock Values: (1) (2) (3) max (3,3) (4) 7:max(5,7)
Rule Applied: 1 1 2 1 2

Clock Value of e25: Max(C(e16) + 1,C(e24) + 1) = Max( 6+1, 4+1) = Max


(7,5) = 7
33
Clock Values
Clock Values: (1) (2) (3) (4) (5) max (5, 3) (6) 7: max(5,7)
P1 e11 e12 e13 e14 e15 e16 e17

P2 e21 e22 e23 e24 e25


Clock Values: (1) (2) (3) max (3,3) (4) 7:max(5,7)

Clock Value of e17: Max(C(e24) + 1,C(e24) + 1) = Max( 4+1, 6+1) = Max (5,7) = 7

34
Another Example

35
36
37
Logical Clock

38
Logical Clock

39
Another Example

40
Clock Values: 1 2 7 8
P1 e11 e12 e13 e14

Clock Values: 1 2 3 5 6
e21x e22 e23 e24 e25
P2

e31 e32 e33 e34 e35 e36


P3
Clock values: 1 2 3 4 5 6

41
Limitations of Lamports Clock or Scalar Clock
W.R.T Implementation rule 1 and 2:

[IR1]: If a → b then Ci(a) < Ci(b) True


[IR2]: If a → b then Ci(a) < Ci(b) May be or May not
Limitation:Difficult to predict whether Clock value of e11 < Clock value of e31 or not ?
This is called partial ordering of events (Can not resolve clock issues with same counter values)
Rule No Applied: 1 1
Clock Values: (1) (2)
P1 e11 e12 Works globally & Communicate; Causal dependency

Space

P2 e21 e22 Works globally & Communicate ; Causal dependency


Clock Values: (1) (3)
Rule Applied: 1 1
P3 e31 e32 e33 Independent Process works locally; No incoming edges
Clock Values: (1) (2) (3)
Rule Applied: 1 1 1
Time
42
Two Types of Event Ordering
Partial Ordering of Events: Supported by Lamports logical clock
Clock values are obtained for each event within the process
Execution of events in concurrent processes can not be predicted
Problem arises as a single number is used to represent time

Total Ordering of Events: Used to solve the problem of partial ordering of events using an arbitrary
mathmatical functions

Example: Multiply by 10 and I to the clock value of Pi so that the values are different from each other

Finding the clock values of every event in concurrent processes

Resolve the issue of having same counter values in different processes

43
Total Ordering of Events

Clock Val *10+1: 11 21 71 81


P1 e11 e12 e13 e14

Clock Val*10+2: 12 22 32 52 62
e21 e22 e23 e24 e25
P2

e31 e32 e33 e34 e35 e36


P3
Clock val *10+3 13 23 33 43 53 63

Math Function: Clock Value * 10 + process number


44
Vector Clocks
Vector Clocks extend Lamports Scalar Time to provide a causally consistent view
of the world.

By looking at the clock, we can observe whether one event caused another event.

Provides partial ordering of events

Each process keeps a vector (a list of integers) with an integer for each local clock
of every process within the system.

For N processes, a vector of N size maintained by each process.

45
Vector Clocks
Given a process (Pi) with a vector (v), Vector Clocks implement the Logical Clock rules as follows:

Rule 1: Before executing an event (excluding the event of receiving a message) process Pi
increments the value v[i] within its local vector by 1.

This is the element in the vector that refers to Node(i)’s local clock.

local_vector[i] = local_vector[i] + 1

Rule 2: When receiving a message (the message must include the senders vector) loop through
each element in the vector sent and compare it to the local vector, updating the local vector
to be the maximum of local and received clock value.

Then increment your local clock within the vector by 1

1. For k = 1 to N: local_vector[k] = max(local_vector[k], sent_vector[k])


2. local_vector[i] = local_vector[i] + 1
3. message becomes available.
46
Where k is the process number i.e., Pk is the kth process
Advantages/Disadvantages of Vector Clocks
Advantage: Provide a causally consistent ordering of events

Disadvantages: Costly due to the need of sending the entire Vector to each
process for every message sent, in order to keep the vector clocks in sync.

When there are a large number of processes this technique can become
extremely expensive, as the vector sent is extremely large.

Gives partial order of events for processes in a distributed system

There have been many improvements over the initial Vector


Clock implementation mentioned
47
Improvements over Vector Clocks
(1) Singhal–Kshemkalyani’s differential technique : This approach improves the
message passing mechanism by only sending updates to the vector clock that
have occurred since the last message sent from Process(i) → Process(j).

This drastically reduces the message size being sent,


But does require O(n²) storage.

(1) Fowler-Zwaenepoel direct-dependency technique : Further reduces the


message size by only sending the single clock value of the sending process with
a message.
However, this means processes cannot know their transitive dependencies
when looking at the causality of events.
In order to gain a full view of all dependencies that lead to a specific event,
an offline search must be made across processes. 48
49
Vector Clocks
Clock Val: (1 0 0) (200 ) (3 4 1 )
P1 e11 e12 e13

Clock Val: (0 1 0) (2 2 0) (2 3 1) (2 4 1)
P2 e21 e22 e23 e24

P3 e31 e32
Clock val (0 0 1) (0 0 2)

TIME ------

P2 -> e22 -> Max[( 2 0 0 ) ( 0 2 0)] = (2 2 0)


P2 -> e23 -> Max [(2 3 0) (0 0 1)] = (2 3 1)
P1 -> e13 -> Max [(3 0 0) (2 4 1)] = (3 4 1)
50
Vector Clocks
Clock Val: (1 0 0) (200 ) (3 0 0 ) (4 0 0 )
P1 e11 e12 e13 e14

Clock Val: (0 1 0) (0 2 1) (2 3 1) (2 4 1)
P2 e21 e22 e23 e24

P3 e31 e32 e33


Clock val (0 0 1) (0 0 2) (4 0 3 )

TIME ------

P2 -> e22 -> max [(0 2 0) (0 0 1)] = (0 1 1)


P2 -> e23 -> max [(2 0 0), (0 3 1)] = (2 3 1)
P3 -> e33 -> max [(0 0 3), (4 0 0)] = (4 0 3)
51
Applications of Vector Clocks

For updating the data during transactions in distributed data bases

Transactions can be assigned with logical time stamps

Provides consistent view of transactions and correct updation of


data in distributed data bases

52
References

https://www.youtube.com/watch?v=VqZa4raMv_Q

Click to add text

53
Thank
You

54

You might also like