Download as pdf or txt
Download as pdf or txt
You are on page 1of 41

1

Cache Coherence Protocols

for

Sequential Consistency

Arvind

Computer Science and Artificial Intelligence Lab

M.I.T.

Based on the material prepared by

Arvind and Krste Asanovic

November 14, 2005


6.823 L18- 2
Arvind

Systems view
snooper
(WbReq, InvReq, InvRep)

load/store
buffers pushout (WbRep) Memory
CPU Cache

(I/Sh/Ex) (ShRep, ExRep)

(ShReq, ExReq)
Blocking caches CPU/Memory
In order, one request at a time + CC ⇒ SC Interface
Non-blocking caches
Multiple requests (different addresses) concurrently + CC
⇒ Relaxed memory models
CC ensures that all processors observe the same
order of loads and stores to an address
November 14, 2005
6.823 L18- 3
Arvind

A System with Multiple Caches


P P P P
L1 L1 L1 L1
P
P L1
L1 L2 L2

Interconnect
M aka Home

Assumptions: Caches are organized in a hierarchical manner

• Each cache has exactly one parent but can have

zero or more children

• Only a parent and its children can communicate directly


• Inclusion property is maintained between a parent
and its children, i.e.,

a ∈ Li ⇒ a ∈ Li+1

November 14, 2005


6.823 L18- 4
Arvind

Maintaining Cache Coherence

Hardware support is required such that


• only one processor at a time has write
permission for a location
• no processor can load a stale copy of
the location after a write


write request:

The address is invalidated in all other caches before


the write is performed

read request:

If a dirty copy is found in some cache, a write-back


is performed before the memory is read

November 14, 2005


6.823 L18- 5
Arvind

State Encoding
P P P P
(Sh, ∈) a L1 L1 L1
P
6 7 8 9
P 5 L1
(Sh, ∈) 2 a 3 L2 4 a (Sh, R(6))

Interconnect
1 a (Ex, R(2,4))

Each address in a cache keeps two types of state


info
• sibling info: do my siblings have a copy of address a
- Ex (means no), Sh (means may be)
• children info: has this address been passed on to
any of my children
- W(id) means child id has a writable version
- R(dir) means only children named in the directory
dir have copies
November 14, 2005
6.823 L18- 6
Arvind

Cache State Implications

Sh ⇒ cache’s siblings and decedents can only


have Sh copies

Ex ⇒ each ancestor of the cache must be in Ex

⇒ either all children can have Sh copies

or one child can have an Ex copy

• Once a parent gives an Ex copy to a child, the


parent’s data is considered stale
• A processor cannot overwrite data in Sh state
in L1
• By definition all addresses in the home are in
the Ex state

November 14, 2005


6.823 L18- 7
Arvind

Cache State Transitions

Inv
invalidate flush
store
load
optimizations
Sh Ex
store

write-back

This state diagram is helpful as long as one


remembers that each transition involves
cooperation of other caches and the main
memory.

November 14, 2005


6.823 L18- 8
Arvind

High-level Invariants in Protocol


Design

November 14, 2005


6.823 L18- 9
Arvind

Guarded Atomic Actions

• Rules specified using guarded atomic


actions:
<guard predicate>
→ {set of state updates that must occur
atomically with respect to other rules}
• E.g.:
m.state(a) == R(dir) & idc ∉ dir
→ m.setState(a, R(dir+ idc)),
c.setState(a, Sh); c.setData(a, m.data(a));

November 14, 2005


6.823 L18- 10
Arvind

Data Propagation Between Caches

Child c Child c

Parent m Parent m

Caching rules De-caching rules


• Read caching rule • Write-back rule
• Write caching rule • Invalidate rule

November 14, 2005


6.823 L18- 11
Arvind

Caching Rules: Parent to Child

Child c idc

Parent m idp

• Read caching rule


R(dir) == m.state(a) & idc ∉ dir
→ m.setState(a, R(dir+ idc))
c.setState(a, Sh); c.setData(a, m.data(a));

• Write caching rule


ε == m.state(a)
→ m.setState(a, W(idc))
c.setState(a, Ex); c.setData(a, m.data(a));

November 14, 2005


6.823 L18- 12
Arvind

De-caching Rules: Child to Parent


Child c idc

Parent m idp

• Writeback rule
W(idc) == m.state(a) & Ex == c.state(a)
→ m.setState(a, R({idc}))

msetData(a, c.data(a));

c.setState(a, Sh);

• Invalidate rule
R(dir) == m.state(a) & idc ∈ dir & Sh == c.state(a)
→ m.setState(a, R(dir - idc))

c.invalidate(a);

November 14, 2005


6.823 L18- 13
Arvind

Making the Rules Local & Reactive

Child c idc

Parent m idp

• Some rules require observing and changing the state


of multiple caches simultaneously (atomically).
– very difficult to implement, especially if caches are separated
by a network
• Each rule must be triggered by some action
• Split rules are into multiple rules – “request for an
action” followed by “an action and an ack”.
– ultimately all actions are triggered by some processor

November 14, 2005


6.823 L18- 14
Arvind

Protocol Design

*Note*

We will not be able to finish this part today.

(The rest of the material will be covered during the next lecture.)

November 14, 2005


6.823 L18- 15
Arvind

Protocol Processors an abstract view

P P

p2m m2p m2p p2m


c2m c2m
L1 PP interconnect PP L1
m2c m2c

in out

PP m
• Each cache has 2 pairs of queues
– one pair (c2m, m2c) to communicate with the memory
– one pair (p2m, m2p) to communicate with the processor
• Messages format:
msg(idsrc,iddest,cmd,priority,a,v)
• FIFO messages passing between each (src,dest) pair
except a Low priority (L) msg cannot block a high
priority (H) msg
November 14, 2005
6.823 L18- 16
Arvind

H and L Priority Messages

• At the memory unprocessed requests cannot


block the result messages. Hence all
messages are classified as H or L priority.
– all messages carrying results are classified as high

priority

• Accomplished by having separate paths for H


and L priority
– In Theory: separate networks

– In Practice:
H
• Separate Queues L
• Shared buses for both networks

November 14, 2005


6.823 L18- 17

A Protocol for a system with two


Arvind

memory levels (L1 + M)


Cache states: Sh, Ex, Pending, Nothing

Memory states: R(dir), W(id), TR (dir), TW(id)

If dir is empty then R(dir) and TR(dir)


represent the same state
Messages:
Cache to Memory requests: ShReq, ExReq
Memory to Cache requests: WbReq, InvReq, FlushReq

Cache to Memory responses: WbRep(v), InvRep, FlushRep(v)


Memory to Cache responses: ShRep(v), ExRep(v)

Operations on cache:
cache.state(a) – returns state s
cache.data(a) - returns data v
cache.setState(a,s), cache.setData(a,v), cache.invalidate(a)

inst = first(p2m); msg= first(m2c); mmsg = first(in)


November 14, 2005
6.823 L18- 18
Arvind

Voluntary rules: Cache must be able to


evict values to create space
It would be good to have
Invalidate rule “silent drops” but difficult in
cache.state(a) is Sh a directory-based protocol
→ cache.invalidate(a)
c2m.enq (Msg(id, Home, InvRep, a))

Flush rule
cache.state(a) is Ex
→ cache.invalidate(a)
c2m.enq (Msg(id, Home, FlushRep, a, cache.data(v))

Writeback rule
cache.state(a) is Ex
→ cache.setState(a, Sh)
c2m.enq (Msg(id, Home, WbRep, a, cache.data(v)))
This rule may be applied if the cache/processor knows it
is the “last store” operation to the location.

Such voluntary rules can be used to construct


adaptive protocols.
November 14, 2005
6.823 L18- 19
Arvind

Voluntary rules: Memory should be able


to send more values than requested

Cache Rule
m.state(a) is R(dir) & id ∉ dir
→ m.setState(a, R(id+dir))
out.enq(Msg(Home,id,ShRep, a,m.data(a)))

It is a rule like this that allows us to fetch locations a+1,


a+2, ... when a processor requests address a.

November 14, 2005


20

Five-minute break to stretch your legs

November 14, 2005


6.823 L18- 21
Arvind

Load Rules (at cache)

• Load-hit rule

Load(a)==inst
& cache.state(a) is Sh or Ex
→ p2m.deq

m2p.enq(cache.data(a))

• Load-miss rule
Load(a)==inst
& cache.state(a) is Nothing
→ c2m.enq(Msg(id, Home, ShReq, a)
cache.setState(a,Pending)

This is blocking cache because the Load miss


rule does not remove the request from the
input queue (p2m) ... more later
November 14, 2005
6.823 L18- 22
Arvind

Store Rules (at cache)

• Store-hit rule
Store(a,v)==inst

& cache.state(a) is Ex

→ p2m.deq;

m2p.enq(Ack)

cache.setData(a, v)

• Store-miss rules

Store(a,v)==inst

& cache.state(a) is Nothing

→ c2m.enq(Msg(id, Home, ExReq, a);


cache.setState(a,Pending) Already covered
by the
Store(a,v)==inst Invalidate
& cache.state(a) is Sh voluntary rule
→ c2m.enq(Msg(id, Home, InvRep, a);

cache.setState(a,Nothing)

November 14, 2005


6.823 L18- 23
Arvind

Processing ShReq Messages (at Home)

Uncached or Outstanding Shared Copies


Msg(id,Home,ShReq,a) ==mmsg
& m.state(a) is R(dir) & id ∉ dir
→ in.deq;
m.setState(a, R(dir+{id}));
out.enq(Msg(Home,id,ShRep, a,m.data(a)))

Outstanding Exclusive Copy

Msg(id,Home,ShReq,a) ==mmsg
& m.state(a) is W(id’) & (id’ is not id)
→ m.setState(a, TW(id’));

out.enq(Msg(Home,id’,WbReq, a))

November 14, 2005


6.823 L18- 24
Arvind

Processing ExReq Messages (at home)

Uncached or cached only at the requester cache


Msg(id,Home,ExReq,a) ==mmsg
& m.state(a) is R(dir) & (dir is empty or has only id)
→ in.deq
m.setState(a, W(id))
out.enq(Msg(Home,id,ExRep, a, m.data(a))
Outstanding Shared Copies

Msg(id,Home,ExReq,a) ==mmsg
& m.state(a) is R(dir) & !(dir is empty or has only id)
→ m.setState(a, TR(dir-{id}))
out.enq(multicast(Home,dir-{id},InvReq, a)
Outstanding Exclusive Copy

Msg(id,Home,ExReq,a) ==mmsg
& m.state(a) is W(id’) & (id’ is not id)
→ m.setState(a, TW(id’))
out.enq(Msg(Home,id’,FlushReq, a)
November 14, 2005
6.823 L18- 25
Arvind

Processing Reply Messages (at cache)

ShRep
Msg(Home, id, ShRep, a, v) == msg
-- cache.state(a) must be Pending or Nothing
→ m2c.deq

cache.setState(a, Sh)

cache.setData(a, v)

ExRep
Msg(Home, id, ExRep, a, v) == msg
-- cache.state(a) must be Pending or Nothing
→ m2c.deq

cache.setState(a, Ex)

cache.setData(a, v)

-- In general only a part of v will be


overwritten by the Store instruction.

November 14, 2005


6.823 L18- 26
Arvind

Processing InvReq Message (at cache)

InvReq
Msg(Home,id,InvReq,a) == msg
& cache.state(a) is Sh
→ m2c.deq
cache.invalidate(a)
c2m.enq (Msg(id, Home, InvRep, a))

Msg(Home,id,InvReq,a) == msg
& cache.state(a) is Nothing or Pending
→ m2c.deq

November 14, 2005


6.823 L18- 27
Arvind

Processing WbReq Message (at cache)

WbReq
Msg(Home,id,WbReq,a) == msg
& cache.state(a) is Ex
→ m2c.deq
cache.setState(a, Sh)
c2m.enq (Msg(id, Home, WbRep, a, cache.data(v)))

Msg(Home,id,WbReq,a) == msg
& cache.state(a) is Sh or Nothing or Pending
→ m2c.deq

November 14, 2005


6.823 L18- 28
Arvind

Processing FlushReq Message (at cache)

FlushReq
Msg(Home,id,FlushReq,a) == msg

& cache.state(a) is Ex

→ m2c.deq
cache.invalidate(a)
c2m.enq (Msg(id, Home, FlushRep, a, cache.data(v)))

Msg(Home,id,FlushReq,a) == msg

& cache.state(a) is Sh

→ m2c.deq

cache.invalidate(a)

c2m.enq (Msg(id, Home, InvRep, a))

Msg(Home,id,FlushReq,a) == msg

& cache.state(a) is Nothing or Pending

→ m2c.deq

November 14, 2005


6.823 L18- 29
Arvind

Processing Reply InvRep Messages

(at home)

InvRep
Msg(id,Home,InvRep,a) == mmsg
& m.state(a) is TR(dir)
→ deq mmsg;

m.setState(a, TR(dir-{id}))

Msg(id,Home,InvRep,a) == mmsg
& m.state(a) is R(dir)


deq mmsg;

m.setState(a, R(dir-{id}))

November 14, 2005


6.823 L18- 30
Arvind

Processing Reply WbRep Messages

(at home)
WbRep
Msg(id,Home,WbRep,a,v) == mmsg
-- m.state(a) must be TW(id) or W(id)
→ deq mmsg;

m.setState(a, R(id))

m.setData(a,v)

FlushRep
Msg(id,Home,FlushRep,a,v) == mmsg
-- m.state(a) must be TW(id) or W(id)
→ deq mmsg;

m.setState(a, R(Empty))

m.setData(a,v)

November 14, 2005


6.823 L18- 31
Arvind

Non-Blocking Caches

new reqs
• Non-blocking caches are enq
needed to tolerate large p2m
memory latencies

incomingQ
• To get non-blocking property

deferQ
we implement p2m with 2
FIFOs (deferQ, incomingQ)

• Requests moved to deferQ


when:
– address not there deq
– needed for consistency

Handle
Req.
November 14, 2005
6.823 L18- 32
Arvind

Conclusion

• This protocol with its voluntary rules


captures many other protocols that are
used in practice.
– we will discuss a bus-based version of this protocol
in the next lecture
• We need policies and mechanisms to
invoke voluntary rules to build truly
adaptive protocols.
– search for such policies and mechanisms in an
active area of research
• Quantitative evaluation of protocols or
protocol features is extremely difficult.

November 14, 2005


33

Thank you !

November 14, 2005


6.823 L18- 34
Arvind

Protocol Diagram

Cache 1 Cache 2 Cache N


Pen: a ...

ShReq
a

Dir a: Sh {}
Dir a: Sh {1}
Main Memory

November 14, 2005


6.823 L18- 35
Arvind

Protocol Diagram

Cache 1 Cache 2 Cache N


Pen: a
Sh: a ...

ShResp
<a,v>

Dir a: Sh {1}
Main Memory

November 14, 2005


6.823 L18- 36
Arvind

Protocol Diagram

Cache 1 Cache 2 Cache N


Sh: a Pen: a ...

ShReq
a

Dir a: Sh {1}
Dir a: Sh {1,2}
Main Memory

November 14, 2005


6.823 L18- 37
Arvind

Protocol Diagram

Cache 1 Cache 2 Cache N


Pen: a
Sh: a Sh: a ...

ShResp
<a,v>

Dir a: Sh {1,2}
Main Memory

November 14, 2005


6.823 L18- 38
Arvind

Protocol Diagram

Cache 1 Cache 2 Cache N


Sh: a Sh: a ... Pen: a

InvReq InvReq ExReq


a a a

Dir a: Sh {1,2}
Main Memory

November 14, 2005


6.823 L18- 39
Arvind

Protocol Diagram

Cache 1 Cache 2 Cache N


Pen: a
Sh: a Sh: a ... Ex: a

Inv Inv ExResp


a a <a,v>

Dir a: Sh {}
Dir a: Sh {1,2} Main Memory
Dir a: Ex {N}

November 14, 2005


6.823 L18- 40
Arvind

Protocol Diagram

Cache 1 Cache 2 Cache N


Pen: a ... Ex: a

ShReq WBReq
a a

Dir a: Ex {N}
Main Memory

November 14, 2005


6.823 L18- 41
Arvind

Protocol Diagram

Cache 1 Cache 2 Cache N


Pen: a Ex: a
Sh: a ... Sh: a

ShResp WBResp
<a,v'> <a,v'>

Dir a: Ex {N}
Dir a: Sh {1,N}
Main Memory

November 14, 2005

You might also like