Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 16

Cache Memory

Basic Philosophy
Temporal Locality Spatial Locality

Basic Terms
Cache Block Miss/Hit Miss Rate/Hit Rate Miss Penalty Hit Time 3-Cs of caches
Conflict Compulsory Capacity

Direct Mapped Cache


Assume 5-bit address bus and cache with 8 entries

D 4 D3
TAG Processor D2 - D0 Index

Valid

TAG

DATA

Index 000 001 010 011 100 101 110 111

Data Bus

HIT

Direct Mapped Cache


First Load Valid

D 4 D3

TAG

DATA

Index 000 001 010 011 100 101 110 111

TAG
= 01

Processor D2 - D0 = 010

0 0 0 0 0 0 0 0

Data Bus LD R1, (01010) ;remember 5-bit address bus, assume data is 8-bit and AA16 is stored at this location First time, cause a MISS, data loaded from memory and cache HIT bit is set to 1

Direct Mapped Cache


After first load
Valid

D 4 D3

TAG

DATA

Index 000 001 010 011 100 101 110 111

TAG
= 01

Processor D2 - D0 = 010

0 0 1 0 0 0 0 0

01

AA

Data Bus LD R1, (01010) ; AA16 is stored at this location, Cache HIT bit is set to 1

Direct Mapped Cache


Second Load TAG = 11 Processor D2 - D0 = 010
Valid

D 4 D3

TAG

DATA

Index 000 001 010 011 100 101 110 111

0 0 1 0 0 0 0 0

01

AA

Data Bus LD R1, (11010) ; assume 99 at address 11010 Same index but different TAG will cause a MISS, data loaded from memory

Direct Mapped Cache


After Second Load
Valid

D 4 D3

TAG

DATA

Index 000 001 010 011 100 101 110 111

TAG
= 11

Processor D2 - D0 = 010

0 0 1 0 0 0 0 0

11

99

Data Bus LD R1, (11010) ;remember 5-bit address bus, assume 99 First time, same index but different TAG will cause a MISS, data loaded from memory

Cache Size Example Direct Mapped


32K X 48-bit Memory Processor Address Bus (32-bit)

Valid

TAG (15 bit)

DATA (32 bit)


1 1111 1111 1111 1111 1111 1111

0 0

Address Bus (A17 A2)

0 0 0

32 K Entries

0
0
0 0000 0000 0000 0000 0000 0000

A31- A2=18

Processor Address bus = 32 bit (A) Number of blocks in cache (entries) = 32K Tag Size = A- N- 2 = 32 15 2 (Byte offset) = 15

(15-bit)

Cache Storage = 128KB = 32 K Words (2N) with N = 15

Data Out

Cache Size = 128KB (data) + 32K X 15-bit (tag) + 32K X 1-bit (Hit bit) = 192KB

Cache Size Example (1) Two-Way Set Associative

Assume same processor (A = 32, D= 32) Assume same total storage of data = 128KB Two sets means we will have two direct mapped caches with 64KB (128/2) each. 64KB = 16K words To address 16K X 32-bit memory we need 14-bit index. Hence Tag Size = 32-14-2 = 16

Cache Size Example (1) Two-Way Set Associative


Valid

SET 1

16K X 49-bit Memories

(1 bit)

TAG (16 bit)

DATA (32 bit)


1111 1111 1111 1111 1111 1111

Valid

SET 2

(1 bit)

TAG (16 bit)

DATA (32 bit)

0 0 0

0 0

16 K Entries
Address Bus
(A16 A2)

0 0 0 0

Address Bus (A16 A2)

0 0 0 0

0000 0000 0000 0000 0000 0000

A31- A17

(16-bit)

(16-bit)

Size = 2 (Sets) X 16K X (32-bit + 16-bit + 1-bit) = 196KB

Data Out 2:1 MUX

Data Out

A31- A17

Cache Size Example (1) 4-Way Set Associative

Assume same processor (A = 32, D= 32) Assume same total storage of data = 128MB Four sets means we will have four direct mapped caches with 32KB (128/4) each. 32KB = 8K words To address 8K X 32-bit memory we need 13-bit address. Hence Tag Size = 32-13-2 = 17

Cache Size Example (1) 4-Way Set Associative


V

SET 1
TAG

8K X 50-bit Memories

17

SET 2
TAG

17

SET 3
TAG

17

SET 4
TAG

17

0 0 0

8M Entries
Address Bus (A15 A2)

0 0 0 0 0 0

8M Entries
Address Bus (A15 A2)

0 0 0 0 0 0

8M Entries
Address Bus (A15 A2)

0 0 0 0 0 0

Address Bus (A15 A2)

0 0 0 0

A31- A16

Data Out

Data Out

(17-bit)

(17-bit)

(17-bit)

(17-bit)

4:1 MUX

Size = 4 (Sets) X 8K X (32-bit + 17-bit + 1-bit) = 200KB


Data Out to processor

Data Out

A31- A16

A31- A16

A31- A16

Alpha 21264 Processor44-Bit Virtual Address

Organization of the data cache Alpha 21264


Byte Offset (A5 A0)
Valid

SET 1

512 Entries Cache (2 Sets) (Block Size = 64 bit)


DATA (64 bit)

(1 bit)

TAG (29 bit)

Valid

SET 2

(1 bit)

TAG (29 bit)

DATA (64 bit)

0 0
Index 512 entries Address Bus (A14 A6)

0 0
Index 512 entries Address Bus
(A14 A6)

0 0 0 0 0

0 0 0 0 0

A44- A15 (29-bit Tag)

(29-bit)

(29-bit)

Size = 2 (Sets) X 16K X (32-bit + 16-bit + 1-bit) = 196KB

Data Out 2:1 MUX

Data Out

A44- A15 (29-bit Tag)

Four Memory Hierarchy Questions

Where can a block be placed


Direct Mapped to Fully Associative

How a block is found


Tag Comparison

Which block should be replaced on a cache miss (only for sets)


LRU, Random, FIFO

4 Qs (Contd..)
What Happens on a Write?
Write Back Main Memory only updated when data is replaced from cache Write Through The information is updated in upper as well as lower level.
Write Allocate: Allocate data in cache on write Write No-Allocate: Only write to next level.

Classifying Misses: 3 Cs

3 Cs of Caches

Compulsory The first access to a block is not in the cache, so the block must be brought into the cache. Also called cold start misses or first reference misses. (Misses in even an Infinite Cache) Capacity If the cache cannot contain all the blocks needed during execution of a program, capacity misses will occur due to blocks being discarded and later retrieved. (Misses in Fully Associative Size Cache) Conflict If block-placement strategy is set associative or direct mapped, conflict misses (in addition to compulsory & capacity misses) will occur because a block can be discarded and later retrieved if too many blocks map to its set. Also called collision misses or interference misses.

You might also like