Professional Documents
Culture Documents
Basics of Memories
Basics of Memories
Basics of Memories
1 / 33
The Memory Hierarchy
2 / 33
How does this help?
3 / 33
Cache Memories
4 / 33
Fully Associative Cache
Any block in the (physical or virtual) memory can be mapped into any
block in cache (just like paged virtual memory). Memory addresses
are formed as: n d where n is the memory tag and d is the
address in the block.
tag
Block 0 Block 0
tag
Block i
Block N−1
tag
Block N−1
Cache
n Block M
Lookup Algorithm
if ∃α : 0 ≤ α < N ∧ cache[α].tag = n
Memory
then return cache[α].word[d]
else cache-miss
5 / 33
Direct Mapped Cache
If cache is partitioned into N fixed size blocks then each cache block
k will contain only (physical or virtual) memory blocks k + nN, where
n = 0, 1, .... The memory addresses are formed as: n k d
where n is the memory tag, k is the cache block index, and d is the
address in the block.
tag
Block 0 Block 0
tag
Block 1 Block 1
k
tag
Block N−1 Block N−1
Cache Block N
n Block N+1
Lookup Algorithm
if cache[k].tag = n
Memory
then return cache[k].word[d]
else cache-miss
6 / 33
Set Associative Cache
Midpoint between directed mapped and fully associative (at the extremes it
degenerates to each). Basic idea: divide cache into S sets with E = N/S block
frames per set (N total blocks in the cache). Memory address are formed as:
n k d where n is the memory tag, k is the cache set index, and d is
the address in the block.
tag
Block 0
Set 0 tag
Block E−1
Block 0
tag
Block 0
Set 1 tag
Block E−1
Block 1
k
tag
Block 0
Set S−1 tag
Block E−1
Block S−1
Block S
n Cache
Lookup Algorithm Block S+1
if ∃α : 0 ≤ α < E ∧ cache[k].block[α].tag = n
then return cache[k].block[α].word[d]
else cache-miss Memory
7 / 33
Cache Policies
I Read/Write Policies
Disk
I Read through
I Early restart
Main Memory
I Critical word first
I Write through
Cache (L3)
I Write back
I Way prediction
Cache (L2)
I Write allocate/no-allocate
I Structure
I Unified or Split Caches I−Cache D−Cache
I Contents across Cache Levels CPU/core
I Multilevel inclusion
I Multilevel exclusion
8 / 33
Cache Miss Rates by Size
9 / 33
Cache Miss Rates by Type
10 / 33
Lecture break; continue next class
11 / 33
Virtual Memory
Address Main
CPU Translation Memory
Virtual Physical
Address Address
12 / 33
Demand Paged Virtual Memory
13 / 33
Paged Virtual Memory: Structure
page 0 page 0
page 1 page 1
page N−1
Main Memory
page M−1
Program Space
14 / 33
Paged Virtual Memory: Address Translation
Virtual Address
page number page offset
page table
15 / 33
Segmented Virtual Memory
16 / 33
Segmented Virtual Memory: Structure
segment 0 segment i
segment m
Main Memory
segment m
Program Space
17 / 33
Seg. Virt. Memory: Address Translation
Virtual Address
segment number seg. offset
segment table
Physical Address
18 / 33
Segmented-Paged Virtual Memory
19 / 33
Segmented-Paged: Address Translation
Virtual Address
seg # page # pg. offset
segment
table
page
table
Physical Address
20 / 33
Hardware Implications of Virtual Memory
21 / 33
Lecture break; continue next class
22 / 33
The Memory Hierarchy: Who does what?
TB
MMU CPU
I DMA: operates concurrently w/ other
tasks
23 / 33
Notes
24 / 33
Satisfying A Memory Request
Satisfied in L1 cache:
1. MMU: translate address
2. MMU: search I or D cache as indicated by CPU, success
(sometimes simultaneously with translation)
3. MMU: read/write information to/from CPU
25 / 33
Satisfying A Memory Request
Satisfied in L2 cache:
1. MMU: translate address
2. MMU: search I or D cache as indicated by CPU, failure
3. MMU: search L2 cache, success
4. MMU: place block from L2 → L1, critical word first?
5. MMU: read/write information to/from CPU
26 / 33
Satisfying A Memory Request
Satisfied in L3 cache:
1. MMU: translate address
2. MMU: search I or D cache as indicated by CPU, failure
3. MMU: search L2 cache, failure
4. MMU: search L3 cache, success
5. MMU: place block from L3 → L2, critical word first?
6. MMU: place block from L2 → L1, critical word first?
7. MMU: read/write information to/from CPU
27 / 33
Satisfying A Memory Request
28 / 33
Satisfying A Memory Request
29 / 33
Memory Performance Impact: Naive
30 / 33
Average Memory Access Time
31 / 33
Cache Misses
I Miss Rate
# misses in a cache
Local miss rate =
total # of memory accesses to this cache
# misses in a cache
Global miss rate =
total # of memory accesses generated by CPU
32 / 33
Basic Cache Memory Optimizations
33 / 33