Professional Documents
Culture Documents
Chapter 3 P1
Chapter 3 P1
Chapter 3
COMPUTER MEMORY
Part 1
Contents
1
02/03/2019
Hardware review
Overview computer
systems
CPU executes instructions; CPU Memory
memory stores data
To execute an instruction,
the CPU must:
fetch an instruction;
fetch the data used by the Bus
instruction; and, finally,
execute the instruction on the
data… Disks Net
USB Etc.
which may result in writing
data back to memory
•••
2
02/03/2019
Introduction
3
02/03/2019
Capacity
Word size: Capacity is expressed in terms of words or
bytes.The natural unit of organisation
Number of words: Common word lengths are 8, 16, 32 bits
etc. or Bytes
Unit of Transfer
Internal: For internal memory, the unit of transfer is equal to
the number of data lines into and out of the memory module
External: For external memory, they are transferred in block
which is larger than a word.
Addressable unit : Smallest location which can be uniquely
addressed— Word internally— Cluster on Magnetic disks
4
02/03/2019
Access Method
Sequential acces. Examples tape
Direct Access: Individual blocks of records have
unique address based on location. Access is
accomplished by jumping (direct access) to general
vicinity plus a sequential search to reach the final
location. Example disk
Random access: example RAM
Associative access: example cache
Performance
Access time
Memory Cycle time
Transfer Rate:
Physical Types
Semiconductor : RAM
Magnetic : Disk & Tape
Optical : CD & DVD
Others
5
02/03/2019
6
02/03/2019
Memory Hierarchy
1 ns on-chip L1
Smaller, cache (SRAM)
faster,
costlier
5-10 ns off-chip L2 1-2 min
per byte
cache (SRAM)
7
02/03/2019
Examples
registers CPU registers hold words retrieved from L1 cache
on-chip L1
Smaller, cache (SRAM) L1 cache holds cache lines retrieved from L2 cache
faster,
costlier
off-chip L2
per byte
cache (SRAM) L2 cache holds cache lines retrieved from
main memory
L0:
Regs CPU registers hold words
Smaller,
retrieved from the L1 cache.
faster, L1: L1 cache
and (SRAM) L1 cache holds cache lines
costlier retrieved from the L2 cache.
L2: L2 cache
(per byte)
(SRAM) L2 cache holds cache lines
storage
devices retrieved from L3 cache
L3: L3 cache
(SRAM)
L3 cache holds cache lines
Larger, retrieved from main memory.
slower, L4: Main memory
and (DRAM) Main memory holds disk
cheaper blocks retrieved from
(per byte) local disks.
storage L5: Local secondary storage
devices (local disks)
8
02/03/2019
L2 unified cache:
L2 unified cache L2 unified cache 256 KiB, 8-way,
Access: 11 cycles
Main memory
9
02/03/2019
Processor-Memory Gap
DRAM
7%/year
(2X/10yrs)
10
02/03/2019
Example cache
Smaller, faster, more expensive
Cache 8
4 9 14
10 3 memory caches a subset of
the blocks
Cache memory
11
02/03/2019
Cache Memories
I/O Main
Bus interface
bridge memory
12
02/03/2019
S = 2s sets
Cache size:
C = S x E x B data bytes
v tag 0 1 2 B-1
Address of word:
t bits s bits b bits
S = 2s sets
tag set block
index offset
v tag 0 1 2 B-1
valid bit
B = 2b bytes per cache block (the data)
13
02/03/2019
Overview cache
• Smaller, faster, more expensive
Cache memory.
7 9 14 3 • Caches a subset of the blocks (a.k.a.
lines)
8 9 10 11
12 13 14 15
Cache
Block b is in cache:
7 9 14 3
Hit!
Memory 0 1 2 3
4 5 6 7
8 9 10 11
12 13 14 15
14
02/03/2019
Cache
Block b is not in cache:
7 9
12 14 3
Miss!
Memory
Block b is stored in cache
0 1 2 3 • Placement policy:
4 5 6 7 determines where b goes
• Replacement policy:
8 9 10 11
determines which block
12 13 14 15 gets evicted (victim)
15
02/03/2019
Operations of cache
16
02/03/2019
17
02/03/2019
Addressing
Size
Mapping Function
Replacement Algorithm
Write Policy
Block Size
Number of Caches
Cache Addressing
18
02/03/2019
Cache size
Mapping function
19
02/03/2019
Cache of 64kByte
Cache block of 4 bytes cache is 16k (214)
lines of 4 bytes
16MBytes main memory
24 bit address (224=16M)
Thus, for mapping purposes, we can consider
main memory to consist of 4Mbytes blocks of 4
bytes each.
20
02/03/2019
Cache organization
21
02/03/2019
Block Number
Direct Mapping
22
02/03/2019
8 14 2
24 bit address
2 bit word identifier (4 byte block)
22 bit block identifier
8 bit tag (=22-14)
14 bit slot or line
No two blocks in the same line have the same Tag field
Check contents of cache by finding line and checking Tag
23
02/03/2019
1 1,m+1, 2m+1…2s-m+1
…
m-1 m-1, 2m-1,3m-1…2s-1
24
02/03/2019
Direct
cache
example
25
02/03/2019
Address of int:
v tag 0 1 2 3 4 5 6 7
t bits 0…01 100
v tag 0 1 2 3 4 5 6 7
find set
S= 2s sets
v tag 0 1 2 3 4 5 6 7
v tag 0 1 2 3 4 5 6 7
v tag 0 1 2 3 4 5 6 7
block offset
26
02/03/2019
v tag 0 1 2 3 4 5 6 7
block offset
v Tag Block
Set 0 0
1 1?
0 ?
M[8-9]
M[0-1]
Set 1
Set 2
Set 3 1 0 M[6-7]
27
02/03/2019
Direct-Mapped Cache-example
Memory Cache
Block Addr Block Data Index Tag Block Data
00 00 00 00
00 01 01 11
00 10 10 01
00 11 11 01
01 00
01 01
01 10
01 11 Hash function: (block address)
10 00
10 01
mod (# of blocks in cache)
10 10 Each memory address maps to
10 11
11 00
exactly one index in the cache
11 01 Fast (and simpler) to find an
11 10
address
11 11
8 = 00 10 00
Direct-Mapped Cache Problem
24=01 10 00
(t) (s) (k)
Memory Cache
Block Addr Block Data Index Tag Block Data
00 00 00 ??
00 01 01 ??
00 10 10
00 11 11 ??
01 00
01 01
01 10
01 11 What happens if we access the
10 00
10 01 following addresses?
10 10
8, 24, 8, 24, 8, …?
10 11
11 00 Conflict in cache (misses!)
11 01
11 10
Rest of cache goes unused
11 11
Solution?
28
02/03/2019
Note that
Associated Mapping
29
02/03/2019
Associative Mapping
30
02/03/2019
Example
31
02/03/2019
Word
Tag 22 bit 2 bit
32
02/03/2019
33
02/03/2019
34
02/03/2019
35
02/03/2019
Word
Tag 9 bit Set 13 bit 2 bit
36
02/03/2019
37
02/03/2019
v tag 0 1 2 3 4 5 6 7 v tag 0 1 2 3 4 5 6 7
v tag 0 1 2 3 4 5 6 7 v tag 0 1 2 3 4 5 6 7
v tag 0 1 2 3 4 5 6 7 v tag 0 1 2 3 4 5 6 7
v tag 0 1 2 3 4 5 6 7 v tag 0 1 2 3 4 5 6 7
block offset
38
02/03/2019
v tag 0 1 2 3 4 5 6 7 v tag 0 1 2 3 4 5 6 7
block offset
No match:
One line in set is selected for eviction and replacement
Replacement policies: random, least recently used (LRU), …
Example of 2 way
t=2 s=1 b=1
xx x x M=16 byte addresses, B=2 bytes/block,
S=2 sets, E=2 blocks/set
v Tag Block
0
Set 0 1 ?
00 ?M[0-1]
0
1 10 M[8-9]
0
1 01 M[6-7]
Set 1
0
39
02/03/2019
Replacement algorithm
40
02/03/2019
LRU
Memory page B E E R B A R E B E A R
1 B
2 E *
3 R
LRU
Memory page B E E R B A R E B E A R
1 B *
2 E *
3 R
41
02/03/2019
LRU
Memory page B E E R B A R E B E A R
1 B *
2 E *
3 R
LRU
Memory page B E E R B A R E B E A R
1 B *
2 E * A
3 R
42
02/03/2019
LRU
Memory page B E E R B A R E B E A R
1 B *
2 E * A
3 R *
LRU
Memory page B E E R B A R E B E A R
1 B *
2 E * A
3 R *
43
02/03/2019
LRU
Memory page B E E R B A R E B E A R
1 B * E
2 E * A
3 R *
LRU
Memory page B E E R B A R E B E A R
1 B * E
2 E * A
3 R *
44
02/03/2019
LRU
Memory page B E E R B A R E B E A R
1 B * E
2 E * A B
3 R *
LRU
Memory page B E E R B A R E B E A R
1 B * E *
2 E * A B
3 R *
45
02/03/2019
LRU
Memory page B E E R B A R E B E A R
1 B * E *
2 E * A B
3 R *
LRU
Memory page B E E R B A R E B E A R
1 B * E *
2 E * A B
3 R * A
46
02/03/2019
LRU
Memory page B E E R B A R E B E A R
1 B * E *
2 E * A B
3 R * A
LRU
Memory page B E E R B A R E B E A R
1 B * E *
2 E * A B R
3 R * A
47
02/03/2019
LRU
8 page faults
Memory page B E E R B A R E B E A R
1 B * E *
2 E * A B R
3 R * A
LFU
Memory page B E E R B A R E B E A R
1 B
2
3
48
02/03/2019
LFU
Memory page B E E R B A R E B E A R
1 B
2
3
LFU
Memory page B E E R B A R E B E A R
1 B
2 E
3
49
02/03/2019
LFU
Memory page B E E R B A R E B E A R
1 B
2 E 2
3
LFU
Memory page B E E R B A R E B E A R
1 B
2 E 2
3 R
50
02/03/2019
LFU
Memory page B E E R B A R E B E A R
1 B 2
2 E 2
3 R
LFU
Memory page B E E R B A R E B E A R
1 B 2
2 E 2
3 R A
51
02/03/2019
LFU
Memory page B E E R B A R E B E A R
1 B 2
2 E 2
3 R A R
LFU
Memory page B E E R B A R E B E A R
1 B 2
2 E 2 3
3 R A R
52
02/03/2019
LFU
Memory page B E E R B A R E B E A R
1 B 2 3
2 E 2 3
3 R A R
LFU
Memory page B E E R B A R E B E A R
1 B 2 3
2 E 2 3 4
3 R A R
53
02/03/2019
LFU
Memory page B E E R B A R E B E A R
1 B 2 3
2 E 2 3 4
3 R A R A
LFU
Memory page B E E R B A R E B E A R
1 B 2 3
2 E 2 3 4
3 R A R A R
54
02/03/2019
LFU
7 page faults
Memory page B E E R B A R E B E A R
1 B 2 3
2 E 2 3 4
3 R A R A R
55
02/03/2019
Write Through
All write operations are made to main memory as well as to cache,
so main memory is always valid
Other CPU’s monitor traffic to main memory to update their
caches when needed
This generates substantial memory traffic and may create a
bottleneck
Anytime a word in cache is changed, it is also changed in main
memory
Both copies always agree
Generates lots of memory writes to main memory
Multiple CPUs can monitor main memory traffic to keep local (to
CPU) cache up to date
Lots of traffic
Slows down writes
Remember bogus write through caches!
Write back
When an update occurs, an UPDATE bit associated
with that slot is set, so when the block is replaced it is
written back first
During a write, only change the contents of the cache
Update main memory only when the cache line is to be
replaced
Causes “cache coherency” problems -- different values
for the contents of an address are in the cache and the
main memory
Complex circuitry to avoid this problem
Accesses by I/O modules must occur through the cache
56
02/03/2019
57