Professional Documents
Culture Documents
L10 Cache Memory
L10 Cache Memory
Cache Memory
Note: The Hit Ratios of cache depend on the design and size of the cache as well
as the behavior of the program in terms of locality of reference. Instances
of Hit Ratios of exceeding 0.9 is not uncommon. Generally higher-level
caches have higher Hit Ratios due to the larger space.
Example Scenarios of Single Level Cache:
Block j
Block No.
(Word bits)
LSB
Cache Memory - Mapping
• There are far fewer cache lines than main memory blocks
• A method for mapping blocks to cache lines is needed
i.e. To determine which block goes to which line
(Tag bits)
A Memory Address
(Line bits)
Block j
Block No.
(Word bits)
LSB
Cache Memory – Direct Mapping
Memory Address length =
(t + l + w) bits
No. of addressable units=
2t+l+w words or bytes
Length of Block =
2w words or bytes BC-1 LC-1
No. of blocks in main memory,
M = 2t+l
No. of cache lines, C = 2l
Size of cache = Cache Line Main Memory Blocks Mapped
2l+w words or bytes 0 0, C, 2C, …, 2 L-C
(Excluding Tags & Control bits) 1 1, C+1, 2C+1, …, 2 L-C+1
No. of blocks mapped to per 2 2, C+2, 2C+2, …, 2 L-C+2
cache line = M/C = 2t
...
Length of tag = t bits B2L -1
C-1 C-1, 2C-1, 3C-1, …, 2 L-1
Main Memory
Cache Memory – Direct Mapping
• Example
...
one of the 4 bytes in that line C-1 00FFFC, 01FFFC, 02FFFC, …, FFFFFC
Cache Memory – Direct Mapping
t+l+w
t l
t+l+w
Cache Memory – Direct Mapping
t+w
t
Cache Memory – Associative Mapping
• Advantage:
• Can support large number of simultaneous Localities in the program,
• Flexibility as to which block to replace when a new block is read into the cache -
• Disadvantage:
• Requires expensive circuitry for parallel examination of the tags in all the cache
lines,
• Requires hardware implementation of suitable Cache Replacement Algorithm to
maximize the hit ratio.
Cache Memory – Set Associative Mapping
• Combines the strengths of both, Direct and Associative mapping
approaches while overcoming their disadvantages
• The Cache Lines are grouped into ν Sets, each consisting of α Lines
such that-
Total Cache Line, C = ν x α
and i = j modulo ν
where, i = Set no. of cache lines
j = Main memory block no.
C = No. of lines in cache
ν = No. of cache line sets
α = No. of lines in each set
• This is a α-way set associative mapping
• A block Bi can be loaded into any of the α lines in set j
Cache Memory – Set Associative Mapping
α lines
Lα-1
Cache Memory – Set Associative Mapping
• Cache control logic interprets a memory address as three fields: Tag, Set, and Word
• Tag in a memory address is much smaller and is only compared to the α tags within
a single set
• Address length, l = (t+s+w) bits
• No. of addressable units= 2(t+s+w) word or bytes
• Block size= Line size= 2w
(w bits)
• No. of blocks in main memory= 2t+s (t bits) (s bits)
t s
t
t+s+w
Example: Direct Mapping
• A 64KB size cache memory with 8-byte cache lines uses Direct Mapping.
• The main memory size is 64MB
• The memory address length is- 26 bits [226 = 64M]
• The Word field length is- 3 bits [23 = 8]
• No. of cache lines = 64 KBytes/ 8 Bytes = 8K
• The Line field length is- 13 bits [213 = 8K]
• The Tag field length= 26 – 13 – 3 = 10 bits
Tag Line Word
(16-25) (3-15) (0-2)
• Total Tag size is- Tag field length x No. of Cache lines = 10 x 1 K bits = 10 K bits
• Comparator size= Tag length = 10 bits
Example: Associative Mapping
• A 64KB size cache memory with 8-byte cache lines uses Associative mapping.
• The main memory size is 64MB
• Total Tag size is- Tag field length x No. of Cache lines = 23 x 8 K bits = 23 KB
• Comparator size= Tag length x No. of cache lines = 23 x 8K bits = 1,844,416 bits
Example: Set-Associative Mapping
• A 64KB size 8-way set-associative cache memory uses 8-byte cache lines.
• The main memory size is 64MB
• The memory address length is- 26 bits [226 = 64M]
• The Word field length is- 3 bits [23 = 8]
• No. of cache lines = 64 KBytes/ 8 Bytes = 8K
• No. of cache line sets = 8K/ 8 = 1K
• The Set field length is- 10 bits [210 = 1K]
• The Tag field length= 26 – 10 – 3 = 13bits
Tag Set Word
(13-25) (3-12) (0-2)
• Total Tag size is- Tag field length x No. of Cache lines = 13 x 8K bits = 13 KB
• Comparator size= Tag length x No. of cache lines per set = 13 x 8 bits = 104 bits
A Comparison of the Mapping Techniques
Parameter Direct Associative Set-Associative
No. of Localities Limited Very Large Large Enough
Supported
Replacement Not Required Need selection of Need selection of
Algorithm 1 among C 1 among α
No. of Parallel 1 As many as the no. Equal to no. of lines
Comparisons of cache lines (C) per Set (α)
Tag Length (t) Short Very Long Short
Comparator Size Small Very Big Not Big
• Direct Mapping is simple and less costly, but does not support multiple simultaneous
localities – prone to Thrashing
• Associative Mapping supports very large number of localities, but the Comparator
Complexity and Tag Space requirements are Very High.
• Set-Associative Mapping supports sufficient number of localities and the Comparator
Complexity and Tag Space requirements are within acceptable limits.
Questions?
Cache Replacement Algorithm
• The size of cache memory is generally much smaller than the main memory.
• Only a portion of a program, which may contain one or more localities in it,
can be accommodated in the cache.
• As the program execution shifts from one locality to another there is need for
accommodating the new localities in the cache replacing the older localities.
• For this the cache uses Replacement Algorithms.
• The commonly used cache replacement algorithm - Least Recently Used
(LRU) - assumes that the location referred to farthest-in-time is least likely to
be referred to in the near future.
• The memory location content that was referred to
farthest-in-time, among those contained in the cache, is
replaced with the newly referred location content.
Cache Memory – Replacement Algorithms
• When the cache is full and a new block is required to be
bought into the cache, one of the existing blocks must be
replaced
• For direct mapping, there is only one possible line for any
particular block – no choice is possible
• For the associative and set-associative mapping, a replacement
algorithm is needed to choose the cache line
• Some common replacement algorithms are:
• Least Recently Used (LRU)
• First in First Out (FIFO)
• Least Frequently Used (LFU)
• Random
Replacement Algorithm - LRU
• When caches are split into two parts – one part dedicated
to instructions and other to data – it is a split cache
• Both exist at the same level, typically as two L1 caches
• Advantage
• Eliminates contention for the cache between the instruction
fetch/decode unit and the execution unit
Unified & Split Cache
• When no such splitting is done it is a unified cache
• Advantages of unified cache:
• For a given size, a unified cache has a higher hit rate than split
caches because it balances the load between instruction and
data fetches automatically
• Only one cache needs to be designed and implemented
• Usually split caches are used for L1 and unified caches for
higher levels
Cache Memory Operation