Download as pdf or txt
Download as pdf or txt
You are on page 1of 13

CACHE MEMORY

• The speed of main memory is very low compared to the speed of the processor.
So during execution of a program if CPU spends too much time in accessing main
memory it will affect the efficiency of the system. So a small fast memory module
is interposed between the large main memory and CPU. This small fast memory
module is referred to as cache memory.
• The effectiveness of the cache memory is based on a property called Locality of
Reference.
Locality of Reference
• Most of the execution time of the processor is spent on routines in which
many instructions are executed repeatedly.
• These instructions may constitute a simple loop, nested loop, or a few
procedures that repeatedly call each other.
• i.e. many instructions in the localized areas of the program are executed
repeatedly during some period and the remainder of the program are accessed
relatively infrequently. This is referred to as locality of reference.
• There are two aspects of locality of reference: Temporal and Spatial aspects.
• In temporal aspect, it is assumed that recently executed instructions are likely
to be executed again very soon. In spatial aspect it is assumed that instructions
in close proximity of the recently executed instruction are likely to be
executed again.
• When read request is received from the processor for a particular word in
main memory, the contents of a block of memory containing the requested
word will be transferred to a cache block.
• Subsequently if any of the words from this block is requested by the
processor, the memory controller can transfer the word from this cache block
without accessing main memory again.

• Memory controller treats main memory as a group of blocks of words. Cache


memory is also considered to be a group of blocks. When a word in main
memory is requested, the main memory block containing that block is
transferred to a particular cache block. The size of main memory block and
cache block are to be same.

• The correspondence between main memory blocks and the blocks in cache is
specified by a mapping function.
• When the cache is full and the requested word is not present in any of the
cache blocks then the requested word will be accessed from the main
memory.
• The main memory block containing that word is to be then loaded into a
cache block. So one cache block has to be removed.

• The cache control circuitry should decide which cache block is to be


removed to create space for the new main memory block which contains the
referenced word. The collection of rules for making this decision constitutes
replacement algorithms.

• If the word requested by the processor is existing in the cache, it is said that a
cache hit is occurred, otherwise a cache miss is said to be occurred.
• In read operation, if cache hit(Read hit) occurs the main memory is not at all
involved.
– If read miss occurs the block of words that contains the requested word is
copied from the main memory into a cache block. After the entire block is
transferred to cache, the requested word will be transferred to the processor.
– Alternatively, the requested word can be sent to processor as soon as it is read
from the processor. This approach is called Load-through or Early-restart.
This will reduce the processor’s waiting time to get the word but at the expense
of complex circuitry.

• For a write operation the system can proceed in two ways.


– The cache location and main memory location are simultaneously updated.
This technique is called write through protocol. It can be implemented using
simpler circuitry but causes a number of unnecessary memory write operations.

– Only the cache location is updated and it is marked as updated by setting a bit
associated with that block. That bit is termed as dirty bit or modified bit. The
main memory location of that word will be updated later when the block
containing this word is to be removed from the cache. This technique is called
write-back or copy-back protocol.
Mapping Functions

• The correspondence between main memory blocks and the blocks in cache is
specified by a mapping function.

• Consider a cache consisting of 128 blocks of 16 words each. The main


memory has 4K (4096) blocks of 16 words each. Thus, 64K words. So
number of bits in main memory address is 16 bits.

• Also assuming that consecutive words has consecutive addresses.


• Mainly the mapping functions can be classified into three.

1. Direct Mapping

2. Associative Mapping

3. Set Associative Mapping


Direct Mapping
• In direct mapping block j of main memory is mapped onto cache block -
(j modulo 128).
Thus when main memory blocks 0, 128, 256,…. is loaded into cache, they
will be loaded into cache block 0.
Main memory block 1, 129 etc will be mapped to cache block 1.
• More than one main memory block can be mapped onto a single cache block.
• Even when the cache is not full the existing cache block need to be removed when
a new block arrives.
• The memory address generated by the processor can be divided into 3 fields.
→ The low order 4 bits select one of the 16 words in a block.
→ The 7 bit cache block field determines one of the 128 cache blocks.
→ Since there are 4096 main memory blocks and 128 cache blocks, 32 memory
blocks can be mapped onto each cache block. To determine which memory block
is mapped onto a particular cache block, 5 bit Tag field is used.
• The 7-bit cache block field of each address generated by the processor points to a
particular cache block. The high order 5 bits are then compared with the tag field
of each cache block, if they matches, the requested word is present in that block,
otherwise the requested cache block is not present.
Associative Mapping

• In associative mapping a main memory block can be place in any cache block.
• Considering the previous example, any of the 4096 main memory blocks can be
placed in any one of the 128 cache blocks.
• So one cache block can hold one of 4096 main memory block. To identify which
one among the 4096 (2 12)main memory blocks are mapped onto a cache block, a
12-bit tag filed is used.
• If the tag field of the generated address matches with the tag field of a particular
cache block ,remaining 4bit word field is used to select one among the 16 words in
that block.
• If the tag field of the generated address does not match with any of the Tag fields
of the cache blocks, a cache miss is said to occur.
• The space in the cache can be utilized more efficiently in this approach. A
particular cache block has to be replace only when the complete cache is full.
Set Associative Mapping

• This approach is a combination of direct & associative mapping


techniques.
• Blocks in the cache are grouped into sets.
• A main memory block is mapped onto a specific cache set(Like direct
mapping). The memory block can reside in any cache block within that
set(Like associative mapping).
• Considering the previous example itself, 2 cache blocks are considered to
be in a cache set. So there will be 64 sets.
• Each main memory block will be mapped to a specific cache set like the
process done in direct mapping. Here main memory block j will be
mapped onto cache set j modulo 64.
• Within that set 2 blocks are there, so the memory block can reside in any
of these 2 cache blocks.
• The address generated by the processor will be divided into 3 parts. One
part of identifying one among the 16 words in a block(4 bits), second one
is for identifying one among the 64 sets in the cache block(6 bits).
• Each cache set can hold any one of 64 main memory blocks. To identify
which main memory block is placed in a cache set, the high order 6 bits
are used as tag field.
• If the .cache set contains 8 blocks per set, then the set field will be having
4 bits.

• Another control bit associated with each cache block, valid bit is used to
indicate the cache block contains valid data.

• The valid bit of a particular cache block is set to 1 when a new memory
block is loaded in the cache block.

• When main memory block is updated by a unit which bypasses the cache,
(like DMA)checking will be done to ensure if that memory block is
mapped to any of the cache block. If it is, its valid bit is set to zero.

• In a multiprocessing environment, each processor may have its own


cache. So different copies of the same memory block may exist in
different locations. This problem is referred to as Cache Coherence
Problem.

You might also like