Miss Rate Versus Block Size: 25% 1K 4K 16K 64K 256K

Miss Rate versus Block Size
25%
20% 1K
4K
15%
Miss
16K
Rate
10%
64K
5% 256K
0%
64
16
32
128
Block Size (bytes) 256

1
Review
2
CS501
Advanced Computer
Architecture
Lecture 39
Dr.Noor Muhammad Sheikh

3
Performance versus Year
1000 CPU
µProc
60%/yr.
“Moore’s Law”
Performance
100 Processor-Memory
Performance Gap:
(grows 50% / year)
10
DRAM
DRAM
7%/yr.
1
1980
1981
1988
1996
1982
1983
1984
1985
1986
1987
1989
1990
1991
1992
1993
1994
1995
1997
1998
1999
2000
Year 4
Block Diagram of Cache
Cache
Control Logic
Determine
Fast
and Tag
Memory comparison RAM
unit
5
Int ALFA [100], SUM;
SUM = 0;
For ( i=1; i<=100; i++ )
{ SUM = SUM + ALFA[i]; }
6
Associative Cache
fig. 7.31(jordan)
Tag Valid Cache Main
memory bits memory memory
421 1 0 Cache block 0 MM block 0
? 0 1 ? MM block 1
119 1 2 Cache block 2 MM block 2
2 1 255 Cache block 255

MM block 119
Tag One cache line,
field, 8 bytes
13 bits MM block 421
Valid,
1 bit
MM block 8191
Main memory address: 13 3

One cache line,
Tag Byte 8 bytes
7
• Main memory address references
have two fields:
• 3-bit word field
• 13-bit tag field
8
• 3-bit word field becomes a “cache
address”.
• Cache address specify where to find

the word in the cache.
• The 13-bit field must be compared

against every 13-bit tag in the tag
memory.
9
Associative Cache Mechanism
fig. 7.32 (Jordan)
A s s o c ia t iv e t a g m e m o r y
1
A rg u m e n t
r e g ist e r
M a tc h V a lid
b it b it
C a c h e b lo c k 0
2
?
M a tc h
C a c h e b lo c k 2
3
4
C a c h e b lo c k 2 5 5
64
O n e c a c h e lin e ,
8 b y te s
M a in m e m o ry a d d r e s s
Tag B y te 5
3
13 3 S e le c to r
6
To C PU
8 10
Direct-mapped cache
The main memory address is
partitioned into three fields:
• Word field
• Group field
• Tag field
11
Direct-mapped cache
• Cache address is composed of
two fields:
• Group field
• Word field
12
Figure: 7.34
(jordan)
13
Continued
• The data cache RAM is a block of fast

memory, usually a static RAM and it
stores copies of data or instructions
frequently requested by the CPU
• The Tag RAM contains part of the
memory address, called into the data
cache RAM.
14
Continued
• Associative memories are
considerably more expensive in terms
of gates than ordinary access-by-
address memories
• Each bit comparison is made with an
XOR gate, whose output will be 0 if
there is a match between the two bits.
• 1 output from the NOR gate indicates
a word match.
15
Continued
• Valid bit specifies that the

information in the selected block
is valid
16
Figure:7.33
(jordan)
17
Direct mapped cache
fig. 7.33 (Jordan)
Tag Valid Cache
memory bits memory Main memory block numbers Group #:
30 1 0 0 256 512 7680 7936 0
9 1 1 1 257 513 2305 7681 7937 1
1 1 2 2 258 514 7682 7938 2
1 1 255 255 511 767 8191 255

Tag #: 0 1 2 9 30 31
Tag One
field, cache
5 bits line,
8 bytes One cache line,
8 bytes
Cache address: 8 3
Main memory address: 5 8 3 18

Tag Group Byte
Direct map cache
• Imposes a considerable amount

of rigidity on cache organization.
• Relies on principle of locality.
19
Direct mapped cache
• Advantage:
simplicity
• Disadvantage:
only a single block from a given
group is present in cache at any time.
20
2-Way Set-Associative Cache
fig. 7.35(Jordan)
Tag Cache
memory memory Main memory block numbers Group #:
2 30 0 512 7680 0 256 512 7680 7936 0
2 9 1 513 2304 1 257 513 2304 7681 7937 1
1 2 258 2 258 514 7682 7938 2
0 1 255 255 511 255 511 767 8191 255

Tag #: 0 1 2 9 30 31
Tag One
field, cache
5 bits line,
One cache line,
8 bytes
8 bytes
Cache group address: 8 3
Main memory address: 5 8 3

Tag Set Byte
21
Continued
• The cache hardware is a

combination of direct and
associative mapping.
22
Block Replacement
23
2-Way Set-Associative Cache
• Similar to direct mapped cache
• Twice as many blocks in the

cache so that a set of any two
blocks from each main memory
can be stored in the cache.
24
2-Way-Set-Associative Cache
The main memory is address is

divided into two fields:
• 8-bit set field
• 5-bit tag field

25
Continued
• The group field is called the set
field.
• Set field is decoded and direct the
search to the correct group.
• After that the tags in the selected
groups are searched.
26
Continued
• Multiple copies of the same data can exist

in memory hierarchy simultaneously.
• Cache needs updating mechanism to
prevent old data values from being used.
This is the problem of cache coherence.
• Write policy is the method used by the
cache to deal with and keep the main
memory updated.
27
2-Way-Set-Associative Cache
• Two possible places in which a block can

resides.
• Both places must be searched

associatively.
• Cache group address is the same as that

of the direct-mapped cache.
28
Continued
• Dirty bit is a status bit which indicates
whether the block in the cache is dirty.
• If the block is clean, it is not written on a
miss, since lower level contains the same
information as the cache
• Writing the cache is not as easy as
reading from it, e.g. modifying a block
cannot begin unit the tag has been
checked to see if the address is a hit
29
Continued
• In the case of write through, also called
store through, the information is written to
both the blocks in the cache as well as in
the next lower level memory which is the
main memory.
• Read misses never results in the write to
the lower level
• In the next lower level the most current
copy of the information is present at all
times
30
Continued
• Write stall:
For write to complete in write through, the
CPU has to wait. This wait state is called
write stall.
• Write buffer:
Write Buffer reduces the write stall by
permitting the processor to continue as
soon as the data has been written into the
buffer, thus allowing overlapping of the
instruction execution with the memory
update.
31
Continued
Write Back:
• The information is written only in lower
block from the cache when modified.
• The modified block is written to the lower
level when it is replaced with the cache.
• Write occurs at the speed of the cache
memory.
• Multiple writes within a block require
memory.
• It uses less memory bandwidth.
32
Continued
Write Allocate:
The block is loaded followed by the write.
This action is similar to the road miss. It is
used in write back caches, since
subsequent writes to that particular block
will be captured by the cache.
No write Allocates:
The block is modified in the lower level and
not loaded into the cache. This method is
generally used in write through caches as
subsequent writs to that block still have to
go to the lower level
33

Miss Rate Versus Block Size: 25% 1K 4K 16K 64K 256K

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Miss Rate Versus Block Size: 25% 1K 4K 16K 64K 256K

Uploaded by

Copyright:

Available Formats

Miss Rate versus Block Size

Block Size (bytes) 256

Dr.Noor Muhammad Sheikh

2 1 255 Cache block 255

Main memory address: 13 3

• 3-bit word field

• 13-bit tag field

• Cache address specify where to find

• The 13-bit field must be compared

• The data cache RAM is a block of fast

• Valid bit specifies that the

1 1 255 255 511 767 8191 255

Main memory address: 5 8 3 18

• Imposes a considerable amount

• Relies on principle of locality.

0 1 255 255 511 255 511 767 8191 255

Main memory address: 5 8 3

• The cache hardware is a

• Similar to direct mapped cache

• Twice as many blocks in the

The main memory is address is

• 8-bit set field

• 5-bit tag field

• Multiple copies of the same data can exist

• Two possible places in which a block can

• Both places must be searched

• Cache group address is the same as that

You might also like