Professional Documents
Culture Documents
Unit 4 - Memory Organization
Unit 4 - Memory Organization
Unit 4 - Memory Organization
Memory Organization
Table 4.1
Key Characteristics of Computer Memory Systems
◼ Capacity
◼ Memory is typically expressed in terms of bytes
◼ Unit of transfer
◼ For internal memory the unit of transfer is equal to the number of
electrical lines into and out of the memory module
Ou isk
t t i cd
sto boar
e
gn OM
rag d M a D- R W
e C D -R W
C R M
D-
D V D- R A y
a
DV lu-R
B
Of ap
e
f ct
sto -line eti
rag gn
e Ma
T2
T1
0 1
Fraction of accesses involving only Level 1 (Hit ratio)
◼ Disk cache
◼ A portion of main memory can be used as a buffer to hold data
temporarily that is to be read out to disk
◼ A few large transfers of data can be used instead of many small
transfers of data
◼ Data can be retrieved rapidly from the software cache rather than
slowly from the disk
Fastest Fast
Less Slow
fast
C–1
Block Length
(K Words)
(a) Cache
Block M – 1
2n – 1
Word
Length
(b) Main memory
Receive address
RA from CPU
Load main
Deliver RA word
memory block
to CPU
into cache line
DONE
Address
buffer
System Bus
Control Control
Processor Cache
Data
buffer
Data
Table 4.2
Elements of Cache Design
© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.
+
Cache Addresses
Virtual Memory
◼ Virtual memory
◼ Facility that allows programs to address memory from a logical
point of view, without regard to the amount of main memory
physically available
◼ When used, the address fields of machine instructions contain
virtual addresses
◼ For reads to and writes from main memory, a hardware memory
management unit (MMU) translates each virtual address into a
physical address in main memory
Processor Main
Cache memory
Data
Processor Main
Cache memory
Data
m lines
Bm–1 Lm–1
First m blocks of
cache memory
main memory
(equal to size of cache) b = length of block in bits
t = length of tag in bits
(a) Direct mapping
t b
L0
one block of
main memory
Lm–1
cache memory
(b) Associative mapping
s–r
s
W4j
w Li
W(4j+1) Bj
Compare w
W(4j+2)
W(4j+3)
(hit in cache)
1 if match
0 if no match
Lm–1
0 if match
1 if no match
(miss in cache)
00 000000001111111111111000
00 000000001111111111111100
Line
Tag Data Number
16 000101100000000000000000 77777777 00 13579246 0000
16 000101100000000000000100 11235813 16 11235813 0001
FF 11223344 3FFE
16 000101101111111111111100 12345678 16 12345678 3FFF
8 bits 32 bits
FF 111111110000000000000000 16-Kline cache
FF 111111110000000000000100
FF 111111111111111111111000 11223344
FF 111111111111111111111100 24682468
Note: Memory address values are
in binary representation;
32 bits other values are in hexadecimal
16-MByte main memory
w Lj
s
W4j
W(4j+1)
Compare w Bj
W(4j+2)
W(4j+3)
(hit in cache)
1 if match
0 if no match
s
Lm–1
0 if match
1 if no match
(miss in cache)
Line
Tag Data Number
3FFFFE 11223344 0000
058CE7 FEDCBA98 0001
058CE6 000101100011001110011000
058CE7 000101100011001110011100 FEDCBA98 FEDCBA98
058CE8 000101100011001110100000
3FFFFD 33333333 3FFD
000000 13579246 3FFE
3FFFFF 24682468 3FFF
22 bits 32 bits
16 Kline Cache
Tag Word
Main Memory Address =
22 bits 2 bits
k lines
Lk–1
Cache memory - set 0
Bv–1
First v blocks of
main memory
(equal to number of sets)
B0 L0
one
set
v lines
Bv–1 Lv–1
First v blocks of Cache memory - way 1 Cache memory - way k
main memory
(equal to number of sets)
Set 0
s–d Fk–1
Fk s+w
Bj
0 if match
1 if no match
(miss in cache)
◼ Number of sets = v = 2d
000 000000001111111111111000
000 000000001111111111111100
Set
Tag Data Number Tag Data
02C 000101100000000000000000 77777777 000 13579246 0000 02C 77777777
02C 000101100000000000000100 11235813 02C 11235813 0001
32 bits
Note: Memory address values are
16 MByte Main Memory in binary representation;
other values are in hexadecimal
0.6
0.5
0.4
0.3
0.2
0.1
0.0
1k 2k 4k 8k 16k 32k 64k 128k 256k 512k 1M
Cache size (bytes)
direct
2-way
4-way
8-way
16-way
◼ Once the cache has been filled, when a new block is brought
into the cache, one of the existing blocks must be replaced
◼ For direct mapping there is only one possible line for any
particular block and no choice is possible
◼ First-in-first-out (FIFO)
◼ Replace that block in the set that has been in the cache longest
◼ Easily implemented as a round-robin or circular buffer technique
If at least one write operation has been A more complex problem occurs when
performed on a word in that line of the multiple processors are attached to the
cache then main memory must be updated same bus and each processor has its own
by writing the line of cache out to the local cache - if a word is altered in one
block of memory before bringing in the cache it could conceivably invalidate a
new block word in other caches
◼ Write back
◼ Minimizes memory writes
◼ Updates are made only in the cache
◼ Portions of main memory are invalid and hence accesses by I/O
modules can be allowed only through the cache
◼ This makes for complex circuitry and a potential bottleneck
◼ The on-chip cache reduces the processor’s external bus activity and
speeds up execution time and increases overall system performance
◼ When the requested instruction or data is found in the on-chip cache, the bus
access is eliminated
◼ On-chip cache accesses will complete appreciably faster than would even
zero-wait state bus cycles
◼ During this period the bus is free to support other transfers
◼ Two-level cache:
◼ Internal cache designated as level 1 (L1)
◼ External cache designated as level 2 (L2)
◼ Potential savings due to the use of an L2 cache depends on the hit rates
in both the L1 and L2 caches
◼ The use of multilevel caches complicates all of the design issues related
to caches, including size, replacement algorithm, and write policy
© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.
0.98
0.96
0.94
0.92
0.90
L1 = 16k
Hit ratio
0.88 L1 = 8k
0.86
0.84
0.82
0.80
0.78
1k 2k 4k 8k 16k 32k 64k 128k 256k 512k 1M 2M
Figure 4.17 Total Hit Ratio (L1 and L2) for 8 Kbyte and 16 Kbyte L1
Table 5.1
Semiconductor Memory Types
◼ DRAM
Address line T3 T4
T5 C1 C2 T6
Transistor
Storage
capacitor
T1 T2
(a) Dynamic RAM (DRAM) cell (b) Static RAM (SRAM) cell
◼ Dynamic cell
◼ Simpler to build, smaller
◼ More dense (smaller cells = more cells per unit
area)
DRAM
◼ Less expensive
◼ Requires the supporting refresh circuitry
◼ Tend to be favored for large memory
+ requirements
◼ Used for main memory
◼ Static
◼ Faster
◼ Used for cache memory (both on and off chip)
© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.
+
Read Only Memory (ROM)
◼ Contains a permanent pattern of data that cannot be
changed or added to
Flash
EPROM EEPROM
Memory
Electrically erasable
programmable read-only Intermediate between
Erasable programmable
memory EPROM and EEPROM in
read-only memory
both cost and functionality
Refresh
Counter MUX
Data Input
A10 Column Buffer D1
Address D2
Refresh circuitry D3
Buffer Data Output D4
Buffer
Column Decoder
Decode 1 of
Memory address 512 bits
512
register (MAR) Chip #1
9 Decode 1 of
512 bit-sense Memory buffer
register (MBR)
1
2
9 3
4
5
6
7
8
512 words by
Decode 1 of 512 bits
512 Chip #8
Decode 1 of
512 bit-sense
If consecutive words of
memory are stored in different
banks, the transfer of a block of
memory is speeded up
CLK
COMMAND READ A NOP NOP NOP NOP NOP NOP NOP NOP
Table 5.4
DDR Characteristics
◼ Principle of locality
◼ When working with a large process execution may be confined to a small section of a
program (subroutine)
◼ It is better use of memory to load in just a few pages
◼ If the program references data or branches to an instruction on a page not in main
memory, a page fault is triggered which tells the OS to bring in the desired page
◼ Advantages:
◼ More processes can be maintained in memory
◼ Time is saved because unused pages are not swapped in and out of memory
◼ Disadvantages:
◼ When one page is brought in, another page must be thrown out (page replacement)
◼ If a page is thrown out just before it is about to be used the OS will have to go get the
page again
◼ Thrashing
◼ When the processor spends most of its time swapping pages rather than
executing instructions
Page 1
13
of A
16
18
17
13
14 Page 0
18
of A
15
Process A
Page Table
2m – 1 Frame # Offset
m bits
Inverted page table Real address
(one entry for each
physical memory frame)
Yes
CPU activates
I/O hardware Update TLB
Page transferred
from disk to CPU generates
main memory physical address
Memory Yes
full?
No Perform page
replacement
Page tables
updated
Figure 8.18 Operation of Paging and Translation Lookaside Buffer (TLB) [FURH87]
Page # Offset
TLB
TLB miss
TLB
hit Cache Operation
Real Address
Main
Memory
Page Table
Value
◼ An executing program may only access data segments for which its
clearance level is lower than or equal to the privilege level of the data
segment
31 22 21 12 11 0
31 24 23 22 20 19 16 15 14 13 12 11 8 7 0
D A Segment
Base 31...24 G / L V limit P DPL S Type Base 23...16
B L 19...16
31 12 11 9 7 6 5 4 3 2 1 0
P P P U R
Page frame address 31...12 AVL S 0 A CW S WP
D T
AVL — Available for systems programmer use PWT — Write through = reserved
P — Page size US — User/supervisor
A — Accessed RW — Read-write
PCD — Cache disable P — Present
(d) Page directory entry
31 12 11 9 7 6 5 4 3 2 1 0
P P U R
Page frame address 31...12 AVL D A C W S WP
D T
D — Dirty
(e) Page table entry
Base
Defines the starting address of the segment within the 4-GByte linear address space.
D/B bit
In a code segment, this is the D bit and indicates whether operands and addressing modes
are 16 or 32 bits.
Descriptor Privilege Level (DPL)
Specifies the privilege level of the segment referred to by this segment descriptor.
Granularity bit (G)
Indicates whether the Limit field is to be interpreted in units by one byte or 4 KBytes.
Limit
Defines the size of the segment. The processor interprets the limit field in one of two ways,
depending on the granularity bit: in units of one byte, up to a segment size limit of 1
MByte, or in units of 4 KBytes, up to a segment size limit of 4 GBytes.
S bit
Determines whether a given segment is a system segment or a code or data segment.
Segment Present bit (P)
Used for nonpaged systems. It indicates whether the segment is present in main memory.
For paged systems, this bit is always set to 1.
Type
Distinguishes between various kinds of segments and indicates the access attributes.
© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.
Table 8.5
x86 Memory Management Parameters (page 2 of 2)
Page Directory Entry and Page Table Entry
Accessed bit (A)
This bit is set to 1 by the processor in both levels of page tables when a read or write
operation to the corresponding page occurs.
Dirty bit (D)
This bit is set to 1 by the processor when a write operation to the corresponding page
occurs.
Page Frame Address
Provides the physical address of the page in memory if the present bit is set. Since page
frames are aligned on 4K boundaries, the bottom 12 bits are 0, and only the top 20 bits are
included in the entry. In a page directory, the address is that of a page table.
Page Cache Disable bit (PCD)
Indicates whether data from page may be cached.
Page Size bit (PS)
Indicates whether page size is 4 KByte or 4 MByte.
Page Write Through bit (PWT)
Indicates whether write-through or write-back caching policy will be used for data in the
corresponding page.
Present bit (P)
Indicates whether the page table or page is in main memory.
Read/Write bit (RW)
For user-level pages, indicates whether the page is read-only access or read/write access for
user-level programs.
User/Supervisor bit (US)
Indicates whether the page is available only to the operating system (supervisor level) or is
available to both operating system and applications (user level).
© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.
+ Paging
ARM
core Cache
Virtual and Cache
address write line fetch
buffer hardware
Domain
Collection of memory regions. Access control can be applied on the basis of domain.
Shared (S)
Determines whether the translation is for not-shared (0), or shared (1) memory.
SBZ
Should be zero.
◼ The region can be privileged access only, reserved for use by the OS and not by
applications
Chapter
◼ RAID
◼ Magnetic disk
◼ RAID level 0
◼ Magnetic read and write
mechanisms ◼ RAID level 1
Magnetic
Read
The write head itself is made of
Electric pulses are sent to the
write head and the resulting
easily magnetizable material and Write
and is in the shape of a
magnetic patterns are recorded
on the surface below, with
rectangular doughnut with a gap Mechanisms
along one side and a few turns
different patterns for positive
of conducting wire along the
and negative currents
opposite side
MR
sensor Write current
h Shield
dt
k wi
ac
Tr
Inductive
N write element
S
S
N
N
S
Magnetization S
N
N
S
S
N
N
S
Recording
medium
Inter-sector gap
S6 •••
S6 •••
Track sector
Sector •••
SN
S6
S5
SN
S5
SN
S5
S4
S1
S4
S1
S3 S2
S4
S1
S3 S2
S3 S2
Read-write head
(1 per surface)
Platter
Direction of
Cylinder Spindle Boom
arm motion
gap ID gap data gap gap ID gap data gap gap ID gap data gap
1 field 2 field 3 1 field 2 field 3 1 field 2 field 3
0 0 1 1 29 29
bytes 17 7 41 515 20 17 7 41 515 20 17 7 41 515 20
600 bytes/sector
bytes 1 2 1 1 2 1 512 2
Table 6.1
Physical Characteristics of Disk Systems
Device Busy
◼ To read or write the head must be positioned at the desired track and at the beginning
of the desired sector on the track
◼ Track selection involves moving the head in a movable-head system or electronically
selecting one head on a fixed-head system
◼ Once the track is selected, the disk controller waits until the appropriate sector rotates to
line up with the head
◼ Seek time
◼ On a movable–head system, the time it takes to position the head at the track
◼ Access time
◼ The sum of the seek time and the rotational delay
◼ The time it takes to get into position to read or write
◼ Transfer time
◼ Once the head is in position, the read or write operation is then performed
as the sector moves under the head
◼ This is the data transfer portion of the operation
Level 2 i
d
Characteristics Performance
◼ An error-correcting code is
◼ Makes use of a parallel access calculated across corresponding
technique bits on each data disk and the bits
of the code are stored in the
corresponding bit positions on
◼ In a parallel access array all multiple parity disks
member disks participate in the
execution of every I/O request ◼ Typically a Hamming code is used,
which is able to correct single-bit
◼ Spindles of the individual drives errors and detect double-bit
are synchronized so that each errors
disk head is in the same position
on each disk at any given time ◼ The number of redundant disks is
proportional to the log of the
number of data disks
◼ Data striping is used
◼ Strips are very small, often as ◼ Would only be an effective choice
small as a single byte or word in an environment in which many
disk errors occur
© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.
+ R
RAID a
Level 3 i
d
Redundancy Performance
◼ Requires only a single ◼ In the event of a drive failure, the
redundant disk, no matter how parity drive is accessed and data is 3
reconstructed from the remaining
large the disk array devices
◼ Employs parallel access, with ◼ Once the failed drive is replaced, the
data distributed in small strips missing data can be restored on the
new drive and operation resumed
◼ Instead of an error correcting ◼ In the event of a disk failure, all of the
code, a simple parity bit is data are still available in what is
computed for the set of referred to as reduced mode
individual bits in the same
position on all of the data disks ◼ Return to full operation requires that
the failed disk be replaced and the
entire contents of the failed disk be
◼ Can achieve very high data regenerated on the new disk
transfer rates
◼ In a transaction-oriented environment
performance suffers
© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.
+ RAID R
a
Level 4 i
d
Characteristics
Performance
◼ Makes use of an independent
access technique ◼ Involves a write penalty when 4
◼ In an independent access array, an I/O write request of small
each member disk operates size is performed
independently so that separate
I/O requests can be satisfied in
parallel
◼ Each time a write occurs the
array management software
◼ Data striping is used must update not only the user
◼ Strips are relatively large data but also the corresponding
parity bits
◼ To calculate the new parity the
array management software ◼ Thus each strip write involves
must read the old user strip two reads and two writes
and the old parity strip
◼ Durability
◼ Longer lifespan
Table 6.5
Comparison of Solid State Drives and Disk Drives
Interface
Interface SSD
Controller
Addressing
Flash
memory
components
Flash
memory
components
Flash
memory
components
Flash
memory
components
CD-ROM
Compact Disk Read-Only Memory. A nonerasable disk used for storing computer data.
The standard system uses 12-cm disks and can hold more than 650 Mbytes.
CD-R
CD Recordable. Similar to a CD-ROM. The user can write to the disk only once.
Table 6. 6
CD-RW
CD Rewritable. Similar to a CD-ROM. The user can erase and rewrite to the disk
multiple times. Optical
DVD
Digital Versatile Disk. A technology for producing digitized, compressed representation
Disk
of video information, as well as large volumes of other digital data. Both 8 and 12 cm diameters
are used, with a double-sided capacity of up to 17 Gbytes. The basic DVD is read-only (DVD- Products
ROM).
DVD-R
DVD Recordable. Similar to a DVD-ROM. The user can write to the disk only once.
Only one-sided disks can be used.
DVD-RW
DVD Rewritable. Similar to a DVD-ROM. The user can erase and rewrite to the disk
multiple times. Only one-sided disks can be used.
Blu-Ray DVD
High definition video disk. Provides considerably greater data storage density than DVD,
using a 405-nm (blue-violet) laser. A single layer on a single side can store 25 Gbytes.
© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.
+
Compact Disk Read-Only Memory
(CD-ROM)
◼ Audio CD and the CD-ROM share a similar technology
◼ The main difference is that CD-ROM players are more rugged and
have error correction devices to ensure that data are properly transferred
◼ Production:
◼ The disk is formed from a resin such as polycarbonate
◼ Digitally recorded information is imprinted as a series of microscopic pits on
the surface of the polycarbonate
◼ This is done with a finely focused, high intensity laser to create a master disk
◼ The master is used, in turn, to make a die to stamp out copies onto
polycarbonate
◼ The pitted surface is then coated with a highly reflective surface, usually
aluminum or gold
◼ This shiny surface is protected against dust and scratches by a top
coat of clear acrylic
◼ Finally a label can be silkscreened onto the acrylic
Land
Pit
Polycarbonate Aluminum
plastic
Laser transmit/
receive
Sector
Mode
MIN
SEC
00 FF . . . FF 00 Data ECC
2352 bytes
Protective layer
(acrylic)
1.2 mm
Reflective layer thick
(aluminum)
Polycarbonate substrate Laser focuses on polycarbonate
(plastic) pits in front of reflective layer.
Data layer
Beam spot Land
Pit 1.2 µm
0.58 µm
Blu-ray
Track
laser wavelength
= 780 nm
0.1 µm
1.32 µm
DVD
405 nm
0.6 µm
650 nm
◼ Serial recording
◼ Data are laid out as a sequence of bits along each track
Track 1
Track 0
Direction of
Bottom read/write
edge of tape
Track 3 4 8 12 16 20
Track 2 3 7 11 15 19
Track 1 2 6 10 14 18
Track 0 1 5 9 13 17
Direction of
tape motion
(b) Block layout for system that reads/writes four tracks simultaneously