Download as pdf or txt
Download as pdf or txt
You are on page 1of 47

Lecture 12: Memory Hierarchy

and Introduction to Cache

Rose Gomar
Department of Systems and Computer Engineering
Textbook/Copyright
• Hennessy, John L., and David A. Patterson. Computer architecture: a
quantitative approach. Elsevier, 6th edition, 2017, Chapter 2.
• Hennessy, John L., and David A. Patterson. Computer architecture: a
quantitative approach. Elsevier, 6th edition, Appendix B.
• Hennessy, John L., and David A. Patterson, Computer Organization and
Design: RISC-V edition, Chapter 5.
• Part of the slides are provided by Elsevier (Copyright © 2019, Elsevier
Inc. All rights reserved)

2
What we learn in this lecture?
• Memory technologies
• Memory Hierarchy
• Motivation for cache
• Caches

3
The Three Main Memory Categories
• Generally, there are three categories of
memory inside a computer system:
➢ CPU memory
➢ Main memory
➢ Secondary memory

• CPU Memory is the collection of registers


inside the CPU. They are very fast since they
are manufactured within the same silicon
die or chip. However, they are very
expensive too. Therefore, only few number
of registers are manufactured.

5
The Three Main Memory Categories
• Generally, there are three categories
of memory inside a computer system:
➢ CPU memory
➢ Main memory
➢ Secondary memory

• CPU Memory is the collection of


registers inside the CPU. They are very
fast since they are manufactured within
the same silicon die or chip.
However, they are very expensive
too. Therefore, only few number of
registers are manufactured.
• Main memory is the system RAM and
ROM. Compared to CPU registers, they
are larger and less expensive but they
are slower to access.

6
The Three Main Memory Categories
• Generally, there are three categories
of memory inside a computer system:
➢ CPU memory
➢ Main memory
➢ Secondary memory

• CPU Memory is the collection of


registers inside the CPU. They are very
fast since they are manufactured within
the same silicon die or chip.
However, they are very expensive
too. Therefore, only few number of
registers are manufactured.
• Main memory is the system RAM and
ROM. Compared to CPU registers, they
are larger and less expensive but they
are slower to access.
• An example of secondary memory is
hard disks. Compared to main memory,
they are larger, cheaper, but much
slower. 7
Memory Technologies
• Random Access Memory (RAM)
• SRAM (Static RAM)
• DRAM (Dynamic RAM)
• Read Only Memory (ROM)
• Flash Storage
• A type of EEPROM
• Magnetic Disk
• Why do we need different memory technologies?

8
Memory Technologies
• Random Access Memory (RAM)
• SRAM (Static RAM)
• DRAM (Dynamic RAM)
• Read Only Memory (ROM)
• Flash Storage
• A type of EEPROM
• Magnetic Disk
• Why do we need different memory technologies?
• Designers always want unlimited fastest memory!
• SRAM is the fastest but the most expensive one!
• DRAM is slower but less expensive
• Flash and magnetic will provide even more capacity

9
A Simple Memory Model
• Not a good model for big capacities
• Very slow (why?)
Write
data
Read
Write
Read data

Memory
Address

clk CLK
Read
address
Decoder

Write address

10
Main Memory Arrays
• Efficiently store large amounts of data
N
• 3 common types: Address Array
– Dynamic random-access memory
(DRAM)
– Static random-access memory (SRAM)
M
– Read only memory (ROM)
Data
• M-bit data value read/written at each unique N-bit
address

11
Main Memory Arrays
• Efficiently store large amounts of data
N
• 3 common types: Address Array
– Dynamic random-access memory (DRAM)
– Static random-access memory (SRAM)
– Read only memory (ROM) M
• M-bit data value read/written at each unique N-bit
address Data
• 2-dimensional array of bit cells Address Data
• Each bit cell stores one bit 11 0 1 0
2
• N address bits and M data bits: Address Array 10 1 0 0
depth
– 2N rows and M columns 01 1 1 0
– Depth: number of rows (number of words) 3 00 0 1 1
– Width: number of columns (size of word)
Data width
– Array size: depth × width = 2N × M

12
Memory Cell Access
• Cells are accessed using wordline and bitline
bitline
wordline
stored
bit

bitline = bitline =
wordline = 1 wordline = 0
stored stored
bit = 0 bit = 0

bitline = bitline =
wordline = 1 wordline = 0
stored stored
bit = 1 bit = 1

(a) (b)

15
Memory Cell Access
• Cells are accessed using wordline and bitline
bitline
wordline
stored
bit

• When the wordline is enabled, bitline takes the stored cell-value


• When the wordline is disabled, bitline takes high-impedence (open-circuit) status
bitline = 0 bitline = Z
wordline = 1 wordline = 0
stored stored
bit = 0 bit = 0

bitline = 1 bitline = Z
wordline = 1 wordline = 0
stored stored
bit = 1 bit = 1

(a) (b)

16
Memory Array Layout
• Memory Array with Address Decoder

2:4
Decoder bitline2 bitline1 bitline0
wordline3
11
2 stored stored stored
Address bit = 0 bit = 1 bit = 0
wordline 2
10
stored stored stored
wordline1 bit = 1 bit = 0 bit = 0
01
stored stored stored
bit = 1 bit = 1 bit = 0
wordline0
00
stored stored stored
bit = 0 bit = 1 bit = 1

Data2 Data1 Data0

17
Memory Arrays
One Memory cell

Row Decoder

Column
Decoder

18
Main Memory: Static RAM
• Main Memory: Static RAM. This is the
type of RAM in which data is held until
power is removed from it. One memory
cell (bit) of SRAM consists of at least 6
transistors (6 T memory cell).

One SRAM Cell

19
Main Memory: Static RAM
• Main Memory: Static RAM. This is the type of
RAM in which data is held until power is
removed from it. One memory cell (bit) of
SRAM consists of at least 6 transistors (6 T
memory cell).
• SRAM data is organized into cells.
One SRAM Cell
• Cells are organized into arrays where
address decoders determines the row and
column of the desired information.

General logic of SRAM


20
Main Memory: Static RAM
• Advantages: SRAMs are as fast as typical CPUs
because of using the same technology and so
find more important use as ‘cache memory’.
• Because of being expensive, caches are
naturally much less in size (storage capacity)
than the regular main memory.
One SRAM Cell

• Disadvantages: SRAMs are more expensive


because each cell needs at least six transistors,
and less dense compared to DRAMs.

General logic of SRAM


21
Static RAM Read Cycles
• The steps of a read cycle of SRAM:
➢ Place the address to be read on the address bus.
➢ Ensure that the chip is activated by making CS low.
➢ Activate the OE pin. This ensures that data is read.
➢ The required data then appears on the data bus.
TAA is the read access time. The time from the instant
the address is placed on the address bus to the A read cycle of SRAM
point when the required data is available on the data
bus. TRC is the read cycle time which is the minimum
time between two read cycles

22
Static RAM Write Cycles
• The steps of a write cycle of SRAM:
➢ Place the address to be written to on the address
bus.
➢ Ensure that the chip is activated by making CS low.
➢ Place the data to be written on the data bus.
➢ Activate the WR line. Only then the data is valid.

23
DRAM Technology
• Data stored as a charge in a capacitor
• Single transistor used to access the charge
• Must periodically be refreshed
• Read contents and write back A DRAM cell. Very economic compared to
• Performed on a DRAM “row” SRAM that has 6 or more Transistors per cell

• DRAMs are organized in banks (for DDR4 is up to 16)


DRAM Technology
• Dynamic RAM: It is designated as dynamic, because its content
does not remain unchanged or static as in SRAM, and hence,
frequent ‘refreshing’ is necessary.
• One of the problems with this arrangement is that the capacitors do
not hold their charge indefinitely and needs to be recharged. This
action is done by ‘refreshing’ the cell at regular intervals.
• One important merit of DRAM is that its packing density is very
high compared to SRAM.
Main Memory: Dynamic RAM
• Read Cycle of DRAM : a processor when
addressing memory sends the complete
address on its address pins
• Between the processor and a DRAM chip,
there is a memory controller whose function
is to split the address into two, as columns
and rows. Memory controller for a DRAM

26
Main Memory: Dynamic RAM
• Read Cycle of DRAM : a processor when
addressing memory sends the complete
address on its address pins
• Between the processor and a DRAM chip,
there is a memory controller whose function
is to split the address into two, as columns
and rows. Memory controller for a DRAM

• A DRAM has less number of address pins


than the address supplied by the processor,
because the address lines of the DRAM chip
is multiplexed (in time) for the row and
column addresses.
• DRAM chips are large, rectangular arrays of
memory cells with support logic that is used
for reading and writing data in the arrays and
refresh circuitry to maintain the integrity of
stored data

27
DRAM Read Cycle

The addressing structure of DRAM


• Dynamic RAM Timing. Steps of a read cycle of
DRAM
is given below:
1) The row address is placed on the rows and given sufficient time
to stabilize and be latched.

2) The Row Address Strobe (RAS) signal is then activated.

3) The Row Address Decoder selects the proper row.

4) Next, the column address is placed on the same address lines


and allowed to stabilize and be latched.

5) The Column Address Strobe (CAS) signal is then activated.

6) The CAS pin also serves as the Output Enable; so, once the
CAS signal has stabilized, the sense amps place the data
from the selected row and column on the data bus.

7) With this, the data in the selected address is available at


the output buffers of the chip, and it is transferred to the data bus.

8) Before the read cycle can be considered complete, CAS and


RAS must return to their previous state.

• Note that this is a conventional asynchronous


read, because the timing signals are not tied to a
common system clock.
28
DRAM Timing
• Dynamic RAM Timing: The access time (tRAC) is the
time from the time the RAS signal is activated to
the time the data is available on the data bus.
• The read cycle time (tRC) is also shown in the
diagram.
• Observe that another time tRP is included within
this read cycle time. The total read cycle time is the
sum of the ‘RAS active time’ and the ‘RAS pre-
charge time’. The first corresponds to the time
A read cycle of DRAM
during which the RAS signal is active (low).
• tRP is the additional time needed before a new read
The DRAM controller takes care of
(or write) cycle can be started by lowering the scheduling the refreshes and
signal. This is because there is a parasitic making sure that they do not
capacitance for each cell. This parasitic capacitance interfere with regular reads and
must be pre-charged high before any operation is writes
to be commenced. The access time is also referred
to as latency.

29
DRAM Refreshing
• DRAM Refreshing Hints:
• Rate: It varies, but typically manufacturers specify
that each row should be refreshed every 64 ms.
• How is refreshing done: by activating each

The addressing structure of DRAM


row using RAS signal.
• When is refreshing done? The DRAM controller
takes care of scheduling the refreshes and
making sure that they do not interfere with
regular reads and writes.
• So, to keep the data in DRAM chip from leaking
away, the DRAM controller periodically sweeps
through all of the rows by cycling repeatedly and
placing a series of row addresses on the address
bus.
• To reduce the number of refresh cycles, one
method is to split the address such that there are
fewer rows and more columns.
30
DRAM Advantages/ Disadvantages
• DRAM Advantages/ Disadvantages:
• One important merit of DRAM is its packing
density is very high compared to SRAM.
• Sensing a small charge on the memory cell
capacitor is challenging due to noise from “coupling A DRAM cell
capacitance”.
• It is cheaper than SRAM.

31
Advanced DRAM Organization
• Bits in a DRAM are organized as a rectangular array
• DRAM accesses an entire row
• Synchronous DRAM
• Allows for consecutive accesses in bursts without needing to send
each address
• Improves bandwidth
• Double data rate (DDR) DRAM
• Transfer on rising and falling clock edges
• Quad data rate (QDR) DRAM
• Separate DDR inputs and outputs

An animated video on DRAM:


https://www.youtube.com/watch?v=7J7X7aZvMXQ
Dynamic vs. Static RAM
• SRAM vs. DRAM Summary

• Typical Memory Types and corresponding Latencies

33
ROM (Read Only Memory)
• Main Memory: ROM (Read Only Memory): This
is ‘Read Only Memory’. A ROM does not lose its
contents when power is switched off. ROM is a
type of ‘programmable’ memory. It has internal
fuses which when blown create a bit pattern
which is permanent and hence can be read
whenever needed. However, if it is an OTP (one
time programmable) ROM, its contents can
never be changed again.
Categorization of main memory types
• EPROMs are ‘Erasable and Programmable’
exposing them to ultraviolet radiation. EEPROM technology is used for BIOS ROMs
• EEPROM: This is ‘Electrically Erasable’ PROM, in personal computers.
and erasure can be done while on circuit board.
• Flash ROM: This is a special type of EEPROM Flash ROM technology is used in
that can be erased and reprogrammed in blocks microcontrollers and embedded computer
instead of one byte at a time. This feature gave systems
flash memory the advantage of speed over
EEPROM.

34
Flash Storage
• Nonvolatile semiconductor storage
• 100× – 1000× faster than disk
• Smaller, lower power, more robust
• But more $/GB (between disk and DRAM)
• Popular in personal mobile devices
Flash Types
• NOR flash: bit cell like a NOR gate
• Random read/write access
• Used for instruction memory in embedded systems
• NAND flash: bit cell like a NAND gate
• Denser (bits/area), but block-at-a-time access
• Cheaper per GB
• Used for USB keys, media storage, …

• Flash bits wears out after 1000’s of accesses


• Not suitable for direct RAM or disk replacement
• Wear leveling: remap data to less used blocks

More information on: https://www.youtube.com/watch?v=YtBysgPOKx4


Disk Storage
• Nonvolatile, rotating magnetic storage
• Big capacity
• Awfully slow
• Average read time about 6.2 ms

More information on :
https://www.youtube.com/watch?v=NtPc0jI21i0
A bit on SSD: https://www.youtube.com/watch?v=5Mh3o886qpg
Summary
• Memory technologies
• RAM
SRAM
DRAM
• ROM
• Secondary storages
Future lecture
• Principles of locality
• Caches

You might also like