Professional Documents
Culture Documents
Chapter 12
Chapter 12
Circuits
A Design Perspective
Jan M. Rabaey
Anantha Chandrakasan
Borivoje Nikolic
Semiconductor
Memories
December 20, 2002
Non-Volatile
Read-Write Memory Read-Write Read-Only Memory
Memory
DRAM LIFO
Shift Register
CAM
READ
Write cycle
Read access Read access
WRITE
Write access
Data valid
DATA
Data written
S0 S0
Word 0 Word 0
S1
Word 1 A0 Word 1
S2 Storage Storage
Word 2 A1 Word 2
cell cell
AK 21
SN 2 2
Word N 2 2 Word N 2 2
SN 2 1
Word N 2 1 Word N 2 1
K 5 log2 N
Input-Output Input-Output
(M bits) (M bits)
Intuitive architecture for N x M memory Decoder reduces the number of select signals
Too many select signals:
N words == N select signals
K = log2N
AK
Row Decoder
A K1 1 Word line
A L2 1
M.2K
A0
Column decoder Selects appropriate
A K2 1 word
Input-Output
(M bits)
Row
address
Column
address
Block
address
I/O
Advantages:
1. Shorter wires within blocks
2. Block address activates only 1 block => power savings
© Digital Integrated Circuits2nd Memories
Block Diagram of 4 Mbit SRAM
Clock Z-address X-address
generator buffer buffer
Transfer gate
Column decoder
Sense amplifier and write driver
I/O Buffers
Commands
Comparand
Mask
2 Validity Bits
Priority Encoder
Address Decoder
CAM Array
Control Logic R/W Address (9 bits) 9
2 words 3 64 bits
9
© Digital Integrated Circuits2nd Memories
Memory Timing: Approaches
Address
bus Row Address Column Address
RAS Address
Bus Address
Address transition
CAS initiates memory operation
BL BL BL
WL WL
WL
0
GND
WL[0]
V DD
WL[1]
WL[2]
V DD
WL[3]
V bias
Pull-down loads
WL[0]
GND
WL [1]
WL [2]
GND
WL [3]
Polysilicon
Metal1
Diffusion
Metal1 on Diffusion
Programmming using
the Contact Layer Only
Polysilicon
Metal1
Diffusion
Metal1 on Diffusion
WL [0]
WL [1]
WL [2]
WL [3]
Programmming using
the Metal-1 Layer Only
Polysilicon
Diffusion
Metal1 on Diffusion
Programmming using
Implants Only
Polysilicon
Threshold-altering
implant
Metal1 on Diffusion
BL
rword
WL Cbit
cword
V DD
Model for NAND ROM
BL
CL
r bit
cbit
r word
WL
cword
Metal bypass
Precharge devices
WL [0]
GND
WL [1]
WL [2]
GND
WL [3]
tox G
tox
S
n+ p n+_
Substrate
20 V 0V 5V
10 V 5V 20 V 0V 5V
25V 2 2.5 V
S D S D S D
“0”-state “1”-state
“ON ”
DV T
“OFF ”
V WL V GS
Source Drain
20–30 nm -10 V V GD
10 V
n1 n1
Substrate
p
10 nm
Fowler-Nordheim
FLOTOX transistor
I-V characteristic
WL
Absolute threshold control
is hard
VDD Unprogrammed transistor
might be depletion
2 transistor cell
Control gate
Floating gate
n 1 source n 1 drain
programming
p- substrate
Flash EPROM
© Digital Integrated Circuits2nd Courtesy Intel Memories
Basic Operations in a NOR Flash Memory―
Erase
cell array
BL 0 BL 1
G
12 V
0V WL 0
S D
0V WL 1
open open
0V WL 1
6V 0V
0V WL 1
1V 0V
Gate FG
Oxide
Source line
(Diff. Layer)
Active area
STI
DYNAMIC (DRAM)
Periodic refresh required
Small (1-3 transistors/cell)
Slower
Single Ended
WL
V DD
M2 M4
Q
M5 Q M6
M1 M3
BL BL
V DD
BL M4
BL
Q= 0
Q= 1 M6
M5
V DD M1 V DD V DD
Cbit Cbit
0.8
0.6
0.4
0.2
0
0 0.5 1 1.2 1.5 2 2.5 3
Cell Ratio (CR)
Q= 0 M6
M5 Q= 1
M1
V DD
BL = 1 BL = 0
Q Q
M1 M3
GND
M5 M6 WL
BL BL
Q Q
M3 M4
BL M1 M2 BL
WWL
RWL WWL
M3 RWL
M1 X X V DD 2 V T
M2
V DD
CS BL 1
BL 2 V DD 2 V T DV
RWL
M3
M2
WWL
M1
M1
X GND V DD 2 V T
CS
V DD
BL
V DD /2 V
sensing
C BL
V BL V(1)
V PRE
DV(1)
V(0)
Sense amp activated t
Word line activated
M 1 word
line
Metal word line
SiO2
Poly
n+ n+ Field Oxide Diffused
bit line
Inversion layer
Poly
induced by Polysilicon
Polysilicon
plate bias gate plate
Cross-section Layout
Cell Plate Si
Word
Word S S
int
CAM ••• CAM M3 M2
Match
M1
CAM SRAM
ARRAY ARRAY
Decoders
Sense Amplifiers
Input/Output Buffers
Control / Timing Circuitry
NOR Decoder
WL 1
WL 0
A 0A 1 A 0A 1 A 0A 1 A 0A 1 A 2A 3 A 2A 3 A 2A 3 A 2A 3
•••
NAND decoder using
2-input pre-decoders
A1 A0 A0 A1 A3 A2 A2 A3
WL3
VDD
WL 3
WL 2
WL 2 VDD
WL 1
WL 1
V DD
WL 0
WL 0
VDD A0 A0 A1 A1
A0 A0 A1 A1
S0
A0
S1
S2
A1 S3
Advantages: speed (tpd does not add to overall memory access time)
Only one extra transistor in signal path
Disadvantage: Large transistor count
A0
A0
A1
A1
D
Number of devices drastically reduced
Delay increases quadratically with # of sections; prohibitive for large decoders
Solutions: buffers
progressive sizing
combination of tree and pass transistor approaches
V DD V DD V DD V DD V DD V DD
WL 0 WL 1 WL 2
f f f f f f
• • •
R f f R f f R f f
V DD
large small
small
transition s.a.
input output
M3 M4
y Out
bit M1 M2 bit
SE M5
Directly applicable to
SRAMs
BL BL V DD V DD
EQ
M3 M4
WL i
x M1 M2 2x x 2x
SE M5 SE
SE
SRAM cell i
V DD
Diff.
x Sense 2x Output
Amp
SE
Output
(a) SRAM sensing scheme (b) two stage differential amplifier
SE
SE
VL VS
M1
C small
M2 M3 C large
Transient Response
2.5
Concept 2.0 VS
V in
1.5
V
VL
1.0
0.5
V ref 5 3V
0.0
0.0 1.00 2.00 3.00
time (nsec)
SE M4 Load
Out
Cout Cascode
V casc M3 device
Ccol
Column
WLC M2 decoder
BL
M1 CBL EPROM
WL array
Output
L L1 L0 V DD R0 R1 L
SE
BLL BLR
… …
CS CS CS CS CS CS
SE
2 2
BL BL
1 1
BL BL
0 0
0 1 2 3 0 1 2 3
t (ns) t (ns)
reading 0 reading 1
3
EQ WL
SE
1
0
0 1 2 3
t (ns)
control signals
© Digital Integrated Circuits2nd Memories
Voltage Regulator
VDD
Mdrive
VREF VDL
Equivalent Model
Vbias
VREF
-
Mdrive
+
VDL
0V
k
Data k 3l memory
bus array
Column
demux packet dec.
Row
demux packet dec.
DELAY
A0 td
ATD ATD
DELAY
A1 td
DELAY …
A N2 1 td
Q S (1C)
100
CS(1F)
10
V DD (V)
Q S 5 C S V DD / 2
V smax 5 Q S / (C S 1 C D )
WL
leakage
CS
electrode
Ccross
EQ
WL 1 WL 0 WL WL D WL 0 WL 1
C WBL D C WBL
BL BL
C BL Sense C BL
C C C Amplifier C C C
WL 1 WL 1 WL 0 C WL 0 WL D WL D
WBL
BL C BL x y
C C C Sense
C C C
EQ Amplifier
BL C BL x
C WBL
C cross
BL9
BL
SA
BL
BL99
e.g. B3 Wrong
with
1
1 =3
nC DE V INT f m
selected mi act
C PT V INT f
I DCP
n
mC DE V INT f
PERIPHERY
COLUMN DEC
V SS
1.10u
0.13m m CMOS
900n
700n
Ileakage
500n Factor 7
100n
VDD
V SS,int
sleep
100
I ACT
21
10
I AC
Current (A)
22
10
102 3
Cycle time : 150 ns
102 4 T 5 75 C, S
I DC
25
10
102 6
15M 64M 255M 1G 4G 15G 64G
Capacity (bit)
IDENTICAL TO ROM!
Main difference
ROM: fully populated
PLA: one element per minterm
GND
GND
V DD X0 X0 X1 X1 X2 X2 f0 f1
AND-plane OR-plane
GND V DD
f OR
f OR
f AND
V DD X0 X0 X1 X1 X2 X2 f0 f1 GND
AND-plane OR-plane
f f
f AND
Dummy AND row
f AND
tpre teval
f AND Dummy AND row f OR
f OR
x0 x0 x1 x1 x2 x2 f0 f1
Pull-up devices Pull-up devices
BEQ
Local WL
Memory cell
B/T B/T
CD CD
CD I/O
I/O line
I/O
Sense amplifier
ATD
SEQ Block
select ATD
BS BEQ
Vdd
SA BS SA I/O Lines
SEQ GND
SEQ
SEQ
SEQ SEQ Vdd
DATA SA, SA
Dei GND
DATA
BS
Data-cut
SGD
Word Line Driver
108
Number of memory cells
106
104
102
Result of 4 times 100
program 0V 1V 2V 3V 4V
0V 1V 2V 3V 4V Vt of memory cells
Vt of memory cells
(a)
Evolution of thresholds Final Distribution
11.7mm