New Homework To Be Posted This Weekend Project Phase 3 Launched Next We

EE141
EE141
EECS141
Lecture #24
New
homework to be posted this weekend
Last one to be graded

Project
Phase 3 Launched Next We
Some insights today
EE141
EECS141
Lecture #24
EE141
Last
lecture
Adders
Todays
lecture
Multipliers
Introduction to Memory
Reading
(Ch 11)
EE141
EECS141
Lecture #24
EE141
EECS141
Lecture #24
EE141
EE141
EECS141
Lecture #24
EE141
EECS141
Lecture #24
EE141
EE141
EECS141
Lecture #24
Critical Path 1 & 2
EE141
EECS141
Lecture #24
EE141
Balanced tsum and tcarry
EE141
EECS141
Lecture #24
EE141
EECS141
Lecture #24
10
EE141
EE141
EECS141
Lecture #24
11
EE141
EECS141
Lecture #24
12
EE141
EE141
EECS141
Lecture #24
13
EE141
EECS141
Lecture #24
14
EE141
Optimization
constraints different than in
binary adder
Once again:
Need to identify critical path
And find ways to use parallelism to reduce it
Other
possible techniques
Logarithmic versus linear (Wallace Tree Mult)

Data encoding (Booth)
Pipelining
First glimpse at system level optimization

EE141
EECS141
Lecture #24
15
EE141
EECS141
Lecture #24
16
EE141
Area Dominated by Wiring

EE141
EECS141
Lecture #24
17
Widthbarrel ~ 2 pm M
EE141
EECS141
Lecture #24
18
EE141
EE141
EECS141
Lecture #24
19
Out3
Out2
Out1
EE141
EECS141
Out0
Lecture #24
20
10
EE141
EE141
EECS141
Lecture #24
21
EE141
EECS141
Lecture #24
22
22
11
EE141
Intel 45nm Core 2
EE141
EECS141
23
Lecture #24
Read-Write Memory
Random
Access
Non-Random
Access
SRAM
FIFO
DRAM
LIFO
Non-Volatile
Read-Write
Memory
EPROM
2
E PROM
Read-Only Memory
Mask-Programmed
Programmable (PROM)
FLASH
Shift Register
CAM
EE141
EECS141
Lecture #24
24
12
EE141
STATIC (SRAM)
Data stored as long as supply is applied
Larger (6 transistors/cell)
Fast
Differential (usually)
DYNAMIC (DRAM)
Periodic refresh required
Smaller (1-3 transistors/cell)
Slower
Single Ended
EE141
EECS141
Lecture #24
25
25
Conceptual: linear array

Each box holds some data
But this does not lead to a nice layout shape
Too long and skinny
Create a 2-D array

Decode Row and Column
address to get data
EE141
EECS141
Lecture #24
26
13
EE141
M bits
S0
S1
S2
N
words
SN-2
SN -1
M bits
S0
Word 0
Word 1
Word 2
Storage
cell
A0
A1
A K-1
Word N-2
Word N-1
D
e
c
o
d
e
r
Word 0
Word 1
Word 2
Storage
cell
Word N-2
Word N-1
K = log2N
Input-Output
( M bits)
Input-Output
( M bits)
Intuitive architecture for N x M memory

Too many select signals:
N words == N select signals
Decoder reduces the number of select signals
K = log2N
EE141
EECS141
Lecture #24
27
EE141
EECS141
Lecture #24
28
14
EE141
Collection of 2M complex logic gates

Organized in regular and dense fashion
(N)AND Decoder
NOR Decoder
EE141
EECS141
Lecture #24
29
Look
at decoder for 256x256 memory

block (8KBytes)
EE141
EECS141
Lecture #24
30
15
EE141
Goal:
Build fastest possible decoder with

static CMOS logic
What
we know
Basically need 256 AND

gates, each one of them
drives one word line
N=8
EE141
EECS141
Lecture #24
31
Each
word line has 256 cells connected to it

Total output load is 256*Ccell + Cwire
Assume that decoder input capacitance is
Caddress=4*Ccell
Each
address drives 28/2 AND gates
A0 drives of the gates, A0_b the other of the

gates
Neglecting
Cwire, the fan-out on each one of the

16 address wires is:
B
EE141
EECS141
Lecture #24
32
16
EE141
FB
of at least 213 means that we will want to

use more than log4(213) = 6.5 stages to
implement the AND8
Need
many stages anyways
So what is the best way to implement the AND

gate?
Will see next that its the one with the most stages
and least complicated gates
EE141
EECS141
LE=10/3
1
LE = 10/3
P= 8 + 1
EE141
EECS141
Lecture #24
LE=2
5/3
LE = 10/3
P=4 + 2
33
LE=4/3
5/3
4/3
1
LE = 80/27
P= 2 + 2 + 2 + 1
Lecture #24
34
17
EE141
Using
2-input NAND gates
8-input gate takes 6 stages

Total
LE is (4/3)3 2.4
So PE is 2.4*213 optimal N of ~7.1
EE141
EECS141
256
Lecture #24
35
8-input AND gates
Each built out of

tree of NAND gates
and inverters
Issue:
Every address line has

to drive 128 gates (and
wire) right away
Cant build gates small enough - Forces us
to add buffers just to drive address inputs
EE141
EECS141
Lecture #24
36
18
EE141
EE141
EECS141
Lecture #24
37
Use
a single gate for each of the shared

terms
E.g., from A0, A0, A1, and A1, generate four
signals: A0A1, A0A1, A0A1, A0A1
In
other words, we are decoding smaller

groups of address bits first
And using the predecoded outputs to do
the rest of the decoding
EE141
EECS141
Lecture #24
38
19
EE141
EE141
EECS141
Lecture #24
39
20

New Homework To Be Posted This Weekend Project Phase 3 Launched Next We

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

New Homework To Be Posted This Weekend Project Phase 3 Launched Next We

Uploaded by

Copyright:

Available Formats

EE141

homework to be posted this weekend

Last one to be graded

Phase 3 Launched Next We

Some insights today

Critical Path 1 & 2

Balanced tsum and tcarry

constraints different than in

Logarithmic versus linear (Wallace Tree Mult)

First glimpse at system level optimization

Area Dominated by Wiring

Intel 45nm Core 2

Conceptual: linear array

Create a 2-D array

Intuitive architecture for N x M memory

Decoder reduces the number of select signals

Collection of 2M complex logic gates

at decoder for 256x256 memory

Build fastest possible decoder with

Basically need 256 AND

word line has 256 cells connected to it

address drives 28/2 AND gates

A0 drives of the gates, A0_b the other of the

Cwire, the fan-out on each one of the

of at least 213 means that we will want to

many stages anyways

So what is the best way to implement the AND

2-input NAND gates

8-input gate takes 6 stages

8-input AND gates

Each built out of

Every address line has

a single gate for each of the shared

other words, we are decoding smaller

You might also like