Download as pdf or txt
Download as pdf or txt
You are on page 1of 20

EE141

EE141
EECS141

Lecture #24

New

homework to be posted this weekend

Last one to be graded


Project

Phase 3 Launched Next We

Some insights today

EE141
EECS141

Lecture #24

EE141

Last

lecture

Adders
Todays

lecture

Multipliers
Introduction to Memory
Reading

(Ch 11)

EE141
EECS141

Lecture #24

EE141
EECS141

Lecture #24

EE141

EE141
EECS141

Lecture #24

EE141
EECS141

Lecture #24

EE141

EE141
EECS141

Lecture #24

Critical Path 1 & 2

EE141
EECS141

Lecture #24

EE141

Balanced tsum and tcarry

EE141
EECS141

Lecture #24

EE141
EECS141

Lecture #24

10

EE141

EE141
EECS141

Lecture #24

11

EE141
EECS141

Lecture #24

12

EE141

EE141
EECS141

Lecture #24

13

EE141
EECS141

Lecture #24

14

EE141

Optimization

constraints different than in

binary adder
Once again:
Need to identify critical path
And find ways to use parallelism to reduce it
Other

possible techniques

Logarithmic versus linear (Wallace Tree Mult)


Data encoding (Booth)
Pipelining

First glimpse at system level optimization


EE141
EECS141

Lecture #24

15

EE141
EECS141

Lecture #24

16

EE141

Area Dominated by Wiring


EE141
EECS141

Lecture #24

17

Widthbarrel ~ 2 pm M
EE141
EECS141

Lecture #24

18

EE141

EE141
EECS141

Lecture #24

19

Out3

Out2

Out1

EE141
EECS141

Out0

Lecture #24

20

10

EE141

EE141
EECS141

Lecture #24

21

EE141
EECS141

Lecture #24

22

22

11

EE141

Intel 45nm Core 2

EE141
EECS141

23

Lecture #24

Read-Write Memory

Random
Access

Non-Random
Access

SRAM

FIFO

DRAM

LIFO

Non-Volatile
Read-Write
Memory
EPROM
2

E PROM

Read-Only Memory

Mask-Programmed
Programmable (PROM)

FLASH

Shift Register
CAM

EE141
EECS141

Lecture #24

24

12

EE141

STATIC (SRAM)
Data stored as long as supply is applied
Larger (6 transistors/cell)
Fast
Differential (usually)

DYNAMIC (DRAM)
Periodic refresh required
Smaller (1-3 transistors/cell)
Slower
Single Ended
EE141
EECS141

Lecture #24

25

25

Conceptual: linear array


Each box holds some data
But this does not lead to a nice layout shape
Too long and skinny

Create a 2-D array


Decode Row and Column
address to get data

EE141
EECS141

Lecture #24

26

13

EE141

M bits
S0
S1
S2

N
words

SN-2
SN -1

M bits
S0

Word 0
Word 1
Word 2

Storage
cell

A0
A1
A K-1

Word N-2
Word N-1

D
e
c
o
d
e
r

Word 0
Word 1
Word 2

Storage
cell

Word N-2
Word N-1

K = log2N
Input-Output
( M bits)

Input-Output
( M bits)

Intuitive architecture for N x M memory


Too many select signals:
N words == N select signals

Decoder reduces the number of select signals

K = log2N

EE141
EECS141

Lecture #24

27

EE141
EECS141

Lecture #24

28

14

EE141

Collection of 2M complex logic gates


Organized in regular and dense fashion
(N)AND Decoder

NOR Decoder

EE141
EECS141

Lecture #24

29

Look

at decoder for 256x256 memory


block (8KBytes)

EE141
EECS141

Lecture #24

30

15

EE141

Goal:

Build fastest possible decoder with


static CMOS logic

What

we know

Basically need 256 AND


gates, each one of them
drives one word line

N=8

EE141
EECS141

Lecture #24

31

Each

word line has 256 cells connected to it


Total output load is 256*Ccell + Cwire
Assume that decoder input capacitance is
Caddress=4*Ccell
Each

address drives 28/2 AND gates

A0 drives of the gates, A0_b the other of the


gates
Neglecting

Cwire, the fan-out on each one of the


16 address wires is:
B

EE141
EECS141

Lecture #24

32

16

EE141

FB

of at least 213 means that we will want to


use more than log4(213) = 6.5 stages to
implement the AND8

Need

many stages anyways

So what is the best way to implement the AND


gate?
Will see next that its the one with the most stages
and least complicated gates

EE141
EECS141

LE=10/3
1
LE = 10/3
P= 8 + 1

EE141
EECS141

Lecture #24

LE=2
5/3
LE = 10/3
P=4 + 2

33

LE=4/3
5/3
4/3
1
LE = 80/27
P= 2 + 2 + 2 + 1

Lecture #24

34

17

EE141

Using

2-input NAND gates

8-input gate takes 6 stages


Total

LE is (4/3)3 2.4
So PE is 2.4*213 optimal N of ~7.1
EE141
EECS141

256

Lecture #24

35

8-input AND gates

Each built out of


tree of NAND gates
and inverters
Issue:

Every address line has


to drive 128 gates (and
wire) right away
Cant build gates small enough - Forces us
to add buffers just to drive address inputs
EE141
EECS141

Lecture #24

36

18

EE141

EE141
EECS141

Lecture #24

37

Use

a single gate for each of the shared


terms
E.g., from A0, A0, A1, and A1, generate four
signals: A0A1, A0A1, A0A1, A0A1

In

other words, we are decoding smaller


groups of address bits first
And using the predecoded outputs to do
the rest of the decoding

EE141
EECS141

Lecture #24

38

19

EE141

EE141
EECS141

Lecture #24

39

20

You might also like