Download as pdf or txt
Download as pdf or txt
You are on page 1of 18

EE5311 – Final Project

Design and layout of a pipelined signed 8-bit Carry Save Multiplier

Group 10
Jayanth B
Bhargav
Summary of Results

SPECIFICATION RESULT
Unpipelined frequency 2.28 GHz
Pipelined frequency 4.31 GHz
Unpipelined frequency 1.19 GHz
(RC extracted)
Pipelined frequency (RC 2.25 GHz
extracted)
DRC Clean
LVS Clean
Area 213.2 um^2
No. of test samples #8
Components used:

NAND2 – 3x
NOT – 3x
SUM – 3x
COUT – 1x
XOR – 3x

Schematic and Layout of all subcircuits:

Schematic of NAND2 (3x)


Layout of NAND2
Schematic of Inverter (3x)

Schematic of XOR (3x)

Schematic of Full Adder used

Layout of Full Adder


Schematic of CSM (without pipelining):

Layout of CSM (without pipelining):


Clean DRC of layout

Clean NCC Layout

Area of layout = 4128.5 λ x 516.5 λ = 2132370.25 λ2

Schematic of D Flip Flop:


D Flip Flop I/P for given input D:

Schematic of pipelined CSM:


The complete Schematic of pipelined CSM

Critical Path of pipelined CSM

Sample Simulations:
1. A= b’01110111 (119); B= b’10111011 (-69) => S=b’1101111111101101
(-8211)
2. A = b’11111111 (-1); B = b’10100000 (-96) => S = b’0000000001100000

3. A = b’00000001 (1) ; B = b’00000000 (0) => S = b’0000000000000000 (0)


4. A = b’10101010 (-86); B = b’01010101 (85) => S = b’1110001101110010
(-7310)

5. A = b’1111111 (-1); B = b’11111111 (-1) => S = b’0000000000000001 (1)

6. A = b’0111111 (127) ; B = b’10000000 (-128) => S = b’1100000010000000


(-16256)
7. A = b’10000000 (-128) ; B = b’00000001 (1) => S = b’111111110000000 (-128)

8. A = b’00010001 (17) ; B = b’00011100 (28) => S = b’0000000111011100 (476)


Delay of CSM:
Maximum delay of NAND and INVERTER:

Rise Delay (s) Fall Delay (s)


3.56E-11 3.01E-11

INVERTER Delay

A B Rise delay(s)
X (transition) 0 NA
0 X (transition) NA
1 Transition 3.38E-11
Transition 1 3.78E-11

NAND Delay
Delays from various sample inputs:

A(binary) A(decimal) B(binary) B(decimal) Z(binary) Z(decimal) Delay


11111111 -1 11111111 -1 00000000 00000001 1 4.15E-10
00000001 1 11100000 -128 1111111110000000 -128 2.31E-10
00010001 17 00011100 28 0000000111011100 476 2.27E-10
00010101 21 00110101 53 0000010001011001 1113 2.27E-10
11100000 -128 11111111 -1 0000000010000000 128 2.54E-10
11110101 -11 11110101 -11 0000000001111001 121 4.04E-10
01111111 127 10000000 -128 1100000010000000 -16256 2.54E-10

Maximum Clocking Frequency:


1. Unpipelined CSM from Schematic:

NAND maximum delay = 3.78E-11 s (t_and)


NOT maximum delay = 3.56E-11 s (t_not)
Tpd = 4.15E-10 s (highest from -1 * -1)

From the falling characteristics,


Setup time, Tsetup = 1.46E-11 s
Tcq = 8.32E-12 s
 Minimum clock delay, Tc = Tpd+ Tsetup + Tcq = 4.38E-10 s = 0.438 ns
 Maximum clock frequency, fc = 1/Tc = 2.28 GHz

2. Unpipelined CSM from RC extracted:

Tc = 0.835E-10 s
 Maximum clock frequency, fc = 1/Tc = 1.19 GHz

3. Pipelined CSM from Schematic:

Constructing a single stage pipeline, ideally both stages of the pipeline should
have approximately equal combinational delays, ie; T cd1=Tcd2 = Tpd/2 = 2.07E-10 s
From the critical path simulation of the pipelined schematic, flip flops were
placed at S5 to split the delays equally approximately.

 Tcd1 = 2.09E-10 s ; Tcd2 = 1.95E-10 s

The inputs to the second stage of the pipeline have been delayed by a clock
cycle

 Minimum clock delay, Tc = max(Tcd1, Tcd2) +Tsetup + Tcq = 2.31E-10 s = 0.26 ns


 Maximum clocking frequency, fc = 1/Tc = 3.84 GHz

4. Pipelined CSM from RC extraction:

Tdelay = 4.43E-10
 Maximum clocking frequency, fc = 1/Tc = 2.25 GHz
Results:
Unpipelined
Tpd 415 ps
Tsetup 14.6 ps
Tcq 8.32 ps
fc (schematic) 2.28 GHz
fc (RC extracted) 1.19 GHz

Pipelined
Tcd1 209 ps
Tcd2 195 ps
Tsetup 14.6 ps
Tcq 8.32 ps
fc (schematic) 3.84 GHz
fc (RC extracted) 2.25 GHz

Pipeline Gain from SPICE Model = 1.69


Pipeline Gain from RC Extraction = 1.89
fc (RC extracted)/ fc (schematic) [Unpipelined] = 0.52
fc (RC extracted)/ fc (schematic) [Pipelined] = 0.58
Layout efficiency = 1.915

Alternate Vector Merge implementation:

For the vector merge stage an alternate Carry Lookahead adder has been implemented.
The architecture is used to speed up the existing computation. It is a replacement for the
ripple carry adder. All the carry-in are generated simultaneously. This reduces the wait time
to calculate results of higher order bits
The carry lookahead adder calculates beforehand if each bit position is going to propagate
the carry or not and checks if carry is generated.
Complete Schematic of CSM with CLA

CLA

CLA for CSM


Sample Outputs:

A(binary) A(decimal) B(binary) B(decimal) Z(binary) Z(decimal) Delay (s) Delay(s)


(with CLA) (no CLA)
11111111 -1 11111111 -1 00000000 1 4.56E-10 4.15E-10
00000001
00000001 1 11100000 -128 1111111110000000 -128 2.37E-10 2.31E-10
00010001 17 00011100 28 0000000111011100 476 3.45E-10 2.27E-10
00010101 21 00110101 53 0000010001011001 1113 2.10E-10 2.27E-10
11100000 -128 11111111 -1 0000000010000000 128 1.78E-10 2.54E-10
11110101 -11 11110101 -11 0000000001111001 121 3.99E-10 4.04E-10
01111111 127 10000000 -128 1100000010000000 -16256 1.77E-10 2.54E-10

For majority of the outputs, we can see that there is a decrease in the delay and
thus the Carry Look Ahead Adder is more efficient than the Ripple carry adder.
Though in some cases there is no requirement of a CLA since they do not make
use of the advantages of CLA.
In conclusion the circuit has been optimised but there is scope for further
optimisation

Rubric Results:

#Test patterns – #8
Cell DRC and LVS – Clean
CSM DRC – Clean
CSM LVS – Clean
Use of inverting adders – Yes
Extracted netlist delay vs schematic delay – Yes
Pipelining – Yes
Max clocking frequency with and w/o pipelining – 3.84 GHz, 2.28 GHz
Alternate Vector Merge – Yes

You might also like