Project Final - Pipelined CSM

EE5311 – Final Project
Design and layout of a pipelined signed 8-bit Carry Save Multiplier
Group 10
Jayanth B
Bhargav
Summary of Results
SPECIFICATION RESULT
Unpipelined frequency 2.28 GHz
Pipelined frequency 4.31 GHz
Unpipelined frequency 1.19 GHz
(RC extracted)
Pipelined frequency (RC 2.25 GHz
extracted)
DRC Clean
LVS Clean
Area 213.2 um^2
No. of test samples #8
Components used:
NAND2 – 3x
NOT – 3x
SUM – 3x
COUT – 1x
XOR – 3x
Schematic and Layout of all subcircuits:
Schematic of NAND2 (3x)

Layout of NAND2
Schematic of Inverter (3x)
Schematic of XOR (3x)
Schematic of Full Adder used
Layout of Full Adder

Schematic of CSM (without pipelining):
Layout of CSM (without pipelining):

Clean DRC of layout
Clean NCC Layout
Area of layout = 4128.5 λ x 516.5 λ = 2132370.25 λ2
Schematic of D Flip Flop:

D Flip Flop I/P for given input D:
Schematic of pipelined CSM:

The complete Schematic of pipelined CSM
Critical Path of pipelined CSM
Sample Simulations:
1. A= b’01110111 (119); B= b’10111011 (-69) => S=b’1101111111101101
(-8211)
2. A = b’11111111 (-1); B = b’10100000 (-96) => S = b’0000000001100000
3. A = b’00000001 (1) ; B = b’00000000 (0) => S = b’0000000000000000 (0)

4. A = b’10101010 (-86); B = b’01010101 (85) => S = b’1110001101110010
(-7310)
5. A = b’1111111 (-1); B = b’11111111 (-1) => S = b’0000000000000001 (1)
6. A = b’0111111 (127) ; B = b’10000000 (-128) => S = b’1100000010000000

(-16256)
7. A = b’10000000 (-128) ; B = b’00000001 (1) => S = b’111111110000000 (-128)
8. A = b’00010001 (17) ; B = b’00011100 (28) => S = b’0000000111011100 (476)

Delay of CSM:
Maximum delay of NAND and INVERTER:
Rise Delay (s) Fall Delay (s)

3.56E-11 3.01E-11
INVERTER Delay
A B Rise delay(s)
X (transition) 0 NA
0 X (transition) NA
1 Transition 3.38E-11
Transition 1 3.78E-11
NAND Delay
Delays from various sample inputs:
A(binary) A(decimal) B(binary) B(decimal) Z(binary) Z(decimal) Delay

11111111 -1 11111111 -1 00000000 00000001 1 4.15E-10
00000001 1 11100000 -128 1111111110000000 -128 2.31E-10
00010001 17 00011100 28 0000000111011100 476 2.27E-10
00010101 21 00110101 53 0000010001011001 1113 2.27E-10
11100000 -128 11111111 -1 0000000010000000 128 2.54E-10
11110101 -11 11110101 -11 0000000001111001 121 4.04E-10
01111111 127 10000000 -128 1100000010000000 -16256 2.54E-10
Maximum Clocking Frequency:

1. Unpipelined CSM from Schematic:
NAND maximum delay = 3.78E-11 s (t_and)

NOT maximum delay = 3.56E-11 s (t_not)
Tpd = 4.15E-10 s (highest from -1 * -1)
From the falling characteristics,

Setup time, Tsetup = 1.46E-11 s
Tcq = 8.32E-12 s
 Minimum clock delay, Tc = Tpd+ Tsetup + Tcq = 4.38E-10 s = 0.438 ns
 Maximum clock frequency, fc = 1/Tc = 2.28 GHz
2. Unpipelined CSM from RC extracted:
Tc = 0.835E-10 s
 Maximum clock frequency, fc = 1/Tc = 1.19 GHz
3. Pipelined CSM from Schematic:
Constructing a single stage pipeline, ideally both stages of the pipeline should
have approximately equal combinational delays, ie; T cd1=Tcd2 = Tpd/2 = 2.07E-10 s
From the critical path simulation of the pipelined schematic, flip flops were
placed at S5 to split the delays equally approximately.
 Tcd1 = 2.09E-10 s ; Tcd2 = 1.95E-10 s
The inputs to the second stage of the pipeline have been delayed by a clock
cycle
 Minimum clock delay, Tc = max(Tcd1, Tcd2) +Tsetup + Tcq = 2.31E-10 s = 0.26 ns

 Maximum clocking frequency, fc = 1/Tc = 3.84 GHz
4. Pipelined CSM from RC extraction:
Tdelay = 4.43E-10
 Maximum clocking frequency, fc = 1/Tc = 2.25 GHz
Results:
Unpipelined
Tpd 415 ps
Tsetup 14.6 ps
Tcq 8.32 ps
fc (schematic) 2.28 GHz
fc (RC extracted) 1.19 GHz
Pipelined
Tcd1 209 ps
Tcd2 195 ps
Tsetup 14.6 ps
Tcq 8.32 ps
fc (schematic) 3.84 GHz
fc (RC extracted) 2.25 GHz
Pipeline Gain from SPICE Model = 1.69

Pipeline Gain from RC Extraction = 1.89
fc (RC extracted)/ fc (schematic) [Unpipelined] = 0.52
fc (RC extracted)/ fc (schematic) [Pipelined] = 0.58
Layout efficiency = 1.915
Alternate Vector Merge implementation:
For the vector merge stage an alternate Carry Lookahead adder has been implemented.
The architecture is used to speed up the existing computation. It is a replacement for the
ripple carry adder. All the carry-in are generated simultaneously. This reduces the wait time
to calculate results of higher order bits
The carry lookahead adder calculates beforehand if each bit position is going to propagate
the carry or not and checks if carry is generated.
Complete Schematic of CSM with CLA
CLA
CLA for CSM

Sample Outputs:
A(binary) A(decimal) B(binary) B(decimal) Z(binary) Z(decimal) Delay (s) Delay(s)

(with CLA) (no CLA)
11111111 -1 11111111 -1 00000000 1 4.56E-10 4.15E-10
00000001
00000001 1 11100000 -128 1111111110000000 -128 2.37E-10 2.31E-10
00010001 17 00011100 28 0000000111011100 476 3.45E-10 2.27E-10
00010101 21 00110101 53 0000010001011001 1113 2.10E-10 2.27E-10
11100000 -128 11111111 -1 0000000010000000 128 1.78E-10 2.54E-10
11110101 -11 11110101 -11 0000000001111001 121 3.99E-10 4.04E-10
01111111 127 10000000 -128 1100000010000000 -16256 1.77E-10 2.54E-10
For majority of the outputs, we can see that there is a decrease in the delay and
thus the Carry Look Ahead Adder is more efficient than the Ripple carry adder.
Though in some cases there is no requirement of a CLA since they do not make
use of the advantages of CLA.
In conclusion the circuit has been optimised but there is scope for further
optimisation
Rubric Results:
#Test patterns – #8
Cell DRC and LVS – Clean
CSM DRC – Clean
CSM LVS – Clean
Use of inverting adders – Yes
Extracted netlist delay vs schematic delay – Yes
Pipelining – Yes
Max clocking frequency with and w/o pipelining – 3.84 GHz, 2.28 GHz
Alternate Vector Merge – Yes

Project Final - Pipelined CSM

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Project Final - Pipelined CSM

Uploaded by

Copyright:

Available Formats

EE5311 – Final Project

Design and layout of a pipelined signed 8-bit Carry Save Multiplier

Schematic and Layout of all subcircuits:

Schematic of NAND2 (3x)

Schematic of XOR (3x)

Schematic of Full Adder used

Layout of Full Adder

Layout of CSM (without pipelining):

Clean NCC Layout

Area of layout = 4128.5 λ x 516.5 λ = 2132370.25 λ2

Schematic of D Flip Flop:

Schematic of pipelined CSM:

Critical Path of pipelined CSM

3. A = b’00000001 (1) ; B = b’00000000 (0) => S = b’0000000000000000 (0)

5. A = b’1111111 (-1); B = b’11111111 (-1) => S = b’0000000000000001 (1)

6. A = b’0111111 (127) ; B = b’10000000 (-128) => S = b’1100000010000000

8. A = b’00010001 (17) ; B = b’00011100 (28) => S = b’0000000111011100 (476)

Rise Delay (s) Fall Delay (s)

A(binary) A(decimal) B(binary) B(decimal) Z(binary) Z(decimal) Delay

Maximum Clocking Frequency:

NAND maximum delay = 3.78E-11 s (t_and)

From the falling characteristics,

2. Unpipelined CSM from RC extracted:

3. Pipelined CSM from Schematic:

 Tcd1 = 2.09E-10 s ; Tcd2 = 1.95E-10 s

 Minimum clock delay, Tc = max(Tcd1, Tcd2) +Tsetup + Tcq = 2.31E-10 s = 0.26 ns

4. Pipelined CSM from RC extraction:

Pipeline Gain from SPICE Model = 1.69

Alternate Vector Merge implementation:

CLA for CSM

A(binary) A(decimal) B(binary) B(decimal) Z(binary) Z(decimal) Delay (s) Delay(s)

You might also like