Professional Documents
Culture Documents
Chapter 10 Arithmetic Ckts Presentation.V
Chapter 10 Arithmetic Ckts Presentation.V
circuits
ic principle of pipelining
aditional approach
Input
Data
Process Through
< 100 ns 10 MH
clk
pelining approach
¾ Throughput increases
considerably.
¾ Chip area also increases.
¾ Latency comes into effect.
nput Throug
ata 100 M
Proc. Reg. Proc. Reg.
1 1 10 10
<10 ns <10 ns
clk clk
ocessing order
0 Data1
0 Data2 Proc.1_1
20 Data3 Proc.1_2
Proc.2_1
rtitioning of a design
Partition of functionality
¾ Conventional approach of
addition/subtraction uses
all the 12 bits together.
artition of functionality
________________________________
__________________________
ADDER CAN BE REALIZED IN TWO
DIFFERENT WAYS:
sum [14:0] + s
n [11:0] - s
( n0 – n7 )
enable clk
module serial_adder12s (
clk,
enable,
n,
sum,
sum_valid,
result
);
input clk ;
input enable ;
input [11:0] n;
begin
end
begin
end
endmodule
________________________________
__________________________
`define clkperiodby2 10
`include "serial_adder12s.v"
module serial_adder12s_test (
sum,
sum_valid,
result
);
reg clk ;
reg enable ;
reg [11:0] n;
serial_adder12s u1(
.clk(clk),
.enable(enable),
.n(n),
.sum(sum),
.sum_valid(sum_valid),
.result(result)
);
initial
begin
clk = 1'b0 ;
n = 12'h0 ; // n0 @ 0 ns.
enable =0;
#20 enable =1;
#20 enable = 1 ;
// Apply the next set of
inputs.
n =100 ; // n0
#20 n = 200 ;
#20 n = 300 ;
#20 n = 400 ;
#20 n = 500 ;
#20 n = 100 ;
#20 n = 200 ;
#20 n = 247 ; // n7
#20 enable = 0 ;
#100
$stop ;
end
always
endmodule
________________________________
__________________________
Synplify results
I/O primitives:
IBUF 13 uses
OBUF 31 uses
BUFGP 1 use
Mapping Summary:
Total LUTs: 18 (0%)
Mapper successful!
Xilinx P&R Results
Design Summary:
Number of errors: 0
Number of warnings: 0
Number of Slices: 11 out of 6,912
1%
Number of Slices containing
unrelated logic: 0 out of 11 0%
Number of Slice Flip Flops: 19 out of 13,824
1%
Number of 4 input LUTs: 18 out of 13,824
1%
Number of bonded IOBs: 44 out of 158
27%
IOB Flip Flops: 15
Number of GCLKs: 1 out of 4
25%
Number of GCLKIOBs: 1 out of 4
25%
_______________________________________
____________________________
n0 [11:0]
n1 [11:0]
n2 [11:0]
n3 [11:0]
adder12s
n4 [11:0]
n5 [11:0]
n6 [11:0]
Complement evaluation (shortcut)
[8].....[0]
ep
11110000 Data
1 10000 Retain first 1
followed by 0s
2 00010000 Invert other
bits
Extend
Sign
[8].....[0]
111111111 -1
111111111 -1
111111111 -1
000000001 +1
_________ ___
_________ __
111111110 -2
000000000 0
_________ ___
_________ __
Ignore Carry.
001111111 +1 27
110000000 -128
001111111 +1 27
110000000 -128
_________ _____ _________
____
011111110 +2 54
100000000 -256
_________ _____
_________ ____
Pipelined
design partition
n0 [11:0]
+
n1 [11:0]
+
n2 [11:0]
+
n3 [11:0]
n4 [11:0]
+
n5 [11:0]
+
n6 [11:0]
+
n7 [11:0] Regist
clk clk
LSB MSB
clk clk
Register
Register LSB MSB Result
Result
First stage Second stage T
**********
module adder12s(
clk,
n0,n1,n2,n3,n4,n5,n6,n7,
sum
) ;
input clk ;
reg s20_lsbreg5cy ;
reg [6:0] s20_lsbreg5 ;
// First stage addition
assign s00_lsb[7:0] =
n0[6:0]+n1[6:0] ;
assign s01_lsb[7:0] =
n2[6:0]+n3[6:0] ;
assign s02_lsb[7:0] =
n4[6:0]+n5[6:0] ;
assign s03_lsb[7:0] =
n6[6:0]+n7[6:0] ;
always @ (posedge clk)
begin
s00_lsbreg1[7:0] <=
s00_lsb[7:0] ;
// Preserve all
lsb sum.
// s00_lsbreg1[7]
is the
// registered
carry
// from lsb
addition.
s01_lsbreg1[7:0] <=
s01_lsb[7:0] ;
s02_lsbreg1[7:0] <=
s02_lsb[7:0] ;
s03_lsbreg1[7:0] <=
s03_lsb[7:0] ;
end
assign s00_msb[5:0] =
{n0_reg1[11],
n0_reg1[11:7]}+
{n1_reg1[11],
n1_reg1[11:7]}+s00_lsbreg1[7];
//
s00_msb[6] is ignored.
assign s01_msb[5:0] =
{n2_reg1[11],
n2_reg1[11:7]}+
{n3_reg1[11],
n3_reg1[11:7]}+s01_lsbreg1[7];
assign s02_msb[5:0] =
{n4_reg1[11],
n4_reg1[11:7]}+
{n5_reg1[11],
n5_reg1[11:7]}+s02_lsbreg1[7];
assign s03_msb[5:0] =
{n6_reg1[11],
n6_reg1[11:7]}+
{n7_reg1[11],
n7_reg1[11:7]}+s03_lsbreg1[7];
begin
s00_msbreg2[5:0] <=
s00_msb[5:0] ;
//
Preserve all msb sum.
s01_msbreg2[5:0] <=
s01_msb[5:0] ;
s02_msbreg2[5:0] <=
s02_msb[5:0] ;
s03_msbreg2[5:0] <=
s03_msb[5:0] ;
s00_lsbreg2[6:0] <=
s00_lsbreg1[6:0] ;
// Preserve
all lsb
sum.
s01_lsbreg2[6:0] <=
s01_lsbreg1[6:0] ;
s02_lsbreg2[6:0] <=
s02_lsbreg1[6:0] ;
s03_lsbreg2[6:0] <=
s03_lsbreg1[6:0] ;
end
// Second stage addition
assign s10_lsb[7:0] =
s00_lsbreg2[6:0] +
s01_lsbreg2[6:0] ;
assign s11_lsb[7:0] =
s02_lsbreg2[6:0] +
s03_lsbreg2[6:0] ;
begin
s10_lsbreg3[7:0] <=
s10_lsb[7:0] ;
// Preserve
all lsb
sum.
s11_lsbreg3[7:0] <=
s11_lsb[7:0] ;
s00_msbreg3[5:0] <=
s00_msbreg2[5:0] ;
// Preserve
all msb sum.
s01_msbreg3[5:0] <=
s01_msbreg2[5:0] ;
s02_msbreg3[5:0] <=
s02_msbreg2[5:0] ;
s03_msbreg3[5:0] <=
s03_msbreg2[5:0] ;
end
assign s10_msb[6:0] =
{s00_msbreg3[5],
s00_msbreg3[5:0]}+
{s01_msbreg3[5],
s01_msbreg3[5:0]}
+s10_lsbreg3[7] ;
assign s11_msb[6:0] =
{s02_msbreg3[5],
s02_msbreg3[5:0]}+
{s03_msbreg3[5],
s03_msbreg3[5:0]}+
s11_lsbreg3[7] ;
begin
s10_lsbreg4[6:0] <=
s10_lsbreg3[6:0] ;
//
Preserve all lsb sum.
s11_lsbreg4[6:0] <=
s11_lsbreg3[6:0] ;
s10_msbreg4[6:0] <=
s10_msb[6:0] ;
//
Preserve all msb sum.
s11_msbreg4[6:0] <=
s11_msb[6:0] ;
end
// Third stage addition.
assign s20_lsb[7:0] =
s10_lsbreg4[6:0]+
s11_lsbreg4[6:0] ;
begin
s10_msbreg5[6:0] <=
s10_msbreg4[6:0] ;
//
Preserve all msb sum.
s11_msbreg5[6:0] <=
s11_msbreg4[6:0] ;
s20_lsbreg5cy <=
s20_lsb[7];
//
Preserve all lsb sum.
s20_lsbreg5[6:0] <=
s20_lsb[6:0];
end
assign sum[14:0] =
{({s10_msbreg5[6],
s10_msbreg5[6:0]}+
{s11_msbreg5[6],
s11_msbreg5[6:0]}+
s20_lsbreg5cy),
s20_lsbreg5[6:0]};
endmodule
________________________________
__________________________
TEST BENCH FOR PARALLEL SIGNED
ADDER DESIGN
`define clkperiodby2 10
`include "adder12s_banno.v"
// Use back
annotated file.
module adder12s_test (
sum
);
reg clk ;
reg [11:0] n0 ;
reg [11:0] n1 ;
reg [11:0] n2 ;
reg [11:0] n3 ;
reg [11:0] n4 ;
reg [11:0] n5 ;
reg [11:0] n6 ;
reg [11:0] n7 ;
adder12s u1(
.clk(clk),
.n0(n0),
.n1(n1),
.n2(n2),
.n3(n3),
.n4(n4),
.n5(n5),
.n6(n6),
.n7(n7),
.sum(sum)
);
initial
begin
clk = 1'b0 ;
n0 = 12'h0 ;
n1 = 12'h0 ;
n2 = 12'h0 ;
n3 = 12'h0 ;
n4 = 12'h0 ;
n5 = 12'h0 ;
n6 = 12'h0 ;
n7 = 12'h0 ;
#17 n0 = 12'hfff ;
n1 = 12'hfff ;
n2 = 12'hfff ;
n3 = 12'hfff ;
n4 = 12'hfff ;
n5 = 12'hfff ;
n6 = 12'hfff ;
n7 = 12'hfff ;
#20 n0 = 12'h7ff ;
n1 = 12'h7ff ;
n2 = 12'h7ff ;
n3 = 12'h7ff ;
n4 = 12'h7ff ;
n5 = 12'h7ff ;
n6 = 12'h7ff ;
n7 = 12'h7ff ;
#20 n0 = 12'h800 ;
n1 = 12'h800 ;
n2 = 12'h800 ;
n3 = 12'h800 ;
n4 = 12'h800 ;
n5 = 12'h800 ;
n6 = 12'h800 ;
n7 = 12'h800 ;
#20 n0 = 12'h001 ;
n1 = 12'h001 ;
n2 = 12'h001 ;
n3 = 12'h001 ;
n4 = 12'h001 ;
n5 = 12'h001 ;
n6 = 12'h001 ;
n7 = 12'h001 ;
#20 n0 = 12'h001 ;
n1 = 12'hfff ;
n2 = 12'h001 ;
n3 = 12'hfff ;
n4 = 12'h001 ;
n5 = 12'hfff ;
n6 = 12'h001 ;
n7 = 12'hfff ;
#20 n0 = 12'h7ff ;
n1 = 12'h7ff ;
n2 = 12'h7ff ;
n3 = 12'h7ff ;
n4 = 12'h801 ;
n5 = 12'h801 ;
n6 = 12'h801 ;
n7 = 12'h801 ;
#20 n0 = 12'haaa ;
n1 = 12'h555 ;
n2 = 12'haaa ;
n3 = 12'h555 ;
n4 = 12'haaa ;
n5 = 12'h555 ;
n6 = 12'haaa ;
n7 = 12'h555 ;
#20 n0 = 12'h0 ;
n1 = 12'h0 ;
n2 = 12'h0 ;
n3 = 12'h0 ;
n4 = 12'h0 ;
n5 = 12'h0 ;
n6 = 12'h0 ;
n7 = 12'h0 ;
#400
$stop ;
end
always
endmodule
________________________________
__________________________
@I::"D:\user\ram\verilog_latest\
dvlsi_des_verilog\adder12s.v"
Verilog syntax check successful!
Selecting top level module
adder12s
Synthesizing module adder12s
Performance Summary
*******************
Requested
Estimated
Starting Clock Frequency
Frequency
--------------------------------
-----------
clk 100.0 MHz
112.8 MHz
================================
===========
Requested Estimated
Clock
Period Period
Slack Type
--------------------------------
--------------
10.000 8.864
1.136 inferred
================================
==============
I/O primitives:
IBUF 96 uses
OBUF 15 uses
BUFGP 1 use
Mapping Summary:
Total LUTs: 95 (0%)
Mapper successful!
________________________________
__________________________
Xilinx P&R
Results
Design Summary:
Number of errors: 0
Number of warnings: 0
Number of Slices: 97 out
of 6,912 1%
Number of Slices containing
unrelated logic: 0 out
of 97 0%
Mapping completed.
Timing summary:
---------------
Design statistics:
Minimum period: 6.563ns
(Maximum frequency:
152.369MHz)
Minimum input arrival time
before clock:
4.259ns
Minimum output required time
after clock:
11.083ns
Running DRC.
DRC detected 0 errors and 0
warnings.
Creating bit map...
Saving bit stream in
"adder12s.bit".
Creating bit mask...
Saving mask bit stream in
"adder12s.msk".
Bitstream generation is
complete.
________________________________
__________________________
--------------------------------
--
Type of Serial Parallel
Adder
--------------------------------
--
No. of i/p 8 1
clk cycles
--------------------------------
--
No. of o/p 9 1
clk cycles
--------------------------------
--
Gate count 464 2810
JTAG gate 2,160 5376
--------------------------------
--
Max. freq. of 174 152
Operation in MHz
--------------------------------
--
________________________________
__________________________
n1 [10:0]
mult11sx8s
n2 [7:0]
clk
8
pipeline
stages
Example :
1023 x -128 =
- 130944
01111111111 x 10000000 =
1000000000010000000
n1 (magnitude) x
n2 (magnitude)
01111111111
x 10000000
______________________________
_______
00000000000
P1
00000000000
P2
00000000000
P3
00000000000
P4
00000000000
P5
00000000000
P6
00000000000
P7
01111111111
P8
____________________
011111111110000000
(magnitude)
____________________
P1
+ S 11
P2 LS 1 b S2
+
P3 LS 2 b
+
P4 LS 1 b S12
P5 L
+ S13
P6 LS 1 b
+
P7 LS 2 b S2
+
P8 LS 1 b S14
Second stage
module mult11sx8s(
clk,
n1,
n2,
result
) ;
input clk ;
input [10:0] n1 ;
input [7:0] n2 ;
output [18:0] result ;
wire n1orn2z
;
wire [10:0] p1 ;
wire [10:0] p2 ;
wire [10:0] p3 ;
wire [10:0] p4 ;
wire [10:0] p5 ;
wire [10:0] p6 ;
wire [10:0] p7 ;
wire [10:0] p8 ;
wire res_sign
;
wire [18:0] res ;
reg [6:0]
s11a_reg2 ;
reg [6:0]
s12a_reg2 ;
reg [6:0]
s13a_reg2 ;
reg [6:0]
s14a_reg2 ;
reg n1_reg1;
reg n1_reg2;
reg n1_reg3;
reg n1_reg4;
reg n1_reg5;
reg n1_reg6;
reg n1_reg7;
reg n2_reg1;
reg n2_reg2;
reg n2_reg3;
reg n2_reg4;
reg n2_reg5;
reg n2_reg6;
reg n2_reg7;
reg n1orn2z_reg1 ;
reg n1orn2z_reg2 ;
reg n1orn2z_reg3 ;
reg n1orn2z_reg4 ;
reg n1orn2z_reg5 ;
reg n1orn2z_reg6 ;
reg n1orn2z_reg7 ;
reg [7:0]
s21a_reg4 ;
reg [7:0]
s22a_reg4 ;
reg [14:0] s21_reg5
;
reg [14:0] s22_reg5
;
reg [8:0]
s31a_reg6 ;
always @(n1)
begin
if(n1[10] == 1'b0)
n1_mag = n1[10:0];
else
n1_mag = ~n1[10:0] + 1; //
Evaluate twos complement.
end
always @(n2)
begin
if(n2[7] == 1'b0)
n2_mag = n2[7:0];
else
n2_mag = ~n2[7:0] + 1;
// Evaluate
twos complement.
end
assign n1orn2z = ((n1 ==
11'b0)||(n2 == 7'b0))
? 1'b1:1'b0;
// If n1 or n2 is zero,
make final
// result +0.
// n1 multiplied by n2
bit '0', etc.
begin
end
// LSB
is added here.
assign s12a[6:0] = p3_reg1[6:1]
+
p4_reg1[5:0];
// for p1,
p3, p5 and p7.
// p1_reg1[0],
etc. will be
// processed
at the clk (2).
// s11a[6],
etc. are the
// carry bits.
begin
end
p2_reg2[10:6] +
s11a_reg2[6];
p4_reg2[10:6] +
s12a_reg2[6];
p6_reg2[10:6] +
s13a_reg2[6];
p8_reg2[10:6] +
s14a_reg2[6];
p1_reg2[0]};
// MSB, LSB,
'0' th bit
//
respectively.
p3_reg2[0]};
assign s13[12:0] = {s13b,
s13a_reg2[5:0],
p5_reg2[0]};
p7_reg2[0]};
always @ (posedge clk)
begin
s12_reg3[6:0];
//
s21a[7]is the carry.
s14_reg3[6:0];
// LSB
sum, 2nd stage.
begin
s11_reg4[12:9] <=
s11_reg3[12:9];
end
s12_reg4[12:7] +
s21a_reg4[7];
assign s22b[6:0] = {2'b0,
s13_reg4[12:9]} +
s14_reg4[12:7] +
s22a_reg4[7];
s11_reg4[1:0]} ;
// {MSB,
LSB, [1:0]}
begin
end
assign s31a[8:0] =
s21_reg5[11:4] +
s22_reg5[7:0];
begin
s21_reg6[14:12]<=
s21_reg5[14:12];
//
Preserve MSB.
s22_reg6[14:8] <=
s22_reg5[14:8];
s21_reg6[3:0] <= s21_reg5[3:0];
s31a_reg6 <= s31a; //3rd
stage LSB
//
registered here.
end
s22_reg6[14:8] +
s31a_reg6[8];
s21_reg6[3:0]} ;
begin
end
assign res_sign =
n1_reg7^n2_reg7;
// '1'
means a -ve no.
begin
if (n1orn2z_reg7 == 1'b1)
else
________________________________
__________________________
`define clkperiodby2 10
`include "mult11sx8s_banno.v"
module mult11sx8s_test (
result
);
reg [10:0] n1 ;
reg [7:0] n2 ;
mult11sx8s u1(
.clk(clk),
.n1(n1),
.n2(n2),
.result(result)
);
initial
begin
clk = 1'b0 ;
n1 = 11'h0 ;
n2 = 8'h0 ;
#17 n1 = 11'h555 ;
n2 = 8'h55;
#20 n1 = 11'h2aa ;
n2 = 8'haa;
#20 n1 = 11'h7ff ;
n2 = 8'h80;
#20 n1 = 11'h555 ;
n2 = 8'hff;
#20 n1 = 11'h7ff ;
n2 = 8'h81;
#20 n1 = 11'h555 ;
n2 = 8'h81;
#20 n1 = 11'h2aa ;
n2 = 8'h81;
#20 n1 = 11'h7ff ;
n2 = 8'h00;
#20 n1 = 11'h7ff ;
n2 = 8'h7f;
#20 n1 = 11'h000 ;
n2 = 8'hff;
#20 n1 = 11'h000 ;
n2 = 8'h7f;
#400
$stop ;
end
always
endmodule
________________________________
__________________________
Synplify results
@I::"D:\user\ram\verilog_latest\
dvlsi_des_verilog\mult11sx8s.v"
Verilog syntax check successful!
Selecting top level module
mult11sx8s
Synthesizing module mult11sx8s
@N:"D:\user\ram\verilog_latest\d
vlsi_des_verilog\mult11sx8s.v":3
46:0:346:5|Found seqShift
n1orn2z, depth=7, width=1
@N:"D:\user\ram\verilog_latest\d
vlsi_des_verilog\mult11sx8s.v":3
46:0:346:5|Found seqShift n1,
depth=6, width=1
@N:"D:\user\ram\verilog_latest\d
vlsi_des_verilog\mult11sx8s.v":3
46:0:346:5|Found seqShift n2,
depth=6, width=1
@W:"D:\user\ram\verilog_latest\d
vlsi_des_verilog\mult11sx8s.v":2
02:0:202:5|Register bit
s14a_reg2[6] is always 0,
optimizing ...
@END
Performance Summary
*******************
Requested
Estimated
Starting Clock Frequency
Frequency
--------------------------------
-----------
clk 50.0 MHz
125.1 MHz
================================
===========
Requested Estimated
Clock
Period Period
Slack Type
--------------------------------
--------------
20.000 7.991
12.009 inferred
================================
==============
I/O primitives:
IBUF 19 uses
OBUF 19 uses
BUFGP 1 use
SRL primitives:
SRL16 9 uses
Mapping Summary:
Total LUTs: 181 (1%)
Mapper successful!
Design Summary:
Number of errors: 0
Number of warnings: 0
Number of Slices: 201 out
of 6,912 2%
Number of Slices containing
unrelated logic: 0 out
of 201 0%
Number of Slice Flip Flops:
292 out
of 13,824 2%
Total Number 4 input LUTs:
178 out
of 13,824 1%
Number used as LUTs:161
Number used as a route-thru:
8
Number used as Shift
registers: 9
Number of bonded IOBs:
38 out
of 158 24%
IOB Flip Flops: 22
Number of GCLKs: 1 out
of 4 25%
Number of GCLKIOBs: 1 out
of 4 25%
Mapping completed.
Timing summary:
---------------
Design statistics:
Minimum period: 12.132ns
(Maximum
frequency:
82.427MHz)
10.150ns
Minimum output required time
after clock:
5.617ns
Running DRC.
DRC detected 0 errors and 0
warnings.
Creating bit map...
Saving bit stream in
"mult11sx8s.bit".
Creating bit mask...
Saving mask bit stream in
"mult11sx8s.msk".
Bitstream generation is
complete.
________________________________
__________________________