Download as pdf or txt
Download as pdf or txt
You are on page 1of 18

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/316286935

Design of Approximate Compressors for Multiplication

Article in ACM Journal on Emerging Technologies in Computing Systems · April 2017


DOI: 10.1145/3007649

CITATIONS READS
54 2,060

2 authors:

Anusha Gorantla P. Deepa


Raghu Engineering College Government College of Technology
7 PUBLICATIONS 128 CITATIONS 38 PUBLICATIONS 406 CITATIONS

SEE PROFILE SEE PROFILE

All content following this page was uploaded by Anusha Gorantla on 12 September 2019.

The user has requested enhancement of the downloaded file.


Design of Approximate Compressors for Multiplication
ANUSHA GORANTLA, Government College of Technology in Coimbatore, Tamilnadu, India
DEEPA P, Government College of Technology in Coimbatore, Tamilnadu, India

Approximate computing is a promising technique for energy-efficient Very Large Scale Integration (VLSI)
system design. It is best suited for error-resilient applications such as signal processing and multimedia.
Approximate computing reduces accuracy but still provides significant and faster results with lower power
consumption. This is attractive to arithmetic circuits. In this article, various novel design approaches of
approximate 4-2 and 5-2 compressors have been proposed for reduction of the partial product stages in
multiplication. Three approximate 8 × 8 Dadda multiplier designs using three novel approximate 4-2 com-
pressors and two approximate 8 × 8 Dadda multiplier designs using two novel approximate 5-2 compressors
have proposed. The synthesis results show that the proposed designs achieved significant accuracy improve-
ment together with power and delay reductions compared to the existing approximate designs.
CCS Concepts: r Computing methodologies; • Hardware → Very large scale integration design
Additional Key Words and Phrases: Approximate computing, 4-2 compressor, 5-2 compressor, dadda
multiplier
ACM Reference Format:
Anusha Gorantla and Deepa P. 2017. Design of approximate compressors for multiplication. J. Emerg.
Technol. Comput. Syst. 13, 3, Article 44 (April 2017), 17 pages.
DOI: http://dx.doi.org/10.1145/3007649
44
1. INTRODUCTION
Various scientific and engineering problems solved using deterministic and precise
algorithms. However, some applications such as an image and video processing can
tolerate errors [Han et al. 2013]. Humans have fewer perceptual abilities in identifying
imprecision during image or video processing. Hence, precise algorithms and models
are inefficient to use in these applications. Approximate computation increases the
performance of the existing digital logic circuits or systems by decreasing the logic
complexity with a tradeoff in accuracy [Gupta et al. 2011; Han et al. 2013; Gupta et al.
2013; Swagath et al. 2013; Li et al. 2015; Nair et al. 2010]. Approximate computing is
an emerging approach to energy-efficient Very Large Scale Integration (VLSI) designs.
Approximate computing can also be applied to the different levels of abstractions.
This article presents a way to introduce approximate computing at the logic level by
introducing possible minimal errors in the truth table and simplifying the logic using
a karunagh map (k-map).
Multiplication is an elementary arithmetic operation and crucial in applications
like digital signal processing. The implementation of multipliers includes generation

Authors’ addresses: A. Gorantla, Electronics and Communication Engineering Department, Government


College of Technology, Coimbatore-641013,Tamilnadu, India; email: anushagorantla3@gmail.com; P. Deepa,
Assistant Professor, Department of Electronics and Communication Engineering, Government College of
Technology, Coimbatore-641013, Tamilnadu, India; email: deepap05@gmail.com.
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted
without fee provided that copies are not made or distributed for profit or commercial advantage and that
copies show this notice on the first page or initial screen of a display along with the full citation. Copyrights for
components of this work owned by others than ACM must be honored. Abstracting with credit is permitted.
To copy otherwise, to republish, to post on servers, to redistribute to lists, or to use any component of this
work in other works requires prior specific permission and/or a fee. Permissions may be requested from
Publications Dept., ACM, Inc., 2 Penn Plaza, Suite 701, New York, NY 10121-0701 USA, fax +1 (212)
869-0481, or permissions@acm.org.
c 2017 ACM 1550-4832/2017/04-ART44 $15.00
DOI: http://dx.doi.org/10.1145/3007649

ACM Journal on Emerging Technologies in Computing Systems, Vol. 13, No. 3, Article 44, Publication date: April 2017.
44:2 A. Gorantla and Deepa P.

of partial products, reduction of partial products using Carry-Save Adder (CSA) and
Carry propagation for computing the final result. Thus, speeding up the CSA circuit and
lowering its power dissipation are crucial for sustaining performance of the multiplier
to stay competitive. To reduce partial products, multi-operand adders are required,
and hence a different design method is needed for multi-operand adders [Parhami
et al. 2010; Ercegovac et al. 2004; Koren et al. 1993]. A different structure, known as
the compressor, can be adopted for multi-operand addition. Wallace and Dadda were
the first ones who explained the usage of compressors and counters, respectively, for
partial product reduction trees in multipliers [Wallace et al. 1964; Dadda et al. 1965].
Early designs of the CSA tree used the Dadda’s column compression technique with the
3-2 counters or, equivalently, the full adders to reduce the partial production stages.
Further, to reduce the partial production stages, 4-2, and 5-2 compressors have been
employed nowadays for high-speed multipliers.
An Error Tolerant Multiplier (ETM) divides the input operands into two parts as
accurate and inaccurate. In the accurate part, exact multiplication is performed at
higher order bits, and in the inaccurate part non-multiplication is constructed with a
certain amount of errors [Kyaw et al. 2010]. A novel 2×2-bit Under Designed Multiplier
(UDM) is proposed to build a larger multiplier [Kulkarni et al. 2011]. [Mahdiani et al.
2010] presents a 6×6 bit Broken Array Multiplier and is faster than an accurate array
multiplier. The 4×4 Imprecise Counter-based Multiplier (ICM) that uses 4:2 inaccurate
compressors to reduce the partial production stages of a Wallace tree multiplier has
a powerfully efficient design to implement multipliers of large sizes [Lin et al. 2013].
Four different approaches of the Approximate Wallace Tree Multiplier (AWTM) are
presented in Bhardwaj et al. [2014]. This design uses a carry-in prediction method,
resulting in hardware reduction and less power, smaller area, and decreased delay
compared to the Accurate Wallace Tree Multiplier. A fast multiplier is based on the
approximate adder that can process data in parallel by cutting the carry propagation
chain [Liu et al. 2014]. However, still there arises a need to develop the efficient adders
and multipliers for recent applications [Momeni et al. 2015].
Two approximate 4-2 compressor architectures and four 8×8 approximate Dadda
multipliers are proposed in Momeni et al. [2015]. Most of the approximate multipliers
resolve for a tradeoff in accuracy, power, delay, and area. There are various approxi-
mate 4-2 compressors, 5-2 compressors, and approximate 8×8 Dadda multipliers that
are proposed in this article to improve the performance and accuracy. The proposed
approximate 8×8 Dadda multipliers presented in the article provides better results
than the approximate 8×8 Dadda Multipliers proposed in Momeni et al. [2015].
This article is organized as follows.

—Section II reviews the exact 4-2 compressor, existing approximate 4-2 compres-
sors, exact 5-2 compressor, and the proposed approximate 4-2 compressors and 5-2
compressors.
—Section III presents the conventional 8×8 Dadda multiplier, proposed approximate 4-
2 compressor-based approximate 8×8 Dadda multipliers and proposed approximate
5-2 compressor-based approximate 8×8 Dadda multipliers.
—The synthesized results, error metrics for the approximate compressors, approxi-
mate 8×8 Dadda multipliers, and application in image processing using 8×8 Dadda
Multipliers are discussed in Section IV.
—Finally, Section V concludes the article.

2. EXISTING DESIGNS
Compressors are the key building blocks used for reducing the partial production
stages during multiplication processes. Therefore, improving the power efficiency of
ACM Journal on Emerging Technologies in Computing Systems, Vol. 13, No. 3, Article 44, Publication date: April 2017.
Design of Approximate Compressors for Multiplication 44:3

Fig. 1. Exact 4-2 compressor.

these architectures can lead to significant savings of the power consumed by the entire
multiplier. In parallel multiplication, (n-2) compressors are used.
2.1. Exact and Proposed Approximate 4-2 Compressors
An exact 4-2 compressor has five inputs and three outputs as shown in Figure 1. It
produces a sum for the same order of the next stage and a carry for one order higher
in the next stage. Also, a carry out (Cout ) becomes the carry in (Cin ) for the next
higher order compressor. A 4-2 accurate compressor design utilizes three Exclusive OR
(EX-OR)- Exclusive NOR (EX-NOR) gates, one Exclusive OR (EX-OR) gate and two 2:1
multiplexers [Chang et al. 2004].
Table I shows the exact 4-2 compressor truth table and the logic equations for outputs
of the exact 4-2 compressor as follows:
Sum = X1 ⊕ X2 ⊕ X3 ⊕ X4 ⊕ Cin (1)

Cout = ( X1 ⊕ X2) X3 + ( X1 ⊕ X2) X1 (2)

Carry = ( X1 ⊕ X2 ⊕ X3 ⊕ X4) Cin + ( X1 ⊕ X2 ⊕ X3 ⊕ X4) X4) (3)
Two approximate 4-2 compressors, such as approximate compressor1 and approxi-
mate compressor2, are presented in Momeni et al. [2015]. The logic, as well as perfor-
mance, is optimized by using these compressors with a tradeoff in accuracy. The gate
level implementation of approximate compressor1 produces the critical path delay of
3 and the corresponding truth table is given in Momeni et al. [2015]. The difference is
computed between the exact 4-2 compressor output and approximate 4-2 compressor1
output. The difference is related to the inaccuracy and produces12 errors.
Various approximate 4-2 compressors and 5-2 compressors are proposed and simpli-
fied using a k-map to further reduce errors and to increase the performance compared
to the exact 4-2 and 5-2 compressors. They are named approximate 4-2 compressor3,
approximate 4-2 compressor4, approximate 4-2 compressor5, approximate 5-2 compres-
sor1, and approximate 5-2 compressor2. The proposed approximate 4-2 compressor and
5-2 compressor designs are required to simplify the design, such that Cin and Cout are
removed from the circuit.
Approximate 4-2 compressors2 is proposed further to reduce the critical path delay
and errors and increase the performance as compared to approximate compressor1.
It simplifies the circuit and gives better results in terms of accuracy. The gate level
implementation of approximate compressor2 produces the critical path delay of 2,
so accuracy is decreased compared to approximate 4-2 compressor1. Four errors are
possible in the 4-2 compressor2.
2.1.1. Proposed Approximate 4-2 Compressor3. The logic equations given below are used
for designing the approximate 4-2 compressor3 and the corresponding truth table is
given in Table II. This proposed design produces three errors as specified by the term
difference in the truth table.
ACM Journal on Emerging Technologies in Computing Systems, Vol. 13, No. 3, Article 44, Publication date: April 2017.
44:4 A. Gorantla and Deepa P.

Table I. Truth Table for Exact and Existing Approximate 4-2 Compressors
Outputs Difference (Number of Errors)
Existing approximate Existing approximate
Inputs Exact 4-2 4-2 compressors 4-2 compressors
compressor compressor 1 compressor 2 Exact 4-2
Cin X4 X3 X2 X1 Sum Carry Sum1 Carry1 Sum2 Carry2 Compressor Compressor 1 Compressor 2
0 0 0 0 0 0 0 1 0 1 0 0 1 1
0 0 0 0 1 1 0 1 0 1 0 0 0 0
0 0 0 1 0 1 0 1 0 1 0 0 0 0
0 0 0 1 1 0 0 1 0 1 0 0 −1 −1
0 0 1 0 0 1 0 1 0 1 0 0 0 0
0 0 1 0 1 0 0 0 0 0 1 0 0 0
0 0 1 1 0 0 0 0 0 0 1 0 0 0
0 0 1 1 1 1 0 1 0 1 1 0 0 0
0 1 0 0 0 1 0 1 0 1 0 0 0 0
0 1 0 0 1 0 1 0 0 0 1 0 0 0
0 1 0 1 0 0 1 0 0 0 1 0 0 0
0 1 0 1 1 1 0 1 0 1 1 0 0 0
0 1 1 0 0 0 1 1 0 1 0 0 −1 −1
0 1 1 0 1 1 0 1 0 1 1 0 0 0
0 1 1 1 0 1 0 1 0 1 1 0 0 0
0 1 1 1 1 0 1 1 0 1 1 0 −1 −1
1 0 0 0 0 1 0 0 1 – – 0 1 –
1 0 0 0 1 0 1 0 1 – – 0 0 –
1 0 0 1 0 0 1 0 1 – – 0 0 –
1 0 0 1 1 1 0 0 1 – – 0 −1 –
1 0 1 0 0 0 1 0 1 – – 0 0 –
1 0 1 0 1 1 0 0 1 – – 0 1 –
1 0 1 1 0 1 0 0 1 – – 0 1 –
1 0 1 1 1 0 1 0 1 – – 0 0 –
1 1 0 0 0 0 1 0 1 – – 0 0 –
1 1 0 0 1 1 1 0 1 – – 0 1 –
1 1 0 1 0 1 1 0 1 – – 0 1 –
1 1 0 1 1 0 1 0 1 – – 0 0 –
1 1 1 0 0 1 1 0 1 – – 0 −1 –
1 1 1 0 1 0 1 0 1 – – 0 0 –
1 1 1 1 0 0 1 0 1 – – 0 0 –
1 1 1 1 1 1 1 0 1 – – 0 −1 –

The logic equations for outputs of proposed Approximate 4-2 compressor3 as follows:
Sum3 = X1X2 + X3X4 + X1 X2 ( X3 + X4) X1 X4 ( X1 + X2) (4)

Carry3 = ((X1X2) + (X3X4) ) (5)


2.1.2. Proposed Approximate 4-2 Compressor4. For designing an approximate 4-2 com-
pressor4 the logic equations (6) and (7) are used, and the corresponding truth table is
given in Table II. From Table II, it is observed that the proposed design produces two
errors as specified by the term difference in the truth table.
The logic equations for outputs of Approximate 4-2 compressor4 as follows:
Sum4 = X4X3+X4X1 X2+X1 X2 X3+ X1X2X4+ X1X2X3+ X1X2 X3 X4 + X1 X2+ X3 X4
(6)
Carry4 = ((X1X2) + (X3X4) ) (7)

ACM Journal on Emerging Technologies in Computing Systems, Vol. 13, No. 3, Article 44, Publication date: April 2017.
Design of Approximate Compressors for Multiplication 44:5

Table II. Truth Table for Proposed Approximate 4-2 Compressors


Outputs Difference ( Number of Errors)
Inputs Proposed approximate 4-2 compressors Proposed approximate 4-2 compressors
Compressor3 Compressor4 Compressor5
X4 X3 X2 X1 Sum3 Carry3 Sum4 Carry4 Sum5 Carry5 Compressor3 Compressor4 Compressor5
0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 1 1 0 1 0 1 0 0 0 0
0 0 1 0 1 0 1 0 1 0 0 0 0
0 0 1 1 1 0 0 0 0 0 −1 0 0
0 1 0 0 1 0 1 0 1 0 0 0 0
0 1 0 1 0 1 0 1 0 1 0 0 0
0 1 1 0 0 1 0 1 0 1 0 0 0
0 1 1 1 1 1 1 1 1 1 0 0 0
1 0 0 0 1 0 1 0 1 0 0 0 0
1 0 0 1 0 1 0 1 0 1 0 0 0
1 0 1 0 0 1 0 1 0 1 0 0 0
1 0 1 1 1 1 1 1 1 1 0 0 0
1 1 0 0 1 0 1 0 0 0 −1 −1 0
1 1 0 1 1 1 1 1 1 1 0 0 0
1 1 1 0 1 1 1 1 1 1 0 0 0
1 1 1 1 1 1 1 1 1 1 −1 −1 −1

Fig. 2. Exact 5-2 compressor.

2.1.3. Proposed Approximate 4-2 Compressor5. The logic equations (8) and (9) are used
for designing approximate 4-2 compressor4, and the corresponding truth table is given
in Table II. The proposed design produces one error as specified by the term difference
in the truth table.
The logic equations for outputs of Approximate 4-2 compressor5 are as follows:
Sum5 = X1X3X4 + X2X3X4 + X1 X2 X3 X4 + X1 X2 X3X4 + X1 X2X3 X4 (8)
 
Carry5 = ( X1X2) + ( X3X4) (9)

2.2. Exact and Proposed Approximate 5-2 Compressors


An exact 5-2 compressor has seven inputs and four outputs, as shown in Figure 2. The
primary inputs are ×1, ×2, ×3, ×4, and ×5 and other two inputs are carry bits Cin1 and
Cin2 . It produces two outputs sum, carry, and two output carry bits, Cout1 and Cout2 .
Input carry bits are the outputs from the previous lesser significant compressor block
and the output carry bits are passed on to the next higher significant compressor block.
A 5-2 accurate compressor uses six XOR gates and three 2:1 multiplexers [Chang et al.
2004].

ACM Journal on Emerging Technologies in Computing Systems, Vol. 13, No. 3, Article 44, Publication date: April 2017.
44:6 A. Gorantla and Deepa P.

The logic equations for outputs of exact 5-2 compressor are as follows:
Sum = (X1 ⊕ X2 ⊕ X3 ⊕ X4 ⊕ X5 ⊕ Cin1 ⊕ Cin2 (10)
Cout1 = (X1 + X2)(X3 + X4) (11)

Cout2 = ((((X1 ⊕ X2 ⊕ X3 ⊕ X4) (X1X2)) + ((X1 ⊕ X2 ⊕ X3 ⊕ X4)Cin1)) (12)
  
Carry = ( X1 ⊕ X2 ⊕ X3 ⊕ X4 ⊕ Cin1) X5 + (( X1 ⊕ X2 ⊕ X3 ⊕ X4 ⊕ X5 ⊕ Cin1) Cin2)
(13)
2.2.1. Proposed Approximate 5-2 Compressor1. The logic equations given below are used
for designing the approximate 5-2 compressor, and the corresponding truth table is
given in Table III. The proposed design produces seven errors as specified by the term
difference in the truth table.
The logic equations for outputs of Approximate 5-2 compressor1 are as follows:
Sum1 = ((X2X3)X5 ) + (X2X3 X5) + (X1 X2 X3X4 X5 ) + (X1 X2 X3X4X5)
+ (X1X2 X3 X4 X5 ) + (X1X2 X3 X4 X5) + (X1X2X3X4X5) (14)
Cout1 = Cout2 = (( X1 + X2) X3) + ( X1X2) + ( X4 + X5) + ( X4X5) (15)
2.2.2. Proposed Approximate 5-2 Compressor2. The logic equations given below are used
for designing the approximate 5-2 compressor2, and the corresponding truth table is
given in Table III. From Table III, it is observed that the proposed design produces five
errors as specified by the term difference in the truth table.
The logic equations for the outputs of the proposed approximate 5-2 compressor2 are
as follows:
Sum2 = X2X3 + X1 X2X3X4 + X3X4X5X1 + X1X2X4X5 + X1X2 X3X4
+ X1X3X4 X5 (16)
Carry2 = ((X1 + X2)X3 + X1X2 + (X4 + X5) + X4X5 (17)

3. DADDA MULTIPLIERS
In a conventional parallel multiplier, generation of partial products is done by mul-
tiplying the multiplicand with each bit of multiplier. Then these partial products are
added together to generate a resultant product. A multiplication process is divided into
two parts, namely the partial product generation and partial product accumulation.
The number of partial products to be added plays an important role in determining the
performance of parallel multiplier.
The key objective is to design and implement multiplier focusing on methods to
decrease the power consumption and minimizing overall delay. These parameters are
inversely proportional to each other, and improving one comes at the cost of the nother.
The two well-known fast multipliers presented by Wallace and Dadda [Wallace et al.
1964; Dadda et al. 1965] consist of three stages. In the first stage, a partial product
matrix is formed. In the second stage, the partial product matrix is reduced to a height
of two. These two rows are combined using an adder in the final stage.
The dots diagram shown in Figure 3 represents the conventional Dadda algorithm
implemented for an 8×8 bit multiplier. Four reduction levels are required with matrix
heights of 6, 4, 3, and 2. Two dots joined by a diagonal line indicate that these dots
are the outputs from a 3-2 compressor. Similarly, two dots joined by a crossed diagonal
indicate that these dots are the outputs from a 2-2 compressor. Sixty-four AND gates,
35 numbers in 3:2 compressors, 7 numbers in 2:2 compressors, and a 14-bit carry
propagating adder are required to form the 16-bit product.

ACM Journal on Emerging Technologies in Computing Systems, Vol. 13, No. 3, Article 44, Publication date: April 2017.
Design of Approximate Compressors for Multiplication 44:7

Table III. Truth Table for Exact and Proposed Approximate 5-2 Compressors
Outputs Difference (Number of Errors)
Exact 5-2 Proposed approximate Exact 5-2 Proposed approximate
Inputs compressor 5-2 compressors compressor 5-2 compressors
compressor 1 compressor 2
Cin1 Cin2 X5 X4 X3 X2 X1 Sum Carry Sum1 Cout1 Sum2 Carry2 compressor 1 compressor 2
0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0
0 0 0 0 0 0 1 1 0 1 0 1 1 0 0 0
0 0 0 0 0 1 0 1 0 1 1 1 1 0 0 0
0 0 0 0 0 1 1 0 1 0 1 0 0 0 0 0
0 0 0 0 1 0 0 1 0 0 1 1 1 0 1 1
0 0 0 0 1 0 1 0 1 0 0 0 0 0 0 0
0 0 0 0 1 1 0 0 0 0 1 0 0 0 0 0
0 0 0 0 1 1 1 1 0 1 0 1 1 0 0 0
0 0 0 1 0 0 0 1 0 1 0 0 1 0 1 1
0 0 0 1 0 0 1 0 1 0 0 0 0 0 0 0
0 0 0 1 0 1 0 0 0 0 1 1 0 0 0 0
0 0 0 1 0 1 1 1 0 1 1 1 1 0 0 0
0 0 0 1 1 0 0 0 0 0 0 1 0 0 0 0
0 0 0 1 1 0 1 1 0 1 1 1 1 0 1 0
0 0 0 1 1 1 0 1 0 1 1 0 1 0 0 0
0 0 0 1 1 1 1 0 1 0 0 1 0 0 0 0
0 1 0 0 0 0 0 1 0 1 1 0 1 0 0 0
0 1 0 0 0 0 1 0 1 0 1 0 0 0 0 0
0 1 0 0 0 1 0 0 1 0 1 0 0 0 0 0
0 1 0 0 0 1 1 1 0 0 0 0 1 0 1 1
0 1 0 0 1 0 0 0 1 0 0 1 0 0 0 0
0 1 0 0 1 0 1 1 1 1 1 1 1 0 0 0
0 1 0 0 1 1 0 1 0 0 0 0 1 0 1 1
0 1 0 0 1 1 1 0 1 0 1 0 0 0 0 0
0 1 0 1 0 0 0 0 0 0 1 1 0 0 0 0
0 1 0 1 0 0 1 1 1 1 0 1 1 0 0 0
0 1 0 1 0 1 0 1 0 1 1 0 1 0 1 1
0 1 0 1 0 1 1 0 1 0 0 1 0 0 0 0
0 1 0 1 1 0 0 1 1 1 1 1 1 0 0 0
0 1 0 1 1 0 1 0 0 0 0 0 0 0 0 0
0 1 0 1 1 1 0 0 1 0 1 0 0 0 0 0
0 1 0 1 1 1 1 1 1 1 1 1 1 0 0 0
1 0 0 0 0 0 0 1 1 – – – – 0 – –
1 0 0 0 0 0 1 0 1 – – – – 0 – –
1 0 0 0 0 1 0 0 0 – – – – 0 – –
1 0 0 0 0 1 1 1 1 – – – – 0 – –
1 0 0 0 1 0 0 0 1 – – – – 0 – –
1 0 0 0 1 0 1 1 1 – – – – 0 – –
1 0 0 0 1 1 0 1 0 – – – – 0 – –
1 0 0 0 1 1 1 0 1 – – – – 0 – –
1 0 0 1 0 0 0 0 0 – – – – 0 – –
1 0 0 1 0 0 1 1 1 – – – – 0 – –
1 0 0 1 0 1 0 1 1 – – – – 0 – –
1 0 0 1 0 1 1 0 1 – – – – 0 – –
1 0 0 1 1 0 0 1 1 – – – – 0 – –
1 0 0 1 1 0 1 0 0 – – – – 0 – –
1 0 0 1 1 1 0 0 0 – – – – 0 – –
1 0 0 1 1 1 1 1 1 – – – – 0 – –
1 1 0 0 0 0 0 0 0 – – – – 0 – –
1 1 0 0 0 0 1 1 1 – – – – 0 – –
1 1 0 0 0 1 0 1 1 – – – – 0 – –
1 1 0 0 0 1 1 0 0 – – – – 0 – –
1 1 0 0 1 0 0 1 1 – – – – 0 – –
1 1 0 0 1 0 1 0 1 – – – – 0 – –
1 1 0 0 1 1 0 0 0 – – – – 0 – –
1 1 0 0 1 1 1 1 1 – – – – 0 – –
1 1 0 1 0 0 0 1 1 – – – – 0 – –
1 1 0 1 0 0 1 0 1 – – – – 0 – –
1 1 0 1 0 1 0 0 0 – – – – 0 – –
1 1 0 1 0 1 1 1 1 – – – – 0 – –
1 1 0 1 1 0 0 0 0 – – – – 0 – –
1 1 0 1 1 0 1 1 1 – – – – 0 – –
1 1 0 1 1 1 0 1 1 – – – – 0 – –
1 1 0 1 1 1 1 0 0 – – – – 0 – –

ACM Journal on Emerging Technologies in Computing Systems, Vol. 13, No. 3, Article 44, Publication date: April 2017.
44:8 A. Gorantla and Deepa P.

Fig. 3. Conventional 8×8 Dadda multiplier.

3.1. Existing Approximate 8×8 Dadda Multipliers


Approximate 4-2 compressors are used to increase the performance of Dadda multipli-
ers. Two approximate 8×8 Dadda multipliers are designed based on the approximate
4-2 compressor1 and approximate 4-2 compressor2. The design of approximate 8×8
Dadda multipliers based on reduction circuitry by using the approximate 4-2 compres-
sors1 and 4-2 compressors2 is given in Momeni et al. [2015]. An approximate 8×8
Dadda multiplier2 design has involved approximate compressor2 design. So, approxi-
mate 8×8 Dadda multiplier2 design has involved a lower number of stages as compared
to approximate 8×8 Dadda multiplier1.

3.2. Proposed Approximate 8×8 Dadda Multipliers


Conventional 8×8 Dadda multiplier design involves an exact 4-2 and 5-2 compressor.
In this section, various approximate 8×8 Dadda multipliers are designed based on pro-
posed approximate 4-2 compressor3, compressor4, and compressor5. These are named
as proposed approximate 8×8 Dadda multiplier3, multiplier4, and multiplier5. In addi-
tion, approximate 8×8 Dadda multiplier6 is designed based on proposed approximate
4-2 compressor3 and proposed approximate 5-2 compressor1. Approximate 8×8 Dadda
multiplier7 is designed based on proposed approximate 4-2 compressor3 and approxi-
mate 5-2 compressor2. Figure 4 shows the reduction circuitry of proposed approximate
8×8 Dadda multiplier3, multiplier4, and multiplier5. To get a reduction circuit, con-
sider approximate 4-2 compressors at stage 1, d = 4, and approximate 4-2 compressors
at stage 2, d = 2, which gives better results in terms of performance as compared to
existing approximate 8×8 Dadda multipliers. Figure 5 shows the reduction circuitry of
the proposed approximate 8×8 Dadda multiplier6 and multiplier7. In order to obtain
a reduction circuit, proposed approximate 4-2 compressor3 and proposed 5-2 compres-
sors are used at stage 1, d = 3, and proposed approximate 4-2 compressor3 is used
at stage 2, d = 2. This reduction circuit gives better results in terms of performance
as compared to the conventional 8×8 Dadda multiplier based on exact 4-2 and 5-2
compressors.

3.3. Design Features of Approximate 8×8 Dadda Multipliers


Table IV lists the features of various existing and proposed approximate 8×8 Dadda
multipliers. For any digital circuit design, Most Significant Bits (MSBs) have more
weight than Least Significant Bits (LSBs) to maintain accuracy in the circuit. Approx-
imate computing-based designs involve errors. To improve the accuracy in proposed
approximate 8×8 Dadda multipliers, consider the proposed approximate compressors

ACM Journal on Emerging Technologies in Computing Systems, Vol. 13, No. 3, Article 44, Publication date: April 2017.
Design of Approximate Compressors for Multiplication 44:9

Fig. 4. The reduction circuitry approach of proposed approximate 8×8 Dadda multiplier3, multiplier4, and
multiplier5.

Fig. 5. Reduction circuitry approach of proposed approximate 8×8 Dadda multiplier6 and multiplier7.

at LSBs and exact compressors at MSBs. As discussed in Section 2, the proposed 4-


2 compressor5 has fewer errors. Among all existing and proposed approximate 8×8
Dadda multiplier designs, the proposed approximate 8×8 Dadda multiplier5 design
has fewer errors, as this design involves proposed approximate 4-2 compressor5.

4. RESULTS AND DISCUSSIONS


In this section, the designs of proposed 4-2 compressors, 5-2 compressors, and proposed
approximate 8×8 Dadda multipliers as explained in Sections 2 and 3 are evaluated. The
proposed designs are implemented in Verilog Hardware Description Language (HDL)
and synthesized in the Cadence Register Transfer Level (RTL) Compiler. For analysing
the power, area, and delay, the proposed designs are synthesized in 180nm, 90nm, and
45nm technologies using the Cadence RTL compiler. Scaling down of technology to
Nanometer dimensions has has provided many advantages to circuit designers with
respect to area, power, and speed optimisation, leading to the advancements in chip
design. Scaling realizes the following advantages: reduced capacitances due to the

ACM Journal on Emerging Technologies in Computing Systems, Vol. 13, No. 3, Article 44, Publication date: April 2017.
44:10 A. Gorantla and Deepa P.

Table IV. Approximate 8×8 Dadda Multipliers and Their Features


Design Feature
Approximate 8×8 Dadda multiplier1 [Momeni Approximate 4-2 compressor1 in LSBs and exact
et al. 2015] 4-2 compressor in MSBs at stages d = 4 and d = 2.
Approximate 8×8 Dadda multiplier2 [Momeni Approximate 4-2 compressor2 in LSBs and exact
et al. 2015] 4-2 compressor in MSBs at stages d = 4 and d = 2.
Proposed approximate 8×8 Dadda multiplier3 Proposed approximate 4-2 compressor3 in LSBs and
exact 4-2 compressor in MSBs at stages d = 4 and
d = 2.
Proposed approximate 8×8 Dadda multiplier4 Proposed approximate 4-2 compressor4 in LSBs and
exact 4-2 compressor in MSBs at stages d = 4 and
d = 2.
Proposed approximate 8×8 Dadda multiplier5 Proposed approximate 4-2 compressor5 in LSBs and
exact 4-2 compressor in MSBs at stages d = 4 and
d = 2.
Proposed approximate 8×8 Dadda multiplier6 Proposed approximate 4-2 compressor3 in the least
LSB, proposed approximate 5-2 compressor1 in
other LSBs, an exact 5-2 compressor in MSBs at
stage d = 3.
Proposed approximate compressor3 in LSBs and
exact 4-2 compressor in MSBs at stage d = 2.
Proposed approximate 8×8 Dadda multiplier7 Proposed approximate 4-2 compressor3 in the least
LSB, proposed approximate 5-2 compressor2 in
other LSBs, an exact 5-2 compressor in MSBs at
stage d = 3.
Proposed approximate compressor3 in LSBs and
exact 4-2 compressor in MSBs at stage d = 2.

Table V. Comparison of Various Approximate 4-2 Compressors


Power Delay Number
Design (nanowatts) (Ps) Area of Errors
@180 nm (Voltage = 0.9V, Operating Frequency = 1GHz)
Exact 4-2 compressor [Momeni et al. 2015] 10,676 1,445 156 0
Approximate 4-2 compressor1 [Momeni et al. 2015] 2,838 426 77 12
Approximate 4-2 compressor2 [Momeni et al. 2015] 3,124 362 83 4
Proposed approximate 4-2 compressor3 2,695 442 73 3
Proposed approximate 4-2 compressor4 4,444 899 110 2
Proposed approximate 4-2 compressor5 6,581 737 173 1
@90 nm (Voltage = 1V, Operating Frequency = 1GHz)
Exact 4-2 compressor [Momeni et al. 2015] 3,903 223 32 0
Approximate 4-2 compressor1 [Momeni et al. 2015] 902 104 20 12
Approximate 4-2 compressor2 [Momeni et al. 2015] 1163 88 22 4
Proposed approximate 4-2 compressor3 745 116 18 3
Proposed approximate 4-2 compressor4 1,674 242 29 2
Proposed approximate 4-2 compressor5 1,783 157 39 1
@45 nm (Voltage = 1.1 V, Operating Frequency = 1GHz)
Exact 4-2 compressor [Momeni et al. 2015] 636 145 14 0
Approximate 4-2 compressor1 [Momeni et al. 2015] 213 68 9 12
Approximate 4-2 compressor2 [Momeni et al. 2015] 219 60 9 4
Proposed approximate 4-2 compressor3 170 59 7 3
Proposed approximate 4-2 compressor4 280 97 11 2
Proposed approximate 4-2 compressor5 439 105 18 1

smaller devices that reduce the switching power. Smaller gate oxide thickness improves
gate control. Low-technology nodes have considerable benefits on high-speed VLSI
circuits. Tables V, VI, VII, and VIII compare the design parameters with respect to
three technology nodes of existing and proposed approximate designs.

ACM Journal on Emerging Technologies in Computing Systems, Vol. 13, No. 3, Article 44, Publication date: April 2017.
Design of Approximate Compressors for Multiplication 44:11

Table VI. Comparison of Various Approximate 5-2 Compressors


Design Power (nanowatts) Delay (Ps) Area Number of Errors
@180 nm (Voltage = 0.9V, Operating Frequency = 1GHz)
Exact 5-2 compressor [Chang et al. 2004]. 17,776 1,630 263 0
Proposed approximate 5-2 compressor1 6,050 473 173 7
Proposed approximate 5-2 compressor2 5,519 653 153 5
@90 nm (Voltage = 1V, Operating Frequency = 1GHz)
Exact 5-2 compressor [Chang et al. 2004]. 5,180.8 395 58 0
Proposed approximate 5-2 compressor1 1,743.8 159 43 7
Proposed approximate 5-2 compressor2 1,718.5 189 44 5
@45 nm (Voltage = 1.1 V, Operating Frequency = 1GHz)
Exact 5-2 compressor [Chang et al. 2004]. 1,213.5 185 25 0
Proposed approximate 5-2 compressor1 432.6 78 17 7
Proposed approximate 5-2 compressor2 345.9 88 18 5

4.1. Approximate 4-2 Compressors


Table V shows the comparison of power, delay, area, and the number of errors in an
exact 4-2 compressor, existing 4-2 compressors, and proposed approximate 4-2 com-
pressors. From Table V, it is concluded that approximate 4-2 compressor1 design gives
the best results in terms of power consumption and area. Proposed approximate 4-
2 compressor2 design has the lowest delay as compared to an exact 4-2 compressor
and approximate 4-2 compressor1. As specified in Table V, among all proposed ap-
proximate 4-2 compressors, proposed approximate 4-2 compressor3 has lower power
consumption and area. The lower power consumption and the area are due to the ap-
proximation done in the proposed logic circuits by logic simplifications. For inaccuracy
point of view, proposed approximate 4-2 compressor5 design has a minimal error as
compared to other proposed approximate 4-2 compressors and existing approximate 4-2
compressors.

4.2. Approximate 5-2 Compressors


Table VI shows the comparison of power, delay, area, and the number of errors for an
exact 5-2 compressor and proposed approximate 5-2 compressors. From Table VI, it is
concluded that the proposed approximate 5-2 compressor2 design gives the best results
in terms of power consumption and area. The proposed approximate 5-2 compressor1
design has the lowest delay as compared to an exact 5-2 compressor and proposed ap-
proximate 5-2 compressor2. Among two proposed approximate 5-2 compressor designs,
the proposed approximate 5-2 compressor2 design achieves significant improvement in
terms of power consumption and area. The lower power consumption and the area are
due to the approximation done in the proposed logic circuits by logic simplifications.
In terms of accuracy, the proposed approximate 5-2 compressor2 design has minimal
error as compared to proposed approximate 5-2 compressor1.

4.3. 4-2 Compressor-Based Approximate 8x8 Dadda Multipliers


Table VII shows the comparison of power, delay, area, and the number of errors for the
conventional 8×8 Dadda multiplier, the existing approximate 8×8 Dadda multipliers
given in Momeni et al. [2015], and the proposed approximate 8×8 Dadda multipli-
ers based on proposed approximate 4-2 compressors. Most of the previous approaches
were compared; some of the approaches were not due to variable bit sizes and com-
plexity in designs. From Table VII, it is observed that the approximate 8×8 Dadda
multiplier1 design provides the best results in terms of power consumption and area.
The approximate 8×8 Dadda multiplier2 design has the lowest delay as compared to
an 8×8 Dadda multiplier and approximate 8×8 Dadda multiplier1. Among the three

ACM Journal on Emerging Technologies in Computing Systems, Vol. 13, No. 3, Article 44, Publication date: April 2017.
44:12 A. Gorantla and Deepa P.

Table VII. Comparison of Several 4-2 Compressor Based Approximate 8×8 Dadda Multipliers
Design Power (nanowatts) Delay (Ps) Area Number of Errors
@180 nm (Voltage = 0.9V, Operating Frequency = 1GHz)
A 2x2 bit UDM [Kulkarni et al. 2011] 52,000 710 52 1
4×4 ICM [Lin et al. 2013] 230,000 1,990 130 1
Conventional 8×8 Dadda multiplier 48,083 3,400 1,785 0
based on exact 4-2 and 5-2 Compressor
approximate 8×8 Dadda multiplier1 44,725 2,300 1,529 108
[Momeni et al. 2015]
approximate 8×8 Dadda multiplier2 46,311 2,100 1,538 36
[Momeni et al. 2015]
Proposed approximate 8×8 Dadda 43,212 2,500 1,506 27
multiplier3
Proposed approximate 8×8 Dadda 45,064 4,100 1,656 18
multiplier4
Proposed approximate 8×8 Dadda 47,165 3,750 1,890 9
multiplier5
@90 nm (Voltage = 1V, Operating Frequency = 1GHz)
A 2x2 bit UDM [Kulkarni et al. 2011] 19,240 262 25 1
4×4 ICM [Lin et al. 2013] 85,100 990 65 1
Conventional 8×8 Dadda multiplier 17,791 1,688 862 0
based on exact 4-2 and 5-2 Compressor
Approximate 8×8 Dadda multiplier1 16,548 1,035 820 108
[Momeni et al. 2015]
Approximate 8×8 Dadda multiplier2 17,135 1,419 812 36
[Momeni et al. 2015]
Proposed approximate 8×8 Dadda 15,988 1,432 726 27
multiplier3
Proposed approximate 8×8 Dadda 16,674 1,516 749 18
multiplier4
Proposed approximate 8×8 Dadda 18,309 1,464 767 9
multiplier5
@45 nm (Voltage = 1.1 V, Operating Frequency = 1GHz)
A 2x2 bit UDM [Kulkarni et al. 2011] 3,078 183 14 1
4×4 ICM [Lin et al. 2013] 13,616 495 33 1
Conventional 8×8 Dadda multiplier 2,847 798 388 0
based on exact 4-2 and 5-2 Compressor
Approximate 8×8 Dadda multiplier1 2,648 717 369 108
[Momeni et al. 2015]
Approximate 8×8 Dadda multiplier2 2,742 709 365 36
[Momeni et al. 2015]
Proposed approximate 8×8 Dadda 2,558 682 645 27
multiplier3
Proposed approximate 8×8 Dadda 2,668 748 741 18
multiplier4
Proposed approximate 8×8 Dadda 2,929 753 757 9
multiplier5

proposed approximate 8×8 Dadda multiplier designs, the proposed approximate 8×8
Dadda multiplier3 design achieves low power consumption and area, due to the usage
of proposed approximate 4-2 compressor3. In terms of accuracy, the proposed approxi-
mate 8×8 Dadda multiplier5 design has minimal error as compared to other proposed
approximate 8×8 Dadda multipliers and existing approximate 8×8 Dadda multipliers.

4.4. 5-2 Compressor-Based Approximate 8×8 Dadda Multipliers


Table VIII shows the comparison of power, delay, area, and the number of er-
rors of the 8×8 Dadda multiplier based on the exact 5-2 compressor and proposed

ACM Journal on Emerging Technologies in Computing Systems, Vol. 13, No. 3, Article 44, Publication date: April 2017.
Design of Approximate Compressors for Multiplication 44:13

Table VIII. Comparison of Various 5-2 Compressor-Based Approximate 8×8 Dadda Multipliers
Design Power (nanowatts) Delay (Ps) Area Number of Errors
@180 nm (Voltage = 0.9V, Operating Frequency = 1GHz)
Conventional 8×8 Dadda multiplier 48,083 3,400 1,785 0
based on exact 4-2 and 5-2 compressor
Proposed approximate 8×8 Dadda 47,034 3,512 765 46
multiplier6
Proposed approximate 8×8 Dadda 45,481 3,790 731 38
multiplier7
@90 nm (Voltage = 1V, Operating Frequency = 1GHz)
Conventional 8×8 Dadda multiplier 17,791 1,688 862 0
based on exact 4-2 and 5-2 compressor
Proposed approximate 8×8 Dadda 13,289 1,891 443 46
multiplier6
Proposed approximate 8×8 Dadda 12,871 1,712 412 38
multiplier7
@45 nm (Voltage = 1.1 V, Operating Frequency = 1GHz)
Conventional 8×8 Dadda multiplier 2,847 798 388 0
based on exact 4-2 and 5-2 compressor
Proposed approximate 8×8 Dadda 2,615 987 275 46
multiplier6
Proposed approximate 8×8 Dadda 2,472 912 247 38
multiplier7

approximate 8×8 Dadda multipliers based on proposed approximate 4-2 compressor3,


approximate 5-2 compressor1, and approximate 5-2 compressor2. From Table VIII, it
is observed that the proposed approximate 8×8 Dadda multiplier7 design gives the
best results in terms of power consumption and area. The proposed approximate 8×8
Dadda multiplier6 design has the lowest delay as compared to an 8×8 Dadda mul-
tiplier and approximate 8×8 Dadda multiplier6. Among two proposed approximate
8×8 Dadda multiplier designs, the proposed approximate 8×8 Dadda multiplier7 de-
sign gives significant improvement in terms of power consumption and area, since
this design involves the proposed approximate 5-2 compressor2. In terms of accuracy,
the proposed approximate 8×8 Dadda multiplier7 design has minimal error as com-
pared to other proposed approximate 8×8 Dadda multiplier6 and existing 8×8 Dadda
multipliers.

4.5. Error Metrics


A comparison of the Error Distance (ED), Error Rate (ER), Pass Rate (PR), Mean
Error Distance (MED), and Normalized Error Distance (NED) [Liang et al. 2013] is
used as a metrics measure of the approximate 4-2 compressors and 5-2 compressors
and the approximate 8×8 Dadda multipliers. Table IX shows the comparison of ED,
ER, PR, MED, and NED of the exact and approximate 4-2 and 5-2 compressors and
the approximate 8×8 Dadda multipliers. Root Mean Square Error (RMSE) is related to
MED. MED is an important metric for approximate computing design. Hence, the MED
value is computed for the proposed designs. A comprehensive metric for comparing the
accuracy among different designs is NED. NED is the ratio of MED and maximum
possible error. The ED, ER, MED, and NED for the proposed approximate compressor5
and proposed approximate 8×8 Dadda multiplier5 is relatively low compared to the
other approximate compressors and other approximate 8×8 Dadda multipliers with a
tradeoff in the area. The approximate 4-2 compressor1 and approximate 8×8 Dadda
multiplier1 designs have high error metrics as compared to other approximate 4-2
compressors and other approximate 8×8 Dadda multipliers.

ACM Journal on Emerging Technologies in Computing Systems, Vol. 13, No. 3, Article 44, Publication date: April 2017.
44:14 A. Gorantla and Deepa P.

Table IX. Comparison the Error Metrics of Approximate Compressors


and Approximate 8×8 Dadda Multipliers
Designs Error Distance Error Rate Pass Rate MED NED
Approximate 4-2 compressor1 [Momeni 12 0.375 0.625 0.34 0.024
et al. 2015]
Approximate 4-2 compressor2 [Momeni 4 0.25 0.75 0.19 0.016
et. al. 2015]
Proposed approximate 4-2 compressor3 3 0.187 0.812 0.17 0.014
Proposed approximate 4-2 compressor4 2 0.125 0.875 0.13 0.012
Proposed approximate 4-2 compressor5 1 0.0625 0.937 0.09 0.01
Proposed approximate 5-2 compressor1 7 0.109 0.89 0.9 0.021
Proposed approximate 5-2 compresssor2 5 0.078 0.921 0.08 0.0091
Approximate 8×8 Dadda multiplier1 204 6.375 10.625 3.25 0.342
[Momeni et. al. 2015]
Approximate 8×8 Dadda multiplier2 68 4.25 11 1.67 0.265
[Momeni et. al. 2015]
Proposed approximate 8×8 Dadda 51 3.187 12 1.23 0.199
multiplier3
Proposed approximate 8×8 Dadda 34 2.125 13 0.83 0.132
multiplier4
Proposed approximate 8×8 Dadda 17 1.0625 14 0.62 0.066
multiplier5
Proposed approximate 8×8 Dadda 49 0.52 15 0.28 0.029
multiplier6
Proposed approximate 8×8 Dadda 35 0.24 16 0.21 0.021
multiplier7

Table X. Average NED Values for Image Sharpening


Design Average NED (×10−2 )
approximate 8×8 Dadda multiplier1 [Momeni et al. 2015] 0.096
approximate 8×8 Dadda multiplier2 [Momeni et al. 2015] 0.082
Proposed approximate 8×8 Dadda multiplier3 0.071
Proposed approximate 8×8 Dadda multiplier4 0.053
Proposed approximate 8×8 Dadda multiplier5 0.024
Proposed approximate 8×8 Dadda multiplier6 0.187
Proposed approximate 8×8 Dadda multiplier7 0.134

4.6. Application
In this section, the application of the proposed approximate 8×8 Dadda multipliers to
image processing is illustrated. An image sharpening [Burger et al. 2009] approach is
chosen to analyse the quality of the 8×8 Dadda multipliers. The peak-to-signal noise
ratio (PSNR) based on MSE are computed to access the quality of the output image.
Figure 6 show the performance of the exact, existing, and proposed approximate 8×8
Dadda multipliers in terms of PSNR. The output image of the proposed approximate
8x8 Dadda multipliers is compared with the output image of conventional 8×8 Dadda
multiplier. From Figure 6, it is observed that the proposed approximate 8×8 Dadda
multiplier5 has high PSNR compared to other 8×8 Dadda multipliers. The Average
NED values for image sharpening are given in Table X. From Table X, it is observed that
the proposed approximate 8×8 Dadda multiplier5 has a low average NED compared to
other 8×8 Dadda multipliers. As discussed previously, in the proposed approximate 4-2
compressors and approximate 5-2 compressors, the approximate 4-2 compressor5 has
a smaller probability of error. Therefore, the image quality for proposed approximate
8×8 Dadda multiplier5 is high compared to other approximate 8x8 Dadda multipliers.

ACM Journal on Emerging Technologies in Computing Systems, Vol. 13, No. 3, Article 44, Publication date: April 2017.
Design of Approximate Compressors for Multiplication 44:15

Fig. 6. Output image of size 512×512 (a) Original image. (b) Conventional 8×8 Dadda multiplier. (c)
Approximate 8×8 Dadda multiplier1. (d) Approximate 8×8 Dadda multiplier2. (e) Proposed approximate
8×8 Dadda multiplier3. (f) Proposed approximate 8×8 Dadda multiplier4. (g) Proposed approximate 8×8
Dadda multiplier5 (h) Proposed approximate 8×8 Dadda multiplier6. (i) Proposed approximate 8×8 Dadda
multiplier7.

For audio processing, the output of image quality is not needed [Han et al. 2013] and
these designs are more suitable.

5. CONCLUSIONS
In this article, novel designs of three approximate 4-2 compressors, three designs of
approximate 8×8 Dadda multipliers based on approximate 4-2 compressors, two ap-
proximate 5-2 compressors, and two designs of approximate 8×8 Dadda multipliers
based on approximate 4-2 compressor3 and approximate 5-2 compressors have been
proposed. The proposed approximate 4-2 compressors have given the best results in
power, delay, area, and accuracy as compared to the approximate 4-2 compressors pre-
sented in Momeni et al. [2015]. In addition, the proposed approximate 5-2 compressors
give the best results in power, delay, and area as compared to an exact 5-2 compressor.
Among all approximate 4-2 compressors and approximate 5-2 compressors, the pro-
posed approximate 4-2 compressor3 and approximate 5-2 compressor2 are well suited
for designing energy-efficient digital circuits. Thus, an approximate design approach in
compressors offers a significant advantage in terms of both circuit level and error met-
rics. Therefore, proposed approximate 8×8 Dadda multiplier3 and approximate 8×8
Dadda multiplier7 are more suitable for energy-efficient VLSI architectures. Future

ACM Journal on Emerging Technologies in Computing Systems, Vol. 13, No. 3, Article 44, Publication date: April 2017.
44:16 A. Gorantla and Deepa P.

works should optimize logic for high-level designs, and approximate 7-2 compressors
are more helpful to improve the performance of design metrics and to apply suitable
approximate compressors for DSP applications.

ACKNOWLEDGMENTS
The authors thank the ECE Department of Government College of Technology for providing necessary
support for project implementation.

REFERENCES
K. Bhardwaj, P. S. Mane, and J. Henkel. 2014 Power- and area-efficient approximate wallace tree multiplier
for error-resilient systems. In Proceedings of the 15th International Symposium on Quality Electronic
Design (ISQED’14). 263–269 DOI:http://dx.doi.org/10.1109/ISQED.2014.6783335
Wilhelm Burger and Mark James Burge. 2009. Principles of Digital Image Processing: Fundamental Tech-
niques (1st ed.). Springer. DOI:http://dx.doi.org/ 10.1109/TCSI.2004.835683.
C. H. Chang, J. Gu, and M. Zhang. 2004. Ultra-low-voltage, low-power CMOS 4-2 and 5-2 com-
pressors for fast arithmetic circuits. IEEE Trans. Circ. Syst. 51, 10, 85–97. DOI:http://dx.doi.org/
10.1109/TCSI.2004.835683.
L. Dadda. 1965. Some schemes for parallel multipliers. Alta Freq. 34, 349–356. DOI:http://dx.doi.org/10.1109/
TCSI.2004.835683.
Milos Ercegovac and Tomas Lang. 2004 Digital Arithmetic. Morgan Kaufman. DOI:http://dx.doi.org/
10.1109/TCSI.2004.835683.
V. Gupta, D. Mohapatra, S. P. Park, A. Raghunathan, and K. Roy. 2011 IMPACT: Imprecise adders for low-
power approximate computing, In Proceedings of the IEEE/ACM International Symposium Low-Power
Electronic Design. DOI:http://dx.doi.org/10.1109/ISLPED.2011.5993675.
Vaibhav Gupta, Debabrata Mohaptra, Anand Raghunathan and Kaushik Roy, 2013, low-power digital signal
processing using approximate adders. IEEE Trans. Comput.-Aid. Des. Integr. Circ. Syst. 32, 1, 87–97.
DOI:http://dx.doi.org/10.1109/TCAD.2012.2217962.
J. Han and M. Orshansky. 2013. In Proceedings of Approximate Computing: An Emerging Paradigm for
Energy-Efficient Design (ETS’13), 1–6. DOI:http://dx.doi.org/10.1109/ETS.2013.6569370.
I. Koren. 1993. Computer Arithmetic Algorithms. Prentice Hall, Englewood Cliffs, NJ. DOI:http://dx.doi.org/
10.1109/TCSI.2004.835683.
P. Kulkarni, P. Gupta, and M. Ercegovac. 2011 Trading accuracy for power with an under designed mul-
tiplier architecture. In Proceedings of the 24th International Conference on VLSI Design. 346–351.
DOI:http://dx.doi.org/10.1109/VLSID.2011.51
K. Y. Kyaw, W. L. Goh, and K. S. Yeo. 2010. Low-power, a high-speed multiplier for error-tolerant applica-
tion. In Proceedings of the IEEE International Conference on Electron Devices and Solid-State Circuits
(EDSSC’10). 1–4. DOI:http://dx.doi.org/10.1109/EDSSC.2010.5713751.
Chaofan Li, Wei Luo, S. S. Sapatnekar, and Jiang Hu. 2015. Joint precision optimization and high-level
synthesis for approximate computing in the DAC’15. In Proceedings of the Annual Design Automation
Conference, 1–6. DOI:http://dx.doi.org/10.1145/2744769.2744863
J. Liang, J. Han, and F. Lombardi. 2013. New metrics for the reliability of approximate and probabilistic
adders, IEEE Trans. Comput. 63, 9, 1760–1771. DOI:http://dx.doi.org/ 10.1109/TC.2012.146.
C. H. Lin and I. C. Lin. 2013. High accuracy approximate multiplier with error correction. IEEE 31st
International Conference on Computer Design (ICCD), 33–38. DOI:http://dx.doi.org/10.1109/ICCD.2013.
6657022
C. Liu, J. Han, and F. Lombardi. 2014 Dresten, germany. In Proceeding of DATE 2014, A Low-Power, High-
Performance Approximate Multiplier with Configurable Partial Error Recovery. DOI:http://dx.doi.org/
10.1109/TCSI.2004.835683.
H. R. Mahdiani, A. Ahmadi, S. M. Fakhraie, and C. Lucas. 2010 Bio-inspired imprecise computational blocks
for efficient VLSI implementation of soft-computing applications. IEEE Trans. Circ. Syst. 57, 850–862.
DOI:http://dx.doi.org/10.1109/TCSI.2009.2027626
A. Momeni, J. Han, P. Montuschi, and F. Lombardi. 2015. Design and analysis of approximate compressors
for multiplication. IEEE Trans. Comput. 984–994. DOI:http://dx.doi.org/10.1109/TC.2014.
Ravi Nair, 2010. Models for energy-efficient approximate computing, In Proceedings of the 16th
ACM/IEEE International Symposium on Low Power Electronic and Design (ISLPED’10), 359–360.
DOI:http://dx.doi.org/10.1145/1840845.1840921.

ACM Journal on Emerging Technologies in Computing Systems, Vol. 13, No. 3, Article 44, Publication date: April 2017.
Design of Approximate Compressors for Multiplication 44:17

B. Parhami. 2010. Computer Arithmetic: Algorithms and Hardware Designs (2nd ed.). Oxford University
Press, New York, NY.
Swagath Venkataramani, Srimat T. Chakradhar, Kaushik Roy, and Anand Raghunathan. Approximate
computing and the quest for computing efficiency in DAC’13. In Proceedings of the 50th Annual Design
Automation Conference DOI:http://dx.doi.org/10.1145/2744769.2751163
C. S. Wallace. 1964. A suggestion for a fast multiplier. IEEE Trans. Electron. Comp. 13, 1, 14–17. DOI:http://dx.
doi.org/10.1109/PGEC.1964.263830

Received December 2015; revised September 2016; accepted October 2016

ACM Journal on Emerging Technologies in Computing Systems, Vol. 13, No. 3, Article 44, Publication date: April 2017.

View publication stats

You might also like