Professional Documents
Culture Documents
"Digit-Recurrence Algorithms For Division and Square Root With Limited Precision Primitives" Literature Survey
"Digit-Recurrence Algorithms For Division and Square Root With Limited Precision Primitives" Literature Survey
Primitives”
LITERATURE SURVEY:
We extend the digit-recurrence algorithm for division (F-DIV) presented in to
square-root operation and discuss a combined divide/square-root scheme. The algorithms in
this class are characterized by Digit-by-digit use of the operands, similar to on-line
algorithms .Short precision residuals - no more than 2 radix-r digits independent of the step.
Selection using short reciprocal and round-Compensation needed because of the on-line
mode - requiring a short-precision sum of digit by digit products.
In on-line algorithms the error in the residual due to the incremental use of operands
is compensated in each step by adding a term missed in prioty. We now introduce an
algorithm for square rooting which has similar characteristics as F-division. Let s = ...G. We
obtain a short-precision approximation a by rounding to p most-significant digits. Let b
correspond to the remaining n — p digits of s. To compute part b, we apply F-division
algorithm: the dividend in this case is y = — a , and the (full-precision) divisor is 2a + b.
Since F-DIV uses a short divisor, it sumces to take d* — 20. The digits of b are produced in
on-line mode, beginning with the most-significant digit of b. As in F-DIV, the dividend y =
c — a2 is produced and applied digit-serially. The first two digits of y are in the signed-digit
form after the subtraction of 02 ; for the rest, yj — c The correction terms are computed by
the digits of b as they are produced. The part a, a short-precision approximation to E, is
obtained from a table TSQR using the N'IS digits of c. The F-DIV selection function uses a
short reciprocal g — l/d*. This reciprocal can be stored in TSQR or obtained from TREC
using 2a as the input with an extra delay. We assume a former approach.
We presents an algorithm for square root which uses radix-r digit recurrence division
with short precision primitives. A combined • division/square root implementation is
described. The proposed scheme for radix-512 has a cycle time of (8.4 + 1.9) 10.3T
compared with a cycle time of 8.15T for the radix-512 division with pre-scaling. However,
the proposed scheme uses short In multipliers which may have an advantage at the layout
level. Regarding cost, the proposed scheme uses 38% less area than a combine, we
division/square root scheme with pre scaling. This makes combined F-DIV/SQR interesting
in low power designs. Since all primitive modules are digit by digit, this class of algorithms
is suitable for higher radix implementation . Since the precision of all modules is short,
designs with nonredundant outputs may be faster[simpler than implementation ng with
redundant outputs. This has The scheme discussed uses the modules defined in (51 and
repeated here for ease of reference. The delays are expressed in terms of T - the delay of a
full adder. The cost is given in terms of K - the cost of a full-adder. The delay and cost of
modules are estimated for k 9 (r = 512).
Bottom-up development
Primitives
+ Addition/subtraction
+ Multi operand addition
+ Arithmetic shifts
+ Multiplication by digit
+ Result-digit selection (PLA)
+ Table look-up
+ Multiplication
a) Arithmetic level
+ Packaging
+ Interconnection complexity
+ Number of pins
+ Area
+ Power dissipation
+ Power consumption
The objective of this paper is to propose new metrics for assessing adder designs
with respect to reliability and power efficiency for inexact computing. A new figure of
merit referred to as error distance (ED) is initially proposed to characterize the reliability of
an output of an adder. ED is then used to obtain two new metrics: the mean error distance
(MED) and the normalized error distance (NED). The MED and NED can be obtained using
sequential probability transition matrices (SPTMs) and are able to evaluate the reliability of
both probabilistic and deterministic adders. It is shown that the MED is an effective metric
in evaluating the implementation of a multiple-bit adder. The NED is a stable metric that is
almost independent of the size of an implementation; this feature brings a new perspective
for the evaluation and comparison of different adder designs. The power and NED product is
further used to evaluate the power and precision tradeoff. An adder implementation with
reduced precision, referred to as the lower-bit ignored adder (LIA), is investigated as a
baseline design for assessing the LOA, AMAs and PFAs. A detailed analysis and simulation
results are presented to assess the reliable performance of these adders using the proposed
new metrics.
This paper presents the following novel contributions. Adaptive pruning schemes are
analyzed in detail for four different scenarios of the dividend and divisor. Based on this
analysis, new division strategies are proposed to avoid the possible occurrence of overflow
found in the approximate divider in Finally, an error correction circuit using OR gates is
utilized for achieving a high accuracy at a very small hardware overhead.
Compared with the exact 16/8 array divider, the proposed adaptive approximation-
based divider (denoted as AAXD) using an 8/4 divider achieves a speedup by 60.51%, a
reduction in power dissipation by 65.88% and in area by 38.63%. For a more accurate
configuration using a 12/6 divider, the AAXD is 26.54% faster and 34.13% more power
efficient than the accurate design. Two image processing applications, change detection and
foreground extraction, show that a higher image quality is obtained by using the proposed
design than using other approximate dividers.
In the simulation, the outputs Q and R of the AXDr and EXDr are simulated
exhaustively for the 8-bit and the 16bit AXD; the 32-bit AXD has been simulated using 1
million randomly generated input patterns. The MEDs of Q and R are obtained by
calculating the average of the output error distances (i.e. the error distance corresponding to
each input combination). The NED is calculated as ratio of the MED over the maximum
possible error distance (i.e.
2𝑁𝑁−1 −1 where N is the bit width). The simulation results for the static and dynamic powers
are presented next. The dynamic power is measured under at a frequency of 250MHz.
To evaluate the trade off between computation accuracy and power (include both
static and dynamic) consumption of the AXDs, the MED power product of the AXDs is
calculated and plotted in Although the power saving of a truncation scheme is larger than that
of a replacement scheme, this is accomplished at the expense of the error; so, the triangle
replacement schemes with AXDr1 have the smallest MED power product. Hence AXDr1 is
very promising for an approximate divider design requiring both high accuracy and low
power consumption. AXD2 is again shown to be the worst design among the proposed
designs regardless of the type of replacement. Compared to a replacement scheme, a
truncation scheme is not suitable when both high accuracy and low power AXD designs are
used.