Professional Documents
Culture Documents
MISR Polynomial Equations
MISR Polynomial Equations
MISR Polynomial Equations
Authorized licensed use limited to: Chin-Yi University of Technology. Downloaded on October 29, 2008 at 07:40 from IEEE Xplore. Restrictions apply.
1032 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 19, NO. 9, SEPTEMBER 2000
Let be the content of the -th FF, then the state of the MISR
(i.e., the sequence ) can be represented by the
state polynomial
(2)
(3)
Fig. 2. A typical MISR configuration.
The signature obtained by any SISR or MISR is defined as the
final state of the register after the input sequence has been
of the single-input signature register (SISR) or multiple-input
entered into the register. Consider an IE-SISR with a character-
signature register (MISR), we propose two partial-input LUT
istic polynomial . The signature obtained after the input bit
methods. The first uses the memory-driven approach to gain
stream ( ) is entered can be represented as [2]
a certain speedup with less memory requirement as compared
with the linear compaction algorithm. The second uses the (4)
timing-driven approach to gain a higher speedup with the
same memory requirement as the linear compaction algorithm. i.e., the signature is equivalent to the remainder taken from
Significant speedup can be achieved by exploring the sparsity the division of by . Equation (4) generally pro-
of the error-domain inputs, i.e., long runs of contiguous zeros in vides an easy way for signature computation and aliasing
the error syndromes from the CUT. The idea is to skip long runs probability analysis. Since an IE-MISR can be reduced to an
of zeros in the error-domain input sequence in one iteration. equivalent SISR, it is a widely used technique to simulate a
Operating on the error-domain inputs, our flying-state LUT MISR by its corresponding SISR form. Parallel simulation for
method can significantly reduce the signature computation external-EXOR signature registers can be found in [21].
time with an appropriate stride. However, the speedup of the To reduce memory requirement during signature computa-
method may be small if the stride is not properly selected. tion, the state and input polynomials can be partitioned into sub-
Consequently, a modification of the flying-state method using polynomials of equal length [22]. Specifically, for the -bit state
the pivot-checking (PC) technique is proposed to guarantee polynomial, we let
the speedup irrespective of the stride. Since the sparsity of
the error-domain inputs depends on the faults of the CUT, we
(5)
finally propose a dynamic pivot-checking (DPC) technique,
which may stretch the pivot length dynamically according to
the sparsity. It improves significantly the speedup in a general where
situation. As a result, our signature computation algorithm
using the memory-driven approach and the (dynamic) PC
technique is fast and economical in terms of memory space. (6)
Mathematical analysis and simulation results on ISCAS85
is the -th partition (with length ) of the state polynomial, as-
benchmark circuits show that, with the same memory require-
suming is a multiple of . Similarly, the -bit input sequence
ment, this algorithm has an order of magnitude speedup over
can be partitioned into -bit subpolynomials, i.e.,
the linear compaction algorithm.
The rest of the paper is organized as follows. In Section II, we
briefly review the linear compaction algorithm. We propose sev- (7)
eral improved MISR simulation algorithms in Section III, fol-
lowed by their complexity analysis in Section IV. Experimental
where
results are presented and discussed in Section V. Finally, Sec-
tion VI concludes this work.
(8)
II. PRELIMINARIES is the -th partition of the input polynomial and is assumed
to be a multiple of .
Figure 2 shows a typical MISR configuration—the in-
Consider a linear compactor with initial state polynomial
ternal-EXOR MISR (IE-MISR). In the figure, denotes the
. Let represent the state polynomial after an
length of the MISR, i.e., the number of FFs in the register.
-bit input polynomial is entered, i.e.,
Also, are the binary coefficients of the
characteristic polynomial of the MISR, i.e., (9)
where denotes the operator which shifts the -bit input se-
(1) quence into a linear compactor of length . Also, let
Authorized licensed use limited to: Chin-Yi University of Technology. Downloaded on October 29, 2008 at 07:40 from IEEE Xplore. Restrictions apply.
LIN et al.: A FAST SIGNATURE COMPUTATION ALGORITHM FOR LFSR AND MISR 1033
(10)
(11)
Fig. 3. The original linear compaction algorithm.
and
(12)
Authorized licensed use limited to: Chin-Yi University of Technology. Downloaded on October 29, 2008 at 07:40 from IEEE Xplore. Restrictions apply.
1034 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 19, NO. 9, SEPTEMBER 2000
than . Therefore, it increases the speed of the linear dense (nonsparse) input sequence. For a sparse input sequence
compaction algorithm. (i.e., an input sequence with a high percentage of zero seg-
Since table references are eliminated for those input partitions ments), the linear compaction algorithm can skip input table
with degrees less than , the memory storage for the look-up operations due to the null effect of the zero partitions. It
input LUTs is, therefore, reduced from to reduces the computation time of the input contribution just like
-bit words. the direct substitution of the least significant input subsequence
This memory-driven approach is especially useful for a long in the memory-driven approach—the memory-driven approach
MISR with , where is the degree of each thus has a fair speedup under such a situation. On the other
input sequence partition. In such a case, the input look-up op- hand, the speedup of the timing-driven approach mainly results
erations are completely eliminated, so a better performance im- from the reduction of the number of iterations, so the speedup
provement can be observed. is pattern independent and guaranteed.
(20)
(18)
where and are the output sequences of the CUT
and the fault-free circuit, respectively. Generally, a circuit
where is the least significant part of the input polynomial may contain many untestable faults and hard-to-detect faults.
with degree less than , and , Therefore, in pseudorandom-pattern based BIST, is
i.e., represents a sequence of bits with trailing usually sparse, i.e., it usually contains only a very small
zero bits. percentage of nonzero bits. Let denote the probability that
By (18), we use the input LUTs for the input bits an input partition is nonzero. It is clear that by using
(with trailing zero bits), i.e., our algorithm can virtually retire can be greatly reduced as compared with that of . Table
bits in one iteration without requiring extra memory. This I shows the comparison of among the bigger ISCAS-85
approach outperforms the linear compaction algorithm in that benchmark circuits. In this experiment, 8-bit partitions and
the number of iterations is reduced from to single stuck-at faults were assumed, and 10 000 pseudorandom
for an input sequence of bits in total, at the cost of only patterns were simulated for each fault. From the table, we see
an extra EXOR operation for in each iteration. The table that a significant reduction of is observed if we switch from
look-ups for also are eliminated since its degree is less to , especially for large circuits.
than . Now let the error-domain signature of the MISR be
Generally, this approach gains higher speedup than the
memory-driven approach discussed above, especially for a (21)
Authorized licensed use limited to: Chin-Yi University of Technology. Downloaded on October 29, 2008 at 07:40 from IEEE Xplore. Restrictions apply.
LIN et al.: A FAST SIGNATURE COMPUTATION ALGORITHM FOR LFSR AND MISR 1035
TABLE I
COMPARISON OF THE PROBABILITY OF
NONZERO PARTITIONS BETWEEN I (x) AND I (x)
(22)
Authorized licensed use limited to: Chin-Yi University of Technology. Downloaded on October 29, 2008 at 07:40 from IEEE Xplore. Restrictions apply.
1036 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 19, NO. 9, SEPTEMBER 2000
Authorized licensed use limited to: Chin-Yi University of Technology. Downloaded on October 29, 2008 at 07:40 from IEEE Xplore. Restrictions apply.
LIN et al.: A FAST SIGNATURE COMPUTATION ALGORITHM FOR LFSR AND MISR 1037
If the pivot is nonzero, then -bit input polynomial in each iteration, the time complexity
of our algorithm, including the LUT operation time and the BZC
if operation time, is calculated as follows.
(25)
otherwise
1) LUT operation time:
a) Non-flying-state operation:
G. Fast Signature Computation Algorithm If there is a nonzero partition Hit, since the parti-
Our fast signature computation algorithm results from the tions from Head to Done have been checked and
preceding speedup approaches. Keeping the memory require- known to be zero in the BZC process, only au-
ment as small as the linear compaction algorithm, our signa- tonomous responses are accumulated during this
ture computation algorithm operates on the error-domain inputs interval. The memory-driven operations are exe-
. The algorithm is a combination of the memory-driven cuted from Done to Hit, and the time required is
approach and the flying-state approach with BZC (for practical Done Head Hit Done
cases, DPC also can be used).
Although the discussion has been made on the IE-SISR, ap- (31)
plying our algorithm to a MISR can be done by at least the fol-
lowing two different ways. where
1) Convert the MISR into an equivalent IE-SISR and then
apply our algorithm to the SISR to obtain the signature (32)
[17], [24].
is the time for autonomous response accumulation
2) Run our algorithm for each nonzero connection of the
in the memory-driven approach, and
MISR and then combine the individual signatures ac-
cording to the location of the nonzero connection to the
MISR.
(33)
IV. COMPLEXITY ANALYSIS
A. Memory Complexity Note that and represent the average time
required to perform a table look-up and a logical
The memory space requirement of our algorithm includes the
operation, respectively, represents the av-
input LUTs, state LUTs, flying-state LUTs, and input queue,
erage time to check if a subpolynomial is zero, and
with their memory sizes denoted as , , , and
is the word length. In general,
, respectively. For BZC, the memory sizes in terms of
.
-bit words are, respectively,
b) Flying-state operation:
(26) If consecutive zero partitions are found,
then we perform the flying-state operation on this
(27) -bit input sequence. The operation time is
(28) (34)
Authorized licensed use limited to: Chin-Yi University of Technology. Downloaded on October 29, 2008 at 07:40 from IEEE Xplore. Restrictions apply.
1038 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 19, NO. 9, SEPTEMBER 2000
(38)
where
Done Head
Hit Done
(40)
For an -bit input sequence, the average total time thus is
Let Head Tail be the number of partitions to be pro-
cessed in an iteration, then . Assume Done is uni- (46)
formly distributed between Head and Tail, then we have the
probability It can be seen that a significant speedup can be achieved when
we use the error-domain inputs.
Done Head (41)
V. EXPERIMENTAL RESULTS
for all . Obviously the mean is Done We use a 32-bit LFSR with characteristic polynomial
Head . Now let Tail Done. By (39), in our simulations [22]. The ISCAS-85 benchmark
. Also circuits are used as the functional circuits under test, and the
fault model assumed is the single stuck-at fault. For each fault,
Hit Done (42) 10 000 random patterns are applied. For ease of discussion, the
results shown below were taken from the sequence of the first
and
primary output of each functional circuit. The results are similar
(43) if we pick a random primary output. We also assume that
32, 8, and 8. All the experiments were done on a SUN
Therefore, we obtain Sparc-20/71 workstation with 256 M-byte of RAM. The time
was measured in ticks (a tick is 1/60 s), and the memory space
was measured in 32-bit words.
Tables II and III list the experimental results for the orig-
inal inputs and error-domain inputs , respectively.
Both the memory-driven and timing-driven approaches were
simulated. Improvement can be observed for both approaches as
compared with the linear compaction algorithm [22]. Note that
(44) there is no performance improvement for the memory-driven
approach on the error-domain inputs (though memory space is
If it is a flying-state iteration, bits are processed, else the reduced by half). This is because for the error-domain inputs
partitions from Head to Hit are processed. On the average, the the linear compaction algorithm also neglects as many table
number of bits processed in each iteration is look-ups (for the zero input partitions) due to small , so the
check time for each partition is approximately that for the as-
signment operation on the partition in the memory-driven ap-
proach. From the tables, we can see that the timing-driven ap-
(45)
proach is faster than the memory-driven approach, especially for
Authorized licensed use limited to: Chin-Yi University of Technology. Downloaded on October 29, 2008 at 07:40 from IEEE Xplore. Restrictions apply.
LIN et al.: A FAST SIGNATURE COMPUTATION ALGORITHM FOR LFSR AND MISR 1039
REFERENCES
[1] P. H. Bardell, W. H. McAnney, and J. Savir, Built-In Test for
TABLE V VLSI. New York, NY: John Wiley & Sons, 1987.
COMPUTATION TIME OF THE PC PROCEDURE OVER A RANGE OF P (PIVOT
LENGTH), ASSUMING n 30 = [2] M. Abramovici, M. A. Breuer, and A. D. Friedman, Digital Systems
Testing and Testable Design. New York, NY: Computer Science Press,
1990.
[3] S. Pilarski and T. Kameda, A Probabilistic Analysis of Test-Response
Compaction. Los Alamitos, CA: IEEE Computer Society Press, 1995.
[4] M. Abramovici, P. R. Menon, and D. T. Miller, “Critical path tracing:
An alternative to fault simulationy,” IEEE Design & Test of Computers,
vol. 1, no. 1, pp. 83–93, Feb. 1984.
[5] J. A. Waicukauski, E. B. Eichelberger, D. O. Forlenza, E. Lindbloom,
and T. McCarthy, “Fault simulation for structured VLSI,” VLSI Systems
Design, vol. 6, no. 12, pp. 20–32, Dec. 1985.
[6] K. J. Antreich and M. H. Schulz, “Accelerated fault simulation and fault
grading in combinational circuits,” IEEE Transactions on Computer-
Aided Design of Integrated Circuits, vol. 6, no. 9, pp. 704–712, Sept.
the error-domain inputs. Also, it is input-pattern independent as 1987.
expected. [7] P. A. Duba, R. K. Roy, J. A. Abraham, and W. A. Rogers, “Fault simula-
Table IV lists the computation time (in terms of ticks) of our tion in a distributed environment,” in Proc. IEEE/ACM Design Automa-
tion Conf. (DAC), 1988, pp. 686–691.
algorithm over a range of stride . Compared with that of the [8] W.-T. Cheng and M.-L. Yu, “Differential fault simulation—A fast
linear compaction algorithm shown in Table II, a great reduc- method using minimal memory,” in Proc. IEEE/ACM Design Automa-
tion in computation time can be seen—the speedup ranges from tion Conf. (DAC), June 1989, pp. 424–428.
[9] B. Underwood and J. Ferguson, “The parallel-test-detect fault simulation
about 5 to 20 for . It is interesting to note that the algorithm,” in Proc. Int. Test Conf. (ITC), 1989, pp. 712–717.
speedup tends to saturate (and even decay) for large . This is [10] F. Maamari and J. Rajski, “A method of fault simulation based on stem
because 1) the check time dominates when is very region,” IEEE Transactions on Computer-Aided Design of Integrated
Circuits, vol. 9, no. 2, pp. 212–220, Feb. 1990.
small if we use and the BZC technique, and 2) the spar- [11] H.-K. Lee and D.-S. Ha, “An efficient forward fault simulation algorithm
sity of has been exploited. The memory requirement is the based on the parallel single fault propagation,” in Proc. Int. Test Conf.
same as the linear compaction algorithm. (ITC), 1991, pp. 946–955.
[12] T. M. Niermann, W.-T. Cheng, and J. H. Patel, “PROOFS: A fast,
Table V shows the computation time for a range of (pivot memory efficient sequential circuit fault simulator,” IEEE Transactions
length) when we use the PC technique, assuming . From on Computer-Aided Design of Integrated Circuits, vol. 11, no. 2, pp.
the table, we can see that the performance is very dependent on 198–207, Feb. 1992.
[13] F. Maamari and J. Rajski, “The dynamic reduction of fault simulation,”
the value of . A larger means more intermediate states will IEEE Transactions on Computer-Aided Design of Integrated Circuits,
be removed when we detect a zero pivot. Consequently, for a vol. 12, no. 1, pp. 137–148, Jan. 1993.
sparse input sequence, a large should be used. For a sparse [14] S. Parkes, P. Banerjee, and J. Patel, “A parallel algorithm for fault sim-
ulation based on PROOFS,” in Proc. IEEE Int. Conf. Computer Design
input sequence and a fixed , PC is about as good as BZC if (ICCD), 1995, pp. 616–621.
. [15] I. Pomeranz and S. M. Reddy, “On fault simulation for synchronous
sequential circuits,” IEEE Transactions on Computers, vol. 44, no. 2,
pp. 335–340, Feb. 1995.
VI. CONCLUSION [16] M. B. Amin and B. Vinnakota, “ZAMBEZI: A parallel pattern parallel
fault sequential circuit fault simulator,” in Proc. IEEE VLSI Test Symp.
We have proposed several approaches to improving the (VTS), 1996, pp. 438–443.
LFSR/MISR signature computation performance. Based on the [17] C. F. See and K. K. Saluja, “An efficient method for computation of
table look-up linear compaction algorithm and the modularity signatures,” in Proc. Int. Conf. VLSI Design, Jan. 1992, pp. 245–250.
[18] C.-P. Kung, C.-J. Huang, and C.-S. Lin, “Fast fault simulation for BIST
property of SISR, new accelerating schemes—partial input applications,” in Proc. Fourth Asian Test Symp. (ATS), Bangalore, Kar-
look-up tables and flying-state look-up tables—have been de- nataka, Nov. 1995, pp. 93–99.
veloped to boost the signature computation speed. We proposed [19] G. Griffiths and G. C. Stones, “The tea-leaf reader algorithm: An effi-
cient implementation of CRC-16 and CRC-32,” Communications of the
two partial-input LUT methods—the memory-driven approach ACM, vol. 30, no. 7, pp. 617–620, July 1987.
and the timing-driven approach. Significant speedup also has [20] D. V. Sarwate, “Computation of cyclic redundancy checks via table
been observed when we explore the sparsity of the error-domain look-up,” Communications of the ACM, vol. 31, no. 8, pp. 1008–1013,
Aug. 1988.
inputs. Operating on the error-domain inputs, the flying-state [21] N. Narendran, M. Franklin, and K. K. Saluja, “Parallel computation of
LUT method can significantly reduce the signature compu- LFSR signatures,” in Proc. Second Asian Test Symp. (ATS), Beijing,
tation time with an appropriate stride which can be kept by Nov. 1993, pp. 75–80.
[22] D. Lambidonis, A. Ivanov, and V. K. Agarwal, “Fast signature compu-
using the DPC technique. Our signature computation algorithm tation for BIST linear comparators,” IEEE Transactions on Computer-
using the memory-driven approach and the DPC technique is Aided Design of Integrated Circuits, vol. 14, pp. 1037–1044, Aug. 1995.
Authorized licensed use limited to: Chin-Yi University of Technology. Downloaded on October 29, 2008 at 07:40 from IEEE Xplore. Restrictions apply.
1040 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 19, NO. 9, SEPTEMBER 2000
[23] S. Lin and D. J. Costello, Error Control Coding: Fundamentals and Ap- Cheng-Wen Wu (S’86–M’87–SM’95) received the
plications. Englewood Cliffs, NJ: Prentice-Hall Inc., 1983. B.S.E.E. degree in 1981 from National Taiwan Uni-
[24] K. K. Saluja and C. F. See, “An efficient signature computation method,” versity, Taipei, Taiwan, and the M.S. and Ph.D. de-
IEEE Design & Test of Computers, vol. 9, no. 4, pp. 22–26, Dec. 1992. grees, both in electrical and computer engineering,
in 1985 and 1987, respectively, from the University
of California, Santa Barbara. Since 1988 he has been
with the Department of Electrical Engineering, Na-
Bin-Hong Lin received the B.S. and M.S. degrees tional Tsing Hua University, Hsinchu, Taiwan, where
both in electrical engineering, in 1990 and 1992, he is currently a professor. He also has served as the
respectively, from National Tsing Hua University, director of the university’s Computer and Communi-
Hsinchu, Taiwan. He is currently a principal engineer cations Center from 1996 to 1998, and the director of
at TSMC, Hsinchu, Taiwan. Mr. Lin’s research the university’s Technology Service Center from 1998 to 1999. He is a Guest
interests include VLSI design and testing, CAD Editor of the Journal of Information Science and Engineering, Special Issue on
tools for VLSI, and memory testing. VLSI Testing.
Dr. Wu was the Technical Program Chair of the IEEE Fifth Asian Test Sym-
posium (ATS’96), and is the General Chair of the Ninth ATS (ATS’00). He re-
ceived the Distinguished Teaching Award from NTHU in 1996 and the Out-
standing Electrical Engineering Professor Award from the Chinese Institute of
Electrical Engineers (CIEE) in 1997. He is interested in design and test of high
performance VLSI circuits and systems. He is a member of CIEE and a senior
Shao-Hui Shieh received the B.S. degree in elec- member of IEEE.
tronic engineering and the M.S. degree in automatic
control engineering from Feng Chia University,
Taichung, Taiwan, in 1977 and 1981, respectively.
He is currently pursing the Ph.D. degree in electrical
engineering at National Tsing Hua University. His
research interests include VLSI design and testing,
computer arithmetic and architecture, and redundant
number representations.
Authorized licensed use limited to: Chin-Yi University of Technology. Downloaded on October 29, 2008 at 07:40 from IEEE Xplore. Restrictions apply.