Professional Documents
Culture Documents
Generic Built-In Self-Repair Architectures For Soc Logic Cores
Generic Built-In Self-Repair Architectures For Soc Logic Cores
Generic Built-In Self-Repair Architectures For Soc Logic Cores
Logic Cores
Marcel Balaz, Stefan Kristofik, Maria Fischerova
Institute of Informatics
Slovak Academy of Sciences
Bratislava, Slovak Republic
E-mail: {marcel.balaz, stefan.kristofik, maria.fischerova}@savba.sk
Abstract—The built-in self-repair (BISR) concept is utilized SoCs. The work identifies four basic requirements which guide
and proven by industry mainly in regular structures of system- a core design development of the BISR architecture with
on-chips (SoCs) memory cores. On the other hand, the idea of minimal area overhead that could be implemented in any
self repair concept for logic cores introduced and developed in
several papers is relatively new, as the irregular structure of logic core. Moreover, four algorithms that are able to handle
these types of cores represents a serious limitation. However, fault detection and localization procedure were proposed. Each
there is a need of a complex BISR architecture that can be algorithm aims at fulfilling different goals; therefore variety of
widely used on different types of logic cores in order to support usage is covered with the proposed architecture.
further the reliability of SoCs. This paper presents a generic The rest of the paper is organized as follows. Section II
BISR architecture based on reconfigurable logic blocks (RLBs)
applicable for any logic core inside a SoC together with in detail contains the related work. Section III introduces a set of
defined basic requirements guiding the architecture development requirements to achieve feasible reliability characteristics of
and also algorithms handling fault detection and localization the generic BISR architecture, together with algorithms driv-
procedure. ing a fault detection and localization procedure. Section IV
Index Terms—built-in self-repair, reconfigurable logic, logic summarizes the achieved results and Section V concludes the
core, system-on-chip, reliability.
paper.
I. I NTRODUCTION II. R ELATED WORK
Regarding predictions and experiences with complex The BISR principle for logic cores was published in [4].
system-on-chips (SoCs) reliability has become important and The logic core gains BISR capabilities by transforming it
self repair is required for manufacturing long time dependable into several reconfigurable logic blocks (RLBs). The RLB
systems [1], [2]. Dominant silicon area of these days SoCs ensures the repair of the assigned part of the logic core. The
is represented by memory cores of different types; therefore assigned part consists of identical functional blocks (FBs)
investigation and development during the last decade were for which a backup block (BB) with the same function is
mostly targeted to improve the reliability of these cores [3]. added to the RLB. One BB is considered for several identical
Several redundant design methods by means of duplicate FBs in order to reduce the area overhead. The most common
elements are widely used to detect and repair the variety of configuration is 3+1 - three FBs and one BB. If one of the
fault types by early discovering and substituting faulty memory FBs becomes faulty then all inputs and outputs of the block
elements with the backup elements. However, other types of will be redirected to the BB, so that BB replaces the faulty
cores should not be neglected neither in order to guarantee the FB, i.e. the logic core is repaired by partial reconfiguration.
fault tolerant behavior of the whole SoCs. Defects occurring in The replacement and isolation of the faulty FB is performed
logic cores caused by new technologies are also very important by the switches assigned to each input and output port of
to be handled. Logic cores with the ability of self-repair offer the FB. The given RLB should also contain some simple
the opportunity to target these issues. controller for storing the current block configuration (which
In order to support the reliability of SoC also from the FB is faulty or, otherwise, all FBs are fault-free) and to
perspective of logic cores fully automated methods and generic drive control signals for switches. The test generation, test
architectures are required, though, the random (irregular) response evaluation and reconfiguration are governed by the
structure of logic cores raises the complexity of built-in self- global control unit [4]. Analogy between BISR of logic cores
repair (BISR) to much higher levels than memory elements. and memory cores can be found easily. Rows/columns of the
Several papers have presented basic principles, suggestions memory matrix can be interpreted as FBs and the backup
for BISR architecture design or dealt with core’s specific row/column (assigned to several rows/columns) as the BB.
implementation of ad hoc methods. However, the random (irregular) structure of logic cores makes
In this paper a generic BISR architecture for logic cores the implementation of BISR capabilities more difficult. This
is presented. The main motivation was to provide capability is the main reason why only ad hoc methods exist in this area
for improving reliability parameters of any logic core inside and the BISR architecture was introduced only for specific
Reconfigurable logic blocks (RLBs) were introduced in [1]. This architecture enables fault isolation by using
RLB3+1
switching elements at both inputs and outputs. RLB consists of a number of identical functional blocks (FB) with
c1 c1
one of them FB1set up as backup block (BB).
FB1 In case any FB is faulty, BB can substitute its functionality. Maximum
FB
number of repairable blocks1 in each RLB is 1. Repair function cannot be guaranteed when more FBs are faulty.
I1 O1
!c 1 !c1
BB BB
c1 c1
D
SET
Q
c2 c2
FB2 FB2
FB2 rst Q
CLR
I2 O2
!c2 !c2
BB BB
c2 c2 D
SET
Q
c3 c3
FB3 FB3
FB3 CLR Q
rst
I3 O3
!c3 !c3
BB BB
BB (a) (b)
c3 (c) c3
cvm rst*
Fig. 1. (a) RLB 3+1 architecture, (b) input switch, (c) output switch.
c1-c3 c1-c3 Figure 4: State controller – RLB control logic
state controller
We use RLB architecture with 3 functional blocks (FB1, FB2 and FB3) and 1 backup block (BB), i.e. RLB 3+1,
which is shown in Fig. 1 (a). RLB
cv RLB 3+1 has 4 logic states as is shown in
Requirement Table
1
4. Testing RLB
I. In state 0,2 backup
should RLBn
block is
be performed on a core as a
unused, whereas in states 1, 2 and 3, backup
Figure 3: RLB of structure 3+1 block is used instead
whole. of FB1, FB2 and FB3, respectively.
I1 FBs O1 I2 FBs O2 In FBs On
The aimBBof the requirement BB is to simplify the BB entire repair
Table I. States of RLB 3+1
Table I: States of RLB 3+1
procedure and consequently to reduce area overhead. In re-
c c c c c c c c c
RLB Control bits spect to low area overhead the proposed
1 2 3 1
. . . RLB does not contain
2 3 1 2 3
FB1 FB2 FB3 BB test inputs SC and outputs forSCBB testing. The regular testing of
state c1 c2 c3 SC
0 used used used unused 0 0 0 the core is performed as for the core without repair capability.
1 unused used used used 1 0 0 The BISR cv 1 logic shouldcvbe able to start thecvtest and to get the
2 n
c1 c2 c3 c1 c2 c3 c1 c2 c3
. . .
D
SET
Q SC SC SC
rst Q
CLR
B1 B2 . . . Bn
Figure 5: Circuitry to preserve repaired state in RLB
Table IV: Comparison of number of test launches for proposed localization algorithms
Parallel-then-serial Partitioning
# of Serial approach Parallel approach
approach approach
RLBs # of test runs # of test runs
# of test runs # of test runs
(n)
min. max. avg. min. max. avg. min. max. avg. min. max. avg.
4 1 12 6,5 1 3 2 2 7 4,5 2 8 5
6 1 18 9,5 1 3 2 2 9 5,5 3 9 5,83
7 1 21 11 1 3 2 2 10 6 3 9 5,71
8 1 24 12,5 1 3 2 2 11 6,5 3 9 6
10 1 30 15,5 1 3 2 2 13 7,5 3 9 6,5
12 1 36 18,5 1 3 2 2 15 8,5 4 10 6,83
16 1 48 24,5 1 3 2 2 19 10,5 4 10 7
24 1 72 36,5 1 3 2 2 27 14,5 5 11 7,83
32 1 96 48,5 1 3 2 2 35 18,5 5 11 8
Transistor count
of test runs can be computed # of as
n×k+1
. This
2 Serial approach is
approach Partitioning approach
R EFERENCES
RLBs
suitable for smaller BISR architectures SC where BISRthe time of the
Controller [1] BISR Controller
SC “International technology roadmap for semiconductors, 2012 update,”
(n)
repair process is not very crucial.(nFor bigger
× ASC ) (n ×architectures
AB) AFDLP last (n × ASC2012,
) (nhttp://www.itrs.net/Links/2012ITRS/Home2012.htm
× AB) AFDLP [Online, ac-
7 938 448
two approaches are more suitable. The parallel-then-serial 850 2236 938 cessed: 448 1184
1.14.2014]. 2570
12 1608 768 982 3358 1608 768
[2] M. L. Shooman, 1554 3930of Computer Systems and Networks: Fault
Reliability
approach has the average 16 number2144 of test 1024runs on level4246
1078 of 2144 Tolerance, 1024Analysis,
1606 and4774
Design. Wiley-Interscience, 2001.
30–40 % of the serial approach24 for bigger 1536
3216 architectures.
1172 5924 The 3216 [3] M. Fischerová
1536 and
1800E. Gramatová,
6552 “Memory testing and self-repair,” in
partitioning approach is not 32 very 4288
suitable 2048 1416 BISR
for smaller 7752 4288 Design2048
and Test1984
Technology
8320 for Dependable Systems-on-Chip, R. Ubar,
J. Raik, and H. T. Vierhaus, Eds. IGI Global, 2011, pp. 155–174, doi:
architectures due to higher area but it is the best solution for 10.4018/978-1-60960-212-3.ch007.
the biggest BISR architectures where repair process needs to Transistor [4] T. Koal,
countD. Scheit, and H. T. Vierhaus, “A concept for logic self repair,”
# of in 12th Euromicro Conf. Digital System Design: Architectures, Methods
be performed quickly. RLBs
same for all 4 approaches Serial Parallel Parallel-then-serial Partitioning
and Tools, DSD 2009, 2009, pp. 621–624, doi: 10.1109/DSD.2009.238.
SC B FDLP FDLP FDLP FDLP
The reliability of the core(n)is highly dependent
(n × ASC)
on
(n × AB)
the addi-
AFDLP
[5] A. Benso,
AFDLP
S. Di Carlo, G. Di
AFDLP
Natale, and P. Prinetto,
AFDLP
“Online self-repair
tional area and the chosen number of RLBs inside the914core.2300 456 of FIR filters,” IEEE Des. Test, vol. 20, no. 3, pp. 50–57, 2003, doi:
7 938 448 1842 1284 2670 1570 2956
10.1109/MDT.2003.1198686.
Therefore the reliability improvement
12 could not768
1608 be investigated
1014 3390 456 2832 1380 3756 1780 4156
[6] T. Koal and H. T. Vierhaus, “A software-based self-test and hardware
16 2144 1024
on the generic architecture. However, all our implementations 1164 4332 456 3624 1432 4600 1790 4958
reconfiguration solution for VLIW processors,” in 13th IEEE Int. Symp.
24 3216 1536 1244 5996 456
have 100 – 200 % overhead which is noticeably lower than Design5208 1668
and Diagnostics 6420 2152 6904
of Electronic Circuits and Systems, DDECS
32 4288 2048 1488 7824 456
2010, 6792
2010, pp.1872 8208
40–43, doi: 2428 8764
10.1109/DDECS.2010.5491821.
the previously predicted 250 – 300% [11]. [7] M. Ulbricht, M. Schölzel, T. Koal, and H. T. Vierhaus, “A new
hierarchical built-in self-test with on-chip diagnosis for VLIW pro-
cessors,” in 14th IEEE Int. Symp. Design and Diagnostics of Elec-
V. C ONCLUSION tronic Circuits and Systems, DDECS 2011, 2011, pp. 143–146, doi:
10.1109/DDECS.2011.5783067.
[8] T. Koal and H. T. Vierhaus, “Optimal spare utilization for reliability and
The generic BISR architecture for logic cores was presented mean lifetime improvement of logic built-in self-repair,” in 14th IEEE
Int. Symp. Design and Diagnostics of Electronic Circuits and Systems,
in the paper. The main motivation was to provide possibility DDECS 2011, 2011, pp. 219–224, doi: 10.1109/DDECS.2011.5783083.
to improve reliability parameters for any logic core inside a [9] C. Gleichner, T. Koal, and H. T. Vierhaus, “Effective logic self repair
SoC. The four basic requirements, which guide a core designer based on extracted logic clusters,” in 14th IEEE Conf. Signal Processing:
Algorithms, Architectures, Arrangements, and Applications, SPA 2010,
to develop a simple BISR architecture with minimal area 2010, pp. 10–15.
overhead implementable for any logic core were proposed. [10] R. Dobai, M. Balaz, and M. Fischerova, “Automated generation of built-
Moreover, four simple algorithms to handle the fault detection in self-repair architectures for random logic soc cores,” in Digital System
Design (DSD), 2012 15th Euromicro Conference on, 2012, pp. 73–78,
and localization procedure were presented. Each algorithm doi: 10.1109/DSD.2012.29.
aims to fulfill different goals, therefore the proposed architec- [11] T. Koal and H. T. Vierhaus, “Built-in self repair for logic structures,” in
ture can be implemented in a variety of cores with different Design and Test Technology for Dependable Systems-on-Chip, R. Ubar,
J. Raik, and H. T. Vierhaus, Eds. IGI Global, 2011, pp. 216–240, doi:
design restrictions. 10.4018/978-1-60960-212-3.ch010.
This work has been supported by Slovak national project [12] R. J. Baker, CMOS Circuit Design, Layout, and Simulation, 3rd ed.
VEGA 2/0034/12. Wiley-IEEE Press, 2010.