Download as pdf or txt
Download as pdf or txt
You are on page 1of 3

A Fast Scrubbing Method Based on Triple Modular Redundancy for

SRAM-Based FPGAs
Rong-Sheng Zhang, Li-Yi Xiao*, Xue-Bing Cao, Jie Li, Jia-Qiang Li, and Lin-Zhe Li
Microelectronics Center, Harbin Institute of Technology, Harbin 150001, China
* Email: xiaoly@hit.edu.cn

Abstract TMR technique. The user design is designed into three


identical circuits and a majority voter as shown in figure
In recent years, SRAM-based FPGAs (Field 1. TMR is suitable for SRAM-based FPGAs too. If SEU
Programmable Gate Arrays) have been popular in space occurs in FPGA, one incorrect redundancy circuit cannot
applications because of their high speed and change the final correct output. TMR technique belongs
reconfigurability. The specific structure cell SRAM to the SEU mitigation technique.
(Static Random Access Memory) not only speed up the
user design but also speed up the configuration and
Module 1
reconfiguration. However, SRAM is very sensitive to the
space particles that can change the value stored in
SRAM. For improving the reliability of user design in INPUT OUTPUT
Module 2 Voter
SRAM-based FPGA, this paper proposes a scrubbing
method based on TMR (Triple Modular Redundancy) for
SRAM-based FPGAs. This method improves the
reliability of TMR design by reducing the possibility of Module 3
SEUs accumulation. We test the proposed scrubbing Figure 1. The TMR scheme
method through fault injection and the experimental
results indicate the proposed scrubbing method can Obviously, TMR technique has some disadvantages. The
improve the reliability of TMR design in SRAM-based three identical circuits must lead to the increase of area
FPGAs. and power consumption [4]. The majority voter is the
key of TMR technique. If SEU occurs in the majority
Keywords: SRAM-based FPGAs, reliability, scrubbing, voter, the function of majority voter may change. The
fault injection. incorrect majority voter will lead to incorrect output of
TMR.
1. Introduction For SRAM-based FPGAs, TMR technique has a
non-negligible problem. That is the accumulation of
FPGAs become more and more popular because they can SEUs in FPGA. Usually, the configuration memory will
be configured flexibly. Especially for SRAM-based not change after the configuring of SRAM-based FPGA.
FPGAs, they can be configured many times in a very If SEU occurs, it will remain in the FPGA. When two or
short time. However, there is a problem if SRAM-based three redundancy circuits of TMR are wrong, the
FPGAs are used in space applications. That is the majority voter will give the incorrect output. Therefore,
memory cells in SRAM-based FPGAs are very sensitive it is necessary to repair the SEUs in configuration
to SEU (Single Event Upset) which restricts the memory at intervals to ensure the correct running of
application of SRAM-based FPGAs [1, 2]. TMR design.
The SEU effects in SRAM-based FPGAs are different Scrubbing is a very effective method to resolve the
from ASIC (Application Specific Integrated Circuit) [3]. accumulation of SEUs in configuration memory.
Configuration memory is an important part in Scrubbing belongs to the SEU recovery technique.
SRAM-based FPGAs that controls the function of Scrubbing can be divided into two types, blind scrubbing
FPGAs. If SEU occurs in configuration memory, the and detecting scrubbing [5]. Blind Scrubbing is a simple
function of user design may change permanently, unless method to repair the SEUs in configuration memory by
the FPGA is reconfigured. Therefore, it is important to reload the original correct frames into the configuration
design some fault tolerant techniques to improve the memory at set intervals. The scrubbing rate of blind
reliability of SRAM-based FPGAs in space applications. scrubbing is the key problem because it is restricted by
Redundancy technique is the most common fault tolerant the configuration clock frequency and the size of
technique that contains hardware redundancy and time configuration memory. Detecting scrubbing repairs the
redundancy. One representative redundancy technique is SEUs in configuration memory by readback of the

978-1-5386-4441-6/18/$31.00 ©2018 IEEE


Authorized licensed use limited to: Cornell University Library. Downloaded on September 01,2020 at 09:02:18 UTC from IEEE Xplore. Restrictions apply.
bitstream, detecting an upset and reloading the original voter identifies an error module, the scrubbing address
correct frames. controller turn into the error mode that the scrubbing
The combination of detecting scrubbing and ECC (Error frame addresses transform into the region of the error
Correction Code) is an effective scrubbing method. The module. When completing the scrubbing of the error
advantage of this method is decreasing the area module region, the scrubbing address controller turn into
consumption because it omits the large storage space to the normal mode. The three identical modules are routed
store the original correct frames [6, 7]. in advance. The proposed scrubbing execution is
For SRAM-based FPGAs, only TMR technique or performed through the scrubbing algorithm illustrated in
scrubbing method is not enough to ensure the reliability figure 3.
of user design. It is a good choice that combining the
TMR technique and scrubbing method. In order to FPGA
combine the TMR technique and scrubbing method more Scrubber
effectively, this paper proposes a scrubbing method
based on TMR which can fast repair the SEUs in TMR FSM
Decoder of
design. FRAME_ECC

2. The proposed scrubbing method


Scrubbing
According to the type of configuration port, scrubbing address Majority voter
can be divided into two types, external scrubbing and controller
internal scrubbing. External scrubbing reload the correct
frames by using external configuration port, such as
SelectMAP. Internal scrubbing reload the correct frames
by using internal configuration port ICAP (Internal Module 1 Module 2 Module 3
Configuration Access Port). This paper adopts the
internal scrubbing method because the speed of internal User design
scrubbing is faster than the speed of external scrubbing
and all designs can be designed in one FPGA. Figure 2. The scheme of scrubbing platform
This paper adopts the method that combining the
detecting scrubbing and ECC. FRAME_ECC is the /* Initialization */
internal Xilinx logic that can detect single or double scrub_initialization();
/* Normal mode scrubbing */
SEUs in configuration memory. The FRAME_ECC logic while(1)
can calculate the syndrome value according to the bits in {
scrub_current_frame(frame_address);
one frame including the ECC bits. When read the frames if(majority voter error)
from the configuration memory, we can confirm whether {
goto scrub_error_mode();
an SEU occurs according to the syndrome value. This }
method do not need any additional storage space because else
{
we do not need to store the original correct frames. frame_address = next_normal_mode_frame_address;
}
The entire scrubbing platform composes of scrubber and }
user design, as shown in figure 2. The FSM (Finite State /* Error mode scrubbing */
scrub_error_mode()
Machine) is the core of scrubber that controls the {
readback and rewrite of the scrubbing. The function of frame_address = fisrt_error_module_frame_address;
while(1)
decoder of FRAME_ECC is to decode the {
FRAME_ECC logic. The scrubbing address controller scrub_current_frame(frame_address);
if(error repaired or frame_address == last_error_module_frame_address)
controls the scrubbing frames address according to the {
output of majority voter. frame_address = next_normal_mode_frame_address;
goto scrub_current_frame(frame_address);
Usually, scrubber and user design do not need any }
else
interconnection. But in this paper, we put the majority {
voter and the scrubber together. The majority voter is frame_address = next_error_module_frame_address;
}
optimized that not only can filter the output of error }
module but also can identify which module is wrong. }

When FPGA starts, the scrubber is running. The


scrubbing address controller has two modes, normal Figure 3. The scrubbing algorithm
mode and error mode. In normal mode, the scrubbing
frame addresses are ergodic in sequence. If majority 3. Test for proposed scrubbing method

Authorized licensed use limited to: Cornell University Library. Downloaded on September 01,2020 at 09:02:18 UTC from IEEE Xplore. Restrictions apply.
This paper adopts the fault injection technique to verify proposed scrubbing method has a higher speed to repair
the proposed scrubbing method. This is because fault the accumulation of SEUs in TMR design. Because the
injection is a fast and effective evaluation method [8]. existential time of SEU in TMR design decreases, the
Our fault injection tool uses ICAP too. When fault reliability of TMR design in SRAM-base FPGAs is
injection runs, the scrubbing will pause. Until the SEU is improved.
injected, the scrubbing continues to run.
There is an important factor that may influence the 5. Conclusion
experimental results. That is the interval between each
injected SEU. Firstly, it is necessary to make the interval Because SRAM-based FPGAs are sensitive to SEU, it is
random. Secondly, the length of the interval cannot be necessary to design some tolerant techniques if the
too long or too short. The long interval will bring the FPGAs are used in space application. The combination
large time consumption and the short interval cannot of TMR technique and scrubbing is a good choice. This
show the randomness that will influence the paper proposes a scrubbing method that can faster repair
experimental results. We consider that 1-5 times length the SEUs in TMR design. That is, the proposed
of entire frames is appropriate. scrubbing method can improve the reliability of TMR
Another important factor is the injected frame address. It design in SRAM-based FPGAs.
is necessary to make the injected frame addresses
uniform random. Therefore, we design a fault injection Acknowledgments
address generator that can generate the effective uniform
random addresses in real time [9]. Real-time effective This work was supported by the Fundamental Research
addresses can speed up the fault injection flow because it Funds for the Central Universities (Grant
can save the time of waiting for the effective addresses. No.HIT.KISTP.201404), Harbin science and innovation
The fault injection tool not only can inject an SEU, but research special fund (2015RAXXJ003), and Special
also can repair the current injected SEU. If the injected fund for development of Shenzhen strategic emerging
SEU changes the function of one or more modules, the industries (JCYJ20150625142543456).
SEU is a critical SEU. If not, it is not a critical SEU. If
the injected SEU is not a critical SEU, it will be repaired References
by the fault injection tool, because the repairing time
does not influence the experimental result. [1] Y. Y. Fan, X. D. Cai, C. H. He, and D. Liu, IEEE
Transactions on Nuclear Science, 65(5), p.1140
4. Experimental results (2018).
[2] J. Li, V. Choutko, and L. Xiao, Nuclear Instrument
We test the proposed scrubbing method in and Methods in Physics Research Section A:
XC6VLX240T Virtex-6 FPGA of ML605 evaluation Accelerators, Spectrometers, Detectors and
board. Two of ISCAS85 benchmark circuits are chose Associated Equipment, 885 p.98 (2018).
for the user designs, c7552 and c6288. The input vectors [3] J. R. Azambuja, G. Nazar, P. Rech, L. Carro, F. L.
are generated by an ATPG (Automatic Test Pattern Kastensmidt, T. Fairbanks, and H. Quinn, IEEE
Generation) tool named atalanta to meet the high fault Transactions on Nuclear Science, 60(6), p.4243
coverage rate. (2013).
If an SEU is a critical SEU, the SEU is regarded as an [4] P. K. Samudrala, J. R. Ramos, and S. Katkoori, IEEE
error. In this paper, we test the average scrubbing frames Transactions on Nuclear Science, 51(5), p.2957
of repairing an error. We implement experiment a large (2004).
number of times to improve the accuracy of [5] I. Herrera-Alzu and M. López-Vallejo, IEEE
experimental results. The experimental data is illustrated Transactions on Nuclear Science, 60(1), p.376
in table 1. (2013).
[6] G. –H. Asadi, and M. B. Tahoori, 23rd IEEE VLSI
Table 1. The experimental data Test Symposium (VTS'05), p.207 (2005)
user average scrubbing frames of repairing an error [7] U. Legat, A. Biasizzo, and F. Novak, IEEE
design normal scrubbing proposed scrubbing Transactions on Nuclear Science, 59(5), p.2562
c7552 10772 2445 (2012).
c6288 10834 1179 [8] J. Tarrillo, J. Tonfat, L. Tambara, F. L. Kastensmidt,
and R. Reis, 16th Latin-American Test Symposium
This paper tests two scrubbing methods, normal (LATS), p.1 (2015).
scrubbing and the proposed scrubbing. The frames of [9] R. Zhang, L. Xiao, and J. Li, IEEE 12th International
normal scrubbing method are ergodic. From table 1, the Conference on ASIC (ASICON), p.359 (2017).

Authorized licensed use limited to: Cornell University Library. Downloaded on September 01,2020 at 09:02:18 UTC from IEEE Xplore. Restrictions apply.

You might also like