Single Event Effect Criticality Analysis

You might also like

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 45

Single Event Effect Criticality

Analysis

Sunil S Pillai

Department of Electronics & Communications


Government Engineering College, Sector-26
Gandhinagar
Effect of Radiation on Electronic
Components / Circuits.

Total Ionizing Dose (TID).

Single Event Effect (SEE).


Single Event Effects (SEEs)

Individual events which occur when a single


incident ionizing particle deposits enough
energy to cause an effect in a device.

Two types of SEEs:


Soft Errors

Hard Errors
Single Event Effects (SEEs)
Single Event Upset (SEU)

Multiple Bit Upset (MBU)

Single Hard Error (SHE)

Single Event Functional Interrupt (SEFI)

Single Event Latchup (SEL)

Single Event Burnout (SEB)

Single Event Gate Rupture (SEGR)


Single Event Upset (SEU)

A change of state or transient induced by


an ionizing particle in a device.

These are "soft" bit errors in that a reset or


rewriting of the device causes normal
behavior thereafter.
Multiple Bit Upset (MBU)

An event induced by a single energetic


particle that causes multiple upsets or
transients during its path through a device or
system.

Problem for single-bit error detection and


correction (EDAC)
Single Hard Error (SHE)

An SEU which causes a permanent change to the


operation of a device.

An example is a permanent stuck bit in a memory


device.
Single Event Functional Interrupt
(SEFI)
A condition where the device stops
normal functions, and usually requires a
power reset to resume normal operations.

A special case of SEU changing an


internal control signal.
Single Event Latchup (SEL)
A potentially destructive condition
involving parasitic circuit elements.

The device current may exceed device


maximum specification and destroy the
device if not current limited
Single Event Burnout (SEB)

A highly localized burnout of the drain-source


in power MOSFETs due to a high current state.

SEB susceptibility has been shown to


decrease with increasing temperature
Single Event Gate Rupture
(SEGR)
The formation of a conducting path in the
gate oxide resulting in burnout of a gate
insulator in a power MOSFET.

Single Event Dielectric Ruptures (SEDR)


occurs in CMOS.
The Criticality Analysis
The Criticality Analysis examines the degree of
contribution that each individual failure mode of a
component has with respect to system safety

Its results can be used to


justify a development option,

to establish the safety-related criteria for the selection of an


appropriate component for the required functionality,

to suggest suitable protective measures,

to provide basis for the final system certification.


Linear Energy Transfer (LET)
A measure of the energy transferred to the device
per unit length as an ionizing particle travels through
a material.

Unit :- MeV*cm2/mg of material

LETthreshold (LETth) : Minimum LET to cause an


effect.

SEE-immune device:- LETth > 100 MeV·cm²/mg


Functional Analysis and Criticality

SEE requirements depend on the functions devices


perform.
Memories will exhibit different conditions than
power converters, so the function the device performs is
critical to the analysis.

SEEs propagate through the design and impact other


areas.

These two conditions make each single event


problem different in terms of failure mode and effect
Systems Engineering Process
The first box represents the input requirements for the system being
considered.

With the known performance requirements, one then identifies the


required functions to achieve performance, termed "functional analysis".

Potential mechanisms to fulfill the functions, or design options, are


explored and evaluated.

A decision is made, leading to the system description.


A Case Study:
Far Ultraviolet Spectroscopic Explorer mission

Function 4: Mission operations:-


Contingency operations, Deployment & Initialization,
Target acquisition & Tracking, Science Data Acquisition, Science Data Processing
A Case Study:
Far Ultraviolet Spectroscopic Explorer mission

Function 4.5, target acquisition & tracking :-


Inertial attitude determination, Sun acquisition, Target selection
Inertial attitude processing, Slew specification, Instrument alignment
Relative attitude processing, Sensor configuration,
Functional Criticality Classes
Three criticality groups for Single Event Upset:

Error-functional,

Error-vulnerable,

Error-critical
Ionizing Radiation Environment
Concerns
1) What is the "normal" radiation
environment under which the system must
operate?

2) What is the "worst case" radiation


environment that the mission will
encounter ?
Ionizing Radiation Environment
Sources
The main sources of energetic particles that are of
concern to spacecraft designers are:

Protons and electrons trapped in the Van Allen belts,

Heavy ions trapped in the magnetosphere,

Cosmic ray protons and heavy ions, and

Protons and heavy ions from solar flares.


Protons of “inner” Van Allen belt

In the equatorial plane, the high energy protons


(E>30 MeV) extend only to about 2.4 earth radii

Energy range :- keV to hundreds of MeV

Intensity:- 1 to 1 x 105 protons/cm2/sec.

NASA AP8 model


Cosmic Ray Protons and Heavy
Ions
Originate outside the solar system.

Ions of all elements from atomic number 1


through 92.

Highly Energetic :10s of MeV/n ~ 100s of


GeV/n.

They are difficult to shield against .


Solar Cycle & Solar Flares
The solar cycle : 11 years (9 ~ 13 years)

Solar Minimum : 4 years


• Inactive Phase
• Few flare Events

Solar Maximum : 7 years


• Active Phase
• Large no. of Solar Flares

JPL92 model
Single Event Upset at Ground
Level
Errors in RAM chips due to upsets caused
by the alpha particles.

Particles were released by U and Th


contaminants within the chip packaging
material.
Alpha-induced upsets
These occur in DRAMs when a "page
miss" (a change in the row address)
causes 4K bits of data to move from the
DRAM cells to a small on-chip SRAM
page.

Error rate is proportional to the rate of


page misses (plus refreshes).
An IBM Initiative
IBM began an 15-year effort in 1979 to
understand ground level upsets.

Results compiled in IBM Journal of


Research And Development. (Special
Edition 1994)
A Case Study:
ACPMAPS - FERMILAB
ACPMAPS – a system of individual computers
far from very high energy Fermilab Acclerators

Contains 156 Gbits of 4 Mbit fast page-mode


DRAM,

In production it consistently experiences


single bit errors on an almost daily basis. .
The ACPMAPS Problem
Upset Rate: 2.5 upset/day or 7E-13 upset/bit hr.

5-10 times larger than non-accelerated failure


test results

500 times larger than accelerated failure tests

Independent of the rate of page misses,

No effect of critical charge, Qc,


ACAPMAPS Conclusion

GROUND LEVEL NEUTRON FLUX

It has been suggested that it is the


thermal neutron portion (E~ 0.025 eV) of
the atmospheric neutron spectrum, rather
than the high energy portion (E> 10 MeV),
which is mainly responsible for the upsets
Thermal Neutrons Mechanism
B10 content of boron dopants in
Microelectronics

0.8 MeV Li ions produced by thermal


neutron interactions with B10

They recoil with 1.5 MeV alpha


releasing Energy that causes Upsets.
SEE Mitigation
Additional Hardware
Cost, Power, Volume, Performance,
Availability

Software
Design Complexity

Most effective and efficient option:


Combination of Both
Classification of System Level
SEEs by Device Type

Memory or Data-related devices

Control-related devices
Mitigation of Memories and Data-
Related Devices
Scrubbing
EDAC Method EDAC Capability
Parity Single bit error detect
Detects if any errors occurred in a given
CRC Code
data structure
Hamming Code Single bit correct, double bit detect
Correct consecutive and multiple bytes
RS Code
in error
Corrects isolated burst noise in a
Convolutional encoding
communication stream
Overlying protocol Specific to each system implementation
Mitigation of Control-related
Devices
Health and Safety (H&S) subroutines

Watchdog timers

Redundancy

Lockstep

Voting
Health and Safety (H&S)
subroutines
Perform Memory Scrubbing

Error Detected

Place System into SAFE MODE

Activate Redundant Device


Watchdog timers
Active Passive

Sends "I’m okay" Receives Confirmations

"time out" “time out”

Recovery Action Initiated Recovery Action Initiated

Multiple Watch Dog Timer Power Considerations


Redundancy
Recovery from Hard Errors

Autonomous or ground-controlled
switching from a prime system to a
redundant spare

MIL-STD-1773 fiber optic data bus is a


fully redundant bus with an A side and a
B side
Lockstep System
Operating two identical circuits with
synchronized clocking

Error detection occurs if the processor


outputs do not agree, implying that a
potential SEU has occurred

Clock Skew & "false" triggers


Voting
Three identical circuits

Output that at least two agree upon

Triple modular redundancy (TMR)


voting scheme for FPGAs

Three voting flip-flops per logical flip-


flop.
SEECA … Indispensable

The increased venture into SEE


Inducing Radiation Environment.

The increased functionality of


Space and Nuclear systems
SEECA … Indispensable

The increased scaling down =


=Increased SEE vulnerability

Increased Ground Impact : The


DRAM problem
Need to do a

Single Event Effect Criticality


Analysis
References
Eugene Normand, Member, IEEE
Boeing Defense & Space Group, Seattle, WA 98124-2499
Single Event Upset at Ground Level

S B Umesh, S R Kulkarni, R Sandhya, G R Joshi, R Damle and M


Ravindra
Components Division, ICG, ISRO Satellite Centre, Bangalore
Department of Physics, Bangalore University, Bangalore
High-energy heavy ion testing of VLSI devices for single event upsets and latch up

Fan Ye; Tim Kelly;


University of York, York, UK
Criticality Analysis for COTS Software Components
Thank you
Queries ? ? ?

You might also like