Download as pdf or txt
Download as pdf or txt
You are on page 1of 11

2 IEEE TRANSACTIONS ON RELIABILITY, VOL. 41, NO.

1 , 1992 MARCH

A Survey of Reliability-Prediction Procedures


For Microelectronic Devices

John B. Bowles, Senior Member IEEE French National Center for Telecommunications Studies
University of South Carolina, Columbia Recueil De Donnees De Fiabilite Du CNET (CNET
Procedure),
Siemens Reliability and Quality Specification Failure Rates
Key Words - Reliability prediction, Microelectronic device, of Components (Siemens Procedure). 0
Reliability model
Although the procedures cover a wide range of electronic com-
Reader Aids - ponents, the focus here is on the failure rate models used for
Purpose: Survey & tutorial microelectronic devices. Reliability predictions for microelec-
Special math needed for derivation: Elementary probability tronic devices are especially important since they are key
Special math needed for results: None elements in improving reliability by increasing the level of com-
Results useful to: Reliability engineers and analysts ponent integration.
Section 2 describes the failure rate models used for micro-
Summary t Conclusions - This article reviews six current electronic devices in each procedure and illustrates their applica-
reliability prediction procedures for micro-electronic devices. The tion in calculating device failure rates. Sections 3 - 8 examine
device models are described and the parameters and parameter the parameters used in the device failure rate models and their
values used to calculate device failure rates are examined. The pro- effects on the reliability calculations. Section 9 discusses ex-
cedures are illustrated by using them to calculate the predicted
tensions to the device models to accommodate infant failures.
failure rate for a 64K DRAM; the resulting failure rates are com-
pared under a variety of assumptions. The models used in the pro- Section 10 discusses the material and provides perspective. A
cedures are similar in form, but they give very different predicted 64K Dynamic Random Access Memory (DRAM) is used to:
failure rates under similar operating and environmental conditions, illustrate the reliability calculations
and they show different sensitivities to changes in conditions af- provide insight into the effects of various parameters on the
fecting the failure rates.
predicted failure rate. 0
All failure rates are expressed as failures per lo9 hours of
1. INTRODUCTION operation (Fit).

Reliability predictions are used in several important ac-


tivities, eg, 2. DEVICE MODELS
feasibility evaluations
comparing competing designs The constant failure-rate reliability model is used by all
identifying potential reliability problems of the reliability prediction procedures presented here, wherein
planning maintenance and logistic support strategies the reliability at time t is:
input to other studies such as life-cycle cost analysis or pro-
duct selection. 0 R ( t ) = exp( - A t ) .
In many instances, reliability requirements drive product design.
The failure rate, A, is given by the device model which, in turn,
In all of these activities, accurate reliability predictions are im-
is a function of parameters that describe:
portant since they can appreciably affect the conclusions.
This article examines the data & methodology for predic- its physical and operating characteristics,
ting electronic component reliability in six widely used reliability the environment in which it operates. 0
prediction procedures [l - 61:
With the constant failure-rate model, the failure rate of any
Mil-Hdbk-217E, Reliability Prediction of Electronic Equip- higher level assembly (assuming a series logic diagram) is the
ment (Mil-Hdbk-217), sum of the failure rates of its constituent components, but it does
Bellcore Reliability Prediction Procedure For Electronic not include interconnects, stress interactions, solder joints, etc.
Equipment (Bellcore RPP) , Table 1 shows the models for microelectronic integrated
Nippon Telegraph and Telephone Corporation Standard Reli- circuits (ICs) in each of the reliability prediction procedures.
ability Table for Semiconductor Devices (NTT Procedure), The Mil-Hdbk-217 and CNET procedures provide models for
British Telecom Handbook of Reliability Data for Compo- both a parts-count analysis and for a stress analysis. The parts-
nents Used in Telecommunications Systems (British Telecom count model (called simplijed in CNET) assumes typical
HRD4), operating parameters for a component and is used when most

0018-9529/92$03.00 0 1992 IEEE

Authorized licensed use limited to: Universitaetsbibliothek der RWTH Aachen. Downloaded on November 04,2022 at 08:53:13 UTC from IEEE Xplore. Restrictions apply.
BOWLES: A SURVEY OF RELIABILITY-PREDICTION PROCEDURES FOR MICROELECTRONIC DEVICES

Table 1 Example Failure Rate Calculations


Formulas for Calculating Failure Rates of Microelectronic
Devices
In order to give some insight into the manipulation of the
procedures, a failure rate for a 64K DRAM was calculated.
Procedure Microelectronic Device Model Assumptions
MIL-HDBK-217 ambient temperature of 40°C
(stress model) h = nQ (cl nTnv + c, "E) n,!. ground-benign environment. This is generally described as
MIL-HDBK-217 an ideal environment having controlled temperature and
(parts count) h = h, "Q n, humidity with nearly zero environmental stress.
Bellcore RPP h = h, "Q n, n
T the devices are encapsulated in ceramic (hermetic), Dual-In-
British Telecom HRD4 h = hb n~ "Q n~ line Packages (DIPS) with 16 pins
NTT Procedure h = hb "Q ("E + nTnv) the devices have passed the infant mortality period
CNET Procedure the devices were procured to good specifications with pro-
(stress model) h = (c,n, n, n, + cz & ne) n, "Q per qualification programs and manufacturing controls.
CNET Procedure power dissipation is 250 mW - typical for this size compo-
(simplified) h = "Q ha nent. This is needed in order to estimate the junction
Siemens Procedure h = h, nT temperature and temperature factor. 0
The resulting failure rates are summarized in table 2.

Table 2
of its operating parameters are not known. The stress model Predicted Failure-Rates for a 64K DRAM
requires a detailed analysis of all of the parameters on which
the component failure-rate depends. Procedure A
The nomenclature in table 1 is the same as that used in
MIL-HDBK-217 (stress model) 216 Fit
the corresponding procedures except that in CNET, IT, is used MIL-HDBK-217 (parts count) 219 Fit
instead of ITT for the technology-function factor, and IT,, is used Bellcore RPP 140 Fit
instead of ITs for the package-pin-count factor. ITT & ITS have NTT Procedure 138 Fit
different meanings in the other procedures. The factors for CNET Procedure (stress model) 631 Fit
CNET Procedure (simplified) 1950 Fit
failure rate calculations are: British Telecom Procedure 8 Fit
Siemens Procedure 96 Fit
ITQ = quality factor. It depends on the quality of the device
as determined by inspection & test after the product has been
manufactured.
C, & C2 = failure rate constants. They depend on the device The apparent agreement between the Mil-Hdbk-217 stress
complexity. C 1 depends on the circuit complexity & and part-count models in table 2 is largely coincidental. The
technology. C2 depends on the packaging type and package assumptions of an ambient temperature of 40°C and power
pin count. lI,, U,, IT, in CNET also depend on the circuit dissipation of 25OmW lead to a device junction temperature of
technology and function, the package technology, and the 47.5 "C in the stress model. The part-count model assumes an
package pin count respectively. ambient temperature of 30°C and a junction temperature of
IIT, IT, = temperature acceleration factors. They depend on- 45 "C for microelectronic devices. An ambient temperature of
ly on the steady-state operating temperature of the device. 30 "C in the stress model would have resulted in a device junc-
There are no temperature cycle or temperature gradient terms tion temperature of 37.5 "C and a failure rate of 131 Fit. The
in any of the models. very low failure rate for the British Telecom procedure is due
I T , IT, IIs = voltage-stress factors. They depend on the to the choice of the base year (see section 3).
ratio of the applied voltage to the rated voltage of the com-
ponent. In the proposed Mil-Hdbk-217F, these factors have
been removed. 3. BASE FAILURE RATE
ITIE = environmental factor. It depends on the environment
in which the device is operating. However, the term does not Since the early 1970s, failure rates for micro-electronic
directly address vibration, shock, or temperature. devices have fallen approximately 50% every 3 years [7].
IT, = device or process learning factor. It depends on how However, the handbook models are typically updated on the
long the device has been in production. average every 6 years, and typically not in most of the device
Ab = base failure rate. It depends on the device complexity categories. Most of the expressions in table 1 consist of a base
and technology. A, and Aa are generic or average failure failure rate, modified by several pi-factors. The Mil-Hdbk-217
rates, assuming average operating conditions. 0 & CNET stress-models consist of two separate failure rates,

Authorized licensed use limited to: Universitaetsbibliothek der RWTH Aachen. Downloaded on November 04,2022 at 08:53:13 UTC from IEEE Xplore. Restrictions apply.
IEEE TRANSACTIONS ON RELIABILITY, VOL. 41, NO. 1, 1992 MARCH

C1 & C2, for the part technology and its packaging respective- Although the selection of the base year is somewhat arbitrary,
ly. Each of these is modified by appropriate pi-factors. it appreciably affects the calculated failure rate due to the ex-
In Mil-Hdbk-217, tables for use with a parts-count analysis ponential time-factor. FRATE, an automated version of the
give values of the generic failure rate, hG,for various micro- British Telecom procedure uses a base year of 1972 for MOS
electronic devices. The values depend on the environment in DRAMS [8]. This leads to a base failure rate of 26.6 Fit in-
which the device is used and are based on the stress-model, stead of 8.1 Fit.
assuming nominal operating conditions and temperature for that Expressions for calculating the base failure-rate, hb, in the
environment. The CNET procedure also provides generic failure NTT procedure are given for a wide variety of monolithic ICs
rates, ha, for use with the simplified device model. Failure having various technologies. These are of the form kxo.25or
rates for junction temperatures of 40 "C & 70 "C are given for kx0.50, where k is a constant that depends on the device type
each operating environment. The Bellcore RPP and Siemens and technology, and X is a measure of the device complexity
Procedure include tables of generic failure rates for many - typically, the number of bits for digital devices, the number
devices of varying size, type, technology, and complexity. of active elements for digital logic, or the number of cross points
for Programmable Logic Arrays. The exponent 0.25 is used
for smaller devices (for example, a RAM having less than 16K
Table 3 bits) and 0.50 is used for larger devices. For a 64K DRAM,
British Telecom HRD4 Formulas for Calculating the the base failure-rate, with B as the number of bits, is:
Base Failure-Rate of Microelectronic Devices
hb = 0.337B0.50 = 0.337(65536)0.50= 86.3 Fit.
Component Category Base Failure Rate

94 In the Mil-Hdbk-217 stress-model, C1is the circuit com-


Bipolar digital logic = -
),, plexity failure-rate factor which is based on the circuit
t
technology, and C2 is the package complexity failure-rate fac-
12 tor. Circuit complexity is measured by the number of bits for
Bipolar and MOS linear Ab = - N51t
t digital devices, transistors for linear devices, and gates for logic
devices. C1 is tabulated in the procedure for various device
MOS digital logic
),,- 1500 G5/f1"
tl.5 types and is generally proportional to the square root of the com-
plexity. For small DRAMS (less than 16K bits), C1 = 25; for
22
MOS DRAMS, EPROMs, EEPROMs and CCDs Ab = - B5/' DRAMS of 256K to 1M bits, C1 = 200.
t

MOS SRAMs ),,= -


43
t
B5Ir
Table 4
MIL-HDBK-217 Package-Complexity Factor, C,

94
Bipolar SRAMs and fusible link PROMS ),, - B5Ir
= Package Type Complexity
t
Hermetic DIPS with solder or weld
G = number of gates; N = number of transistors; B = number of bits. seals; Leadless Chip Carriers C2 = .28 (NP)'.O8
Hermetic DIPS with glass seals c, = .09 ( N p ) l . 5 1
Non-hermetic DIPS C, = . 2 ( N p ) ' . 2 3
In British Telecom HRD4, the base failure-rate, hb, in- Hermetic Flatpacks C2 = .03 (N,)'.82
cludes a time factor which reflects the steady improvement of
Hermetic Cans C, = .03 (N,)2.0'
reliability in the industry state of the art. Expressions for the
base failure-rate for various microelectronic devices are shown Np = number of functional pins.
in table 3. Device complexity is measured by the number of
gates for logic devices, the number of transistors for linear
devices, and the number of bits for memories. The time, t, is C2 depends on the number of pins and the package type.
defined as the year of manufacture, taking the "year of C2 is calculated according to the expressions in table 4, and
technology maturity" as the base year. No guidelines for selec- values are tabulated in the procedure by package type and
ting the base year are given in the procedure documentation. number of pins. For the 64K DRAM, with 16 pins, ceramic
However, a table of base failure-rates which uses 1965 as the encapsulation and solder seals, C1 = 50 Fit, and C2 = 5.6 Fit.
base year and 1986 as the year of manufacture for a wide variety The CNET technology complexity factor, C1, also
of components is given. In 1990, with a base year of 1965, the depends on the number of gates, bits, or transistors in the IC.
British Telecom HRD4 gives a base failure-rate for a 64K For memories, C, is:
DRAM of
C1 = 5B0,4for memories of less than 100 bits

Ab = (22/t)B5" = (22/25)(65536)5'25 = 8.1 Fit. C1 = 12.5B0,2for ROM & EPROM of less than 100 bits

Authorized licensed use limited to: Universitaetsbibliothek der RWTH Aachen. Downloaded on November 04,2022 at 08:53:13 UTC from IEEE Xplore. Restrictions apply.
BOWLES: A SURVEY OF RELIABILITY-PREDICTION PROCEDURES FOR MICROELECTRONIC DEVICES

C1 = 100(B/1000)0.5for memories of more than 100 bits. Component manufacturers have also begun to apply in-line
statistical process control techniques to reduce process varia-
B is the number of bits in the device. tion and defects. These techniques are capable of giving much
higher levels of component quality than tests and inspections,
Similar expressions are given for logic, interface, and linear but, since each vendor’s technology is different, it is difficult
devices. C1 for the 64K DRAM is 100(65536/1000)0~50= to specify a quality level on the basis of what monitors were
810 Fit. C1is adjusted by the technology-function factor, IIT, applied to the production process. Consequently, the reliabili-
whose value depends on the device technology and type. As ty prediction methods do not incorporate any of these factors,
shown in table 5, II, is considerably higher for CMOS than for and some customers have begun to rely on reliability audits and
other technologies and it is lower for memories than for other approved vendor lists to determine component quality. A good
device types. discussion of these changes is given in [7].

Table 5
CNET Technology-Function Factor, n,, 21 7 RPP I IRD4 NTT CNET
[For selected types of microelectronic devices
S Space SCC
.25 Bc i
B .15
s-i c 3
Technology 75
Integrated SpRclnl
Circuit Type Bipolar PMOS NMOS CMOS n IflYOI 111 I llqll
i I tdl~,l,illly
5 ( 1 1)
6 (NII)
Level 3 i o (11)
DRAM 1.1 0.7 0.7 3.5 1 5 (NII)
5
ROM 0.5 0.6 0.6 1.7 R-1 Aqrflflmorll P T T
EPROM 0.5 0.7 0.7 1.7 2 WIlllCCO 7
Interface 3 4 4 20 wiIIInit1 CCO 1

Logic 3 4 4 20 Level II
Linear 3 4 4 20 0 2 cca
I O (11) will1 1111 lest 1
5 llill
1 2 (NII) nnriiial 1
lelinbilily
-
. .-
1 5 (II)
2 5 (NII) I lolllolog;lllon
C, is the package complexity failure-rate in the CNET D Level 2 I 8
10 I O
procedure. For memories, linear, and logic devices, it is: C2 Level I Qualilicalion by
Clienl
= 7x0.2, where X is the number of bits, transistors, or gates i 5 (Ii) 2
i 8 (NH) Level 1
respectively. For analog devices with N transistors a slightly D-1 2 0 (Ii) No Clualilication
20
different expression, C2 = 3@.’, is used. 4 0 (NIi) 35

C2 is modified by the package-technology factor, II,, and


by the package pin count factor, II,. II, depends upon the type Figure 1. Comparison of Component Quality Levels.
of packaging, the device technology and the environment. It
is 1 for hermetic packages in all cases. For Bipolar, NMOS,
PMOS circuits in non-hermetic packages IIB is 1 in a ground The quality factor, q,
expresses the relationship of com-
benign environment and 3 in other environments. For CMOS ponent quality to the failure rate. Each prediction procedure
IIB is 3 for a ground benign environment and 6 for other en- defines several levels of quality and assigns a value of IIQto
vironments. Values of II, depend on only the number of pins each level. Figure 1 shows the quality levels and the corre-
in the package and range from 1 for packages having less than sponding values of IIQfor each procedure. Although relative
24 pins up to 4 for those with more than 63 pins. quality levels are clearly defined within each procedure, it is
difficult to ascertain which quality levels in one procedure are
equivalent to the quality levels in another since they are deter-
4. QUALITY mined by different tests, inspections, and control documents in
each procedure. Hence, the relationships between the quality
Component quality is generally determined by the controls levels shown in figure 1 should be viewed with extreme cau-
and the inspections & tests used in the manufacturing process. tion. The wide range of values (0.25 to 20) for IIQin Mil-
Mil-M-38510 governs IC specifications and establishes “the Hdbk-217 makes quality an especially important component in
general requirements for ... microcircuits and the quality and that procedure’s reliability calculations. IIQhas a much nar-
reliability requirements which must be met in the acquisition rower range of values in the other procedures.
of microcircuits. ” Mil-Std-883C defines the methods used to The Mil-Hdbk-217 class S and S-1 quality levels are in-
perform screening and qualification testing in Mil-M-38510. tended for devices having very high reliability requirements such
Major manufacturers and purchasers of electronic devices also as when a failure could result in the loss of a life or an expen-
have developed their own inspection, test, and qualification sive satellite. These devices must be procured in accordance
requirements. with Mil-M-38510, Class S requirements which require 100%

Authorized licensed use limited to: Universitaetsbibliothek der RWTH Aachen. Downloaded on November 04,2022 at 08:53:13 UTC from IEEE Xplore. Restrictions apply.
6 IEEE TRANSACTIONS ON RELIABILITY, VOL. 41, NO. 1, 1992 MARCH

inspection or testing of devices for most tests, and an entire lot category and define the subclasses B + , B, and C. The agency
is rejected if more than a stated fraction, typically 5 % or lo%, requirements generally include measurements of electrical
of the devices fail. characteristics, environmental tests, burn-in, inspections dur-
Mil-M-38510 requirements for class B are less stringent ing manufacturing, allowable tolerances on parameter values,
than class S requirements. For example, devices in class S re- etc.
quire 240 hours of burn-in at 125 “C whereas those in class B The classificationCCQ (UTEKECC) is based on a detailed
require only 160 hours of burn-in. Classes B-1 and B-2 allow examination of the production process using a system of tests
further leeway in the test procedures. and measurements developed by the UTEICECC. “Homologa-
Class D is used for hermetically sealed devices with nor- tion” is used as a temporary classification for components that
mal reliability screening and the manufacturer’s own quality are awaiting certification as CCQ and for components that are
assurance practices. Class D-1 is used for non-hermetic, com- manufactured at times when all of the specified controls required
mercial parts. Some non-hermetic parts can be in class D if they for the CCQ classification cannot be applied.
have passed a specified burn-in test. The classification “qualification by client” is used for com-
Bellcore RPP recognizes three quality levels and states the ponents that are manufactured according to the manufacturer’s
general requirements for each level. Level I is used for own quality control procedures. After examining the produc-
“commercial-grade parts that are procured and used without tion process, the procedure suggests that a IIQ factor in-
thorough device qualification or lot-to-lot controls by the equip- termediate between “Homologation” and some other classifica-
ment supplier.” Level I1 is assigned to devices which meet the tion can be used in some cases. “Without qualification” is used
level I requirements and for which the purchase specifications when no information is available on what tests or controls were
‘‘explicitly identify important characteristics . .. and acceptable applied during the component’s production.
quality levels for lot control.” In addition, qualification for Although the Siemens procedure does not specify a quali-
device level II must include appropriate life and endurance tests ty factor, it states that the failure rates given are for component
and both the device and the vendor must be on approved producers who apply appropriate quality assurance controls.
partshendor lists. In addition to the levels I & I1 requirements,
level I11 requires that devices be requalified periodically and Table 6
that lot-to-lot controls include 100% screening with a specified Failure Rates (Fit) for a 64K DRAM
Percent Defective Allowed (PDA). IC tests follow the pro- [For various quality levels]
cedures specified for Mil-Std-883, Class B.
The British Telecom procedure also specifies three levels Procedure B D-1 (NH)
of procurement quality. Level 1 components are manufactured 361 722
Mil-HDBK-217 (stress) 22 216
and tested in accordance with generally accepted commercial MIL-HDBK-217 (parts count) 22 219 380 760
practices; no tests, screening or inspections are specified beyond Bellcore RPP 70 140 168 252
the manufacturer’s normal quality control practices. Level 2 NTT Procedure 92 138 410 410
components are procured in accordance with documentation that CNET Procedure (stress) 189 1262 1670 2922
CNET Procedure (simplified) 585 3900 3900 6825
controls the design and manufacturing process, specifies com-
British Telecom HRD4 4 8 8 32
plete electrical characteristics, and includes criteria for incom- Siemens Procedure 96 96 149 149
ing inspection. Level 3 components are also subjected to 100%
burn-in and 100% electrical screening tests that are designed
to remove early life failures.
The N’IT procedure classifies devices as “special high The effect of the device quality on the predicted failure-
reliability devices for telecommunications use” and ‘‘high rate for a 64K DRAM is shown in table 6. The Mil-Hdbk-217
reliability devices for telecommunications use”, but the pro- nomenclature is used for the quality level and both hermetic
cedure documentation gives no guidelines for putting com- (H) and non-hermetic (NH) parts are assumed for level D. The
ponents into either category. Within each category components choice of non-hermetic components for levels D and D-1 af-
are further categorized as hermetically sealed or non- fects the temperature factor (see section 6) as well as the quali-
hermetically sealed. The quality factor for each category also ty factor.
varies with the technical maturity of the device (see section 8.)
The values shown in figure 1 are for devices with a “high
degree” of technical maturity. 5. ENVIRONMENT
The CNET procedure identifies 7 quality classes and
several subclasses. The classes Space, NFC 96.883, Agreement The environmental factor, H E , accounts for the effects of
PTT, and CCQ (UTEICECC) are used for components that are environmental stresses on the device reliability. Each of the
produced in accordance with the procedures, inspections, and reliability prediction procedures lists typical environmentswithin
tests specified by official agencies which also maintain lists of their range of applicability and gives corresponding value for
approved componentsthat have been produced according to their IIE. Mil-Hdbk-2 17 identifies 27 different environments but the
requirements. For example, the European Space Agency and proposed -217F reduces this to 14; Bellcore RPP has 4; British
SCCG (an industry group) set requirements for the “space” Telecom HRD4 has 3; the NTT Procedure has 4; and the CNET

Authorized licensed use limited to: Universitaetsbibliothek der RWTH Aachen. Downloaded on November 04,2022 at 08:53:13 UTC from IEEE Xplore. Restrictions apply.
BOWLES: A SURVEY OF RELIABILITY-PREDICTION PROCEDURES FOR MICROELECTRONIC DEVICES

Procedure has 11. Some of these environments are shown in assumes a 40°C ambient temperature in all cases.) Mil-
table 7 along with the corresponding values of Where a Hdbk-2 17 states that non-hermetic components should not be
component encounters different environments during different used except in ground benign and ground fixed environments.
phases of its operation, the corresponding IIE should be used This is in conflict with the modern trend to use “plastic” com-
to calculate the failure rate for each phase. ponents in both commercial and military avionics.

Table 7 Table 8
Mil-Hdbk-217 Environmental Factor, IIE Failure Rates (Fit) for a 64K DRAM
[For selected environments] [For various operating environments]

Environment 217 RPP HRD4 NTT CNET Procedure GB GF GM AIC

Ground, Benign 1.0 1.0 .3 (H) 1.0 Mil-HDBK-2 17 (stress) 216 335 430 335
.38
. 3 (NH) MIL-HDBK-217 (parts count) 219 529 767 1002
Bellcore RPP 140 210 700 -
Ground. Fixed 2.5 1.5 1.5 .3 (H) 6.0 NTT Procedure 138 138 333 -
.6 (NH) CNET Procedure (stress) 631 1734 2506 2313
Ground, Mobile 4.2 5.0 8.0 1.8 10.0 CNET (simplified) 1950 2250 2550 2450
British Telecom Procedure 8 12 65 -
Space Flight .9 2.0 Siemens Procedure 96 96 96 96
Portable 3.8 5.0 1.8
GB = Ground Benign; GF = Ground Fixed; GM = Ground Mobile;
Naval, Sheltered 4.0 1.8 10.0 AIC = Airborne Inhabited Cargo.
Naval, Unsheltered 5.7 14.0
Airborne, Inhabited, Cargo 2.5 9.0
6. TEMPERATURE
Airborne, Uninhabited Cargo 3.0 10.0
Airborne, Inhabited, Attack 4.0 18.0 The Arrhenius model is generally used to describe the ef-
Missile, Launch 13.0 fect of steady state temperature on component failure-rates:
Cannon, launch 220
k = ko exp( -E,/kBT).

Notation
The CNET Procedure specifies a range of parameter values T absolute temperature ( K )
that define each environment. These parameters include vibra- E, activation energy
tion, noise, dust, pressure, relative humidity, and shock. For kB Boltzmann’s constant (8.617*10-~eVIK)
example, ground-benign (favorable) has no vibration, noise of k0 a constant 0
less than 70dB, little dust, 1 atmosphere pressure, relative
humidity of 20% to 70%, and 0 shock. The ground mobile en- The Arrhenius equation has been used to describe the effect of
vironment has vibration of 2 to 300 hertz, acceleration of 0.5 steady state temperature on many of the physical and chemical
to 5 g; noise of 40 to 100dB, severe dust, pressure of 1 at- processes such as ion drift and impurity diffusion that lead to
mosphere; relative humidity of 20% to 100%, and shock of 15 component failures [9]. However, it is also used to describe
to 50 g for 11 ms or 500 g for 0.5 ms. Mil-Hdbk-217 does not mechanisms which are temperature independent at operating
provide any detailed guideline. temperature (eg, intermetallic compound formation in Au-AI),
The Siemens procedure does not include an environmen- and to describe mechanisms which are temperature cycle or gra-
tal factor nor identify any operational environments. However, dient dependent.
it states that components are assumed to operate in an environ- Assuming that device failures occur when the concentra-
ment having a mean ambient temperature of 40”C, adequate tion of the reactant corresponding to a particular failure
ventilation, appropriate levels of humidity, and little mechanical mechanism reaches some critical value, the change in the time
stress. If the anticipated environmental conditions are not met, to failure from tl to t2due to a change in temperature from Tl
the procedure says that “a multiple of the failure rate values to Tr is expressed by the temperature acceleration factor, AT
must be expected”, but no multiplicative factors are included [lo]:
with the documentation.
The effect of the environment on the predicted failure-rate
for a 64K DRAM is shown in table 8 for four environments.
Differences between the Mil-Hdbk-217 parts-count and stress Various failure mechanisms can have different activation
models are largely due to the fact that the parts-count model energies, and in some cases, an average value corresponding
assumes higher device junction temperatures than does the stress to several failure mechanisms might be used in the prediction
model, in all except the ground benign environment. (Table 8 process. IIT is calculated from the reference temperature Tl

Authorized licensed use limited to: Universitaetsbibliothek der RWTH Aachen. Downloaded on November 04,2022 at 08:53:13 UTC from IEEE Xplore. Restrictions apply.
8 IEEE TRANSACTIONS ON RELIABILITY, VOL. 41, NO. 1 , 1992 MARCH

Table 9
Temperature-Acceleration Factors

Procedure Temperature acceleration factor

MIL-HDBK-2 17E nT = 0.1 exp[ -A( - I)&


HRD4 II, = 1, for q 5 70°C; 2.6 X IO4 exp [ -?]+
~ 1 . 8 ~ 1 0exp
'~ [ -'km]
~ for q > 70°C.

NTT Procedure IIT = exp[3480 (& - )] + exp[8120(&-;)] = 2.9x1O4exp[Y] + 8 . 0 ~ 1 e0 x~ p [ y ] .

1
Siemens Procedure + ( I - A ) exp[Eo2 . 11605 (q -

and the activation energy. Reference temperatures in the range


25 "C to 70 "C are used by the various prediction procedures. MIL.IIDRK 21 /E Nowtlermellc
Table 9 shows the temperature-accelerationfactor formulas for CNET Non-tierniellc

the reliability predictions. Siemens ,

In table 9, A , Al, A2, Eul,Ea* are constants defined within the


corresponding procedure and q, ql, T j 2 are the device pn-
junction absolute temperatures. The constants depend on the
device technology and on whether the component is in a hermetic
"T
or non-hermetic package. The British Telecom and NTT pro-
cedures use one temperature acceleration factor for all com-
ponents; the other procedures provide different constants for
each technology. Mil-Hdbk-217 identifies 25 types of IC
technologies (eg, TTL, LSTL, PMOS, NMOS, CMOS, //e----- MI1 HDOK 217E tlermellc

HCMOS, etc.) which are grouped into 7 categories. It gives


a value of A for hermetic and for non-hermetic components in
each category. The CNET procedure identifies 4 types of 1
technologies and also gives values of Al, A2 for hermetic and
non-hermetic components in each category. The Siemens pro-
cedure uses the same values of A , Eul, Eu2 for all microelec- Figure 2. Temperature Acceleration Factors (nT)vs Junction
tronic devices except EPROMs. Values of II, calculated using Temperature for NMOS ICs
the equations in table 9 are given in the form of tables or graphs
in the procedures. Figure 2 shows I'I, as a function of the
based on their package letter designation. For example,
device junction temperature for NMOS ICs.
for a package with letter designation E which applies to a
The Bellcore RPP does not give a specific form for the
16-pin DIP measuring 1/4 inch by 7/8 inch is 50°C/W. If
temperature acceleration factor as a function of the device junc-
the package letter designation is not known, 0 , can ~ be
tion temperature. Instead, it provides a table giving nTas a
estimated from a generic description of the package type, die
function of the ambient temperature for several types of
attachment, and number of pins. Typical estimates given by
components.
-217 are about 30 for packages with eutectic die attachments
All of the procedures, except the NTT, provide estimates of and 125 for those having glass or epoxy attachments; these
eIcbased on limited component physical-characteristics, but extremes are not realistic. The CNET procedure estimate of
not such critical material characteristics as the die-attach material eIcis based on the pin count, package size and package
and thickness, lead from material and C2H material, thickness. material. The British Telecom and Siemens procedures base
Mil-Hdbk-217 has the most detailed procedure for estimating their estimates of el, on only the pin count and package
eIc.Estimates of eIcare given for standard package types, material. Estimates of eIcfor the 16-pin DIPS used for 64K

Authorized licensed use limited to: Universitaetsbibliothek der RWTH Aachen. Downloaded on November 04,2022 at 08:53:13 UTC from IEEE Xplore. Restrictions apply.
BOWLES: A SURVEY OF RELIABILITY-PREDICTION PROCEDURES FOR MICROELECTRONIC DEVICES 9

DRAMS range from 30 (Mil-Hdbk-217) to 80 (Siemens) for dissipated by the device. Observe in figure 3a that values of
ceramic packages and from 30 (Mil-Hdbk-217)to 125 (Siemens) IIT for all of the procedures lie within a range of about a fac-
for plastic packages. tor of 10 for a given ambient temperature over most of the
temperature range. For devices having a higher power dissipa-
tion, differences in estimates of OJc in the procedures cause
Io00
much greater differences in the values of l l ~ .
MIL HDBK 217E Plastic

British Telecani Plaslic


Table 10
Failure Rates (Fit) for a 64K DRAM for
[For various operating temperatures]

Procedure 20°C 40°C 60°C 80°C


ri
T Mil-HDBK-217 (stress) 77 216 582 1495
r i 15 I Telwom Cerainic MIL-HDBK-217 (parts count) 219 219 219 219
Bellcore RPP 45 140 378 910
MIL HDBK 217E Cerariiic NTT Procedure 84 139 259 541
CNET Procedure (stress) 353 63 1 1156 2140
CNET (simplified) 790 1,950 - -
British Telecom Procedure 8 8 10 19
Siemens Procedure 48 96 208 504
(a) Hermetic packaging
0 1 - 1 . 1 . I . , , , . I
n 1 'I ',0 7'1 I,," 12', 150
Mil-HDBK-217 (stress) 86 362 1523 5648
Anibienl leniperature ( ' C )
MIL-HDBK-217 (parts count) 380 380 380 380
A Bellcore RPP 54 168 454 1092
NTT Procedure 216 410 827 1844
CNET Procedure (stress) 446 835 1687 4088
250 CNET (simplified) 790 1,950 - -

British Telecom Procedure 8 8 16 33


Siemens Procedure 69 149 341 845
@) Non-Hermetic packaging
200

MIL-HDBK 217E Plaslic

150
The effect of temperature on the predicted failure-rate for
II
1
a 64K DRAM is shown in table 10. In these calculations, OJc
Brilish Telpcorn Plaslic was estimated according to the methods given in the procedures.
IO0
For hermetic components a value of 50 was assumed for €3,~
BII~ISII
ToIcron Ceraiiiic
in the NTT procedure and for non-hermetic components a value
MI(. t U B K 21 7F Ccr,iiiiic
of 120 was assumed. Temperature is not a variable in the Mil-
50
Hdbk-217 parts count model which assumes a specific,
preselected temperature for each operating environment; hence
the calculated failure-rate is independent of the actual
0
0 25 50 75 IO0 I25 150
temperature. In the British Telecom procedure the ground
Anibienl Teriiperalure (" C) benign environment is restricted to temperatures of less than
B 55 "C, but we have none-the-less extended the model to show
the effect of higher temperatures on the failure rate. The same
Figure 3. Temperature Acceleration Factors (ITT) vs Ambient effect would be observed in other environments.
Temperature for a 64K DRAM
[16-pin DIP, power dissipation = 0.25Wl

7. VOLTAGE STRESS
Figure 3 shows the temperature acceleration factor, IIT, as a
function of the ambient temperature for a 64K DRAM having In the procedures, the stress factory

is 1 for all IC technologies except CMOS. For CMOS, Mil-


', 7

0.25 W power dissipation. Figure 3a has a logarithmic vertical


Hdbk-217 assigns IIv = 1.0 for applied voltages of less than
axis as in figure 2. Figure 3b shows the same data plotted with
12 V. Above 12 V, II, increases exponentially with both the
a linear scale; that scale shows the threshold effect in all of the
procedures whereby IIT increases very rapidly for temperatures and the device Junction
above a certain point. The range of values of llT in the predic-
tion procedures strongly depends on the amount of power Hv = 0.110 exP[0.168 vs (?/298)1-

Authorized licensed use limited to: Universitaetsbibliothek der RWTH Aachen. Downloaded on November 04,2022 at 08:53:13 UTC from IEEE Xplore. Restrictions apply.
10 IEEE TRANSACTIONS ON RELIABILITY, VOL. 41, NO. 1, 1992 MARCH

Values of II, range from 1 to 13 for junction temperatures in figure 1 are for devices that were manufactured with a “stable
ranging from 0°C to 150°C (T, is in K) and supply voltages producing technology”. Devices which have ‘‘only recently
ranging from 12 V to 20 V. Mil-Hdbk-217F has proposed to been developed” have quality factors ranging from 0.5 to 1.5
eliminate this term. higher.
In the NTT procedure, II, for CMOS is calculated by: As discussed in section 3, British Telecom HRD4 incor-
porates process maturity into the expressions for calculating the
II, = 0.25 exp (0.21 VDD), base failure-rate, Ab, through the time factor, t , in table 3. Us-
ing 1965 as the base year, the failure rate for a 64K DRAM
VDD = applied voltage. manufactured in 1986 is 14.7 Fit and in 1990 it is 8.1 Fit. The
difference would account for improvements in the process
A table giving a value of 1 for VDD = 5 V, and ranging from technology during that period.
0.48 for VDD = 1.5 V up to 8.2 for VDD = 15 V is provided.
These values are slightly lower than those calculated with the
above expression. 9. INFANT FAILURES
The value of II, in CNET for CMOS also depends on the
applied voltage (V,) and the device junction temperature: During the infant mortality period, the failure rate of
semiconductor devices generally decreases as the operating time
II, = A, exp[A4VA( T,/298)]. increases. This decrease may continue for more than lo4hours
at an ever decreasing rate until the failure rate approaches a
The constants A3, A, depend on the voltage and range in value nearly constant value [4].
from 0.5 to 2.2. The models used in the NTT procedure assume that
components have operated for more than lo4 hours and that
sufficient screening and testing has been done to prevent the
8. PROCESS MATURITY occurrence of infant failures. Similarly, the British Telecom and
Mil-Hdbk-217 procedures assume that devices have had suffi-
The maturity of the device manufacturing process appears cient burn-in to have achieved a constant failure-rate. The
explicitly in the failure rate models for the Mil-Hdbk-217 and Bellcore, Siemens, and CNET procedures provide a metho-
CNET procedures through the factor IIL and indirectly in the dology for developing a failure-rate adjustment factor for com-
other procedures. IIL is intended to reflect the fact that the first ponents that have not been completely burned-in.
production units of any device are more likely to be unreliable
than later production units. This may be unrealistic if processes Table 11
are in control and there is only a minor variation in part type. CNET Values of lI, for Components During
In Mil-Hdbk-217, II, is usually 1, but it is set to 10 under and After Infant Failure Period
any of the following conditions:
less than more than
a new device in initial production Quality Class 3000 hrs 3000 hrs
major changes in design or process have occurred
there has been an extended interruption in production or a Space SCC B + 0.1 0.1
B 0.15 0.15
radical change in line personnel
C 0.3 0.3
for all new or unproven technologies. U
NFC 96883 B 0.3 0.3
The factor of 10 applies for a period of up to 4 months of con- G 0.6 0.5
tinuous production after conditions and controls have stabiliz- D 0.8 0.7
ed. Since a designer usually has no way of knowing if any of Agreement (PTT ...) with CCQ 1 0.7
the production conditions cited above apply, the assumption is without CCQ 1.5 1
often made that components will be in continuous production CCQ (UTEICECC) ...
when the design is complete and II, is taken to be 1.O in most with lot testing 1.5 1
reliability analyses. This is being completely modified in pro- normal NMOS, CMOS > 100 bit 2 1
posed Mil-Hdbk-217F. other 1.5 1
In the CNET procedure, IIL also depends on the length Homologation 3 1.8
of time that the device has been in production. New devices, Qualification by client 5 2
devices that have had a major change in either their design or
No qualification 10 3.5
the production process by which they are made, and devices
for which production has been halted for a prolonged period
of time are assigned IIL = 10. This decreases to 1.0 over a
period of 24 months. In the CNET procedure, IIe is adjusted for devices that
The NTT procedure adjusts llQto compensate for the have accumulated less than 3000 hours operating time as shown
maturity of the manufacturing process. The quality factors listed in table 11. The failure rate for the device is:

Authorized licensed use limited to: Universitaetsbibliothek der RWTH Aachen. Downloaded on November 04,2022 at 08:53:13 UTC from IEEE Xplore. Restrictions apply.
BOWLES: A SURVEY OF RELIABILITY-PREDICTION PROCEDURES FOR MICROELECTRONIC DEVICES 11

X = [Al * 3000 + X2* ( t - 3000)]/t. 10. DISCUSSION

Notation I have reviewed 6 reliability prediction procedures for


t operating time micro-electronic devices, examined the parameters on which
XI failure rate computed with II, for less than 3000 they are based, and illustrated their application in the case of
hours operating time a 64K DRAM. All the procedures are based on data collected
h2 failure rate computed with II, for more than 3000 from field experience and, in some cases, from laboratory
hours operating time testing and extrapolation from similar devices. The failure-rate
0 models are similar, but they give large differences in predicted
failure rates for components having the same physical and
The Siemens procedure also defines the early failure period operating characteristics. There are also large differences in the
to be the first 3000 hours of operation. During this period it way specific changes in conditions affect the predicted failure-
suggests augmenting the failure-rate formula in table 1 with an rates. This, at the very least, detracts from the creditability of
early failure factor, I, whose value depends on the device the models.
operating time, its technology, and the design maturity. For There is considerable room for interpretation in the way
devices with less than 100 hours of operation, the procedure values are assigned to some of the model parameters. For ex-
suggests using values of IIF ranging from 150 to 50 for MOS ample, if documentation describing the manufacturing process
devices and from 30 to 20 for bipolar devices. The low end is available, it may not be difficult to assign the proper quality
of the range is for mature devices and the high end for new level to a specific component, purchased from a specific
designs. As the operating time increases to 3000 hours IIF manufacturer at a specific time; however, it is very difficult
decreases to 1 in all cases. to decide, in any general way, which quality levels in one pro-
The Bellcore procedure uses the Arrhenius model to deter- cedure are equivalent to the quality levels in another procedure.
mine a quantity called the effective burn-in time, t,. This time Similarly, due to the exponential nature of the temperature ac-
takes into account the burn-in accumulated at the device, unit, celeration factor dependence on the junction temperature, a
and system levels. The effective burn-in time, the temperature
small difference in estimating the junction temperature can result
acceleration factor, II, and the electrical stress factor, II,, are in a large difference in the predicted failure-rate. And estimates
then used to compute an infant failure factor called the “first of the thermal resistance, from which the junction temperature
year multiplier”, IIFy, for the failure rate. The effective burn- is calculated, vary widely in the procedures.
in time is defined as: Since micro-electronic device reliability has been improv-
ing rapidly every year, the higher failure-rate predictions for
some procedures may simply reflect the use of older (or sim-
Ab,d, Ab,u, Ab,$ Arrhenius temperature acceleration factors ply different) data in determining the model parameters. Only
for the device, unit, system burn-in temperatures the British Telecom HRD4 has a built-in mechanism for match-
tb,d, tb,u, tb,s device, unit, system burn-in times ing the reduction in failure rates over time that component
A,, Arrhenius temperature acceleration factor correspond- manufacturers have been able to achieve. This helps to keep
ing to the normal operating temperature the procedure current, but the use of a single base-year for all
II, electrical stress factor 0 technologies, or even for a broad class of technologies, is in-
appropriate. For example, I question the use of the same year
Values for the Arrhenius temperature acceleration factors are of technology maturity for 1K & 4K DRAMS which were
tabulated in the procedure. II, = 1, for most micro-electronic available in the early 1970s and 1M DRAMS which were not
devices. available until the late 1980s.
The first year multiplier for the failure rate is computed as
follows:
REFERENCES
te 2 lo4/ (II, II,): then IIFy = I
te I lo4/ (II, II,) - 8760:
[ 11 US Mil-Hdbk-217, Reliability Prediction of Electronic Equipment, ver-
then IIFy = [O. 46/ (UT II,) 0.751[( t , + 8760) 0.25 - sion E, 1986 October 27.
[2] TR-TSY-000332, Reliability Prediction Procedure For Electronic Equip-
otherwise: IIFy = [ 1 . 1 4 / ( I I ~II,)][10-4te II, II, ment, Issue 2, 1988 July; Bellcore.
- 4 ( 1 0 - ~ t ,IIns)0.25
, +3] + 1 0 [3] Handbook of Reliability Data for Components Used in Telecommunica-
tions Systems, Issue 4, 1987 January; British Telecom.
In effect, the Bellcore first year multiplier is 1 for devices that [4]Stundurd Reliability Tablefor Semiconductor Devices, 1985 March; Nippon
have had the equivalent of lo4 hours (adjusted by the Telegraph and Telephone Corporation.
temperature and voltage stress factors) burn-in. As the burn-in [5] Recueil De Donnees De Fiabilite Du CNET(Collection of Reliability Data
time decreases the first year multiplier increases. Its maximum from CNET), 1983; Centre National D’Etudes des Telecommunications
(National Center for Telecommunication Studies).
value of approximately 4.5 occurs when te goes to 0 (assum- [6] SN29500, Reliability and Qualiry Specification Failure Rates of Com-
ing II, = 1, II, = 1). ponents, Siemens Standard, 1986.

Authorized licensed use limited to: Universitaetsbibliothek der RWTH Aachen. Downloaded on November 04,2022 at 08:53:13 UTC from IEEE Xplore. Restrictions apply.
12 IEEE TRANSACTIONS ON RELIABILITY, VOL. 41, NO. 1, 1992 MARCH

[7] D. L. Crook, “Evolution of VLSI reliability engineering”, Proc. 1990 John B. Bowles (M’87,SM’89) is an Assistant Professor in the Elec-
Int’l Reliability Physics Symp., 1990 March, pp 2-1 1. trical & Computer Engineering Department at the University of South Carolina.
[8] RT4223, FRATE, Systems Reliability Consultancy; British Telecom Before joining the USC faculty he was project leader of the Systems Analysis
Research Laboratories, Martlesham Heath, Ipswich IP5 7RE, England. Group, Advanced Systems Development at NCR Corporation and a member
[9] F. Jensen, N.E. Petersen, Bum-in: An Engineering Approach to the Design of technical staff at Bell Laboratories. He holds a BS in Engineering Science
and Analysis of Bum-in Procedures, 1982; John Wiley & Sons. from the University of Virginia, an MS in Applied Mathematics from the Univer-
[lo] D. J. Klinger, Y. Nakada, M. A. Menendez (eds), AT&T Reliability sity of Michigan, and a PhD in Computer Science from Rutgers University.
Manual, 1990; Van Nostrand Reinhold. He is co-author of Fourier Analysis of Numerical Solutions of Hyperbolic
[ l l ] J. L. Spencer,“The highs and lows of reliability predictions”, 1986 Proc. Equations.
Annual Reliability & Maintainability Symp., 1986, pp 156-162.
Manuscript TR90-192 received 1990 October 16; revised 1991 July 19.

AUTHOR IEEE Log Number 03300 4TRb

Dr. John B. Bowles; Electrical & Computer Engineering; University of South


Carolina; Columbia, South Carolina 29208 USA.

VOICES FROM THE PAST VOICES FROM THE PAST VOICES FROM THE PAST VOICES FROM THE PAST VOICES FROM THE PAST

Hints and Kinks


Paul Gottfried How much military equipment lies abandoned near jungle
trails - not because it was unneeded, but because it wasn’t
This message isn’t new, and it is addressed to a small designed to operate in the combat environment?
fraction of the Newsletter readership. The message nevertheless No one can hope to prevent all failures; neither materials
is worth repeating, because it is so painfully obvious that it nor men are perfect. Enough can be prevented to make the
so often has not been heeded. It is placed in the Newsletter in proverbial ounce of prevention worth the pound of cure, and
the hope that those to whom it is not addressed will carry it the ounce is much cheaper.
to those who need it. The message is:
Consideration of Reliability as an after-thought is a ter- This message is brief.
ribly expensive and ineffective procedure. The time to It is important.
start considering Reliability is the time when system and
design concepts are being formulated. How about passing it on to someone
who might listen - and act?
How often is this principle violated? Too often -
especially in view of our years of preaching, and manage-
ment’s years of lip service in proposals, advertisements, and AUTHOR
speeches. Think:
Paul Gottfried; 9251 Three Oaks Drive; Silver Spring,
How many products do you know of that have failed in Maryland 20901 USA.
the market-place, or have required recall for expensive cor-
rective action, as a result of reliability neglect? [This “Hints and Kinks” is reprinted from the 1967 October
How many space systems (not just those in the headlines) issue of the IEEE Reliability Group Newsletter. Paul was at that
have ‘‘just growed” without effective configuration con- time a Principal Scientist with Booz-Allen Applied Research
trol or procedures as elementary as systematic search for Inc. He is now an independent consultant in quality &
weak links? reliability.] 4TRW

Authorized licensed use limited to: Universitaetsbibliothek der RWTH Aachen. Downloaded on November 04,2022 at 08:53:13 UTC from IEEE Xplore. Restrictions apply.

You might also like