Physics of Failure

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 7

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/4359637

The Physics-of-Failure approach in reliability engineering

Conference Paper · July 2008


DOI: 10.1109/ITI.2008.4588504 · Source: IEEE Xplore

CITATIONS READS

38 5,557

2 authors, including:

Vlado Sruk
University of Zagreb
59 PUBLICATIONS 301 CITATIONS

SEE PROFILE

All content following this page was uploaded by Vlado Sruk on 17 May 2014.

The user has requested enhancement of the downloaded file.


The Physics-of-Failure Approach in Reliability Engineering

Zoran Matiü, Vlado Sruk


Faculty of Electrical Engineering and Computing, University of Zagreb
Unska 3, 10000 Zagreb, Croatia
zoran.mtch@yahoo.com; vlado.sruk@fer.hr

Abstract. In today’s world, apart from the fact that it will perform its intended function for a
that systems and products are becoming stated period of time in intended use
increasingly complex, electronic technology is environment. This concept, therefore, focuses on
rapidly progressing in both miniaturization and the pair of reliability definition’s key words:
higher complexity. Consequently, these facts are “time” and “environment;” the concept,
accompanied with new failures modes. sometimes called Reliability Physics, is known
Standard reliability tools cope to tackle all of as Physics-of-Failure approach.
the new emerging challenges. New technology In short, the P-o-F concept explains why and
and designs require adapted approaches to when components fail. As a result, the concept
ensure that the products cost-effectively and surfaces those failure models that can carry
timely meet desired reliability goals. The failure mode characterization. The dominant
Physics-of-Failure (P-o-F) represents one conceptual failure model in the P-o-F framework
approach to reliability assessment based on has been the Damage-Endurance Model which
modeling and simulation that relies on embodies the paradigm that an applied stress
understanding the physical processes cumulates damage in components, and when
contributing to the appearance of the critical damage exceeds endurance the corresponding
failures. critical failure occurs.
This paper outlines the classical approaches In the field of electronics, the Physics-of-
to reliability engineering and discusses Failure approach was not established early either
advantages of the Physics-of-Failure approach. as a standard for predicting life or for assessing
It also stresses that the P-o-F approach should reliability of components. Because of relative
be probabilistic in order to include inevitable inexpensiveness of electronic components,
variations of variables involved in processes reliability programs have relied on statistical
contributing to the occurrence of failures in the analyses employed late in the development cycle.
analysis. In order to make a reliability prediction early in
the development cycle, the statistics of a
Keywords. Reliability prediction, Physics-of- tremendous amount of data collected in
Failure, Probabilistic Physics-of-Failure corresponding fields through governmental and
industry support were used. As the result, various
1. Introduction standards that represent the classical approach
for predicting the life of components were
It can be said that reliability prediction has established. Due to both diverse applications and
been rarely made with notable accuracy. The significant variations in environments, the
prediction inaccuracy in reliability engineering is assumption of constant failure rate became the
often a consequence of neglecting the significant foundation of the classical approach. As a
factors affecting the heart of the reliability itself. consequence of assuming constant failure rate,
In the last 30 years, we witnessed an the classical approach does not answer to newer
increased systematic attention to revive the demands and can represent a high burden to
reliability prediction concept conceived in the those projects where it is applied.
very beginning of the 1960s, which sets a In electronics, two main factors drive
relationship between environmental/operational application of more accurate reliability approach:
factors and components’ life expectation. The 1) rapid technological progress, and 2) market
reliability of an item is defined as the probability competitiveness. Electronic technology is rapidly

745
th
Proceedings of the ITI 2008 30 Int. Conf. on Information Technology Interfaces, June 23-26, 2008, Cavtat, Croatia
progressing in both miniaturization and higher classical reliability methodologies that do not
complexity. Consequently, these newer facts are acknowledge design and manufacturing
accompanied with new failures modes. The new improvements. A more realistic approach was
failure modes are results of not only higher searched as the adequate replacement.
stress-strength ratios due to miniaturization but It can be noted that reliability engineering
also new materials and new manufacturing authorities took a side in 1990s. As an example,
processes. On the other hand, companies have to a reliability authority, Patrick D.T. O’Connor,
respond to competitive market requirements of expressed his conviction to reliability
reducing development cycle and cost while communities, “I think the only balanced view is
producing more reliable products for a wider to say that Mil-Hdbk-217 and anything like it is
spectrum of environments. These requirements the biggest load of garbage ever to be foisted by
call for novel reliability methodologies for engineers on other engineers and should
predicting or assessing reliability of components immediately be done away with” [14].
as accurately as possible in up-front activities A very short article [10] having the direct
within development processes. These facts question as the article’s title: “Is It Time for A
require from reliability engineers acquisition of New Approach?” questioned the appropriateness
deep knowledge concerning both critical failure of a reliability prediction concept contained in
mechanisms, that are specific to products they MIL-HDBK-217. The author pointed out that
are working on, and dependency of those MTBF is “a trap for the unwary and as a concept
mechanisms to the market’s targeted it is misunderstood by the majority of
environment. customers.” He disregarded the concept by
The existence of two approaches split the claiming the “pseudo-statistics” behind it. The
reliability community into three groups: author highlighted that reliability belongs to
1) advocates of the classical reliability approach; engineering and the new approach must be a part
2) advocates of the Physics-of-Failure approach; of design integrity. Although the article neither
and 3) those unbiased who see both approaches answered the raised question nor directly referred
applicable during the development process. to the P-o-F approach, the P-o-F might be
While the first two groups are characterized by intuited as the answer to the title’s question
highlighting either practicality or validity as the through the article’s stress on the feasibility of a
criteria for mutual exclusive use of these failure-free period of an item designed against a
approaches, the third group is best portrayed by specified environment.
the saying: “always use the available knowledge Pecht [12] also pointed to the P-o-F concept
you have to help predict the future” – see an as an alternative approach to the classical
example for recalibration for classical approach by listing both the following main
predictions in [9]. Evans argues that due to ever- problems that arise from the classical prediction
present uncertainty the first two groups are “not method and the following reasons of their
realistic about what we know, and/or we do not existence: out-of-date data needed for the
know, about the real world” [3]. reliability prediction; difference between
removed and failed parts; distinction between the
2. Need for an Effective Approach in design and manufacturing failures; failures due
Reliability Prediction to overstresses, damages due to incorrect
manipulation with components, usage of
In the 1980s, some researchers and engineers improper parts, etc.; assumption of a constant
had questioned the accuracy of the Mil-Hdbk- failure rate; usage of averaged values that neither
217 in predicting reliability - the Mil-Hdbk-217 vendor specific nor device specific; inappropriate
was the epitome of classical prediction modeling; considerable discrepancies in
methodology in electronics; soon, the need for predictions among the worldwide existing
another approach was conceived. Debates in procedures that follow the classical approach.
many conferences and published questionings [6, Reliability engineering communities in
10, 13] of the appropriateness of applying the various countries had different visions on how to
procedures based on the classical concept have overcome these problems. In Europe the problem
contributed to maturing-up the necessity for has been seen in total averaging that qualitatively
another approach that overcomes the ignorance penalizes those parts that are produced by
previously exposed through the usage of the utilizing the good quality and reliability

746
practices. The European response was to work on Another comparison between predicted
establishing a standard based on the classical constant failure rates and real ones of many
approach that distinguishes suppliers’ reliability plastic encapsulated parts was done by Weil [15].
practices. In general, this vision means that the By analyzing the test and field data, Weil
prediction will be based on vendor-component showed that when high volume suppliers
specifics. From the other side of the world, Japan employing the best reliability and quality
has been preferably establishing grounds for the practices manufactured plastic encapsulated
P-o-F approach. Dominant problems seen by the microcircuits (PEMs), the PEMs had
Japanese reliability engineering community are considerably lower failure rates than the failure
not only in the constant hazard rate assumption rates predicted by Mil-Hdbk-217. Although the
but also in the Arrhenius relation regarding to PEMs were quite qualified for many military and
averaging effects of various failure mechanisms. aerospace applications, their implementation
Mil-Hdbk-217, that was previously overused, has historically was rejected due to Mil-Hdbk-217's
not been updated for over a decade and become very strong influence on designers of that time.
obsolete and does not adequately cover newer Also, Cushing et al [1] provided a descriptive
technologies. The previous upgrade of Military comparison between the classical and P-o-F
Acquisition Handbook 179, which directed approaches by various issues starting from model
readers to Mil-Hdbk-217 (Military Acquisition development to final relative cost analysis. This
Handbook 179A [2]) suggested the use of paper favors the P-o-F approach by targeting a
general “positive reliability practices”, and broader perspective of reliability assessment. The
among those practices listed the P-o-F approach. authors argued that whenever test and field data
are inconsistent with reliability assessments, the
3. Comparison between P-o-F and reliability is not well understood, and therefore,
Classical Approaches additional analysis is required. The authors
pointed to the ignorance of the classical approach
Several comparisons with Mil-Hdbk-217 to the dominant failure mechanism in
have been published in both quantitative and determining reliability, while the P-o-F approach
qualitative domains. From those papers, the three inherently contains reliability analysis of
that are completely dedicated to the comparison contributing failure mechanisms.
between the P-o-F and classical approaches are Further, Foucher et al [4] provided the
summarized here; these three papers highlighted comparison between the P-o-F approach and
prediction inaccuracy and its consequences as classical approach which is not based on
connotation of the classical approach. The first subjective characteristics. When the most
logical target for the comparison was the significant characteristic are compared, such as
assumption of a constant hazard rate that is the accuracy, ease of customization, traceability, and
basic assumption of the classical approach. ability to evolution, the P-o-F approach is
Mortin et al [11] made a comparison between reported as the more superior approach in
constant hazard rate and hazard rate specific for a reliability engineering.
failure mechanism. The electromigration is These aforementioned comparisons between
considered as an example for the single failure the P-o-F approach and the classical approach
mechanism of a component. When increasing distinguish the P-o-F approach as the more
hazard rate vs. constant hazard rate is compared, effective reliability approach to be used in
the significant discrepancy can be observed. The concurrent engineering which involves methods
discrepancy between an assumed constant hazard for both designing-in reliability and performance
rate and the corresponding real hazard rate improvement.
penalizes not only the support determined by
logistic and maintenance requirements, which are 4. Physics-of-Failure Procedure
based on the assumed hazard rates, but also the
design itself as a result of either over-designing The literature review revealed dissimilarities
or under-designing it. As the consequence, in P-o-F procedural steps that indicate limitation
tremendous life-cycle cost could be imposed due in understanding the P-o-F process. By defining
to the incorrect assumption. the generic P-o-F procedure, this paper identifies
procedural steps in conducting any P-o-F
analysis and offers uniformity in its applicability.

747
• Comparing up-front design candidates.
Potential failure modes for a design candidate
STEP 1 STEP 2 Failure that are specific for an intended environment can
Environmental Contributing
Factors and Mechanisms be analyzed before making a decision whether
Conditions the candidate component should be used in a new
design. The ability to compare different design
candidates for supporting a specific requirement
STEP 4 STEP 3 eliminates the possibility of over-designing; thus,
Environmental
this advantage contributes to cost-effectiveness.
Reliability /Operational • Identifying up-front design improvements.
Assessment Factors and
Conditions While the classical prediction methodology
Figure 1. The generic P-o-F procedure completely ignores failure modes, the P-o-F
approach does have the potential either to warn
The application of the P-o-F approach up-front on a need for improvements in design
contains the four main steps that are illustrated in and manufacturing or to indicate the necessity
Figure 1. for required inspections during operation. As a
The first step is to determine environmental result, test-analyze-fix (TAF) cycles inherent to
factors by specifying or measuring them. reliability growth process can be minimized, or
The next crucial step is to isolate potential even eliminated. This capability to minimize
failures triads (site, mode, mechanism) – the TAF cycles positively impacts both on
failure-triad determining sequence can differ shortening the time-to-market and on reducing
depending on the available data. This step unifies the development cost. In addition, the P-o-F
the identification of failure sites and site- approach can be concurrently used in analyzing
corresponding failure modes, and the how design changes affect reliability.
determination of mechanisms contributing to a • Getting realistic predictions. Due to deeper
potential failure mode. This step could benefit insights in failures when failure mechanisms are
from performing a comparative analysis between known, over-optimistic or over-pessimistic
good components and failed components, those predictions can be minimized - the P-o-F
failed either in field environment or in laboratory approach realistically evaluates all elements in
environment. reliability consideration such as new materials,
The third step that comes after identifying structures, or technologies. Therefore, the P-o-F
failure triads and isolating the potential approach can have cost-effective impacts on
mechanisms is to filter contributing maintenance and logistic requirements.
environmental and/or operational factors. • Estimating the reliability quickly. The P-o-F
The last step is to find proper functional approach offers the ability to check or compare
dependencies of all stresses and to identify the vendor-claimed reliability for a specified
applicable models. Selected models are required environment. This ability could eliminate the
that have the best fitting for the specified need for long lasting qualification testing.
operational/ environmental conditions - it is also • Determining the life expectancy of
quite important to determine validity bounds of components for different mission profiles. P-o-F
the corresponding model. Then, when the proper models can predict the life expectancy of a
equations are known, the life of components for component for various potential mission profiles.
any given operational/environmental condition For example, an electronic component used on
inside its validity is determined. an automotive platform could expose a very
different life distribution than the life distribution
5. Advantages of the P-o-F Approach in of the same component when it operates on some
Reliability Engineering other, non-similar platform. Also, due to market
globalization, it is a competitive advantage to
When the P-o-F approach is applied during design a reliable system for various
development, it gives many advantages that are environments.
spread throughout the whole development • Optimizing Environmental Stress Screening
process. These advantages can be expressed in (ESS) / Burn-in parameters. Because P-o-F
different ways depending on the development models can exercise the existence of different
perspective, and some of them are as follows: weak properties that are consequences of either

748
the presence of defects in materials or P-o-F framework by applying methods for
inappropriate manufacturing processes, the P-o-F including variability of either key variables or all
approach should be used in determining the variables - an example is presented in [5, 8].
optimal number of burn-in/ ESS cycles for Besides this straightforward way, another
applied profiles and their extreme values. This method for constructing probabilistic P-o-F
optimization is especially important to avoid models has been developed. This method
unnecessary product aging. proposed to build-in engineering parameters into
• Identifying a focus of preventive prior beliefs. Consequently, predictions made by
maintenance and its optimal preventive interval. this method often require expert knowledge that
Due to environmental variations, the life itself introduces huge uncertainty. On the other
distribution of components might considerably side, by utilizing the Bayesian approach, this
differ; different environments may alter method also requires failure evidence for
dominant failure modes, and they may changing the prior beliefs. These requirements
consequently result in a different maintenance imply that this PP-o-F method is not so
focus and the corresponding optimal preventive effectively fit in time during the development
maintenance interval. process – consequently, the method brings
This list is not exhaustive by any means. disadvantages similar to those of the classical
From these several listed advantages it can be reliability methodology that has been explored
inferred that applying the P-o-F approach is by the reliability engineering community.
beneficial during the development process for Some elements of PP-o-F principles are also
addressing or identifying proper reliability present in the work of Haggag et al [7] who
concerns, and as a result, it can have significant determined the failure-time distribution of both
impact on shortening the development cycle, on deep-submicron MOSFET transistors and optical
reducing development cost, or on optimizing interconnects through a common defect
maintenance. activation distribution.
It can be said that the Probabilistic Physics-
6. Probabilistic Physics-of-Failure of-Failure approach is relatively in its early
infancy. An effective PP-o-F methodology,
Due to inevitable variations of variables therefore, represents targeted frontier for
involved in processes contributing to the reliability scientists – the PP-o-F methodologies
occurrence of failures, the P-o-F approach has to shall bridge the last gap in the reliability
be probabilistic. The new Probabilistic Physics- engineering by establishing relationship among
of-Failure (PP-o-F) methodologies should be the three key words from the reliability
developed for assessing the reliability of definitions: probability, time, and environment.
components by involving variations that mainly
come from: 7. Conclusion
• Environmental factors. The variations as a
result of environmental factors are specific not There are two approaches to prediction and
only for the field environment but also for the assessment in reliability engineering. The
accelerated qualification or demonstration classical way relays on the availability of
testing; for example, during temperature cycling extensive component libraries having reliability
in an environmental chamber, there is always the measures for similar parts; failure rate prediction
discrepancy between the air temperature in the made by this approach often results in inaccurate
chamber and the temperature of the components reliability prediction and assessment. On the
due to their thermal inertia. other hand, the physics-of-failure approach
• Mission profiles. The uncertainty of mission demonstrates the better accuracy because it is
profiles includes both nominal values vs. based both on identifying the critical failure
extreme values and dynamics in the domain of modes and on estimating the impact of
environmental factors. contributing environmental factors accumulating
• Manufacturing processes. Manufacturing damages in the parts that ultimately lead to
processes combine variations in tooling and in failures.
material properties. It is important to have in mind that the
A logical, straightforward way of classical and P-o-F approaches may be
implementing PP-o-F is to extend the existing complementary and both are useful in the

749
development of a product or a component. When [6] Goel A, Graves RJ. Electronic system
utilized early in development, classical approach reliability: collating prediction models. IEEE
to prediction has its complementing value to Transaction on Device and Materials
P-o-F approach. Reliability 2006; 6(2): 258–265.
Applying the P-o-F approach during the [7] Haggag A, McMahon W, Hess K, Cheng K,
development process is beneficial for addressing Lee J, Lyding J. A Probabilistic-Physics-of-
or identifying proper reliability concerns; as a Failure/Short-Time-Test Approach to
result, the physics-of-failure approach can have Reliability Assurance for High-Performance
significant impact on shortening the development Chips: Models for Deep-Submicron
cycle, on reducing development cost, or on Transistors and Optical Interconnects, IEEE
optimizing maintenance. Therefore, utilizing this International Integrated Reliability
approach has many advantages in development; Workshop; 2000 Oct 23-26; S. Lake Tahoe,
however, there is an important, rarely-utilized California, USA; pp. 179 – 182.
aspect of the approach: the probabilistic aspect. [8] Hall PL, Strutt JE. Probabilistic Physics-of-
The probabilistic P-o-F is in its infancy; Failure Models for Component Reliabilities
existing methods can and should be improved. Using Monte Carlo Simulation and Weibull
To make them effective, developing a new Analysis: A Parametric Study. Reliability
methodology to address more effectively aspects Engineering & System Safety 2003; 80(3):
of the P-o-F's rarely-utilized and missing pp. 233-242.
dimensions is urgently needed. New [9] Kleyner A, Bender M. Reliability Prediction
methodologies should rely more on remarkable Method Based on Merging Military
computing power which is readily available Standards Approach with Manufacturer's
nowadays in the engineering development Warranty Data. Annual Reliability and
environment for simulation. Future work in new Maintainability Symposium; 2003 Jan27-30;
probabilistic physics-of-failure methodologies Tampa, FL, USA; pp. 202-206.
will improve reliability discipline and will [10] Knowles I. Is It Time for a New Approach?.
increase the accuracy of reliability predictions IEEE Transactions on Reliability 1993;
and assessments throughout the whole 42(1): 2-3.
development cycle. [11] Mortin DE, Krolewski JG, Cushing MJ.
Consideration of Component Failure
8. References Mechanisms in the Reliability Assessment
of Electronic Equipment - Addressing the
[1] Cushing MJ, Mortin DE, Stadterman TJ, Constant Failure Rate Assumption.
Malhotra, A. Comparison of Electronics- Proceedings Annual Reliability and
Reliability Assessment Approaches. IEEE Maintainability Symposium; 1995 Jan 16-
Transactions on Reliability 1993; 42(4): 19; Washington DC, USA; pp. 54-59.
542-546. [12] Pecht M. Why The Traditional Reliability
[2] Department of Defense, Military Acquisition Prediction Models Do Not Work - Is There
Handbook 179A. Washington DC: US Gov. An Alternative. Electronic Cooling 1996;
Printing Office; 1991. 2(1): 10-12.
[3] Evans J, Lall P, Bauernschub R. A [13] Talmor M, Arueti S. Reliability Prediction:
Framework for Reliability Modeling of The Turn-Over Point. Proceedings Annual
Electronics. Proc. Annual Reliability and Reliability and Maintainability Symposium;
Maintainability Symposium; 1995 Jan 16- 1997 Jan 13-16; Philadelphia, Pennsylvania
19; Washington DC, USA; pp. 144-151. USA; pp. 254-262.
[4] Foucher B, Boulli J, Meslet B, Das D. A [14] Watson GF. MIL Reliability: A New
review of reliability prediction methods for Approach. IEEE Spectrum 1992; 29(8): 46-
electronic devices. Microelectronics 49.
Reliability 2002; 42(8): 1155–1162. [15] Weil L, Pecht M, Hakim E. Reliability
[5] Gibson D. Statistical Reliability Prediction. Evaluation of Plastic Encapsulated Parts.
Integrated Reliability Workshop Final IEEE Transaction on Reliability 1993;
Report; 1995 Oct 22-25; Atlanta, GA, USA; 42(4): 536-540.
pp. 161-166.

750

View publication stats

You might also like