Professional Documents
Culture Documents
MTBF
MTBF
MTBF
Role in Reliability
MTBF: Understanding
Understanding Its
Its Role
Role in
in
MTBF:
Reliability
Reliability
MTBF = te t dt
0
Tutorial by
by
AATutorial
DavidC.
C.Wilson
Wilson
David
Owner/ /Principal
PrincipalConsultant
Consultant
Owner
WILSONCONSULTING
CONSULTINGSERVICES,
SERVICES,LLC
LLC
WILSON
March 2007
2007
March
Page 1 - 47
Table of Contents
Section
Page
1.0: Introduction---------------------------------------------------------2-47
2.0: Reliability Mathematics-------------------------------------------7-47
3.0: Bathtub Curve-----------------------------------------------------21-47
4.0: Failure Rate--------------------------------------------------------23-47
5.0: MTBF----------------------------------------------------------------30-47
6.0: MTTF----------------------------------------------------------------38-47
7.0: MTTR----------------------------------------------------------------4047
8.0: Relationships Summary------------------------------------------41-47
9.0: Summary------------------------------------------------------------46-47
Page 2 - 47
Introduction
1.0: Introduction
The objective is to enable an individual who is unfamiliar with the use of the MTBF
reliability parameter to understand its relationship in product predictions, failure
rates, field performance, etc. After completing this tutorial, the participant will know
and understand how MTBF relates to product performance over time.
In order for the practitioner to speak intelligently and authoritatively on the
parameter, MTBF, it important at a minimum that a cursory understanding of the
mathematical concepts involved with Reliability be mastered. Therefore,
mathematical and practical treatments relative to MTBF are included in this tutorial
using the exponential distribution model.
Additionally, to achieve a thorough understanding of statistics and reliability, many
statistical experts list four kinds of understanding as shown below.*
1. Computational/Numerical
2 Visual/Graphical
3. Verbal/Interpretive
4. Structural/deductive
*For additional information on the items above, please contact the author
of this tutorial.
Copyright 2007 Wilson Consulting Services, LLC
Page 3 - 47
Introduction contd
What is MTBF?
Measure of rate of failure within the design life.
Page 4 - 47
Introduction contd
Page 5 - 47
Introduction contd
Reliability Definitions
RELIABILITY [R(t)] - The probability that an item will
perform its intended function without failure under stated
conditions for a specified period of time.
FAILURE - The termination of the ability of an item to perform
its required function as specified.
FAILURE RATE (FR) - The ratio of the number of failures
within a sample to the cumulative operating time.
HAZARD RATE [, h(t)] - The "instantaneous" probability of
failure of an item given that it has survived up until that time.
Sometimes called the instantaneous failure rate.
Copyright 2007 Wilson Consulting Services, LLC
Page 6 - 47
Reliability Mathematics
Page 7 - 47
P.D.F. contd
A positive continuous random variable follows an exponential
distribution if the probability density function is as shown:
Thus the P.D.F.
ae ax ( For
f ( x) =
0 ( For
x0)
x<0)
Page 8 - 47
P.D.F. contd
It is a probability density function (P.D.F.)
f (t ) = e t
Lambda is a constant and is called the failure rate
Density
f(t)
Figure 1
t
Page 9 - 47
F (t ) = P (T t ) =
f ( )d
variable; Let t = ,
then dt =d
C.D.F., F(t)
1.0
0
Copyright 2007 Wilson Consulting Services, LLC
Figure 1A
t
Page 10 - 47
1 e
Page 11 - 47
f (t ) = e t
2. C.D.F. is derived by integrating p.d.f.
F ( ) = P (T ) =
F ( ) =
f ( ) d
e d
F ( ) = e
F (t ) = 1 e t
F(t) = 1 R(t)
QED
0t
is
F (t ) = 1 e
t
Page 12 - 47
f (t )
f (t ) e t
=
= t =
h (t ) =
1 F (t )
R (t )
e
2. Cumulative hazard function:
(Integrating the hazard function to obtain the cum hazard function. Use dummy variable of
integration )
t
H ( ) = h( )d =
t e
t
t
f ( )
d = d = d = | = t
0
0 e
0
R( )
H (t ) = ln R(t )
Taking the natural log of both sides : R (t ) = e t
ln R (t ) = ln e t = t
t can be exp ressed as ln( e t ) = ln R (t )
Copyright 2007 Wilson Consulting Services, LLC
QED
Page 13 - 47
f (t ) = e
0<t<
h(t)
U seful L ife
f (t ) e
= t =
R(t )
e
(t)
Figure 2
Copyright 2007 Wilson Consulting Services, LLC
time
Page 14 - 47
Illustrations: Exponential
ReliabilityCurve
Curve
Reliability
Reliability
R(t)
R(t ) = e t
0
time
Figure 3
f(t)
Density
ProbabilityDensity
DensityFunction
FunctionCurve
Curve
Probability
f (t ) = e t
0
time
Figure 4
t
Page 15 - 47
F (t ) = f ( )d = P (T t )
0
1.0
R (t)
R eliability area R (t)
t
R(t ) = 1 f ( )d
0
Figure 5
t
F (t ) = 1 e
F (t ) = e
R (t ) = 1 F (t )
R (t ) =
Page 16 - 47
Example 1:
For exponential distribution, take the integral of f(t), where
f(t) = e-t , where t =1
1
F (t ) =
0 t <1
f ( t ) dt
0
1
F (t ) = e t dt
0 t <1
Solution
t
F (t ) = e t dt = e
0
t
0
= 1 e = 1 0.3678 = 0.6322
Page 17 - 47
Example 2:
An electronic device contains discrete transistors. Each transistor has
a constant failure rate of = 1105 failure rate/hour. What is the
probability that a single transistor will survive a mission of 104 hours?
R (t ) = 1 F (t )
R (t ) =
Solution
R (t ) =
R (t ) =
10
10
= 0 . 905
F ( t ) = 1 R ( t ) = 1 0 . 906 = 0 . 095
Copyright 2007 Wilson Consulting Services, LLC
Page 18 - 47
10 10
e
R(t ) =
Solution
R(t ) = e0.01
R(t ) = 0.990
F (t ) = 1 R(t ) = 1 0.990 = 0.010
Copyright 2007 Wilson Consulting Services, LLC
Page 19 - 47
Bathtub Curve
Failures
infant mortality
wearout
Figure 6
Time
Infant mortality- often due to manufacturing defects
In electronics systems, there are models to predict MTBF for the
constant failure rate period (Bellcore Model, MIL-HDBK-217F, others)
Understanding wearout requires data on the particular device
- Semiconductor devices should not show wearout except at long times
- Electrical devices which wearout: relays, EL caps, fans, connectors, solder
Copyright 2007 Wilson Consulting Services, LLC
Page 20 - 47
Hazard Rate
- h(t)
Infant Mortality
W earout
External
Stress Failures
W earout
Failures
M anufacturing
Defects
Figure 7
Time
Reliability Prediction
& Validation
Materials characterization
Prediction tools
Wearout Mechanism
Analysis
Long-term data mining
ALT (test to failure)
System Life Modeling
Page 21 - 47
Failure Rate
- A function of time
- Instantaneous
F(t) = 1 - R(t)
Probability of Failure
100%
0.000100
Instantaneous
Hazard Rate
Average
Failure Rate
0.000095
0.000090
0
10000
20000
30000
40000
50000
Time
Figure 8
Page 22 - 47
t2
t 1 h ( t ) dt H (t 2 ) H (t 1) ln R (t 1) ln R (t 2 )
AFR =
=
=
t 2 t1
t 2 t1
t 2 t1
Pr oof :
AFR =
H (t 2 ) H (t 1)
Pr oof :
tAFR =
1
If H ( t ) = ln R ( t )
Then :
[ ln R ( t 2 ) ] [ ln R ( t 1 ) ] =
AFR =
t 2 t1
AFR =
ln R ( t 1 ) ln R ( t 2 )
t 2 t1
QED
Page 23 - 47
Exponential Distribution
R(t)
0.98
Te
xt
0.95
0
t1
t2
8760
17520
tim e in hours
AFR =
t
Figure 9
ln R (t 1) ln R (t 2 )
t1
Page 24 - 47
Example 1 continued
h(t)
3.55E-6
10000
20000
hours
Figure 10
Page 25 - 47
number of failures
=
total unit test time
The denominator is obtained by adding up all operating hours on test of
all units tested, including those that failed and those that completing the
test without failing.
Page 26 - 47
^ = Failure rate =
=
1200*24*90
5
2,592,000
failures / hour
= 1.93E-06 failures/hour
Other expressions
Failures per million hours (Fpmh)
Fpmh = *10 E + 06 = (1.93E 06)(10 E + 06) = 1.93
Page 27 - 47
=
s
i =1
Page 28 - 47
Page 29 - 47
MTBF
Design Life:
MTBF:
Item
Contactor
Pushbutton
CPU-Panel
Design Life
15,000 cycles
3 million ops
15 years
MTBF
55,000 cycles
12 million ops
37 years
Page 30 - 47
MTBF =
t e
dt =
Mathematical Proof
Integrating by parts solution:
By substituti on
let : y = t
t=
MTBF contd
udv = uv vdu
let
u = y & du = dy
dt = dy
MTBF =
y e
ye
= ye
dy
Integratin g by parts
MTBF =
v = e & dv = e dv
y
[
]
(
+
1
)
y
e
MTBF =
QED
ye
dv =
e dy
y
by factoring
[ ( y + 1) e ]
y
Recall: e- = 0
Page 31 - 47
MTBF contd
10000
Figure 11
20000
hours
MTBF = te t dt =
1
MTBF =
= 281,757hours
3.55 E 6
Copyright 2007 Wilson Consulting Services, LLC
Page 32 - 47
Also
Page 33 - 47
MTBF contd
Example 3
Using Chi-square model to find MTBF
MTBF Upper and lower bound is calculated such as 50%, 80% 90%, 95%, etc.
MTBF lower
MTBF upper
2T
,( = 2 n )
2T
2
1 ,( = 2 n )
n Number of defects
T Total operating time
Degrees 0f freedom = 2n
Significance level
Copyright 2007 Wilson Consulting Services, LLC
Page 34 - 47
MTBF contd
Solution
lower bound
MTBF lower
MTBF lower
2T
, ( = 2 n )
2 * 45,000
2
0 .10 , = 2( 6 )
Upper bound
MTBF
MTBF
lupper
lupper
90,000
90,000
= 4,865 hours
18 .5
0 .10 , 12
2T
2
1 , ( = 2 n )
2 * 45 ,000
2
1 0 .10 , = 2(6)
90 ,000
90 ,000
= 14,286 hours
6 .3
0 .90 , 12
Page 35 - 47
MTBF contd
Example 4
An electronics assembly has a goal of 0.99 reliability for one year.
What is the MTBF that the designer should work towards to meet the
goal?
Reliability equation: R(t) = e -t and MTBF = 1/
Hence : R (t ) =
Solution:
MTBF
-t
ln R(t) =
MTBF
t = M T B F * ln R (t)
M TBF =
=
ln R (t)
8760
= 8 7 2 ,0 0 0 h o u r s
-ln 0 .9 9
Page 36 - 47
MTBF contd
Reliability*
99.0%
10,000
90.5%
50,000
60.7%
100,000
36.8%
200,000
13.5%
Page 37 - 47
MTTF
6.0 MTTF
Mean Time To Failure [MTTF] - For non-repairable items, the
ratio of the cumulative operating time to the number of failures
for a group of items.
Example 1: 1200 monitoring devices are operated for 90 days.
During that time, five failures occur.
failures
5
5
=
=
= 19.29E 07
time
1200* 24*90 2,592,000
1
1
= 518,403 hours
MTTF = =
19.29E 07
=
Also,
MTTF =
Page 38 - 47
MTBF contd
An electronics assembly has a goal of 0.99 reliability for one year.
What is the MTTF that the designer should work towards to meet the goal?
Reliability equation: R(t) = e -t and MTTF = 1/
Solving for MTTF
t
MTTF =
ln R(t )
8760
ln R (0.99)
8760
MTTF =
= 871,642 hours
(0.01005)
MTTF =
Page 39 - 47
MTTR
7.0: MTTR
Mean Time To Repair (MTTR) - this is corrective maintenance, which includes
all actions to return a system from a failed to an operating or available state. It is
difficult to plan.
It entails, for example:
1. Preparation time: finding a person for the job, travel, obtaining tools and
test equipment, etc.
2. Active maintenance time, i.e., doing the job
3. Delay time ( logistic time: waiting for spare parts., once the job has been
started.
MTTR
( t
MTBF
Availability =
MTBF + MTTR + mean preventive ma int enance time
Copyright 2007 Wilson Consulting Services, LLC
Page 40 - 47
Relationships Summary
Hazard rate
(failure
rate/hour)
P.D.F.
Fails in
time
Failures per
million
hours
FIT
Fpmh
Description
Hours
Product
MTBF
R(t)
F(t)
h(t)
f(t)
AMPS-24
616860
0.9859
0.01410
1.6211E-06
1.5983E-06
0.1621
1621
1621
1.6211
CPU-3030
325776
0.9735
0.02653
3.0696E-06
2.9882E-06
0.3070
3070
3070
3.0696
CPU-640
307754
0.9719
0.02806
3.2493E-06
3.1582E-06
0.3249
3249
3249
3.2493
NCM-W
NCA
684196
476949
0.9873
0.9818
0.01272
0.01820
1.4616E-06
2.0967E-06
1.4430E-06
2.0585E-06
0.1462
0.2097
1462
2097
1462
2097
1.4616
2.0967
Page 41 - 47
Relationships contd
. Definitions:
Failure rate
The ratio of the number of failures within a sample to the cumulative operating time.
Example: PPM / K hrs.
MTBF
Mean time between failure, which means that 63.2% of the product would have failed by this time.
t
Example:
MTBF = tf (t)dt =
o
te
dt
Reliability R(t)
The probability that an item will perform its intended function without failure under stated conditions for a specified period of time.
Example:
R(t ) = 1 f (t )dt
0
F (t ) =
f ( t ) dt
o
h (t ) =
f (t )
R (t )
Page 42 - 47
Relationships contd
Hazard rate h(t)
The "instantaneous" probability of failure of an item given that it has survived up until that time. Sometimes called the
instantaneous failure rate. It is the failure rate per unit time.
Example: 2.8E-6 / hour,
h (t ) =
f (t )
R (t )
f (t ) =
0t<
. Examples
1. (1% / 1000 hrs.)
One percent per thousand hours would mean an expected rate of 1 fail for each 100 units operating 1000 hours.
Example: 0.1280%/1000 hours means that 1280 failures each 1 million units operating for 1000 hours.
2. (PPM / 1000 hrs.)
One per million per thousand hours means 1 fail is expected out of 1 million components operating for 1000 hours.
Example: 1280 parts per million per thousand hours means that 1280 failures are expected out of 1 million components operating
3. Failure in time (FIT): 1 FIT = 1 Failure in One Billion Hours
Example: 1280 FITS = 1280 failures in one billion hours
Page 43 - 47
N o rm a l
Reliability Models
P ro b a b ility d e n s ity
fu n c tio n , f(t)
P a ra m e te rs
H a z a rd fu n c tio n
(in s ta n ta n e o u s fa ilu re ra te ),
h (t) = f(t) / R (t)
M ean,
0 .4
3
S ta n d a rd
d e v ia tio n ,
R( t )
f( t )
2
1
0
0
1
2
f (t ) =
F a ilu re ra te ,
M TBF,
= -1
D escrib es co n stan t
F a ilu re rate co n d itio n s.
A p p lie s for th e u se fu l
L ife c y c le of m an y
P ro d u c ts. F req u en tly, tim e (t)
Is u sed for x .
W e ib u ll
U se d fo r
m any
reliab ility
a p p lic atio n s.
C an test fo r th e en d
in fa n t m orta lity p erio d .
C an also d esc rib e th e
n o rm a l an d
e x p o n en tial
d istrib u tio n s.
h( t )
0 .2
N u m e ro u s a p p lic a tio n s.
U se fu l w h e n it is eq u a lly
lik e ly th a t re a d in gs w ill
fa ll a b o ve o r b e lo w th e
a ve ra ge .
E x p o n e n tia l
0 .5
(t )
R (t ) =
h (t ) =
f ( t ) dt
f (t )
R (t )
R( t )
f( t )
h( t )
0 .3 6 8
R (t ) =
f( t )
=3
=3
( t)
h (t ) = =
1
=3
=1
R (t)
0 .5
0 .5
L o c a tio n (m in im u m
life ),
C u rv e s s h o w n fo r
= 0
f (t) =
S hape,
S c a le (c h a ra c te ris tic
life ),
f
h (t )
=1
= 0 .5
= 0 .5
=1
= 0 .5
00
.
(t
. exp
0
0
R( t )
exp
h( t)
.( t
Page 44 - 47
Parameters
Gam m a
Describes a situation
when partial failures
can exist. Used to describe
random variables bounded
at one end. The partial
failures can be described
as sub failures. Is an
appropriate model for
the time required for
a total of exactly ?
independent events
to take place if events
occur at a constant rate .
Lognorm al
Failure rate,
= 0.5
1
1
2
h ( t)
a = 0.5
0.5
a=1
a=2
a=2
( )
1 t
( t )
R(t) =
( ) t
t (t ) 1et)dt
f( t )
h( t )
R( t )
0.8
M ean,
a = 0.5
a=1
f (t ) =
a=2
R ( t)
Note: when a is
an integer
I () = (1)!
Hazard function
(instantaneous failure rate),
h(t) = f(t) / R(t)
Reliability function,
R(t) = 1- F(t)
Probability density
function, f(t)
0.6
The Lognormal
distribution is often
a good model for times
to failure when failures
are caused by fatigue
cracks. Let T be a
random variable with a
Lognormal distribution.
By definition the
new random variable
X = ln T will have
a normal distribution.
Standard
deviation,
f( t )
R( t)
0.4
h( t )
0.5
0.2
0
0
0
f( t )
.t .( 2 . )
. exp
0.5
ln ( t
2 .
)
2
R( t )
h( t )
f( t ) d t
f( t )
R( t )
Page 45 - 47
Summary
9.0 Summary
This paper did elaborate on the value of using the MTBF parameter. However, there have been tremendous
improvements in solid-state devices over the years. In earlier times, electronic components were fragile, used
glass tubes, filaments, etc., had inherent wear out mechanisms. By the same token, earlier solid-state devices had
mechanisms that would cause failures in time, such as chemical contamination, metallization defects, and
packaging defects, which resulted in corrosion and delaminating. Many of these defects were accelerated by
high temperature, which resulted in successful use of the burn-in process to weed out infant mortality.
Statistical prediction during that period was valid and accepted because designs at that time consists mostly of
discrete components; therefore, reliability statistical estimates of the life of a new design had for the most part a
reasonable correlation to the actual MTBF. Today hundreds of new components are introduced to the market
almost every week and hundreds are probably taken off the market every week; therefore, it is impossible to
make an accurate prediction based on a summation of parts reliability. Example: Mil-Std-217
Todays components do not have wear-out modes that are within most electronics technologically useful life.
Therefore, the vast majority of failures is due to defects in design or introduced in manufacturing.
Unplanned events in manufacturing such as ECN, change in machine operators, or change in vendors
capabilities of design, introduction of cost reduced parts, etc., any of these or combination can introduce a
decrease in design margins. Hence: this affects reliability and increase field returns. Many experts feel that it is
best to spend time, not on statistical predictions rather on discovering the real capabilities and identifying the
weak links in the design or manufacturing process, and improving them. This approach will help realize
significant improvement in reliability. The end user environment is even more uncontrolled. The end-user will
always push the limits; therefore, a robust design will have a higher survivor rate for these extremes.
Page 46 - 47
References
References
1.
2.
Page 47 - 47