Professional Documents
Culture Documents
Kumar 1997
Kumar 1997
JOURNAL
OF OPERATIONAL
RESEARCH
ELSEVIER European Journal of Operational Research 99 (1997) 507-515
Abstract
The failure characteristics of a system may depend on the total operating time, operating time since the last repair, failure
history, operating conditions or on the values of monitored variables. A reliability based approach which takes into
consideration values of monitored variables is suggested for estimating the optimum maintenance (or replacement) time
interval for a system or threshold values of monitored variables under the age replacement policy. The maintenance cost
equation is formed on the basis of the planned and unplanned maintenance costs and the values of monitored variables. The
proportional hazards model is used to identify the importance of monitored variables. The reliability function is estimated
considering the values of monitored variables. A total time on test (TIT) plot based on this estimate of the reliability
function is used to estimate the optimum maintenance (or replacement) time interval for a system or threshold values of
monitored variables. This approach is illustrated with an example. © 1997 Elsevier Science B.V.
Keywords: Proportional hazards model; Total time on test; Age replacement policy; Threshold value; Maintenance scheduling
values of the variables at which a planned mainte- nential function including the effects of the moni-
nance should be carried out so that the long run tored variables. For example, these monitored vari-
average maintenance cost per unit time is minimised. ables can be lubricant pressure, temperature, particle
The threshold value of a monitored variable is the contents in hydraulics or vibration levels. Hence, the
maximum/minimum allowable value which leads to hazard rate of a system can be written as (Cox, 1972)
a minimal long run average maintenance cost per
unit time of a system, if a planned maintenance is h( t; z) = ho( t ) exp ( z / 3 ) , (1)
carried out at this value. In case of a continuously
where z is a row vector consisting of covariates,
monitored single variable, the generalised total time
explanatory variables, any monitored variable or any
on test plot (Barlow and Campo, 1975; Bergman,
state indicating variable, and /3 is a column vector
1977; A1-Najjar, 1990) may be used to determine the
consisting of the corresponding regression parame-
threshold value of a variable.
ters. The unknown parameter/3 defines the influence
Under the age replacement policy, a system is
of the monitored variables on the failure process. In
assumed to be renewed by planned and unplanned
case of continuously or periodically monitored vari-
maintenance either through replacements or repairs.
ables during a life cycle, an approximation such as
A maintenance cost equation is formed considering
slope of the curve followed by the variable or changes
the planned and unplanned maintenance costs and
in the slope or an appropriate function should be
the reliability function for estimating optimum main-
used for z. The partial likelihood function is gener-
tenance (or replacement) time intervals. Our sugges-
ally used for estimating the regression vector/3 (see
tions is to use the reliability function based both on
Kalbfleisch and Prentice (1980) and Cox and Oakes
the failure times and the values of monitored vari-
(1984) for details):
ables in the cost equation instead of the usual ap-
proach based on failure times only (see Barlow and k exp ( zi/3 )
Proschan, 1965). There are several models for esti- L(/3) = F I ,,, (2)
mating the reliability function based on variables but i=l[E exp ( Z , / 3 ) 1
the proportional hazards model (Cox, 1972) appears n~¢( ti)
very robust and requires few assumptions (Newby,
1993). Our suggestions is to use the estimate of the where d i is a small number of tied failures compared
reliability function obtained in the proportional haz- to the number of failure times n the risk set ¢(t~) at
ards model. Further it is suggested that a graphical time t i and k is the number of uncensored observa-
method called total time on test (TTT) plotting based tions. These observations may arise from failures of
on the proportional hazards model should be used to less than n repairable systems with multiple failures
estimate the optimum maintenance (or replacement) or from n nonrepairable systems. It should be noted
time interval or threshold value. that Eq. (2) is based on the assumptions of indepen-
dent and identical distribution of failure times. This
assumption should be tested for repairable systems.
2. Proportional hazards model Here failure times of a repairable system correspond
to the time between successive failures.
The proportional hazards model (PHM) was first Different methods have been suggested for esti-
introduced by Cox (1972). Since then various appli- mating of the baseline hazard rate (see Breslow,
cations of the PHM in reliability analysis have been 1974; Kalbfleisch and Prentice, 1980; Westberg and
presented. A general review and available computer Kumar, 1995). We will use the following estimate of
programs for the PHM are given in Kumar and the cumulative baseline hazard rate suggested by
Klefsj~5 (1994). Breslow (1974) as it is easier to calculate:
The basic approach in the proportional hazards
modelling is to assume that the hazard rate of a Ito( t ) = • di (3)
system consists of two multiplicative factors, the i;ti<t E exp(z, fi) "
baseline hazard rate, ho(t), and generally an expo- n Ere( l i)
D. Kumar, U. Westberg/ European Journal of Operational Research 99 (1997) 507-515 509
The reliability function, R(t; z) is related to the renewed, both at unplanned and planned mainte-
baseline reliability function, in the following manner nance.
(see Appendix A) The average maintenance cost per unit time in the
long run without considering monitored variables is
R( t; z) = [ Ro( t)] exp(z#). (4)
given in Barlow and Proschan (1965). This mainte-
Hence we can get an estimate of the total reliabil- nance cost equation can be extended to include
ity function, R(t; z), at any desired time points. This monitored variables
estimate of R(t; z) is then used in the maintenance c + a . F ( T ; z)
cost equation. C(T; z) = , (6)
orR( t; z )dt
3. Maintenance cost equation for age replacement where F(T; z) is the cumulative distribution func-
policy tion. The optimum maintenance time interval, TO, or
the threshold value, Zth, that will give the minimum
Let c be the planned maintenance cost and c + a average maintenance cost per unit time in the long
be the unplanned maintenance cost, i.e., a is the run can be found by minimising Eq. (6). One may
additional cost due to failure of the system during use a numerical method (see Love and Guo, 1991) or
operation. The total maintenance cost C(t, T, z) a graphical method called TTF- plotting discussed in
over the time interval (0, t] is a random variable and the following sections.
T is the time interval between two planned mainte-
nance activities. Our aim is to find either the value of
T, say T0, or the threshold value of variable of z, 4. Total time on test plotting
say z,h, such that the long run average maintenance
The TTT-plot and its theoretical counterpart, the
cost per unit time is minimised, i.e., we look for
scaled T/~-transform, were introduced by Barlow
either T or z such that
and Campo (1975). Since then various applications
of TTT-plotting have been presented (Bergman and
C(T, Z) = lim E (5) Klefsj~5, 1982; Klefsj~5, 1986; Kumar et al., 1989;
t--~ t
Dohi et al., 1995). The "lq~-plot gives a picture of
is minimised. the failure data which is independent of the scale and
Under the age replacement policy, unplanned is situated completely within the unit square with
maintenance is carried out after a failure and planned comers in (0, 0), (0, 1), (1, 0) and (1, 1). The
maintenance is carried out after the system has been TTT-plot explained in this paper differs from the
working for T units of time since the last planned or original TTT-plot in the sense that monitored vari-
unplanned maintenance (see Fig. 1). A nonrepairable able values are being considered.
system is replaced by a new one after any mainte- Let 0 = t(l ) ~ / ' ( 2 ) ~ . . . ~ t(.) denote an ordered
nance action. A repairable system is assumed to be and complete sample from a life distribution F(t; z).
Let S(ti; zi) be the total time generated in ages by a
failure at a time less than or equal to t v Then the
TFF value at any time t~ is defined as (see Barlow
Hazard [ / / - / ,' , ~ q , i
and Campo, 1975):
where ui = F( t i, zi). The TrT-plot is obtained by ing under different situations. The TTI'-plot should
plotting (u, qffu)) and joining the points by line be obtained considering the situations for which an
segments. optimum maintenance (or replacement) time interval
is to be estimated. Then a line with the largest
possible slope should be drawn tangent to the T T r -
5. Determination of the optimum maintenance (or plot and passing through the point ( - c / a , 0) on the
replacement) time interval horizontal axis. The value of the failure time corre-
sponding to the point at which the line touches the
A graphical approach to estimate the optimum TVr-plot will be an estimate of the optimum mainte-
maintenance time interval from a TTT-plot was first nance (or replacement) time interval.
suggested by Bergman (1977). Our suggestion is to Suppose that more than one variables have been
consider monitored variables in the TgI'-plot. If we found significant in the PHM analysis. These vari-
draw a line from the point ( - c / a , 0) and tangent to ables can either be condition monitored variables or
the TTT-plot, its slope will be given by any explanatory factors characterising the system
failure. The TT'f-plot should be obtained considering
Slope - c / a + u (9) all the significant variables to obtain an estimate of
the optimum maintenance (or replacement) time in-
After some transformation, Eq. (6) can be written as terval. In some situations, we may be interested in
(see Appendix A) estimating the optimum maintenance time interval,
To corresponding to one particular monitored vari-
c / / a q- u
C ( T ; z) = constant × - - (10) able, say z~. The reliability function, R(t; z~), corre-
sponding to z~ is estimated using the relation Eq.
It can be concluded that the slope of a line drawn (4), i.e.,
from the point ( - c / a , 0) on the TTT-plot is in-
versely proportional to the average unit maintenance R( t; zl) -~ [ Ro( t)] exp(t3'z') (12)
cost given by Eq. (6). Therefore, the largest possible
slope will correspond to the minimum average unit
maintenance (or replacement) cost. If the line with Eq. (12) can be used in estimating R(t; Zl) at all
the largest possible slope is tangent to the TTT-plot failure times and a TVI'-plot can be obtained to
at the point (u 0, q~0(u)), the value of the failure time, estimate optimum maintenance time interval To cor-
To, which has been used to calculate (u 0, q~0(u)), responding to the variable value Zl. If we are inter-
will be an estimate of the optimum maintenance (or ested in estimating TO corresponding to two or more
replacement) time interval. Generally planned main- particular variables, we should estimate the reliabil-
tenance will not be economical to perform if the line ity function same as in Eq. (12) based on each
is tangent to the point (1, 1). individual variables. Then, the corresponding TTT-
To consider the monitored variable values, the plots should be used for estimating TO.
T I T value at time t i can be estimated based on the The value of a monitored variable may remain the
estimate of the reliability function given by Eq. (4) same during a particular interval between mainte-
using the PHM, that is, nance actions, but it may have a different value
i
during an another life cycle. One may be interested
S ( t i ; Z , ) = n ~., ( t j - t j _ l ) R ( t j _ , ; z j _ , ) , in estimating the optimum maintenance (or replace-
j=l ment) time intervals corresponding to different val-
ues of the monitored variable. For example, let the
for i = 1,2 ..... n and R( to; Zo) = l, (11)
monitored variable, Zl have three levels, say zl0, zll
where zi are the values of the monitored variables at and z,2, and the monitored variable value is equal to
time t i. zm during some operational time until a failure
Consideration of values of monitored variables in occurs and is equal to Zll during another operational
the TTT-plot is very useful in maintenance schedul- time until a failure occurs. One may be interested in
D. Kumar, U. Westberg/ European Journal of Operational Research 99 (1997) 507-515 51 l
Table 1
The time to failure of pressure gauges were replaced on failure or as part of planned maintenance (denoted by asterisks). This data has been
taken from the paper by Love and Guo (1991)
Observation Number 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Time to failure 32 42 44' 47 51 53 60 * 61 * 66 70 70 77 95 101 198
Monitored variable, Pressure 5 4 5 5 5 4 3 4 3 4 5 4 3 3 3
512 D. Kumar, U. Westberg/ European Journal of Operational Research 99 (1997) 507-515
1.0
on the hazard rate is estimated using the PHM. The
effect of a variable estimated in the PHM is a
0.8
relative risk. Therefore, it is better to formulate a
variable value that includes zero. The pressure values
0.6
were therefore coded as - 1, 0, and 1 for the pres-
sure values equal to 3, 4, and 5, respectively. The
0.4
result of the PHM analysis is listed in Table 2. The
effects of the variable pressure was estimated using
Eq. (2). Hence, the hazard rate of the pressure gauge
can be written as
-0.5 0.0 0.5 1.0 h ( t ; z) = h 0 ( t ) e x p ( 1 . 1 8 8 z ) , (17)
//
where z is equal to - 1, 0, or 1.
Fig. 2. A TIT-plot to find the optimal replacement time intervals. In order to estimate the optimum replacement
The straight line drawn with the largest possible slope from the
point (-0.50, 0) touches the TIT-plot at the point which corre- time intervals when the coded value of the variable
sponds to failure time 42 hours. Hence, the optimal maintenance is equal to only one of the three values, - 1, 0, or 1,
time interval will be 42 hours. TT-l'-plots corresponding to each of the three values
were obtained. The Eq. (13) was used to estimate the
corresponding reliability functions. The correspond-
s(ti) ing TTT-plots are given in Fig. 3. For c / a = 0.5,
q~( ui) = S( t,----~" (16) lines with the largest slope and tangent to the TTT-
plots are drawn from the point ( - 0 . 5 , 0). These
To estimate the optimum maintenance or replace-
lines touch the TTT-plots at the points which corre-
ment time interval, a line with the largest possible
spond to failure times 66, 47 and 42 hours. Hence
slope is drawn tangent to the TTT-plot and passing
the optimum replacement time interval will be 66, 47
through the point ( - c / a , 0) on the horizontal axis.
and 42 hours when the variable z is equal to - 1, 0,
For c / a = 0.5, the corresponding line touches the
and 1, respectively, i.e., when the pressure is equal
TTT-plot at the point (u 0, ~P(u0)) which corresponds
to 3, 4, and 5 units, respectively.
to the failure time that is equal to 42 hours. Hence
the optimum replacement time intervals of pressure 7.3. Threshold value of the variable pressure
gauges will be 42 hours if c / a = 0.5. The estimated
optimum maintenance or replacement time interval is Estimation of threshold values has no practical
the same even though the cost ratio varies in the significance for the example considered here. But to
range 0.16 to 0.56. Because, a line drawn from any illustrate the approach, let us consider a hypothetical
point in the interval [ - 0 . 1 6 , - 0 . 5 6 ] on the horizon- situation. Let us assume that time to failures are from
tal axis will touch the "ITr-plot at the same point a process plant and we have possibility to decide the
which corresponds to 42 hours. This illustrates that it level of pressure at which the plant should be oper-
is easier to carry out a sensitivity analysis using a ated.
TI'T-plot compared to using numerical methods. To estimate the threshold value, the unit mainte-
nance costs Eq. (5) are compared when T = 66, 47
7.2. Optimum time interval based on failure times and 42 hours when z = - l, 0, and 1, respectively.
and the monitored variables The estimated costs per hour are 0.042 c, 0.059c and
1.ot
0.8 (a)
0.6
1.0 l
0.8 0.8 (C
0.6 0.6
-0.5
ilt 0.0 0.5 1.0 -0.5 0.0 05 1.0
U
Fig. 3. T'lq'-plots to find the optimal r e p l a c e m e n t time intervals w h e n the monitored variable z is equal to only one o f the three possible
values, i.e., - 1, 0 or 1. Straight lines are d r a w n with the largest possible slope from the point ( - 0.5, 0) and tangent to the c o r r e s p o n d i n g
T I T - p l o t s . The points at w h i c h these lines are tangent to the "ITl'-plots c o r r e s p o n d s to failure times equal to 66, 47 and 42 h o u r s for z =
- 1 , 0 a n d 1, respectively. Hence, the optimal r e p l a c e m e n t time interval will be 66 (for z = - 1), 47 (for z = 0) and 42 (for z = 1) hours, a:
"FIT-plot when covariate z = - 1. b: T T T - p l o t w h e n covariate z = 0. c: l l l - p l o t when covariate z = I.
0.061c, respectively for c/a=0.5. Therefore the characteristic of a system depends on the failure
threshold value of the monitored variable will be 3 times only or on the monitored variable value also.
units which corresponds to z = - 1. This corre- The TTT-plot considering only failure times should
sponds to the minimum long run average unit main- be used when information about the monitored vari-
tenance costs. Other examples of applications of able is not available or when the monitored variables
such an approach is decision about maximum allow- are not important in explaining the failure character-
able level of particle contents in lubrication oil or istic of a system. Otherwise, the TTT-plot consider-
maximum allowable vibration of a system. ing monitored variable values should be used while
estimating the optimum maintenance time intervals
or threshold values. This approach can be used in the
8. Conclusions presence of any factors influencing the reliability
characteristics of a system with some limitations.
A new method for maintenance scheduling based This approach will be difficult to use when there is
on the PHM and TTF-plotting is suggested. The interaction among monitored variables or any other
PHM should be used to find out whether the failure influencing factors.
514 D. Kumar, U. Westberg/ European Journal of Operational Research 99 (1997) 507-515
n
fo (1 -u)dt
Appendix A or
s( t,; z,)
A.1. Relation between R(t; z) and Ro(t) =
s( t.; z . ) '
where
R( t; z) =exp [ - foth( x; z)dx ]
S( ti; Zi) = n fot,R( t; z ) d t = n for'( 1 -- u)d t.
=exp[-fotho(x)exp(zfl)dx ]
= exp [ - H 0 ( t ) exp ( z f l ) ]
References
= exp [-ln(Ro(t))exp(z[3)]
Al-Najjar, B. (1990), "Some problems on the selection of a
=
condition based maintenance technique for mechanical sys-
tems", Licentiate Thesis No. 248, LinkSping University.
Barlow, R.E. and Proschan, F. (1965), Mathematical Theory of
A.2. Transformation of maintenance cost equation Reliability and Life Testing, Wiley, New York.
Barlow, R.E. and Campo, R. (1975), "Total time on test process
and applications to failure data analysis", in: R.E. Barlow, J.
c + a" F(T; z) Fussel and N.D. Singpurwalla (eds.), Reliability and Fault
c ( T , z) = Tree Analysis, SIAM, Philadelphia, PA, 451-481.
f0TR(t; z)dt Bergman, B. (1977), "Some graphical methods for maintenance
planning", Annual Reliability and Maintainability Sympo-
sium, 467-471.
a c/a + F(T; z) Bergman, B. and KlefsjS, B. (1982), " A graphical method appli-
Ix F-I(F(T; z)) cable to age replacement problems", IEEE Transactions on
± f (R(T;z))dt Reliability R-31,478-481.
Breslow, N.E. (1974), "Covariance analysis of censored survival
/x o data", Biometrics 30, 89-99.
D. Kumar, U. Westberg / European Journal of Operational Research 99 (1997) 507-515 515
Cho, D.I. and Parlar, M. (1991), "A survey of maintenance a review", Reliability Engineering and System Safety 44,
models for multi-unit systems", European Journal of Opera- 177-188.
tional Research 51, 1-23. Kumar, U, KlefsjiS, B. and Granholm, S. (1989), "Reliability
Cox, D.R. (1972), "Regression models and life-tables", Journal investigation for a fleet of load haul dump machines in a
of The Royal Statistical Society B 34, 187-220. Swedish mine", Reliability Engineering and System Safety
Cox, D.R. and Oakes, D. (1984), Analysis of Survival Data, 26, 341-361.
Chapman and Hall, London. Love, C.E. and Guo, R. (1991), "Using proportional hazard
Dohi, T., Kaio, N. and Osaki, S. (1995), "Solution procedure for modelling in plant maintenance", Quality and Reliability En-
a repair-limit problem using the TIT concept", IMA Journal gineering International 7, 7-17.
of Mathematics Applied in Business and Industry 6/1, 101- Newby, M. (1993), "A critical look at some point process models
111. for repairable systems", IMA Journal of Mathematics Applied
Kalbfleisch, J.D. and Prentice, R.L. (1990), The Statistical Analy- in Business and Industry 4, 375-394.
sis of Failure Time Data, Wiley, New York. Valdez-Flores, C. and Feldman, R.M. (1989), "A survey of
Kaplan, E.L. and Meier, P. (1958), "Non-parametric estimation preventive maintenance models for stochastically deteriorating
from incomplete observations", Journal of the American Sta- single-unit system", Naval Research Logistics 36, 419-446.
tistical Association 53, 457-481. Westberg, U. and Kumar D. (1995), "Estimation of the cumula-
Klefsj~5, B. (1986), "TYI'-transforms - A useful tool when tive baseline hazard rate in the proportional hazards model",
analysing different reliability problems", Reliability Engineer- in: Proceedings of the International Society of Science and
ing 4, 231-241. Applied Technology, ISSAT, Florida, 254-258.
Kumar, D. and KlefsjiS, B. (1994), "Proportional hazards model: