Professional Documents
Culture Documents
A Monte Carlo Methodological Approach To Plant Availability Modeling - Trabajo Confiabilidad - Articulo3
A Monte Carlo Methodological Approach To Plant Availability Modeling - Trabajo Confiabilidad - Articulo3
www.elsevier.com/locate/ress
Abstract
In this paper we present a Monte Carlo approach for the evaluation of plant maintenance strategies and operating procedures under
economic constraints. The proposed Monte Carlo simulation model provides a flexible tool which enables one to describe many of the
relevant aspects for plant management and operation such as aging, repair, obsolescence, renovation, which are not easily captured by
analytical models. The maintenance periods are varied with the age of the components. Aging is described by means of a modified Brown–
Proschan model of imperfect (deteriorating) repair which accounts for the increased proneness to failure of a component after it has been
repaired. A model of obsolescence is introduced to evaluate the convenience of substituting a failed component with a new, improved one.
The economic constraint is formalized in terms of an energy, or cost, function; optimization studies are then performed using the main-
tenance period as the control parameter. q 1999 Elsevier Science Ltd. All rights reserved.
Keywords: Monte Carlo simulation; Periodic maintenance; Aging; Obsolescence; Availability; Energy function; Optimization
0.8 0.025
0.7
0.02
0.6
cdf
0.5
cdf
0.015
0.4
0.3 0.01
0.2
0.005
0.1 weibull: α = 1.1
exponential:λ = .0206
0 0
0 20 40 60 80 100 0 20 40 60 80 100
time (h) time (h)
(a)
(b)
Fig. 2. (a) Cumulative distributions: weibull (a 1:1, b 0:021) vs exponential (l 0:0206). (b) Probability density functions: weibull (a 1:1,
b 0:021) vs exponential (l 0:0206).
maintenance period) CDFs are quite different and the failures occur at very differ-
ent times: the exponential distribution somewhat favors
1
lp ba ta21 l
t
5 early failures whereas the Weibull distribution shifts the
a failures to later times, closer to the end of the period.
which is the average of the l
t function over the period t . Note that the effective failure rate of a component, l p, is
Obviously, for a -values close to unity the Weibull distribu- strictly linked to the maintenance period t and this will have
tion is almost exponential and the two failure models are significant effects on the optimization of maintenance with
almost coincident throughout (Fig. 2(a) and (b)); otherwise, respect to this parameter, as it will be seen below.
the discrepancy becomes significant (Fig. 3(a) and (b)) and As for what concerns the repair process, we shall adopt
coincidence of the CDFs occurs only at t t: In this latter the usual assumption of constant repair rate, m . Although we
case, assuming small repair times, the average number of realize that the repair process is all but Markovian, this
failures within the period is the same by construction but the assumption, which can be easily removed in a Monte
0.7
0.02
0.6
cdf
cdf
0.5 0.015
0.4
0.01
0.3
0.2
0.005
0.1 weibull: α = 2
exponential:λ = .0176
0 0
0 20 40 60 80 100 0 20 40 60 80 100
time (h) time (h)
(a) (b)
Fig. 3. (a) Cumulative distributions: weibull (a 2, b 0.021) vs exponential (l 0.0176). (b) Probability density functions: weibull (a 2, b 0.021) vs
exponential (l 0.0176).
64 E. Borgonovo et al. / Reliability Engineering and System Safety 67 (2000) 61–73
τ 2.2. Maintenance
t It is well recognized that maintenance is a central theme
of plant management: indeed, an efficient maintenance
Fig. 4. Linear growth of failure rate within the maintenance period t and
counterbalancing effect of maintenance. policy may ensure safe, reliable and economic operation.
On the contrary, an inefficient scheduling and choice of
maintenance actions guarantees at least a waste of
Carlo approach, will allow us to compare the analytical resources.
results with those obtained by the Monte Carlo simulation. Various maintenance criteria have been followed to fulfill
Moreover, since the asymptotic system availability of a the large variety of requirements and constraints of the
repairable component depends on the average repair time, industrial world. More recently, the criterion of RCM has
the approximation of constant repair rate is significant only been supported as a unifying concept in maintenance prac-
during the transient evolution [13]. tice [3]. It is essentially a qualitative approach aiming at
Even with maintenance counterbalancing its effects, developing a maintenance scheme which satisfies both the
some aging of the components is inevitable. Given that reliability and the economic constraints of plant manage-
we are going to use equivalent constant transition rates, ment and operation. The main goal of RCM is that of iden-
we account for an effect of component deterioration due tifying those maintenance activities which guarantee a
to extensive operation through the outcome of the repair certain level of system reliability: however, in the case of
process. More specifically, we assume that as a result of a limited resources, as is always the case in practice, a
repair action the component might not necessarily return to compromise is sought between the number, frequency and
an “as good as new” condition since it is likely to become the type of intervention and the operation and management
more fragile and prone to future failures. To account for costs of the plant. It is then clear how relevant it is to study
imperfect, deteriorating repairs, we adopt a modified efficient maintenance strategies and how important it is to
Brown–Proschan model of stochastic repairs which postu- develop appropriate models that render the analysis quanti-
lates that a system is repaired to an “as good as before” tative.
condition (minimal repair) only with a certain probability The model here proposed is based on the following
p and is, otherwise, returned in a “deteriorated” condition assumptions: (i) maintenance occurs only while the system
(deteriorating repair) [14,15]. Thus, these two conditions is available; (ii) maintenance periodicity varies with the
obey a Bernoulli distribution. Inclusion of this model within component’s aging due to imperfect, deteriorating repair;
the Monte Carlo simulation scheme is straightforward. (iii) the maintenance action is such to restore the conditions
When a repair action is completed, we sample a uniform existing at the beginning of the previous maintenance period
random number r in 0; 1: if r , p; then the repair is mini- (Fig. 4); and (iv) the maintenance action is instantaneous.
mal and the failure properties of the component are returned In realistic situations, the maintenance activities become
to the conditions existing prior to failure; otherwise, repair is more and more frequent as the component ages. In our
λ*(t)
τ’ τ’
Trip
τ τ
Fig. 5. Adaptive maintenance period for an aging component. After component failure and repair (with time Trep) the component ages according to the Brown–
Proshan model and the maintenance period is shortened from t to t 0 .
E. Borgonovo et al. / Reliability Engineering and System Safety 67 (2000) 61–73 65
model, effectively, deterioration of a component is due to denote by knj l the average number of repairs underwent
the imperfect repairs, as for the Brown–Proschan model by the jth component over the whole mission time TMISS.
previously introduced, and we allow for an adaptive sche- The energy function can then be written as follows:
dule of maintenance intervention according to which the ZTMISS X 1
ratio between the maintenance period t and the mean time E
TMISS ; t B0 A
t; t 2 CMj A
t; t dt
1=l between successive failures is kept constant. Thus, after 0 j
tj
a minimal repair lnew lold and, then, tnew told ; vice X 1
versa, after a deteriorating repair, lnew
1 1 plold and, 2 CRj kn l
7a
then, tnew told lold =lnew told =
1 1 p: Fig. 5 shows the j
mj j
situation for a 2 and p 1:
or
2.3. The profit function E
TMISS ; t E0 2
EU 1 EM 1 ER
7b
history but to a change in the external scenario of techno- maintenance of the renovated system, from now on,
logical evolution and marketing [20]. The overcoming of a compare with those that would be obtained with the old
given technology due to technical, legislative and/or system, without renovation. The profit functions E
t !
marketing reasons typically leads to a decrease in value of TMISS ; tujNEW and E
t 1 1=mj ; t ! TMISS uj will serve as
the system which is not necessarily related to its past or measures of the benefits and costs in the renovated and
current performance but can certainly influence its future old system configuration, respectively. Note that the reno-
life. Indeed, the availability on the market of improved vation process is considered instantaneous whereas the aver-
components offers the enviable opportunity to plant age repair time 1=mj of the failed component is accounted for
managers of upgrading their system performance while explicitly.
rejuvenating the system itself. We now have all the ingredients to make a decision on
In this section, we formalize the issue of obsolescence in how to proceed when the jth component fails: we will
a quantifiable manner and see how resource constraints play proceed to, renovation, instead of repair, if:
a fundamental role in the management of this problem.
Qualitatively, we can say that as the system components E
t ! TMISS ; tujNEW 1 CNjNEW
t
age, the overall management costs increase mainly due to !
1 1
downtime costs; at the same time, new, improved compo- #E t1 ! TMISS ; tuj 1 CRj 1 VRj
t:
9
nents become available on the market and this further mj mj
reduces the current plant value; the substitution of an old The two sides of the inequality represent the total profit of
component with a new, improved one does increase the running the system for the remaining portion of the mission
system performance but at a cost of purchase. The problem time with the new or the old (repaired) component, respec-
posed by the obsolescence issue is, then, that of deciding tively.
whether to continue operation with the current plant status A highly complicated issue is the evaluation of the profits
or renovate it, partially or totally. To account for the various E
t ! TMISS ; tujNEW and E
t 1 1=mj ! TMISS ; tuj: The
issues at stake, we postulate that as calendar time goes by complication lies in the fact that, from the current system
new components are available on the market and they are status as resulting from the past failure–repair–aging
characterized by a failure rate which decreases exponen- history, one should consider all possible future evolutions,
tially. If we buy a component at time t0, components of thus facing the combinatorial explosion of possible scenar-
the same kind appear in the market at later times and they ios. In our Monte Carlo approach this problem is drastically
have better l ’s according to the expression: approximated by following system evolutions in which only
one component can be renovated during TMISS. For what
lji!l
t lji!l
t0 e2s
t2t0
j
t $ t0
8 concerns E
t ! TMISS ; tujNEW ; before starting the Monte
Carlo simulation we establish a suitable sequence of times
where s j is the rate of decrease in the failure rate of the Ti, i 1; 2; …N0 : For each component we, then, pre-
newly produced components and lji!l
t0 is the failure rate compute N0 batches of 1000 Monte Carlo trials to evaluate
of the component purchased at t0. the value of E 0
Ti ! TMISS ; tujNEW ; i 1; 2; …N0 ; which
Typically the decision of replacing the nominal jth represents the profit of operating the system with the reno-
component with a new, improved one is faced at the time vated component from the time Ti to the end of the mission
of failure, so that the alternatives are: repair, at an average time. For the evaluation of E
t 1 1=mj ! TMISS ; tuj; before
cost CRj × 1=mj (where CRj is the hourly cost of repair of the starting the Monte Carlo simulation we pre-compute one
jth component and the second factor represents the nominal batch of 1000 trials with no renovations. When during the
average time for repair completion), or renovation by actual simulation the jth component fails at time t, we inter-
purchasing a new component jNEW that has become available polate between the adjacent Tip values to determine E
t !
in the market at time t, CNjNEW
t: This cost depends on TMISS ; tujNEW and retrieve the pre-computed value of E
t 1
many factors related to the patterns of evolution of both the 1=mj ! TMISS ; tuj; then, we insert these quantities in the
related technology and market. Since a detailed modeling of inequality (9) to decide whether to actually substitute or
these factors is beyond the scope of this paper, we simplified not the failed component.
the issue by considering the purchase cost constant over The above approximations seem reasonable since, in
time. general, we do not expect the important components of
An additional factor in the decision is the residual value the system to be renovated frequently within TMISS.
VRj
t 2 t0 of the jth component, if repaired. In this regard,
we assume that it decreases continuously from the time of
purchase t0 according to an exponential law with parameter 3. The reference system
u j.
Finally, the decision of replacing a component with a For the application of the proposed methodology we
newly available one depends also on how the increased considered a gas compression system, taken from literature
benefits and reduced costs associated to the operation and [16], comprising an active and a standby compressor which
E. Borgonovo et al. / Reliability Engineering and System Safety 67 (2000) 61–73 67
Fig. 7. System cost: (a) maintenance; (b) downtime; (c) repair; (d) total.
E. Borgonovo et al. / Reliability Engineering and System Safety 67 (2000) 61–73 69
Fig. 8. (a) Instantaneous and (b) integral behavior of the costs as a function of time, in case of Brown–Proshan aging after repair
p 0:8; p 0:3; t
20 h:
Note also how, in this unrealistic case, the chosen cost between the failure rate of a component and the mainte-
values are such that maintenance expenditures become nance period t (Eq. (5)) which forms the basis for the
predominant at most times, when maintenance actions are present optimization of the maintenance period. This
made very frequent so as to overcome the running aging implies that the optimization regards only the starting main-
process. tenance period referring to ‘as new’ components; this initial
The repair costs also show a similar, peaked behavior maintenance period is then modified during the compo-
(crosses). Indeed, aging at the beginning worsens the beha- nents’ life so as to counteract the Brown–Proschan degra-
vior of the components, thus inducing also more frequent dation of the corresponding failure rates, as explained in
repairs. However, since the system failure (both components Section 2.2. Indeed, Eq. (5) implies that the smaller is t
C1 and C2 down) constitutes an absorbing configuration, as the smaller the components’ failure rates are, so that the
time goes by the system is more and more in this state of system is more available and produces more benefits; on
unavailability where repairs are not performed. Finally, the the other hand, there will be more maintenance actions
value E 0
TMISS ; t of the cumulated global costs at TMISS in and this will increase the associated costs. Therefore, if
Fig. 8(b) can be considered as the total amount of expenses we look at the contributions of the cost function E 0 in
needed to operate the system up to that time. This shall be Eqs. (7a) and (7b), we expect that as t decreases: the down-
compared with the gained benefits from plant operation. time cost EU decreases, as does the contribution due to
repairs, ER, since the probability of failure decreases; the
maintenance term EM, on the contrary, increases. The opti-
4. Maintenance optimization mal choice of t will then represent a compromise for the
behaviors of the various contributions. Obviously, if t p
The problem of choosing maintenance strategies is of turns out to be very small, this means that the system design
foremost importance in plant management and operation. and/or components’ choice were poor; moreover, the
An efficient strategy should aim at guaranteeing the level assumption of instantaneous maintenance should be re-
of performance and availability of the system while allow- visited. In the opposite case, a very large t p would imply
ing for a reduction in the resource expenditures. a very good system design and/or components characteris-
The Monte Carlo scheme proposed here allows for a tics; this would bring in issues of capital costs/interests and
quantitative analysis of the maintenance strategies. The their relation to the goodness of the components which have
definition of an appropriate system energy model, such as not been here considered, for simplicity.
the one proposed in the previous section, enables one to While an analytical search for the optimal t -value is
perform a search of the optimal strategy in terms of a maxi- exceedingly complicate, the Monte Carlo approach is rather
mization of the energy function E (Eqs. (7a) and (7b)), straightforward. We, a priori, define a range for t within
which corresponds to a minimization of E 0 EU 1 EM 1 which the search is to be performed. From this range, we
ER : In this section the search of the optimal maintenance select a number of values t i, i 1; 2; …; NM ; and for each
period t p for the reference system presented in Section 3 is value we perform a batch-Monte Carlo evaluation of the
performed within the Monte Carlo simulation framework. cost function E 0
t; ti : The choice of the optimal period
As mentioned in Section 2.2, to describe the failure beha- will then fall on that value t p which minimizes the cumu-
vior of the system components instead of considering a lative cost E 0
t; tp at t TMISS : This approach was applied
Weibull distribution, we introduce an equivalent exponen- to the reference system for values of t i in the range 2; 38
tial distribution such that the probability of a failure at any (in hours) and with a value of b (Eq. (5)) such that the
time within an interval between maintenance actions is the failure rate of component C1 (which is the only one to
same. By doing so, we are able to establish a connection be maintained and repaired) is equal to 9 × 1024 h21 in
70 E. Borgonovo et al. / Reliability Engineering and System Safety 67 (2000) 61–73
Fig. 9. System costs as a function of the maintenance period t : (a) maintenance; (b) downtime; (c) repair; (d) total.
correspondence of the mean value t 20 h: It turns out that hourly cumulative costs due to downtime, maintenance and
b 0:021: The number of Monte Carlo trials per batch was repair, as well as for the global hourly cost
chosen equal to 10 000. E 0
TMISS uti =t
$=h; for various values of t i: the optimal
We first consider the simple case with no aging and obso- choice for the period t p turns out to be 9.2 h.
lescence, for which the analytical solution to the energy As expected, Fig. 9(a)–(d) shows that for small values of
function can be obtained [16]. Fig. 9 shows the good agree- t i, the costs due to highly frequent maintenance actions are
ment between the analytical and Monte Carlo results for the such to increase considerably the global costs; as the value
Fig. 10. System costs as a function of the maintenance period t in presence of aging
p 0:8; p 0:3:
E. Borgonovo et al. / Reliability Engineering and System Safety 67 (2000) 61–73 71
Fig. 11. System cumulative benefits and total costs as a function of time: (a) non-optimized t 20 h; (b) optimized tp 5:6 h:
of t i is increased, the maintenance costs decrease but down- obsolescence (circles). The effect of obsolescence is seen to
time, and its associated costs, slowly increase: the conflict- improve significantly the availability performance of the
ing trends of these contributions give rise to a minimum of system as it counteracts the aging of the components.
the global cost function; finally, for very low maintenance Fig. 13 compares the effects of obsolescence on the
frequencies the system availability deteriorates significantly cumulated total costs E 0 . Obviously, the results strongly
and the dominating contribution of the downtime costs gives depend on the input data: as shown in the figure, the operat-
rise to very large global cost values. ing costs of the system are significantly lowered in the case
Let us now see what happens when aging is considered. of renovation of component C1 at a price CN1 equal to $5
Fig. 10 reports the Monte Carlo simulation results for the (a); the advantages of renovation are completely defeated
case of Brown–Proschan aging with p 0:8 and p 0:3: when the cost of a new component C1 is raised from $5 to
The optimal maintenance period reduces to tp 5:6 h : this $100 (b).
is due to the fact that aging worsens the availability of the
system so that the contribution of the downtime costs is felt
at earlier times. The importance of the choice of the main- 5. Conclusions
tenance period can also be seen from the point of view of the
cost-benefit analysis of plant operation. Fig. 11 shows the The operation and management of a plant requires proper
cumulative benefits and total costs, in dollars, as a function accounting for the constraints coming from safety and relia-
of time for the aging system with (a) non-optimized period bility requirements as well as from budget and resource
t 20 h and (b) optimized period tp 5:6 h. In the first considerations. The analyses that need to be performed in
case, the plant turns out to be making a net profit only for the order to evaluate the maintenance strategies and operating
first 2300 h, after which the downtime costs are such to procedures need to consider many practical aspects, such as
dominate (Fig. 11(a)). On the contrary, the optimized main- aging, repair, obsolescence, renovation, which are almost
tenance period is such to render profitable the operation of
the plant throughout the mission time (Fig. 11(b)).
Finally, we investigate the effects of obsolescence on the
reference system of Section 3 with the data of Table 3, N0
3 and maintenance period t 20 h:
Fig. 12 reports the results of the system unavailability for
the cases of no aging (stars), aging (crosses) and aging plus
Table 3
Data for the obsolescence process
Fig. 13. Total cumulative costs for the system, with and without obsolescence. (a) CN1, $5; (b) CN1, $100.
[19] Framatome spare parts expertise. Framatome Nuclear Newsletter, no. [21] Marseguerra M, Zio E. Nonlinear Monte Carlo reliability analysis
51, 1997. with biasing towards top event. Reliab Engng System Safety
[20] Song JS, Zipkin PH. Managing inventory with the prospect of obso- 1993;40:31–42.
lescence. Oper Res 1993;44:215–24.