Professional Documents
Culture Documents
10 2004 Reliability Analysis of A Dynamic Phased Mission System
10 2004 Reliability Analysis of A Dynamic Phased Mission System
Marc BOUISSOU
Electricité de France R&D
1 avenue du Général de Gaulle - 92141 Clamart France
and CNRS UMR 8050, Université de Marne la Vallée, France
E-mail: marc.bouissou@edf.fr
Yves DUTUIT
Université Bordeaux 1/LAPS
351 cours de la Libération - 33405 Talence cedex France
E-mail: yves.dutuit@iut.u-bordeaux1.fr
Sidoine MAILLARD
Institut National Polytechnique de Lorraine (CRAN)
Faculté des sciences - BP 239, 54506 Vandoeuvre Cedex, France
E-mail: sidoine.maillard@laposte.net
1. Introduction
With the increasing complexity and automation associated with systems
encountered in many domains such as the nuclear, aerospace, chemical,
electronic, the safety studies are more and more complex and multifaceted.
Nowadays, a phased mission analysis methodology is being recognized as
1
March 15, 2005 16:34 WSPC/Trim Size: 9in x 6in for Review Volume article˙MMR2004˙long˙V2
• the tasks performed within a phase may differ from phase to phase.
• performance and dependability requirements can be different from
one phase to another.
• the system may be subject to a particularly stressing environment
in a specific phase, thus experiencing increases in the failure rate
of its components.
• the structure may change over time, depending or not on the per-
formance and/or dependability requirements of the current phase.
• the successful completion of a phase may bring a different benefit
to the PMS with respect to that obtained with other phases.
In front of this duality, phased mission techniques are thus required for
proper analysis of problems in particular when switching procedures are
carried out or equipment is reassembled into new systems at predetermined
times. An important quantitative phased mission analysis problem is to
calculate exactly or obtain bounds for mission unreliability, where mission
unreliability is defined as the probability that the system fails to function
successfully in at least one phase. Dependability modeling and evaluation
of such systems has been addressed mainly resorting to Fault Tree models
and to Markov processes based models. And the dynamic structure of the
phased systems makes the analysis more complex compared to the single
phased systems. Models such as Fault Trees and Reliability Block Dia-
grams were widely used to analyse phased mission systems dependability
14,18,19
. More recently, a new family of approaches based on Fault Trees has
been proposed 23 . It exploits the gain in computational complexity that is
possible thanks to the use of Binary Decision Diagrams based techniques.
Reference 20 applies Fault Tree methodology to the dependability analysis
of PMS systems with non-repairable and repairable components. State-
based modelling approaches based on Markov chains and Petri Nets were
also applied because of their ability in representing complex dependencies
among system components 1,2,11 . Combinatorial models provide simpler for-
malisms that allow a very intuitive mapping between modelling elements
and system failures. Moreover, it is quite immediate to exploit results of
classical qualitative analysis such as those made available by the Failure
Modes and Effects Analysis (FMEA) to build quantitative dependability
Fault Tree or Reliability Block Diagram models. On the other hand, such
models show severe limitations with respect to the representation of de-
pendencies among different system components, imperfect coverage of fault
containment mechanisms, repair actions for failed units and sub-systems.
State-space models exhibit a higher flexibility with respect to the represen-
tation capabilities. However, such generality does not come alone: it is paid
by a higher complexity of both the modelling formalism itself and of the
modelling process. The considerations above on differences between flex-
ibility and expressiveness generally apply to any modelling formalism. In
the specific case of PMS, additional increased complexities are to be han-
dled by the dependability modellers because of the phased behaviour of the
systems to be analysed.
In the literature, there are examples of separate and single modelling
studies for both combinatorial and state-space based approaches. Recently,
some hierarchical approaches 2 tried to grab the best aspects while alleviat-
March 15, 2005 16:34 WSPC/Trim Size: 9in x 6in for Review Volume article˙MMR2004˙long˙V2
ing the limitations of each of the two choices. They allow for the definition
of a high level single model of the PMS, which has the only purpose of defin-
ing the sequence of phases, and a second, lower level modelling layer, which
focuses on PMS intra-phase behaviour. Nowadays, combinatorial and state-
space modelling formalisms still represent the two dominant approaches to
PMS dependability analysis. Each approach has its own advantages and
weaknesses, and the choice of the best one is largely dependent on the spe-
cific characteristics of the system at hand and on the goals of the analysis.
The sequel of this article aims to illustrate the use of two reliability
analysis methods applied to a simple, but not trivial, 2-phases problem. The
system proposed as a test enables us to compare the respective benefits and
drawbacks of a Petri net based approach, 2,13,16 , and of the so-called BDMP
(Boolean logic Driven Markov Process) approach, recently published 5,6 .
A
K1 K2
K5
K3 K4
B
Fig. 1. System structure of the studied test case
Phase 1
• T1 (the duration of phase 1) is exponentially distributed with a
mean value equal to E{T1 } = 1/λ1 = 100 hours.
• Switches K1, K2, K3, and K4 are normally closed.
March 15, 2005 16:34 WSPC/Trim Size: 9in x 6in for Review Volume article˙MMR2004˙long˙V2
is ”mission is in progress”.
To make the Petri net of Figure 2 more understandable, each place of the
right subnet corresponds to a state of a Markov graph, which is explicitly
defined hereafter:
cannot change from state FALSE to state TRUE. Besides the failures
of A and B, the inadvertent openings are represented by such leaves (with
names beginning with IO− ).
,
10
P4
3λ
P14
16:34
2λ γ2
P1
5λ 3λ
(1-γ)3 = 0,985 λ P9
2λ ?M ?M
3λ
P6 ?M
article˙MMR2004˙long˙V2
3λ
11
AND
C1
OR
C2 C3
When their mode changes from ”not required” to ”required”, they instanta-
neously can become TRUE with a given probability. All on-demand failures
of the system are represented in this way, with names beginning with RO−
for ”refuse to open” and RC− for ”refuse to close”. When a mode change
occurs at the same time for several components of that kind, it is possible to
specify constraints on the order in which their reaction must be taken into
account. This is done with grey dotted links. Two of these links specify that
the outcome of the opening demands on K1 and K4 must be determined
before the attempt to close K5. The last symbol we must explain is the
phase indicator leaf, represented with a clock. The behavior of this leaf is
as follows: if no trigger points at it (like for phase− 1), it is initialized in the
TRUE state and becomes FALSE after an exponentially distributed time.
If a trigger points at it (like for phase− 2), it is initialized in the FALSE
state and when the origin of the trigger changes from the TRUE to the
FALSE state, the leaf instantaneously becomes TRUE. It goes back to the
FALSE state after an exponentially distributed time. This kind of behavior
makes it easy to link an arbitrary number of phases. It is even possible to
define a cyclic chain of phases: this is consistent with the general theory of
BDMP.
16:34
system_failure
$1' $1'
AND_2
system_failure_in_phase_2
system_failure_in_phase_1
25 system_failure_in_phase_2
Main_page
short_circuit
$1' $1' $1' 25 25 OR_5 due_to_B
FailureOfB FailureOfA
Main_page Main_page
A_and_B_unavailable due_to_A due_to_B failure_on_phase_change cpts A_or_B_isolated cpts
FailureOfA FailureOfB impossible_to_isolate_B
cpts Main_page
$1'
cpts
FailureOfB
FailureOfB
25
cpts
25
OR_4 OR_5
impossible_to_isolate_B Main_page Main_page
system_failure_in_phase_2
Main_page
OR_4 OR_5 ,
, ,
FailureOfA $1'
cpts OR_4 due_to_A
impossible_to_isolate_A RC_K5 IO_K2 IO_K5 IO_K3 Main_page Main_page
FailureOfA IO_K2 FailureOfB IO_K3 RO_K3 RO_K1
cpts Main_page cpts Main_page
, ,
article˙MMR2004˙long˙V2
impossible_to_isolate_A
Main_page
13
MOCA-RP 13 , and p = 0.92402 after about 4min with YAMS (for 107
trials). YAMS 24 is a Monte-Carlo simulator able to process any model
written in the FIGARO modeling language, and therefore, any model built
with KB3. The mission success corresponds to the fact that the place P14,
see Figure 2, contains one token at the end of the trial (simulated duration
of each trial: 3000 hours: this time is large enough to ensure that the end
of the whole mission is reached in each trial). We have also solved the Petri
net with the tool FIGSEQ, which is based on sequence exploration and
quantification of the Markov graph specified by the Petri net. FIGSEQ uses
an analytical quantification of sequences leading to a specified set of states
4,8
, and is able to process any markovian model written in the FIGARO
language. FIGSEQ instantly solved the model and gave the following result
for the probability of mission success: p = 0.92394. We do not report the
sequences output by FIGSEQ in this article, because the aggregation of the
states prevents them from being legible.
We solved the second model (the BDMP) with the two evaluation tools
FIGSEQ and YAMS (MOCA-RP is dedicated to Petri-nets and cannot be
used to solve a BDMP). The Monte Carlo simulation gave in 6min for
107 trials of 3000 hours a probability of success p = 0.92380. The BDMP
was solved instantly with FIGSEQ. This solution yielded p = 0.92392 as
the probability of mission success, and two sets of sequences sorted by
decreasing probability: one for the sequences leading to loss of mission and
the other leading to success. The results tables in appendix are directly
those created by FIGSEQ. They display the list of transitions for each
sequence, with their own rate and class (EXP for exponential distribution,
INS for instantaneous), the probability of the sequence at mission time
3000 hours (Proba MT), the average duration after initiator (Aver. Dur.
After init) and the Contribution to the probability of the whole event.
The contributions of the 12 first sequences decrease from 11.9% to 5.96%
of the mission failure probability; subsequent sequences have much lower
contributions (45 sequences). The 12 first sequences are listed in Table 3.
We could also obtain the three only success sequences, corresponding to the
non-occurrence of the top event (UE− 1) and the end of phase 2 (see Table
2).
The cross results are summed up in Table 1. Note that the Monte-Carlo
simulations results are given with a confidence interval of 1.6410−4 and
are therefore consistent with the analytical results given by FIGSEQ. The
FIGSEQ results are exact because the sequences have been exhaustively
explored and quantified in both models.
March 15, 2005 16:34 WSPC/Trim Size: 9in x 6in for Review Volume article˙MMR2004˙long˙V2
4. Conclusions
The results obtained with quite different methods are practically the same,
which constitutes a good cross validation. Since both models are Markovian,
any solving method valid for Markov processes could have been used to solve
the two models. Therefore, the only significant difference between the two
approaches resides in the model construction.
Whereas the BDMP construction has been straightforward and pro-
duced a self explaining, easy to validate model, the Petri net required in
this case some further work to result in a concise graphical representation.
The size of the Petri net could be limited thanks to a careful exploita-
tion of all the symmetries of the system (structure of the installation and
features of components). However, if we had to model a system with the
same behavior, but made of components having all different characteristics,
the Petri net size would obviously increase, while the BDMP would remain
exactly the same. The same remark would apply if we wanted to introduce
repairs.
But the most spectacular advantage of the BDMP formalism is prob-
ably the following: thanks to its hierarchical structure, all we would have
to do in order to replace the simple components A and B by subsystems
would be to replace the leaves FailureOfA and FailureOfB of the BDMP of
March 15, 2005 16:34 WSPC/Trim Size: 9in x 6in for Review Volume article˙MMR2004˙long˙V2
15
17
References
19
18. Pedar A., Sarma V.V.S., Phased-Mission Analysis for Evaluating the Effec-
tiveness of Aerospace Computing-Systems, IEEE Transactions on Reliability,
Vol. R-30 (5), p. 429-437, Dec. 1981.
19. Somani A. K, Simplified Phased-Mission System Analysis for Systems with
Independent Component Repairs, proceedings of ACM SIGMETRICS, 1996.
20. Vaurio JK., Fault tree analysis of phased mission systems with repairable
and non-repairable components, Reliability Engineering and System Safety
Vol. 74, p.169 - 180, 2001.
21. Bouissou M., Humbert S., Muffat S., Villatte N., KB3 tool : Feedback on
knowledge bases, proceedings of Lambda Mu 13 / ESREL 2002, European
Conference, Lyon (France), p.754-759, March 2002.
22. Xing L., Reliability and sensitivity analysis of static phased mission systems
with imperfect coverage, M.S. Thesis, Electrical Eng., Univ. Virginia, Jan
2000.
23. Xing L., Dugan J. B., Analysis of generalized phased mission system reli-
ability, performance and sensitivity, IEEE Transactions on Reliability, Vol.
51 (2), p.199-211, June 2002.
24. Bouissou M., Chraibi H., Muffat S., Utilisation de la Simulation de Monte-
Carlo pour la résolution d’un benchmark (MINIPLANT), 14ème congrès de
fiabilité et maintenabilité, Bourges, (France), October 2004.