Download as pdf or txt
Download as pdf or txt
You are on page 1of 19

March 15, 2005 16:34 WSPC/Trim Size: 9in x 6in for Review Volume article˙MMR2004˙long˙V2

RELIABILITY ANALYSIS OF A DYNAMIC PHASED


MISSION SYSTEM: COMPARISON OF TWO APPROACHES

Marc BOUISSOU
Electricité de France R&D
1 avenue du Général de Gaulle - 92141 Clamart France
and CNRS UMR 8050, Université de Marne la Vallée, France
E-mail: marc.bouissou@edf.fr

Yves DUTUIT
Université Bordeaux 1/LAPS
351 cours de la Libération - 33405 Talence cedex France
E-mail: yves.dutuit@iut.u-bordeaux1.fr

Sidoine MAILLARD
Institut National Polytechnique de Lorraine (CRAN)
Faculté des sciences - BP 239, 54506 Vandoeuvre Cedex, France
E-mail: sidoine.maillard@laposte.net

Phased mission systems are frequently encountered in industrial fields,


and many approaches have been proposed in the literature to compute
their reliability. After a short review of the existing literature, this paper
aims to illustrate the use of two reliability analysis methods applied
to a simple, but not trivial, problem. The system proposed as a test-
case enables us to compare the respective benefits and drawbacks of a
Petri net-based approach and of the so-called BDMP approach, recently
published.

1. Introduction
With the increasing complexity and automation associated with systems
encountered in many domains such as the nuclear, aerospace, chemical,
electronic, the safety studies are more and more complex and multifaceted.
Nowadays, a phased mission analysis methodology is being recognized as

1
March 15, 2005 16:34 WSPC/Trim Size: 9in x 6in for Review Volume article˙MMR2004˙long˙V2

2 M. BOUISSOU, Y. DUTUIT, S. MAILLARD

the appropriate reliability analysis method for a large number of problems.


Indeed the industrial systems are mostly used over several ways of function-
ing, including degraded states. A phased mission system (PMS) is a system
subject to multiple, consecutive and non-overlapping time periods of oper-
ation (or phases). The phases can be characterized along a wide variety of
differentiating features:

• the tasks performed within a phase may differ from phase to phase.
• performance and dependability requirements can be different from
one phase to another.
• the system may be subject to a particularly stressing environment
in a specific phase, thus experiencing increases in the failure rate
of its components.
• the structure may change over time, depending or not on the per-
formance and/or dependability requirements of the current phase.
• the successful completion of a phase may bring a different benefit
to the PMS with respect to that obtained with other phases.

Thus, considering the characteristics of the whole system over phases


can be very difficult. Moreover, the effects of the past history of the system
(for instance the degraded configuration) need to be taken into account to
explore its future behaviour within the successive phases, and this turns
out in a great increase of the modelling and analysis complexity.
Phased-mission systems have been widely investigated, and many dif-
ferent approaches appear in the literature. The encountered studies can be
roughly classified in two main groups, based on the approach used to deal
with the changes in the structure of the system. These studies consider the
definition and the solution of either a global model including all phases as
proposed in 1,9,11,18,20 , or the definition of a distinct model for each phase
of the system and a separate evaluation for each of these models. The defi-
nition of a single model, that takes into account all the possible behaviours
of the system in the different phases, allows to easily consider the depen-
dencies among the phases. This approach gives the possibility of exploiting
the similarities among phases to obtain a compact model in which all the
phases are properly embedded. But building such a single model may not
be simple or suitable in some cases where the following aspects prevail over
the similarities along different phases:

• The operational configuration of the system is not inflexible but


rather may vary from phase to phase, in accordance with the crit-
March 15, 2005 16:34 WSPC/Trim Size: 9in x 6in for Review Volume article˙MMR2004˙long˙V2

icality of the specific phase,


• The failure and repair history of some components within one phase
affects system behaviour in subsequent phases. Therefore, the state
of a component at the beginning of a phase is dependent from the
state it had at the previous phase completion time,
• The criteria defining the level of fulfilment of dependability and per-
formance requirements inside a phase may differ from those valid
for a subsequent phase.

One of the main disadvantages of this single model approach is repre-


sented by the lack of reusability of the model. A new model needs to be
built if the behaviour of the system in any phase is changed or if the phase
order is changed. Moreover, a substantial effort may be needed to define
and solve, using automatic tools, the overall model of the system, which is
often of large size.
On the other hand, a separate modeling and evaluation of each phase, as
in 3,19,22,23 , allows a better management of the complexity of the analysis,
and to use again previously built models of the phases. Furthermore, this
approach permits to focus, inside each phase, on the most interesting be-
haviors to be analyzed from the system dependability viewpoint, and also
to reduce the size of the models for each phase. But especially the char-
acterization of the differences among phases, in terms of different failure
rates and different configuration requirements, is much easier. Very often,
a small difference does not allow to consider a single model, even if two
consecutive phases are quite similar to each other. Each phase can be sep-
arately solved, and then its solution outcomes aggregated with those of the
other phases to obtain the overall result for the PMS, thus demonstrating
a better performance at solution time. The major weakness of the separate
modelling approach (not shown by the single model one) is the treatment
of the dependencies among phases, which are to be taken into consideration
because of the sharing of components among phases. This approach explic-
itly requires to perform the mapping of a component state at the end of
a phase to the state of the component at the beginning of the next phase.
This mapping is conceptually simple but can be cumbersome and certainly
becomes a potential source of errors for large models. However, it must be
done because, as it was shown in 9 , estimating the mission reliability by
the product of the reliabilities of the phases usually results in optimistic
results, with an appreciable over-prediction of the reliability.
March 15, 2005 16:34 WSPC/Trim Size: 9in x 6in for Review Volume article˙MMR2004˙long˙V2

4 M. BOUISSOU, Y. DUTUIT, S. MAILLARD

In front of this duality, phased mission techniques are thus required for
proper analysis of problems in particular when switching procedures are
carried out or equipment is reassembled into new systems at predetermined
times. An important quantitative phased mission analysis problem is to
calculate exactly or obtain bounds for mission unreliability, where mission
unreliability is defined as the probability that the system fails to function
successfully in at least one phase. Dependability modeling and evaluation
of such systems has been addressed mainly resorting to Fault Tree models
and to Markov processes based models. And the dynamic structure of the
phased systems makes the analysis more complex compared to the single
phased systems. Models such as Fault Trees and Reliability Block Dia-
grams were widely used to analyse phased mission systems dependability
14,18,19
. More recently, a new family of approaches based on Fault Trees has
been proposed 23 . It exploits the gain in computational complexity that is
possible thanks to the use of Binary Decision Diagrams based techniques.
Reference 20 applies Fault Tree methodology to the dependability analysis
of PMS systems with non-repairable and repairable components. State-
based modelling approaches based on Markov chains and Petri Nets were
also applied because of their ability in representing complex dependencies
among system components 1,2,11 . Combinatorial models provide simpler for-
malisms that allow a very intuitive mapping between modelling elements
and system failures. Moreover, it is quite immediate to exploit results of
classical qualitative analysis such as those made available by the Failure
Modes and Effects Analysis (FMEA) to build quantitative dependability
Fault Tree or Reliability Block Diagram models. On the other hand, such
models show severe limitations with respect to the representation of de-
pendencies among different system components, imperfect coverage of fault
containment mechanisms, repair actions for failed units and sub-systems.
State-space models exhibit a higher flexibility with respect to the represen-
tation capabilities. However, such generality does not come alone: it is paid
by a higher complexity of both the modelling formalism itself and of the
modelling process. The considerations above on differences between flex-
ibility and expressiveness generally apply to any modelling formalism. In
the specific case of PMS, additional increased complexities are to be han-
dled by the dependability modellers because of the phased behaviour of the
systems to be analysed.
In the literature, there are examples of separate and single modelling
studies for both combinatorial and state-space based approaches. Recently,
some hierarchical approaches 2 tried to grab the best aspects while alleviat-
March 15, 2005 16:34 WSPC/Trim Size: 9in x 6in for Review Volume article˙MMR2004˙long˙V2

ing the limitations of each of the two choices. They allow for the definition
of a high level single model of the PMS, which has the only purpose of defin-
ing the sequence of phases, and a second, lower level modelling layer, which
focuses on PMS intra-phase behaviour. Nowadays, combinatorial and state-
space modelling formalisms still represent the two dominant approaches to
PMS dependability analysis. Each approach has its own advantages and
weaknesses, and the choice of the best one is largely dependent on the spe-
cific characteristics of the system at hand and on the goals of the analysis.
The sequel of this article aims to illustrate the use of two reliability
analysis methods applied to a simple, but not trivial, 2-phases problem. The
system proposed as a test enables us to compare the respective benefits and
drawbacks of a Petri net based approach, 2,13,16 , and of the so-called BDMP
(Boolean logic Driven Markov Process) approach, recently published 5,6 .

2. Test case definition


The system to be studied is a hypothetical example of phased mission sys-
tem, as shown in Figure 1. It consists of two main non-repairable compo-
nents A and B, and five switches that are used for protection or reconfigu-
ration functions in different configurations over two consecutive phases as
described hereafter:

A
K1 K2
K5

K3 K4

B
Fig. 1. System structure of the studied test case

Phase 1
• T1 (the duration of phase 1) is exponentially distributed with a
mean value equal to E{T1 } = 1/λ1 = 100 hours.
• Switches K1, K2, K3, and K4 are normally closed.
March 15, 2005 16:34 WSPC/Trim Size: 9in x 6in for Review Volume article˙MMR2004˙long˙V2

6 M. BOUISSOU, Y. DUTUIT, S. MAILLARD

• Switch K5 is normally open.


• Components A and B work in parallel. Their (constant) failure rate
is λA = λB = λ = 10−4 h−1 . A failure of A or B is considered as a
short-circuit between the input and output of the component.
• Possible reconfigurations:
– In case of failure of one component, some switches (K2 and
K4 on a failure of A, K1 and K3 on a failure of B) must be
opened, in order to avoid short circuit of the system, with a
probability of failure on demand equal to γ = 5.10−3.
– Inadvertent opening of switches can also occur, with a failure
rate λS = λ = 10−4 h−1 .
Phase 2
• T2 (the duration of phase 2) is exponentially distributed with a
mean value equal to E{T2 } = 1/λ2 = 50 hours.
• At the beginning of phase 2, the positions of some switches are
changed to enable the two active components to work in series.
More precisely: in the nominal procedure, K1 and K4 are open,
then K5 is closed (operations must be done in this order to avoid
creating a short-circuit). But some alterations due to unwanted
opening of K1 or K4 during phase 1 may occur. If component A
or B has failed during phase 1, the system cannot be used on the
second phase.
Note that the reconfiguration of the system (change of structure from par-
allel to series) is scheduled by an independent process.
Comments on the test case :
• This test case may seem simple at first sight, because it involves
only 7 components. However the number of elementary failure
modes to be taken into account is 12, including 5 failures on de-
mand, which can happen at various times, depending on the system
evolution.
• What makes this example really tricky is the omnipresent depen-
dencies between components, and between the system behaviours
in the two phases. For example, in Phase 1, if A fails, K2 and K4
are supposed to open. If at least one of the switches opens, the
system can still work until the end of Phase 1, but if both refuse
to open, the system is lost at once. If K4 opens inadvertently dur-
ing Phase 1 it cannot refuse to open during the reconfiguration at
March 15, 2005 16:34 WSPC/Trim Size: 9in x 6in for Review Volume article˙MMR2004˙long˙V2

phase change. A real system would involve many more components,


but its reconfiguration would probably be simpler than the one we
are studying in the present test case.
• Another criticism that this test case could suffer from is the fact
that all distributions are exponential. This assumption is very com-
mon about the times to failures of components, but what may seem
strange is the fact that the phases durations are exponentially dis-
tributed. We have deliberately chosen this assumption in order to
be able to use the FIGSEQ tool (which works only on Markov mod-
els) and to show the advantages brought by the original method
implemented in this tool. The other solving method we used, i.e.
Monte-Carlo simulation would have worked as well with any other
distributions.

3. Test case resolution


3.1. Resolution with a Petri net
To model in a concise way the behavior of the system during its phased mis-
sion, by using the Petri net (PN) formalism, a system-based approach has
been chosen instead of the usual component-based approach. The symmet-
rical configuration of the system has been exploited (symmetrical structure
and same characteristics of the switches and components) and an aggre-
gation procedure has been carried out to obtain the PN shown in Figure
2. Two main subnets make it up. The first one (on the right side) models
both all the possible states (aggregated states) in which the system could
be during its first mission phase (places 1 to 10 representing these lumped
states) and the transitions between them. It was obtained by transposing
a Markov chain model of the system in this first phase. The second subnet
(on the left side) consists in two parts: one to model the behaviour of the
system during its second mission phase (places P14 for success and place
P15 for system failure), and the other is used to manage the phase sequence
(places P11 and P12 for the phase 1 and phase 2, and place P13 signifying
the end of the whole mission time). A link like inhibits the transition it
is tied to whenever the origin place contains tokens. The rate of each tran-
sition is explicitly written on the figure. Some transitions are marked with
special signs: ?M notifies that the transition needs the Boolean variable M
to be in state TRUE to be fired. This variable is initially set to TRUE
and when the transition between P12 and P13 is fired (i.e. at the end of
Phase 2), it changes to FALSE. Therefore, the meaning of the variable M
March 15, 2005 16:34 WSPC/Trim Size: 9in x 6in for Review Volume article˙MMR2004˙long˙V2

8 M. BOUISSOU, Y. DUTUIT, S. MAILLARD

is ”mission is in progress”.
To make the Petri net of Figure 2 more understandable, each place of the
right subnet corresponds to a state of a Markov graph, which is explicitly
defined hereafter:

• Place 10 corresponds to the initial state of the system, i.e. A and


B are working and all Ki except K5 are closed.
• Place 1: one of the components A and B has just failed. This state
is instantaneous, because some switches are required to open.
• Place 4: A (resp. B) is failed and at least one of its associated
switches K2 and K4 (resp. K1 and K3) has opened on demand
(probability (1 − γ 2 ))
• Place 2: only one of the switches K1 and K4 has inadvertently
opened. This place corresponds to a degraded functioning state of
the system. Its whole mission can still be performed.
• Place 3: only one of the switches K2 and K3 has inadvertently
opened. This failure does not prevent success of phase 1, but it will
induce the failure of both phase 2 and the whole mission.
• Place 5: A (resp. B) is still working and (K1 and K3) (resp. K2
and K4) opened inadvertently. From the initial state, the state
corresponding to place 2 can be reached via place 3 or via place 2.
The consequences with regards to the mission are the same as for
place 3.
• Place 6 is reached from place 2 because of a failure of either A or
B. The reaction of the corresponding switch K2 (resp. K3) does
not play any role for the remainder of the mission.
• Place 8 is reached from place 3 because of a failure of either A or
B. The reaction of the corresponding switch K4 (resp. K1) has no
impact on the remainder of the mission.
• Place 7 is reached from place 5 because of a failure of either A or
B. The states of all other components remain unchanged in this
transition.
• Place 9 corresponds to the system failure during Phase 1. It is an
absorbing state. It can be reached from each of the places 2, 3, 4, 5,
6, 7 and 8 because of the failure of one of the 3 components of the
remaining operating path (A-K2-K4 or K1-K3-B). This is why the
corresponding failure rate equals to 3λ. Place 9 can also be reached
from the initial state (place 10), after only one timed transition,
via one of the two following sequences: (A fails and both K2 and
March 15, 2005 16:34 WSPC/Trim Size: 9in x 6in for Review Volume article˙MMR2004˙long˙V2

K4 refuse to open), or (B fails and both K1 and K3 refuse to open).


A qualitative analysis of the PN enables us to identify the sequences of
component failures which result in the failure of each phase. This step-by-
step procedure can be done manually with the interactive simulator of KB3
or by using the FIGSEQ tool, 7,15,17,21 .

3.2. Resolution with a BDMP


The second resolution was done with the BDMP formalism. The bulk of
the BDMP depicted in Figure 4 (directly copied and pasted from the KB3
tool 15,17,21 ) is self-explanatory. The advantage of this formalism is that it
looks like fault-trees. It has the same ability to progressively breakdown a
global event into more elementary events, in a top-down approach. Because
of the lack of space, we cannot give all the formal definition of BDMP: it
is available in references 5 and 6 .
Instead, we are going to show on a very simple example how a BDMP
with 3 leaves can specify a Markov model with 16 states representing a
system with a standby redundancy. Let us suppose we wish to model a
system with the structure given in Figure 3 (left). The second line of the
system is a standby redundancy. Therefore, when C1 works, C2 and C3
can only be in a standby or failed state, whereas when C1 is failed, C2 and
C3 can only be in a working or failed state (this explains the 16 states of
the Markov model). This behaviour is precisely specified by the BDMP of
Figure 3 (right).
For a better readability of Figure 4, let us now introduce the mean-
ing of the unusual symbols of this model in a few simple words. First of
all, symbols such as and simply represent a split link: the names
of the origin and target of the link are below the symbols. Split links are
here just to avoid some disgraceful crossings of links in the drawing. Sec-
ondly, red dotted arrows represent the ”triggers” of the BDMP: their role
is to transform what seems to be a standard fault-tree into a fully dynamic
model. As long as the event at the origin of a trigger is FALSE, the trig-
ger maintains all the elements in the sub-tree under its target in a ”non
required” mode. In this mode, the leaves representing failures in function:


cannot change from state FALSE to state TRUE. Besides the failures
of A and B, the inadvertent openings are represented by such leaves (with
names beginning with IO− ).
,

The leaves representing on-demand failures: react to a mode change.


March 15, 2005
1-γ 2 ?M

10
P4

P14

16:34
2λ γ2
P1
5λ 3λ
(1-γ)3 = 0,985 λ P9
2λ ?M ?M

WSPC/Trim Size: 9in x 6in for Review Volume


P15
0,015 P10 P3 P8

M. BOUISSOU, Y. DUTUIT, S. MAILLARD


?M
λ2 λ1 ?M 3λ
2λ λ
P13 P12 P11
!M 3λ

λ ?M λ
0,01 ?M
(1- γ)2 = 0,99 P2 P5 ?M P7


P6 ?M

article˙MMR2004˙long˙V2

Fig. 2. Petri net modeling the system in the two phases


March 15, 2005 16:34 WSPC/Trim Size: 9in x 6in for Review Volume article˙MMR2004˙long˙V2

11

AND

C1 

OR
   
C2 C3
 

Fig. 3. A simple system with a standby redundancy and a BDMP modeling it

When their mode changes from ”not required” to ”required”, they instanta-
neously can become TRUE with a given probability. All on-demand failures
of the system are represented in this way, with names beginning with RO−
for ”refuse to open” and RC− for ”refuse to close”. When a mode change
occurs at the same time for several components of that kind, it is possible to
specify constraints on the order in which their reaction must be taken into
account. This is done with grey dotted links. Two of these links specify that
the outcome of the opening demands on K1 and K4 must be determined
before the attempt to close K5. The last symbol we must explain is the
phase indicator leaf, represented with a clock. The behavior of this leaf is
as follows: if no trigger points at it (like for phase− 1), it is initialized in the
TRUE state and becomes FALSE after an exponentially distributed time.
If a trigger points at it (like for phase− 2), it is initialized in the FALSE
state and when the origin of the trigger changes from the TRUE to the
FALSE state, the leaf instantaneously becomes TRUE. It goes back to the
FALSE state after an exponentially distributed time. This kind of behavior
makes it easy to link an arbitrary number of phases. It is even possible to
define a cyclic chain of phases: this is consistent with the general theory of
BDMP.

3.3. Compared results


By animating the Petri Net model by means of a Monte-Carlo simulation
technique, one can obtain interesting quantitative information such as the
success probability of each phase, the mean of the time of the first system
failure, the mean sojourn time of the system in its different states, etc. The
obtained result for the probability of mission success was p = 0.92396 after
40s of calculation (time needed to perform 107 trials) using the software
March 15, 2005
12
25

16:34
system_failure

$1' $1'
AND_2

WSPC/Trim Size: 9in x 6in for Review Volume


AND_1

M. BOUISSOU, Y. DUTUIT, S. MAILLARD


phase_1 25 phase_2 25

system_failure_in_phase_2
system_failure_in_phase_1

25 system_failure_in_phase_2
Main_page

short_circuit
$1' $1' $1' 25 25 OR_5 due_to_B
FailureOfB FailureOfA
Main_page Main_page
A_and_B_unavailable due_to_A due_to_B failure_on_phase_change cpts A_or_B_isolated cpts


FailureOfA FailureOfB impossible_to_isolate_B
cpts Main_page
$1'
cpts
FailureOfB
FailureOfB
25
cpts
25
OR_4 OR_5
impossible_to_isolate_B Main_page Main_page
system_failure_in_phase_2
Main_page
OR_4 OR_5 ,   
, ,
FailureOfA $1'
cpts OR_4 due_to_A
impossible_to_isolate_A RC_K5 IO_K2 IO_K5 IO_K3 Main_page Main_page
FailureOfA IO_K2 FailureOfB IO_K3 RO_K3 RO_K1
cpts Main_page cpts Main_page
  , , 

article˙MMR2004˙long˙V2
impossible_to_isolate_A
Main_page

IO_K4 IO_K1 RO_K2 RO_K4 FailureOfA

Fig. 4. A BDMP modelling the system in the two phases.


March 15, 2005 16:34 WSPC/Trim Size: 9in x 6in for Review Volume article˙MMR2004˙long˙V2

13

MOCA-RP 13 , and p = 0.92402 after about 4min with YAMS (for 107
trials). YAMS 24 is a Monte-Carlo simulator able to process any model
written in the FIGARO modeling language, and therefore, any model built
with KB3. The mission success corresponds to the fact that the place P14,
see Figure 2, contains one token at the end of the trial (simulated duration
of each trial: 3000 hours: this time is large enough to ensure that the end
of the whole mission is reached in each trial). We have also solved the Petri
net with the tool FIGSEQ, which is based on sequence exploration and
quantification of the Markov graph specified by the Petri net. FIGSEQ uses
an analytical quantification of sequences leading to a specified set of states
4,8
, and is able to process any markovian model written in the FIGARO
language. FIGSEQ instantly solved the model and gave the following result
for the probability of mission success: p = 0.92394. We do not report the
sequences output by FIGSEQ in this article, because the aggregation of the
states prevents them from being legible.
We solved the second model (the BDMP) with the two evaluation tools
FIGSEQ and YAMS (MOCA-RP is dedicated to Petri-nets and cannot be
used to solve a BDMP). The Monte Carlo simulation gave in 6min for
107 trials of 3000 hours a probability of success p = 0.92380. The BDMP
was solved instantly with FIGSEQ. This solution yielded p = 0.92392 as
the probability of mission success, and two sets of sequences sorted by
decreasing probability: one for the sequences leading to loss of mission and
the other leading to success. The results tables in appendix are directly
those created by FIGSEQ. They display the list of transitions for each
sequence, with their own rate and class (EXP for exponential distribution,
INS for instantaneous), the probability of the sequence at mission time
3000 hours (Proba MT), the average duration after initiator (Aver. Dur.
After init) and the Contribution to the probability of the whole event.
The contributions of the 12 first sequences decrease from 11.9% to 5.96%
of the mission failure probability; subsequent sequences have much lower
contributions (45 sequences). The 12 first sequences are listed in Table 3.
We could also obtain the three only success sequences, corresponding to the
non-occurrence of the top event (UE− 1) and the end of phase 2 (see Table
2).
The cross results are summed up in Table 1. Note that the Monte-Carlo
simulations results are given with a confidence interval of 1.6410−4 and
are therefore consistent with the analytical results given by FIGSEQ. The
FIGSEQ results are exact because the sequences have been exhaustively
explored and quantified in both models.
March 15, 2005 16:34 WSPC/Trim Size: 9in x 6in for Review Volume article˙MMR2004˙long˙V2

14 M. BOUISSOU, Y. DUTUIT, S. MAILLARD

The difference of 210−5 between the results obtained with FIGSEQ


from the Petri net and the BDMP is due to subtle differences between the
behaviors depicted by the two models. Here is the main one: the BDMP
allows switches K1 and K4 to open inadvertently in phase 1 and then to
refuse to open at the phase change. In fact these failure modes are mutually
exclusive. The BDMP formalism has now been extended with the notion
of ”inverted trigger” in order to cope with this kind of modeling problem,
but it would be too long to explain this extension here, and it will be done
in a forthcoming publication.

Table 1. Cross results of the different models.


MODEL Processing Tools CPU Time Pr(success)
Petri Net MOCA-RP (MC) 40s, 107 trials 0.92396
YAMS (MC) 3min54s, 107 trials 0.92402
FIGSEQ < 1s 0.92394
BDMP FIGSEQ < 1s 0.92392
YAMS 6min, 107 trials 0.92380

4. Conclusions
The results obtained with quite different methods are practically the same,
which constitutes a good cross validation. Since both models are Markovian,
any solving method valid for Markov processes could have been used to solve
the two models. Therefore, the only significant difference between the two
approaches resides in the model construction.
Whereas the BDMP construction has been straightforward and pro-
duced a self explaining, easy to validate model, the Petri net required in
this case some further work to result in a concise graphical representation.
The size of the Petri net could be limited thanks to a careful exploita-
tion of all the symmetries of the system (structure of the installation and
features of components). However, if we had to model a system with the
same behavior, but made of components having all different characteristics,
the Petri net size would obviously increase, while the BDMP would remain
exactly the same. The same remark would apply if we wanted to introduce
repairs.
But the most spectacular advantage of the BDMP formalism is prob-
ably the following: thanks to its hierarchical structure, all we would have
to do in order to replace the simple components A and B by subsystems
would be to replace the leaves FailureOfA and FailureOfB of the BDMP of
March 15, 2005 16:34 WSPC/Trim Size: 9in x 6in for Review Volume article˙MMR2004˙long˙V2

15

Figure 4 by sub-BDMP. For example, if A and B were subsystems with the


characteristics of the system of Figure 3 (left) the replacement of the leaves
FailureOfA and FailureOfB by two BDMP like the one of Figure 3 (right)
would solve the problem. Whereas we would of course have to rebuild the
Petri net from scratch, and it would be a hard work.
Another interesting result of this study is the illustration of the interest
of the sequence exploration and quantification method used by FIGSEQ,
which allows a quick and precise quantification of a large Markov model and
gives interesting qualitative results: the most probable sequences leading to
the mission failure.
March 15, 2005 16:34 WSPC/Trim Size: 9in x 6in for Review Volume article˙MMR2004˙long˙V2

16 M. BOUISSOU, Y. DUTUIT, S. MAILLARD

Table 2. Sequences leading to mission success.


Transition Name Proba MT Aver. Dur. Contribution
After init.
[end OF phase− 1]
[start OF phase− 2]
[OK OF RO− K1,
OK OF RO− K4]
[OK OF RC− K5]
[end OF phase− 2] 9.0665e-01 4.8780e+01 9.8131e-01
[Fail. OF IO− K4]
[end OF phase− 1]
[start OF phase− 2]
[OK OF RO− K1,
OK OF RO− K4]
[OK OF RC− K5]
[end OF phase− 2] 8.6348e-03 1.4402e+02 9.3458e-03
[Fail. OF IO− K1]
[end OF phase− 1]
[start OF phase− 2]
[OK OF RO− K1,
OK OF RO− K4]
[OK OF RC− K5]
[end OF phase− 2] 8.6348e-03 1.4402e+02 9.3458e-03

Transition Name Proba MT Aver. Dur. Contribution


After init.
[Fail OF IO− K3]
[end OF phase− 1]
[start OF phase− 2] 9.0711e-03 9.6154e+01 1.1927e-01
[Fail OF IO− K2]
[end OF phase− 1]
[start OF phase− 2] 9.0711e-03 9.6154e+01 1.1927e-01
[Fail OF FailureOfA]
[OK OF RO− K2,
OK OF RO− K4]
[end OF phase− 1]
[start OF phase− 2] 9.0678e-03 9.7087e+01 1.1923e-01
[Fail OF FailureOfB]
[OK OF RO− K1,
OK OF RO− K3]
[end OF phase− 1]
[start OF phase− 2] 9.0678e-03 9.7087e+01 1.1923e-01
[end OF phase− 1]
[start OF phase− 2]
[Fail OF RO− K1,
OK OF RO− K4] 4.6934e-03 0.0000e+00 6.1712e-02
[end OF phase− 1]
continued on next page...
March 15, 2005 16:34 WSPC/Trim Size: 9in x 6in for Review Volume article˙MMR2004˙long˙V2

17

Transition Name Proba MT Aver. Dur. Contribution


After init.
[start OF phase− 2]
[OK OF RO− K1,
Fail OF RO− K4] 4.6934e-03 0.0000e+00 6.1712e-02
[end OF phase− 1]
[start OF phase− 2]
[OK OF RO− K1,
OK OF RO− K4]
[Fail OF RC− K5] 4.6699e-03 0.0000e+00 6.1403e-02
[end OF phase− 1]
[start OF phase− 2]
[OK OF RO− K1,
OK OF RO− K4]
[OK OF RC− K5]
[Fail OF FailureOfB] 4.5332e-03 4.8780e+01 5.9606e-02
[end OF phase− 1]
[start OF phase− 2]
[OK OF RO− K1,
OK OF RO− K4]
[OK OF RC− K5]
[Fail OF IO− K2] 4.5332e-03 4.8780e+01 5.9606e-02
[end OF phase− 1]
[start OF phase− 2]
[OK OF RO− K1,
OK OF RO− K4]
[OK OF RC− K5]
[Fail OF IO− K5] 4.5332e-03 4.8780e+01 5.9606e-02
[end OF phase− 1]
[start OF phase− 2]
[OK OF RO− K1,
OK OF RO− K4]
[OK OF RC− K5]
[Fail OF FailureOfA] 4.5332e-03 4.8780e+01 5.9606e-02
[end OF phase− 1]
[start OF phase− 2]
[OK OF RO− K1,
OK OF RO− K4]
[OK OF RC− K5]
[Fail OF IO− K3] 4.5332e-03 4.8780e+01 5.9606e-02
Table 3: Sequences leading to loss of mission

References

1. Alam M., Al-Saggaf U. M., Quantitative Reliability Evaluation of Repairable


Phased-Mission Systems Using Markov Approach, IEEE Transactions on
Reliability, Vol. R-35 (5), p. 498-503, Dec. 1986.
2. Bondavalli A., Mura I., Markov Regenerative Stochastic Petri Nets to Model
March 15, 2005 16:34 WSPC/Trim Size: 9in x 6in for Review Volume article˙MMR2004˙long˙V2

18 M. BOUISSOU, Y. DUTUIT, S. MAILLARD

and Evaluate Phased Mission Systems Dependability, IEEE Transactions on


Computers, Volume 50 , Issue 12, December 2001.
3. Bondavalli A., Mura I., Nelli M., Analytical Modeling and Evaluation of
Phased-Mission Systems for Space Applications, proceedings of IEEE High
Assurance System Engineering Workshop (HASE), 1997.
4. Bon J.L., Bouissou M., Fiabilité des grands systèmes séquentiels: résultats
théoriques et applications dans le cadre du logiciel GSI, Revue de Statistique
appliquée XXXX (2), p.45-54, 1992.
5. Bon J.L., Bouissou M., A new formalism that combines advantages of fault
trees and Markov models: Boolean logic Driven Markov Processes, Reliability
Engineering and System Safety, 2003 Vol. 82, Issue 2, November 2003, Pages
149-163.
6. Bouissou M., Boolean logic Driven Markov Processes: a powerful new for-
malism for specifying and solving very large Markov models, PSAM6, Puerto
Rico, June 2002.
7. Bouissou M., Bouhadana H., Bannelier M., Villatte N., Knowledge modeling
and reliability processing: presentation of the FIGARO language and associ-
ated tools, proceedings of SAFECOMP’91, Trondheim (Norway), November
1991.
8. Bouissou M., Lefebvre Y., A path-based algorithm to evaluate asymptotic un-
availability for large Markov models, annual Reliability and Maintainability
Symposium Proceedings, Seattle 2002.
9. Burdick G.R., Fussell J. B., Rasmusson D. M., Wilson J. R., Phased Mis-
sion Analysis, a Review of New Developments and an Application, IEEE
Transactions on Reliability, Vol. 26 (1), p. 43-49, 1977.
10. Chiaradonna S., Di Giandomenico F., Mura I., Bondavalli A., Dependability
Modeling and Evaluation of Multiple-Phased Systems using DEEM, accepted
for publication in IEEE Transactions on Reliability 2004.
11. Dugan J.B., Automated Analysis of Phased Mission Reliability, IEEE Trans-
actions on Reliability, Vol. 40, p.45-52, 1991.
12. Dugan J.B., Tang Z., Minimal cut set/Sequence generation for dynamic fault
trees, annual Reliability and Maintainability Symposium 2004 Proceedings,
Los Angeles, USA, January 2004.
13. Dutuit Y., Chatelet E., Thomas P., Signoret J.P., Dependability Modelling
and Evaluation by Using Stochastic Petri Nets : Application to Two Test-
Cases, Reliability Engineering and System Safety, 55: 2, p.117 - 124, 1997.
14. Esary J. D., Ziehms H., Reliability analysis of phased missions, in Reliability
and Fault Tree Analysis, p. 213-236, SIAM Philadelphia, 1975.
15. Gallois M., Pillière M., Benefits expected from automatic studies with KB3
in PSAs at EDF, proceedings of the PSA99 conference, Washington, August
1999.
16. Ionescu D.C., Zio E., Contantinescu A., Availability Analysis of a Safety
System of a Nuclear Reactor, proceedings of KONBIN’03 Conference, Vol.2,
p. 225 - 233, 2003.
17. The KB3 and FIGSEQ tools: detailed information, software download at
http://rdsoft.edf.fr
March 15, 2005 16:34 WSPC/Trim Size: 9in x 6in for Review Volume article˙MMR2004˙long˙V2

19

18. Pedar A., Sarma V.V.S., Phased-Mission Analysis for Evaluating the Effec-
tiveness of Aerospace Computing-Systems, IEEE Transactions on Reliability,
Vol. R-30 (5), p. 429-437, Dec. 1981.
19. Somani A. K, Simplified Phased-Mission System Analysis for Systems with
Independent Component Repairs, proceedings of ACM SIGMETRICS, 1996.
20. Vaurio JK., Fault tree analysis of phased mission systems with repairable
and non-repairable components, Reliability Engineering and System Safety
Vol. 74, p.169 - 180, 2001.
21. Bouissou M., Humbert S., Muffat S., Villatte N., KB3 tool : Feedback on
knowledge bases, proceedings of Lambda Mu 13 / ESREL 2002, European
Conference, Lyon (France), p.754-759, March 2002.
22. Xing L., Reliability and sensitivity analysis of static phased mission systems
with imperfect coverage, M.S. Thesis, Electrical Eng., Univ. Virginia, Jan
2000.
23. Xing L., Dugan J. B., Analysis of generalized phased mission system reli-
ability, performance and sensitivity, IEEE Transactions on Reliability, Vol.
51 (2), p.199-211, June 2002.
24. Bouissou M., Chraibi H., Muffat S., Utilisation de la Simulation de Monte-
Carlo pour la résolution d’un benchmark (MINIPLANT), 14ème congrès de
fiabilité et maintenabilité, Bourges, (France), October 2004.

You might also like