Professional Documents
Culture Documents
Formal Analysis of Fault Tree Using Probabilistic Model Checking: A Solar Array Case Study
Formal Analysis of Fault Tree Using Probabilistic Model Checking: A Solar Array Case Study
Abstract—Fault Tree Analysis (FTA) is a widespread technique concurrent systems. This technique has several advantages
used to assess the reliability of safety-critical systems. The over simulation. Notably, probabilistic model checking is an
traditional way of conducting FTA is either through paper and exhaustive, accurate, efficient and completely automated veri-
pencil proof or through computer simulation techniques, which
are inefficient and prone to inaccuracy. In this paper, we propose fication technique [5], providing a comprehensive and reliable
the use of probabilistic model checking to automatically analyze solution for fault tree analysis. In this work, we propose a
fault trees of safety-critical systems. Our methodology consists in PMC-based methodology for FTA modeling, using PRISM
the probabilistic formalization of the gates used in a fault tree to language [6]. Added to the inherent advantages of PMC listed
a Discrete-Time Markov Chain (DTMC) and a Markov Decision above, PRISM’s modularity allows for easily expandable,
Process (MDP), and the subsequent probabilistic verification
using PRISM tool to quantitatively analyze the system. To state-efficient models. The modeling methodology is applied to
illustrate the proposed approach we perform the fault tree a case study of a solar array mechanical system [7], first using
analysis of a solar array system, used as power source for the Discrete-Time Markov Chain (DTMC) [8] to model known
DFH-3 satellite. The results show that harsh thermal environment environment scenarios where the probabilistic distribution of
is the main cause of system failures. the system’s behavior is known, then using Markov Decision
Process (MDP) to model the non-deterministic behavior of the
I. I NTRODUCTION
system when subjected to unknown environments.
Fault Tree Diagram (FTD) is a top-down graphical model Our goal is to provide a way for developers to evaluate the
of a system, which represents all paths and events that may weaknesses of their systems. Through PMC properties, our
lead to failure within that system [1]. Events in FTDs are approach can be used to ascertain not only correctness, but
nodes, connected through logic gates in such a way that an also quantitative measures such as performance and reliability,
error in one of the bottom nodes can propagate to the higher without the time and intensive processing required by simula-
level nodes and reach the top-level event, compromising the tion techniques. This work focuses on evaluating the dynamics
functionality of the entire system. Fault Tree Analysis (FTA) of fault propagation in FTDs, and we observe that the lowest
is the study of such diagrams in order to discover and assess levels in the tree are the most positively affected by fault
the effect of undesirable events or faults [1]. FTA allows safety masking, events connected to multiple gates are especially
and reliability engineers to better understand how the system unreliable in non-deterministic environments (MDP), and that
can fail, identifying the best possible ways to make it safer, events connected to AND gates are more affected by non-
as well as the system’s event rates. FTA is commonly used determinism than events connected to OR gates.
in the aerospace industry for both hardware and software [1] The following section presents some important background
as means for investigating a system’s modes, potential faults information about PMC, the PRISM tool and FTA. Section
occurrences with their causes, and to quantify their contribu- III considers related works. In Section IV we describe our
tion to system unreliability in the course of product design. modeling approach. In Section V we demonstrate the applica-
Traditionally, FTA is based on simulation techniques [2], [1], tion of our proposed methodology, performing the analysis on
[3], with the main techniques being: Monte-Carlo simulation, a solar array FTD and Section VI concludes the paper with
Quasi-Monte-Carlo method, time-sequential simulation, and some future research directions.
discrete event simulation [4]. Assessing the causes and the
probability of a punctual failure occurrence in the system using II. P RELIMINARIES
simulation-based techniques is very costly, since each failure
condition must be evaluated separately, one at a time, creating A. Probabilistic Model Checking with PRISM
a very large state-space and requiring tremendous effort to Probabilistic model checking is a formal verification tech-
analyze the whole scope of the system. nique derived from regular model checking and applied on
An alternative to avoid the aforementioned problem is systems that present a random or probabilistic behavior, like
the use of Probabilistic Model Checking(PMC). PMC is a real life applications, where the resulting models usually
formal verification method that designates a collection of contain a very large number of states. This technique can deal
techniques for the automatic analysis of reactive, finite state with a wide range of quantitative measures; the results show
an exact figure of the property being verified, usually in parts- context of FTA, the general event is known as top event, from
per-hundred; it can be fully automated, provides an exhaustive which the fault tree branches out vertically. The top event is
analysis of the model, and it is very efficient. PMC works defined as the failing point of a system in operating conditions,
with several model types and temporal logic specification whether those conditions are considered normal or abnormal.
languages. In this paper we use DTMC and MDP as the model A single fault tree can be used to analyse one single top event,
types and Probabilistic Computation-Tree Logic (PCTL) as which can then be fed into another fault tree as a bottom event.
the property specification language. In [9] DTMC is defined Bottom events are the ones at the very bottom of the fault tree,
as a tuple D=(S, s,P,L) where S is a countable set of states, independent from any other events, and, assuming the FTA
s ∈ S is an initial state, PP: S × S → [0, 1] is a transition is performed following a quantitative evaluation, these events
probability matrix such that s0 ∈ s P (s, s0 ) = 1 for all s ∈ S, receive fault probabilities that will dynamically spread through
and L : S → 2AP is a labeling function mapping each state the rest of the tree during the analysis. The elements of a
to a set of atomic propositions taken from a set AP. An MDP Fault Tree and their graphical representations are summarized
is defined in [9] as a tuple M = (S, s, αM , δM , L) where S is in Table I.
a finite set of states, s ∈ S is an initial state, αM is a finite TABLE I: Elements of a Fault Tree
alphabet, δM : S × αM → Dist (S) is a (partial) probabilistic Token Element Description
transition function and L : S → 2AP is a labelling function Top or Intermediary Event System or component failure
mapping each state to a set of atomic propositions taken from
Bottom Event A basic initiating fault event
a set AP. As such, the MDP is a stochastic system where all
Specific condition or restriction
the decisions are made in a non-deterministic manner. Conditioning Event
that can apply to any gate
To complete the model checking process, we specify prop- External Event
Event that is normally expected to
occur
erties using a probabilistic extension of CTL temporal logic Event that’s not further developed
called PCTL. PCTL can be used with both DTMC and MDP Undeveloped Event due to lack of importance or
knowledge
models, working at discrete time domain. The main difference AND Gate
The output is true if all inputs are
true
is that using PCTL over MDP model requires to extend The output is true if at least one
OR Gate
the P[](probability query) operator with the min and max input is true