Download as pdf or txt
Download as pdf or txt
You are on page 1of 17

Information Sciences 256 (2014) 5773

Contents lists available at SciVerse ScienceDirect

Information Sciences
journal homepage: www.elsevier.com/locate/ins

A security risk analysis model for information systems:


Causal relationships of risk factors and vulnerability
propagation analysis
Nan Feng a, Harry Jiannan Wang b, Minqiang Li a,
a
b

College of Management and Economics, Tianjin University, 92 Weijin Road, Nankai District, Tianjin 300072, PR China
Department of Accounting and MIS, University of Delaware, Newark, DE, United States

a r t i c l e

i n f o

Article history:
Available online 4 March 2013
Keywords:
Information systems
Security risk
Bayesian networks
Ant colony optimization
Vulnerability propagation

a b s t r a c t
With the increasing organizational dependence on information systems, information systems security has become a very critical issue in enterprise risk management. In information systems, security risks are caused by various interrelated internal and external factors.
A security vulnerability could also propagate and escalate through the causal chains of risk
factors via multiple paths, leading to different system security risks. In order to identify the
causal relationships among risk factors and analyze the complexity and uncertainty of vulnerability propagation, a security risk analysis model (SRAM) is proposed in this paper. In
SRAM, a Bayesian network (BN) is developed to simultaneously dene the risk factors and
their causal relationships based on the knowledge from observed cases and domain
experts. Then, the security vulnerability propagation analysis is performed to determine
the propagation paths with the highest probability and the largest estimated risk value.
SRAM enables organizations to establish proactive security risk management plans for
information systems, which is validated via a case study.
2013 Elsevier Inc. All rights reserved.

1. Introduction
As information systems have become more prevalent in business, the consequences of information system security
violations have become more and more costly [43]. For example, the 2010 Computer Crime and Security Survey on 738
organizations by the Computer Security Institute reported a total estimated annual loss of $190 million caused by information systems security incidents [20]. Recent literature [3,6,7,34] has also documented signicant costs related to information
systems security breaches.
As an important part of enterprise risk management (ERM), security risk analysis mainly focuses on analyzing vulnerabilities and threats to the information resources and deciding what countermeasures to take for reducing risk to an acceptable level. However, security risk analysis for information systems is a very challenging task due to the complex and dynamic
environment. For example, there often exist complex interactions among the components of information systems. Therefore,
any single vulnerability may have multiple propagation paths, leading to different security risks in information systems.
In recent years, the security risk analysis for information systems has attracted much attention of researchers in the eld
[8,35,28]. The existing approaches for risk analysis can be grouped into three major categories: the quantitative approaches,
the qualitative approaches, and the combination of quantitative and qualitative approaches.

Corresponding author. Tel./fax: +86 22 27404796.


E-mail address: tjufengnan@gmail.com (M. Li).
0020-0255/$ - see front matter 2013 Elsevier Inc. All rights reserved.
http://dx.doi.org/10.1016/j.ins.2013.02.036

58

N. Feng et al. / Information Sciences 256 (2014) 5773

The quantitative approaches utilize mathematical and statistical models to represent risk [28]. Security risk exposure is
represented as a function of the probability of the threats and the expected loss due to the vulnerability to those threats [6].
Gordon and Loeb [19] presented a mathematical model to determine the optimal security investment level for information
systems. Their work and subsequent literatures on security risk analysis focused on a single system or a single type of protection technology. Yue et al. [47] extended those studies by formulating and solving the problem according to the risk management paradigm, and therefore provided manager with additional insights into making optimal decisions. Wu et al. [46]
analyzed various risks and challenges in the product development of the concurrent engineering environment and proposed
a quantitative approach to systematically identifying the most important risks for accomplishing concurrent engineering
projects. Grunske and Joyce [21] proposed a risk-based approach that created modular attack trees for each component in
information systems. These modular attack trees were specied as parametric constraints, which allowed quantifying the
probability of security breaches that occurred due to the internal component vulnerabilities as well as vulnerabilities in
the components deployment environment.
There are also many qualitative security risk analysis methods and techniques. The Operationally Critical Threat, Asset,
and Vulnerability Evaluation (OCTAVE) approach [1] denes a set of impact evaluation criteria to establish a common basis
for determining the impact values due to threats to the critical assets. Peltier [35] presented a qualitative risk analysis process using techniques such as Practical Application of Risk Analysis (PARA) and Facilitated Risk Analysis Process (FRAP) to
evaluate tangible and intangible risks. This process allowed for the systematic evaluation on risk, threats, hazards, and concerns, and provided cost-effective measures for lowering risk to an acceptable level. Some other popular qualitative methods
are CCTA Risk Analysis and Management Method (CRAMM) developed by the UK Governments Central Computer and Telecommunications Agency (CCTA) and INFOSEC Assessment Methodology (IAM) [15].
Some comprehensive approaches combining both quantitative and qualitative methods have also been proposed [2,38].
Chen et al. [9] applied the similarity measures of generalized fuzzy numbers to deal with fuzzy risk analysis problems.
Although this approach is good at processing the ambiguous information by simulating the characteristic of human in making judgments, it is unable to provide the graphical relationships among various security risk factors using ow charts or
diagrams. For representing the relationships among risk factors, Fan and Yu [16] developed a Bayesian networks (BNs) based
procedure to provide risk analysis support. In their approach, the BN is structured solely based on domain experts experience. Sun et al. [43] proposed an evidential reasoning approach under the DempsterShafer theory for the risk analysis of
information systems security, which provides a rigorous, structured means to incorporate relevant security risk factors, related countermeasures, and their interrelationships when estimating security risk in information systems. In addition, sensitivity analyses were performed to evaluate the impact of important parameters on the models results in this approach.
Models that are implemented incorrectly or developed based on questionable assumptions are vulnerable to model risks
[45]. Wu and Olson [45] summarized a series of model risks in nancial services industry and demonstrated an effective
means to mitigate such risks through predictive scorecards.
The approaches aforementioned have contributed a great deal to the development of security risk analysis. However two
issues need to be further investigated in the eld of information systems security risk management. First, in the process of
security risk analysis for information systems, models are built in order to analyze and better understand the security risk
factors and their causal relationships in real-world information systems. Establishing an appropriate model suitable for the
target security risk problem is a crucial task that ultimately inuences the effectiveness of risk analysis results. Existing literature [9,16] either assumes that the structure of the model was provided by domain experts or was chosen from some general well-known class of model structures. Therefore, how to leveraging both the database of observed cases and domain
experts experience to construct a representative model for observed information systems is a critical issue in security risk
analysis.
Second, one security vulnerability could propagate and escalate through the causal chains of security risk factors via
multiple paths, leading to different security risks in information systems. Existing approaches largely focuses on risk
probability and severity without considering vulnerability propagation. Therefore, there is an imperative need for advanced vulnerability propagation analysis tools that can help establish proactive security risk management plans for
information systems.
To address challenges aforementioned, we propose a security risk analysis model (SRAM) in this paper based on Bayesian
networks and ant colony optimization. SRAM extends existing work by constructing the BNs using both the database of observed cases and domain experts experience. Then, security risk assessment is performed to deduce the occurrence probabilities and the consequence severities of security risks. After that, the vulnerability propagation paths are calculated using
ant colony optimization to provide guidance for developing security risk treatment plans.
The rest of this paper is organized as follows: Section 2 reviews theoretical backgrounds in this study. After that, we discuss the process for developing the SRAM in detail. The model is further demonstrated and validated in Section 4 via a case
study. We compare our model with other related approaches and discuss its limitations in Section 5. Finally, we summarize
our contributions and point out further research.

2. Theoretical backgrounds
The SRAM is based on Bayesian networks and ant colony optimization, which are introduced in this section.

N. Feng et al. / Information Sciences 256 (2014) 5773

59

2.1. Bayesian networks


Bayesian networks (BNs) [25,26], also known as probabilistic belief networks or causal networks, are knowledge representation tools capable of representing dependence and independence relationships among random variables. A BN,
N = (X, G, P), over variables, X, consists of a directed acyclic graph G = (V, E) and a set of conditional probability distributions
P. Each node v in G corresponds one-to-one with a discrete random variable X v 2 X with a nite set of mutually exclusive
states.
A BN encodes a joint probability distribution over a set of random variables, X, of a problem domain. The set of conditional
probability distributions, P, species a multiplicative factorization of the joint probability distribution over X, X = {X1, X2, . . . , Xn} as represented:

PX 1 ; X 2 ; . . . ; X n

PX v jX Pav

v 2V

where Pa(v) are the parents of variable Xv in the G.


BNs have been widely applied in the eld of medical diagnostics, classication systems, software agents for personal
assistants, multi-sensor fusion, and legal analysis of trials [29]. In SRAM, a BN is used to simultaneously integrate the factors
related to assessing the security risk and their causal mechanisms due to its two advantages. First, a BN can be used to learn
causal relationships, and hence can be used to gain understanding about a problem domain and to predict the consequences
of intervention. Second, the BN is an ideal representation for combining prior knowledge (which often comes in causal form)
and data because it has both causal and probabilistic semantics.
2.1.1. Learning Bayesian networks
Recently, learning BNs from data has become an increasingly active area of research. Although, sometimes experts can
create good BNs from their own experience, it can be a very hard task for domains with large knowledge bases. Therefore,
many methods have been developed to automate the creation of BNs using cases collected from past experience [31].
BN learning can be classied based on whether the structure of the network is known or unknown and the variables
(data) can be observable (complete) or hidden (incomplete). Consequently there are four classes of learning BNs from data:
known structure and observable variables, unknown structure and observable variables, known structure and unobservable
variables, and unknown structure and unobservable variables. In this paper, the problem falls into the third class in which we
try to learn the structure of the BN by using the complete data for security risk analysis in information systems. Learning BNs
consists of structure learning and parameter learning. Structural learning is the estimation of the topology (links) of the network. And parameter learning is the estimation of the conditional probabilities in the network.
In the case of unknown structure and complete data, BN structure learning is a much harder problem compared to
parameter learning since the number of candidate networks grows exponentially when the number of variables increases
[31]. Furthermore, parameter learning, i.e., calculating the conditional probability distributions of each node in the network
from the complete data, is an easier task and has been studied extensively. Therefore, we focus on structure learning in the
BN development in this paper.
2.1.2. Bayesian network structure learning
There are two main approaches to structure learning: constraint-based and score-based. In the constraint-based approach, tests of conditional independence on the data are performed, and a search is conducted to nd the network that
is consistent with the observed dependencies. The score-based method operates on the principle, i.e., a scoring function that
represents how well it ts the data, is dened for each network structure. The goal is to nd the highest-scoring network
structure. Since the score-based method is less sensitive to errors in individual tests, we utilize a score-based method for
structure BN learning in this paper.
A structural BN learning algorithm requires the determination of two components: scoring function for candidate network structures and a search algorithm that does the optimization.
Three main scoring functions commonly used to learn BNs are the log-likelihood [24], the minimal description length
(MDL) score [32], and Bayesian score [24]. In this paper, we use Bayesian score as the scoring function in structural BN learning. A desirable and important property of a metric is its decomposability in the presence of complete data, i.e., the Bayesian
score function can be decomposed in the following way:

f G : D

n
X

f xi ; Paxi : Nxi ;Paxi

i1

f xi ; Paxi : Nxi ;Paxi

qi
X
j1

logr i  1!=Nij r i  1!

ri
X

!
logNijk !

k1

where N xi ;Paxi are the statistics of the variable xi and Pa(xi) in D, ri is the number of possible values of the variable xi, qi is the
number of possible congurations (instantiations) for the variables in Pa(xi), Nijk is the number of cases in D in which variable
Pri
xi has its kth value and Pa(xi) is instantiated to its jth value, and N ij k1
N ijk .

60

N. Feng et al. / Information Sciences 256 (2014) 5773

Having discussed several scoring functions, we now turn to nd the network that has the highest score. In other words, a
dataset, the scoring function, and a set of possible structures are the input to the search algorithm while the desired output is
a network that maximizes the score. The most commonly used search-and-scoring algorithm is the exhaustive search algorithm which can explore all possible structures of a dataset. But for large datasets, the number of possible structure is huge. It
is impossible to perform an exhaustive search. Therefore, heuristic search methods [37] have been proposed.
There are two kinds of typical heuristic search algorithms for the structural learning problem, K2 algorithm [10] and Hillclimbing algorithm [42]. The idea of K2 algorithm is incrementally adding a node to a parent set and nding the best parent
set to maximize the joint probability of the structure and the database. Another most commonly used algorithm is Hillclimbing algorithm. First, a search space is dened in this algorithm rstly. Then, this space is traversed looking for highscoring functions to complete the optimization. Although these two search algorithms are more efcient, they are prone
to getting trapped in local optima [42].
To solve such a problem, genetic algorithm (GA) has been introduced by Larraaga et al. [33] for learning BNs. In their GA
implementation, a directed acyclic graph (DAG) is represented by a connectivity matrix that is stored as a string. Lam et al.
[32] proposed a hybrid evolutionary programming (HEP) algorithm that combined the use of independence tests with a quality-based search. The HEP algorithm evolves a population of DAG to nd a solution that minimizes the MDL score. The common drawback to the algorithms proposed by Larraaga and Lam is that the crossover and mutation operators they used are
complex and expensive both in memory and runtime.
Ant colony optimization (ACO) was initially used to solve specic problems: the ant system, for example, was successfully
applied to the traveling salesman problem (TSP) [12,13]. Applications to shortest path problems in graphs were developed in
order to study the behavior of these algorithms on simple problems, but they later become a metaheuristic optimizer that
can be applied to combinatorial optimization problems which may be represented in the form of a graph [11]. Since the ACO
has an advantage in solving the combinatorial optimization problems, and the space of BNs is a combinatorial space, it is
utilized for structure BN learning in this paper.
Comparing with aforementioned algorithms for structure BN learning, ACOs advantages not only lie in its strong search
capability in the combinatorial optimization problem but also lie in its easy implementation and good efciency for discovering optimal solutions. In the next section, we briey describe the concepts related to ACO which is used in SRAM.
2.2. Ant colony optimization
Ant colony optimization (ACO) algorithms [11,13,14] are multi-agent systems in which the behavior of each agent (ant) is
inspired by the foraging behavior of real ants. In particular, ACO algorithms model the process followed by real ants when
nding the shortest path from a food source to their nest. While walking, real ants deposit a chemical substance called pheromone on the ground when they have successfully found food and are returning to the nest. Ants can smell pheromone and,
when choosing their way, they tend to choose, in a probabilistic way, paths marked by strong pheromone concentrations. In
the absence of pheromone, ants choose randomly, but after a transitory period shortest paths will be more frequently visited
and pheromone will accumulate faster on those paths, which in turn causes more ants to use these paths. This positive feedback effect means that all ants will eventually use the shortest path. So, although a single ant is capable of building a solution
(i.e., a path), the optimal solution comes about solely as a result of the cooperative behavior of the ant colony (which is based
on a simple form of indirect communication through the pheromone, called stigmergy). Besides TSP, ACO has been used to
optimize a wide range of problems, such as the satisability problem [36], supply-chain logistics [39,40], and sorting
problems [22].
In this paper, ACO is applied to nd the maximal scoring BN structure which is good approximation of the process of security risk analysis. In addition, it is also used to determine security vulnerability propagation paths and their occurrence
probabilities.
3. Proposed security risk analysis model
The procedure of the proposed security risk analysis model (SRAM) is dened through three phases (Fig. 1), which are the
Bayesian network (BN) development, security risk assessment, and vulnerability propagation analysis. In Fig. 1, Database1
(DB1) contains the basic information about the BN nodes. Database2 (DB2) stores the case data of the BN nodes, and
Database3 (DB3) has current observation data.
3.1. Bayesian network development
In this phase, a BN is developed to represent the factors related to assessing the security risk and their causal relationships
based on the DB1 and DB2 (see Fig. 1). In addition, the BN will become the basis for security risk assessment and vulnerability propagation analysis.
For the representation of the causal relationships among security risk factors, an ACO-based algorithm (Algorithm 1) is
developed to learn the BN structure that best ts the DB2. Starting from a candidate network, which may be an empty
one, or has a starting point provided by the experts, the ants iteratively search good single-step changes by adding edges,

N. Feng et al. / Information Sciences 256 (2014) 5773

61

Fig. 1. SRAM procedure.

removing edges, or reverting edges to build a BN. Therefore, each ant picks randomly two variables and chooses whether an
edge (with its direction) should exist between both variables. The best action provided by all ants is applied to the network
structure.
Algorithm 1 is presented in the Appendix A. Based on a candidate network, the ants collaboratively build a network structure in each iteration. Within one iteration, every ant randomly picks an edge and decides the state of that edge based on the
pheromones and heuristics. More specically, each ant performs the following two steps.
(1) Random selection of the next edge to be evaluated from the set of candidate edges. The set of candidate edges are all
edges of the graph.
(2) Assignment of an edge state. This assignment is made probabilistically, in balance between the pheromone information and the locally computed heuristic information.
The ant that found the assignment with the highest score improvement changes the network if the change does not lead to
any cycle in the network structure. If no higher scoring network can be found, the current network G and the best network
found so far, G, are used to update the pheromone information in order to guide the ants in the next iterations to higher
quality networks.
When Niter = Nmax, i. e., the current number of iterations is equal to the max number of iterations, the process of iteration
ends. Nmax should be set to a value high enough to allow the pheromone matrix to saturate.
As for the BN parameters (i.e., the conditional probability tables), they can be determined by learning the parameters on
historical data and experts knowledge. In this paper, we use maximum likelihood estimation (MLE) [23] to calculate the conditional probability tables of each node in the BN from the complete data.
3.2. Security risk assessment
Once the BN of information systems is constructed, it serves as a tool for risk assessment based on real time database (i.e.,
DB3 in Fig. 1), which provides updated information about each observable node in the BN as inference evidence. This phase
of SRAM nally yields the occurrence probability and the consequence severity of security risk in the BN. The result from the
risk assessment will be used for the decision-making procedure: if the future estimated situation of information systems is a
state considered secure or successful, no action should be planned. Otherwise, if the probability of one risk node in the
BN exceeds the threshold set in advance, the vulnerability propagation should be further analyzed.
Whenever the new evidence is available in the process of security risk assessment, it should be plugged in the BN to update previous estimates by probabilistic inference. In BNs, probabilistic inference can be dened as the task of computing all
posterior marginals of non-evidence variables given the evidence [5]. In general, probabilistic inference is a NP-hard task
[27]. Therefore, the most critical task related to the risk assessment can be dened as identifying the posterior probability
of each risk based on the evidence obtained from real time database. In this phase, we develop an inference engine based on
junction tree (also known as a join tree or a Markov tree) [27] to compute the posterior marginal P (X|e) of a variable X
approximately, given the evidence e.
A junction tree representation T of a Bayesian network N = (X, G, P) is a pair T = (C, S) where C is the set of cliques and S is
the set of separators. The clique set C indicates the nodes of T, whereas the separators S annotate the links of the tree. Each

62

N. Feng et al. / Information Sciences 256 (2014) 5773

clique C 2 C represents a maximal complete subset of pair wise connected variables of X, i.e., C # X. Once the junction tree
T = (C, S) has been constructed, a probability potential is associated with each clique C 2 C and each separator S 2 S between
two adjacent cliques Ci and Cj where S C i \ C j .
The inference engine is performed using a message passing algorithm [30] on the junction tree. Its process involves the
following steps:
(1) Each item of evidence must be incorporated into the junction tree potentials. For each item of evidence, some potential, containing the variable in target problem, is modied to reect the evidence.
(2) A clique of the junction tree is selected. This clique is referred to as the root of the inference.
(3) Then messages are passed towards the selected root. The messages are passed through the separators of the junction
tree (i.e., along the links of the tree). These messages cause the potentials of the receiving cliques and separators to be
updated.
(4) The messages are passed in the other direction (i.e., from the root towards the leaves of the junction tree).
(5) At this point, the junction tree is said to be in equilibrium: The probability P (X|e) can be computed from any clique or
separator containing X. The result will be independent of the chosen clique or separator.
Prior to the initial round of message passing, for each variable X v 2 X we assign the conditional probability distribution
P(Xv|Xpa(v)) to a clique C such that Xpa(v) # C. Once all conditional probability distributions have been assigned to cliques, the
distributions assigned to each clique are combined to form the initial clique potential.
Most risk analysis methods rely on a qualitative judgment of consequence severity, regardless of the analysis rigor applied to the security risk assessment. As the risk analysis is dependent on the estimated occurrence probability and consequence severity of security risks, the error associated with the consequence severity evaluation directly impacts the
estimated security risk and ultimately the risk reduction requirements. To address the qualitative biases for consequence
severity estimation, a semi-quantitative approach [41] is adopted to support the assessment of consequence severity of
security risk.
3.3. Vulnerability propagation analysis
In this phase, based on ant colony algorithm, we propose an algorithm (Fig. 2) for vulnerability propagation analysis
which helps determine the propagation path(s) with highest probability and largest estimated path risk exposure.
In Fig. 2, the algorithm (Algorithm 2) for vulnerability propagation describes the process of nding one propagation path
from a source node to a destination node in the BN. The transition probability pkij t of the kth ant moving from node i to node
j is given as

pkij t sij ta gij tb

,
n
X

sis ta gis tb

s1

where a and b denote the weighting parameters controlling the relative importance of the pheromone amount and the desirability respectively, sij(t) is the quantity of pheromone laid on the path (i, j) at time t, s is the selected node of the kth ant in
the next searching step, gij(t) is the heuristic information.
Before the end of each iteration, the residual pheromone on each connective path is re-calculated. The update method
[44] for the pheromone is described below. However, before we go any further, several variable denitions must rst be
understood:
(1) sij(t): at time t, the pheromone left on a connective path (i, j).
(2) Dsij: between time t and time t + N, the amount of pheromone increased by all ants on a connective path (i, j).
(3) Dskij : between time t and time t + N, the strength of pheromone left by the kth ant on a connective path (i, j).
From the above denitions, the extra strength of the pheromone on a particular path can be determined by the sum of the
pheromone left by each ant.

Dsij

m
X
Dskij

k1

The pheromone on a connective path (i, j) left by the kth ant is the inverse of the total length traveled by the ant in a particular path. The formula is as follows:

Dskij Q =Lk ;

where Q is a constant, and Lk is the total path cost traveled by the kth ant.
Pheromone strength on a particular path increases with the number of ants that have traveled on it. However, the pheromone diminishes and disappears as time elapses. Variable q (0 6 q 6 1) can be used to determine the residual strength of

N. Feng et al. / Information Sciences 256 (2014) 5773

63

Fig. 2. The algorithm for vulnerability propagation analysis.

pheromone after decay between two iterations, and 1  q stands for the portion of pheromone that diminishes with time.
Therefore at t + N, the residual pheromone is exactly the diminished amount plus the residual pheromone strength left by all
the ants from the last iteration. The formula is as follows:

sij t N qsij t Dsij :

When Niter = Nmax the process of iteration ends. The output of the Algorithm2 is one vulnerability propagation path and its
probability.
4. SRAM validation via a case study
4.1. Soundness and completeness of SRAM
In this section, we rst validate the soundness of SRAM by explaining in detail the methodology of applying SRAM as depicted in Fig. 3. In SRAM, since the space of BNs is a combinatorial space, Ant colony optimization (ACO) is utilized for structure BN learning because of its advantage to solving the combinatorial optimization problems. Integrating the database of
observed cases with expert experience, and based on ACO, a BN is developed to represent the security risk factors and their
causal relationships in the information systems. Whenever the new evidence is available in the process of security risk
assessment, it should be plugged in the BN to update previous estimates by BN probabilistic inference, which aims at computing all posterior marginals of non-evidence variables given the new evidence. Based on the occurrence probability and
the consequence severity of security risk in the BN, the vulnerability propagation analysis is performed to calculate the propagation paths and their probabilities. Finally, a security risk treatment plan can be developed according to the results of vulnerability propagation analysis. Given that the key components of SRAM, namely Bayesian networks and ant colony

64

N. Feng et al. / Information Sciences 256 (2014) 5773

Fig. 3. The methodology of SRAM.

optimization, have been widely applied in various research domains and the methodology shown in Fig. 3 is developed systematically and rigorously, we say that our proposed model is sound. In addition, we further validate the soundness of our
model via a case study as discussed in the next section.
Completeness is dened as the degree to which the model represents the elements of interest in the problem domain. In
terms of SRAM, completeness is related to the set of security risk variables dened in the model. To insure the completeness
of variable set, we derive those variables by two means: rst, we refer to an authoritative report from National Institute of
Standards and Technology (NIST) to select the base set of security risk variables; then we interview six domain experts to
review, modify, and enhance the initial set of variables based on their knowledge and experience. We believe the set of security risk variables selected using the two-step procedure aforementioned represent most common security risk factors in real
information systems and thus is complete for the purpose of analyzing information systems security risk in our research
context.
4.2. A case study
In this section, the proposed SRAM is applied to a real company (referred to as AB Company in this paper)s information
system, which has been in service for six years, to assess its security risk status and analyze its vulnerability propagation
paths and related probabilities. The details of the case study are discussed next.
4.2.1. Step 1: data preprocessing
AB Company is a nancial services rm providing a wide range of services in securities trading and sales, corporate nance and investment banking, and asset management. Because of its relatively long business history and several years of
e-business history, there is a large number of raw data related to information systems security.
The security risk related variables (i.e., the nodes in BN) are categorized into six groups: physical and environmental security, network security, host/server computer security, application security, data security and back-up, and communication
and operation security. The basic information about the nodes is stored in DB1 (see Fig. 1).
The raw data related to information systems security is stored in MATLAB les and the data corresponding to the nodes in
BN are extracted. The data adjustment is then followed by an equal frequency data binning process, which clusters the data
into bins while ensuring that each bin contains fairly equal number of elements. After the equal frequency binning, the original data are represented by the bin numbers associated with them. As a result, the maximum number of states in each
node in BN is reduced. Moreover, the conditional probability tables (CPTs) of all variables will have manageable sizes because
each node will have small number of states.
4.2.2. Step 2: BN development
In SRAM, we use preliminary experiments to determine the appropriate values for various parameters. For Algorithm 1,
different parameter levels are examined based on the research presented in [29,18,4,14]. There are six different ant colony
sizes m 2 f5; 10; 20; 30; 40; 50g, four different evaporation rate levels q 2 f0; 0:25; 0:5; 0:75g, three different pheromone

65

N. Feng et al. / Information Sciences 256 (2014) 5773

weighting parameters a 2 f0; 1; 5g, and three different desirability parameters b 2 f0; 1; 5g. The arbitrary positive constant Q
is xed at 50 and the initial pheromone intensity on all arcs s0 is set to 1. Meanwhile, we tested different number of iterations and found that the performance of Algorithm 1 stopped improving signicantly after 400 iterations. Thus, the maximum number of iterations was set to Nmax = 400. In sum, our experiments show that m = 40, a = 1, b = 1, q = 0.25 are the
best choices for parameter values for Algorithm 1. Similarly, for Algorithm 2, the results of our experiments show that
parameters with the following values, i.e., m = 40, a = 1, b = 1, q = 0.5, and Q = 50, yield the best performance.
Table 1
Information of risk factor nodes in BN.
Category

ID

Risk factor nodes

State space

Parent nodes

Children
nodes

1. Physical and
environment security

RF1_1

{RF1_4}

{RF1_4}

RF1_3
RF1_4

3.Physical security perimeter


4. Secure areas

{Very effective; Effective;


Average; Ineffective}
{High level; Average; Need to
be improved}
{Secure; Average; Insecure}
{Secure; Average; Insecure}

RF1_2

1. Protecting against external and


environmental threats
2. Physical entry controls

{RF1_4}
{R1, RF5_3}

RF1_5

5. Supporting utilities

/
{RF1_1, RF1_2,
RF1_3}
/

RF1_6
RF1_7
RF1_8

6. Cabling security level


7. Equipment maintenance
8. Equipment security level

/
/
{RF1_5, RF1_6,
RF1_7}

{RF1_8}
{RF1_8}
{R1, R3}

RF2_1

1. Network connection control

{RF2_3}

RF2_2

2. Network routing control

{RF2_3}

RF2_3

3. Network access control

{RF2_1, RF2_2}

RF2_4

{RF2_5, R2,
R3}
{RF2_5}

RF2_5

4. User authentication for external


connections
5. Network intrusion protection

{Effective; Average;
Ineffective}
{Effective; Average;
Ineffective}
{Effective; Average;
Ineffective}
{Secure; Average; Insecure}

{RF2_3, RF2_4}

{R2, RF3_4}

RF2_6

6. Network security audit

{Effective; Average;
Ineffective}
{Comprehensive;
Incomprehensive}

{R2, RF3_3,
RF6_8}

RF3_1

{Secure; Average; Insecure}

{RF3_2}

RF3_2

1. User identication and


authentication
2. Host/Server access control

{RF3_1}

{R3, RF4_1}

RF3_3

3. Host/Server security audit

{RF2_6}

RF3_4

4. Host/Server intrusion protection

{Effective; Average;
Ineffective}
{Comprehensive;
Incomprehensive}
{Effective; Average;
Ineffective}

{RF2_5}

{R3, RF4_2,
RF6_8}
{R3}

RF4_1

1. Application access control

{RF3_2}

{R4}

RF4_2

2. Application security audit

{RF3_3}

{R4, RF6_8}

RF4_3

3. Capability of fault tolerance

{Effective; Average;
Ineffective}
{Comprehensive;
Incomprehensive}
{High; Medium; Low}

{R4}

RF5_1

1. Data secrecy

{High; Medium; Low}

RF5_2
RF5_3

2. Data integrity
3. Data back-up policy

{High; Medium; Low}


{Adequate; Inadequate}

/
{RF1_4}

{RF6_7,
RF6_9, R5}
{R5}
{R5}

RF6_1
RF6_2

1. Documented operating procedures


2. Change management

/
/

{RF6_4}
{RF6_4}

RF6_3
RF6_4

RF6_8

8. Audit logging

{Secure; Average; Insecure}

RF6_9

9. Protection of log information

RF6_10

10. Monitoring

{Effective; Average;
Ineffective}
{Effective; Average;
Ineffective}

/
{RF6_1, RF6_2,
RF6_3}
/
/
{RF6_5, RF6_6,
RF5_1, RF6_10}
{RF2_6, RF3_3,
RF4_2}
{RF5_1}

{RF6_4}
{R6}

RF6_5
RF6_6
RF6_7

3. Segregation of duties
4. Operational procedures and
responsibilities
5. Communication secrecy
6. Communication integrity
7. Exchange of information

{Good; Not good; Bad}


{Effective; Average;
Ineffective}
{Clear; Unclear}
{Very standard; Standard;
Non-standard}
{High; Medium; Low}
{High; Medium; Low}
{Secure; Average; Insecure}

{RF6_8, RF6_9}

{RF6_7, R6}

2. Network security

3. Host/Server computer
security

4. Application security

5. Data security and backup

6. Communication and
operation security

{In good condition; Need to be


improved}
{High; Medium; Low}
{Regular; Irregular}
{High; Medium; Low}

{RF1_8}

{RF6_7}
{RF6_7}
{R4, R6}
{RF6_10}
{RF6_10}

66

N. Feng et al. / Information Sciences 256 (2014) 5773

Table 2
Information of risk nodes in BN.
ID

Risk nodes

State space

R1
R2
R3
R4
R5
R6

1.
2.
3.
4.
5.
6.

{High;
{High;
{High;
{High;
{High;
{High;

Physical and environment security risk


Network security risk
Host/Server computer security risk
Application security risk
Data security and back-up risk
Communication and operation security risk

Medium;
Medium;
Medium;
Medium;
Medium;
Medium;

Low}
Low}
Low}
Low}
Low}
Low}

Parent nodes

Children nodes

{RF1_4,
{RF2_3,
{RF1_8,
{RF4_1,
{RF5_1,
{RF6_4,

/
/
/
/
/
/

RF1_8}
RF2_5, RF2_6}
RF2_3, RF3_2, RF3_3, RF3_4}
RF4_2, RF4_3, RF6_7}
RF5_2, RF5_3}
RF6_7, RF6_10}

Fig. 4. BN structure of network security risk.

Table 3
CPT of P (RF2_3|RF2_1, RF2_2).
RF2_1

RF2_2

RF2_3 = Effective

RF2_3 = Average

RF2_3 = Ineffective

Effective
Average
Ineffective
Effective
Average
Ineffective
Effective
Average
Ineffective

Effective
Effective
Effective
Average
Average
Average
Ineffective
Ineffective
Ineffective

0.9862
0.4653
0.1096
0.5024
0.0511
0.0095
0.0802
0
0

0.0138
0.4026
0.5592
0.3943
0.6785
0.2318
0.6121
0.2865
0.0089

0
0.1321
0.3312
0.1033
0.2704
0.7587
0.3077
0.7135
0.9911

Table 4
CPT of P (RF2_5|RF2_3, RF2_4).
RF2_3

RF2_4

RF2_5 = Effective

RF2_5 = Average

RF2_5 = Ineffective

Effective
Average
Ineffective
Effective
Average
Ineffective
Effective
Average
Ineffective

Secure
Secure
Secure
Average
Average
Average
Insecure
Insecure
Insecure

0.9134
0.3911
0.0782
0.3541
0.0911
0
0.0678
0.0033
0

0.0866
0.5101
0.4231
0.4886
0.7091
0.2806
0.5996
0.2007
0

0
0.0988
0.4987
0.1573
0.1998
0.7194
0.3326
0.7960
1.0000

Based on the procedure of the ACO-based algorithm presented in Section 3.1, the BN structure was developed. Table 1
demonstrates the information of the security risk factor nodes, i.e., the causes that lead to risks, and Table 2 shows the information of the security risk nodes, which security risk managers hope to predict ultimately. Taking the network security risk
for example, the BN structure of network security risk is shown in Fig. 4, and the CPTs of the nodes are shown in Tables 35.
In the above tables and gure, the IDs of BN nodes are explained in Tables 1 and 2.

67

N. Feng et al. / Information Sciences 256 (2014) 5773


Table 5
CPT of P (R2|RF2_3, RF2_5, RF2_6).
RF2_3

RF2_5

RF2_6

R2 = High

R2 = Medium

R2 = Low

Effective
Average
Ineffective
Effective
Average
Ineffective
Effective
Average
Ineffective
Effective
Average
Ineffective
Effective
Average
Ineffective
Effective
Average
Ineffective

Effective
Effective
Effective
Average
Average
Average
Ineffective
Ineffective
Ineffective
Effective
Effective
Effective
Average
Average
Average
Ineffective
Ineffective
Ineffective

Comprehensive
Comprehensive
Comprehensive
Comprehensive
Comprehensive
Comprehensive
Comprehensive
Comprehensive
Comprehensive
Incomprehensive
Incomprehensive
Incomprehensive
Incomprehensive
Incomprehensive
Incomprehensive
Incomprehensive
Incomprehensive
Incomprehensive

0.0102
0.1908
0.5001
0.2103
0.2687
0.5915
0.5456
0.6007
0.7301
0.4980
0.5517
0.6375
0.5709
0.7011
0.7978
0.7013
0.8392
0.9806

0.1703
0.3171
0.2988
0.3502
0.4333
0.2959
0.2845
0.2504
0.1787
0.3501
0.3448
0.2508
0.3303
0.2256
0.1525
0.2848
0.1608
0.0194

0.8195
0.4921
0.2011
0.4395
0.2980
0.1126
0.1699
0.1489
0.0912
0.1519
0.1035
0.1117
0.0988
0.0733
0.0497
0.0139
0
0

Table 6
State specication of risk nodes in BN.
Risk nodes

State specication
Low

R1. Physical and environment security


risk
R2. Network security risk
R3. Host/Server computer security risk
R4. Application security risk
R5. Data security and back-up risk
R6. Communication and operation
security risk

(1) (2)
(12)
(1) (2)
(10)
(1) (3)
(1) (2)
(1) (4)
(1) (2)
(13)

(5) (9)
(6) (8)
(5) (7)
(5)
(10) (12)

Medium

High

(1) (2) (3) (5) (7) (9) (10) (12)


(13)
(1) (2) (3) (6) (7) (8) (9) (10)

(1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) (12) (13)

(1) (2) (3)


(1) (2) (4)
(1) (2) (4)
(1) (2) (4)
(13) (15)

(1) (2) (3) (4) (5) (6) (7) (8)


(1) (2) (3) (4) (5) (6) (7)
(1) (2) (3) (4) (5) (6)
(1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) (12) (13)
(14) (15)

(5) (7) (8)


(5) (6)
(5)
(6) (8) (9) (10) (12)

(1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11)

Table 7
The evidence obtained from real time database.
Notation

Evidence

Related nodes

E1
E2

There is no regular equipment maintenance


Several internal computer connections breach the access control policy

E3

Several manual external tests are performed and do not result in unauthorized access

E4

During the past two months several suspicious incidents related to network have not been
logged
There are a few other processes aside from normal browser caching, which store, alter or
copy information
No restrictions on connection times to provide additional security for high-risk applications
There is no security policy document that details the procedure of changes to systems
Several requests for data are not channeled through a DBA who then requests from
operation staff

RF1_7 Equipment maintenance


RF2_2 Network routing control
RF2_3 Network connection control
RF2_4 User authentication for external
connections
RF2_6 Network security audit

E5
E6
E7
E8

RF3_2 Host/Server access control


RF4_1 Application access control
RF6_2 Change management
RF6_3 Segregation of duties

According to the security controls related to each security risk in Appendix B, domain experts and security managers of AB
Company were interviewed to specify the states of the risk nodes in BN in Table 6.

4.2.3. Step 3: security risk assessment


From August 2009 to October 2009, the new evidence was obtained from real time database as shown in Table 7, which
gives updated information about each observable node in the BN as inference evidence. Based on the principle presented in
Section 3.2, we compute the posterior probability of the nodes in the BN based on the evidence.
For the risk nodes in the BN, the probabilities of risk occurrence and the severities of risk consequence estimated by security risk assessment are shown in Table 8, from which the probabilities of R2 Network security risk and R6 Communica-

68

N. Feng et al. / Information Sciences 256 (2014) 5773


Table 8
The probabilities of risk occurrence and the severities of risk consequence.
Risk nodes

State

Probability

Severity

R1. Physical and environment security risk

High
Medium
Low

0.2379
0.5233
0.2388

0.9

R2. Network security risk

High
Medium
Low

0.6732
0.2217
0.1051

0.7

R3. Host/Server computer security risk

High
Medium
Low

0.2043
0.5945
0.2012

0.7

R4. Application security risk

High
Medium
Low

0.3512
0.4920
0.1568

0.6

R5. Data security and back-up risk

High
Medium
Low

0.1878
0.4029
0.4093

0.7

R6. Communication and operation security risk

High
Medium
Low

0.5930
0.2856
0.1214

0.7

Table 9
The result of vulnerability propagation analysis.
No

Propagation paths

Occurrence probability

Consequence severity

Path risk exposure

1
2
3
4
5
6

hE2, RF2_3, R2i


hE3, RF2_4, RF2_5, R2i
hE4, RF2_6, R2i
hE4, RF2_6, RF6_8, RF6_10, R6i
hE7, RF6_2, RF6_4, R6i
hE8, RF6_3, RF6_4, R6i

0.1603
0.4101
0.0947
0.0353
0.3474
0.1442

0.7
0.7
0.7
0.7
0.7
0.7

0.1122
0.2871
0.0663
0.0247
0.2432
0.1009

Table 10
Comparison between SRAM and other approaches.

Propagation analysis
Learning capability
Probability updating
Structure support
Uncertainty processing
Tool support

SRAM

Evidential reasoning

BBNRM

Fuzzy risk analysis

Yes
Yes
Yes
Yes
No
Hugin expert

No
No
No
Yes
Yes
NA

No
No
Yes
Yes
No
BNT

No
No
No
No
Yes
NA

tion and operation security risk are both higher than the threshold 0.5 set by experts in advance. Thus, for the risk nodes R2
and R6 in the BN, the vulnerability propagation should be further analyzed.

4.2.4. Step 4: vulnerability propagation analysis


According to the parameters mentioned in Section 4.2.2, Algorithm 2 is performed by combining occurrence probability
and consequence severity of all nodes along the vulnerability propagation path. Vulnerability propagation paths with their
occurrence probabilities and path risk exposures are shown in Table 9, where the path risk exposure = (occurrence probability)  (consequence severity) and the meaning of the notations is explicated in Tables 1, 2 and 7. As shown in Table 9, both of
the occurrence probability and path risk exposure of the paths No. 2 and 5 are larger than other paths signicantly.
Based on the result of vulnerability propagation analysis, the corresponding security risk treatment plans are developed
as follows:
(1) Analyze the unauthorized access-related information based on network logs.
(2) Improve the mechanism of authentication.
(3) Create the procedure for change management.

N. Feng et al. / Information Sciences 256 (2014) 5773

69

According to above risk treatment plans, security engineer checked the information systems and found that the information
systems were under IP address attack due to the vulnerability of authentication. To defense the vulnerability, the engineer
adopted challenge-response authentication, in which one party presents a question (challenge) and another party must provide a valid answer (response) to be authenticated, to authenticate external connections. Furthermore, the procedure for
change management was created to ensure that the changes to information processing facilities and systems would be properly controlled. After above activities, we performed the security risk assessment again, and found that there were no risk alerts
generated by SRAM. In other words, all the occurrence probabilities of the risk nodes in the BN were lower than the threshold
set by experts in advance. It is thus proved that the results given by SRAM proposed in the paper were valid on real data.
5. Discussion
In this section, we compare our security risk analysis model (SRAM) with some related approaches and discuss limitations
of our research. Table 10 shows the comparison results between SRAM and other approaches, namely Evidential reasoning
[17], BBNRM [16], and Fuzzy risk analysis [9], where NA means information not found in the related references.
The rst issue is the capability of vulnerability propagation analysis. In information systems, single system vulnerability
could propagate and escalate through the causal chains of security risk factors via multiple paths, leading to different security consequences, which may further cause adjacent systems security problem and eventually lead to catastrophic accidents. Thus, in order to reduce the loss of security risk consequence, vulnerability propagation is the main issue in
security risk analysis. One of the key innovations of SRAM is the ability of analyzing vulnerability propagations that can help
establish proactive security risk management plans for information systems. Other three approaches have not been applied
to study the vulnerability propagation analysis. In SRAM, based on ant colony algorithm, we propose an algorithm for vulnerability propagation analysis with integration of risk occurrence probability and its severity of each node in the BN.
The second issue is the learning capability that refers to the ability of inducing a representative model for observed information systems based on the database of observed cases. In Table 10, except SRAM, all other three approaches assume that
the structure of the model was provided based on domain expert experience and knowledge, thus the results of security risk
analysis are relatively subjective. In SRAM, integrating the database of observed cases with expert experience, an ACO-based
algorithm is developed to learn the model structure that ts the observed information systems.
The third issue is about the probability updating and structure support. In risk analysis, probability updating can be dened as the task of computing all posterior marginals of non-evidence variables given the evidence [5]. For SRAM and
BBNRM, whenever the new evidence is available in the process of security risk assessment, it is plugged in the model to update previous probabilistic estimates. Both of Fuzzy risk analysis and Evidential reasoning are unable to conduct the probability updating. As to structure support, apart from Fuzzy risk analysis, other three approaches shown in Table 10 provide a
rigorous, structured manner to incorporate relevant security risk factors, related countermeasures, and their interrelationships when estimating security risk in information systems.
The fourth issue is about the uncertainty processing and tool supports. Both of Fuzzy risk analysis and Evidential reasoning are good at processing the ambiguous information by simulating the characteristics of human in making judgments.
However, SRAM could be extended to deal with the uncertain evidence by introducing fuzzy set into our model. Supporting
tools for Evidential reasoning and Fuzzy risk analysis have not been found. Nevertheless, Hugin expert for SRAM and BNT for
BBNRM are powerful tools widely used in the risk management for information systems.
In practice, the security risk analysis is quite complex and full of uncertainty [2]. The uncertainty, existing in the process of
risk analysis, has been an important factor that inuences the effectiveness of risk analysis. Therefore, the handling of uncertainty is an important future research topic. More specically, a process of testing the evidential consistency will need to be
dened to reduce the uncertainty derived from the conicts of evidence. For instance, if an item of evidence is supported by
other items of evidence, then it has a higher credibility and we assign a higher weight for it in evidence combination; In contrast, if an item of evidence is in conict with other items of evidence, then its credibility and weight should be decreased. In
addition, future research effort will also focus on applying the proposed SRAM to other practice situations and incorporating
more sophisticated constraints to enhance the handling of more complex security risk analysis problems.
6. Conclusions
The security risk analysis for information systems is a very critical challenge. In order to identify the causal relationships
among risk factors and address the complexity and uncertainty of vulnerability propagation, a security risk analysis model
(SRAM) is proposed in this paper using Bayesian networks (BNs) and ant colony optimization (ACO).
Integrating the database of observed cases with expert experience and knowledge, a BN is developed to represent the related factors to assess the security risk and their causal mechanisms. Furthermore, in SRAM, security risk assessment is performed to calculate the occurrence probabilities and the consequence severities of security risks. Then, based on the BN and
the results of risk assessment, the vulnerability propagation paths are calculated by an algorithm based on ACO in SRAM,
which provides guidance for developing security risk treatment plans. Finally, the effectiveness of the SRAM is demonstrated
through a case study, which indicates that SRAM is able to improve the accuracy and efciency of security risk management
for information systems.

70

N. Feng et al. / Information Sciences 256 (2014) 5773

Acknowledgements
The research was supported by the National Natural Science Foundation of China (Nos. 70901054, 71271149, and
71110107042) and the National Science Fund for Distinguished Young Scholars of China (No. 70925005). It was also
supported by the Program for Changjiang Scholars and Innovative Research Teams in Universities of China (PCSIRT)
and the China Postdoctoral Science Foundation funded project (No. 2012M520025). The authors are very grateful to
all anonymous reviewers whose invaluable comments and suggestions substantially helped improve the quality of
the paper.

Appendix A
To represent the causal relationships among security risk factors, an ACO-based algorithm shown in Fig. A1 is developed
to learn the BN structure based on reference [11].

Fig. A1. The ACO-based algorithm for learning the BN structure.

71

N. Feng et al. / Information Sciences 256 (2014) 5773


Table B1
Security controls for the risk nodes in BN.
Node
ID

Security controls

R1

(1) The organization develops, disseminates, and updates documented procedures to facilitate the implementation of the physical and
environmental protection policy and associated physical and environmental protection controls
(2) The organization monitors physical access to the information system to detect and respond to physical security incidents
(3) The organization monitors real-time physical intrusion alarms and surveillance equipment
(4) The organization employs automated mechanisms to recognize potential intrusions and initiate designated response actions
(5) The organization maintains visitor access records to the facility where the information system resides
(6) The organization employs automated mechanisms to facilitate the maintenance and review of access records
(7) The organization protects power equipment and power cabling for the information system from damage and destruction
(8) The organization employs redundant and parallel power cabling paths
(9) The organization employs and maintains re suppression and detection devices/systems for the information system that are supported
by an independent energy source
(10) The organization employs re detection devices/systems for the information system that activate automatically and notify the
organization and emergency responders in the event of a re
(11) The organization employs an automatic re suppression capability for the information system when the facility is not staffed on a
continuous basis
(12) The organization protects the information system from damage resulting from water leakage by providing master shutoff valves that
are accessible, working properly, and known to key personnel
(13) The organization positions information system components within the facility to minimize potential damage from physical and
environmental hazards and to minimize the opportunity for unauthorized access

R2

(1) The organization develops, disseminates, and updates documented procedures to facilitate the implementation of the network access
control policy and associated access controls
(2) The organization monitors for unauthorized remote access to the information system
(3) The organization employs automated mechanisms to facilitate the monitoring and control of remote access methods
(4) The information system routes all remote accesses through a limited number of managed access control points
(5) The organization ensures that remote sessions are audited
(6) The organization establishes usage restrictions and implementation guidance for wireless access
(7) The information system protects wireless access to the system using authentication and encryption
(8) The organization establishes usage restrictions and implementation guidance for organization-controlled mobile devices
(9) The organization restricts the use of writable, removable media in the information systems
(10) The organization reviews and analyzes network audit records for indications of inappropriate or unusual activity
(11) The organization analyzes and correlates audit records across different repositories to gain organization-wide situational awareness

R3

(1) The organization identies specic user actions that can be performed on the host/server computer without identication or
authentication
(2) The organization permits actions to be performed without identication and authentication only to the extent necessary to accomplish
mission/business objectives
(3) The host/server uses multifactor authentication for network access to privileged accounts
(4) The organization allows the use of group authenticators only when used in conjunction with an individual/unique authenticator
(5) The organization reviews and updates the list of auditable events
(6) The organization includes execution of privileged functions in the list of events to be audited by the host/server computer
(7) The organization implements an incident handling capability for security incidents that includes preparation, detection and analysis,
containment, and recovery
(8) The organization employs automated mechanisms to support the incident handling process

R4

(1) The organization develops, disseminates, and updates documented procedures to facilitate the implementation of the application access
control policy and associated access controls
(2) The organization manages the application using a development life cycle methodology that includes application security considerations
(3) The organization obtains, protects as required, and makes available to authorized personnel, vendor or manufacturer documentation
that describes the security-relevant external interfaces to the application
(4) The organization enforces explicit rules governing the installation of software by users
(5) The organization requires that providers of external information system services comply with organizational information security
requirements
(6) The organization monitors security control compliance by external service providers
(7) The organization conducts an organizational assessment of risk prior to the acquisition or outsourcing of dedicated information security
services

R5

(1) The organization develops, disseminates, and updates documented data secrecy and integrity policy that addresses purpose, scope, roles,
responsibilities, management commitment, and compliance
(2) Data stored on the network attached storage appliance will be regularly backed up
(3) Backups will be veried periodically
(4) In the event of a system failure or user error, on-site backed up data will be made available to users within ve working days
(5) In the event of a system failure or user error, on-site backed up data will be made available to users within three working days
(6) In the event of a system failure or user error, on-site backed up data will be made available to users within one working day

R6

(1) The organization develops, disseminates, and updates a documented communications and operation security policy that addresses
purpose, scope, roles, responsibilities, management commitment, and compliance
(2) The organization approves conguration-controlled changes to the system with explicit consideration for security impact analyses
(3) The organization tests, validates, and documents changes to the information system before implementing the changes on the
operational system
(continued on next page)

72

N. Feng et al. / Information Sciences 256 (2014) 5773

Table B1 (continued)
Node
ID

Security controls
(4) The organization denes, documents, approves, and enforces physical and logical access restrictions associated with changes to the
information system
(5) The organization employs automated mechanisms to enforce access restrictions and support auditing of the enforcement actions
(6) The information system protects the integrity of transmitted information
(7) The organization employs cryptographic mechanisms to recognize changes to information during transmission unless otherwise
protected by alternative physical measures
(8) The information system protects the condentiality of transmitted information
(9) The organization issues public key certicates under a certicate policy or obtains public key certicates under an appropriate certicate
policy
(10) The information system protects log information from unauthorized access, modication, and deletion
(11) The information system uses cryptographic mechanisms to protect the integrity of log information
(12) The information system produces audit records that contain sufcient information to, at a minimum, establish what type of event
occurred, when the event occurred, and the source of the event
(13) The organization establishes a continuous monitoring strategy and implements a continuous monitoring program
(14) The organization employs an independent assessor or assessment team to monitor the security controls in the information system on
an ongoing basis
(15) The information system monitors inbound and outbound communications for unusual or unauthorized activities or conditions

The equations appearing in the algorithm are as follows:


(1) Heuristic information:

gij f xi ; Paxi [ fxj g  f xi ; Paxi :

(2) Pheromone updating rule:

sij

1  qsij qDsij


if xj ! xi 2 G
, s is the level of pheromone in the arc xj ? xi, q(0 < q 6 1) is a parameter
if xj ! xi R G ij
that controls the pheromone evaporation, and G is the best graph found so far.
(3) Probabilistic transition rule:

a
b
Select xr ? xl such that r; l argmaxi;j2F G fsij  gij  g if q 6 q0 where I, J are two nodes randomly selected according to
I; J
if q > q0
the following probabilities:
where Dsij

1=jf G : Dj

9


sij

,
8
X
>
< s a g b
suv a guv b
ij
ij
pk i; j
u;
v
2F
G
>
:
0

if i; j 2 F G

10

otherwise

The set FG contains all the arcs which are still candidates for insertion in G (i.e., they do not belong to G, their inclusion in G
does not create a directed cycle and gij > 0).
Appendix B
For each security risk, the security controls are listed in Table B1 according to NIST special publication 800-53, which
provides a recommendation for security controls for the information systems.
References
[1] C. Alberts, A. Dorofee, Managing Information Security Risks: The OCTAVE Approach, Pearson Education, Inc., Upper Saddle River, New Jersey, 2002.
[2] S. Alter, S. Sherer, A general, but readily adaptable model of information system risk, Communications of the AIS 14 (1) (2004) 128.
[3] M. An, Y. Chen, C.J. Baker, A fuzzy reasoning and fuzzy-analytical hierarchy process based approach to the process of railway risk information: a railway
risk management system, Information Sciences 181 (18) (2011) 39463966.
[4] C. Blum, A. Roli, Metaheuristics in combinatorial optimization: overview and conceptual comparison, ACM Computing Surveys 35 (3) (2003) 268308.
[5] C. Butz, S. Hua, J. Chen, H. Yao, A simple graphical approach for understanding probabilistic inference in Bayesian networks, Information Sciences 179
(6) (2009) 699716.
[6] G. Bykzkan, D. Ruan, Choquet integral based aggregation approach to software development risk assessment, Information Sciences 180 (3) (2010)
441451.
[7] K. Campbell, L.A. Gordon, M.P. Loeb, L. Zhou, The economic cost of publicly announced information security breaches: empirical evidence from the
stock market, Journal of Computer Security 17 (3) (2009) 431448.
[8] H. Cavusoglu, B. Mishra, S. Raghunathan, The effect of Internet security breach announcements on market value: capital market reactions for breached
rms and Internet security developers, International Journal of Electronic Commerce 14 (3) (2009) 69104.
[9] S.J. Chen, S.M. Chen, Fuzzy risk analysis based on similarity measures of generalized fuzzy numbers, IEEE Transactions on Fuzzy Systems 11 (1) (2003)
4556.
[10] G.F. Cooper, E.A. Herskovits, A Bayesian method for the induction of probabilistic networks from data, Machine Learning 9 (1992) 309347.

N. Feng et al. / Information Sciences 256 (2014) 5773

73

[11] L.M. de Campos, J.M. Fernndez-Lunab, J.A. Gmezc, J.M. Puerta, Ant colony optimization for learning Bayesian networks, International Journal of
Approximate Reasoning 31 (3) (2002) 291311.
[12] M. Dorigo, G.D. Caro, The ant colony optimization meta-heuristic, in: D. Corne, M. Dorigo, F. Glover (Eds.), New Ideas in Optimization, McGraw-Hill,
1999, pp. 1133.
[13] M. Dorigo, V. Maniezzo, A. Colorni, Ant system: optimization by a colony of cooperating agents, IEEE Transactions on Systems, Man, and Cybernetics
Part B 26 (1) (1996) 2941.
[14] M. Dorigo, T. Sttzle, Ant Colony Optimization, MIT Press, Cambridge, MA, 2004.
[15] J.L. Douglas, The Security Risk Assessment Handbook: A Complete Guide for Performing Security Risk Assessments, Auerbach Publications, 2006.
[16] C. Fan, Y. Yu, BBN-based software project risk management, Journal of Systems and Software 73 (2) (2004) 193203.
[17] N. Feng, M. Li, An information systems security risk assessment model under uncertain environment, Applied Soft Computing 11 (7) (2011) 4332
4340.
[18] J.A. Gmez, J.M. Puerta, Searching the best elimination sequence in Bayesian networks by using ant-colony optimization, Pattern Recognition Letters 23
(13) (2002) 261277.
[19] L.A. Gordon, M.P. Loeb, The economics of information security investment, ACM Transactions on Information and System Security 5 (4) (2002) 438457.
[20] L.A. Gordon, M.P. Loeb, W. Lucyshyn, CSI/FBI Computer Crime and Security Survey, Computer Security Institute, San Francisco, 2010.
[21] L. Grunske, D. Joyce, Quantitative risk-based security prediction for component-based systems with explicitly modeled attack proles, Journal of
Systems and Software 81 (8) (2008) 13271345.
[22] S.A. Hartmann, T.A. Runkler, Online optimization of a color sorting assembly buffer using ant colony optimization, in: Proceedings of Operations
Research, 2007, pp. 415420.
[23] D. Heckerman, A Tutorial on Learning Bayesian Networks, Technical Report, Microsoft Research, Redmond Washington, 1996.
[24] D. Heckerman, D. Geiger, D.M. Chickering, Learning BNs: the combination of knowledge and statistical data, Machine Learning 20 (3) (1995) 197243.
[25] D. Heckerman, M.P. Wellman, Bayesian networks, Communications of ACM 38 (3) (1995) 2330.
[26] F.V. Jensen, An Introduction to Bayesian Networks, Springer-Verlag, New York, 1996.
[27] M.I. Jordan, Learning in Graphical Models, MIT Press, Cambridge, MA, 1999.
[28] B. Karabacak, I. Sogukpinar, ISRAM: information security risk analysis method, Computers & Security 24 (2) (2005) 147159.
[29] J. Kennedy, R. Eberhart, Swarm Intelligence, Morgan Kaufmann, San Mateo, CA, 2001.
[30] U.B. Kjaerulff, A.L. Madsen, Bayesian Networks and Inuence Diagrams: A Guide to Construction and Analysis, Springer Science, New York, 2008.
[31] K. Kojima, E. Perrier, S. Imoto, Optimal search on clustered structural constraint for learning Bayesian network structure, Journal of Machine Learning
Research 11 (2010) 285310.
[32] W. Lam, F. Bacchus, Learning Bayesian belief networks: an approach based on the MDL principle, Computational Intelligence 20 (2004) 269293.
[33] P. Larraaga, M. Poza, Y. Yurramendi, R.H. Murga, C. Kuijpers, Structure learning of Bayesian networks by genetic algorithms: a performance analysis of
control parameters, IEEE Transactions on Pattern Analysis and Machine Intelligence 18 (2002) 912926.
[34] J. Li, M. Li, D. Wu, H. Song, An integrated risk measurement and optimization model for trustworthy software process management, Information
Sciences 191 (15) (2012) 4760.
[35] T. Peltier, Information Security Risk Analysis, second ed., Auerbach Publications, Boca Raton, FL, 2007.
[36] P. Pinto, T.A. Runkler, J.M.C. Sousa, Ant colony optimization and its application to regular and dynamic MAX-SAT problems, in: Advances in Biologically
Inspired Information Systems: Models, Methods, and Tools, Springer-Verlag, Berlin, Germany, 2007, pp. 283302.
[37] C.R. Reeves, Modern Heuristic Techniques for Combinatorial Problems, Blackwell Scientic Publications, Oxford, 1995.
[38] H. Salmela, Analysing business losses caused by information systems risk: a business process analysis approach, Journal of Information Technology 23
(3) (2008) 185202.
[39] C.A. Silva, J.M.C. Sousa, T.A. Runkler, Distributed optimization of logistic systems and its suppliers using ant colony optimization, International Journal
of Systems Science 37 (8) (2006) 503512.
[40] C.A. Silva, J.M.C. Sousa, T.A. Runkler, Distributed supply chain management using ant colony optimization, European Journal of Operational Research
199 (8) (2009) 349358.
[41] A. Summers, W. Vogtmann, S. Smolen, Consistent consequence severity estimation, Process Safety Progress 31 (1) (2012) 916.
[42] I. Tsamardinos, L.E. Brown, C.F. Aliferis, The maxmin hill-climbing Bayesian network structure learning algorithm, Machine Learning 65 (1) (2006)
3178.
[43] L. Sun, R.P. Srivastava, T.J. Mock, An information systems security risk assessment model under the DempsterShafer theory of belief functions, Journal
of Management Information Systems 22 (4) (2006) 109142.
[44] S. Weng, Y. Liu, Mining time series data for segmentation by using ant colony optimization, European Journal of Operational Research 173 (3) (2006)
921937.
[45] D. Wu, D.L. Olson, Enterprise risk management: coping with model risk in a large bank, Journal of the Operational Research Society 61 (2) (2010) 179
190.
[46] D.D. Wu, K. Xie, G. Chen, P. Gui, A risk analysis model in concurrent engineering product development, Risk Analysis 30 (9) (2010) 14401453.
[47] W.T. Yue, M. akanyildirim, Y.U. Ryu, D. Liu, Network externalities, layered protection and IT security risk management, Decision Support Systems 44
(1) (2007) 116.

You might also like