Toward Modeling Alarm Handling in SCADA PDF

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 8

This article has been accepted for publication in a future issue of this journal, but has not been

fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TPWRS.2019.2916025, IEEE
Transactions on Power Systems
1

Toward Modeling Alarm Handling in SCADA


System: A Colored Petri Nets Approach
Payam Mahmoudi-Nasr

 required. On one hand, decision making for clearing alarms is


Abstract-- Alarm handling in supervisory control and data stressful even for professional dispatchers and OPs [2]; on the
acquisition (SCADA) system is a critical issue in securing critical other hand, there is always the possibility of error for SCADA
infrastructures (CIs). Each error or delay in clearing an alarm operators. Each intentional or unintentional dispatcher/OP
may jeopardize the reliability and security of the CI at the national
inappropriate action is an insider threat for the system
level. This paper analyzes and models the SCADA alarm
communication management by using the unified modeling reliability. The reports confirm poor AH by an unskilled
language and colored timed Petri nets. The proposed alarm- dispatcher causes accidents in the SCADA system (e.g.,
handling model does not refer to a specific CI application and it is Enbridge Incorporated pipeline rupture on July 25, 2010 [3]).
based on a general approach which alarm transactions are The researches indicate 30% of the total incidents on SCADA
integrated with dispatcher’s commands and substation’s operator systems are done by own staff [4]. Unfortunately, real data for
maintenance. To demonstrate the potential of the proposed alarm-
analyzing dispatcher/OP threats in SCADA system are not
handling model, a real case study in power system is simulated and
some scenarios with different number of substations and error publicly available for privacy reasons. The objective of this
rates are analyzed. paper is a better understanding of the alarm system and
dispatcher/OP threat in SCADA profile for more effective AH.
Index Terms-- Alarm handling, Colored Petri nets, Insider The aim of the proposed AH is analyzing, modeling and
threat, SCADA. simulating the SCADA system related to the dispatchers in
control room and maintenance operators in remote substations.
I. INTRODUCTION Initially, the unified modeling language (UML) is employed to

S upervisory control and data acquisition (SCADA) system is


a complex and distributed system for real-time monitoring
and controlling of the industrial process in a critical
represent (i) the sequence of events and alarms between
SCADA objects, and (ii) interaction of the dispatcher and OP
with the system. Secondly, the behavior of the SCADA system
infrastructure (CI). SCADA system is used in many CI is modelled by colored timed Petri net (CTPN) according to the
applications such as oil, gas, water, and power systems. In raising, sending, and reception of events/alarms, and the actions
particular, the domain of the SCADA system addresses the of the involved dispatchers and OPs.
integration of field devices with computer systems in control The proposed AH model presents a unified CTPN model
room and intends to improve the flow of information between comprising faulty equipment, RTUs, and maintenance operator
the remote terminal units (RTUs) in substations and the human in remote substations; communication network; information
machine interface (HMI) in control room. SCADA system systems and dispatcher in control room. The mathematical
plays vital role in maintaining the reliability of CI and keeping functions and programming language facilities in CTPN allow
the industrial processes within normal operating ranges. In simulating the traffic mean rate of alarms, dispatcher threat, and
other word, SCADA system improves CI efficiency by using OP threat using programming variables. In order to show the
technology for communicating events and alarms from remote effectiveness of the proposed AH model, a real case study is
substations to control room, and conversely, dispatcher’s analyzed considering a power system SCADA profile.
commands from control room to remote substation’s operators Moreover, to analyze the response of the proposed CTPN
(OPs) and devices. model when an alarm appears and dispatcher/OP threatens
Alarm handling (AH) plays key role for successful operation system, no-response threat and delayed threat, which are the
of SCADA system [1]. AH is a national-critical issue in order most important insider threats in SCADA system, have been
to avoid exposing CI application to serious disturbances and simulated. In no-response threat, the dispatcher and/or OP do
critical statuses. Since finding the root cause of an alarm and not issue correct response for clearing alarms and in delayed
decision making for clearing it are the main duties of dispatcher threat, dispatcher and/or OP do not clear alarms timely. The
in SCADA system, the role of dispatcher is underlined and the results demonstrate how the proposed AH model and the
reliability of CI is highly dependent to his/her decision. performance analysis help in the validation and verification of
Furthermore, to clear many alarms effective OP cooperation is such a profile.

P. Mahmoudi-Nasr is with the University of Mazandaran, Babolsar 47416-13534, and also with Tarbiat Modares University CERT (APA), Iran (e-mail:
P.Mahmoudi@umz.ac.ir).

0885-8950 (c) 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TPWRS.2019.2916025, IEEE
Transactions on Power Systems
2

This paper is organized as follows. Section 2 presents the System status Event or alarm
related works and contributions. Section 3 describes the monitoring acknowledgment
proposed AH model. Section 4 provides validation and
Control setting
simulation results. Section 5 gives the conclusions. modifications by using
RTU
<<include>>
II. RELATED WORKS AND CONTRIBUTIONS
Dispatcher <<include>> Requesting repair or
In a SCADA system, RTUs collect real time information of maintenance servive
the field devices and transfer them to the control room through Reaction to
<<include>>

a communication network[5]. In the control room, the HMI events/alarms


allows dispatchers to monitor the current and historical state of Fig. 1. Use-case diagram of dispatcher’s daily duties
the processes. When an abnormal system status occurs, the
dispatcher will be notified by an alarm. In order to the industrial modeling organization problems of health care systems. They
processes continue correctly, the dispatcher should clear the have provided a PN based software for scheduling and activity
alarm accurately and promptly by sending supervisory control planning of health care services. They have used a meta-model
commands to the (i) field devices through RTU, and/or (ii) OPs with three different views consist of process view, resource
to repair faulty equipment. By the way, the alarms will be view and organization view.
analyzed better if they are categorized into several types. Considering the above context, the key contributions of this
Different metrics can be used for classifying alarms such as paper are as follows:
importance of danger, responsible element, and response - It proposes a CTPN model for AH in SCADA system by
time[6]. A historian server (HIS) records all events and alarms dispatcher and OP for both trusted and malicious operations.
e.g., system status, dispatcher commands, and processed - It proposes an approach to predict insider threat in CI based
information. on CTPN. It is a pure application of the CTPN in a new area.
There is significant amount of researches on AH in CI - It proposes a CTPN model to generate simulated data in
applications. Goel et al. provided details on the existing SCADA system even under insider threat conditions.
standards and regulations related to industrial alarm systems - The validation of the proposed model is performed using
[1]. Silva et al. proposed a context-aware ontological approach data from a real dispatching center of power system.
to represent a conceptual model for alarm system [7]. They - The proposed CTPN model can facilitate the analysis of the
analyzed the behavior of an alarm system to make rules for alarm management workflow to evaluate reliability in CI
recognizing system situations, in order to provide operational applications.
support inside an industrial plant. Zeng et al. addressed a
Markov-chain method to compute expected delay timers for an III. SCADA ALARM HANDLING: CONCEPT AND MODELING
alarm [8]. Wang et al. proposed a method to reduce and detect Normal and abnormal system status is monitored in HMI
nuisance alarms and developed a delay timer for them [9]. Tan screen by using events and alarms respectively. All event
et al. developed a method to calculate false and lost alarm rate messages are in neutral status. When an alarm is raised, initially
and expected detection delay for an alarm [10]. Hu et al. it is in pending status. The alarm status is changed to
proposed a method based on logged events and alarms to detect acknowledged, when the dispatcher confirmed he/she received
association rules between them [11]. Yu et al. addressed a the alarm. Continuously, when the dispatcher performs the
method to find abnormal data between logged data in a required actions, and as a result the alarm is cleared, the alarm
multivariate alarm system [12]. Although these researches are status will be changed to cleared [23]. Some alarms may be
extremely valuable on the promotion of an alarm system, but cleared remotely when the dispatcher sends commands to RTU,
beyond this, more research is still necessary to analyze the while other alarms are usually cleared by cooperation of the
impact of operator performance against the persistence of dispatcher and the maintenance operator in remote substation.
alarms and system reliability. Fig. 1 illustrates the use case diagram of the dispatcher’s daily
Moreover, Petri net (PN) and its extensions have been duties. Moreover, Fig. 2 shows the sequence of the alarms by
widely employed to many specific areas in CI applications such the UML sequence diagram that describes in detail the order in
as railway monitoring system[13], health care system [14], which the interactions among the involved dispatchers and OPs
cyber security vulnerability assessment [15], failure take place.
identification and monitoring [16], power system restoration In a CI, any wrong or delayed decision for clearing alarms
[17], reliability and security evaluation in power system [18] may jeopardize the system reliability. There are many
electricity theft detection [19] and distributed network protocol unintentional causes for dispatcher/OP error such as alarm
secure authentication [20]. Jian-wei et al. [21] proposed an flooding [24]. Moreover, a malicious dispatcher/OP may launch
alarm information processing and diagnostic method based on an insider attack and abuse his/her privileges in order to disrupt
PNs with timing constraints. They classified the alarms of an operation in a remote substation. When a malicious
power system, and presented a method based on the time dispatcher analyzes an alarm, an insider attack could be
constraint PN for recognizing alarm error messages such as occurred by taking a wrong decision and sending a wrong
misreports, missing messages and timing inconsistencies. command (e.g., no response, delayed correct response,
Augusto et al. [22] have addressed a new methodology for incorrect/incomplete response). Likewise, a malicious OP may

0885-8950 (c) 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TPWRS.2019.2916025, IEEE
Transactions on Power Systems
3

HMI For any new alarm


Dispatcher HIS RTU

Wrong-Command
getting alarm dp
/event (P error )
Status message Dispatcher threat
for clearing
Monitor Referesh/
Disseminate Right-Command Alarm not cleared
dp
Update (1-P error )
Wrong-Operation
Referesh op
Display (P error )
OP threat
for clearing
(a) System status monitoring Right-Operation
op Alarm not cleared
(1-P error )

Dispatcher HMI HIS Alarm cleared


Fig. 3. Interaction of dispatcher’s behaviors and OP’s behaviors.

Acknowledge tokens with time stamp. The color is equivalent to type, and
Referesh time is used for evaluating performance indices. In addition, the
CTPN integrates the abilities of PN for process interaction with
Update
Referesh the abilities of a high-level programming language for the
definition of data types and the manipulations of data values.
(b) Message acknowledgement Hence, the CTPN is suitable to model complex systems [25].
The proposed CTPN model of AH is presented in Fig. 4.
This CTPN model represents the data communication processes
Dispatcher HMI HIS RTU
in a SCADA system in order to identify and prevent conditions
Command that may cause risks to a CI application. . Table I shows the
Command
Referesh Update
meaning of places. The set of places P is partitioned into
Command
values P  S  C  NET . The set S collects the substation objects:
Referesh S   AAj , RTU1 , RTU 2 , OP, CLR, NoCLR and the set C
Status message collects the control room objects: C  HMI1 , HMI 2 , HIS . A
Referesh
Display token in a place of p  S  C is an massage to be sent or
Report status
Referesh
received by the corresponding object. Note that a token in AAj
represents a fault agent in a field device for alarm type j. Tokens
(c) Remote alarm clearing by dispatcher in RTU1 are alarm messages should be sent from remote
substation to control room, and a token in RTU2 is a dispatcher’s
Substation's command set message should be sent to field devices for
Dispatcher HMI HIS RTU
operator updating setting values and clearing corresponding alarm.
Requesting repair Moreover, a token in the place OP represents a dispatcher’s
command set message should be executed carefully by OPs
Repair
Status devoted to handling the alarm. A token in place (No)CLR
Status message message represents a (no)cleared alarm. In addition, for modeling
Display Report
status purpose, the place HMI is described by two places: tokens in
Referesh
HMI1 are alarm messages should be acknowledged, and tokens
in HMI2 are alarm messages should be cleared by a dispatcher.
(d) Alarm clearing by OP cooperation
Each token in HIS represents a recorded message. The place set
Fig. 2. Sequence diagram of the messages in SCADA system. NET={NET1, NET2} represents the communication network.
occur an insider attack when he/she commits an error of Each token in NET1 is an alarm message, which has been sent
commission. Therefore, the probability of clearing an alarm through RTU1, and a token in NET2 is a command set message

P  [0,1] is calculated using the following formula:


alarm that has been sent through HMI2.
clearing Due to the variety of the types, statuses, activation times, and
clearing times of a message the color of each token in a place
alarm
Pclearing  (1  Perror
dp
)  (1  Perror
op
) (1)
of pa  AAj , RTU1 , NET1 , CLR, NoCLR  C , denoted by
where,
dp
Perror [0,1] and Perror
op
[0,1] are the probability of MESSAGE, is defined by the foursome including the message
threat for dispatcher and OP respectively. Fig. 3. represents the type (mt), message status (ms), activation time (at), time spent
interaction of dispatcher’s behaviors and OP’s behaviors. (ts), and the associated substation (sub). Consequently, the
A. The CTPN model of the SCADA alarm handling color domain of the place pa is:
The CTPN is an extended version of PN that has colored

0885-8950 (c) 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TPWRS.2019.2916025, IEEE
Transactions on Power Systems
4

Fig. 4. The proposed CTPN model of AH.


TABLE I Co( pa )  MESSAGE   mt j , msn , at , ts, subi  (2)
Places of the proposed CTPN model.
Place Meaning
Color Initial msn  pending , acknowledged , cleared , neutral
Set Value
Alarm agent of type where, i= 1, 2, …, I is index of substations, j=1,2,…,J is index
AAj MESSAGE exp(I/Kj) of message types, n = 1, 2, 3, 4 is index of message statues.
j in a field device
RTU1 RTU output port MESSAGE -- The color of a token in the place RTU 2 is defined by the
RTU2 RTU input port COMMAND --
NET1
Communication
MESSAGE --
COMMAND color set, which represents a dispatcher’s
network command set. Besides, the color of each token in a place of
Communication
NET2
network
MESSAGE*COMMAND -- pc  NET2 , OP is defined by the MESSAGE*COMMAND
Messages in HMI
HMI1
screen
MESSAGE -- color set. Therefore, the color domain of the places pc and
HMI2
Alarms in HMI
MESSAGE -- RTU2 are:
screen
HIS Historian server MESSAGE -- Co( pc )  MESSAGE * COMMAND 
 mt , ms , at, ts, sub , cmd 
Substation’s
OP MESSAGE*COMMAND -- (3)
operator j n i m
CLR Cleared alarms MESSAGE --
NOCLR Not cleared alarms MESSAGE -- Co( RTU 2 )  COMMAND  { cmdm } (4)

TABLE II cmdm right _ command , wrong _ command 


Transitions of the proposed CTPN model. where, m=1,2 is index of dispatcher command.
Tran-
sition
Meaning In addition, transitions Fj , Tq where j [1, J ], q  1,7 model
Send a new alarm message of type j to RTU1 and the message transactions, dispatcher operations, and OP
Fj
generate a new fault agent in a field device FDj.
T1 Transmit an alarm message to network. maintenance. Table II shows the meaning of each transition
T2 Dispatch an alarm message to HMI1, HMI2 and HIS. with respect to each token color. The set of the colors of each
1- Acknowledge an alarm message. transition is defined as follows:
T3 2- Update alarm status.
3- Record updated alarm to HIS. Co( Fj , T1 , T2 , T3 , T7 )  MESSAGE (5)
1- Send dispatcher’s command set to network.
T4
2- Record dispatcher’s command set to HIS. Co(T4 , T5 , T6 )  MESSAGE * COMMAND (6)
T5 Dispatch dispatcher’s command set to RTU2 and OP. Moreover, Table III presents defined arc inscription
T6 Execute the command set of dispatcher or OP.
T7 Return the not cleared alarm message to the HMI2. functions and their meaning. Note that in general E(p,t) is an
arc inscription function which related to an arc from t to p. The
TABLE III
token color for each E(p,t) which makes transformation on a
Arc inscriptions of the proposed CTPN model.
Arc
token color is as follows:
Meaning
inscription E ( RTU1 , Fj )  mt j , pending , at ,0.0, subi  (7)
Generate a new fault agent of tye j in a substation
E(FDj,Fj)
resource with timestamp @exponential (I/kj). E ( HMI 2 , T2 )  mt j , pending , at , 0.0, subi  (8)
E(RTU1,Fj) Set alarm properties.
E(NET1,T1) Alarm message. E ( HIS , T3 )  mt j , acknowledged , at , ts, subi  (9)
E(HIS,T2) Event message for received message.
E(HMI1,T2) Alarm message. E( HIS , T4 )  mt j , neutral , at ,0.0, subi  (10)
E(HMI2,T2) Alarm message. (11)
E(HIS,T3) Event message for acknowledged alarm.
E ( NET2 , T4 )  E (OP, T5 )  mt j , msn , at , ts, subi , cmd m 
E(HIS,T4) Event message for dispatcher command.
E(NET2,T4) Dispatcher command set associated to the alarm.
E ( RTU 2 , T5 )  cmdm  (12)
E(RTU2,T5) Dispatcher command set. E (CLR, T6 )  mt j , cleared , at , ts, subi  (13)
E(OP,T5) Dispatcher command set to OP.
E(CLR,T6) Cleared alarm. E ( NoCLR, T6 )  E ( HMI 2 , T7 )  mt j , pending , at , ts, subi  (14)
E(NoCLR,T6) Not cleared alarm.
E(HMI2,T7) Alarm message for not cleared alarm. where, at = current time and ts = updated ts.

0885-8950 (c) 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TPWRS.2019.2916025, IEEE
Transactions on Power Systems
5

The behavior of the CTPN model is described in detail as TABLE IV


follow: The places AAj contain faults or disturbances which are Distribution timing of timed transitions.
Tran- Message Firing delay Distribution function
generated in remote substation resources for alarm type j. In sition type range (min) (Lognormal(µ,σ))
P-value
order to become close to a real life example, Poisson T3 all [2,36] lognormal(2.05,0.81) 0.07
distribution should be used for modeling the number of faults T6 major [8,6000] lognormal(4.59,1.67) 0.08
or disturbances in AAj. In SCADA system, since the probability T6 minor [21,28000] lognormal(5.12,1.93) 0.09
of occurrence of failure in each substation resource is small, and the probability of issuing wrong commands by the dispatcher
the number of substation resources is too high (because the CI’s op
and the parameter Perror in transition T6 is the probability of
grid is often large, complex and wide), the Poisson model will
estimate reasonably well its alarm traffic. Therefore, many improper maintenance procedure by OP.
researches, e.g.,[26] , have used the Poisson distribution The simplicity of the proposed model in Fig. 4 describing the
function for modeling SCADA alarm traffic. Hence, in the AH in SCADA system identifies the main advantages of using
proposed CTPN model the places AAj are marked by Poisson CTPNs: (i) the colors permit to represent the different alarm
types with their properties in a concise form; (ii) the time
distribution with parameters k j I for generating disturbance in
stamps associated to tokens allow to handle alarms in various
substation resources. Where kj is the average number of alarm ways; (iii) CTPNs allow to integrate a framework for
type j in a substation per month. In other words, the interval simultaneous representation of the SCADA alarm handling
time between two alarms of type j has an exponential profile, dispatcher operation and OP maintenance.
distribution with parameter I k j . Therefore, the time between
two alarms of type j has a mean of k j I minutes. In this way the IV. VALIDATION AND SIMULATION RESULTS
initial marking of the proposed CTPN is as follows: The proposed CTPN model is employed to model the AH in
a dispatching center of Iran’s power transmission network
M 0 ( AAj )  ( 1 )@ exponential  I k j  (15)
named Area Operating Center (AOC). This dispatching center
M 0 ( p)  ( 0 ) (16) is equipped with a SCADA system for monitoring and
controlling 31 transmission substations. In the SCADA control
where, p RTU1 , RTU 2 , OP, CLR, NoCLR  C  NET ,
room of Iran in power system, alarms are categorized into two
(<1>)@ represents a time stamped token, and ( 0 ) main types, namely major and minor [6]. The major alarms
indicates that “no token” is in place p at marking M0. include those alarms, which tend to an emergency without
After a protection device detects a fault or disturbance of proper reaction of the corresponding dispatcher and OP, e.g.,
type j in a substation resource, a colored token is fired by Fj to circuit breaker failure in power system. The minor alarms are
RTU1. The function associated to E ( RTU1 , Fj ) sets the color of referred to those alarms, which should be prevented, e.g., over-
current/voltage relay trip. In the case of occurring a minor
the token. Transition T1 fires the colored token, representing an
alarm, dispatcher has to start remedial reaction, such as
alarm, to the NET1. The alarm is (i) logged in HIS, and (ii) sent
reconfiguration, restoration, re-dispatch, and load shedding as
to HMI1 and HMI2 by transition T2. In control center, dispatcher
well as notifying the OPs for clearing it.
should acknowledge received alarm in HMI1 as soon as
By the analysis of the real data for 10 substations at 3
possible. This function will be carried out by F(T3) when the
months, the firing delay ranges for the (i) alarm
transition T3 is fired. The acknowledgment delay, which
acknowledgement; (ii) alarm clearing, and their distribution
represented by time inscription of T3 (@AckDelay), is added to
functions are evaluated and reported in Table IV. The statistical
ts color of the alarm. When the alarm is acknowledged, an
analysis of real data represents that acknowledgement/clearing
associated message will be recorded in HIS by T3. The place
delay of alarms is well described by the lognormal distribution.
HMI2 contains all alarms should be cleared by the dispatcher.
The obtained P-values > 0.05 indicates that the null hypothesis
Therefore, the dispatcher could issue a set of right/wrong
that the observed data are well fit by lognormal distribution is
commands to clear an alarm. Transition T4 stores issued
not rejected. Moreover, by analyzing the number of real major
commands on the HIS, and sends them to NET2. Dispatcher’s
and minor alarms, the exponential distribution parameters at
commands are forwarded to RTU2 and OPs by T5, and are
AA1 and AA2 are determined as follows: k1=14400 minutes, k2=
handled by transition T6. If the dispatcher’s command and the
86400 minutes, which means the average number of major and
action of the OP are correct, the alarm will be cleared by T6.
minor alarms are approximately 9 and 2 for each substation
The total time required for the issuance and execution of
every three months. Furthermore, in normal state with no
commands, which represented by time inscription of T6 dp op
(@HandleDelay), are added to ts color of alarm. The insider threat, the parameters Perror and Perror are considered
(not)cleared alarm is forwarded to (NO)CLR place. As long as zero.
an alarm is not cleared, it should be displayed in HMI screen. The proposed CTPN model has been simulated and analyzed
Therefore, after the time spent in T6, the alarm is returned to by the CPN-Tools that employ a valuable (i) standard report
HMI2 by transition T7. To simulate dispatcher and OP threat the such as reachability, fairness and liveness entitled state space
dp op for analyzing and verification; (ii) programming language
parameters Perror and Perror are used in function of transition
dp
entitled CPN ML for net inscriptions and declaration.
T4 and T6 respectively. The parameter Perror in transition T4 is The following performance indices are selected in order to

0885-8950 (c) 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TPWRS.2019.2916025, IEEE
Transactions on Power Systems
6

TABLE V
The real performance indices for major alarms.
I ADTJ (min) MDTJ (min) ANA
6 227.52 2292.42 53.76
8 225.89 2524.36 71.23
10 228.43 2815.23 83.35
TABLE VI
The real performance indices for minor alarms.
I ADTI (min) MDTI (min) ANA
6 446.68 2089.26 9.66
8 446.33 2437.52 12.63
10 438.57 2529.19 14.81
TABLE VII
Single mean test results for ADTJ at 3 months. Fig. 5. The average delay/maximum delay for clearing major alarms.
Mean Std. error Real Simulated
I t Sig.
difference mean mean mean
6 0.364 0.717 17.63 48.43 245.15 227.52
8 0.714 0.478 34.63 48.51 260.52 225.89
10 0.384 0.702 19.02 49.60 247.45 228.43
TABLE VIII
Single mean test results for ADTI at 3 months.
Mean Std. error Real Simulat
I t Sig.
difference mean mean ed mean
6 0.85 0.41 1558.05 1827.20 2004.73 446.68
8 -0.51 0.62 -93.26 183.17 353.07 446.33
10 -0.47 0.65 -85.50 183.17 353.07 438.57
evaluate the system behavior:
- The average delay time for clearing major alarms (ADTJ);
- The average delay time for clearing minor alarms (ADTI); Fig. 6. The average delay/maximum delay for clearing minor alarms.
- The average maximum delay time for clearing major alarms
(MDTJ);
- The average maximum delay time for clearing minor alarms
(MDTI);
- The average number of alarms (ANA) that are waiting in the
place HMI1.
The defined performance indices are analyzed considering
different values of the number of substations (i.e. I=6, 8, 10)
and by a simulation run of 129600 time units (minutes) which
it is equivalent to 3 months. The performance indices are
concluded by 1000 independent iterations with a 95%
confidence interval. Besides, the half width of the confidence
interval is about 1.3% for ADTJ and 2.2% for ADTI in the worst Fig. 7. The average number of alarms for each type of alarm.
case, which confirms sufficient precision of the performance results are a consequence of the fact that the skilled dispatchers
indices estimation. Tables V and VI show the results for each with cooperation OPs, in order to maintain network reliability,
type of alarm with the corresponding half width confidence try to keep unchanged the average delay time despite their
intervals. workload has been increased. However, Fig. 7 represents, as
In addition, a simulation validation has been performed expected, that the average number of major and minor alarms
thanks to the cooperation of the AOC dispatchers. In order to tends to increase with the number of substations. This result
validate the proposed CTPN model and determine how closely illustrates that the study and analysis of the AH and the
the simulation model illustrates the real system, a standard associated workload of the dispatchers and OPs are of most
statistical procedure, which is known as single mean test, is important issue for ensuring CI reliability.
employed. A single mean test results on the ADTJ and ADTI at
A. Insider attack Scenarios
3 months indicates that, with 95% confidence interval, there is
not a significant difference between the simulation and the real In this section, two insider attack scenarios (no-response
data samples. Tables VII and VIII represent the test results for attack and delayed attack) are considered to evaluate the
major and minor alarms, respectively. response of the proposed CTPN model, when an alarm appears
Moreover, Figs. 5, 6 compare the average and maximum and the dispatcher/OP does not intend to clear it perfectly. In
delays for each type of alarm. The figures represent that when each scenario, different intensities of attack can be studied by
the number of substations increases from I= 6 to 10 the average assigning different values to the parameters.
maximum time of delay increases. Instead, the average time of In no-response attack scenarios, the dispatcher and/or OP do
delay is almost unchanged by the number of substations. Such not send correct response for clearing alarms. The value of the

0885-8950 (c) 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TPWRS.2019.2916025, IEEE
Transactions on Power Systems
7

TABLE IX
The values of the parameters for no-response attack scenarios.
( ,  )major  (4.59,1.67) , ( ,  )minor  (5.12,1.93)
dp op
Perror Perror K1 K2 I
0 to 0.6 0 and 0.5 14400 (min) 86400 (min) 10

TABLE X
The values of the parameters for delayed attack scenarios.
K1=14400(min), K2=86400(min), I=10
dp
Perror op
Perror major  major minor  minor
0 0 1 to 10 1.67 6 to 15 1.93

Fig. 9. ADTJ in different intensity of attacks.

Fig. 8. The average number of alarms in different intensity of attacks.


parameters for controlling dispatcher no-response threat Fig. 10. ADTI in different intensity of attacks.
dp op
probability ( Perror ), OP no-response threat probability ( Perror )
and number of substations (I) have been shown in Table IX. It
is assumed that the traffic mean rate of alarms (k1, k2),
dispatcher/OP delay mean for major and minor alarms and their
standard deviations are as normal state. Attack intensity will be
dp op
varied by changing Perror and Perror parameters. Fig. 8
represents the average number of alarms, which are waiting to
be cleared in place HMI2, tends to increase with increasing the
value of no-response attack intensity. Moreover Figs. 9 and 10
show that the value of ADTJ and ADTI are increased with
increasing intensity of no-response attacks. However, Fig. 11 Fig. 11. The increasing rate of ADTJ, ADTI, and ANA in different intensity of
represents that when the intensity of dispatcher attack increases attacks.
dp
from Perror =0.1 to 0.6 the rate of ADTJ and ADTI significantly
V. CONCLUSION
increases. Whereas, the rate of ANA is increased almost
This paper employs the UML use-case and sequence
linearly. Since the increased delay for clearing alarms could
diagram to depict the AH transactions. Moreover, the AH which
lead to cascade alarms, this result demonstrates that the security
integrated with the dispatchers’ actions and OP’s maintenance
and reliability of CI is highly dependent on the skills of the
is modelled and simulated in a CTPN framework. The proposed
dispatchers and OPs in clearing alarms.
CTPN model is able to address and predict no-response threat
In delayed attack scenarios, the dispatcher and/or OP do not
and delayed threat by dispatcher/OP, which are the most
clear alarms on time. Table X shows the value of the parameters
important insider threats, in SCADA system. A real case study
for delayed attack scenarios. According to lognormal
analyzing a dispatcher center in power system shows that
distribution for delay of clearing alarms, the value of the
CTPNs are a valuable tool to evaluate and predict CI security-
dispatcher/OP delay mean increases from major  1  10 for
threatening data delays. The simplicity of the proposed AH
major alarms and minor  6  15 for minor alarms. The values model shows that it can be used for modeling large and complex
of standard deviation for clearing major and minor alarms are SCADA systems. The proposed CTPN model not only
considered as normal state. Figs. 12 and 13 show that when the represents the SCADA alarm handling system, but also it can
value of dispatcher/OP delay mean increases, the values of be used for studying CI reliability and insider threats based on
ADTJ and ADTI increase exponentially. This result emphasizes the skills of dispatchers and OPs in clearing alarms.
again that the reliability and security of CI is highly dependent Furthermore, it is useful for generating simulated data in
on the timely operation of the dispatchers and Ops. SCADA system even under insider threat conditions.

0885-8950 (c) 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TPWRS.2019.2916025, IEEE
Transactions on Power Systems
8

coloured Petri Nets," Engineering Applications of Artificial Intelligence,


vol. 25, pp. 728-733, 2012.
[15] C.-W. Ten, C.-C. Liu, and G. Manimaran, "Vulnerability assessment of
cybersecurity for SCADA systems," IEEE Transactions on Power Systems,
vol. 23, pp. 1836-1846, 2008.
[16] D. Lefebvre, "Fault diagnosis and prognosis with partially observed Petri
nets," IEEE Transactions on Systems, Man, and Cybernetics: Systems, vol.
44, pp. 1413-1424, 2014.
[17] Y.-z. CHENG and X.-y. FANG, "Application of PN algorithm to power
system restoration," Electric Power Automation Equipment, vol. 5, 2003.
[18] G. Ramos, J. L. Sanchez, A. Torres, and M. Rios, "Power systems security
evaluation using petri nets," IEEE Trans. Power Delivery,, vol. 25, pp. 316-
322, 2010.
[19] M. Tariq and H. V. Poor, "Electricity theft detection and localization in
Fig. 12. The increasing rate of ADTJ in different intensity of delay attacks. grid-tied microgrids," IEEE Trans. Smart Grid, vol. 9, pp. 1920-1929,
2018.
[20] R. Amoah, S. Camtepe, and E. Foo, "Securing DNP3 broadcast
communications in SCADA systems," IEEE Transactions on Industrial
Informatics, vol. 12, pp. 1474-1485, 2016.
[21] J.-w. Yang and Z.-y. He, "Power systems alarm processing technology and
fault diagnosis based on Petri nets with timing constraints," Power System
Protection and Control, vol. 40, pp. 77-84, 2012.
[22] V. Augusto and X. Xie, "A modeling and simulation framework for health
care systems," IEEE Transactions on Systems, Man, and Cybernetics:
Systems, vol. 44, pp. 30-46, 2014.
[23] P. Mahmoudi-Nasr and A. Yazdian-Varjani, "Toward Operator Access
Management in SCADA System: Deontological Threat Mitigation," IEEE
Trans. Industrial Informatics,, vol. 14, pp. 3314-3324, 2018.
[24] Z. Lin, F. Wen, C. Chung, and K. Wong, "A survey on the applications of
Petri net theory in power systems," in Power Engineering Society General
Fig. 13. The increasing rate of ADTI in different intensity of delay attacks. Meeting, 2006. IEEE, 2006, p. 7 pp.
[25] K. Jensen, Coloured Petri nets: basic concepts, analysis methods and
practical use vol. 1: Springer Science & Business Media, 2013.
VI. REFERENCES [26] J. Zhao, Y. Xu, F. Luo, Z. Dong, and Y. Peng, "Power system fault
[1] P. Goel, A. Datta, and M. S. Mannan, "Industrial alarm systems: diagnosis based on history driven differential evolution and stochastic time
Challenges and opportunities," Journal of Loss Prevention in the Process domain simulation," Information Sciences, vol. 275, pp. 13-29, 2014.
Industries, vol. 50, pp. 23-36, 2017.
[2] Y. Guan and M. Kezunovic, "Contingency-Based Nodal Market Operation VII. BIOGRAPHIES
Using Intelligent Economic Alarm Processor," IEEE Trans. Smart Grid,,
vol. 4, pp. 540-548, 2013.
[3] N. T. S. Board, "Pipeline Accident Report," NTSB/PAR-12/01 PB2012- Payam Mahmoudi-Nasr received his BSc. and
916501, 2010. M.Eng in Computer Engineering from the Amirkabir
[4] T. Cherifi and L. Hamami, "A practical implementation of unconditional University of Technology in 1994 and 1996, and PhD.
security for the IEC 60780-5-101 SCADA protocol," International Journal in Electrical Engineering from the Tarbiat Modares
of Critical Infrastructure Protection, vol. 20, pp. 68-84, 2018. University, Iran, in 2016 respectively. Since 2008, he
[5] K. Stouffer, J. Falco, and K. Scarfone, "Guide to Industrial Control has been with the Computer Engineering Department
Systems (ICS) Security," NIST Special Publication, vol. 800, p. 82, 2014. at University of Mazandara, Iran and actively has been
[6] NRI, "Substation Automation Systems standard (Transmission and involved in CERT of Tarbiat Modares and
Subtransmission S/S)," Ministry of Energy of Iran, 2008. Mazandaran universities. His research interests
[7] M. J. da Silva, C. E. Pereira, and M. Götz, "A dynamic approach for includes Industrial Network Security, Information
industrial alarm systems," in Computer, Information and Security and Smart Grid related Topics.
Telecommunication Systems (CITS), 2016 International Conference on,
2016, pp. 1-5.
[8] Z. Zeng, W. Tan, and R. Zhou, "An alternative method to compute the
expected detection delay for deadbands and delay-timers," in Control and
Automation (ICCA), 2016 12th IEEE International Conference on, 2016,
pp. 149-154.
[9] J. Wang, Z. Yang, K. Chen, and D. Zhou, "Practices of detecting and
removing nuisance alarms for alarm overloading in thermal power plants,"
Control Engineering Practice, vol. 67, pp. 21-30, 2017.
[10] W. Tan, Y. Sun, I. I. Azad, and T. Chen, "Design of univariate alarm
systems via rank order filters," Control Engineering Practice, vol. 59, pp.
55-63, 2017.
[11] W. Hu, T. Chen, and S. L. Shah, "Discovering association rules of mode-
dependent alarms from alarm and event logs," IEEE Trans. Control
Systems Technology,, vol. 26, pp. 971-983, 2018.
[12] Y. Yu, D. Zhu, J. Wang, and Y. Zhao, "Abnormal data detection for
multivariate alarm systems based on correlation directions," Journal of
Loss Prevention in the Process Industries, vol. 45, pp. 43-55, 2017.
[13] G. Daian, M. Santa, and T. Letia, "Evolutionary method for railway
monitoring systems," in System Theory, Control and Computing
(ICSTCC), 2014 18th International Conference, 2014, pp. 627-632.
[14] M. P. Fanti, S. Mininel, W. Ukovich, and F. Vatta, "Modelling alarm
management workflow in healthcare according to IHE framework by

0885-8950 (c) 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

You might also like