Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

SCADA and Modeling in Water Management

I. Stoian, D. Capatina, S. Ignat, O. Ghiran


SC IPA SA Cluj
Cluj-Napoca, Romania
ipacluj@automation.ro

Abstract - In the field of water management it operate with converting sensor signals to digital data and sending digital
hydrologic, hydraulic models, and the river basin GIS data to a central control/dispatching unit; Central
component, that take into account the real-time measured control/dispatching station where system operators have access
parameters, and climate prognosis. The SCADA system through HMI (human machine interface) to all the information
associated to a river basin, by acquired historical data, gathered by the distributed RTUs, and to the information
constitutes the support for models calibration and validation, by provided by other related by nature information stations
their training on representative data sets, with the purpose of (hydrological, meteorological information, geographical
establishing model associated parameters. But not all acquired infrastructure surveillance); communication infrastructure (data
data from sensors is qualified for interaction with the models.
transmission under specific secure protocols) [1].
These must pass through filtering, compensation, plausibility
processes. The data stored in databases complies with these The SCADA system associated with a river basin offer
requirements, being validated when loaded into historical information support for the management of water resources,
databases or by post-processing. providing software modules for optimal dispatching of water
on rivers, drains, sewers and reservoirs, remote control of the
For managing some dynamic conditions, alerts or certain dam spillways, on line visualization of the measurement
extreme conditions – floods or prolonged drought – the models parameters. This SCADA system enable remotely operating
have to be shifted during running, depending on instant values of
regarding: the view of real-time measurements (the level of
process parameters – run-time models. The selection of the most
suitable model for on-line erroneous data may result in
water in rivers, polders and reservoirs) and operation of
disastrous results. Therefore, at the level of on-line data there is spillways and bottom gates. Some SCADA options allowed the
required rapid checking, involving specialized processors or dispatching unit to receive sound alarms when a fault within a
demanding software applications, running on parallel threads. water supply system is identified. Other facilities are regarding
the storage in the historical data based of the temporal variables
Presented paper achieves SCADA data analysis with the evolution in the system such as water levels, flows,
purpose of make them compatible with predictive models, for precipitation, humidity, temperatures, status information.
ensuring of decisional support mechanisms, that sustain
The environmental influence of lakes and reservoirs the
proactive measures, by proposing o set of solutions regarding
their efficient usage.
water used in different domains could be improved by means
of skillful optimization of water reservoir management based
Keywords - SCADA data plausability, models and run-time also modeling process. The model facilitates the management
models, virtual valid data base of risk during the flood period and recommends options of
discharge and water level control in the river basin
geographical area [2].
I. INTRODUCTION
Water resource structure is very complex, and includes The usage of SCADA infrastructures in water domain
natural and man-made river basins and reservoirs and require new service and models, allowing on-demand
associated infrastructure. Dedicated automation systems have provisioning of complex applications and group-oriented
evolved and expanded in order to manage water resources for: infrastructure services, across multiple suppliers.
energy production, water supply, distribution, treatment,
pollution monitoring, irrigation, navigation, etc. II. SCADA DATA ANALYSIS
The SCADA (Supervisory Control and Data Acquisition) The type of SCADA data belong into one of the following
system manages and controls in real time various categories: analog data, digital data, pulse data, status bits or
infrastructures (industrial manufacturing, public facilities/ flags. These data are stored in records in the historical data
utilities as water distribution, weather forecast , hydrological bases in pair structures – date, time. When these data are used
and meteorological units deployed over very large geographic for hydrologic and hydraulic modeling, they require
areas as river basins, hydropower plants, etc.). The main verification and careful processing selection in order to
SCADA systems components are: sensors/ transducers, maintain their usefulness. Their usage without this filtering
measuring of specific system parameters and investigate status process is not acceptable and recommended.
of actuators used to control the process; RTUs (Remote
Terminal Units) /or PLCs (Programmable logic controllers),

978-1-4799-3732-5/14/$31.00 ©2014 IEEE


The communication channels of SCADA system deployed detecting application failures in ‘real-time’. All clients of
over large geographical areas are relatively slow because their databases can access one version of the data.
support is radio, telephone lines or wired at low
Partition tolerance means the splitting of database over
communication rate. SCADA data from different devices in the
multiple servers. Guaranteed properties are maintained even
field are not collected continuously because the data request is
when network failures prevent some machines from
in polling and/or incidentally interruption can occurred. Also
communicating with others.
some sensors can make problems (supply problems, range
exceeding) and their measurement value is not plausible. In “Atomicity, Consistency, Isolation, Durability (ACID) is a
these conditions it is common for a SCADA system to display set of properties that guarantee that database transactions are
analog data average values instead of instantaneous ones. processed reliably” [4]. No database that fails to meet any of
these four ACID model goals can be considered reliable.
The SCADA data loaded in relational databases are
obtained from the three major ways: When multiple users access the database at same time the
transaction in database is required for its data protection and
The central host computer asks for the field devices data in
keeping their consistency.
polling process;
When the data are filtered and plausible they have to be
With periodic rate the field devices send data to the central
saved from power outage or other threats. For this is reasonable
host;
to save the data related to transaction in different places.
Operator related and manual set on databases.
These concepts employ that direct exploitation of the
In all ways there is an inconvenient that the request receiver databases for modeling is unsuitable, although SCADA data
could be involved in other operation and it reject the request integrity is achieved.
causing a lost of data in the acquisition process.
Another important feature of the data is their integrity.
SCADA systems exchange data with external applications That means the assurance and maintenance of the consistency
by exported them in various forms: ASCII text, spreadsheet and accuracy over their life cycle in the storage process.
file using ODBC (Open Database Connectivity), proprietary
There are two components envisaged in data integrity:
forms package.
When the SCADA data, are used in the modeling process it • Physical integrity refers to correctly fetching and storing of
must: data.

- perform an overview of their records, • Logical integrity includes data plausibility.

- analyze data details in order to identify and solve timing Data integrity includes check and correction for
problems, missing data, invalid data, based on set of rules defining the relation between
data.
- analyze the plausibility of data,
The causes of errors in data obtained from SCADA are:
- organize the information into needed groups,
- system failure - malfunctions of: (i)measurement devices;
- include into separate strings the measured data with their (ii)RTU or PLC; (iii)communication devices and channels;
time-stamp and the processed data (ex: averaged values, (iv)central host computer; (v)data base servers;
interpolated values, computed values),
- operational schedule overlapping;
- generate from the SCADA measures databases an ‘integral
data set’ for the model calibration/validation. - data alteration due: (i)to compression/decompression
process; (ii)to incomplete conversion of the measured
SCADA data are recorded in distributed relational value; (iii)transducer over range;
databases having large dimensions. Distributed databases
structure can not be direct utilized for modeling process. For - incorrect time-stamping of the measure data.
model calibration and validation must be selected from
historical databases sensitive measures sets for modeled A. Data Compression and Timming Problems:
phenomena (context reasoning data spreadsheet). We have to consider that the users of the data do not always
By CAP theorem, Brewer asserts that for distributed understand the particular mechanisms by which the data have
databases it is impossible to provide simultaneously all three of been collected and do not take this into account when
the next guarantees: consistency, availability and partition analyzing the data [5].
tolerance and get an acceptable latency [3]. The transmission and the storage of the data use a
compression mechanism. The actual field variable transmitted
Consistency means that only valid data should be stored to
with a compression process can lose their trend evolution and
the database. All the clients who access database are able to see
sometimes not reproduce their behavior.
the same data, even if there are concurrent updates.
Another possible source of errors in SCADA trends are the
Availability means that reading and writing operation
timing problems. For this reason SCADA system may
always succeed. Database availability minimizes downtime by
introduce the following mechanism:
• A ‘back-filling’ mechanism, in the case of an unsolicited - Standard multihop wireless networking paradigms
data transfer. Automatic data historian database product basically do not scale for data gathering from such sensor
software have to be compatible with this mechanism; nets.
• Optimized methods for the communication bandwidth and - While the location of significant events is a critical
field-based data memory, thus displayed trend to be element of the information we wish to gather, nodes in a
accurate; huge random deployment may not identify their own
positions, given that positioning techniques like GPS may
• Synchronize timing: temporal error related to the fact that be unavailable (due to cost or environment).
events and their time stamps may not correspond with the
event time’s occurrences in the field. Instrumentation-related problems include:
- Inaccurate data reading - it can results from an uncalibrated
B. Missing Data Problems equipment, from some interference of signal in the inputs to
Missing data in SCADA information can be caused by the local level processor units or by misinterpretation by
many factors, such as: SCADA software [5];
• Sensors or network of sensors failures, - Data spikes in readings which means fluctuations or
measurement limits exceeded (result from hydraulic
• Measurement instrumentation failures, transient regimes appeared when the pump or valve change
• Communication failures between the RTU and the operational mode);
dispatching computer, - Inadequate instrument range (if the measurement duration
• Temporary SCADA software glitches. of the field instruments is not adequate, the measured value
can not be stable and the recorded values exceed the range
Sensors or network of sensors problems consist in: of the device).
- Unknown elevations of sensor - If the elevation of the - Field instrumentation failure usually caused by a drift in the
sensor is unknown (or only roughly known), SCADA calibration or other improper maintenance of the field
system sensors measurement could be inaccurate. device;
- Critical to scalability of sensor networks quality are the - Insufficient resolution of the field device used in data
next metrics: (i)effective bandwidth; (ii)loss rate; acquisition. Problem is avoided by RTUs via IEEE floating
(iii)average network bandwidth. point format.
Scalability enables a high spatial density network, which Communication-related problems include:
makes it possible to identify nodes with greater accuracy.
As emerging sensor networks sizes and densities grow, - Failures of the communication systems (between
there are requested appropriate algorithms for collecting dispatching level and local level or different RTU’s
information in these networks. In order to characterize and interconnected at local level) are identified by physical gaps
identify the most critical questions issued in the wide-scale in acquired and recorded data;
deployment of sensor networks, have to be analyzed the - Noise in the communication system. This can be identified
scalability of these algorithms. This is achieved by by comparison with data collected directly from field long-
examining the scalability behavior of routing and term in previous periods, or by statistical filtering
compression algorithms as the number of nodes in a sensor techniques.
network grow.
Scalability analysis is focused on the next aspects of a Temporary SCADA software glitches
sensor network: (i)its routing algorithm, which decides the - Could be caused by the failures above mentioned. They
communication patterns between nodes; (ii)its method of may appear due to malwares, viruses or un-updated
data aggregation and compression [6]. software packages.
- Scaling sensor networks up related to the number of nodes In figure 1 there are presented user interfaces of SCADA
and energy efficiency: (i)how to scale up the number of systems which ensure the operator visibility on historical
nodes so as to get appropriate coverage of an interest zone; acquired data with different errors. The SCADA applications
(ii)how to scale down the energy spending per node, in are developed by the authors of the paper on various software
order to diminish energy consumption. packages: LabWindows/CVI, Vijeo Citect and LabView.
- Aggregation permits multiple packets to be joined into a
single packet; compression allows the data in a packet to III. MODELS IN WATER MANAGEMENT DOMAIN
be condensed in size. Per-node aggregation and Models generally make assumptions about real world
compression techniques decrease the transmission cost of processes in order to predict the behavior of systems under
data by diminishing the amount of data at the cost of certain conditions but with inherent inaccuracies for any real
additional computation. systems [7].
In water resources hydrologic and hydraulic modeling
must be examined issues with scaling, calibration and
Data verification, and the simplification of parameters against over-
spike
parameterization. A management issue faced by water
management agencies is how to improve the availability and
usefulness of information. An efficient management and
planning necessitate an integrated decision support system
(DSS) based on advances in modeling and geographic
information system (GIS) technologies [7].
The DSS allows the generation of solutions in floods and
droughts cases those have a deterministic nature, providing
solutions for problems that are based on unreliable or
Missing Missing incomplete data. This computer software evolves comparing
data data with the decision-making ability of a human expert. It is
composed by three parts, the inference engine, the knowledge
base, and a dialog interface to communicate with users. The
DSS is calculating the optimal allocation of the water
resources in both cases, elaborating a generally applicable
solution for extreme river flow conditions.
The used technology is a Multiple Input - Multiple Output
Distributed Control System acting predictive based on the
information of the SCADA and with the optimal targets
imposed by DSS simultaneously on all dams, gates and
Data spillways of the polders and irrigation channels through
spike
appropriate actuating devices. It comprises an advanced
control algorithm based on the distributed model predictive
control together with the sensors and the actuators needed to
maintain the controlled variables to their set points. The
technology contains a number of local model predictive
controllers which have to communicate between them in real
time for prediction and prevention of eventual floods and
droughts damages.
Different types of mathematical models either considered
Missing standards in hydrologic modeling or specially developed by
data
our project team (a mathematical model based on artificial
neural network for small basins) are used for the evaluation of
the hydrologic inputs. The hydrologic inputs will be run
through the system of reservoirs and river stretches in order to:
establish operating rules for the reservoirs and test the system
functionality for different meteorological and hydrologic
inputs. The operating rules of the spillways are obtained by
optimization, both locally at the level of each reservoir and at
the level of the whole system by taking into account the
following constraints: minimum and maximum volume in the
reservoirs, maximum water depth of the spillways, the transfer
capacity of the river bed downstream each reservoir, the
request for avoiding discharge pulses in downstream
reservoirs. Genetic algorithms will be used in the optimization
process for deriving the optimal operational strategy.
During drought periods the main problem is the water
allocation between reservoirs and water users. The DSS will
contain all data describing climatic characteristics of the water
catchment, soil characteristics, physiological characteristics of
Missing
data the crops, the continuous estimation of the water demands for
agriculture and other water users as well as the availability of
the water resources, etc. The Water Management System will
be modeled by an arcs-nodes network. The goal is to realize
Fig. 1. Errors detected in data aquired in SCADA System the optimal water distribution within system both on long-term
and on short-term that minimizes the economic losses of not interval selection, ►appropriate data set selection, ►data
meeting water demands. Long-term optimization is achieved conditioning to obtain an ‘integral data set’ specific for
by establishing reservoir operation rules of dispatcher type modeled phenomena. As it is depicted in fig.2 the databases
that imply the reservoir volume zoning. Short-term postprocessor module extracts event specific data structures
optimization is attained during each time interval and consists appropriate for the calibration and validation of each model
in a convenient distribution of water between reservoirs and before their inclusion in model database.
water users by minimizing the violation of the constraints.
The mathematical model will be related to the evaluation
of the hydrologic boundary conditions, followed by hydraulic
simulations combined with the optimal and predictive control
methods for the identification of the best operation strategies
of the cascade of reservoirs and polders. The models will be
applied for the improvement of the flood and drought
protection in a river basin, where a cascade of reservoirs
already exists and to demonstrate the efficiency of the
automatic intervention on the river/channel flows.
The system provides a new set of functions to be
implemented on the SCADA system for flooding and drought
control. In order to implement the decision support mechanism
an intelligent distributed decision support will be created,
having an adapted management structure suitable to the
Fig. 2. SCADA database processing for modeling calibration/validation
monitoring and control RTU entities distributed on the wide
river basin area, having multi objective decision optimization The DB postprocessor – modeling data extractor is service
capabilities. These mechanisms will be implemented as an type software, which act when the modeler calls it. It has
integrated frame decision support based on individual communication interfaces (HMI) that allow the modeler to
intelligent cooperative decision support, which insures one by specify the time periods, the specific data, and the model
one interaction and interaction with a moderator in order to which is to be calibrated and validated. The service verifies
offer a balanced solution in certain practical situations. the consistency, and the availability of needed data, according
with extended form of CAP theorem related to distributed
IV. SCADA DATA USAGE IN MODELING PROCCESS systems. The requested data are displayed to the modeler on
“SCADA is a great system for data automation and graphic or tabular form, and if there is an ‘integral data set’,
collection, but it cannot currently extend beyond these the model is attacked with this input data set, in order to be
applications to complex modeling and mapping”[7], calibrated and validated. If the needed input parameters
contradicted by the further evolution of SCADA systems, at present data missing, the service tries to solve it by
least considering their degree of generalization. But like was interpolation or modeler manual data input.
depicted in this paper during chapter 2, data resulting from
SCADA systems, can sometimes have problems which make
it unsuitable for the use in modeling processes. Present paper
proposes a series of concrete solutions for the use of the data
resulting from SCADA systems acquisitions, as support for
the development of different predictive hydrologic/hydraulic
models, respectively to their calibration, and then the
integration in a decisional support system, which is aimed at
ensuring a proactive behavior destined to emergency situations
- floods or persistent drought.
The SCADA system associated to a river basin, by
acquired historical data, constitutes the support for models
calibration and validation, by their training on representative
‘integral data sets’, with the purpose of establishing model
associated parameters for their storage in the specific Models
database. But not all acquired data from sensors are qualified
for interaction with the models. These must pass through:
►filtering, ►compensation, ►plausibility processes[8]. The
date stored in relational databases, complies with these
requirements, have the integrity validated in the process of
storage into databases.
For usage in modeling process, data loaded from relational
Fig. 3. On-line data processing for adaptive model for DSS
databases of the SCADA system will be processed: ►time
For managing some dynamic conditions, alerts or certain communication networks. This situation is different from the
extreme conditions – floods or prolonged drought – the current exploitation in sensor networks.
models have to be shifted during running, depending on
When the real time data processor detects data missing in
instant values of process parameters – adaptive run-time
memory buffers it interpolates between SCADA values to
models. The selection of the most suitable model for on-line
estimate parameters at acquisition time rate (using averaging
erroneous data may result in disastrous results. Therefore, at
or interpolating data).
the level of on-line data there is required rapid checking,
involving specialized processors or demanding software
applications, running on parallel threads. The real time data V. CONCLUSION AND FUTURE WORK
flow from the sensor level to the decision support system, Present paper proposes a series of concrete solutions for the
include a real-time data processor that extract data sequences use of the data resulting from SCADA systems acquisitions, as
needed as input for the context reasoning model (fig.3). The support for the development of different predictive
real-time data processor is service type software that runs in hydrologic/hydraulic models and then the integration in a
an automatic manner. This component triggers the on-line decisional support system. For usage in modeling process, data
sensors measurements, and by different pre-defined thresholds loaded from relational databases of the SCADA system are
selects the model which must be run for making decisions processed in time interval selection, appropriate data set
process. Related to input data for the run model the service selection, data conditioning to obtain an ‘integral data set’
functionalities are the same (the consistency, and the specific for modeled phenomena. The real time data flow from
availability testing), but on the on-line data, and their trends the sensor level to the decision support system, include a real-
stored in memory locations. In this manner it is obtained input time data processor that extract data sequences needed as input
data for model, which is stored in real-time DB. The control for the context reasoning model. For the future work is need to
driven module obtains the model output data and after analyze in practice the results of the applied models and how
simulation decides if the selected model is the appropriate one the SCADA delivered data are useful, and to apply this method
or must be changed. Based on model output data from the in other area in which SCADA offer a big set of data.
rules and knowledge of DSS, the dispatching operator receives
prognosis and a set of suggested actions for better critical
event mitigation. ACKNOWLEDGMENT
This work was supported by a grant of the Romanian
The real-time data processor verifies the data validity: National Authority for Scientific Research, CNDI– UEFISCDI,
a) Offering detection of sensor data errors, by continuous project number PN-II-PT-PCCA-2011-3.2-0344 - “Pro-active
automatic analysis, involving techniques software agents’ operation of cascade reservoirs in extreme conditions (floods
network analysis; and droughts) using a Comprehensive Decision Support
Systems (CDSS). Case study: Jijia catchment”.
b) SCADA system acquired data are useful to indicate when
the sensors are not correctly operating and need calibration.
The comparison between the data from a sensor and an REFERENCES
expected value indicate the necessity of the calibration process; [1] ***. Funding Application for Joint Applied Research Projects - PN-II-
PT-PCCA-2011-3, “Pro-active operation of cascade reservoirs in
c) When sensor fails the service indicate a sensor with similar extreme conditions (floods and droughts) using a Comprehensive
functionalities (redundant) if in the system exists one. In water Decision Support Systems (CDSS). Case study: Jijia catchment”, 2011
management there are many SCADA system owners that can [2] Sz. Balogh, E. Stancel, B. Gyurka, C. Vigu, S. Ignat, “Integrated
collaborate in a federative environment, and ‘change’ sensors Cascaded Hydropower Plants SCADA Operative System with Decision
Support Components” IEEE Catalog Number: CFP10AQT-PRT, ISBN:
with similar functions than one of them is unavailable [8]. 978-1-4244-6722-8, 2010 IEEE International Conference on
Automation, Quality and Testing, Robotics (AQTR 2010), THETA 17th
Regarding the sensor network here are same considerations edition 28th-30th May, Cluj-Napoca, Romania.
towards his scalability. For moderate scale sensor nets (such
[3] Seth Gilbert, Nancy A. Lynch, "Perspectives on the CAP theorem"
as those of the dams of reservoirs) with a conventional Computer, vol. 45, no. 2, pp. 30-36, 2012.
multihop architecture, if a source may randomly select its [4] *** http://en.wikipedia.org/wiki/ACID
destination in a huge network, then the throughput per node [5] T.M. Walski, D.V. Chase, D.A. Savic, W. Grayman, S. Beckwith, E.
scales down with the network size. In order to perform a given Koelle, “Advanced Water Distribution Modeling and Management”, 1st
sensing task related to the environment sensors often have to ed., Publisher: Exton, PA : Bentley Institute Press, ©2007.
collaborate, it is significant to determine the utility of a remote [6] Andreas Krause, Carlos Guestrin, "Optimizing Sensing: From Water to
sensor’s data in conserving power due to the communication the Web" Computer, vol. 42, no. 8, pp. 38-45, 2009
requirement. The question is centered on how to query sensors [7] L. Thiem, M. Vitale, C. McGreavy, G. Tsiatas, “WATER DATA
dynamically and route data in a network, such as to maximize MANAGEMENT SYSTEMS INTEGRATIONS WITH MODELS”may,
2001; - http://www.wrc.uri.edu/publications.html
information gain, whilst minimizing latency and bandwidth
consumption [9]. Problems regarding “sensing optimization” [8] Stoian I., Stancel E., Ignat S., Balogh Sz., "Federative SCADA –
Solution for Evolving Critical Systems", Journal of Control Engineering
and “power optimization” must be treated in correlation with and Applied Informatics, Vol. 12, No. 3, 2010.
the data set importance and its validity for modeling. So, in [9] M. Chu, H. Haussecker, F. Zhao, “Scalable Information-Driven Sensor
this phase, the data robustness is more important than the Querying and Routing for adhoc Heterogeneous Sensor Networks”, Palo
minimizing of the consume and the optimization of the Alto Research Center Technical Report, 2001

You might also like