PID2706771

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 9

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/261093626

A Development and Execution Environment for Early Warning Systems for


Natural Disasters

Conference Paper · May 2013


DOI: 10.1109/CCGrid.2013.101

CITATIONS READS
15 378

6 authors, including:

Bartosz Balis Tomasz Bartyński


AGH University of Science and Technology in Kraków Akademickie Centrum Komputerowe CYFRONET AGH
89 PUBLICATIONS 1,091 CITATIONS 19 PUBLICATIONS 346 CITATIONS

SEE PROFILE SEE PROFILE

Marian Bubak Marek Kasztelnik


AGH University of Science and Technology in Kraków Akademickie Centrum Komputerowe CYFRONET AGH
367 PUBLICATIONS 2,802 CITATIONS 33 PUBLICATIONS 430 CITATIONS

SEE PROFILE SEE PROFILE

All content following this page was uploaded by Bartosz Balis on 08 January 2020.

The user has requested enhancement of the downloaded file.


A Development and Execution Environment for
Early Warning Systems for Natural Disasters
Bartosz Balis∗† , Tomasz Bartynski† , Marian Bubak∗† , Grzegorz Dyk† , Tomasz Gubala† and Marek Kasztelnik†
∗ AGH University of Science and Technology, Department of Computer Science, Krakow, Poland
† AGH University of Science and Technology, ACC CYFRONET AGH, Krakow, Poland

Abstract—Early Warning Systems (EWS) may become a pow- warning systems [2]. We focus on deployment and runtime
erful tool for mitigating the negative impact of natural disasters, capabilities of CIS not described in previous publications and
especially when combined with advanced IT solutions – such as highlight its mission-critical EWS execution features.
on-demand scenario simulations, semi-automatic impact assess-
ment, or real-time analysis of measurements from in-situ sensors. This paper is organized as follows. Section II overviews the
However, such complex systems require a proper computing state of the art. Section III presents the early warning system
environment supporting their development and operation. We software model and runtime components of the Common
propose the Common Information Space (CIS), a software frame- Information Space. In section IV details of CIS solutions re-
work facilitating design, deployment and execution of early warn- garding mission-critical aspects of EWS are presented. Section
ing systems based on real-time monitoring of natural phenomena
and computationally intensive, time-critical computations. CIS V describes the CIS-powered Flood Early Warning System.
provides a service-oriented technology stack which helps design Section VI concludes the paper.
‘blueprints’ for early warning application scenarios and deploy
these blueprints as services – system factories enabling users II. S TATE OF THE A RT
to rapidly deploy new EWSs in new settings. CIS also provides Early warning systems leveraging real-time sensor data and
advanced runtime services and resource orchestration capabilities
in order to address the specific requirements of EWSs: continuous time-critical computing are becoming increasingly popular.
operation, highly variable resource demands and mission-critical One notable example is the Indian Tsunami Early Warning [3]
computations. The CIS concept is validated through the Flood system which combines sensors for earthquake detection,
Early Warning System whose goal is to monitor embankments in ocean-based sensors to detect tsunamis, satellites to correlate
urban areas and assist in rapid decision making whenever a dike weather information and area maps to correctly predict where
failure results in a flooding threat.
the tsunami will strike. The US Geological Survey [4] offers
access to a global network of seismic sensors and performs
I. I NTRODUCTION
data processing in order to generate warnings for concerned
Early warning systems can become a crucial factor in stakeholders. The Fire Warning System [5] developed at the
mitigating the impact of natural disasters on urban envi- University of Freiburg has a similar goal – to deliver a map
ronments. The maturation of in-situ monitoring technologies which plots all forest fires in the shortest time possible.
and ever-increasing supply of cheap computing power brings However, while the aforementioned systems aggregate existing
this vision ever closer to reality. Early warning scenarios resources in order to deliver a high quality early warning
involve real-time analysis of large volumes of data concerning system for a specific domain, they do not attempt to propose
natural phenomena, and predictions based on CPU-intensive a computing environment facilitating the very development
simulations. Fidelity and timeliness are two crucial quality and operation of early warning systems in general.
indicators in effective early warning systems. Consequently, Mission-critical applications, including early warning sys-
such systems require proper IT support at every stage of tems, operate continuously and monitor selected phenomena
their lifetime: (i) tools for planning early warning scenarios in order to invoke resource-demanding and computationally-
at design time; (ii) frameworks facilitating implementation intensive simulations whenever necessary. These computations
of new early warning system blueprints; (iii) technologies must meet strict deadlines – otherwise their results will present
for rapid deployment of EWS blueprints at new locations; no practical value. Therefore they must be hosted in special
(iv) runtime services supporting mission-critical execution of environments that feature urgent computing facilities [6]. Over
EWSes. the last decade a number of attempts have been made to con-
In this paper, we present a concept of a software framework struct such systems on the basis of grid infrastructures. While
facilitating design, deployment and execution of early warning grids can easily provide extensive resources they also have
systems based on real-time monitoring of natural phenomena important drawbacks. Since they rely on queuing systems, it
and computationally-intensive, time-critical computations — is impossible to predict when execution will commence and
The Common Information Space. In our previous paper [1] when all the required resources will become available. The
we described the concept of CIS at its early development latter problem is partially addressed by advanced reservation
stage. This paper presents CIS as a mature concept and a fully mechanisms, enabling users to specify all required resources
implemented system, validated in the context of flood early that have to be available simultaneously for an application to
execute. One example of such a solution is HARC [7]: an infrastructures are garnering interest not only from commercial
open-source system enabling users to reserve different types users but also from scientific communities [13].
of resources (e.g. CPU, GPU, network connections) in a single Although cloud infrastructures are becoming ever more
step. Another example is QosCosGrid [8] middleware, which mature, operating early warning systems on clouds remains
uses, among others, various advance reservation techniques to a great challenge. Problems that need to be addressed include
integrate multiple computing resources into a single powerful specifying the system as a set of cooperating components run-
virtual supercomputer. ning on virtual machines, automatic deployment, monitoring
However in the above mentioned solutions resources may and scaling of resources.
only be reserved following a significant delay, which calls for
a reservation prioritization mechanism. Moreover, an urgent III. T HE C OMMON I NFORMATION S PACE
computation must not wait for a long time in a queue – The Common Information Space is a software framework
rather, it should be invoked immediately. This makes the listed facilitating development, deployment and execution of com-
systems poorly suited to urgent computing tasks. In [9] the plex mission-critical systems which rely on scientific com-
authors present several ways to address these issues (imple- puting, in particular early warning systems protecting against
mented in the SPRUCE Urgent Computing Environment). In natural disasters. This section explains the system model
order to reduce the resource procurement time there is a pool adopted by CIS and describes core CIS runtime services.
of sites with ready-to-use preinstalled applications from which
the least used are chosen for any emerging task. Afterwards, A. Early warning system model
the urgent job is submitted to the corresponding queue. The In complex systems, such as EWS, there is a need to
queuing system may apply various policies with regard to compose and orchestrate diverse and distributed resources in
working jobs: wait for completion, move to another site, force order to enact complex application scenarios. For example,
a checkpoint or even kill the given job to ensure that the urgent a relatively simple scenario – prediction of inundation of
request is served immediately. Still, the presented approach an area threatened by a dike breach – involves at least the
does not take into account external storage resources which following elements: (i) scientific applications implementing
may be extensively used by scientific computations involved computational models of inundation, (ii) computing infrastruc-
in simulations. This deficiency is addressed by the Urgent Data ture (HPC clusters, clouds) capable of running CPU-intensive
Management Framework [10] which focuses on providing jobs; (iii) current dike sensor readings (e.g. water level);
various services, including storage, for urgent computing. (iv) terrain elevation data for the area.
Yet another disadvantage of grid systems is the fact that
every application runs on resources with one specific software
configuration (operating system and a set of libraries and
programs). Therefore it is difficult or even impossible to
satisfy specific requirements of custom applications. Even
if the administrator were to create a pool of customized
workers, the number of such nodes in traditional grids cannot
be managed dynamically. These problems can be solved by
combining grid infrastructures with virtualization technolo-
gies. Each application can run in a virtual machine which may
be customized in terms of operating systems, installed libraries
and tools. The number of such virtual worker nodes can be
increased or decreased to best meet current needs. An example
of such a system is the Virtual Spaces Job Manager [11].
Job Description Language extensions allow users to specify
the required virtual machines and level of urgency. If a task Fig. 1. Complexity of early warning application scenarios, flood simulation
requires resources that are not available at a given moment, example. Composition and orchestration of diverse distributed resources in
a request to start a virtual machine is submitted to a queue. the Common Information Space based on the service orientation approach.
In spite of the described enhancements of grid infrastruc-
tures, they still lack some desired characteristics that are pro- In CIS, we adopt a service orientation approach in which
vided by cloud infrastructures. The former require users to go resources are exposed as services which are, in turn, composed
through a certification process before access is granted and do and orchestrated in order to enact application scenarios (Fig.
not give administrative access to resources. On the other hand 1).
clouds can be used by anyone on a pay-per-use basis. They Fig. 2 provides a more detailed view of the CIS architecture
generate an illusion of infinite resources, making acquisition of and its system model. The architecture proposes a SOA solu-
fully configurable resources easier and faster. The availability tion stack for mission-critical compute-intensive systems, such
of cluster instances in Amazon’s EC2 makes high performance as EWSs. In creating this design, we wanted to take advantage
computing on the cloud possible [12]. As a consequence, cloud of SOA architectural patterns for enterprise applications while
Fig. 2. Common Information Space architecture and early warning system model. Domain resources (lower layer) are exposed as basic services. These can
be composed into system parts – blueprints for early warning application scenarios and building blocks for early warning systems. Early warning system is
a collection of loosely coupled parts configured for a particular setting (e.g. a dike section).

at the same time modifying them to suit the requirements of


early warning systems.
The bottom layer of the stack contains domain resources:
scientific applications, sensor data streams, data sets, etc. They
are exposed as the so-called basic services which allows users
to access and manage the resources over the network (albeit
usually via proprietary interfaces). Data resources are typically
external and only accessed, but not managed by CIS. Scientific
applications, on the other hand, are wrapped as virtual machine
images (appliances) and can be deployed on demand in the Fig. 3. Services: (i) Types: request-driven (Service A) and event-driven
cloud. (Services B, C); (ii) Interfaces: management, monitoring (IsAlive), invocation
(WPS); (iii) Loosely-coupled interactions through message bus.
Basic services can be composed to provide higher-level
features as composite services, also called system parts.
System parts are blueprints for application scenarios (e.g. The Common Information Space allows developers to cre-
simulation of inundation) and, consequently, building blocks ate blueprints for application scenarios featured in complex
for early warning systems. It is important to remark that parts systems such as EWSes. These blueprints can be deployed
expose standardized interfaces for invocation, management as software services which serve as system factories allowing
and monitoring. the user to rapidly deploy a new instance of a system (e.g.
Finally, a CIS-powered system is a collection of loosely- a flood EWS) in a new setting (e.g. a dike-protected area).
coupled service instances (simply referred to as ‘services’). Such a deployment primarily involves configuration of the
Services can work on a request-driven or event-driven basis. blueprint with system-specific parameters (for a flood EWS:
A request-driven service exposes an HTTP-based URI which dike properties, URI of a service serving dike sensor data,
can be used to send requests. Responses to such requests various metadata items, etc.) Fig. 4 shows that creating a new
typically contain a new URI from which results can be instance of a system affects all layers of the stack: invoking
downloaded (a widely used design pattern for long-running a system blueprint creates a number of associated services
requests). As early warning systems involve spatial data and (parts), while service instances require deployment of associ-
processing, invocation interfaces are compliant with the OGC ated appliances in the cloud.
Web Processing Service (WPS) standard [14].
B. CIS components overview
An event-driven service is connected to a message bus and
listens to one or more events which trigger processing. Results Fig. 2 depicts the current internal architecture of CIS. The
of this processing can also be sent to the message bus, then main components of the CIS technology stack are as follows:
picked up by another service, and so on, effectively enacting • Integration platform service (Platin): CIS core technolo-
a complex scenario. Service types, interfaces and interactions gies responsible for integrating loosely-coupled applica-
are depicted in Fig. 3. tion elements into an operational early warning system
C. Early warning system lifecycle
The Common Information Space supports services for con-
tinuous monitoring of objects and phenomena under obser-
vation as well as resource-intensive scientific computations
which are executed on demand as urgent tasks. These two
types of services differ in terms of resource consumption, but
their software lifecycle is handled in the same way in CIS.
The process of starting a new service from a blueprint
is presented in Fig. 5. A start request may originate from
either an EWS administrator or another system part. Once it
is received by the CIS, appropriate components (appliances
and system parts) are selected and deployed in the cloud
Fig. 4. CIS as a system factory: using the system blueprint to create a new infrastructure. Importantly, the start request contains an ini-
instance. tial configuration for the application to be booted up. It is
forwarded to the Integration Platform Service (PlatIn) which
resolves component dependencies (required appliances and
services) and connections between them. Subsequently, the
which is deployed in a dynamic cloud environment.
application structure and dependent components are registered
Additionally, it delivers archetype implementation for
in the Monitoring System Service (ErlMon) which initiates
system parts and a container, where such services can
monitoring of the newly registered components. Once this
be installed. The system parts can use Enterprise Integra-
is done, PlatIn requests DyReAlla to provide the required
tion Patterns [15] or BPEL to communicate with basic
appliances. In response to this request, the Dynamic Resource
services.
Allocation Service initializes the resource orchestration pro-
• Metadata registry (UFoReg): a central piece of CIS
cess. As a result the infrastructure is updated to support the
architecture, responsible for keeping its internal state. It
new service. Simultaneously, PlatIn initializes system parts,
uses a persistent, fault-tolerant database solution to record
configuring connections with any dependent appliances. Dur-
all actions of CIS components and maintain cloud mon-
ing this process appropriate events are reported and logged
itoring records in order to deliver timely and complete
in the provenance system which tracks the entire application
information for other elements of the system. It stores and
lifecycle.
publishes metadata on appliances, configurations, early
warning system blueprints and running instances. It also
records live measurements of the current state of the
cloud infrastructure.
• Dynamic resource allocation service (DyReAlla): a ser-
vice for dynamic allocation of resources to running early
warning systems. It accepts a list of services that are
required by an EWS and ensures those demands are
met and resource allocation is optimial. It is responsible
for interfacing cloud infrastructures that host service
instances. Furthermore, it collects data describing the
status of the cloud and updates metadata in UFoReg.
• Self-monitoring service (ErlMon): a system leveraging
actor-based architecture for tracking the availability and
health of all EWS components running within CIS de-
ployments, as well as core CIS components. For each ser-
vice that should be monitored a separate process (called
an actor) is delegated to perform inspection. Monitoring
data may originate from various sources, including JMS,
JMX and REST endpoints. ErlMon builds and maintains
a topology of all monitored resources, interpreting them Fig. 5. Starting a new early warning system from its blueprint.
as nodes in a directed acyclic graph connected by edges
which denote depends-on relationships. Such an approach Each running EWS instance has an associated importance
enables drill-down inspection of infrastructure status. level. When the alert level in the EWS rises, the associ-
Being a crucial part of the system, ErlMon is based on ated importance level can also be increased. This results in
the Erlang programming language and runtime platform, reconfiguration of the computational infrastructure in order
known for its use in mission-critical systems [16]. to provide additional resources. Similarly, when the crisis
abates the infrastructure is reconfigured once again to return to indicate that they host service instances of a given type and
to its normal state. Details about this process are presented in that they are a part of a given early warning system. Once
Section IV. started, virtual machines expose a HTTP-based service which
can be registered in a reverse proxy. There are two reasons for
IV. S UPPORTING MISSION - CRITICAL SCIENTIFIC
this action: first, to enable load balancing of stateless service
COMPUTING
requests [17] and second, to expose service instances hosted
Early warning systems are characterized by strict require- on virtual machines with private IP addresses. Finally, the
ments regarding their operation: manager updates metadata in UFoReg.
• Highly variable resource demands. Early warning system Handling a change in the importance level is depicted in
operate in different modes depending on the “alert level”. Fig. 6. A similar algorithm is applied here. When the manager
Resource consumption may vary from minimal in a de- receives a request, it triggers optimization and allocation
fault ‘monitoring’ mode to extreme in an ‘emergency’ actions produced by the optimizer. It registers new virtual
mode. machines or unregisters those which have been stopped, and
• Urgent computations. Required resources must be avail- updates metadata in the internal registry.
able immediately in order to meet strict deadlines. Resource orchestration is also important for the fault toler-
• Mission-critical function. The system must not fail or else ance feature. If ErlMon reports a service instance as malfunc-
a potential threat may not be detected. Consequently, the tioning or a virtual machine as being down, DyReAlla restarts
system must have advanced reliability and self-healing the virtual machine to render it operational again. Additionally,
capabilities. if a virtual machine is registered in the HTTP proxy, DyReAlla
In order to provide these features it is crucial to support unregisters it and registers again when it comes back online.
appropriate resource orchestration and continuously monitor
the status of the infrastructure. The Common Information
Space provides the necessary mechanisms to meet these goals.
A. Resource orchestration
The demand for computing power in early warning sys-
tems is highly variable and unpredictable. Therefore, using
cloud infrastructures for hosting such systems appears to be
a reasonable solution. We leverage computing infrastructure of
the Polish National Grid infrastructure PL-Grid managed by
a cloud computing middleware. Exercising effective control
over the deployment and operation of many EWSes across
multiple cloud sites would be very difficult, if not impossible,
for a human user. Moreover, resources should be utilized
in a cost-effictive manner. In the CIS environment resource
orchestration is performed by the DyReAlla service. The Fig. 6. Changing process importance level.
central component of DyReAlla is a manager which accepts
requests from PlatIn and supervises the process of provisioning
resources. When an EWS is started the manager receives B. Self-monitoring
a request which includes an abstract specification of the In order to enable reliability, availability and self-healing
required appliances as a list of application types. The manager capabilities information about the overall state of the system as
then queries UFoReg to find all resources which can be used well as its individual components is essential. This information
to meet the requirements of the EWS being started. In order is provided by ErlMon – a CIS component previously de-
to utilize resources in an efficient way the manager asks scribed. ErlMon cooperates closely with DyReAlla to provide
the optimizer to prepare a set of allocation actions that will self-healing capabilities. It constantly monitors the state of
result in satisfying the demands of all running early warning virtual machines started by the resource orchestration service
systems in terms of required services and their performance, to detect problems in their operation. Should a problem
while keeping the amount of allocated resources as low as occur, e.g. a virtual machine shuts down unexpectedly, ErlMon
possible (to minimize costs). The optimizer goes through a list immediately informs DyReAlla via a dedicated interface about
of all early warning systems, analyzes the available service this event – a crucial action in the context of ensuring
instances and resources that can be started, and responds reliability.
with actions that may include reusing service instances that The self-monitoring service is also needed when updating
can accept higher load, or requests to start additional virtual and reorchestrating the infrastructure. When told by DyReAlla
machines with specific services. The manager then uses cloud that a new appliance will be started, ErlMon spawns a new
clients dedicated for specific cloud infrastructures to execute probe which listens to its status. Once the instance is detected
allocation actions. It registers new virtual machines in ErlMon as running, the update operation is considered complete. Last
but not least, ErlMon keeps track of the status of core CIS V. C ASE STUDY: F LOOD E ARLY WARNING S YSTEM
components including the Integration Platform, DyReAlla and The goal of the CIS-powered Flood Early Warning System
message passing services used by system parts to communicate (Flood EWS) is to leverage in-situ sensors for monitoring
with basic services. This is crucial since should any of these of embankments in urban areas in order to support decision
components fail, the early warning systems using them would making when the possibility of a dike failure results in
be rendered inoperable. a flooding threat. The Flood EWS contains several application
Information about the cloud infrastructure comes from two scenarios which serve three goals – anomaly detection, risk
different sources. First, DyReAlla informs other components assessment and impact prediction – and are enabled gradually
about any new or obsolete appliances. The other source of as the “alert level” rises. The overall workflow of the Flood
information is the cloud stack itself – the same one DyReAlla EWS is shown in Fig. 7. Measurements from sensors deployed
uses to orchestrate resources. This gives ErlMon up-to-date in dike sections are exposed through the AnySense service. In
information concerning the state of the cloud and allows it to the default mode (“Alert level 0”) sensor data is continuously
expose this information to DyReAlla or CIS administrators. analyzed by the AI module implementing machine learning-
Due to its role in the CIS environment, ErlMon has to based algorithms for detection of anomalies in sensor signals
be exceptionally reliable and robust. In order to ensure this, (AI Anomaly Detection) [18]. When the likelihood of an
ErlMon was developed as a completely standalone service that anomaly becomes high, another level of analysis is automat-
does not depend on any other CIS component. Its internal ically enabled (“Alert level 1”) which computes the risk of
structure is also resilient: there is a separate process for dike failure based on current sensor measurements (Reliability
each monitored service – if one breaks down, the others will Analysis). Again, when the result of this analysis indicates
continue to function. Finally, ErlMon can be distributed over high risk, two simulations can be performed: the first one
multiple hosts to minimize the effects of hardware or network covers the flooding of the threatened area in case the dike
failures. fails; the second one predicts evacuation patterns and loss of
life in the event of a flood.
C. Provenance and tuning
Apart from the above mentioned features CIS provides one
additional benefit: it tracks and stores the provenance of early
warning system execution. This data contains, among others,
startup and stopping times of invoked EWSes as well as
changes in their importance level. While this information is
not used by resource orchestration mechanisms directly, it can
help analyze the actual usage of early warning systems and
fine-tune optimization policies. For example, provenance may
show that given EWS is started and stopped very often. As
a result, the CIS administrator may decide that this particular
instance should be kept running at all times, with some limited
resources assigned to it on a permanent basis.

D. Overheads and scalability


In order to support urgent computing, resource orchestration
mechanisms must introduce as little overhead as possible. The
performance of DyReAlla and UFoReg components is crucial Fig. 7. High-level workflow in the Flood Early Warning System. Different
application scenarios detect anomalies in embankments, compute the risk
for the whole process. Tests indicate that it takes about a of dike failure, and predict its impact. More in-depth analysis is enabled
minute to start an early warning system requiring two services. gradually, either automatically or based on a human decision.
This includes the time it takes to start virtual machines in
the cloud as well as any system overhead. CIS introduce less A separate tool is the Virtual Dike simulator which imple-
than 0.5s of overhead when starting the first early warning ments the computational model of a dike and can be used
system. Depending on the number of early warning systems for advanced experiments, e.g. evaluation of different breach
and required services, CIS overhead varies, but it remains on scenarios [1].
the order of seconds and therefore insignificant compared to Table I summarizes application scenarios in the Flood
time needed to initialize virtual machines. EWS along with their implementation details (services, their
CIS environment is able to automatically adjust resource associated appliances, input data and generated results).
allocation for early warning systems by scaling the number Fig. 8 shows the high-level architecture of the Flood EWS
of service instances. On the basis of load of virtual machines as implemented in CIS. The default anomaly analysis is
DyReAlla decides whether to start or stop virtual machines realized directly by the AI Monitoring appliance, trained for
hosting particular services. a particular dike, and deployed in the cloud. It consumes
TABLE I
F LOOD E ARLY WARNING S YSTEM APPLICATION SCENARIOS AND THEIR IMPLEMENTATION .

Application scenario Associated service (part) Associated appliances (provided by)


Dike anomaly analysis. Artificial None. AI (machine learning) algorithms for detection of
Intelligence-based detection of anomalous sensor signals. (Siemens)
anomalies in sensor signals
Dike stability monitoring. Computa- Event-driven: monitors the likelihood of an anomaly as reported by the AI HRW Reliable: computation of reliability of flood
tion of the risk of dike failure based appliance; when that likelihood exceeds a threshold, raises the alert level and defenses based on fragility curves. (HR Walling-
on current sensor readings. invokes the HRW Reliable appliance in order to compute the probability of dike ford)
breach; When this probability is also high, again raises alert level.
Flood simulation. Prediction of flood- Request-driven: consumes simulation requests and invokes two appliances in order (i) HRW Hydrograph: computation of water flow
ing of an area due to dike breach. to simulate inundation caused by dike breach. through a breach in the dike. (HRW Wallingford)
(ii) HRW Dynamic RFSM: fast flooding spread
model. (HRW Wallingford)
Life safety simulation. Prediction of Request-driven: consumes simulation requests and invokes the Life Safety appli- Life Safety Model: implements simulation of
evacuation patterns and life loss in ance in order to compute evacuation behavior & life loss due to inundation. evacuation patterns and life loss prediction. (Uni-
case of flooding. versity of Amsterdam)
Virtual dike simulation. Computa- Request-driven: invokes virtual dike simulation; retrieves and translates archive Virtual Dike simulation appliance (University of
tional model of a dike. sensor data requested by the simulation and sends it to the VD appliance. Amsterdam)

sensor data and periodically publishes messages containing the and implementation of CIS-based EWSs, let us examine thor-
likelihood of an anomaly. All remaining application scenarios oughly the implementation of the Flood Simulation scenario
are implemented as services (system parts). The Dike Stability blueprint. Fig. 9 gives a detailed overview of resources in-
Monitoring Part is event-driven. It continuously consumes volved in the Flood Simulation application scenario. The entry
anomaly probabilities produced by the AI appliance and when point for the scenario is the Flood Simulation Part (a WPS-
the probability exceeds a designated threshold (whose value compliant service). Actual computations are handled by two
is part of the Part’s configuration), it publishes an ‘Alert level appliances encapsulating computational models provided by
change’ message for the dike section where the anomaly has HR Wallingford: Hydrograph (computation of water volumes
been detected (which results in changing the color of the flowing through a breach) and Dynamic RFSM (rapid flood
section in the system control center GUI). Subsequently, the spreading algorithm) [1]. The Flood Simulation service is
Part consumes sensor readings from the affected section of designed so that some input data can be provided as links to
the dike and invokes the Dike Reliability Part in order to external data services, implemented in compliance with OGC
compute the probability of dike failure. When this probability standards: Web Coverage Service (WCS) and Web Feature
again exceeds a given threshold, the alert level is raised again. Service (WFS). This applies to terrain elevation data and data
Reliability analysis is disabled when the AI appliance no describing dike parameters, geometry, etc. In addition, the
longer reports high likelihood of an anomaly for the given output inundation data is archived and exposed through a WCS
dike section. The remaining three simulation parts are request- service.
driven: they are invoked directly from the system control Such compliance with standards increases the interoperabil-
center. ity and reusability of the Flood Simulation service blueprint.
While currently we provide our own in-house implementation
of both WCS services (the WFS service is not currently
implemented), in the future many spatial data sets owned by
public organizations in Europe should be exposed as services,
as dictated by the INSPIRE directive [19]. Our blueprint
provides the opportunity to instantly leverage such services in
order to perform flooding simulations at a selected location.
Another important service is the Data archiver whose task
is to publish the simulation output to the inundation data
repository. To this end, the inundation data sets are published
as a series of messages to the message bus. The data archiver
subscribes to these messages, receives them and updates the
inundation repository. A system control center retrieves the
inundation data from the WCS service and visualizes it on the
map of the affected area. It is interesting to note that loose cou-
pling through the message bus eliminates direct dependencies
Fig. 8. High-level architecture of the Flood Early Warning System powered between services. We might change the configuration of the
by the Common Information Space. system control center so that it would consume the inundation
data directly from the message bus and disable the Data
In order to highlight the most important aspects of design archiver without breaking the system. Similarly, a new service
run software components of the Common Information Space
and the Flood Early Warning System, as well as to per-
form related experiments. The software described in this
paper was developed within the EU FP7 UrbanFlood project
(http://dice.cyfronet.pl/products/cis). This work is partially
supported by the European Union Regional Development
Fund, POIG.02.03.00-00-096/10 as part of the PLGrid Plus
Project. AGH Grant 11.11.230.015 is also acknowledged.
R EFERENCES
[1] B. Balis, M. Kasztelnik, M. Bubak, T. Bartynski, T. Gubała,
P. Nowakowski, and J. Broekhuijsen, “The UrbanFlood Common Infor-
mation Space for Early Warning Systems,” Procedia Computer Science,
vol. 4, pp. 96–105, 2011, proceedings of the International Conference
on Computational Science, ICCS 2011.
[2] V. Krzhizhanovskaya, G. Shirshov, N. Melnikova, R. Belleman, F. Ru-
sadi, B. Broekhuijsen, B. Gouldby, J. Lhomme, B. Balis, M. Bubak,
Fig. 9. Implementation of the Flood Simulation application scenario. A. Pyayt, I. Mokhov, A. Ozhigin, B. Lang, and R. Meijer, “Flood early
warning system: design, implementation and computational modules,”
Procedia Computer Science, vol. 4, pp. 106–115, 2011, proceedings of
the International Conference on Computational Science, ICCS 2011.
could be connected to the message bus at any time, consume [3] T. Kumar, Srinivasa, “Implementation of the indian national tsunami
messages, and perform additional tasks without affecting other early warning system,” in Fostering e-Governance: Selected Com-
pendium of Indian Initiatives, Pune, India, 2009, pp. 380–391.
services. [4] ANSS Technical Integration Committee, Technical Guidelines for the
Implementation of The Advanced National Seismic System. U.S.
VI. C ONCLUSION Department of the Interior, U.S. Geological Survey, 2002.
[5] W. de Groot, J. Goldammer, T. Keenan, M. Brady, T. Lynham, C. Justice,
We presented the Common Information Space, a service- I. Csiszar, and K. O’Loughlin, “Developing a global early warning
oriented software framework for early warning systems guard- system for wildland fire,” Forest Ecology and Management, vol. 234,
ing against natural disasters. CIS supports both EWS devel- no. 1 S 10, p. 15, 2006.
[6] N. Trebon, “Enabling urgent computing within the existing distributed
opment and execution. It provides a framework for creating computing infrastructure,” Ph.D. dissertation, The Univeristy of Chicago,
blueprints for complete early warning systems and individual 2011.
EWS application scenarios. Furthermore, these blueprints can [7] J. Maclaren and M. M. Keown, “Harc: A highly-available robust co-
scheduler, submitted to the 5th uk e-science all hands meeting,” 2006.
be deployed as services and used as factories for new instances [8] B. S. Werner Dubitzky, Krzysztof Kurowski, Large-Scale Computing
of early warning systems or individual services performing Techniques for Complex System Simulations, 1st ed., 2011.
specific tasks. CIS also provides a set of runtime services for [9] S. Marru, D. Gannon, S. Nadella, P. Beckman, D. B. Weber, K. A.
Brewster, and K. K. Droegemeier, “Lead cyberinfrastructure to track
execution of EWSes as mission-critical applications. Runtime real-time storms using spruce urgent computing,” CTWatch Quarterly,
support includes self-monitoring of EWSes, detection of and vol. 4, no. 1, 2008.
recovery from failures, and optimization of resource allocation, [10] J. Cope and H. Tufo, “Adapting grid services for urgent computing
environments,” in Proceedings of the 2008 International Conference on
leveraging cloud infrastructures. Software and Data Technologies (ICSOFT 2008), 2008, pp. 135–142.
Early warning systems feature characteristics of enterprise [11] A. Cencerrado, M. A. Senar, and A. Cortés, “Support for urgent
applications (e.g. continuous monitoring, mission-critical func- computing based on resource virtualization,” in Proceedings of the 9th
International Conference on Computational Science: Part I, ser. ICCS
tion, ad-hoc scenarios) and scientific applications (resource- ’09. Berlin, Heidelberg: Springer-Verlag, 2009, pp. 227–236.
intensive computations). Consequently, in the CIS computing [12] High performance computing on aws. [Online]. Available: http:
environment we combined architectural patterns, industry stan- //aws.amazon.com/hpc-applications/
[13] C. Vecchiola, S. Pandey, and R. Buyya, “High-performance cloud
dards and technologies commonly found in high-quality enter- computing: A view of scientific applications,” in Pervasive Systems,
prise applications with computing infrastructure and resource Algorithms, and Networks (ISPAN), 2009 10th International Symposium
orchestration algorithms typical for e-Science. on. IEEE, 2009, pp. 4–16.
[14] P. Schut, “Open Geospatial Consortium Inc. OpenGIS R Web Processing
The capabilities of the CIS framework have been validated Service,” Open Geospatial Consortium, pp. 1–87, 2007.
for the Flood Early Warning System. However, the scope of [15] G. Hohpe, Enterprise integration patterns : designing, building, and
CIS applicability extends beyond early warning systems for deploying messaging solutions. Boston: Addison-Wesley, 2004.
[16] M. Logan, E. Merritt, and R. Carlsson, Erlang and OTP in Action.
natural disasters. In fact, CIS is suitable for any complex Manning Publications Co., 2010.
mission-critical system levaraging resource-intensive compu- [17] J. Dabrowski, S. Feduniak, B. Balis, T. Bartynski, and W. Funika, “Au-
tations. tomatic Proxy Generation and Load-Balancing-based Dynamic choice
of Services,” Computer Science, vol. 13, no. 3, pp. 45–59, 2012.
[18] A. Pyayt, I. Mokhov, B. Lang, V. Krzhizhanovskaya, and R. Meijer,
ACKNOWLEDGMENT “Machine Learning Methods for Environmental Monitoring and Flood
The results described in this paper have been made Protection,” World Academy of Science, Engineering and Technology
journal, vol. 78, pp. 118–124, 2011.
possible thanks to the access to the the computing re- [19] “Inspire Directive,” http://inspire.jrc.ec.europa.eu/.
sources of the Polish National Grid infrastructure (PL-Grid,
http://www.plgrid.pl) which were used by the authors to

View publication stats

You might also like