Download as pdf or txt
Download as pdf or txt
You are on page 1of 7

Special Issue on Ubiquitous Computing Security Systems

APPROACH TO DETERMINING AN EXTERNAL PROBLEM


FOR SELF-HEALING

Jeongmin Park, Joonhoon Lee, Hyunsang Youn, and Eunseok Lee


School of Information and Communication Engineering
Sungkyunkwan University Suwon 440-746, South Korea
jmpark@ece.skku.ac.kr, trsprs@ece.skku.ac.kr, wizehack@ece.skku.ac.kr, eslee@ece.skku.ac.kr

ABSTRACT
Self-healing is a methodology used for constructing a system that can detect faults
and recover itself and returns from an abnormal state to a normal state. Much
attention has recently been focused on self-healing ability that recognizes problems
arising in a target system. However, if a system wants to provide self-healing
functionalities, there are many loads such as target system analysis and system
environment analysis for external problem. Thus, this paper proposes using
deployment diagram for self-healing approach to determine problem arising in
external environment. The UML deployment diagram is widely used for resource
specification of a system and generally designed in the system design phase. The
approach proposes of 1) analysis for associations between software and hardware;
2) generating a monitor using constraints in deployment diagrams; and 3) adding
the monitor to the component after adapting it to the specific software architecture.
As proof of the approach, we automatically generate a resource monitor
automatically, and used a video conference system. We illustrate how the method
detects anomalies using the example.

Keywords: External problem, Problem Deternmination, External state

1 INTRODUCTION Analyzing associations between software and


hardware in the UML deployment diagram of a
The complexity of the software execution component.
environment poses new challenges for software Generating the resource monitor using constraints
developers. When computer systems operate specified by the designer in the diagram.
abnormally, detecting and resolving the problem Adding the monitor to the component after
requires much time and effort. Therefore, software adapting the component's structure
should adapt without human intervention to achieve
a self-healing ability. Self-healing is concerned with Through our approach, resource monitor can be
the ability of the system to automatically recover generated automatically using the deployment
from faults [1,2]. diagram for a target system. It is useful in
Self-healing components have been the subject of implementing the resource monitor for the
several studies. For constructing a system that component because it reduces additional work for
facilitates self-healing, Shin et al.[3,4,5] propose monitoring the resources.
self-healing component architecture. Faults can be Component developers just simply modify parts
divided into two types in views of the system: the of the monitor generated automatically for adaptation
fault occurred in software and the fault from and can easily add healing strategies to it. For
resources such as a cpu usage, a ram usage, and illustrating the approach, we tested our method by
bandwidth, etc. adapting a video conference system for evaluation.
However, this approach does not focus on faults We can see that the monitor generated by our method
arising from resources. The monitor for self-healing worked correctly when a resource problem occurred.
in the Healing Layer must be implemented by the The next section of the paper describes related work.
developer and it requires additional efforts in the Section 3 presents the approach in more detail.
software development process. Section 4 illustrates evaluations for the approach.
In this paper, we describe an approach to The paper ends with a summary in Section 5.
generate the resource monitor automatically by using
a UML Deployment diagram. The approach consists
of the following steps:

UbiCC Journal Volume 4 670


Special Issue on Ubiquitous Computing Security Systems

2 RELATED WORK 9 The architecture does not allow


detailed mistakes.
9 Only faults that occurred in the
In this section, we present a self-healing
component can be detected.
component architecture [3,4,5] and an Autonomic
Failure-Detection algorithm [6], which is one of
the failure detection methods. 2.2. Autonomic Failure-Detection Algorithm

Mills et al. [6] proposed an algorithm that detects


2.1. Layered software architecture for a self- failures automatically. In the approach, objects and
healing component devices that need to be observed send a signal to the
monitor periodically, similarly to a humans
Each self-healing component consists of a heartbeat. The monitor can manage many
healing layer and a service layer.[3,4,5] The components. It determines whether the object or
service layer performs tasks requested by another device has a problem by checking the signal over
task or component in the system. It also contains time. Let H p represent the period of a signal. The
active objects, connectors, and passive objects, maximum time for detecting faults will then also be
which are accessed by active objects. The active
H p . However, faults can occur at any time during
object can execute another active object or a
passive object. In contrast, a passive object is the signal period. The average time for detecting
called only by an active object. It cannot perform faults is H p / 2 . This algorithm can identify whether
independently unless another object calls it. The an object has problems or not in a very short time.
connectors transfer messages to or from tasks and However, it has an overhead cost because it requires
synchronize them. frequent communication to exchange the signal
The healing layer makes a decision that an between the monitor and the objects.
object in the service layer of the component
becomes sick, the healing process is launched via
connectors. It is composed of 6 objects as follows. 3 PROPOSED APPROACH

Component Monitor: This module observes In this paper, we present an improved self-
behaviors of each object through messages healing component architecture that can recover
from connectors in the service layer.
resource problems. We do not focus on inner
Component Reconfiguration Plan problems in this paper because this is covered by
Generator: This module produces Shin et al.[3,4,5] The resource in this case could
reconfiguration plans for when a fault occurs be independent of the software. The monitor
in the service layer. It also has information for measures the state of resources periodically and
objects in the service layer. decides whether self-healing policies should be
adopted or not. For this, we used a modified
heartbeat algorithm. The algorithm sends the
Component Repair Plan Generator: This signal to resources. Through this mechanism, the
module constructs self-healing strategies for
resource monitor can measure values and detect
faulty objects. It has recovery plans for each
object in the service layer. anomalies.

Component Reconfiguration Executor, 3.1. Architecture for generating resource monitor


Component Repair Executor: These
modules execute plans generated by the plan The architecture can be divided into an
generators. analyzing phase and a generation phase. Figure 1
illustrates the flow of structure. The architecture
can be divided into an analyzing phase and a
Component Self-healing Controller: This generation phase. Figure 1 illustrates the flow of
module controls the five modules above. structure.
This architecture has the following features.
9 The architecture can identify an object UML Deployment Diagram: This is the input of
with faults. the architecture. The diagram is transformed into
9 Healing strategies for each object are an XMI (XML Meta-Interchange)[7,8].
pre-made.

UbiCC Journal Volume 4 671


Special Issue on Ubiquitous Computing Security Systems

XMI Parser: The XMI parser analyzes resource means the duration time until the detection of
constraints of and associations with the resource. a fault. It can be also said to be the waiting
In the analyzing phase, the outputs are time in the method; initially, its value is 1
monitoring targets and constraints. These outputs second. This value is used as a setting value
are parsed in XML format.
for experiments and can be changed for any
Monitor Template Generator: The monitor system environment.
template generator uses the output of the XMI
parser. It generates a monitor template, which Table 1: Constraints List
detects device problems or resources selected for Contents Input Unit
monitoring. This template is implemented in the CPU
0.0 ~ 1.0 Percent
specific language. usage
Mem
0.0 ~ 1.0 Percent
Configuring: The monitoring template code usage
need to be modified for adaptation. The software Heartbeat 0.1 ~ 1.0 Second
developer configures it for the structure of User defined
software. Bandwidth minimum KB/s
bandwidth
Resource Monitor: The resource monitor User defined
generated by the approach can be adapted to the Method connection
software directly. type
Duration time
Duration for detecting Second
fault

Step2 - Analyzing diagram: At first, the node


(for example, client, server etc) was identified
in the system. Next, constraints for resources,
such as the constraints of cpu, Memory,
Bandwidth and Heartbeat rate, were identified.
The Parsing Engine parses XMI information
(Fig. 4.) and generates XML about the two
types.

Figure 1: Architecture for generating


resource monitor

3.2. Process of approach


We present the process composed of 4 steps in
this section (Fig. 2).
Step1 - Specifying the system using a UML
deployment diagram: Initially, the software
developer creates a deployment diagram (Fig.
3). The deployment diagram is a diagram
which represents a static aspect of the system
in the UML design model and illustrates
associations among components. Constraints
proposed within the method are shown in
Figure2: Process of approach (4-steps)
Table 1 Method means linking techniques of
network or physical devices and using them
Step3 - Generating monitor template: In this
for detecting abnormal terminations. Duration step, the template for an executable resource

UbiCC Journal Volume 4 672


Special Issue on Ubiquitous Computing Security Systems

monitor was generated by using the information


analyzed in the previous step. The Template 3.3. Problem detection algorithm
Generator (TG) performs the generation of a
monitor by analyzing the XML generated by In this section, we describe the parts that were
the Parsing Engine. It also generates fault adapted to the autonomic fault-detection algorithm
processing and anomaly detection routines for relate to our approach (Fig. 5). The resource monitor
each constraint. (Fig. 2) in the self-healing layer judges the state of the
system as abnormal if a reply is sent to the devices or
resources and does not return in the period. It was
also regarded as abnormal if the values of the
resource violated a constraint. In this context, a self-
healing layer should construct a reconfiguration plan
and perform it. Unlike related works, Lmax and
Lavg are 1.5 times longer than before because the
monitor sends the signal first. The monitor
determines that a resource is still in the normal state
if a fault has occurred just after replying to the
monitor. At this time, it sends a signal that tells it to
Figure3: Deployment diagram example
cycle to a new resource. However, the resource is
actually in fault, and a cycle is wasted because the
resource is already in trouble. Therefore, our
approach takes more time to detect faults than related
work.

Figure5: Error detection algorithm

3.4. Self-healing components including resource


monitor
Resource monitoring is illustrated in Fig. 6.
The device and self-healing component
Figure4: XMI Information and constraints model architecture featured resource monitoring.
derived from a deployment diagram Devices and the modified architecture available
to resources monitoring the self-healing
Step4 - Composing monitor: In this step, a component architecture were designed by E. Shin
developer modifies the resource monitor
[2, 3]. Resource monitoring is illustrated in Fig. 6.
according to the software environment. The
fault processing handler or guidelines are The device and self-healing component
actually implemented in the monitor template architecture featured resource monitoring.
generation level by the approach. It also Devices and the modified architecture available
performs customization regarding parts needed to resources monitoring the self-healing
and parts modified. Afterwards, a resource component architecture were designed by E. Shin
monitor is added to the self-healing layer or [2, 3].
component.

UbiCC Journal Volume 4 673


Special Issue on Ubiquitous Computing Security Systems

In this paper, we present an improved self-


healing component architecture that can recover
resource problems. We do not focus on inner
problems in this paper because this is covered by
Shin et al.[3,4,5] The resource in this case could
be independent of the software. The monitor
measures the state of resources periodically and
decides whether self-healing policies should be
adopted or not. For this, we used a modified
heartbeat algorithm. The algorithm sends the
signal to resources. Through this mechanism, the
resource monitor can measure values and detect
anomalies.
To evaluate the algorithm, we expressed the
Figure6: Proposed Self-healing component basic design of a video-based conference system.
architecture The purpose of this system was to successfully
conduct a video-based conference. During the
Six objects used for healing referred
meeting, the client should not be interrupted by
components and three objects used for detecting
external problems of the software. In this paper,
resources and reorganizing is added in this
the purpose was to check whether the client
architecture. The added objects are divided into
detected errors that arose from the software's
three parts. : External Resource Monitor,
external problems after automating the resource
External Resource Reconfiguration Plan
monitor and applying it to the client in the video-
Generator, and External Resource
based conference system by the approach
Reconfiguration Executor.
External Resource Monitor checks the status
of external devices and resources. External
Resource Reconfiguration Plan Generator makes
organizational plans for service levels in
accordance with external situations. External
Resource Reconfiguration Executor executes the
plans.
The purpose of the External Resource Figure7: Parsing Engine Prototype
Reconfiguration Plan Generator is to make plans
that prevent other well-operating objects from
being affected by other resources by isolating
objects that are easily influenced by resources,
similar to the organization of the component
plans.
Self-healing Controller that controls objects
in the self-healing layer governs the resource
reconfiguration executor to perform a
reconfiguration of the service layer. When it Figure8: Template Generator Prototype
comes to external errors, it performs in the same
way and allows anomalies of the service layer by 4.1. Environments
minimizing resources.
To evaluate this approach, we implemented
clients of a video conferencing system based
on .NET Framework 2.0. We used C# with the
4 Implementation and Evaluation implements in MS Windows XP. We used Borland
Together for UML modeling. The server was

UbiCC Journal Volume 4 674


Special Issue on Ubiquitous Computing Security Systems

implemented by Java2 SDK 1.4. The client the client, and a routine that prints the error time in a
additionally used DirectShow.NET for the video resource monitor in pursuit of the accuracy of the
device. A deployment analyzer and resource monitor Failure-Detection Latency evaluation. The detection
template were also implemented in C#. Fig. 3 results for various constraints are listed in Table 2.
illustrated the deployment diagram that we used. Fig. The error detection time, which was estimated for the
7 and Fig. 8 illustrate the Parsing Engine prototype CPU for 10 times, is shown in Fig. 10.
and Template Generator.
Table 2: Experimental results of the monitoring
4.2. Normal case Success of
Check list Constraints
Resource monitor continues to monitor the detection
resource unless resource performs its work CPU
Max 80% Success
without any anomalies. usage
Memory
usage Max 70% Success
4.3. Abnormal case
Monitor detects an abnormal state when the Bandwidth
usage Min 50KB/s Success
measured value was over the normal range or the
connection with the other resources was accidentally Abnormal
Network
terminated. Figure 9 illustrates the case when the network Success
connection
CPU usage was in excess of 80%. In this paper, we determination
did not focus on self-healing strategies. Therefore,
strategies for healing the faulty state were generated
by the administrator.

Figure 10: Error detection time of resource monitor

As a result of the evaluation, the resource


monitor detected the four items that constraints
Figure 9: Detection of anomalies of CPU by monitor are set up. Even though there were differences in
the average fault detection time, we were able to
verify that the resource monitor could detect it
4.4. Objective of evaluation and the results
within the maximum fault detection time.
The purpose of the evaluation is to determine
whether the approach recognizes error situations or 5 CONCLUSION
not within a designated time in applied purpose This paper proposed an approach to reduce the
systems and to compare applied target systems with efforts of a self-healing developer and offered a
not applied to the system, if errors occur in the software architecture that detects the resources
resources. We used programs such as the available. The produce of resource monitors can
benchmarking program and forced server be automated by using the deployment diagram.
determination in the case of extreme situations in the The advantages are listed below.
system. Additionally, we added a routine that
immediately reports the time when errors occur in

UbiCC Journal Volume 4 675


Special Issue on Ubiquitous Computing Security Systems

The resource monitor production is [5] Micheal E.Shin, Jung Hoon An, Self-
reconfiguration in self-healing systems,
automated Proceedings of the 3th IEEE international
A strategy is in place in the case of faults Workshop on EASE06, pp.106-116 (2006).
in resources. [6] K. Mills, S. Rose, S. Quirolgico, M. Britton,
C. Tan, "An autonomic failure-detection
Until now, developers have to do more effort algorithm", ACM SIGSOFT Software
Engineering Notes, Vol. 29, Issue 1, pp. 79-
to implement the monitor which checks resources 83(2004).
for the software. However, in this study, we
[7] G. Booch, J. Rumbaugh, I. Jacobson, "The
confirmed that we could make resource monitors Unified Modeling Language User Guide",
automatically that can include a self-healing Addison Wesley, pp.100-150 (1999).
component by a deployment diagram. To [8] XMI Online Document, http://www.omg.org/xml
evaluate these, we arranged a prototype
component and confirmed whether the detection
monitor operated correctly when an abnormal
situation occurred.
However, we could not overcome a high
overhead since signals must be exchanged
frequently if errors are to be detected. To solve
this problem, a study that investigates self-
regulating cycles of exchanging signals between
monitors is needed. The study of automation in
self-healing strategies for recovering from faulty
states remains future work.

6 ACKNOWLEDGEMENT
This work was supported by the Korea Science
and Engineering Foundation (KOSEF) grant
funded by the Korea government (MEST) (No.
2009-0077453) and a result of Faculty Research
Fund (2008) of Sungkyunkwan University.
Corresponding author: Eunseok Lee.

7 REFERENCES
[1] B.Topol, D.Ogle, D. Pierson, J. Thoensen, J.
Sweitzer, M. Chow, M. A. Hoff-mann, P.
Durham, R. Telford, S. Sheth, T. Studwell,
Automating problem determination: A first
step toward self-healing computing system,
IBM white paper (2003).
[2] D. Ghosh, R. Sharman, H. R. Rao, S.
Upadhyaya, "Self-healing - survey and
synthesis", Decision Support Systems in
Emerging Economies, Vol. 42, Issue 4, pp.
2164-2185 (2007).
[3] Michael E. Shin, "Self-healing component in
robust software architecture for concurrent
and distributed systems", Science of
Computer Programming, Vol. 57, No. 1, pp.
27-44 (2005).
[4] Michael E. Shin and Jung Hoon An, "Self-
Reconfiguration in Self-Healing Systems",
Proceedings of the Third IEEE International
Workshop on EASE'06, pp 89-98 (2006).

UbiCC Journal Volume 4 676

You might also like