Download as pdf or txt
Download as pdf or txt
You are on page 1of 34

Dependability and Complexity:

Exploring ideas for studying open systems


Full report

15th December 2000


Page ii

Title: Dependability and Complexity: Exploring ideas for


studying open systems. Full report.
Abstract: This report presents the results of an exploratory
study on dependability of large-scale information
infrastructures. Particular attention is placed on
complexity emanating from interdependencies
between information infrastructures and critical
applications. It formulates the problem and presents a
conceptual framework for better characterising
dependability and vulnerabilities. It also draws a
number of conclusions on open issues that need to
be further addressed.

th
Date: 15 December 2000

Authors: Nicholas Kyriakopoulos1, Marc Wilikens

Distribution: Unlimited

The role of the Joint Research Centre (JRC) of the EC is to provide scientific
support to the EU policy-making process by acting as a reference centre of
science and technology for the EU. This report has been prepared by the Joint
Research Centre in the frame of its institutional support programme to the EC
DG Information Society in the area of Dependability of IT systems. The
opinions and views expressed in this report do not represent the official
opinions and policies of the European Commission.

We invite readers of this report to send comments or suggestions to:


For Joint Research Centre For EC DG Information Society
Marc Wilikens Andrea Servida
Marc.Wilikens@jrc.it Andrea.Servida@cec.eu.int

A copy of the report can be downloaded from http://deppy.jrc.it

1
Department of Electrical and Computer Engineering, The George Washington University,
Washington, DC – USA (Kyriak@seas.gwu.edu). The work was performed while the author
was a Visiting Scientist at the Institute for Systems, Informatics and Safety of the JRC.
Page iii

TABLE OF CONTENTS

1 INTRODUCTION ..................................................................................................2
1.1 Study objectives..............................................................................................2
1.2 Study approach ...............................................................................................2
2 NATURE OF THE PROBLEM .............................................................................3
2.1 Scenarios ........................................................................................................3
2.2 Emerging issues ..............................................................................................6
3 THE CHALLENGE ...............................................................................................7
3.1 Definition of concepts .....................................................................................7
3.2 The challenge addressed in this study ..............................................................9
4 DEPENDABILITY CHARACTERISATION ..................................................... 10
4.1 Infrastructure model...................................................................................... 10
4.2 Data Transport Model................................................................................... 13
4.3 Dependability and Quality of Service............................................................. 14
4.4 The “cloud” is not a cloud............................................................................. 15
4.5 Issues with respect to the dependability of applications.................................. 16
4.6 Issues with respect to the dependability of the communications infrastructure 17
4.7 Observations regarding dependability characterisation ................................... 18
5 CHARACTERISATION OF THREATS AND VULNERABILITIES .............. 19
5.1 Threats, vulnerabilities and dependability....................................................... 19
5.2 Security and Complex Open Systems ............................................................ 21
5.3 Threats to the application.............................................................................. 23
5.4 Threats to the communications infrastructure ................................................ 24
5.5 Observations related to threats and vulnerabilities.......................................... 26
6 CONCLUSIONS ................................................................................................... 27
6.1 Findings on method....................................................................................... 27
6.2 Open issues................................................................................................... 28
6.3 Follow on ..................................................................................................... 30
7 ACKNOWLEDGMENTS .................................................................................... 31

8 REFERENCES ..................................................................................................... 31
Page 2

1 Introduction
1.1 Study objectives
The Internet has created a new environment for conducting human activities and
has given rise to the term Information Society. In this new environment, existing
economic and social activities are re-defined and new ones are born. At the same
time, concerns about the impact of the Internet on these activities cover a wide
spectrum of topics ranging from ethical to technical and scientific. Because activities
such as commerce, finance, energy, health, personal information exchange – all
increasingly depend on the Internet, one of the areas of interest is the ability of this
new medium to deliver services in a manner inspiring confidence to the users of
these services. The Internet is now a key component of a global information
infrastructure that consists of interconnected communications networks and the data
services provided by them. The global span of this infrastructure and the activities
which use or depend on it form a complex system that needs to be better
understood for confidence to be established. The need for confidence is pervasive
not only in a business context but also in all social interactions.
The Joint Research Centre (JRC) has undertaken this exploratory study upon initial
request of the EC DG Infso (Information Society). It aimed at fostering a better
understanding of the issues related to dependability and vulnerabilities of
information infrastructures, in particular complexity issues emanating from
interdependencies between infrastructures. The aims of the study include: a) to
establish a methodology for addressing issues arising from the complexity of the
problem, b) to identify technical issues, or problems that affect the establishment of
confidence and c) to provide a basis for new ideas for studying those systems and
evaluating their performance.
This study addresses some important dependability challenges with high societal
impact within the realm of European Commission policy making 2. It is anticipated
that a better characterisation of the challenges would stimulate and characterise
apposite follow up activities within appropriate fora. These activities may take the
form of knowledge exchange on international initiatives, on risks and concepts or
R&D projects that could be supported in the frame of EU research programmes.
1.2 Study approach
According to the initial plan [2], the study would be performed by the JRC in
collaboration with a number of additional organisations which would provide case
studies. Participating organisations would contribute to the study by providing sector
case studies covering innovative architectures combined with real or prospective
scenarios of large-scale unbounded systems. On the basis of preliminary contacts,
the following case studies were identified:
• Investigation of vulnerabilities in the electric power system
• The provision of public IP services
• A generic global monitoring system (e.g. early warning for emergency
management, radiation monitoring, non-proliferation of nuclear material) based
on a combination of private channels and open communications infrastructures

2
See also e-Europe initiative and action plan 2002 approved in June 2000 with specific action on
dependability of information infrastructures.
Page 3

• Health care virtual organisation relying on distributed systems and public


networks
The approach to the collaborative project [3] would be to apply a common
methodology to each application and generate case-specific reports. The findings
from the case studies would be integrated into a report that would identify the cross-
sector aspects of the case studies findings. An initial meeting was held at the JRC in
Ispra, on 14 December 1999 to discuss the study plan and follow-up actions.
During the second meeting on 8 March 2000, held at the JRC in Ispra, some of the
collaborators proposed to take advantage of an existing opening for submitting
proposals to the European Commission under the Fifth Framework program of
European research, V.1.4 CPA4: Large Scale Systems Survivability. In order to
meet the deadline of 10 May 2000, action on the case studies and the collaborative
effort was deferred. Instead, the JRC undertook, first, to develop a conceptual
framework and a methodology for studying complex open systems and,
subsequently, to investigate the application of the methodology to specific cases.
This paper is the result of the study conducted at the JRC.
A paper [4] derived from the study was presented at the third Information
Survivability Workshop held in Boston, Massachusetts USA on October 24-26 2000.

2 Nature of the problem


2.1 Scenarios
The convergence of computer and communications technologies has created a
global infrastructure for the transmission and processing of information. This new
infrastructure is based on the ability of diverse network systems (e.g. mobile, fixed)
to be interconnected and provide global coverage for the transmission of data. The
Internet is a key component of this infrastructure. The defining characteristic of this
global infrastructure is the absence of central monitoring and control in contrast to
conventional computer communications systems. By its nature the Internet renders
national borders almost irrelevant with respect to the transmission of information. In
some cases, even the constituent sub-networks may span more than one State.
Another characteristic of this new information infrastructure is that it affects directly
a large number of people by providing high connectivity on a global scale. The
ability of individuals or small groups of people to exchange information directly
among themselves minimises the influence of certain social structures such as
political establishments, while it introduces influence by other establishments,
primarily business that develop, operate and maintain this new infrastructure.
An additional significant characteristic of the new medium is the capability to
transport large amounts of data in extremely high speeds relative to the human time
scale. The combination of volume and speed overwhelms the human capacity to
comprehend easily the processes occurring in the infrastructure and to control
events or activities.
The combination of global coverage, high connectivity, and data volume and speed
has introduced major changes in the conduct of human activities through the
Internet. Some of these changes may be characterised as either evolutionary or
revolutionary. The overall result is that large sectors of the economy and of public
activity become interconnected through the Internet and dependent on its correct
functioning.

Electrical Power system


Page 4

An evolutionary change is the expansion of the scope and capabilities of existing


applications. Consider for example the generation and distribution of electricity
which management has been made possible by the establishment of monitoring and
control systems relying on complex communications systems. It started from a
single power plant feeding a local distribution network to evolve into transnational
and transcontinental grids consisting of interconnected diverse power networks. The
European electrical power system is now composed of a large number of
interconnected national grids. Each grid connects several tens or hundreds
production sites with many million end-users. Each grid operation is controlled by a
complex, distributed automation system, usually performed by a number of regional
control centres, each one in charge of a broad region (e.g., Bavaria, Northern Italy,
Ile-de-France). Regional control centres are usually co-ordinated by a National Grid
Management Centre in charge of regulating energy trading and dispatching.
Electrical grids are vulnerable to many threats, both internal (e.g., subsystem faults)
and external (extreme weather conditions, earthquakes, sabotage and intentional
disruption). In addition to physical threats to the grid, there are threats concerning
the grid automation system itself, like software bugs, hardware malfunctions, cyber
attacks, etc. A local malfunction may impact quickly on the whole network, due to
the high level of interconnection. Through a cascading effect, a malfunction may
result in a sequence of events leading to faults of other systems and isolation of
generating plants due to the activation of protection systems, which may lead to
black-out in large areas.
A number of political, business and technological factors influence new types of
vulnerabilities of the electrical power system:
• Interconnections among different electrical power grids are being developed at
a fast pace, especially towards East European countries. Increased size and
heterogeneity of the grid increases its vulnerability.
• The European electrical power system is being restructured to a market-based
environment. Centralised planning and co-ordination, as practised by utilities in
a regulated environment, will be superseded. The increased number of
operators (generating and distribution companies, grid management operators)
implies increased uncertainty (load, plant availability, etc.). Also, the energy
market will be operated via electronic trade. This together with the drive
towards the use of open telecom networks such as Internet makes it vulnerable
to malicious attacks. For instance an incident was reported on December 13,
2000 to the US National Infrastructure Protection Center (NIPC). According to
the report [12], “a regional entity in the electric power industry has recently
experienced computer intrusions through the Anonymous FTP (File Transfer
Protocol) Login exploitation. The intruders used the hacked FTP site to store
and play interactive games that consumed 95 percent of the organization's
Internet bandwidth. The compromised bandwidth threatened the regional
entity's ability to conduct bulk power transactions …”.

Health care services


A similar evolution has taken place in the field of health care. In the past, the basic
health care service was on an individual basis between the provider and the
recipient and a single location. It has now evolved into a complex system consisting
of a number of actors and services no longer concentrated in a single location.
Because health care provision is essentially a matter of sharing information and
knowledge, both in the clinical and organisational/logistics domain, large-scale
information and communication infrastructures are a crucial enabling factor. The
Page 5

term “virtual healthcare organisation” not only covers a number of telemedicine


activities from the remote monitoring of specific conditions of patients to the remote
provision of medical services relying on distributed data bases and using high speed
networks, but also closer integration of different business processes involved in
healthcare provision such as pharmaceutical suppliers/distributors, insurance,
hospitals. In this scenario, citizen health care records and, more in general, clinical
or clinical-related information are distributed (i.e., generated and stored) among
these different health care enterprises. The increased mobility of Citizens is further
driving requirements for accessing Citizen related safety-critical information.
A simple example will illustrate the complexity of the information dependence and
the problem of delineating the responsibilities for the provision of dependable
services. A patient visits the office of a general practitioner (GP) for medical
services. The GP has a fully automated medical information system, which consists
of a local information management system with access to the databases of the
medical facilities in the region. A segment of the patient’s record is in the database
of a distant facility. The dependencies are obvious. The patient expects a certain
quality of service from the GP including the availability of the complete medical
history during the time of the visit. The GP, in turn, expects the provider of the
medical information system to make available to the GP all the required information
services. Because the medical information system has a distributed database, the
medical information system provider relies on an Internet service provider for access
to the remote database. Of course, the remote database is part of another medical
information system which is controlled and operated by someone else. Thus, the
quality of service offered by the GP depends on the performance of at least three
levels of applications.
Furthermore, given that health care information is sensitive information, further
vulnerabilities related to interdependencies will include the privacy dimension. With
the move to electronic patient records accessible from the Internet, there is now the
potential for health information to be exploited on a large scale. For example,
quoting from [13], it is reported that a large pharmaceutical company gained access
to a prescription database covering over 0.5 million prescription users. They are
said to be mining the database in search of patients whose prescription
requirements fit depression related illnesses, with a view to promoting the use of
one their drugs by contacting the patient's GP's. It is also reported that many health
insurance companies pass on medical information to third parties, such as financial
institutes or employers, without the permission of patients. Employers, for
recruiting, routinely use this information and other personnel related issues.

Public IP communications services


A revolutionary change is the creation of new applications which could not have
been possible without the presence of the Internet. Electronic commerce, in its
various manifestations, does not require any personal contact neither between
buyers and sellers, nor between the physical custodian of the product and the seller
or the buyer. It has also redefined the traditional meaning of money by representing
it as sequences of electronic pulses. Another example of revolutionary change is the
ongoing effort in the offering of voice communication services over the Internet and
the evolution into real time video transmission.
Electronic business and commerce applications are being developed at an ever
increasing pace, the vast majority of them will be carried over large scale Internet
Protocol (IP) communications networks. Indeed within 5 years it is likely that at least
25% of all business will involve transactions carried out over IP networking. For
Page 6

instance Internet banking: The strategies of most large banks foresee Internet and
Web technologies as key to future retail service delivery in order to provide global
access to online banking services. This service will critically depend on the
availability and security of global IP based networks. The global public Internet
Protocol (IP) network or Internet is composed of a large number of interconnected
public IP networks. In order to support seamless end-to-end connectivity across the
Internet each individual IP network is a complex system which contains several tens
or hundreds of routers with many million end-users, through hundred thousands of
km of optical fibre, multiplexed circuits and copper pairs. IP network operation is
controlled by a complex, distributed naming and routing system, usually provided by
large telecommunications companies and Internet Service Providers (ISP).
Management of Internet traffic is usually co-ordinated between the major ISPs and
telecommunications operators so as to balance traffic routing in order to match
available network capacity.
Large scale IP networks are vulnerable to a wide range of physical threats to the
transmission and routing media which include power failures, accidental damage,
sabotage and other forms of intentional disruption. In addition to threats on the
physical network infrastructure, there are threats concerning the network routing
system itself, including software bugs, hardware malfunctions, deliberate hacking,
etc. A local routing malfunction or deliberate attack (e.g. denial of service attacks in
on ISP’s as experienced during 2000) can often rapidly impact on the whole
network, due to the high level of interconnection. Through cascading effects, such a
malfunction may result in a sequence of events leading to failures in other systems
and isolation of parts of the network due to the interaction of routing systems, which
could even result in a brown-out of large areas of the network and resulting
unavailability of the online banking services.
2.2 Emerging issues
What kind of findings can be drawn form the above scenarios? A number of IT
applications have assumed dominant positions and have become essential for the
conduct of human affairs. Also, as the example of the electric power grid illustrates,
some systems have become so large that they have a significant impact on large
segments of society. They have been characterised as “critical infrastructures”,
because they have been deemed indispensable for the human welfare. The term
was first introduced in the United States of America to identify specific national
infrastructures that are deemed critical for the national wellbeing [5]. These are
banking and finance, energy, information and communications, transportation, and
vital human services. In view of the preceding discussion, for most of these
infrastructures the term global rather than national would be more appropriate. The
term “critical” is assigned on the basis of a value judgement and has no direct
impact on the problem analysed in this report. The public will accept, use and rely
on the new applications only if it becomes convinced that it can rely upon them.
Similarly, the political authorities will be able to assure the public that they are
protecting the public interest only by ensuring the soundness of the critical
infrastructures. In both cases, the objective is to ensure the integrity of the
infrastructures, the applications or services in order to maintain confidence in these
systems.
For critical application domains such as the ones illustrated before, confidence may
be eroded either by the faulty delivery of desired and promised services, or by the
manifestation of unexpected and/or undesirable side effects. Applications relying on
the communications infrastructure for the transport of data may deliver faulty
services either due to faults within the application or due to faults in the data
Page 7

transport service of the communications infrastructure. Faults are to be interpreted


widely and including faults affecting reliability (e.g. continuity of service) as well as
security (e.g. confidentiality). On the other hand, the causes of unexpected and/or
undesirable side effects may be more complex. They may lie in an application, in
the communications infrastructure or in some interdependencies between
applications and the communications infrastructure. For example, heavier than
expected use of services available through the Internet could affect the quality of
service of other unrelated applications sharing the same resources. Because of
these interdependencies, one needs to understand not only the behaviour of the
communications medium and of each application relying on it, but also the effects of
the different interactions between applications and the communications medium.
Another factor that has motivated the undertaking of this study is the widespread
publicity generated by various forms of hacker attacks and cyber-crime. While some
of these attacks have serious repercussions on specific targets, applications or
segments of society, most of the public comments are based on generalisations,
exaggerations, ignorance or ulterior motives. It is essential that a rational basis be
formed for evaluating the impact and significance of such attacks or similar events
on the performance of the communications medium and the applications. There are
three main reasons for developing a rational approach for addressing these and
similar issues. In order to assess its confidence in a particular application or service,
the public needs to have reasonably authoritative information about the severity and
significance of a given problem. The service providers need to have a good
understanding of the technical manifestations of the problem in order to solve it. The
word technical includes design, hardware, software, operations and management.
Equally important, a rational basis is necessary for the political authorities, a) to do
proper allocation of resources for research and development activities and b) not to
take actions that might be counter-productive because of ignorance.
To address these issues, tools and methodologies are needed for understanding
the behaviour and for developing means of analysing and predicting the
performance of applications based on large-scale information infrastructures such
as Internet, in order to be able to define and specify end-to-end-quality of service.
There is also a need for gaining a better understanding of the technical issues
arising from the interdependencies between applications and the underlying
communications medium and in particular the vulnerabilities introduced by these
interdependencies.

3 The challenge
3.1 Definition of concepts
In discussions involving complex issues or qualitative characteristics, one of the
problems that need to be addressed is that of understanding concepts. In the effort
to develop a methodology for addressing issues ranging from performance
specifications for physical communications networks to managing applications
based on the Internet, a consistent terminology becomes essential. For the
purposes of this report, the following concepts and terminology will be used.
User or customer: An entity, either a system or one or more human beings acting
as individuals or in some formal capacity, that requires and receives specified
services.
Voice communications and electronic mail are two examples of services received by
or provided to individuals. Voice communications, traditionally offered through the
Page 8

Public Switched Telephone Network (PSTN), have now begun to be offered also
through the Internet.
Layered service model: Driven by open standards and deregulation,
communications services are increasingly moving towards a layered model, Figure
1, consisting of interconnected communications networks with data services and
applications operating on top. New entrants and established industries compete for
the provision of data services and added value services (e.g. e-payments). The
layering of services implies that lower layers details are hidden from higher layers
and that the dependability (availability, security) of the upper-layer applications
relies in large part on the performance of lower layer services and networks [1].

Applications

Data services

Communications networks

Figure 1. The layered model for services

Application: A process or system providing specified services to users.


In the example of the medical information system, the services required and
specified by the general practitioner may range from accounting, to access to
pharmaceutical and patient databases, to diagnostic services provided by expert
systems.
Communications infrastructure: The collection of hardware equipment and
procedures (software, management) for transporting data needed by an application
to deliver specified services to the users. Synonymous with information
infrastructure.
The earliest versions of the PSTN with electromechanical switching are good
examples of hardware-based communications infrastructures for voice
communications. However, the communications infrastructure for voice
communications over the Internet consists of the physical layer of fixed wire lines
and/or radio links, electronic switching centers, and transmission equipment, of
protocols for the transport of bits, octets, characters and files, and interconnected
diverse networks. For the purposes of this report, the communications infrastructure
will consist of interconnected communications networks or systems and the data
services provided by them. Public Switched Networks and IP-based networks
(Private and Public Internet) provide the backbone for the communications
infrastructure. The IP-based networks in turn typically rely on a combination of
leased lines and switched digital services publicly and privately owned. The
functions of a communications network correspond roughly to those of the first four
layers (physical, link, network and transport) of the ISO Reference Model for Open
Systems Interconnection (OSI) [6].
Complex system: Collection of a large number of functional entities (equipment,
procedures and humans) with a large number of interconnections among them.
Discussions on systems complexity can alternate between two perspectives, macro
and micro. The macro perspective follows the path of decomposition of a complex
system into simpler and more manageable sub-systems and uses terms such as
dependability, survivability, fault-tolerance, and security. These terms are generally
Page 9

qualitative. Conversely, the micro perspective views a complex system as the result
of synthesis based on interconnections of elementary components. The synthesis
approach imposes constraints on the interfaces and interactions among the
components. The term quality of service specifies quantitative performance
requirements. In this report, the terminology will be a combination of dependability
and quality of service.
Closed system: A system consisting of a known number of components or nodes,
their characteristics both physical and as data sources or sinks, their location and
their interconnections.
Open system: A system consisting of an unknown or partially known number of
nodes or their characteristics both physical and as data sources or sinks.
Connectivity is generally unknown or partially known.
In the discipline of computer communications, the term open system currently refers
to a set of interconnected computers from different vendors that communicate
among themselves using a standard communications protocol stack based on the
TCP/IP protocols. In this paper, the broader meaning of the term is used, unless
there is a need to refer specifically to the Open Systems Interconnections (OSI)
standard.
Dependability: Property of a system that indicates its ability to deliver specified
services to the user [7].
Dependability can be specified in terms of attributes which can be measurable or
qualitative. Attributes may be generic as well as application-specific. Traditionally,
the set of dependability attributes includes reliability, availability, safety and security.
Security, in turn, is usually subdivided into confidentiality, integrity, authentication,
non-repudiation. The set could also be amended with attributes that better
characterise performance requirements such as timeliness and quantity of data
transported per time unit.
Vulnerability: For the purposes of this paper, vulnerability of a system to a threat
can be understood as a weakness or flaw in the system that eliminates or reduces
its ability to deliver the specified services. A new type of vulnerability to be studied in
the context of critical infrastructures is related to interdependencies between
systems due to massive interconnections in systems-of-systems.
3.2 The challenge addressed in this study
From the perspective of the user, the desirable outcome is to receive services that
meet a set of expected performance requirements. From the perspective of a
service provider, or, equivalently, the designer of an application, the desirable
outcome is to offer services that meet specified performance requirements. From
either perspective, the application should be capable of offering a service in
accordance with desirable, expected, perceived, or specified performance
requirements. These services are to be delivered in a dependable manner by a
complex system that is beyond the capabilities of the user to comprehend and to
control. By formulating the problem in such general terms, one can ask the following
questions:
1. Are the expected services within the capabilities of an existing system?
2. If an existing system cannot deliver the expected services, how could a system
be designed to deliver these services in a cost-effective manner?
To answer these questions we will adopt the well-known procedures used in the
analysis and design of conventional systems to analyse the performance of complex
open systems. These are a) definition of the expected services, b) specification of
Page 10

the service performance requirements, c) translation of the service performance


requirements into system performance requirements, d) determination of the
capabilities of the complex system, e) determination of the threats and
vulnerabilities, and f) development of system design specifications on the basis of
the specified performance requirements.
In broad terms, the challenge can be categorised within two types of problems: a)
the problem of characterising dependability of the information infrastructure in terms
of attributes and requirements that is addressed in chapter 4, and b) the problem of
characterising vulnerabilities, the effects of disruptions on components of the
information infrastructure and how these affect dependability requirements that is
addressed in chapter 5.

4 Dependability characterisation
It was stated earlier that the main characteristics of this complex system that
consists of applications relying on a global communications infrastructure are its
global coverage, high connectivity, and data volume and speed. As a result, the
“Internet” has become a high level abstraction that provides little, if any, help in
analysing the performance of existing systems or designing new ones. Therefore,
we will start our analysis by defining a framework for decomposing the problem into
simpler problems some of which either have been solved, or are solvable by known
methods. For some problems new approaches or paradigms might be necessary.
4.1 Infrastructure model
In the introductory remarks, we used the examples of automation systems for
monitoring and control of interconnected electric power networks, distributed
information systems for the provision of remote health care services, monitoring
systems for verifying multilateral treaties and public IP services. These are few
examples of diverse applications that rely on the communications infrastructure for
the provision of the specialised services. The list can be expanded to cover the
multitude of applications that are not confined by geography. The diagram in Figure
2 shows an abstract framework of the dependence of the applications on the
communications infrastructure. Each application imposes its own performance
requirements on the communications infrastructure. The collection of these
requirements becomes the input for determining design parameters for the
infrastructure.
Page 11

Monitoring Systems
Electric Power Net

Financial Services

?
Voice over IP
Health Care
Communications
Requirements Specifications

Communications
Infrastructure

Figure 2: A conceptual framework for translating performance requirements of diverse applications


into performance requirements of the communications infrastructure

Regardless of the nature of each application the demands upon the


communications infrastructure may be viewed in terms of file transfers and
distributed computations. When these two operations involve the parameter time,
the terms real time and continuous data stream are also added to the set of
parameters. Furthermore, if relatively small files (messages) are exchanged
between two or more users in real time the operation is characterised as interactive.
Additional terminology is in wide spread use when one refers to computer-to-
computer communications or user-to-computer communications. However, these
terms are not very useful as attributes for specifying performance requirements of
an application from the communications infrastructure.
A service offered to the user by an application relying on internetworking requires
data transport services based on IP. Although each application has its own specific
requirements, there are some attributes that are common to all applications. These
are availability of the data transport service, integrity of the data during transport,
quantity of data to be transported per unit time and timeliness of transfer from
source to sink. It should be noted that, for the data transport service, the term
integrity means that, whatever data (in the clear, or encrypted) are sent by the
transmitter, they will be received by the intended recipient without any modification,
regardless of cause. To the extend that the performance requirements of a given
service depend on the data transport service of the Internet, these four attributes
are both necessary and sufficient for specifying the availability, integrity and
timeliness of the service. Thus, the dependability attributes of the application,
namely, availability, integrity and timeliness can easily be mapped into the
dependability attributes of the Internet. These attributes can be used as variables
either for analysing existing systems or for designing new ones, because the
Page 12

attributes are quantifiable and can be translated into the quality of service attributes
of communications systems. From the perspective of the designer of an application,
the dependability requirements for data transport is an output derived from the
dependability requirements of the application. In turn, this output becomes input for
specifying the dependability requirements of the communications infrastructure. The
open question for the Internet is whether the dependability requirements imposed
on it by a given application are feasible and at what cost.
In addition to the dependence upon the communications infrastructure, each
application depends upon additional systems in order to provide the required
services. Practically all applications rely on electric power systems for their energy
needs and on transportation systems for the delivery of products to mention two
other important infrastructures. Of course, the communications infrastructure itself
relies on the electric power infrastructure. Furthermore, the transportation
infrastructure relies on other infrastructures such as communications, power, etc.
This interdependence among infrastructures raises the question whether the
hierarchical decomposition is sufficient to describe the relationship between one or
more applications and the communications infrastructure, or some form of feedback
is necessary as shown in Figure 3.

Dependability
Dependability requirements
requirements

Dependability
Electric power
requirements
Application infrastructure

Communication
Dependability s infrastructure Dependability
requirements requirements

Dependability
Communications Automation
requirements
infrastructure system

(a) (b)

Figure 3. Relationships among dependability requirements a)


independent and b) coupled.

This model can be extended to cover interdependencies among three or more


infrastructures giving rise to multiple feedback loops. If the feedback is relatively
weak, the hierarchical decomposition would be sufficient for translating the
dependability requirements of an application into those of the communications
infrastructure. However, if a strong feedback mechanism is found to exist, additional
research effort would be needed to relate the dependability requirements.
Page 13

4.2 Data Transport Model


The designer of an application or the provider of a service needs to address four
major issues:
• How to determine the dependability requirements of the communications
infrastructure in order to satisfy the performance requirements of the service
• How to assess the feasibility of the communications requirements taking into
account available communications technologies, costs and regulations
• How to evaluate the cost-effectiveness of new research and development
projects in cases where the available systems could not satisfy the dependability
requirements of the application
• How to measure the performance of the service.
In order to address the first problem, we first decompose the communications
infrastructure into two major subsystems, communications networks and
internetworking. The term communications network corresponds approximately to
the first four layers of the OSI model, namely, physical, network, link and transport.
Internetworking refers to the interconnection of heterogeneous networks using the
IP. The rationale for such a partition is based on the observation that the Internet is
built through the interconnection of distinct, heterogeneous networks, these
networks are interconnected through finite numbers of gateways or routers, and
each of these networks is designed, operated or/and managed by some supervisory
authority. The details of how each of the constituent networks is built and operated
are not as important as the fact that the characteristics of each link and node the
data transmission protocols are known and there is a clear delineation of
responsibilities for the performance of each constituent network.
Two simple examples are sufficient for clarification. An entity owns and operates a
communications network consisting of long haul circuits, exchanges, switches and
local loops. In addition, it offers Internet services to users by connecting to the
Internet through peer Internet Service Providers (ISPs). Clearly, the entity has
complete control and responsibility for design, operation and performance of the
network up to the point where the routers are connected with the routers of the peer
ISPs. For the second example, an Internet Service Provider leases bandwidth from
an operator who in turn leases channel capacity from the owner or operator of
physical network links. In this example, the responsibility for the performance of the
network up to the interfaces with other peer networks is established through Service
Level Agreements between the ISP and each subordinate entity down to the
physical network level. In both examples, there is a theoretically unambiguous
mechanism for translating transport level performance requirements into
performance requirements and design parameters for the physical network.
At the internetworking level and higher the lines of responsibility for establishing and
satisfying performance requirements become more complicated. For example, there
is no simple mechanism for implementing performance requirements for timeliness
among peer ISP routers, because IP is inherently a best-effort protocol. In addition,
the number of paths for the end-to-end transport of data on behalf of a user cannot
be determined by the provider, because there is no centralized monitoring at the IP
level. A user, on the other hand, could specify and demand a certain level of
performance from the ISP for data transport services that are derived from the
dependability requirements of the application. In the previous paragraph, we have
shown that within the domain of each ISP there is a defined mechanism for
translating performance requirements into network design parameters. The main
remaining issue is to solve the problem of relating performance requirements at the
Page 14

peer ISP level. The end-to-end performance requirements can be decomposed into
the following three levels:
• Level 1 – User to application
• Level 2 – Application to Internet Service Provider
• Level 3 – Among peer ISPs
If the Level 3 performance requirements could be specified, the problem of deriving
the design parameters of the physical network would be tractable. The preceding
discussion is illustrated in Figure 3.
For the model shown in Figure 4, the four dependability attributes of an application
can easily be translated into dependability attributes of the end-to-end data
transport service. In turn, these can be translated into dependability attributes of the
communications infrastructure.

User User
Level 1
Requirements
Application
Level 2
Requirements
Internetworking
Level 3
Requirements
Communications networks
Data transport
Network design specifications

Figure 4. A model for translating dependability requirements of an application into


dependability requirements of the communications infrastructure

4.3 Dependability and Quality of Service


In the field of communications engineering, the term quality of service (QoS) is used
to measure the performance of data networks with respect to the transport of data.
For example, at the OSI Transport layer, there are five classes of quality of service
ranging from the simplest, Class 0, that provides for connection establishment and
flow control to Class 5 that incorporates error control and flow control procedures.
Outside the OSI model, other quality of service parameters include latency, jitter,
packet loss and minimum/maximum/average bit rates. Within a single network in the
sense that was discussed in the previous section, these attributes can be and are
translated into network design parameters. For such networks, the concept of
dependability is identical to that of quality of service. Thus, there are no major
technical problems in satisfying the performance requirements of availability,
integrity, quantity and timeliness for data transport within each distinct network.
The major problems arise at the IP layer interface, because the IP is inherently a
best-effort service. As a result, the requirement of timeliness cannot be satisfied, but
availability, integrity and quantity could. The best-effort service provides for some
virtual end-to-end link unless all peer ISP routers in all possible such links are out of
service. In such a case, the problem is not inherent to the Internet, but to the
connectivity among peer ISPs. Similarly, specified quantities of data can be
transported end-to-end, however slowly. The integrity of the data can also be
evaluated at the application level regardless how long it takes for a frame to be
transported across a virtual link. Thus, the major issue for the IP interface is how to
satisfy the requirement of timeliness. Recently, new signalling and routing protocols
Page 15

have been developed in order to deliver quality of service to the IP layer. Among
them are RSVP signalling, differentiated services-per-hop queuing behavior and
Real Time Protocol (RTP) [8], [9] that provides timing and buffering for real-time IP
services such as voice and video. Although the intent of these protocols is to deliver
quality of service to the IP layer, they also dilute the simplicity of the original IP
protocol that led to the rapid expansion of the Internet. It would be useful to assess
their impact on and their implications for the future development of the Internet.
4.4 The “cloud” is not a cloud
It has become convenient to refer to the environment created by internetworking as
the “cloud”. Although this designation provides an easy vizualization of what
happens to the data of a user after they enter the server of the ISP, it is not very
helpful for the evaluation of the dependability of the data transport service. Each
ISP has control over and is responsible for the operation of a communications
network with known characteristics including topology and bandwidth. At the IP
level, these networks are connected by internetwork routers, or gateways, under the
control of peer ISPs.

U s e r ⇔ I S P ⇔ “ C loud ” ⇔ I S P ⇔ U s e r

User ISP ISP

ISP

ISP ISP User


Router

Figure 5. The topology of the cloud

Thus the topology of the internetwork is known, the gateways being the nodes and
the links those connecting peer ISPs. The concept of the “cloud” has arisen from the
fact that it is impossible to specify or determine the path of a packet through the
Internet, because the IP is a best-effort service, although the physical connectivity of
the Internet is deterministic as illustrated in Figure 5. The deterministic nature of the
physical topology provides a starting point for translating the dependability
requirements of an application into those of the major subsystems that form the
data transport path.
The interconnection between any two peer ISPs is through a relatively small number
of gateways. For any two peer ISPs, the performance requirements of the
corresponding interconnected gateways, are specified in service level agreements
(SLA). Thus, a mechanism exists for specifying quality of service requirements
among peer ISPs. In view of the preceding discussion about the nature of the IP,
the outstanding question is whether dependability requirements for data transport
can be specified and met at the interconnections among peer ISPs.
Page 16

4.5 Issues with respect to the dependability of applications


Depending on the characteristics of each application, the data transport
requirements for availability, integrity, timeliness, and volume could be specified as
follows:
q Availability – end-to-end connectivity X% of the time averaged over Y
time units
q Integrity – X% of application data frames transmitted over Y time units
satisfying a user-to-user validity check.
q Quantity – X amount of data transferred between users.

q Timeliness – X% of application data frames arriving at the destination


user within Y time units from the instant each frame is transmitted by the
source user.
Some applications, such as the automation system for the electric power networks,
are closed systems, because the number and location of the nodes and their
interconnections are known. Such systems are essentially virtual private networks
utilizing the IP for the transport of information. For closed systems, the dependability
attributes can be specified, quantified and measured leaving open the questions
whether or not the performance requirements are achievable with existing
technologies and at what costs. The performance characteristics with respect to
availability, integrity, quantity and timeliness could be specified as data service
requirements in a service level agreement between the application service provider
and the ISP. In the cases where the application service provider establishes a
private network, in effect becoming an ISP, the dependability requirements for data
transport are not specified in a service level agreement, but as internal design
requirements for the application. For purposes of analysis they can be treated as
externalities. For the case of closed systems, the questions are:
q To what extend can the Internet satisfy the data transport requirements
of each application?
q To what extend can the application service provider monitor the
performance of the ISP with respect to the service level agreement.
Other applications may fall into the category of open systems. Some examples are
electronic mail, voice over IP, reservation systems and public access to databases.
Open systems may be further classified into two sub-categories. One contains
systems with the number of nodes and their connectivity unknown. Voice over IP
and electronic mail may be put into that category. One possible model for handling
such systems would be a fully connected set of an unknown number of nodes that
is treated as a random variable. The second sub-category contains systems with
unknown number of nodes but known connectivity. Reservation systems are
included in this sub-category. A possible model for this category could be a star
network with unknown number of nodes that are also treated as a random variable.
For open systems some relevant questions are:
q How can data service characteristics be specified between the
application service provider and the ISP?
q Is it possible and to what extend can the application service provider
monitor the performance of the ISP with respect to the service level
agreement?
Page 17

4.6 Issues with respect to the dependability of the communications


infrastructure
Within the domain of each ISP the performance of the network is either known or
could be specified and monitored. In the cases where an ISP leases
communications services from other network operators, the quality of service
requirements for these services could be specified in the service level agreements
governing the corresponding relationships. Provisions for monitoring the delivery of
the specified services within the domain of each ISP should also be included in such
agreements. In cases where the ISP owns and operates all communications
systems down to the physical layer, the performance of the network is under the
complete control of the ISP and the quality of service requirements are the direct
responsibility of the ISP. For each such network, there are no major technical issues
affecting the performance of the network with respect to the four dependability
attributes. They can be satisfied through proper selection of network topology and
channel bandwidth.
The problems arise at the IP interfaces where the data are passed from the domain
of one ISP to that of another. Since the IP is a best-effort service with no over all
monitoring and, much less, control, the problem is how to reconcile a best effort
service with end-to-end performance requirements. Where dissimilar networks are
interconnected at the IP interface, the attribute of timeliness would be difficult to
satisfy. Nevertheless, mechanisms could be found for improving the performance of
the end-to-end data transport service. Assuming that the characteristics of the
networks in the domain of each ISP are given and presumably known to the
corresponding ISP, availability and timeliness are also a function of the
interconnections among peer ISPs. Increasing the number of gateways between
any two ISPs affects directly the availability and timeliness of the end-to-end data
transport link. Although timeliness can be improved, there is no easy mechanism for
quantifying it and specifying it as a performance requirement or design parameter. It
has been mentioned previously that some protocols have been developed for
providing quality of service to the IP interface. Another approach could be the
development of statistical traffic models for the gateways among peer ISPs. These
models could then be used to derive optimal topologies for internetworking. The
quality of service offered by each ISP would determine both the number of and the
interconnections among peer ISPs.
The performance requirements at the IP interface could also be specified in service
level agreements between any two peer ISPs. For the new protocols offering quality
of service to the IP interface, the service level agreement would govern the
performance characteristics at each gateway. In the alternative approach using
topology as a mechanism to improve upon or satisfy the dependability requirements
for end-to-end data transport, a different model of a service level agreement would
have to be constructed. Each of the dependability attributes for end-to-end data
transport would need to be specified across all interconnections between any two
ISPs instead of specifying them individually for each interconnection. In addition to
the need for developing better understanding of network characteristics and traffic
models at the gateways the broad issues pertaining to the dependability of the
communications infrastructure are:
Page 18

q What are the feasible and cost-effective solutions for translating the end-
to-end data transport performance requirements into criteria for
interconnectivity among peer ISPs?
q What are the technical issues affecting the monitoring of performance
between peers ISPs with respect to the end-to-end data transport
dependability requirements?
The preceding analysis leads to the conclusion that the dependability requirements
of an application can be translated into dependability requirements for data
transport through the Internet. However, the attribute of timeliness could not be
satisfied with the basic IP. The new protocols attempt to solve this problem, but they
raise the question of trade-offs in the overall performance of the original Internet
based on the simplicity of the IP interface and a new Internet imposing complexity at
that interface.

4.7 Observations regarding dependability characterisation


The preceding discussion leads to a number of observations on dependability,
attributes and metrics.
• The decomposition of the information infrastructure into the components of
applications and communications infrastructure has led into the identification of
four quantifiable attributes for specifying and quantifying dependability for each
of the components. These attributes are: availability, integrity, quantity and
timeliness.
• Through further decomposition of the communications infrastructure into the
components networks and internetworking, the dependability of the
communications infrastructure may be expressed in terms of that of the
communications networks and that of internetworking. Internetworking is
performed through interconnection through the use of e.g. the internet protocol.
• The four quantifiable attributes may be used as measurable performance
requirements for analysing and/or designing either an application or the
communications infrastructure.
• For each communications network, the four dependability attributes are
equivalent to quality of service parameters and can be used as network design
parameters.
• At the internetworking level the four dependability attributes may become
performance requirements specified in service level agreements among peer
ISPs.

Within the context of the above analysis, a number of challenges can be


summarised regarding dependability:
a) Of communications networks:
• Design of a network to meet the performance requirements for link availability,
data integrity, quantity of transported data and timeliness of transport.
• Evaluation of design options for fault prevention, fault tolerance and fault
removal.
• Cost-benefit analysis for evaluating design options.
• It should be noted that the actual existence of reliable communications systems
suggests that cost is the only factor affecting the dependability of a
Page 19

communications network. Link availability is increased by using multiple paths


and data rates are increased by using more bandwidth, all at extra cost.
b) At the internetworking level:
• Development of architectures and protocols capable of satisfying the
performance requirements for link availability, data integrity, quantity of
transported data and timeliness of transport across diverse networks.
• Given the inherent characteristic of IP that provides best effort service,
development of protocols for delivering quality of service to the IP layer to satisfy
timeliness requirements.
• Development of models for characterising traffic at the internetworking level.
• Evaluation of design options for fault prevention, fault tolerance and fault
removal among interconnected networks.
• Cost-benefit analysis for evaluating design options

c) For applications, the main challenges with respect to dependability include:


• Specification of end-to-end performance requirements needed by the
application.
• Translation of the dependability requirements of an application into data
transport requirements.
• Feasibility of satisfying data transport requirements imposed by an application
using existing technologies.
• Allocation of the performance requirements of the application to the
communications network, the IP layer and the application itself.

5 Characterisation of threats and vulnerabilities


5.1 Threats, vulnerabilities and dependability
The dependability of application services can be compromised by threats to the
application, the communications infrastructure, or both. The term threat is used to
designate those conditions which can cause failures that potentially cause some
loss or damage. In the civil domain, damage includes safety-related casualties (e.g.
damage to life) as well as the loss or damage to business and personal information
assets. In this context it is equivalent to the term fault and will be used
interchangeably. Another term used in conjunction with the term threat is
vulnerability. For the purposes of this paper, vulnerability of a system to a threat can
be understood as a weakness or flaw in the system that eliminates or reduces its
ability to deliver the specified services. The question then becomes, how to relate
dependability to vulnerability. One possibility is to define dependability as a
complement of vulnerability (e.g. Dependability = f [1–Vulnerability]).
Vulnerability and, consequently, dependability can be quantified for use in modelling
both the applications and the data transport services, because all four dependability
attributes are quantifiable. One can then measure the deviation of each attribute
from some specified performance criterion. A measure of vulnerability of an attribute
is given in terms of the probability of deviating from the specified values. It would not
be difficult to devise a scheme for combining the vulnerabilities of a set of attributes
to produce a measure of the vulnerability of the entire system or service.
Page 20

A study [10] performed by RAND has defined a generic set of 20 sources of


vulnerabilities grouped in seven categories that are applicable to critical information
infrastructures. The study focussed on the critical defence infrastructure domain but
is generic in nature and other domains can use it as a check-list to help to find
important vulnerabilities. In the study, the issue of dependencies between systems
had been identified as a vulnerability source in the sense that disruption of one
system can cause failure of another interconnected one. It was however not further
developed and applied so far.
Deviations from specified values of attributes are caused by various faults which
affect the performance of the system. Some faults can have significant impact on
specific attributes, while others may have minimal or no impact. Thus, in order to
understand and measure the vulnerability of the application and of the data
transport service, it becomes necessary to identify the threats and vulnerabilities of
each attribute. These are unique to each application service, but common for the
data transport services through the information infrastructure regardless of
application. Another dimension of the problem is the question of synergies among
threats. The simplest model assumes the effects of combined threats on each
attribute are separable. There is, however, a possibility that synergistic effects of
more than one threat may increase the vulnerability of an attribute beyond that
given by the sum of the individual vulnerabilities.
The threats to the application and to the data transport service may be grouped into
different categories according to the intended uses. For the purpose of meeting
desired performance requirements, the threats could be grouped into internal and
external to the system. Internal threats are in theory caused by identifiable,
measurable and in principle preventable conditions. Examples are: use of
components that may not be appropriate or adequate to satisfy the performance
requirements of the system; operation of the components beyond the range of
specifications; inadequate design for the desired system performance. The designer
or operator of a system has the responsibility of ensuring that internal faults are
absent from the system, or their effect on the dependability of the system remains
within predetermined bounds. For this category of faults, the problems which need
addressing are:
• How to identify the relevant faults for a given application
• How to eliminate them or to limit the effects of the faults on the dependability
requirements.
Mitigation of the effects of internal threats on the dependability of a system affects
only the cost of the system (including e.g. training and education of staff), because
internal threats, by definition, are identifiable. That they are also preventable for
large complex systems remains an open question.
External threats, being beyond the control of the system designer or operator, lead
to the development of techniques for ensuring that the system performs its functions
in the presence of faults. In the classical approach for designing systems, these
faults are referred to as disturbances. These, being independent variables, cannot
be affected by the designer of the system. Instead, their effects on the performance
of the system are taken into account by identifying them and by developing models
that can describe heir behaviour. See for example, the noise models for various
types of communications channels. To the extend that, during the operation of a
system, the disturbances behave as predicted by the models, the systems, perform
according to their specifications. The performance may be degraded if the models
are incomplete and do not account for some characteristics of the disturbances. For
example, the bit error rate in a communications channel may increase considerably
Page 21

from the nominal design value, when the channel noise power increases beyond
that predicted by the noise model used for designing the channel. In computer
systems, the concept of fault-tolerance is used to describe techniques for ensuring
that the effect of the faults on the system performance will be within specified
tolerances and leading to the design of fault-tolerant systems.
In designing systems to operate in the presence of disturbances, the sole concern
regarding the disturbances is how to describe their behaviour so as to eliminate their
impact on the operation of the system. Generally, the approach is
phenomenological. Models are developed from measurements. The models are
probabilistic, because the observations are statistical in nature. An underlying and
unspoken assumption is that, as long as a disturbance is present and measurable, it
can be modelled. The validity of the models depends on the completeness and
sufficiency of the measurements. Another perspective has been brought in when
man-made disturbances are generated intentionally to affect the operation of the
system. In some instances, the intentional disturbances have been labelled as
atttacks. To conform with the terminology adopted for this paper, the term threat is
used to describe both unintentional and intentional disturbances. Intentional threats
are posed by humans, but they may affect the system through another intermediary
system agent. In other words, the threat may appear to originate in a system agent
instead of one or more human beings. Sometime, the term malicious threat is used
to indicate intentional disturbances, the implication being that the intent is to do
harm. This designation is based on an unquantifiable value judgement regarding the
meaning of the word harm. On the other hand, the term intentional designates
disturbances introduced knowingly by humans regardless of motives. For this
reason, it is a more appropriate term for classifying man-made disturbances
generated intentionally.
5.2 Security and Complex Open Systems
The designation of intentional disturbances as threats, has given rise to the rapidly
expanding field of security. Historically, the purpose of security has been to
eliminate the intentional threat to the application by either isolating the application
from the external fault, or, if possible, eliminate the fault. To accomplish this
objective, traditional security approaches entail the construction of barriers that aim
to isolate the application from the intentional threats. The objects of security have
been distinct entities that are spatially bounded. Security has been provided by the
construction of barriers that separate the object of security from the intentional
threat. Moats have isolated castles, armour has protected humans, walls have
surrounded cities, and fences have enclosed territories. In the cases where the
object of security is not distinct and spatially bounded, e.g., transportation networks,
the meaning and effectiveness of providing security through barriers is not very
clear. It is less so, for complex spatially distributed systems such as the information
infrastructure. Therefore, a new frame of reference is needed for addressing the
issue of “security” in complex open systems.
An important reason for undertaking this study has been to investigate issues
related to electronic commerce. An underlying assumption is that the electronic
market place is open and accessible to those who desire to enter. Thus, the idea of
creating barriers to isolate open systems from their environment is contrary to the
concept of openness and accessibility. As a result, it becomes meaningless to talk
about providing security, in the classical meaning of the word, for the entire
information infrastructure by creating a barrier around it. The absence of barriers
implies that, the threats, unintentional and intentional, could affect any component
Page 22

of the infrastructure and that, the infrastructure should be able to deliver the
desirable services in a dependable manner in the presence of those threats.
If we consider the intentional threats as a subset of the external threats, we could
apply the traditional techniques of treating disturbances in the analysis and design
of complex systems. The problem of complexity is handled through system
decomposition and the impact of disturbances through modeling of the sources of
the disturbances. In this paper, we have decomposed the information infrastructure
into two subsystems: communications and applications. One could then talk about
the effects of disturbances on communications and on the applications. For the
intentional disturbances, the question of security has two parts, security of the
communications systems and security of the applications. Security for the
communications systems implies that dependability requirements for data transport
are satisfied in the presence of intentional external threats. Similarly, security of the
applications implies that the dependability requirements for the applications are
satisfied in the presence of external threats.
One category of intentional threats that would pose difficulties in modeling and
quantifying is that caused by persons who have legitimate access to a system,
either communications or applications. Intuitively, such threats may be considered
internal, because the human element is also a system component. One could then
treat such threats in terms of the probability of failure and proceed with the standard
techniques of systems analysis. It would be an understatement to say that modeling
human behavior could be a very controversial subject. It has been mentioned in this
paper only for the purpose identifying it as another factor affecting the dependability
of complex systems.
External disturbances may affect, independently, the communications system and
the applications. In addition, external disturbances may affect the applications
through the data transport services provided by the communications system. When
the topic of security for the Internet is discussed, it is, generally, implied that the
concern is about the security of the applications from threats propagated through
the data transport service. For the class of intentional external threats the security of
the application becomes a function of the security of the communications system.
The question then is, how to allocate the security effort between the
communications systems and the applications. At one extreme, the dependability
requirements of the applications could be based on the assumption that there is a
very high probability that threats would be transmitted to the application via the data
transport service. For such a scenario, the application would not rely on any
protection offered by the communications system and would have to take into
account the existence of these threats. At the other extreme, the dependability
requirements of the application could be based on the assumption that the
probability of external intentional disturbances being transmitted to the applications
through the communications system would be negligible. In that scenario, there
would not be any external intentional threats to the application and the dependability
requirements would produce a different design. The latter scenario would,
obviously, impose more severe security requirements on the communications
systems in order to satisfy the dependability requirements for data transport. In
either case, the probability of a threat to the application propagated through the
communications system would be one of the attributes of dependability of the data
transport service and a parameter of the quality of service requirements imposed on
the communications system. An optimal solution would require a trade off analysis
based on the statistical nature of these threats.
To provide security or/and implement a fault-tolerant design requires some
knowledge of the nature of the external threats. One may develop a catalog of all
Page 23

possible external threat to a system, but such a listing would not be very useful
unless it contained specific information about the likelihood of such threats
materializing. This observation also holds for the internal faults. To provide some
quantitative measures of security and fault-tolerance for a given application and
data transport service, the primary issues that need to be addressed are:
• How to identify the external threats for a given application
• How to develop models describing the statistical properties of these threats.

In order to understand the nature of the vulnerabilities caused by various types of


threats and to assure the dependability of the applications we will follow the
established methodology and decompose the problem into a series of simpler
problems. For any application, two categories of threats can be identified:
• Threats to the application
• Threats to the communications infrastructure
5.3 Threats to the application
The threats to the application could be either related to the transport of data through
the communications infrastructure, or caused by other factors. For example, a
program could generate erroneous results either because certain algorithms were
coded incorrectly, or because incorrect values were received via the data transport
service and used in the calculations. In the first case, the threat to the application
can be classified as poor programming. In the second case, the erroneous data
could have been generated either intentionally or accidentally. In either case the
threat to the application has materialized through the data transport service. For the
purposes of this paper we will limit the discussion to threats manifesting themselves
through the interaction between application and data transport service.

Incorrect data or commands, whether generated intentionally or unintentionally, do


not satisfy the attribute of integrity. The problem then for the designer of the
application is to devise integrity checks for all data packets transmitted through the
data transport service for use by the application. The integrity checks could be
different for instructions or commands from those for data. They can involve
encryption, authentication, and/or validation and come under the heading of input
data security. For this model, the designer of the application assumes full
responsibility and control for the integrity of the data and commands transmitted
through the communications infrastructure. Obviously, the type and level of security
depends on the dependability requirements of the services provided by each
application. This formulation makes the quantification and specification of integrity
straightforward. It can be stated as X% of frames delivered by the data transport
service satisfying the integrity criteria, specified by the quality of service agreement
between the application and the data transport service.

A threat to an application is also the violation of the timeliness requirement


governing the delivery of data and commands by the data service to the application.
Monitoring the frames that satisfy the specification for timeliness requires knowledge
of the time the frame was transmitted by the source user. This is a solvable
problem. Although, the threat posed by the violation of the timeliness requirement
can be identified, the application designer or user has no formal mechanism for
alleviating the effects of the threat upon the dependability of the application. Thus,
the dependency of the application on the data transport service is nearly absolute.
Page 24

As in the case of data integrity, the timeliness requirement for data transport can be
specified by the quality of service agreement.
Unavailability of the data transport service is another threat to the application that
can have benign or malicious origins. Of course, it is easy to monitor for the
absence of data channels connected to the application. As in the case of timeliness,
the dependency on the data transport service is nearly absolute, unless provisions
have been made for alternate services. Of course, use of redundant data
transmission paths is a standard technique for reducing the probability of loss of the
path.
The quantity of transported data could pose a threat to the application by being
outside specified upper and lower bounds. The causes could either be malicious or
benign. Regardless of the cause, the threat can be countered by isolating the
application from the data that violate the attribute of quantity. As in the cases of
timeliness and availability, the threat could be detected but not countered.
The major issues concerning the threats to the dependability of the applications are
centered on technologies and procedures for monitoring the application for the
presence of threats originating in the data transport service. From a security
perspective, complete isolation of the application from the data transport service
implies the building of a barrier, such as end-to-end encryption, between the two.

5.4 Threats to the communications infrastructure


Following the reasoning of the preceding section, the threats to the communications
infrastructure can be subdivided into threats at the communications network level
and threats at the internetworking level using the model illustrated in Figure 3.
Although there are various ways of identifying and classifying threats to the
information infrastructure, the approach chosen for this paper is to examine the
threats in terms of their impact on the data transport service. As it was done in the
preceding discussion, the threats will be assessed in terms of their effects on the
dependability attributes of availability, integrity, quantity and timeliness.
The end-to-end virtual link could become unavailable due to faults in the
components, in the design of the system or/and as operational faults, which may
manifest themselves within one or more communications networks or at the
internetworking level. In a communications network, component faults can appear in
either hardware or software. Loss of communications channel, loss of power in a
satellite, cutting of a trunk line, malfunctioning computer program, crashing of an
operating system are examples of component faults. At the internetworking level,
faults can appear in only three components, the corresponding peer ISP routers and
the communications channel between them. The routers may fail and the
communications channel may be cut. Any of these faults would result in the
unavailability of the transport service.
Unavailability could also be caused by operational faults. In communications
networks, erroneous signals can cause the satellite antennas to point in the wrong
direction. A communications channel will become unavailable, if the offered traffic
exceeds the capacity of the channel. At the internetworking level, the operational
faults can appear in the channel connecting the peer routers or in the routers. The
internetwork channel will become unavailable, whenever the traffic between the two
routers exceeds the channel capacity. Similarly, the end-to-end data transport
service can be unavailable, if one router from any pair of the peer routers becomes
unavailable. In its operational state, a router can become unavailable when, either
the processing requirements of the traffic offered to the router exceed the
Page 25

processing capabilities of the router, or erroneous commands cause the router to


malfunction.
The integrity of the data during their transport through the end-to-end path may be
compromised by faults in a communications network or at the internetworking level.
It may be compromised due to physical or operational faults. It should be noted that,
the attribute of integrity is conditioned upon the availability of the end-to-end path,
but it is independent of the attributes of quantity and timelines. Integrity can be
measured even if only one unit of data can be transported per unit time. In a
communications network and at the internetworking level, integrity can be
compromised by physical faults, such as noisy channels, or by operational faults
such as dropping of packets when they are processed for transmission in the
routers. Operational faults that can compromise integrity may also take the form of
alteration of the contents of data frames as they pass through routers of the
communications networks or at the internetworking level.
The quantity of transported data is directly dependent on the available bandwidth of
the end-to-end virtual link. As with the other attributes, quantity may be affected by
physical and/or operational faults within the communications systems and at the
internetworking level. Insufficient design capacity for one or more communications
channels is a physical fault that would affect the ability of the link to transport the
quantity of data required by an application. Limitation in channel capacity could
occur within a communications network or in the channel connecting peer routers.
The quantity of the transported data can also be limited by the insufficient
processing power of one or more routers anywhere in the virtual link. The quantity of
the transported data can also be limited by some faults similar to those affecting the
availability of the link. Increase in channel noise anywhere in end-to-end virtual link
is an operational fault that will degrade the performance of the channel and reduce
the ability of the link to transport the specified quantity of data. A similar operational
fault can manifest itself at the peer routers as increased internetwork traffic that can
reduce the ability of a router to handle the throughput required by the application.
Timeliness is a performance requirement present in all applications. For real time
control of physical processes that have very short time constants, the timeliness
requirements may be specified in fractions of a second. For other applications, the
timeliness requirement may be given in terms of hours, days or even weeks. For a
number of applications, it may not even be stated explicitly. In the context of the
global information infrastructure, one, typically, speaks of time scales ranging from
fractions of a second to, may be, hours. Beyond that range, the timeliness
requirements are determined by processes other than computations and
communications. The primary reason for including hours in the time scale is that the
traffic patterns in the global infrastructure are affected by the 24-hour cycle of
activities anywhere on the globe.
As in the case of data integrity, the attribute of timeliness is conditioned upon the
availability of the end-to-end virtual link. In other words, given that the link is
available, what is the performance of the link with respect to timeliness? The same
types of faults that have affected the integrity and quantity of the data can also have
an impact on timeliness. Insufficient channel bandwidth and router processing
capacity are examples of design faults that can limit the capability of the virtual link
to transport data within specified time constraints. Such faults have already been
classified as physical faults. On the other hand, abnormal increase in traffic within a
network, or at the internetworking level would be classified as operational fault,
because it could not be predicted from known models. Similarly, degradation of
performance of equipment beyond that anticipated by the design specifications
Page 26

would be an operational fault that could generate congestion and pose a threat to
the requirement of timeliness.
In this section, we have given some examples of the various types of threats that
could affect the four attributes of dependability as they apply to the transport of data
through the information infrastructure. These threats are either at the
communications networks or at the internetworking level and can be addressed
within each network management structure or through agreements among peer
ISPs.

5.5 Observations related to threats and vulnerabilities


The preceding discussion leads to a number of observations regarding
vulnerabilities, threats and their relation to the dependability of complex open
systems.
• Threats to the information infrastructure can be divided into threats to the
communications infrastructure and threats to the application manifested through
the communications infrastructure.
• The identification of threats for each element of the decomposition of the
information infrastructure and the knowledge of the relationship between threats
and the dependability attributes forms the basis for specifying dependability
requirements of the information infrastructure, in terms of the dependability
requirements of the applications and the communications networks.
• The objective of security has, so far, been primarily elimination of the effects of
intentional threats on the operation of a system by creating barriers between the
system and the threats. For complex open systems the concept of barriers may
be undesirable and impractical.
• The objective of security may be changed from providing a barrier between the
system and the intentional threats to that of allowing the system to operate in
the presence of intentional threats.
• The philosophy of system design can take into consideration both unintentional
and intentional threats using existing design principles and practices.
Maintaining a design philosophy that aims for dependable systems in the
presence of all categories of threats changes the objective of security from
barrier construction and threat elimination to that of fault-tolerance.
• Changing the objective of security to provide fault-tolerance for intentional
threats eliminates the need for treating security as a concept distinct from
dependability.

In the context of the above analysis, a number of challenges can be summarised:


• Identification and modelling of all the disturbances (unintentional and intentional
threats) to the network and at the internetworking level.
• Identification and modelling of threats to the application through the
communications infrastructure.
• Identification and modelling of interdependencies between applications and the
communications infrastructure.
Page 27

6 Conclusions
6.1 Findings on method
In this paper we presented a method for studying complex systems and have
applied it to the characterisation of dependability for large-scale information
infrastructures employing open systems such as Internet as communications
medium. The methodology is based on the decomposition of infrastructures into two
major components, applications and communications infrastructure. The study
focused on two main issues at the interfaces between these components: a)
Characterisation of dependability in terms of performance attributes and
requirements and b) characterisation of threats and vulnerabilities.
The decomposition approach allows for the establishment of measurable attributes
of dependability for each of the two major components and for translating the
dependability attributes of the applications into those of the communications
infrastructure. The attributes of availability, integrity, quantity and timeliness have
been used in this study for demonstrating the approach. These attributes are
relevant for specifying the performance requirements of the data transport services
offered by the internetwork and for translating those requirements into quality of
service attributes for the communications systems forming the Internet.
The major conclusions and challenges derived from this approach are:
a) Quantifiable attributes and metrics can be specified for end-to-end dependability
requirements of applications relying on the Internet for data transport services.
b) For the component network, the end-to-end dependability requirements can be
mapped into network quality of service attributes that can be used to design the
network.
c) At the internetworking level, the end-to-end dependability attributes cannot be
satisfied, because the IP is inherently a best effort protocol. There is an open
question whether the current efforts to offer quality of service to the IP would
diminish the primary attraction of IP as an open system that offers easy access
to new entrants.
d) Given that it would be desirable to maintain IP as an open system, the end-to-
end dependability could be improved through new approaches to
internetworking architectures and protocols.
e) There is also a need to develop statistical models for better characterising data
traffic at the internetworking level.
The vulnerabilities of the applications to the threats propagated through the data
transport service offered by internetworking have been examined by partitioning the
threats into intentional and unintentional. For unintentional threats, the conventional
design philosophy has been to model the threats as external (independent)
variables and design a system to satisfy stated performance requirements in the
presence of such threats. This philosophy has been extended to the class of
intentional threats, so that, from a design perspective all threats are treated as
independent variables.
The major conclusions and challenges drawn from treating all threats as a single
category of independent variables are:
a) There is a need to develop models for identifying and quantifying intentional
threats. Within a wider risk management frame for addressing the problem, this
should also include models of vulnerabilities of the different layers of the
information infrastructure.
Page 28

b) There is a need for investigating the interdependencies between the data


transport service of the Internet and the applications relying on it, because
categories of threats to the application propagate through the layers of the
information infrastructure and could spread to other applications that rely on
similar technologies.
c) Another challenge posed by the interdependency is the allocation of the effort
(and the distribution of responsibilities) between the data transport service, trust-
related and other intermediate services, and the applications in order to optimise
the end-to-end performance of the applications.
d) In view of the probabilistic nature of all threats, the conventional objective of
security should be changed from protection to fault tolerance and be subsumed
by the broader concept of dependability.
6.2 Open issues
a) The methodology explained in this report is predominantly based on “vertical”
static dependencies between applications and the different layers in the
information infrastructure, i.e. how applications relying on the communications
infrastructure for the transport of data are vulnerable to faults in the data
transport service of the communications infrastructure. There is a need for better
categorising new types of vulnerabilities emanating from complex dynamic
interdependencies between different infrastructures for which the information
infrastructure acts as an interconnection medium and by which faults can
propagate. The figure below exemplifies in a schematic way two different
applications that both rely on the Internet as communications medium and
different types of disruptions that could propagate. Both applications could be
functionally unrelated or could be linked through information dependencies for
instance in a virtual enterprise setting:
• 1. Disruption of application 1 with side effect on a functionally unrelated
application 2. For example, heavier than expected use of data services of
application1 available through the Internet could affect the quality of
service of another but unrelated application 2 sharing the same data
service resources.
• 2. Disruption of a shared resource (communications infrastructure) with
effects on functionally unrelated applications. E.g. Denial-of-service attack
on an ISP.
• 3. Disruption of a shared resource (communications infrastructure)
affecting application 1 (e.g. temporal unavailability) in turn affecting a
functionally related application 2 due to tight information dependence in a
virtual enterprise setting (e.g. design or financial information in a global
enterprise or logistics information in a supply chain integration).
Page 29

Virtual Enterprise

Application 1 Application 2
Disruption 1 3

22 1 3 2 1

Disruption 2 Communications Infrastructure

Disruption 3

b) In this study, we have taken a top down approach in addressing dependability


issues. Other paradigms of system modelling can be explored for instance
centred around self-organisation and emergent behaviour. Complex system
science centred around the concept of self-organisation tries to explain how
systems in the absence of centralised control, can display a high level of
organised behaviour, i.e. properties that emerge spontaneously from the
interaction of a myriad of elementary objects. The applicability of that approach
to characterise the robustness of the Internet to disruption needs clarification.
For instance, an analysis of the Internet based on complex system theory
recently published in Nature [11] suggested that the Internet, although robust to
random faults would be more vulnerable to malicious attacks. Focusing on
integration implies a systematic approach, namely knowledge of the effects of
interconnecting components in a given manner. Focusing on interconnection
instead may carry the connotation of incomplete knowledge of the impact of the
interconnected components on the behaviour of the resulting system. The
manifestation of organised behaviour seems to depend on a set of conditions:
a) A proper structure of the system that fosters or “helps” a certain organisation
pattern. b) A continuous exploration activity of the possible configurations of the
system by means of fluctuations, i.e. by means of small random modifications, in
other words, what happens when components do not behave as expected?
c) Along the same line, but a different topic, we need to consider evolutionary
aspects of large-scale information infrastructures and their impact on threats and
vulnerabilities. This includes dynamics and ongoing changes due to upgrades,
introduction of new nodes (e.g. embedded systems, portable systems) and
technology modifications. But also mobile applications (m-commerce) and
mobility of users and even of components in the network.
Page 30

d) Also, the information infrastructure model could be enriched by introducing new


intermediaries in the functioning of the information infrastructure (e.g. e-
payments, trusted parties, etc.) and their impact on alleviating some of the
inherent vulnerabilities of the Internet or on the other hand introducing new types
of vulnerabilities and threats.
e) The user perspective has been emphasised in this report. Large-scale
infrastructures are of a critical nature for the functioning of society and therefore
of larger public concern. Socio-economic factors related to acceptability of risks
and cost-effectiveness for countering vulnerabilities of these infrastructures
need to be taken into account.

6.3 Follow on
a) Presentation of the study on a workshop in February 2001 on interdependencies
between energy and telecoms sectors to validate the results of the study with
these industry sectors and subsequently to refine some of the identified issues.
b) Further elaborate within other industry forums and project consortia on open
issues related to infrastructure interdependencies. In particular, further elaborate
on the scope and nature of the problem by developing application scenarios that
exemplify the vulnerabilities related to these interdependencies.
c) Stimulate pan-EU and global research and discussions on dependability of
critical infrastructures. Apart from technology issues, this could also include
socio-economic issues related to risk acceptance.
d) Establish an electronic forum for ongoing discussions on the topic and
information exchange at international level. The dependability forum
(http://deppy.jrc.it) will be used for that purpose.
Page 31

7 Acknowledgments
We would like to thank representatives from organisations that contributed to this
study. In partciular:
Alberto Stefanini: CESI
Angelo Invernizzi: CESI
Oliver Botti: CESI
Marco Invernizzi: University of Genova, Department of Electrical Engineering
John Regnault: BT Applied Research and Technology
Alberto Sanna: Hospital San Raffaele - HSR
Yves Deswarte: LAAS-CNRS
Henri Lagneau: European Rail Research Institute
Marcelo Masera, Ioannis Drossinos, Denis Sarigiannis: JRC-ISIS

8 References
[1] EC IST programme consultation meeting on “Infrastructure Adaptability and
Survivability for Dependable and Reliable Services”. Report of the workshop held in
Brussels on 23 May 2000. Report available also on
http://www.cordis.lu/ist/fpd/wpconsult.htm
[2] “Dependability and Complexity: Exploring Ideas for studying large unbounded
systems”, Terms of Reference for an exploratory study, 20 December 1999.
[3] “Dependability and Complexity: Exploring ideas for studying large unbounded
systems”, Study Method, 18 January 2000.
[4] Nicholas Kyriakopoulos, Marc Wilikens, Dependability of complex open systems:
A unifying concept for understanding Internet-related issues. Third Informartion
Survivability Workshop, Boston Massachusetts USA 24-26 October 2000. Worksop
sponsored by the IEEE Computer Society and the US State Department and
organized by the CERT Coordination Center, Software Engineering Institute.
http://www.cert.org/research/isw.html
[5] Critical Infrastructure Protection PDD-63, Presidential Directive signed on May
22, 1998, on protecting the Nation's critical infrastructures from both physical and
cyber attack.
[6] Fred Halsall, Data Communications, Computer Networks and Open Systems,
Third Edition, Addison-Wesley, 1993
[7] Jean-Claude Laprie (ed.), Dependability: Basic Concepts and Terminology,
Springer-Verlag, Vienna, 1993
[8] M. Laubach, J. Halpern, “Classical IP and ARP over ATM”, Network Working
Group, Request for Comments: 2225, April 1998
[9] D. Grossman, T. Heinahen, “ Multiprotocol Encapsulation over ATM Adaptation
Layer 5”, Network Working Group, Request for Comments: 2684, September 1999
[10] ”Robert H. Anderson et al ,“ Securing the US Defense Information
Infrastructure: A proposed approach”, RAND. Published 1999 by RAND. ISBN 0-
8330-2713-1.
[11] Tu, “How robust is the Internet?”, Nature 406, 353-354, 2000.
Page 32

[12] National Infrastructure Protection Center – NIPC (http://www.nipc.gov)


ASSESSMENT 00-062, FTP Anonymous Login Exploit, December 13, 2000.
[13] Ross Anderson, Patient Confidentiality - At Risk from NHS Wide Networking,
Proceedings of Health Care 1996 (http://www.cl.cam.ac.uk/users/rja14/hcs96.ps.Z).

You might also like