Professional Documents
Culture Documents
D3.2.1 - Scenarios Analysis and External Languages Specification - v1.0 - Final
D3.2.1 - Scenarios Analysis and External Languages Specification - v1.0 - Final
in Service InFrastructures
MASSIF
FP7-257475
Activity
A3
Workpackage
WP3.2
Due Date
December 2010
Submission Date
2011-02-04
Main Author(s)
Version
v1.0(Rev : 92)
Status
Final
Dissemination
CO
Nature
Level
Keywords
Reviewers
MASSIF - FP7-257475
D3.2.1 - Scenarios analysis and external languages specification
Version history
Rev
Date
Author
Comments
V0.1
2011-01-14
Herv Debar
V1.0
2011-02-03
Herv Debar
V1.0
2011-02-04
2 / 61
MASSIF - FP7-257475
D3.2.1 - Scenarios analysis and external languages specification
Glossary of Acronyms
Abbr
Abbreviation
BSCW
CEF
CLF
CSS
DoW
Description of Work
EC
European Commission
EU
European Union
FP7
FTP
IEFT
LEA
MSSP
OASIS
ODBC
PU
Public Usage
R&D
RSS
SCP
Secure Copy
SFTP
SIEM
SNMP
SSH
Secure Shell
WMI
W3C
3 / 61
MASSIF - FP7-257475
D3.2.1 - Scenarios analysis and external languages specification
Executive Summary
Deliverable D3.2.1 is one of the first technical productions of the MASSIF project. The description of
work specifies that this document is an analysis of input and output formats from use case scenarii,
and specification of common message formats for these data streams. This document has therefore
two objectives, enumerate data formats and models that have been used by the partners of the project
in SIEM-related projects, and provide a first glimpse at use cases, from a data point of view, that will
spread knowledge and understanding among partners on these use cases, and provide a first evaluation
of the importance of the aforementioned data formats. The document is constituted of 2 parts, Alert and
Event Languages describing security alerts and events, and use-case specific data streams describing
log formats specific to the proposed use cases. This document concludes with an analysis highligting
several characteristics shared between these languages and event formats, among wich simplicity of
the information representation that must be easily readable, timestamping and modularity of the format
structure.
4 / 61
Contents
1 Introduction
11
11
12
14
14
15
. . . . . . . . . . . . . . . . . . . . . . . . .
16
16
2.2.1 Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
16
2.2.2 Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
16
2.2.3 Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
17
Structure overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
17
17
17
18
Advantages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
18
Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
18
Uses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
18
18
2.3.1 Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
18
2.3.2 Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
19
2.3.3 Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
19
Structure overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
19
19
19
20
Advantages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
20
20
Uses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
20
21
2.4.1 Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
21
MASSIF - FP7-257475
D3.2.1 - Scenarios analysis and external languages specification
2.4.2 Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.4.3 Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
21
21
Structure overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
21
22
22
22
Advantages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
22
Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
22
Uses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
23
23
2.5.1 Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
23
2.5.2 Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
23
2.5.3 Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
24
Structure overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
24
25
25
25
Advantages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
25
Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
26
Uses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
26
26
2.6.1 Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
26
2.6.2 Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
27
2.6.3 Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
27
Structure overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
27
27
28
28
Advantages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
28
28
Uses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
29
29
2.7.1 Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
29
2.7.2 Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
29
2.7.3 Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
29
Structure overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
29
30
30
30
Advantages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
30
Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
31
Uses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
31
6 / 61
MASSIF - FP7-257475
D3.2.1 - Scenarios analysis and external languages specification
31
31
2.8.2 Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
31
2.8.3 Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
32
Structure overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
32
34
34
34
Advantages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
34
Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
34
Uses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
35
. . . . . . . . . . . . . . . . . . . . . . . .
35
2.9.1 Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
35
2.9.2 Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
35
2.9.3 Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
36
Structure overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
36
37
37
38
Advantages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
38
Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
38
Uses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
38
38
2.10.1 Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
38
2.10.2 Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
39
Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
39
Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
39
Delivery mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
40
Filters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Links with other data formats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
40
40
40
40
43
43
43
44
Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
44
Advantages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
44
44
Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
45
46
7 / 61
MASSIF - FP7-257475
D3.2.1 - Scenarios analysis and external languages specification
Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Drawbacks and issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
46
47
47
47
47
. . . . . . . . . . . . . . . . . . . .
49
49
50
51
51
52
52
53
Modbus Advantages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
55
Issues (Modbus) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
55
Modbus Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
55
56
WSN Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
57
Advantages (WSN) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
58
Issues (WSN) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
58
58
59
59
60
. . . . . . . . . . . . . . . . . . . . . . . . . .
8 / 61
List of Figures
1.1 MASSIF Blueprint Architecture (proposed) . . . . . . . . . . . . . . . . . . . . . . . . . . .
12
24
36
50
53
54
54
List of Tables
2.1 RSA Envision collectors summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
15
41
42
49
51
10
Chapter 1
Introduction
11
MASSIF - FP7-257475
D3.2.1 - Scenarios analysis and external languages specification
to manage the security status of the monitored system. In this area, we will focus on languages that
are considered having standards status, either through their publication mechanism or because of
their widespread use.
Use-case specific data streams Chapter 3 describe the use cases data stream formats. We are particularly interested in describing the specificities of the content of the data streams, such as the
way they build syslog message contents, as most of the syntax should be covered in the previous
chapters.
12 / 61
MASSIF - FP7-257475
D3.2.1 - Scenarios analysis and external languages specification
undisturbed as possible, or at least the capabilities required by the MASSIF SIEM system in terms of
monitoring and countermeasures should be fixed and acceptable to the business system owners.
Within the monitored system, we have separated three functions, intrusion detection sensors, business process components, and access control. Business process components have as primary function
to service users; however, they have also auditing capabilities in the form of log files, and minimal policy
enforcement capabilities like startup and shutdown. Sensors have as primary function to detect and
report sensitive events, either attacks or anomalies. Access control and identity management are security policy components, whose interaction with the MASSIF SIEM system will be the primary mean for
security response. In the current security litterature, intrusion prevention systems should be considered
as belonging to the two last categories.
Within the SIEM platform, we separate the operational decision support subsystem, handling the
alerts in real time, and the model management subsystem, which evaluates and updates the decision
support system according to its past performance, to the evolution of the monitored system, and to the
evolution of the global knowledge (vulnerabilities, etc.).
The most relevant part of this architecture for the present deliverable is the exchanges between the
two planes, which we model as follows:
Events (push) This stream describes events being pushed by the monitored business system to the
MASSIF SIEM platform. These events are typically alerts or logs driven by the interactions that the
monitored business system has with the outside world (users, updates, etc.) The formats used in
this data stream are described in section 4.1, alert and event languages.
Events (pull) This stream describes events being requested by the MASSIF SIEM platform from the
monitored business system. This allows the business system to store data and only make it available to the MASSIF SIEM if necessary. It is a way for the SIEM platform to ask questions or verify
information that it has on the monitored system. The formats used in this data stream should be
similar to the ones described in section 4.1, alert and event languages.
Configurations (Commands) This stream describes modifications of the behaviour of the business
system that are driven by the MASSIF SIEM system, mainly for update or response purposes.
This stream is important for alert correlation, but is outside the scope of this document.
Audits This stream represents the interaction of the model management subsystem with the monitored
business system. While it is analytically a different data stream, it might be assimilated to the
combination of both event (push + pull) streams, and might be implemented in this way, to simplify
the plane interface management. This stream is particularly important for model acquisition and
maintenance, but is outside the scope of this document.
The refined small blue arrows precise the data stream names in the case of sensors and should be
treated as examples only for the purpose of this deliverable.
This blueprint architecture will further evolve as the specifications of the MASSIF SIEM prototype are
developed.
13 / 61
Chapter 2
14
MASSIF - FP7-257475
D3.2.1 - Scenarios analysis and external languages specification
Syslog
95
51%
25
13%
ODBC
25
13%
SNMP
20
11%
File Reader
2%
Agentless Windows
2%
Other connectors
13
7%
186
100%
15 / 61
MASSIF - FP7-257475
D3.2.1 - Scenarios analysis and external languages specification
2.2.1 Reference
The Common Event Format (CEF)2 is specified and provided without charge by Arcsight Inc3 , a SIEM
vendor, as part of its strategy to foster interoperability between its SIEM vendor and sensors vendors.
2.2.2 Objectives
The Common Event Format (CEF) is an open log management standard that improves the interoperability of security-related information from different security and network devices and applications. CEF
has been designed to enable technology companies and customers to use a common event log format
so that data can easily be collected and aggregated for analysis by an enterprise management system.
2 http://www.arcsight.com/collateral/CEFstandards.pdf
3 http://www.arcsight.com/
16 / 61
MASSIF - FP7-257475
D3.2.1 - Scenarios analysis and external languages specification
2.2.3 Structure
Structure overview
CEF is an extensible, text-based, high-performance format designed to support multiple device types
from both security and non-security devices and applications in the most simple manner possible, unlike
other standards that target a single component of the security infrastructure, are tied to a specific transport protocol, or are designed specifically for applications and cannot support todays high-performance,
real-time security requirements.
To simplify integration, the syslog message format is used as a transport mechanism. However, if an
event producer is unable to write syslog messages, it is still possible to write the events to a file.
The basic grammar of the format includes the self-explanatory fields:
17 / 61
MASSIF - FP7-257475
D3.2.1 - Scenarios analysis and external languages specification
Advantages
The Common Event Format promotes interoperability between various event (or log) generating devices.
Although each vendor has its own format for reporting event information, these event formats often lack
the key information necessary to integrate the events from their devices.
The ArcSight standard attempts to improve the interoperability of infrastructure devices by aligning
the logging output from various technology vendors.
The Extension Dictionary from the CEF provides a broad set of predefined extension keys which
covers most event log requirements.
Issues
Custom extension keys are recommended for use only when no reasonable mapping of the information
can be established for a predefined CEF key. While the custom extension key mechanism can be used
to safely send information to CEF consumers for persistence, there are certain limitations as to when
and how to access the data mapped into them.
Data submitted to ArcSight Logger using custom key extensions is retained in the system; however,
it is not available for use in the Logger reporting infrastructure.
Uses
Use of the CEF format is limited to Arcsights deployments, despite the lobbying efforts deployed.
2.3.1 Reference
The Common Log Format (CLF) and its sibling the Extended Common Log Format (ECLF) are specified
by the W3C community5 and by the Apache developper community6 . This format falls into the category
of de-facto standards; while it is widely adopted by web servers, there is no normative reference.
5 http://www.w3.org/Daemon/User/Config/Logging.html#common-logfile-format
6 http://httpd.apache.org/docs/2.2/logs.html#common
18 / 61
MASSIF - FP7-257475
D3.2.1 - Scenarios analysis and external languages specification
2.3.2 Objectives
The Common Log Format is used by web servers, in particular the Apache web server, to trace all
requests processed by the server. It is generally shared by all log files (access.log, error.log, and others).
While the Apache web server offers the possibility to customize the log format, the users tend to keep
the default configuration, using either the simple CLF format, or its extension the ECLF format, which
shares the same initial description.
2.3.3 Structure
Structure overview
The CLF format stores the following information:
IP address of the origin of the request as presented to the server. If the requesting browser is behind
a proxy, the address of the proxy will show up in the logs.
identd identity of the client as specified in RFC 1413[8].
userid of the requester as determined by HTTP authentication.
Timestamp of the request.
Request line presented by the client, including the method, the URI and the protocol.
Status code that was returned to the client, indicating how the server was able to fulfil the request.
Size of the object returned to the client.
The ECLF format includes in parenthesis, after the information provided by CLF, additional information provided by the client, such as the referign URL and user agent identifiyng the clients browser.
19 / 61
MASSIF - FP7-257475
D3.2.1 - Scenarios analysis and external languages specification
Advantages
The CLF format is very easy to use and very informative. Even though it limits itself to HTTP header
information, it synthetizes the important aspects of the activity of the web server, from the point of view
of security: who asked what, when, and how did the server react. It is extremely compact and thus
efficient in terms of processing. Being widely adopted by web servers developers and proxy developers,
it provides a solid basis for analysis and detection of malicious activity aiming to subvert the web server
through the use of the HTTP protocol.
Issues
The CLF format does suffer from several issues, that have an impact on the detection and diagnosing of
attacks:
Multiplicity of lines Since the HTTP server may serve multiple requests for a single page view, a complete diagnosis may require the analysis of multiple lines which are not necessarily sharing an
identifying token.
Lack of payload information The log file does not contain HTTP payload information. This means that
for methods such as POST, the complete information is not available for diagnosis. This may be a
serious limitation for diagnosing infections such as XSS or SQL injection, for example if content is
pushed into comments in dynamic web sites.
Lack of server-side information The log file does not contain information identifying the web server
(such as the virtual server accessed). This is a serious limitation in identifying the exact target of
the attacker.
Uses
The CLF format is extremely used for web servers.
20 / 61
MASSIF - FP7-257475
D3.2.1 - Scenarios analysis and external languages specification
2.4.1 Reference
The Intrusion Detection Message Exchange Format (IDMEF) is normalized by the Internet Engineering
Task Force (IETF) as RFC 4765[5].
2.4.2 Objectives
The Intrusion Detection Message Exchange Format (IDMEF)[13] is intended to be a standard data format that automated intrusion detection systems can use to report alerts about events that they deem
suspicious. The development of this standard format aims at enabling interoperability among commercial, open source, and research systems, allowing users to mix-and-match the deployment of these
systems according to their strong and weak points to obtain an optimal implementation. It standardizes
messages between a sensor providing security analysis and detecting threats, and a manager which
receives and treats these messages. In the MASSIF context, the manager should be either the SIEM
platform itself or a gateway to it.
2.4.3 Structure
Structure overview
IDMEF is built as an UML class diagram of components. The standard defines two types of messages,
Alert (for security information) and Heartbeat (for management information). A message is an aggregation of components, modeling various entities that are part of an intrusion-detection sensor. At the
top level, a message requires a timestamp (CreateTime in IDMEF), a meaning (Classification in IDMEF)
and a generating sensor (Analyzer in IDMEF). The two other major components are the target and the
source of the attack. Each of these blocs has a complex structure, that attempts to capture the various
facets that characterize a component of an information system. One example of the elementary components that compose these larger blocks is the notion of Node, which is found both in Analyzer, Source
and Target, which models a machine.
21 / 61
MASSIF - FP7-257475
D3.2.1 - Scenarios analysis and external languages specification
Advantages
Semantic IDMEF is extremely conscious of the semantic of the information it manipulates, and does
much more that providing a syntax. Furthermore, it provides rationales and explanations to limit
interpretation by developers and thus reduce ambiguity. IDMEF also includes many constants that
strongly type objects. While the manner in which these constants are defined may not be the best,
the idea of strongly typing objects is very important in contributing to strong and clear semantic.
Modularity IDMEF is built of a set of components and thus is extremely modular. It also provides
facilities for referencing components instead of including them in the message, which contributes
to the efficiency in transfering and sharing identical information.
Extensibility IDMEF provides facilities for including structured information in a message, under the
form of the AdditionalData blob. This facility enables including original messages within IDMEF, or
information that becomes available at a later stage.
Issues
Dissemination Even though IDMEF is an RFC, it is only an informational one and it has not been
widely picked up by the security product developers, as sensor developers prefer simpler and less
constrained solutions, and as SIEM developers have prefered to own their base formats.
22 / 61
MASSIF - FP7-257475
D3.2.1 - Scenarios analysis and external languages specification
XML IDMEF is an XML format, thus it is quite verbose. While for transport purposes it compresses
quite well, it should not be used for storing information, nor for developing DB schemas. Also, the
normative reference is the XML DTD and not the XML schema, thus type checking is less precise.
extensibility IDMEF is extensible through the use of XML blobs. The idea is nice and useful, but there
are currently no possibilities for creating and sharing standard or useful patterns out of these blobs.
Uses
The IDMEF format is used mostly in the research community as a standard back-end for intrusion detection and alert correlation research projects and communities. It is also used by the Prelude SIEM
environment7 as its back-end data format (although the companion transport protocol IDXP is not used
by Prelude).
2.5.1 Reference
trustedcomputing.org
http://www.trustedcomputinggroup.org/developers/trusted_network_connect/
Specification document of IF-MAP 2.0 [11]
Specification document of IF-MAP Metadata for Network Security [12]
2.5.2 Objectives
IF-MAP is an interface specification between a Metadata Access Point (MAP) Server and entities that
either publish metadata or that subscribe to metadata from the MAP. The entities are called IF-MAP
clients, while the Server is referred to Metadata Access Point (MAP) or as IF-MAP Server. The latter
provides functionalities to publish metadata, to search through the stored metadata and enable clients
to subscribe to specific data and be notified on the event of data changes.
As IF-MAP aims to enable the structured collection and provision of data, it is not only a language to
describe (security) events. Nevertheless, a specification of a metadata language for network security is
part of IF-MAP [12]. As IF-MAP has been created by the TNC working group of the Trusted Computing
Group, its foremost purpose is the gathering of information that can be used in order to apply access
decisions in a networking environment. Thus the metadata comprises elements like registered address
7 http://www.prelude-ids.org/
23 / 61
MASSIF - FP7-257475
D3.2.1 - Scenarios analysis and external languages specification
2.5.3 Structure
Structure overview
The IF-MAP specification comprises of two single documents yet. One is the general description
and SOAP binding TNC_IFMAP_v2_0r36.pdf, also referred to as IF-MAP 2.0 [11]. The other is the
specification of IFMAP Metadata for Network Security which is v1.0 revision 25 at the time of writing this document [12]. Additionally, for a quick overview, we propose reading the IF-MAP FAQ under
www.trustedcomputing.org.
The session based communication between a MAP client and server is always initiated by the client
and is based on SOAP. The commands comprise different kinds of publish (update, delete etc.), subscribe (e.g. notification poll) and search.
The data model of IF-MAP comprises two types of data. The identifier (e.g. identities of several
types, mac-address, ip-address) and the metadata which can be related to each other by a link. Figure 2.5.3 visualises the data model used in IF-MAP where identifiers are represented by ovals, metadata
is represented by rectangles, and links are represented by lines connecting identifiers.
24 / 61
MASSIF - FP7-257475
D3.2.1 - Scenarios analysis and external languages specification
Advantages
Provision of an interface for various kinds of security information
A central database for information based on one protocol
A simple publish/subscribe data collector
Standard enables integration of application & system input & output from different vendors.
Opportunity to create a vocabulary explicitly for the needs of MASSIF and
thereby have an influence on the standardisation process
IF-MAP is intrincically defined to be extensible
Close contact of SIT with FHH (open source IF-MAP server irond) and Infoblox (IF-MAP server
IBOS and IF-MAP starter kit) and
opportunities of cooperation (user group) and dissemination (though Infoblox who are actively
advertising every adoption of IF-MAP)
25 / 61
MASSIF - FP7-257475
D3.2.1 - Scenarios analysis and external languages specification
Issues
As the specification of the metadata is not concluded or only consists of NAC information, respectively, there is no fully-fledged vocabulary. Nevertheless, one could add additional metadata types
through the use of other tags.
The standardisation of IF-MAP is not finished, so the specification might evolve during the run
of MASSIF. Standardisation with the IETF is planned for summer 2011 but usually takes several
years.
Uses
As the metadata definition does not yet exceed that of network security information, normal applications
according to the TCG are:
Federation between remote access and network access control (NAC).
Integration of NAC with endpoint monitoring and e.g. data leak detection.
Integration of physical access control with NAC.
Federation of authentication information, single sign on/off.
Real time information gathering and processing.
There are a lot of potential applications, specifically interesting to the goals of MASSIF. The TCG mentions applications in the field of smart grid and cloud security for reasons, that enable IF-MAP to facilitate
SIEM integration, such as aggregating, correlating and distributing of data from various applications and
systems.
2.6.1 Reference
The Incident Object Description and Exchange Format (IODEF) is normalized by the Internet Engineering Task Force (IETF) as RFC 5070[4].
26 / 61
MASSIF - FP7-257475
D3.2.1 - Scenarios analysis and external languages specification
2.6.2 Objectives
The Incident Object Description Exchange Format (IODEF) is a format for representing computer security information commonly exchanged between Computer Security Incident Response Teams (CSIRTs).
It provides an XML representation for conveying incident information across administrative domains
between parties that have an operational responsibility of remediation or a watch-and-warning over a
defined constituency. The data model encodes information about hosts, networks, and the services running on these systems; attack methodology and associated forensic evidence; impact of the activity; and
limited approaches for documenting workflow. The structured format provided by the IODEF allows for
increased automation in processing of incident data; decreased effort in normalizing similar data from
different sources; and a common format on which to build interoperable tools for incident handling and
subsequent analysis, specifically when data comes from multiple constituencies.
2.6.3 Structure
Structure overview
The IODEF implementation is specified as an Extensible Markup Language (XML) document type definition. The data model is composed of nineteen classes that describe the data related to the incident
(e.g. incident ID, related activity, time, assessment, history, etc). The data model serves as a transport
format; it does not attempt to dictate a definition for an incident, it rather assumes a broad understanding
of an incident that is flexible enough to encompass most operators. Since describing an incident for all
definitions requires an extremely complex data model, the IODEF intends to be a framework to convey
commonly exchanged incident information, ensuring that there are ample mechanisms for extensibility
to support organization-specific information and techniques to reference the information kept outside the
model.
27 / 61
MASSIF - FP7-257475
D3.2.1 - Scenarios analysis and external languages specification
Advantages
The overriding purpose of the IODEF is to enhance the operational capabilities of CSIRTs. Community
adoption of the IODEF provides an improved ability to resolve incidents and convey situational awareness by simplifying collaboration and data sharing.
Implementing the IODEF in XML provides numerous advantages. Its extensibility makes it ideal for
specifying a data encoding framework that supports various character encodings, such as UTF-8 and
UTF-16. Likewise, the abundance of related technologies (e.g., XSL, XPath, XML-Signature) makes for
simplified manipulation.
The data model supports multiple translations of free-form text. The intent is to allow the identical
text to be encoded in different instances of the same class, but each being in a different language. This
approach allows an IODEF document author to send recipients speaking different languages an identical
document.
Issues
XML is fundamentally a text representation, which makes it inherently inefficient when binary data must
be embedded or large volumes of data must be exchanged.
In order to support the changing activity of CSIRTs, the IODEF data model will need to evolve along
with them. Internationalization and localization is of specific concern to the IODEF, since it is only through
collaboration, often across language barriers, that certain incidents be resolved. The IODEF supports
this goal by depending on XML constructs, and through explicit design choices in the data model.
The domain of security analysis is not fully standardized and must rely on free-form textual descriptions. The IODEF attempts to strike a balance between supporting this free-form content, while still
allowing automated processing of incident information.
As the data encoded by the IODEF might be considered privacy sensitive by the parties exchanging
the information or by those described by it, care needs to be taken in ensuring the appropriate disclosure
during both document exchange and subsequent processing. Similarly, care must be taken by the parser
to properly authenticate the recipient of the document and ascribe an appropriate confidence to the data
prior to action.
28 / 61
MASSIF - FP7-257475
D3.2.1 - Scenarios analysis and external languages specification
Uses
We do not have specific information about the actual use of the IODEF by FIRST or CERT organizations.
2.7.1 Reference
The Internet Protocol Flow Information Export (IPFIX) requirements are normalized by the Internet Engineering Task Force (IETF) as RFC 3917[10]. The specifications are normalized in the RFC 5101[2].
2.7.2 Objectives
The Internet Protocol Flow Information Export (IPFIX) has been created from the need of a standard for
exporting Internet Protocol flow information collected from routers, probes and other devices used by
mediation systems, accounting/billing systems and network management systems. The IPFIX standard
defines how IP flow information has to be formatted and transferred from an exporter to a collector. Previously, many data network operators were relying on the proprietary Cisco Systems Netflow standard
for traffic flow information export. The IPFIX Working Group chose the Netflow version 9 as basis for the
standardization. The working group submitted the IPFIX Protocol Specification to the IESG for approval
in 2006.
2.7.3 Structure
Structure overview
IPFIX defines a flow as any number of packets observed in a specific timeslot and sharing a number of
properties, like "same source, same destination, same protocol". The IPFIX protocol defines a precise
architecture for flow data information exporting. This architecture includes an observation point for
collecting IP packets belonging to a specific observation domain. A metering process filters data packets
and aggregates information about these packets; this information defines the Flow Records. The Flow
Record contains metrics related to packet header data, timestamping, sampling, classification. Flow
Records are sent by the IPFIX exporter to an IPFIX collector, in charge of receiving and cataloguing
IPFIX packets; exporter and collector are in many-to-many relation and work on a push based paradigm.
29 / 61
MASSIF - FP7-257475
D3.2.1 - Scenarios analysis and external languages specification
The IPFIX data format makeup is transmitted by means of template records to the collector; they
could be standard or user-defined. Template Records are an n-uple of type-size couples, used to define
entirely the structure and the semantic of a specific set of metrics sent to the collector. The collector
discerns different Data Records by means of their Template ID. Data Records are composed of a certain
number of Information Elements, representing the attributes description.
Advantages
Modularity The IPFIX architecture and its many-to-many paradigm is operatively modular and fits perfectly the needs of MASSIF for a distributed data metering system and for collecting data from
remote sites.
Flexibility The IPFIX standard, by means of Template Records, provides solutions to extend the data
message format with user defined fields, for example for introducing non-standard Information
Elements. Moreover it allows the definition of the messages structure. The standard works on
different transmission protocols like TCP, UDP or SCTP.
Interoperability The IPFIX protocol is standard and can rely on a widespread number of compliant
devices from several vendors, reducing the number of ad-hoc solutions.
Extensibility IPFIX information is not limited to flows: network behavior, performance behavior, application behavior, host behavior, security analysis are some of them.
30 / 61
MASSIF - FP7-257475
D3.2.1 - Scenarios analysis and external languages specification
Issues
Encryption Analysis of encrypted packets is a relevant issue for a proper data inspection. In encrypted
scenarios, IP packets fields are encrypted and unobservable at several layers, so some metrics,
related for example to protocol headers, cannot be evaluated.
Hardware requirements Probes must be deployed on every link to be monitored. Moreover deep inspection on high bandwidth networks is not tolerated by a simple router device.
Collector flooding Since the protocol is push based the collector could suffer of excessive load coming
from the probes. A careful exporting configuration must be considered.
Uses
The IPFIX format is largely implemented and adopted by generic network devices, like routers, and
network analysis devices provided by several vendors. IPFIX compliant devices are used as support
for effective network measurement, providing vital information on the health of the managed networks;
the collection of network information can be used for several purposes: the standard provides a strong
back-end for security functionalities, like Intrusion Detection.
2.8.1 Reference
The Syslog Protocol is normalized by the Internet Engineering Task Force (IETF) as RFC 5424[6].
2.8.2 Objectives
The need for a new layered specification has arisen because standardization efforts for reliable and
secure syslog extensions suffer from the lack of a Standards-Track and transport-independent RFC.
Without this, each other standard needs to define its own syslog packet format and transport mechanism,
which over time will introduce subtle compatibility issues. The goal of this architecture is to separate
message content from message transport while enabling easy extensibility for each layer.
31 / 61
MASSIF - FP7-257475
D3.2.1 - Scenarios analysis and external languages specification
2.8.3 Structure
Structure overview
This protocol utilizes a layered architecture, which allows the use of any number of transport protocols
for transmission of syslog messages. It also provides a message format that allows vendor-specific
extensions to be provided in a structured way. The syslog protocol does not provide acknowledgment
of message delivery. Though some transports may provide status information, conceptually, syslog is a
pure simplex communication protocol.
The syslog message has the following ABNF[3] definition:
SYSLOG-MSG
HEADER
PRI
PRIVAL
VERSION
HOSTNAME
APP-NAME
PROCID
MSGID
= NILVALUE / 1*48PRINTUSASCII
= NILVALUE / 1*128PRINTUSASCII
= NILVALUE / 1*32PRINTUSASCII
TIMESTAMP
FULL-DATE
DATE-FULLYEAR
DATE-MONTH
DATE-MDAY
=
=
=
=
=
FULL-TIME
PARTIAL-TIME
=
=
TIME-HOUR
TIME-MINUTE
TIME-SECOND
TIME-SECFRAC
TIME-OFFSET
=
=
=
=
=
32 / 61
MASSIF - FP7-257475
D3.2.1 - Scenarios analysis and external languages specification
=
=
=
=
=
=
NILVALUE / 1*SD-ELEMENT
"[" SD-ID *(SP SD-PARAM) "]"
PARAM-NAME "=" %d34 PARAM-VALUE %d34
SD-NAME
SD-NAME
UTF-8-STRING ; characters '"', '\' and ']'
; MUST be escaped.
= 1*32PRINTUSASCII except '=', SP, ']',
%d34 (")
MSG
MSG-ANY
MSG-UTF8
BOM
=
=
=
=
MSG-ANY / MSG-UTF8
*OCTET ; not starting with BOM
BOM UTF-8-STRING
%xEF.BB.BF
UTF-8-STRING
OCTET
SP
PRINTUSASCII
NONZERO-DIGIT
DIGIT
NILVALUE
=
=
=
=
=
=
%d00-255
%d32
%d33-126
%d49-57
%d48 / NONZERO-DIGIT
"-"
Syslog message size limits are dictated by the syslog transport mapping in use. There is no upper
limit per se. Each transport mapping defines the minimum maximum required message length support,
and the minimum maximum must be at least 480 octets in length.
The TIMESTAMP field is a formalized timestamp derived from [RFC3339].
The HOSTNAME field identifies the machine that originally sent the syslog message.
The APP-NAME field should identify the device or application that originated the message. It is a
string without further semantics. It is intended for filtering messages on a relay or collector.
The PROCID field is a value that is included in the message, having no interoperable meaning,
except that a change in the value indicates there has been a discontinuity in syslog reporting. The
field does not have any specific syntax or semantics; the value is implementation-dependent and/or
operator-assigned.
The MSGID should identify the type of message. For example, a firewall might use the MSGID
TCPIN for incoming TCP traffic and the MSGID TCPOUT for outgoing TCP traffic. Messages with the
same MSGID should reflect events of the same semantics. The MSGID itself is a string without further
semantics. It is intended for filtering messages on a relay or collector.
33 / 61
MASSIF - FP7-257475
D3.2.1 - Scenarios analysis and external languages specification
Advantages
The syslog format tries to provide a solid basis that allows code to be written once for each syslog feature
rather than once for each transport. Without this format, each other standard would need to define its
own syslog packet format and transport mechanism, which over time will introduce subtle compatibility
issues.
Issues
The protocol may content the NULL value as control characters. However, invalid UTF-8 sequences may
be used by an attacker to inject ASCII control characters. Similarly, message truncation can be misused
by an attacker to hide vital log information.
There is no mechanism in the syslog protocol to detect message replay. An attacker may record a
set of messages that indicate normal activity of a machine. At a later time, that attacker may remove
that machine from the network and replay the syslog messages to the relay or collector.
Some messages may be lost because there is no mechanism to ensure delivery, and the underlying
transport may be unreliable (e.g., UDP).
Syslog can generate unlimited amounts of data. The transfer of this data over UDP is generally
problematic, since UDP lacks congestion control mechanisms.
The syslog protocol does not have mechanisms to provide confidentiality for the messages in transit.
34 / 61
MASSIF - FP7-257475
D3.2.1 - Scenarios analysis and external languages specification
Network administrators must take the time to estimate the appropriate capacity of the syslog collector.
An attacker may perform a Denial of Service attack by filling the disk of the collector with false messages.
Uses
Syslog is in widespread use, both for UNIX operating system hosts and for networking equipments.
2.9.1 Reference
Windows Management Instrumentation (WMI) is the Microsoft implementation8 of Web-based Enterprise Management (WBEM), which is an industry initiative to develop a standard technology for accessing management information in an enterprise environment.
WMI uses the Common Information Model (CIM)9 industry standard to represent systems, applications, networks, devices, and other managed components. CIM is developed and maintained by the
Distributed Management Task Force (DMTF). The Managed Object Format (MOF)10 language is used
to create new CIM class.
2.9.2 Objectives
The main target of WMI is to provide a standard to share management information between management
applications windows-based throughout the network. The aim of this set of specifications is to establish a
uniform model that allows working in different environments and interact with other existing management
standards to access information from any source, such as DMI (Desktop Management Interface) or
SNMP.
8 http://msdn.microsoft.com/en-us/library/aa384642(v=VS.85).aspx
9 http://www.dmtf.org/standards/cim
10 http://msdn.microsoft.com/en-us/library/aa823192%28v=vs.85%29.aspx
35 / 61
MASSIF - FP7-257475
D3.2.1 - Scenarios analysis and external languages specification
2.9.3 Structure
Structure overview
The Microsoft WMI implements the three-tiered model of the WBEM architecture for working with management data that in this case includes the following components: a standard mechanism for storing
object definition (a CIM-compliant object repository), a standard protocol for collecting and distributing
management data (such as COM/DCOM), and one or more Win32 dynamic-link libraries (DLLs) that
function as WMI data providers.
Diagram shows the data flow in the WMI architecture11 :
36 / 61
MASSIF - FP7-257475
D3.2.1 - Scenarios analysis and external languages specification
It is important to highlight that WMI is an object model and not a language. Several scripting languages, such as VBScript or Windows PowerShell, can be used in WMI to manage the different windowsbased servers locally and remotely.
The Windows Management Instrumentation defines the objects, methods and properties which are
needed to access to the management information data from the different parts of the operating system.
The model that WMI uses to store this information is the standard Common Information Model (CIM).
According to the CIM Specification 2.312 , there are three different levels of classes in the CIM model
for storing information: the Core, Common and the Extended classes.
The core model define an information model that applies to all areas of management
The common model applies to information that is common to particular management areas (such
as systems, applications, networks and devices) but which is independent of a particular implementation or technology.
The extension schemas are extensions to the common model for a specific technology, for example
for different operating systems such as Microsoft Windows or Unix.
On the other hand, according to the CIM definition provided by the DMTF, CIM is composed of a
specification and a schema. The specification defines the details for integration with other management
models, while the schema provides the actual model descriptions.
The specification can be described in Unified Modeling Language (UML), Managed Object Format
(MOF), or Extensible Markup Language (XML). But to create and describe classes in the Common
Information Model (CIM), the Managed Object Format (MOF)13 is the most used and popular language.
37 / 61
MASSIF - FP7-257475
D3.2.1 - Scenarios analysis and external languages specification
Advantages
WMI is widely present in windows-based applications so it is a common way to access and share management information from local and remote computers. Besides, there is a variety of scripting languages
(such as VBScript or Perl), that can be used in enterprise applications and administrative scripts to obtain
WMI data or take actions through WMI.
CIM is a model that permits both a common model that applies to all areas and particular extensions
to define different management information for systems, networks, applications, devices and services.
This feature allows building semantically rich management information that will be exchange throughout
the network.
Issues
The WMI log files are being replaced by Event Tracing for Windows (ETW) .
Some vulnerability on applications that use Windows Management Instrumentation can be found.
For example in some applications, due to insufficient security protections on WMI providers, a local
attacker could gain elevated privileges on the local system and use them to take control of it.
Uses
WMI scripts and applications are used to obtain and exchange management information on windowsbased systems. These scripts allow performing administrative tasks on parts of the operating systems
as well as share management data with different products. Some of the products can be Microsoft
System Center Operations Manager or Windows Remote Management (WinRM).
2.10.1 Objectives
WS-Eventing[1] and WS-Notification[7] are two competing specifications to standardize message formats and Web services interfaces for subscription management and notification delivery in event notification systems in WS-based systems. A WS-based event notification system utilizes Web services tech-
38 / 61
MASSIF - FP7-257475
D3.2.1 - Scenarios analysis and external languages specification
nologies to deliver event notifications and manage subscriptions. In such a system, a SOAP-formatted
subscription is sent to an event producer Web service, requesting a certain kind of event notifications to
one or more event consumer Web services. As events occurr, the event consumer Web services can
receive SOAP-formatted notification messages. The notification messages can be transported through
intermediary and use different transportation mechanisms.
2.10.2 Structure
Architecture
The architectures presented in WS-Eventing and WS-Notification are remarkably similar irrespective of
their incompatibility. In fact, subsequent versions of each specification have converged towards each
other, borrowing concepts from the other to mitigate their own deficiencies.
WS-Eventing and WS-Notifications both process identical WS-based architecture and follow Publisher/Subscriber design. Both define subscriber and subscription manager entities. The event sink
defined in WS-Eventing is comparable to the notification consumer defined in WS-Notification. The
subscribers are separated from notification consumers such that notification consumers are required to
handle only the received notification messages. They are not required to know the message broker location and manage subscriptions. WS-Eventing does not separate the publisher from the event source.
The event source in WS-Eventing has both functions of the notification producer and publisher defined
in WS-Notification.
Function
WS-Eventing defines five operations, namely Subscribe, Renew, GetStatus, Unsubscribe and SubscriptionEnd. The Subscribe operation is used to create a subscription for an event sink. The Renew, GetStatus and Unsubscribe operations are provided by subscription managers to subscribe to their existing
subscriptions. If an event source terminates unexpectedly, a SubscriptionEnd message is generated
and sent to the address specified in the subscription request. If that address is not presented in the
subscription request, this SubscriptionEnd message is not generated.
WS-Notification has comparable operations for the above five operations. Even though it does not
define GetStatus and SubscriptionEnd operations, they can be implemented using the (optional) WSResourceFramework since WS-Notification can treat subscriptions as WS-Resources in WS-ResourceFramework
specification.
39 / 61
MASSIF - FP7-257475
D3.2.1 - Scenarios analysis and external languages specification
Delivery mode
Both WS-Eventing and WS-Notification can use push, pull and wrapped mode to deliver notification
messages. The wrapped mode deliver can encapsulate several notification messages on to one for
efficient delivery. The pull mode enables the event sink or notification manager to check an event source
periodically for relevant events. In push mode, the event source waits for an acknowledgement for the
notification message it sends.
Filters
WS-Notification defines three types of message filters namely TopicExpression, ProducerProperties and
MessageContent. A subscriber can use any or all of these filters. WS-Eventing allows at most one filter
in subscription requests. The default filter is a content-based filter using XPath expressions in a specified
dialect that evaluates to a Boolean value as a filtering criteria. WS-Eventing does not specify a way to
filter messages using ProducerProperties of publishers.
40 / 61
MASSIF - FP7-257475
D3.2.1 - Scenarios analysis and external languages specification
Data source
Characteristics
Rationale summary
Y(1)
CLF
Y(all)
CLF is a major log format for web servers, being supported by Apache out of the box. It can be directly integrated in many SIEMs, e.g. Prelude and RSA.
IDMEF
Y(1)
IF-MAP
IODEF
IPFIX
IPFIX is becoming increasingly important in the networking world, where it may provide an alternative or a complement for syslog.
Syslog
Y(all)
WMI
WS-Eventing
41 / 61
MASSIF - FP7-257475
D3.2.1 - Scenarios analysis and external languages specification
Data source
Characteristics
Rationale summary
SNMP
While SNMP is cited as a collection mechanism by several SIEMs, its use seems to be limited to transporting
data. The management information bases used by SIEMS
would have been in scope, but SIEM products do not publicly document this, and the transport protocol only is out
of the scope of this deliverable.
42 / 61
Chapter 3
43
MASSIF - FP7-257475
D3.2.1 - Scenarios analysis and external languages specification
Description
Syslog2 (see section 2.8) is a standard for logging program messages. It allows separation of the
software that generates messages from the system that stores them and the software that reports and
analyzes them. It also provides devices, which would otherwise be unable to communicate, a means to
notify administrators of problems or performance.
There are three main topics when defining the Olympic Games related events and languages:
1. How to collect data transmission, syslog, wmi, snmp, etc
2. How to parse the data format, spaces and commas
3. How to make sense out of the collected data meaning/logics of the fields posed by the monitored
application/system
Mapping these three topics into Novell Sentinel 6.1 we get the following Novell components:
Sources are systems that are being monitored.
Connectors define connectivity protocols. Only two different protocols where used in the last Olympic
Games: Syslog and LEA.
Collectors define parsing rules and mapping of the internal data presentation into Sentinel taxonomy.
Collectors examples used in the Olympic Games were Windows (through Snare agents), Sourcefire, Nortel switches/routers or Sophos Antivirus.
Advantages
Syslog provides flexibility when dealing with different SIEM products and obviously is a widely extended
log format.
Syslog is the preferred (de facto) format in the Olympic Games scenario.
44 / 61
MASSIF - FP7-257475
D3.2.1 - Scenarios analysis and external languages specification
When monitoring Windows systems we might used WMI to grab the logs, but still we enforced using
the standard format and moved to syslog by implementing Snare agents on each windows system
translating Eventlog into Syslog.
Examples
The following are examples of valid syslog messages. A description of each example can be found below
it. The examples are based on similar examples from RFC 3164[9] and may be familiar to readers. The
otherwise-unprintable Unicode BOM is represented as "BOM" in the examples.
45 / 61
MASSIF - FP7-257475
D3.2.1 - Scenarios analysis and external languages specification
Description
Checkpoint3 has two APIs, LEA (Log Export API) and ELA (Event Logging API), that allow third parties
to access log data. This ability to access a granular level of connection detail enables robust reporting
capabilities by specialized security products, network reporting products, help desk and event management systems, security audits, accounting and billing, and network management systems. This
integration is accomplished through two client-server APIs which enable events to be passed between
the Check Point Management Console and other products through secure channels.
The Log Export API enables applications to read the VPN-1/FireWall-1 log database. The LEA client,
written by an OPSEC (Open Platform for Security) partner, can retrieve both real-time and historical log
data from the Management Console with the LEA server. A reporting application can use the LEA
client in an on-line mode or off-line mode to process the logged events that are generated by the VPN1/FireWall-1 security policy. OPSEC partners rely on LEA as a mission-critical source for granular traffic
connection information driven by the VPN-1/FireWall-1 kernel engine. The SSL-enabled version of LEA
3 http://www.checkpoint.com/
46 / 61
MASSIF - FP7-257475
D3.2.1 - Scenarios analysis and external languages specification
provides additional security to applications-ensuring that all data traversing the network between the
LEA application and the firewall management system is encrypted.
47 / 61
MASSIF - FP7-257475
D3.2.1 - Scenarios analysis and external languages specification
Recharge: This service enables customers to buy extra time for their telecom prepaid account
with mMoney.
Cash in / Cash out: This transaction allows customers to deposit or withdraw money from
their mobile wallet (mWallet) through a retailer.
National / International Money Transfer: This enables customers to transfer mMoney from
their Wallet to another person within the country or outside the country. The receiver may
be registered or not to the same Money transfer service and may also be a user of different
operator.
Bill Payment: It enables customers to receive and pay bills using their mWallet account.
Salary Payment: It allows customers to have their salary paid on their mWallet account.
Social Security Payment: It allows users to have their social security benefits paid on their
mWallet account.
Merchant Payment: It enables users to buy goods and services with mMoney from their
mWallet accounts.
Third-Party Payments: It enables users to pay through a third party like Paypal.
Financial Operations: It allows users to perform financial operations such as credit and savings.
The retailers, billers and merchants can also interact with the operator to exchange mMoney into
cash or inversely.
Reporting to the Central Bank: Periodically, a report is generated for the Central Bank by the Partner
Bank. The Central Bank also has the right to access the information of any transaction it wishes to
investigate.
All of these operations must be included in the audit trail provided by the applications log files.
Table 3.1 summarizes the description of Money Transfer Service actors.
In money transfer service, information which follow are necessary for each transaction:
MSISDN The phone number of customer (sender/receiver)
User ID The identifier of actor (sender/receiver)
Transaction ID The transaction identifier
Transaction Type The transaction type (money transfer, withdrawal, ...)
Transaction Status The transaction status (success, fail, waiting, ...)
Request Type The type of transaction (Request, Reply, Signaling, ...)
Transfer ID, Date and Time of Transfer
Actor Category Sender/Receiver Category (Customer, Merchant, Biller, ...)
Balance Sender/Receiver mMoney Balance (Customer, Merchant, Biller, ...)
Figure 3.1, shows an example of log message.
48 / 61
MASSIF - FP7-257475
D3.2.1 - Scenarios analysis and external languages specification
Actor
Customer
mMoney
mWallet
User Money Ac- The mMoney account opened by Operator in the users name,
count
for the purpose of holding and managing the mMoney held by
the relevant Participant.
Participant
Retailer
Wholesaler
Retailer provider
Merchant
Biller
49 / 61
MASSIF - FP7-257475
D3.2.1 - Scenarios analysis and external languages specification
Event Aggregation Module (EAM): Data from various network devices and applications are gathered
by the EAM, via conduits such as Syslog or SNMP. The EAM normalizes, filters, batches and
transmits incoming data streams to the Central Management System (CMS) for further processing.
Central Management System (CMS): Data streams received from EAM servers are correlated and
events categorized and stored within a connected database. Using a deterministic threat analysis
technique, the CMS determines the level of threat an event poses, applying pre-configured rules
from the stateful rules engine to respond to threatening events and attack signatures.
TSOM allows actions to be performed in response to events, such as transmission of SNMP traps or
Syslog messages.
50 / 61
MASSIF - FP7-257475
D3.2.1 - Scenarios analysis and external languages specification
OID Name
Value
sysUpTime
snmpTrapOID
1.3.6.1.4.1.13978.2.0.2
EamTime
1294301091056
SensorTime
1294301855000
SensorName
OMRSNA016
SensorType
Windows EventLog
... ...
...
25 Information
EventLog = Security
RecordNumber = 6049430
TimeGenerated = 2011-01-06 10:17:35
TimeWritten = 2011-01-06 10:17:35
EventID = 529
EventType = 16
EventTypeName = Failure Audit event
Table 3.2: TIVOLI TSOM SNMP Trap content example
Tasks include identifying the most significant OIDs within SNMP traps, and pre-processing this data
into CSV files. An anonymisation tool is responsible for providing anonymous sample event data for
testing with MASSIF.
Mar
51 / 61
MASSIF - FP7-257475
D3.2.1 - Scenarios analysis and external languages specification
operative system, a BSD-licensed operating system designed for low-power wireless devices. More
formats of interest for the scenario are also described in this deliverable (e.g. syslog, CLF).
52 / 61
MASSIF - FP7-257475
D3.2.1 - Scenarios analysis and external languages specification
With regards to the CTP and DIP protocols, they are used in the Wireless Sensor Networks deployed
on the dam scenario. CTP is a tree-based collection protocol where nodes in a wireless network are
tree roots. In our context the protocol is used to collect data and information from wireless sensor nodes
constituting a WSN. DIP is a dissemination protocol used in the WSN to send commands through the
tree nodes.
53 / 61
MASSIF - FP7-257475
D3.2.1 - Scenarios analysis and external languages specification
The function code field of a Modbus data unit is coded in one byte. Valid codes are in the range of 1
255 decimal (the range 128 255 is reserved and used for exception responses). When a message
is sent from a Client to a Server device, the function code field tells the server which kind of action must
be performed. Function code "0" is not valid. Sub-function codes are added to some function codes to
define multiple actions. The data field, contained in the messages sent from a client to a server device,
contains additional information that the server uses to take the action defined by the function code. This
can include items like discrete and register addresses, the quantity of items to be handled, and the count
of actual data bytes in the field. The data field may be non-existent (of zero length) in certain kinds of
requests, in this case the server does not require any additional information. The function code alone
specifies the action. If no error related to the Modbus function requested occurs (in a properly received
Modbus ADU), the data field of a response from a server to a client contains the data requested. If
an error related to the Modbus function requested occurs, the field contains an exception code that the
server application can use to determine the next action to be taken. For example a client can read the
ON / OFF states of a group of discrete outputs or inputs or it can read/write the data contents of a group
of registers. When the server responds to the client, it uses the function code field to indicate either
a normal (error-free) response or that some kind of error occurred (called exception response). For a
normal response, the server simply echoes to the request of the original function code.
54 / 61
MASSIF - FP7-257475
D3.2.1 - Scenarios analysis and external languages specification
Modbus Advantages
Simplicity Modbus TCP/IP simply takes the Modbus instruction set and wraps TCP/IP around it. Development costs are exceptionally low. Minimum hardware is required, and development is easy
under any operating system.
Open The Modbus specification is available free of charge for download, and there are no subsequent
licensing fees required for using Modbus or Modbus TCP/IP protocols. Additional sample code,
implementation examples, and diagnostics are available on the Modbus TCP toolkit, a free benefit
to Modbus Organization members and available for purchase by nonmembers.
Availability of many devices Interoperability among different vendors devices and compatibility with a
large installed base of Modbus-compatible devices.
Issues (Modbus)
Non-encrypted Modbus is not encrypted. There is no protection from message eavesdropping and
spoofing.
No data description The Modbus protocol does not natively support data object description.
Modbus Example
Modbus is the standard protocol used for communication between an RTU and a supervisory server,
like a SCADA system. Several version of the standard are available and differences include the data
format used. The sample below is related to the so called Modbus ASCII and RTU version and contains
measurements data:
55 / 61
MASSIF - FP7-257475
D3.2.1 - Scenarios analysis and external languages specification
FUNCT: 2 char
DATA:
n
LRC Check: 2
End: 2 chars
Function codes
chars Data
chars Error checks
carriage return line feed (CR-LF) pair
11
03
00
6B
00 03
76 87
|ADDR |FUNCT|-------DATA---------| CRC |
These are other typical Modbus operative commands:
45E6
| CRC |
56 / 61
MASSIF - FP7-257475
D3.2.1 - Scenarios analysis and external languages specification
The Collection Tree Protocol (CTP) is a tree-based collection protocol. The CTP is used to collect
data from the sensors in a Wireless Sensor Network by means of data messages and to send routing
information to other nodes by means of routing messages. Data in this context are measurements.
Regarding the rooting mechanism, the CTP assumes that every node is root for other nodes. The
first root node is the real root node and is called Base Station (BS). All the nodes part of a WSN send
data to the Base Station. In the CTP data message we have, among the others, the orig ID of the node
originating the message and routing flags for congestion control. Moreover CTP is address-free, in that
a node does not send a packet to a particular root; instead, it implicitly chooses a root by choosing a
next hop. Nodes generate routes to roots using a routing gradient. For the next hop choice, the CTP
uses a shortest path first algorithm, which gives priority to the route to the base station having the lowest
cost. The cost function can be based on either the hop count to the base station or on the estimate of
the link bandwidth.
So the CTP estimates the link quality with a certain number of neighbors; the protocol used to
exchange information with other nodes about the transmission cost is called LEEP (Link Estimation
Exchange Protocol)9 . The quality values are used to select the parent node, that is the neighbor node
with the best path metric. The nodes periodically send route update messages with routing information
to their neighbors. The routing message contains the measured Expected Transmission cost (ETX) to
the base station and a measure of the link quality for every neighbor node. Moreover it contains the
generating node current parent ID and the nodes current routing metric value.
The Dissemination Protocol (DIP) instead has different aims in the context of the WSN: common
uses include network reconfiguration and reprogramming. The mechanism for realizing these operations
involves the use of some shared variables among the nodes. Maintaining shared variables consistency
is the service offered by the DIP. Indeed, the dissemination service tells nodes when the value changes,
and exchanges packets so that the value will reach eventual consistency across the network. At any
given time, two nodes may disagree, but over time the number of disagreements will shrink and the
network will converge on a single value.
WSN Example
In a Wireless Sensor Network the messages from the Wireless Sensors to the Base Station contain
routing data or measurement data and are transported by means of CTP and DIP protocol. Follow some
sample packets (header and data):
PCR = Routing Pull flag (P), Congestion Notification flag (C), Reserved Bits (R)
THL = Time Has Lived
ETX = Expected Transmission
ORIGIN = Origin Nodes
SQ = Origin Sequence Number
CI = Collect_id
Data = Data payloads (i.e. measurements)
9 http://www.tinyos.net/tinyos-2.x/doc/html/tep124.html
57 / 61
MASSIF - FP7-257475
D3.2.1 - Scenarios analysis and external languages specification
05
05
05
09
00
00
00
00
05
05
05
0B
00
00
00
00
01
01
01
06
00
00
00
00
00 00 00
00 00 00
00 00 00
02 00 00
Data
00
00
00
00
00
00
00
01
00
00
00
00
00
00
00
01
02
02
02
02
00
00
00
00
03
03
03
00
|
Advantages (WSN)
The CTP and DIP protocol are particularly suited for mobile devices, having strict requirements in terms
of energy saving.
Issues (WSN)
The CTP and DIP protocols lack in terms of data transmission security, due to poor on null authentication, cryptography and integrity support. WSN nodes could be victim of several cyber attacks. For
example the Sink Hole or the Sleep Deprivation try to exploit the routing mechanisms of the protocols.
Main weakness is related to the energy consumption, even in the case of the topology and functionality
restoration.
58 / 61
Chapter 4
59
MASSIF - FP7-257475
D3.2.1 - Scenarios analysis and external languages specification
The ability to accurately manage time will be a primary operational requirement for the MASSIF
platform.
Modularity Many of these data formats have some sort of hierarchical structure. The older formats may
have only one or two levels of indirection (e.g. the CLF format has two levels), and more recent
formats such as IDMEF and IODEF use a fairly complex class structure. We consider this trend to
be a corroborating example that simplicity is not enough.
Furthermore, several of the components defined in these formats are fairly similar. The notion of
address, of machine, of sensor, of timestamp, are quite similar both in syntax and in semantic
accross formats. It will thus be important, when working on deliverable 3.2.2, to precisely and extensively define these components, in order to reach concensus both on the syntax and semantic.
The existence of these components justifies the choice of defining an ontology in deliverable 3.2.2,
and in addition to the format we might also need to define the major constants that are important
in a SIEM environment, such as localhost (127.0.0.1), or the IANA ports assignments.
Also, the actual instances of these components are likely to be shared by many components of the
SIEM system. Thus, instead of including in an event all the information that qualifies it, it might be
more useful to simply provide common references to shared objects, this sharing hapenning either
in real-time or during separate information synchronization sessions.
XML XML does not appear to be widely adopted in event languages. Thus, there will be a need in
the MASSIF project to reach concensus on the use (or not) of XML, and more specifically XML
schemas, to define event streams. Two of the major advantages of XML are the built-in syntactic verification of messages (including typing with carefully specified schemas) and the ability to
project a base language into others using XSLT transformations.
60 / 61
Bibliography
[1] Don Box, Luis Felipe Cabrera, Craig Critchley, Francisco Curbera, Donald Ferguson, Steve
Graham, David Hull, Gopal Kakivaya, Amelia Lewis, Brad Lovering, Peter Niblett, David Orchard, Shivajee Samdarshi, Jeffrey Schlimmer, Igor Sedukhin, John Shewchuk, Sanjiva Weerawarana, and David Wortendyke. Web services eventing. W3C Member Submission, March 2006.
http://www.w3.org/Submission/WS-Eventing/.
[2] B. Claise. Specification of the ip flow information export (ipfix) protocol for the exchange of ip traffic
flow information. RFC 5101, January 2008. http://www.ietf.org/rfc/rfc5101.txt.
[3] D. Crocker and P. Overell. Augmented bnf for syntax specifications: Abnf. RFC 5234, January
2008. http://www.ietf.org/rfc/rfc5234.txt.
[4] R. Danyliw, J. Meijer, and Y. Demchenko. The incident object description exchange format. RFC
5070, December 2007. http://www.ietf.org/rfc/rfc5070.txt.
[5] H. Debar, D. Curry, and B. Feinstein. Intrusion detection message exchange format. RFC 4765,
March 2007. http://www.ietf.org/rfc/rfc4765.txt.
[6] R. Gerhards. The syslog protocol. RFC 5424, March 2009. http://www.ietf.org/rfc/rfc5424.txt.
[7] Steve Graham, David Hull, and Bryan Murray. Web services base notification 1.3. OASIS Standard,
October 2006. http://www.oasis-open.org/committees/tc_home.php?wg_abbrev=wsn.
[8] M. St. Johns. Identification protocol. RFC 1413, February 1993. http://www.ietf.org/rfc/rfc1413.txt.
[9] C. Lonvick. The bsd syslog protocol. RFC 3164, August 2001. http://www.ietf.org/rfc/rfc3164.txt.
[10] J. Quittek, T. Zseby, B. Claise, and S. Zander. Requirements for ip flow information export (ipfix).
RFC 3917, October 2004. http://www.ietf.org/rfc/rfc3917.txt.
[11] TCG Trusted Network Connect. TNC IF-MAP Binding for SOAP. Technical report, Trusted Computing Group, 2010.
[12] TCG Trusted Network Connect. Tnc if-map metadata for network security. Technical report, Trusted
Computing Group, 2010.
[13] M. Wood and M. Erlinger. Intrusion detection mesage exchange requirements. RFC 4766, March
2007. http://www.ietf.org/rfc/rfc4766.txt.
61