Download as pdf or txt
Download as pdf or txt
You are on page 1of 188

B.Sc. LL.B. (Hons.

) [Cyber Security] Program

SEMESTER-IV

STUDY MATERIAL

ON

Security Operation Centre, Threat Modelling and Law

BY

Dr. Satya Prakash

NATIONAL LAW INSTITUTE UNIVERSITY, BHOPAL


(For Private Circulation and Academic Purpose Only)
INDEX
S. No. Content Page No.

Syllabus 01

1. Introduction to Operational Security 05

2. Security Operations Centre Framework 12

3. Security Information and Event Management (SIEM) 29

4. Computer security Log Management 41

5. Log Analysis using Arc Sight 60

6. Log Management using Splunk 66

7. Incident Response and handling 77

8. Data Leakage Prevention 118

9. Threat Intelligence and Threat Modelling 131

10. Legal and Regulatory Framework 166


SYLLABUS
Security Operation Centre, Threat Modelling and
Law [four Credits]
B.Sc. LL.B. (Hons.) [Cyber Security] Program
IV SEMESTER
ACADEMIC YEAR 2023– 24
COURSE TEACHER: SATYA PRAKASH
*****
INTRODUCTION
In an era defined by unprecedented connectivity, the protection of digital assets has become
paramount. As organizations increasingly depend on interconnected systems, the need for a
robust cybersecurity framework has never been more critical. This course serves as an
indispensable gateway to comprehending the intricacies of three fundamental pillars that
underpin a resilient cybersecurity posture i.e., the Security Operation Centre (SOC), Threat
Modelling and the legal considerations.

This course specifically highlights the functionality of the Security Operation Centre (SOC).
From real-time monitoring to incident response, it explores the functionalities that empower
organizations to detect, analyze and respond to potential security breaches. The course also
addresses Threat Modelling which is considered the first step into the shoes of a cybersecurity
analyst. It equips students to learn how to anticipate and assess potential threats through
systematic methodologies, providing the ability to construct robust threat models and fostering
a proactive approach to cybersecurity.

Additionally, the course thoroughly examines the legal aspects within the digital realm,
investigating both global and regional legal frameworks that govern cybersecurity. Its objective
is to assist students in comprehending compliance requirements and grasping the potential
repercussions of non-compliance.

This course is designed to equip students with the knowledge and skills needed to understand,
implement and manage security measures within the context of a Security Operation Centre,
this course also delves into the crucial aspects of threat modelling and the legal frameworks
governing cybersecurity.

COURSE OBJECTIVES
The objectives of this course are to equip the students with the ability to:
❖ Understand the basics of Security Operations Centre and Threat Modeling.

1
❖ Learn the importance of central point monitoring and acting on threats.
❖ Deploy the Policies for scheduling, deadlocks, memory management, synchronization,
system calls and file systems.
❖ Monitor system operations and reaction to events in response to triggers and/or
observation of trends or unusual activity.
❖ Interpret the Security Operation Centre (SOC) audit and the compliance policy.

LEARNING OUTCOME

On the successful completion of this course, students will be able to:


Identify, monitor and analyse potential intrusions in real time and through historical
trending on security-relevant data sources.
Analyse the behaviour of the incidences, threats and their potential impact on the
business.
Provide situational awareness and reporting on cybersecurity status, incidents and
trends in adversary behavior to appropriate personnel
Develop incidence response shield for protection from the threats.
Perform the incidence response and investigate the root cause of incidence and Perform
the Security Operation Centre (SOC) audit.

COURSE OUTLINE

UNIT-I: Introduction [3hrs]


Overview of Security Operation Centre (SOC) – Importance of the SOC – Security Operations
and Events Monitoring– SOC functions–SOC Basics–People, Process and Technology–Roles
and Responsibilities–Know what you are protecting and why: Situational Awareness–SOC
Operating Context–Understand the Organization’s Mission –Legal, Regulatory and
Compliance –Understand the Threat.

UNIT-II: Development of Security Operations Center (SOC) [6hrs]


Organizational Needs–Drivers for Choosing the Most Appropriate SOC Structure–SOC
Organizational Models–Centralized SOC Organizational Structures–SOC Functions and
Services–On-Premise and off-premise of SOC–SOC 24x7–SOC Physical Location and
Maintaining Connections Among Distributed Staff.
UNIT -III Security Information and Event Management (SIEM) [5hrs]

2
Overview of SIEM–SIEM Architecture–SIEM Features and Capabilities: Log Aggregation and
Normalization–Event Collection and Correlation–Alerting–Dashboard–Compliance
Reporting– Log Retention.

UNIT -IV: Computer security Log Management [4hrs]


Meaning of Log–Log Management: Infrastructure-Planning-Significance–Challenges in Log
Management.

UNIT-V: Log Analysis using Arc Sight [8hrs]


Configure Log Sources – Analyse Log Data – Practical Design and implementation of Log
Management Solution – Arc Sight ESM / Express – Arc Sight Logger – ESM Risk Insight –
User Behaviour Analytics – Arc Sight Threat Detector.

UNIT -VI: Log Management using Splunk [8hrs]


Log Management Tool Splunk – Architecture and Futures – Monitoring and Alerting– Splunk
Setup and Look up Table–Splunk DB Connect in a Distributed Environment–User Access
Permissions– Security and Access Controls.

UNIT -VII: Incident Response and handling [8hrs]


Overview of Incident Response–Incident Handling–Incident Response Phase1: Preparation–
Incident Response Phase 2 and 3: Detection and Analysis–Incident Response Phases 4 to 6 –
Containment-Eradication and Recovery–Incident Response Phase 7: Post Incident Activities.

UNIT-VIII: Data Leakage Prevention [8hrs]


Meaning of Data Leakage – Types of DLP Solutions: Network-based DLP-Endpoint-based
DLP -Cloud-based DLP. Components of DLP Solution: Data Classification-Content Analysis
Techniques-Conceptual Analysis-Role Based/ Regular Expression-Database Fingerprinting-
DLP Dashboard.

UNIT-IX Threat Intelligence and Threat Modeling [8hrs]


Overview on Threat Intelligence–Threat Actors–Types of Threat Intelligence: Operational
Intelligence-Strategical Intelligence-Tactical Intelligence– Overview on OSINT. Threat
Modeling–Threat Modeling Concepts: STRIDE Framework- DREAD Model–Risk
Assessment and Mitigation Strategies.
UNIT-X Legal and Regulatory Framework [6hrs]

3
Overview of the NICE– Cybersecurity Workforce Framework–NIST Cybersecurity
Framework: NIST– Management of Information Security Incidents: ISO/IEC 27035-ISO/IEC
27035-1:2023.

READING LIST

ESSENTIAL READINGs

1. Joseph Muniz, The Modern Security Operations Centre (Pearson Education 2021)
2. John W. Rittinghouse, William M. Hancock, Cybersecurity Operations Handbook (lst ed,
Digital Press 2004).

RECOMMENDED READINGS

1. Mike O’Leary, Cyber Operations: Building, Defending, and Attacking Modern Computer
Networks (Apress 2019).
2. Benanter Messaoud, Access Control Systems: Security, Identity Management and Trust
Models (Springer 2006).

4
Introduction to Operational Security

There are two types of companies in the world: those that know they’ve been hacked, and
those that don’t. –Misha Glenny
1.1. Meaning of Operational Security
Security operations pertains to everything that takes place to keep networks, computer systems,
applications, and environments up and running in a secure and protected manner. It consists of
ensuring that people, applications, and servers have the proper access privileges to only the
resources to which they are entitled and that oversight is implemented via monitoring, auditing,
and reporting controls. Operations take place after the network is developed and implemented.
This includes the continual maintenance of an environment and the activities that should take
place on a day-to-day or week-to-week basis. These activities are routine in nature and enable
the network and individual computer systems to continue running correctly and securely.
Networks and computing environments are evolving entities; just because they are secure one
week does not mean they are still secure three weeks later. Many companies pay security
consultants to come in and advise them on how to improve their infrastructure, policies, and
procedures. A company can then spend thousands or even hundreds of thousands of dollars to
implement the consultant’s suggestions and install properly configured firewalls, intrusion
detection systems (IDSs), antivirus software, and patch management systems. However, if the
IDS and antivirus software do not continually have updated signatures, if the systems are not
continually patched and monitored, if firewalls and devices are not tested for vulnerabilities,
or if new software is added to the network and not added to the operations plan, then the
company can easily slip back into an insecure and dangerous place. This can happen if the
company does not keep its operational security tasks up-to-date.
Even if you take great care to ensure you are watching your perimeters (both virtual and
physical) and ensuring that you provision new services and retire unneeded ones in a secure
manner, odds are that some threat source will be able to compromise your information systems.
What then? Security operations also involves the detection, containment, eradication, and
recovery that is required to ensure the continuity of business operations. It may also require
addressing liability and compliance issues. In short, security operations encompass all the
activities required to ensure the security of information systems. It is the culmination of most
of what we’ve discussed in the book thus far.
Most of the necessary operational security issues have been addressed in earlier chapters. They
were integrated with related topics and not necessarily pointed out as actual operational security

5
issues. So instead of repeating what has already been stated, this chapter reviews and points
out the operational security topics that are important for organizations and CISSP candidates.

In other words, OPSEC is a capability that identifies and controls critical information and
indicators of friendly force actions attendant to military operations, and incorporates
countermeasures to reduce the risk of an adversary exploiting vulnerabilities. When effectively
employed, it denies or mitigates an adversary’s ability to compromise or interrupt a mission,
operation, or activity. Without a coordinated effort to maintain the essential secrecy of plans
and operations, our enemies can forecast, frustrate, or defeat major military operations. Well-
executed OPSEC helps to blind our enemies, forcing them to make decisions with insufficient
information.

OPSEC is an information-related capability (IRC) that, when properly employed, can be used
to gain advantages in the information environment, just as other military techniques are used
in the operational environment. OPSEC fits into a web of many other mutually supporting
IRCs, such as military deception, public affairs, and cyberspace operations. The effective
application, coordination, and synchronization of other IRCs is a critical component of the
execution of OPSEC and achievement of a common information operations (IO) objective.
When used in concert with one another, OPSEC and other IRCs can be effectively integrated
into operations to create operationally exploitable conditions necessary for achieving a
commander’s objectives.

OPSEC …

1. is an analytic process

2. focuses on adversary collection capability and intent

3. emphasizes the value of sensitive and critical information.

OPSEC provides a means for screening information prior to its release in order to prevent
aggregation with other information, ultimately revealing intentions or capabilities. Aggregation
of information—with its potentially negative impact on operations, missions, activities, and
personnel safety—is a basic OPSEC concept.

An effective OPSEC program has its chain of command’s full support. Command emphasis
includes an OPSEC program manager or coordinator being appointed in writing by the
commanding officer, and an OPSEC working group charged with the responsibility of ensuring
the command and family members maintain an acute OPSEC awareness.

6
Create an algorithm and a flowchart that will accept/read two numbers and then display the
bigger number.

What is Operations Security

Operations security is used to identify the controls over software, hardware, media, and the
operators and administrators who possess elevated access privileges to any of these resources.
The primary focus is on data centre operations processes, people, and technology. Specific
types of controls are needed

• Preventative controls reduce the frequency and impact of errors and prevent unauthorized
intruders
• Detective controls discover errors after they occur
• Corrective or recovery controls help mitigate the impact of a loss
• Deterrent controls encourage compliance with external controls
• Application-level controls minimize and detect software operational irregularities
• Transaction-level controls provide control over various stages of a transaction

How Did OPSEC Come into the Picture?

OPSEC first came about through a U.S. military team called Purple Dragon in the Vietnam
War. The counterintelligence team realized that its adversaries could anticipate the U.S.’s
strategies and tactics without managing to decrypt their communications or having intelligence
assets to steal their data. They concluded that the U.S. military forces were actually revealing
information to their enemy. Purple Dragon coined the first OPSEC definition, which was: “The
ability to keep knowledge of our strengths and weaknesses away from hostile forces.”

7
This OPSEC process has since been adopted by other government agencies, such as the
Department of Defense, in their efforts to protect national security and trade secrets. It is also
used by organizations that want to protect customer data and is instrumental in helping them
address corporate espionage, information security, and risk management.

The 5 Steps of Operational Security

There are five steps to OPSEC that allow organizations to secure their data processes.

1. Identify Sensitive Data

Understanding what data organizations have and the sensitive data they store on their systems
is a crucial first step to OPSEC security. This includes identifying information such as customer
details, credit card data, employee details, financial statements, intellectual property, and
product research. It is vital for organizations to focus their resources on protecting this critical
data.

2. Identify Possible Threats

With sensitive information identified, organizations then need to determine the potential threats
presented to this data. This includes third parties that may want to steal the data, competitors
that could gain an advantage by stealing information, and insider threats or malicious insiders
like disgruntled workers or negligent employees

3. Analyze the Vulnerabilities

Organizations then need to analyze the potential vulnerabilities in their security defenses that
could provide an opportunity for the threats to materialize. This involves assessing the
processes and technology solutions that safeguard their data and identifying loopholes or
weaknesses that attackers could potentially exploit.

4. What is the Threat Level?

Each identified vulnerability then has to have a level of threat attributed to it. The
vulnerabilities should be ranked based on the likelihood of attackers targeting them, the level
of damage caused if they are exploited, and the amount of time and work required to mitigate
and repair the damage. The more damage that could be inflicted and the higher the chances of
an attack occurring, the more resources and priority that organizations should place in
mitigating the risk.

5. Devise a Plan to Mitigate the Threats

8
This information provides organizations with everything they need to devise a plan to mitigate
the threats identified. The final step in OPSEC is putting countermeasures in place to eliminate
threats and mitigate cyber risks. These typically include updating hardware, creating policies
around safeguarding sensitive data, and providing employee training on security best practice
and corporate data policies.

Operations Security Principles

Principle of least privilege (need-to-know)-Defines a minimum set of access rights or


privileges needed to perform a specific job description – E.g. military or law enforcement
agency.

Separation of duties – A type of control that shows up in most security processes to make
certain that no single person has excessive privileges – E.g. double custody.

Two fundamental reasons behind Separation of duties

• People are an integral part of every operations process. They authorize, generate, and
approve all work that’s needed.
• People have shortcomings. When individuals perform complementary checks on each
other, there is an opportunity for someone to catch an error before a process is fully
executed. E.g. two signature is required to sign on any legal documents

Operations Security Process Controls

Necessary to secure data centre operations such as

• Trusted recovery controls – Ensure that security is not breached when a computer system
crashes. E.g. BCP and DRP processes
• Configuration and Change Management controls – Used for tracking and approving changes
to a system. E.g. patch management
• Personnel security – Involves pre-employment screening and mandatory vacation time. E.g.
background check, security clearance, references and etc.
• Record retention processes – Refers to how long transactions and other types of
computerized or process records should be retained. E.g. event logs, transaction logs,
backup rotation and etc.
• Resource protection – Protects company resources and assets. E.g. physical and logical
security control.

9
• Privileged entity controls – Given to operators and system administrators as special access
to computing resources. E.g. backup operator’s vs domain administrators
• Media viability controls – Needed for the proper marking and handling of assets. E.g.
backup media, onsite storage vs offsite storage

Operations Security Controls in Action

The principles needed for secure operation of data centre assets

• Software support
• Configuration and change management
• Backups
• Media controls
• Documentation
• Maintenance
• Interdependencies

Software support

• It’s essential that software functions correctly and is protected from corruption
• Several elements of control are needed
o Limiting what software is used on a given system
o Inspecting or testing software before it is loaded
o Ensuring software is properly licensed
o Ensuring software is not modified without proper authorization

Configuration and Change management

• To ensure that users don’t cause unintentional changes to the system that could diminish
security
• To ensure that changes to the system are reflected in up-to-date documentation, such as
the contingency or continuity plan

Backups

• This function is critical to contingency planning


• Backups should be stored securely and preferably at a different site, in case the building
where the computing equipment is located is inaccessible

Media controls

10
• Include a variety of measures to provide physical and environmental protection and
accountability for tapes, optical media, USB (Flash) drives, printouts, and other media
• Common media controls
• Marking
• Logging
• Integrity verification
• Physical access protection
• Environmental protection
• Transmittal
• Disposition

Documentation

• Documentation of all aspects of computer support and operations is important to ensure


continuity and consistency
• Security documentation and procedures manual should be written to inform system users
how to do their jobs securely

Maintenance

• System maintenance requires either physical or logical access to the system


• Maintenance may be performed on site, or it may be necessary to move equipment to a
repair site
• Maintenance may also be performed remotely via communications connections
• It may be necessary to take additional precautions such as conducting background
investigations of service personnel.

Interdependencies

• Support and operations components coexist in most computer security controls


• These components are
• Personnel
• Incident handling
• Contingency planning
• Security awareness, training, and education
• Physical and environmental
• • Technical controls
• • Assurance

11
UNIT-I and II: Security Operations Centre Framework

“It’s better to light a candle than curse the darkness.”

What is a security operations Centre (SOC)?


While a SOC traditionally refers to a physical facility within an organization, it more regularly
refers to in-house or outsourced information security professionals that analyze and monitor
the organization’s security systems. The SOC’s mission is to protect the company from security
breaches by identifying, analyzing, and reacting to cybersecurity threats. SOC teams are
composed of management, security analysts, and sometimes, security engineers. The SOC
works across teams, with the company’s development and IT operations teams.
SOCs are a proven way to improve threat detection, decrease the likelihood of security
breaches, and ensure an appropriate organizational response when incidents do occur. SOC
teams identify unusual activity on servers, databases, networks, endpoints, applications, etc.,
investigate security threats, and respond to security incidents as they occur.
Once upon a time, it was believed that a SOC was only suitable for large enterprises. Today,
many smaller organizations are setting up lightweight SOCs, such as a hybrid SOC, which
combines part-time, in-house staff with outsourced experts, or a virtual SOC, which has no
physical facility at all, and is a team of in-house staff who also serve other functions.
SOC BASICS: Whether you’re protecting a bank or the local grocery store, certain common
sense security rules apply. At the very least, you need locks on entrances and exits, cash
registers and vaults as well as cameras pointed at these places and others throughout the facility.
The same goes for your network. Controlling access with tools like passwords, ACLs, firewall
rules and others aren’t quite good enough. You still have to constantly monitor that these
security controls continue to work across all of your devices, so that you can spot strange
activity that may indicate a possible exposure.
The tools you use to do security monitoring and analysis may be a bit more varied than just a
CCTV monitor, but the concept is the same.
Unfortunately, unlike with CCTV cameras, you can’t just look into a monitor and immediately
see an active threat unfold, or use a video recording to prosecute a criminal after catching them
in the act on tape. The “bread crumbs” of cyber security incidents and exposures are far more
varied, distributed and hidden than what can be captured in a single camera feed, and that’s
why it takes more than just a single tool to effectively monitor your environment.
People: Just like people, every security organization is different. In some companies, the
executive team has realized the significance of cyber security to the business bottom line. In
these cases, the SOC team is in a great position: enough budget for good tools and enough staff

12
to manage them, and the “human” capital of executive visibility and support. But that’s not the
reality in most cases, unfortunately. SOC teams are fighting fire with never enough staff, never
enough time, and never enough visibility or certainty about what’s going on. That’s why it’s
essential to focus on consolidating your toolset, and effectively organizing your team. A SOC
team that has the right skills, using the least number of resources - all while gaining visibility
into active and emerging threats.

Processes: As well as having the latest intrusion detection and prevention technologies and
highly skilled people, SOCs must have processes in place to make sure all steps are followed
to effectively prevent, detect, and remediate breaches. In addition to conducting regular duties
such as filtering emails, network traffic, and endpoints, and working with clients to review
lessons learned after an incident, the SOC also should have playbooks to detect and respond to
threats without disrupting the business. A standardized repeatable workflow provides guidance
for handling any type of situation, including steps that must be taken to meet compliance
requirements for SOX, FERPA, FISMA, PCI DSS, GDPR, and HIPAA. A SOC should be able
to provide guidance in meeting each compliance standard’s requirements, and should be able

13
to provide each customer with a personalized audit-ready document, an Attestation of
Compliance, to show what it has done to meet the requirements so companies don’t have to
waste hours customizing reports.
Technology: SOCs need the latest tools, which now incorporate machine learning and artificial
intelligence, to prevent and remediate threats. Tons of data flow from mobile devices,
workstations, routers, servers, and numerous other security technologies, but analysts can only
process so much information. Machine learning can handle tasks in seconds that would take
humans hours. It can also quickly detect anomalous activities. For example, it can identify
events that are out of the ordinary, such as an employee based in the U.S. who appears to be
logging into the network from a computer with an IP address based in China. Machine learning
can also flag emails from a domain that is similar to one it is familiar with to help detect fraud.
For example, it could block an email that comes from amazzon.com rather than amazon.com.
Artificial Intelligence tools use machine learning to detect threats and categorize them based
on their level of severity. SOC tools need to be able to spot attacks on premises and in the
cloud. Virtually all organizations have data in the cloud, even those that don’t know it.
Employees may be using cloud apps such as SalesForce, Dropbox, or Google Docs. Or they
may be using their personal emails for business and exfiltrating company data.
Building an SOC: SOC teams are responsible for monitoring, detecting, containing and
remediating IT threats across applications, devices, systems, networks, and locations. Using a
variety of technologies and processes, SOC teams rely on the latest threat intelligence (e.g.
indicators, artifacts, and other evidence) to determine whether an active threat is occurring, the
scope of the impact, as well as the appropriate remediation. Security operations centre roles &
responsibilities have continued to evolve as the frequency and severity of incidents continue to
increase.
Building a SOC with Limited Resources in a Race against Time: For many organizations
(unless you work for a large bank), building a SOC may seem like an impossible task. With
limited resources (time, staf, and budget), setting up an operations center supported by multiple
monitoring technologies and realtime threat updates doesn’t seem all that DIY. In fact, you
may doubt that you’ll have enough full-time and skilled team members to implement and
manage these diferent tools on an ongoing basis. That’s why it’s essential to look for ways to
simplify and unify security monitoring to optimize your SOC processes and team. Thankfully,
AlienVault™ provides the foundation you need to build a SOC - without requiring costly
implementation services or large teams to manage it. With AlienVault USM™, AlienVault
Labs Threat Intelligence, and AlienVault OTX™, you’ll achieve a well-orchestrated

14
combination of people, processes, tools and threat intelligence. All the key ingredients for
building a SOC. In each chapter of this eBook, we’ll go into detail on each of these essential
characteristics.
How do Security Operations Centres Work?
SOCs operate by collecting, analyzing, and correlating data from various sources, such as
network traffic, log files, and threat intelligence feeds. This data is then used to detect
potential security incidents and respond to them in a timely manner. Following are the key
components of a modern SOC:
Continuous Monitoring
One of the main functions of a SOC is to continuously monitor an organization’s IT
infrastructure for any signs of suspicious activity or potential threats.
This involves the use of various detection tools and technologies, such as intrusion detection
systems (IDS), email security, cloud security, and endpoint detection and response (EDR)
solutions, collected into a security information and event management (SIEM) system.These
tools help the SOC team identify unusual or malicious activities that may indicate a security
breach or an attempted attack.
Threat Intelligence
By gathering and analyzing information about current and emerging threats, security teams
can better understand the tactics, techniques, and procedures (TTPs) used by malicious
actors. This knowledge enables them to proactively defend against potential attacks and
respond more effectively to incidents when they occur.
Threat intelligence can be sourced from various channels, including open-source intelligence
(OSINT), commercial threat intelligence feeds, and information sharing groups or platforms.
Incident Response
When a potential security incident is detected, the SOC team must quickly assess the situation
and determine the appropriate course of action. This involves containing the threat,
mitigating its impact, coordinating with other teams within the organization to ensure a swift
and effective response, and ensuring recovery of operational systems.
Incident response plans and playbooks are critical components of a SOC’s operations, as they
provide a structured, and often automated approach to dealing with different types of security
incidents.
Main focus areas of a SOC
A well-designed security operations center should focus on several key areas to ensure the
organization’s digital assets are adequately protected. These focus areas include:

15
Network Security
Network security involves protecting the organization’s network infrastructure from
unauthorized access, misuse, or disruption. This includes monitoring network traffic for signs
of intrusion, analyzing log data for anomalies, and implementing network segmentation and
access controls to restrict access to resources, often through a zero trust security model. This
ultimately limits the potential impact of a security breach.
Endpoint Security
Endpoints, such as desktops, laptops, and mobile devices, are often targeted by
cybercriminals as they can serve as entry points into an organization’s network. A SOC must
focus on securing these devices by implementing strong authentication and access controls,
monitoring for signs of compromise, and ensuring that security patches and updates are
applied in a timely manner.
Cloud Security
As more organizations are interested in taking advantage of the agility and scale of the cloud,
a SOC must include not only cloud access but cloud infrastructure and data security into their
monitoring and detection plans to avoid accidental and malicious leaks of sensitive
information.
Application Security
Applications are another critical area that SOCs must focus on, as they can be exploited by
attackers to gain unauthorized access to sensitive data or carry out other malicious activities.
Application security involves identifying and addressing vulnerabilities in the software
development lifecycle, monitoring for signs of application-based attacks, and securing
application programming interfaces (APIs) and other components of the application
infrastructure.
7 SOC deployment models and their pros and cons
Let’s review the primary models’ organizations use to deploy a SOC, some of which are
innovative models that have emerged over the past few years.
1. Dedicated SOC
A dedicated security operations center is a SOC model that is focused solely on providing
security services to a single organization. This type of SOC often has a physical location
within the organization’s premises and is staffed by in-house security experts responsible for
monitoring, detecting, and responding to security incidents and threats.
Pros: Having a dedicated SOC provides a highly focused and customized approach to
security, as the security experts are dedicated solely to the organization’s networks and

16
systems. This results in a more in-depth understanding of the organization’s unique security
needs and allows for more effective threat management.
Cons: A dedicated SOC may not be suitable for all organizations. For one, the cost of setting
up and maintaining a dedicated SOC can be quite high, as it requires significant investment
in infrastructure, technology, and highly skilled security professionals. Moreover, smaller
organizations may find it difficult to attract and retain top security talent, as they may not be
able to offer competitive salaries and benefits.
2. Distributed SOC
A distributed security operations center is a SOC model that consists of multiple,
geographically dispersed SOCs working together to provide security services. These SOCs
can be located in different regions or countries and are connected through a centralized
management system that allows for seamless communication and coordination between them.
Pros: A distributed SOC offers improved threat visibility and detection. By having multiple
SOCs monitoring different parts of an organization’s network, it is more likely that threats
will be detected and addressed quickly. Additionally, a distributed SOC can help an
organization achieve a more comprehensive understanding of global threat trends, as each
SOC will have access to information about threats and incidents occurring in its specific
region.
Cons: One potential drawback of a distributed SOC is the increased complexity of managing
multiple SOCs, as it can be challenging to coordinate and align the efforts of geographically
dispersed teams. Additionally, a distributed SOC may require a significant investment in
communication and collaboration tools to ensure seamless communication between the
different SOCs.
3. Multifunctional SOC/NOC
A multifunctional SOC/NOC is a hybrid model that combines the functions of a security
operations center (SOC) and a Network Operation Center (NOC) into a single, unified unit.
This model allows for the integration of security and network management tasks, resulting
in a more streamlined and efficient approach to securing an organization’s networks and
systems.
Pros: A multifunctional SOC/NOC consolidates the security and network management
functions, providing greater operational efficiency, as resources can be shared and allocated
more effectively. Additionally, a multifunctional SOC/NOC can lead to improved
communication and collaboration between security and network teams, which can result in
faster and more effective incident response.

17
Cons: One potential drawback of a multifunctional SOC/NOC is that it may be difficult to
find professionals with the skills and expertise needed to manage both security and network
operations. Additionally, combining the functions of an SOC and NOC may result in an
increased workload for the team, which could lead to burnout and decreased effectiveness.
4. Fusion SOC
A fusion security operations center is an advanced SOC model that integrates various security
functions, such as threat intelligence, incident response, and security analytics, into a single,
unified platform. This model leverages advanced technologies, such as artificial intelligence
and machine learning, to provide a more proactive and sophisticated approach to security.
Pros: A Fusion SOC offers improved threat detection and response capabilities. By
leveraging advanced technologies and integrating various security functions, a Fusion SOC
can quickly identify and respond to threats, reducing the likelihood of a security breach.
Cons: One potential drawback of a fusion SOC is the cost of implementing and maintaining
it. This type of SOC requires significant investment in advanced technologies and skilled
security professionals. Moreover, some organizations may not have the necessary resources
or expertise to manage a fusion SOC effectively.
5. Command SOC/Global SOC
A command security operations center, also known as a global SOC, is a high-level SOC
model that oversees and coordinates the activities of multiple SOCs within an organization.
This model is typically used by large, multinational organizations with multiple SOCs located
in different regions or countries.
Pros: A command SOC/global SOC provides a comprehensive, global view of an
organization’s security posture. By overseeing the activities of multiple SOCs, a Command
SOC/Global SOC can identify trends and patterns in security incidents and threats that may
not be apparent when looking at the data from a single SOC.
Cons: The cost of implementing and maintaining a Command SOC/Global SOC can be high,
as it requires significant investment in technology, infrastructure, and skilled security
professionals. Furthermore, managing and coordinating the activities of multiple SOCs can
be complex and challenging, particularly for organizations with limited experience in this
area.
6. Virtual SOC
A virtual security operations center is a SOC model that leverages cloud-based technologies
and remote security professionals to provide security services. Unlike traditional SOCs, a

18
virtual SOC does not require a physical location or dedicated infrastructure, making it a more
flexible and cost-effective option for organizations.
Pros: A virtual SOC offers several advantages, particularly for smaller organizations or those
with limited resources. By leveraging cloud-based technologies and remote security
professionals, a Virtual SOC can provide many of the same benefits as a traditional SOC,
such as continuous monitoring and incident response, at a fraction of the cost.
Cons: One potential drawback of a virtual SOC is the reliance on cloud-based technologies
and remote security professionals, which may raise concerns about data privacy and security.
Additionally, some organizations may prefer the greater control and visibility offered by a
traditional, on-premises SOC.
7. Managed SOC/MSSP/MDR
A managed security operations center (Managed SOC), also known as a Managed Security
Services Provider (MSSP) or Managed Detection and Response (MDR) service, is a SOC
model that involves outsourcing security operations to a third-party provider. This provider
is responsible for monitoring, detecting, and responding to security incidents and threats on
behalf of the organization.
Pros: A managed SOC/MSSP/MDR provider can be a more cost-effective option for
organizations, as it eliminates the need for significant investment in infrastructure,
technology, and skilled security professionals. Additionally, a Managed SOC/MSSP/MDR
provider can offer access to a wider range of security expertise and resources than an
organization may be able to acquire in-house.
Cons: One potential drawback of managed SOC/MSSP/MDR providers is the loss of control
over security operations, as the organization will be relying on a third-party provider to
manage its security. Additionally, there may be concerns about data privacy and security,
particularly when sensitive information is being shared with an external provider.
Security Operations Centre Roles and Responsibilities
• Security analyst – The first to respond to incidents. Their response typically occurs
in three stages: threat detection, threat investigation, and timely response. Security
analysts should also ensure that the correct training is in place and that staff can
implement policies and procedures. Security analysts work together with internal IT
staff and business administrators to communicate information about security
limitations and develop documentation.
• Security engineer/architect – Maintains and suggests monitoring and analysis tools.
They create a security architecture and work with developers to ensure that this

19
architecture is part of the development cycle. A security engineer may be a software
or hardware specialist who pays particular attention to security aspects when
designing information systems. They develop tools and solutions that allow
organizations to prevent and respond effectively to attacks. They document
procedures, requirements, and protocols.
• SOC manager – Manages the security operations team and reports to the CISO. They
supervise the security team, provide technical guidance, and manage financial
activities. The SOC manager oversees the activity of the SOC team, including hiring,
training, and assessing staff. Additional responsibilities include creating processes,
assessing incident reports, and developing and implementing crisis communication
plans. They write compliance reports, support the audit process, measure SOC
performance metrics, and report on security operations to business leaders.
• CISO – Defines the security operations of the organization. They communicate with
management about security issues and oversee compliance tasks. The CISO has the
final say on policies, strategies, and procedures relating to the organization’s
cybersecurity. They also have a central role in compliance and risk management, and
implement policies to meet specific security demands.
SOC analysts are organized in four tiers. First, SIEM alerts flow to Tier 1 analysts who
monitor, prioritize and investigate them. Real threats are passed to a Tier 2 analyst with
deeper security experience, who conducts further analysis and decides on a strategy for
containment.
Critical breaches are moved up to a Tier 3 senior analyst, who manages the incident and is
responsible for actively hunting for threats continuously. The Tier 4 analyst is the SOC
manager, responsible for recruitment, strategy, priorities, and the direct management of SOC
staff when major security incidents occur.

20
The table below explains each SOC role in more detail.

Role Qualifications Duties

System administration skills;


Monitors SIEM alerts, manages and
web programming languages,
Tier 1 Analyst configures security monitoring tools.
such as Python, Ruby, PHP;
Alert Prioritizes and triages alerts or issues to
scripting languages; security
Investigator determine whethera real security incident is
certifications such as CISSP or
taking place.
SANS SEC401

Similar to Tier 1 analyst, but


Receives incidents and performs deep
with more experience including
Tier 2 analysis; correlates with threat intelligence
incident response. Advanced
Analyst to identify the threat actor, nature of the
forensics, malware assessment,
Incident attack, and systems or data affected. Defines
threat intelligence. Ethical
Responder and executes on strategy for containment,
hacker certification or training
remediation, and recovery.
is a major advantage.

21
Role Qualifications Duties

Similar to Tier 2 analyst but


Day-to-day, conducts vulnerability
with even more experience,
assessments and penetration tests, and
including high-level incidents.
Tier 3 reviews alerts, industry news, threat
Experience with penetration
AnalystSubject intelligence, and security data. Actively
testing tools and cross-
Matter hunts for threats that have madetheir way
organization data visualization.
Expert/Threat into the network, as well as unknown
Malware reverse engineering,
Hunter vulnerabilities and security gaps. When a
experience identifying and
major incident occurs, teams with the Tier 2
developing responses to new
Analyst in responding to and containing it.
threats and attack patterns.

Like the commander of a military unit,


responsible for hiring and training SOC staff,
Similar to Tier 3 analyst, in charge of defensive and offensive strategy.
Tier 4 SOC including project management Manages resources, priorities and projects,
Manager skills, incident response and manages the team directly when
Commander management training, and responding to business-critical security
strong communication skills. incidents. The organization’s point of
contact for security incidents, compliance,
and other security-related issues.

A software or hardware specialist who


focuses on security aspects in the design of
Degree in computer science,
Security information systems. Creates solutions and
computer engineering or
Engineer tools that help organizations deal robustly
information assurance,
Support and with disruption of operations or malicious
typically combined with
Infrastructure attacks. Sometimes employed within the
certifications like CISSP.
SOC, and sometimes supports the SOC as
part of development or operations teams.

Benefits of security operations centres


• Incident response – SOCs operate around the clock to detect and respond to
incidents.

22
• Threat intelligence and rapid analysis – SOCs use threat intelligence feeds and
security tools to quickly identify threats and fully understand incidents, in order to
enable appropriate response.
• Reduce cybersecurity costs – Although a SOC represents a major expense, in the
long run, it prevents the costs of ad hoc security measures and the damage caused by
security breaches.
• Reduce the complexity of investigations – SOC teams can streamline their
investigative efforts. The SOC can coordinate data and information from sources,
such as network activity, security events, endpoint activity, threat intelligence, and
authorization. SOC teams have visibility into the network environment, so the SOC
can simplify the tasks of drilling into logs and forensic information, for example.
SOC challenges and how technology can help
• Increased volumes of security alerts – The growing number of security alerts
requires a significant amount of an analyst’s time. Analysts may tend to tasks from
the mundane to the urgent when determining the accuracy of alerts. They could miss
alerts as a result, which highlights the need for alert prioritization. Exabeam
Advanced Analytics uses UEBA technology to provide security alert prioritization,
which relies on the dynamic analysis of anomalous events. This ensures that analysts
can find the alerts requiring the most immediate attention.
• Management of many security tools – As various security suites are being used by
SOCs and CSIRTs, it is hard to efficiently monitor all the data generated from
multiple data points and sources. A SOC may use 20 or more technologies, which can
be hard to keep track of and control individually. This makes it important to have a
central source and a single platform. A SIEM serves this function in most SOCs. For
an example of a next-generation SIEM solution with advanced analytics and security
automation.
• Skills shortage – Short staffing or lack of qualified individuals is an issue. A key
strategy for dealing with the cybersecurity skills shortage is automating SOC
processes, to save time for analysts. In addition, an organization may decide to
outsource. Some organizations are now outsourcing to MSSPs to help them with their
SOC services. Managed SOCs can be outsourced entirely or in partnership with on-
premises security staff.
Learn about how security technologies are helping solve SOC challenges in our guide: The
SOC, SIEM, and Other Essential SOC Tools

23
Getting started with a SOC
Questions to ask before setting up a SOC
1. Availability and hours – Will you staff your SOC 8×5 or 24×7?
2. Format – Will you have a standalone SOC or an integrated SOC and NOC?
3. Organization – Do you plan to control everything in house, or will you use an MSSP?
4. Priorities and capabilities – Is security the core concern, or is compliance a key
issue? Is monitoring the main priority, or will you need capabilities such as ethical
hacking or penetration testing? Will you make extensive use of the cloud?
5. Environment – Are you using a single on-premises environment or a hybrid
environment?
5 steps to setting up your SOC
1. Ensure everyone understands what the SOC does – A SOC observes and checks
endpoints and the organization’s network, and isolates and addresses possible security
issues. Create a clear separation between the SOC and the IT help desk. The help desk
is for employee IT concerns, whereas the SOC is for security issues related to the
entire organization.
2. Provide infrastructure for your SOC – Without the appropriate tools, a SOC team
will not be able to deal with a security threat. Evaluate and invest in tools and
technologies that will support the effectiveness of the SOC and are appropriate for the
level of expertise of your in-house security team. See the next section for a list of
tools commonly used in the modern SOC.
3. Find the right people – Build a security team using the roles listed above: security
analysts, security engineers, and a SOC manager. These specialists should receive
ongoing training in areas such as reverse engineering, intrusion detection, and
malware anatomy. The SOC manager needs to have strong security expertise,
management skills, and battle-tested crisis management experience.
4. Have an incident response plan ready – An incident response team should create a
specific and detailed action plan. The team can also create a repeatable plan that can
be used over time and adapt to different threat scenarios. Business, PR, and legal
teams may also be involved if needed. The team should adhere to predefined response
protocols so they can build on their experience.
5. Defend – A key responsibility of the SOC is to protect the perimeter with a dedicated
team focused on detecting threats. The SOC’s goal is to collect as much data and

24
context as possible, prioritize incidents, and ensure the important ones are dealt with
quickly and comprehensively.
The security maturity spectrum — Are you ready for a SOC?
A SOC is an advanced stage in the maturity of an organization’s security. The following are
drivers that typically push companies to take this step:
• Requirements of standards such as the Payment Card Industry Data Security Standard
(PCI DSS), government regulations, or client requirements
• The need for the business to secure very sensitive data
• Past security breaches and/or public scrutiny
• Type of organization — For example, a government agency or Fortune 500 company
will almost always have the scale and threat profile that justifies a SOC, or even
multiple SOCs.
Different organizations find themselves at different stages of developing their security stance.
We define five stages of security maturity. In stages 4 and 5, an investment in a security
operations centre becomes relevant and worthwhile.

25
The future of the SOC
The security operations center is undergoing an exciting transformation. It is integrating with
ops and development departments, and is empowered by powerful new technologies, while
retaining its traditional command structure and roles to identify and respond to critical
security incidents.
We showed how SIEM is a foundational technology of the SOC, and how next-generation
SIEMs, which include new capabilities like behavioral analytics, machine learning, and SOC
automation, open up new possibilities for security analysts.
The impact of a next-gen SIEM on the SOC can be significant. It can:
• Reduce alert fatigue via user and entity behavior analytics (UEBA) that goes beyond
correlation rules, helps reduce false positives, and discover hidden threats.
• Improve MTTD by helping analysts discover incidents faster and gather all relevant
data.
• Improve MTTR by integrating with security systems and leveraging Security
Orchestration, Automation and Response (SOAR) technology.
• Enable threat hunting by giving analysts fast and easy access and powerful
exploration of unlimited volumes of security data.
It describes strategies that can be applied to SOCs of all sizes, from two people to large, multi-
national centres with hundreds of people. It is intended for all cybersecurity operations centre
personnel, from new professionals just starting in a SOC to managers considering capability
expansion of the SOC.

This guides cyber professionals through applying mission context to 11 strategies of a


worldclass SOC:

Strategy 1: Know What You Are Protecting and Why

Develop situational awareness through understanding the mission; legal regulatory


environment; technical and data environment; user, user behaviors and service interactions;
and the threat. Prioritize gaining insights into critical systems and data and iterate
understanding over time.

Strategy 2: Give the SOC the Authority to Do Its Job

Empower the SOC to carry out the desired functions, scope, partnerships, and responsibilities
through an approved charter and the SOCs alignment within the organization.

26
Strategy 3: Build a SOC Structure to Match Your Organizational Needs

Structure SOCs by considering the constituency, SOC functions and responsibilities, service
availability, and any operational efficiencies gained by selecting one construct over another.

Strategy 4: Hire AND Grow Quality Staff

Create an environment to attract the right people and encourage them to stay through career
progression opportunities and great culture and operating environment. Plan for turnover and
build a pipeline to hire. Consider how many personnel are needed for the different SOC
functions.

Strategy 5: Prioritize Incident Response

Prepare for handling incidents by defining incident categories, response steps, and escalation
paths, and codifying those into SOPs and playbooks. Determine the priorities of incidents for
the organization and allocate the resources to respond. Execute response with precision and
care toward constituency mission and business.

Strategy 6: Illuminate Adversaries with Cyber Threat Intelligence

Tailor the collection and use of cyber threat intelligence by analyzing the intersection of
adversary information, organization relevancy, and technical environment to prioritize
defenses, monitoring, and other actions.

Strategy 7: Select and Collect the Right Data

Choose data by considering relative value of different data types such as sensor and log data
collected by network and host systems, cloud resources, applications, and sensors. Consider
the trade-offs of too little data and therefore not having the relevant information available and
too much data such that tools and analysts become overwhelmed.

Strategy 8: Leverage Tools to Support Analyst Workflow

Consolidate and harmonize views into tools and data and integrate them to maximize SOC
workflow. Consider how the many SOC tools, including SIEM, UEBA, SOAR, and others fit
in with the organization’s technical landscape, to include cloud and OT environments.

Strategy 9: Communicate Clearly, Collaborate Often, Share Generously

27
Engage within the SOC, with stakeholders and constituents, and with the broader cyber
community to evolve capabilities and contribute to the overall security of the broader
community.

Strategy 10: Measure Performance to Improve Performance

Determine qualitative and quantitative measures to know what is working well, and where to
improve. A SOC metrics program includes business objectives, data sources and collection,
data synthesis, reporting, and decision-making and action.

Strategy 11: Turn up the Volume by Expanding SOC Functionality

Enhance SOC activities to include threat hunting, red teaming, deception, malware analysis,
forensics, and/or tabletop exercises, once incident response is mature. Any of these can
improve the SOCs operating ability and increase the likelihood of finding more sophisticated
adversaries.

28
UNIT -III Security Information and Event Management (SIEM)
A common topic covered in the security industry is centralizing data from multiple tools.
Sometimes, this is labelled as having a “a single pane of glass.” The goal is to reduce the time
to investigate an incident by having one place to perform the investigation and having the
ability to correlate data from different tools to gain a better understanding of threats impacting
your organization. SIEM technology attempts to solve this challenge by acting as that central
point for event data, as I described earlier in this book. Market leaders for SIEM solutions
include Splunk, QRadar, and LogRhythm according to sources such as Gartner. The goals of
using a SIEM solution are to improve attack detection, speed up incident handling, centralize
reporting, and provide a resource to measure compliance. With that in mind, where does threat
intelligence fit in?
It is common for a SIEM solution to digest and correlate findings with internal telemetry, such
as what you get from firewall and DNS logs. This allows you to match potential attacks to the
external data that was collected. The value of a SIEM
solution depends on the data it receives. If you send it limited data, you will get limited results.
If the data sent is not good, the output will also not be good.
Another way to say this is “garbage in, garbage out.” If you don’t configure it to present the
information in your desired format, you will end up with dashboards alerting about hundreds
of events you will never be able to address, known in the industry as the “bug splat.” A SIEM
dashboard throwing out thousands of alarms at your analysts will just overwhelm them, causing
key alerts to be missed and the SIEM to provide little value. This brings us to the pros and cons
of threat intelligence for a SIEM. I will start with the cons.
Adding threat data to a SIEM tool might seem like a simple process, but it is not. The purpose
of the SIEM tool is to piece everything together (hence the single pane of glass, right?). If you
mix in the wrong external threat data, you will cause a ton of misunderstandings within the
SIEM tool’s built-in logic because the SIEM tool won’t be able to determine what is internal
and external, forcing the SOC to either re-engineer everything or look at methods to isolate
external data, leading to limitations in data correlation. I have found that some SOCs will light
up threat intelligence, get trampled with tons of new alerts they can’t take action on, and soon
after disable the threat data. This occurs due to an increase in false positives, contradictions in
correlation data, and just more things to look at.
One other con I have seen is the complete opposite situation, where a SIEM solution is not
generating any new alerts after the threat intelligence feed is added. This can occur if the threat

29
data is not applied correctly and either forgotten about or the SOC didn’t find any major
changes and felt the feed didn’t provide value.
When this occurs, I tend to find that a scope or objective wasn’t set, meaning there wasn’t a set
success criterion to measure against. The SOC simply lit up the feed and hoped for what would
feel as “better” results. This is a common mistake when testing threat intelligence. The SOC
should not just add a threat intelligence feed and generally look for some major incident they
didn’t know about to pop up. In the real world, that doesn’t happen by automatically adding a
threat intelligence
feed to a SIEM.
Adding Threat Intelligence to a SIEM
Given that just dumping threat data into a SIEM tool will cause more problems than value,
does that mean threat intelligence is not ideal to use in a SIEM tool?
The answer is no—if you apply data with a specific goal and expected outcome. I covered how
to evaluate threat data, with a focus on ensuring that the data is reliable, timely, and accurate
and is relevant to your line of business. There are a handful of additional checkpoints that can
help with threat data integrations with a SIEM tool. Consider the following items as you
evaluate threat data for your SIEM solution:
Validate the threat data is in a format accepted by the SIEM solution of choice. If it isn’t, how
much effort is required to manipulate the feed so that it would be accepted?
Will this new data source increase the monthly/annual SIEM bill? Many SIEM providers
charge on a usage billing cycle, meaning the more data used, the higher the cost for using the
SIEM solution. You could reduce the impact by filtering only what is relevant to your
organization and using other tuning tricks. If you are billed by your SIEM provider, consider
the impact of adding more data.
What are the top threats you plan to address? Answering this allows you to focus the results of
the threat data to specific goals that matter to your SOC. For example, are you a target for a
nation state? Is phishing a top concern? You can use the SIEM tool to develop reports and live
displays that digest threat data and answer these questions. In particular, you will find that some
threats are better addressed by using external threat data while others are easier to detect using
data from internal security and network tools. The two previous examples of phishing and
nation state concerns are both better suited for using external threat data.
Does the SIEM tool offer a way to capture additional context about events?
This is where support for tactical and operational data will be extremely handy, allowing you
to correlate attack data with additional details about the who, what, when, where, and how of

30
the attack. Having context will make decisions regarding what action to take much simpler.
This is especially true with SOARs and SOAR/SIEM integrations.
What filters are available in the SIEM tool, and can they be applied against the threat data? The
more you can focus on what is relevant to your specific needs, the more useful the threat data
will be.
Would the threat intelligence improve the confidence of existing detection capabilities? This is
a huge question to answer since the SIEM tool is pulling in data from various internal security
and network tools. As pointed out earlier, if adding external data weakens the SIEM tool’s
decision process, this will reduce the SIEM’s confidence in alerts being generated, causing a
breakdown of the value it provides. One common method to overcome this is to be selective
about which checkpoints/widgets within the SIEM tool are using the external threat data. The
bad news is that it will require some re-engineering of certain widgets and reports if filters were
not put in place regarding adding or removing external event data prior to when the reports and
widgets were originally created.
Threat data can be converted to threat intelligence using a SIEM solution if it is added in a
planned and meaningful manner. It is critical that you follow a solid rollout plan to ensure you
maximize your value received while also avoiding any losses from adding the feed. If you just
add threat data without any set goals or considerations for how the SIEM solution is currently
being used prior to the threat data, you will run into problems with your deployment.
In summary, your rollout plan should include the following steps. Many of these steps follow
the best practices I have covered in this chapter.
Step 1. Set objectives for using the threat data. What is your measurement for success?
Step 2. Configure the SIEM solution to accept the feed.
Step 3. Monitor that the data is being collected correctly.
Step 4. Identify if existing reports and live widgets are negatively impacted. If so, add filters
to remove the external data from existing reports and widgets.
Step 5. Attempt to identify objectives using live feeds against filters that include the new threat
data.
Step 6. Tune how the data is digested and troubleshoot any collection issues until you can
identify your goals.
Step 7. Operationalize your findings in widgets and reports.
Step 8. Build the new use cases into your SOC practice and SOAR solution.
Security Orchestration, Automation, and Response

31
One major drawback of a SIEM solution is its limitations in what actions it can take against an
event. This is where a security orchestration, automation, and response (SOAR) solution steps
in. A SOAR solution can provide case management, standardization, workflow, and analytics,
all of which enable the SOC to be much more productive. Without a SOAR tool, a SOC would
be left with a slew of alarms, leaving the SOC with the responsibility to manually investigate
and track how events are being handled.
The benefits of using threat data for a SOAR tool are slightly different than those for a SIEM
tool. One benefit is the impact on how playbooks are used. Having external data can be
extremely useful for this purpose. Playbooks can include additional triggers that are impacted
by threat data, allowing a SOC to take more proactive measures when attack campaigns are
being seen in the wild. Many SOAR providers have playbook templates that leverage both
internal and external threat data; not including a threat intelligence feed would limit usage of
such playbooks. External threat data can also add confidence in when a playbook is triggered
by adding additional context or checkpoints that must occur before the playbook is launched.
Other benefits from using external threat data with a SOAR tool are similar to those of using
external threat data with a SIEM tool. Those benefits include improvements to dashboards and
reporting and improvements to incident management and response. The assumption is,
however, that the same considerations are made regarding choosing which data and how it is
used as I covered in the SIEM section. As with a SIEM tool, you must follow a phased-in
approach to adding threat intelligence to a SOAR tool or your capabilities will break, data
within the SOAR tool will become tainted with false positives, and the overall value of the
SOAR tool will be negatively impacted.
I highly suggest following the same rollout plan for adding threat intelligence to a SOAR tool
as you would for a SIEM tool. The only difference will be identifying any default playbooks
or other SOAR capabilities that are designed for leveraging threat intelligence and adding those
to your evaluation plan to simplify your end results. Many SOARs likely have default
playbooks that are similar to your goals, allowing you to have a starting point rather than
developing each playbook from scratch. I recommend to first test default playbooks, which are
built for leveraging external threat data, before creating your own. Once those default
playbooks are capable of digesting the data, then attempt to modify the templates or build your
own, knowing that the threat data is captured correctly and available within the SOAR tool.
I recommend picking a handful of testing criteria as you add threat data to a SOAR solution.
Those testing criteria should fall under three categories: playbooks, dashboards, and reporting.
I also recommend to first test the impact of data within the SOAR dashboard before attempting

32
projects associated with playbooks, reporting, or other automated tasks. The specifics of how
to carry out testing will depend on the SOAR solution, other security tools, and your business
objectives.
What is Security Information and Event Management (SIEM)?
Security information and event management (SIEM) is a security solution that helps improve
security awareness and identify security threats and risks. It collects information from various
security devices, monitors and analyzes this information, and then presents the results in a
manner that is relevant to the enterprise using it.

SIEM Architecture
One of the main objectives of SIEM architecture is to maintain and manage system
configuration changes, directory services, review and log auditing, both service and user
privileges with the inclusion of incident response. In addition, the applications related
to Identity and Access Management (IAM) must be updated on a regular basis to bolster
system security and eliminate external threats. Moreover, the SIEM architecture must provide

33
the capabilities to present, analyze, and collect information from network and security devices.
The SIEM anomaly and visibility detection features are also worth mentioning. Detecting
polymorphic code and zero-days, automatic parsing, and log normalization can establish
patterns that are collected by SIEM visualization by utilizing the security events.

Core Components of SIEM Architecture


The architectural aspect of SIEM basically is concerned with the process of building SIEM
systems and its core components. In a nutshell, SIEM architecture encapsulates the following
components:
• Management of Logs: This is concerned with data collection, management of data and
retention of previous data. The SIEM collects both event data and contextual data as
stipulated in the above Figure 1. Basically, SIEM architecture collects event data from
organized systems such as installed devices, network protocol, storage protocols
(Syslog) and streaming protocols.
Data management mostly deals with data storage and retention policies. Modern SIEMs rely
on technologies that provide unlimited data storage capabilities such as Hadoop or Amazon
S3. Data retention allows retaining of data for a specific time which is almost seven years. This
data can be helpful for forensics of audit purposes.

• Normalization of Logs: It is evident from Figure 1 that SIEM receives the event and
contextual data as an input. However, the normalization of such is necessary. This is
concerned with how event data is transformed into relevant security insights. Basically,

34
this process entails the elimination of irrelevant data from generated data through a
filtering process. The main import of this is to retain only relevant data for futuristic
analysis.
• Sources of Logs: The logs are collected from networking applications, security
systems, and cloud systems. Basically, this process is concerned with how logs are
being fed into the SIEM by organizations.
• Hosting Choices for SIEM: There are different models available for hosting the SIEM.
These include Self-Host, Cloud-Host, or Hybrid-Host.
• Reporting of SEIM: Based on the available logs, the SIEM identifies and reports
suspicious activities.
• Real-Time Monitoring: SIEM provides real-time monitoring of the organization’s
infrastructure through threat detection and rapid responses to potential data breaches.

As a result, it is pertinent to point out that traditional SIEM architecture used to be monolithic
and expensive. However, the next generation SIEM is more affordable and offers better
technological advantages through sophisticated software and cloud-based technology for
effective security event management.
Understanding the Architecture of Security Information and Event Management
(SIEM)
SIEM mainly refers to threat detection, prevention, and management. The goal of a SIEM
platform is to provide real-time situational awareness. It enables an organization to detect and
respond to attacks in a timely manner.
Its architecture plays a crucial role to keep SIEM up and running smoothly. Basically, before
SIEM is put into place, enough attention should be paid to its setup and technological aspects.
Let’s understand the core components of the SIEM architecture and gain valuable insights into
the functionalities of this system.

35
Operational Process of Security Information and Event Management (SIEM)
One of the integral parts of the security information and event management architecture is the
operational process that goes behind making an effective cyber threat mitigation strategy.
However, all this information can be overwhelming and there should be a way to streamline it.

Top 10 Best Practices of Security Information and Event Management (SIEM) in 2022
SIEM software helps organizations keep an eye out for potential threats to their systems while
carrying out automatic monitoring tasks to ensure compliance with industry regulations. With

36
security breaches being a serious issue, it is important that businesses utilize accessible
cybersecurity software as they simply help make the situation more manageable.
Listed below are the best practices used by leading security experts while incorporating SIEM
solutions into their enterprise framework.

1. Start by determining the scope


Before incorporating SIEM software into your enterprise’s security architecture, it is essential
to understand the specific objectives for its implementation. First, determine which systems,
users, networks, and applications are in scope for monitoring and find out the parts of data that
are highly sensitive. This can be done by building policy-based rules in the software and then
comparing them to external compliance requirements to determine the type of dashboard and
reporting your organization would need.
This practice will help you decide whether to choose an on-premise solution, a cloud-based
implementation, or one hosted through virtualization technology. Moreover, having a proper
scope also ensures that all critical aspects are being monitored without collecting large amounts
of unnecessary data.
2. Do a pilot run

37
Whether you’re ready for a big move or not, it’s wise to start with a small step. The same is
applicable with SIEM as well. It is crucial to subject your SIEM software to a pilot run, so you
don’t end up using faulty tools that mess up your systems and slow down processes in the long
run. To avoid the chances of a hit and miss, start by implementing the SIEM solution in one
area of the business first and then expand its usage zone after assessing whether it is going to
produce the desired results or not.
You can measure effectiveness by identifying the impactful key metrics for the outcome of any
implementation. This data will not only help you determine whether the company’s investment
into the technology was worth it but also highlight areas where there could be room for
improvement in terms of returning more value.
3. Establish correlation rules
SIEM software presents a wide range of pre-configured correlation rules. Security teams can
customize the software to match the specific needs of an organization and its customers if they
choose to. This can be done by enabling everything by default and monitoring the behavior of
the software to identify scopes of improvement and increase detection efficacy against false
positives.
It’s worth treating SIEM as a partner to help trace suspicious events and protect the
organization against any upcoming threats. The best way to approach correlation is getting
onboard with the preconfigured rules and then tweaking them based on what needs to be
correlated and what doesn’t.
For instance, if you want to be alerted about any incidents that could compromise the security
of the company’s website, it’s a good idea to set up correlation rules for common SIEM alerts
relating to security breaches such as SQL injection or cross-site scripting.
4. Determine compliance requirements
Security logging from a SIEM system can equip your business with the important information
to demonstrate compliance with security standards. However, unless you know whether those
regulations are ahead of time, you may inadvertently find yourself wasting money on a SIEM
system that doesn’t even meet the minimum-security requirements.
This is why it would be a good practice to create a separate document containing a list of all
the IT regulations (HIPAA, GDPR, HITECH) that you need to comply with and then match
these requirements to the potential SIEM solution that you are considering.
In this case, it’s best to tie up with vendors who offer inbuilt features that support precise
compliance mandates. This will help you shortlist vendors and become aware of your auditing

38
requirements, including the amount of log data needed and the time for log data retention to
remain compliant
5. Ensure continuous monitoring
SIEM tools are an important part of network security. Having them in place can help quickly
detect attacks or intrusions that might otherwise go unnoticed by monitoring software.
However, a robust SIEM solution will require logging as much data as possible from the
company’s applications and services so that it has enough information to successfully flag
unusual activity whenever it occurs.
SIEM looks out for any anomalies such as unusual user behavior on systems, remote login
attempts, and system failures. This approach offers an opportunity to take preventative
measures ahead of time against potential vulnerabilities.
6. Draw up a comprehensive incident response plan
Benefits such as real-time monitoring, alerts for IT threat detection, and quick responses to
security incidents are the golden eggs that come along with the implementation of SIEM
software. However, to properly respond to these incidents, an organization implementing
SIEM must adopt the right plans.
Be sure to have a strong game plan with the relevant people named, escalation processes
detailed, and troubleshooting approaches laid out. These ensure that breaches are reduced and
any possible issues/problems are minimized or at least resolved as fast as possible.
7. Ensure proper deployment
Software deployments can be difficult; one is never quite sure how it’ll go. Will everyone
respond well? Will it break unexpectedly? You might think you’ve made the right choice in a
SIEM tool, but if it isn’t successfully deployed, you may find yourself with more problems.
An organization must ensure that all the necessary infrastructure and equipment are in place
and ready to go. In addition, it’s essential to look for warning signs such as alert fatigue or false
alerts to understand whether or not there are any potential barriers to deployment.
8. Protect network boundaries
Several unsecured areas exist in and around a network that can lead to cyberattacks. Vulnerable
areas exist at the edge of networks, and these need to be thoroughly monitored by the SIEM
software.
These vulnerable areas include firewalls, routers, ports, and wireless access points. It’s best to
routinely log network changes and other information so that issues are promptly accounted for
as soon as they happen, and preventative measures can be taken.
9. Conduct test runs

39
In an ideal world, all security alerts that SIEM generates would be fundamentally correct and
detect attacks and events as they occur in real time. However, in reality, this is not always the
case.
The chances of getting false positives are not unheard of. For example, SIEM might identify a
vulnerability scanner as an aggressive attacker, flag various threat notifications, and send out a
stream of alerts. Hence, it is advisable to conduct a test run for the SIEM integration process.
10. Carry out routine reviews & monitoring
Last but not least, evaluate all the steps outlined above regularly to ensure that everything is
properly maintained and configured. This includes checking the functionality of SIEM,
determining whether or not the infrastructure can accommodate both present and future needs,
and optimizing anything you find necessary based on performance testing results.
In addition, it’s very important to keep all other security tools—as well as SIEM itself—up to
date as new vulnerabilities arise every now and then. Given that attackers are getting smarter
than ever, it’s important to stay alert and take measures toward mitigating any new threats that
may come your way.

40
UNIT -IV: Computer security Log Management

A log is a record of the events occurring within an organization’s systems and networks. Logs
are composed of log entries; each entry contains information related to a specific event that has
occurred within a system or network. Originally, logs were used primarily for troubleshooting
problems, but logs now serve many functions within most organizations, such as optimizing
system and network performance, recording the actions of users, and providing data useful for
investigating malicious activity. Logs have evolved to contain information related to many
different types of events occurring within networks and systems. Within an organization, many
logs contain records related to computer security; common examples of these computer
security logs are audit logs that track user authentication attempts and security device logs that
record possible attacks. This guide addresses only those logs that typically contain computer
security-related information.1 Because of the widespread deployment of networked servers,
workstations, and other computing devices, and the ever-increasing number of threats against
networks and systems, the number, volume, and variety of computer security logs has increased
greatly. This has created the need for computer security log management, which is the process
for generating, transmitting, storing, analyzing, and disposing of computer security log data.
This section of the document discusses the needs and challenges in computer security log
management. Section 2.1 explains the basics of computer security logs. Section 2.2 discusses
the laws, regulations, and operational needs involved with log management. Section 2.3
explains the most common log management challenges, and Section 2.4 offers high-level
recommendations for meeting them.
The Basics of Computer Security Logs: Logs can contain a wide variety of information on
the events occurring within systems and networks. This section describes the following
categories of logs of particular interest:
• Security software logs primarily contain computer security-related information.
• Operating system logs and application logs typically contain a variety of information,
including computer security-related data.
Under different sets of circumstances, many logs created within an organization could have
some relevance to computer security. For example, logs from network devices such as switches
and wireless access points, and from programs such as network monitoring software, might
record data that could be of use in computer security or other information technology (IT)
initiatives, such as operations and audits, as well as in demonstrating compliance with
regulations. However, for computer security these logs are generally used on an as-needed
basis as supplementary sources of information. This document focuses on the types of logs that

41
are most often deemed to be important by organizations in terms of computer security.
Organizations should consider the value of each potential source of computer security log data
when designing and implementing a log management infrastructure. Most of the sources of the
log entries run continuously, so they generate entries on an ongoing basis. However, some
sources run periodically, so they generate entries in batches, often at regular intervals
Security Software: Most organizations use several types of network-based and host-based
security software to detect malicious activity, protect systems and data, and support incident
response efforts. Accordingly, security software is a major source of computer security log
data. Common types of network-based and hostbased security software include the following:

• Antimalware Software. The most common form of antimalware software is antivirus


software, which typically records all instances of detected malware, file and system
disinfection attempts, and file quarantines.3 Additionally, antivirus software might also
record when malware scans were performed and when antivirus signature or software
updates occurred. Antispyware software and other types of antimalware software (e.g.,
rootkit detectors) are also common sources of security information.
• Intrusion Detection and Intrusion Prevention Systems. Intrusion detection and
intrusion prevention systems record detailed information on suspicious behavior and
detected attacks, as well as any actions intrusion prevention systems performed to stop
malicious activity in progress. Some intrusion detection systems, such as file integrity
checking software, run periodically instead of continuously, so they generate log entries
in batches instead of on an ongoing basis.
• Remote Access Software. Remote access is often granted and secured through virtual
private networking (VPN). VPN systems typically log successful and failed login
attempts, as well as the dates and times each user connected and disconnected, and the
amount of data sent and received in each user session. VPN systems that support
granular access control, such as many Secure Sockets Layer (SSL) VPNs, may log
detailed information about the use of resources.
• Web Proxies. Web proxies are intermediate hosts through which Web sites are
accessed. Web proxies make Web page requests on behalf of users, and they cache
copies of retrieved Web pages to make additional accesses to those pages more
efficient. Web proxies can also be used to restrict Web access and to add a layer of
protection between Web clients and Web servers. Web proxies often keep a record of
all URLs accessed through them.

42
• Vulnerability Management Software. Vulnerability management software, which
includes patch management software and vulnerability assessment software, typically
logs the patch installation history and vulnerability status of each host, which includes
known vulnerabilities and missing software updates.5 Vulnerability management
software may also record additional information about hosts’ configurations.
Vulnerability management software typically runs occasionally, not continuously, and
is likely to generate large batches of log entries.
• Authentication Servers. Authentication servers, including directory servers and
single sign-on servers, typically log each authentication attempt, including its origin,
username, success or failure, and date and time.
• Routers. Routers may be configured to permit or block certain types of network traffic
based on a policy. Routers that block traffic are usually configured to log only the most
basic characteristics of blocked activity.
• Firewalls. Like routers, firewalls permit or block activity based on a policy; however,
firewalls use much more sophisticated methods to examine network traffic.6 Firewalls
can also track the state of network traffic and perform content inspection. Firewalls tend
to have more complex policies and generate more detailed logs of activity than routers.

• Network Quarantine Servers. Some organizations check each remote host’s security
posture before allowing it to join the network. This is often done through a network
quarantine server and agents placed on each host. Hosts that do not respond to the
server’s checks or that fail the checks are quarantined on a separate virtual local area
network (VLAN) segment. Network quarantine servers log information about the status
of checks, including which hosts were quarantined and for what reasons.

43
Operating Systems Operating systems (OS) for servers, workstations, and networking devices
(e.g., routers, switches) usually log a variety of information related to security. The most
common types of security-related OS data are as follows:

• System Events. System events are operational actions performed by OS components,


such as shutting down the system or starting a service. Typically, failed events and the
most significant successful events are logged, but many OSs permit administrators to
specify which types of events will be logged. The details logged for each event also
vary widely; each event is usually timestamped, and other supporting information could
include event, status, and error codes; service name; and user or system account
associated with an event.
• Audit Records. Audit records contain security event information such as successful
and failed authentication attempts, file accesses, security policy changes, account
changes (e.g., account creation and deletion, account privilege assignment), and use of
privileges. OSs typically permit system administrators to specify which types of events
should be audited and whether successful and/or failed attempts to perform certain
actions should be logged.

Applications: Operating systems and security software provide the foundation and protection
for applications, which are used to store, access, and manipulate the data used for the
organization’s business processes. Most organizations rely on a variety of commercial off-the-
shelf (COTS) applications, such as e-mail servers and clients, Web servers and browsers, file
servers and file sharing clients, and database servers and clients. Many organizations also use
various COTS or government off-the-shelf (GOTS) business applications such as supply chain
management, financial management, procurement systems, enterprise resource planning, and
customer relationship management. In addition to COTS and GOTS software, most
organizations also use custom-developed applications tailored to their specific requirements.

44
Some applications generate their own log files, while others use the logging capabilities of the
OS on which they are installed. Applications vary significantly in the types of information that
they log. The following lists some of the most commonly logged types of information and the
potential benefits of each:

• Client requests and server responses, which can be very helpful in reconstructing
sequences of events and determining their apparent outcome. If the application logs
successful user authentications, it is usually possible to determine which user made
each request. Some applications can perform highly detailed logging, such as e-mail
servers recording the sender, recipients, subject name, and attachment names for each
e-mail; Web servers recording each URL requested and the type of response provided
by the server; and business applications recording which financial records were
accessed by each user. This information can be used to identify or investigate incidents
and to monitor application usage for compliance and auditing purposes.
• Account information such as successful and failed authentication attempts, account
changes (e.g., account creation and deletion, account privilege assignment), and use of
privileges. In addition to identifying security events such as brute force password
guessing and escalation of privileges, it can be used to identify who has used the
application and when each person has used it.
• Usage information such as the number of transactions occurring in a certain period
(e.g., minute, hour) and the size of transactions (e.g., e-mail message size, file transfer
size). This can be useful for certain types of security monitoring (e.g., a ten-fold
increase in e-mail activity might indicate a new e-mail–borne malware threat; an
unusually large outbound e-mail message might indicate inappropriate release of
information).
• Significant operational actions such as application startup and shutdown, application
failures, and major application configuration changes. This can be used to identify
security compromises and operational failures.

45
The Need for Log Management:

Log management can benefit an organization in many ways. It helps to ensure that computer
security records are stored in sufficient detail for an appropriate period of time. Routine log
reviews and analysis are beneficial for identifying security incidents, policy violations,
fraudulent activity, and operational problems shortly after they have occurred, and for
providing information useful for resolving such problems. Logs can also be useful for
performing auditing and forensic analysis, supporting the organization’s internal
investigations, establishing baselines, and identifying operational trends and longterm
problems. Besides the inherent benefits of log management, a number of laws and regulations
further compel organizations to store and review certain logs. The following is a listing of key
regulations, standards, and guidelines that help define organizations’ needs for log
management:

• Federal Information Security Management Act of 2002 (FISMA). FISMA


emphasizes the need for each Federal agency to develop, document, and implement an

46
organization-wide program to provide information security for the information systems
that support its operations and assets. NIST SP 800-53, Recommended Security
Controls for Federal Information Systems, was developed in support of FISMA.11
NIST SP 800-53 is the primary source of recommended security controls for Federal
agencies. It describes several controls related to log management, including the
generation, review, protection, and retention of audit records, as well as the actions to
be taken because of audit failure.
• Gramm-Leach-Bliley Act (GLBA). GLBA requires financial institutions to protect
their customers’ information against security threats. Log management can be helpful
in identifying possible security violations and resolving them effectively.
• Health Insurance Portability and Accountability Act of 1996 (HIPAA). HIPAA
includes security standards for certain health information. NIST SP 800-66, An
Introductory Resource Guide for Implementing the Health Insurance Portability and
Accountability Act (HIPAA) Security Rule, lists HIPAA-related log management
needs.13 For example, Section 4.1 of NIST SP 800-66 describes the need to perform
regular reviews of audit logs and access reports. Also, Section 4.22 specifies that
documentation of actions and activities need to be retained for at least six years.
• Sarbanes-Oxley Act (SOX) of 2002. Although SOX applies primarily to financial and
accounting practices, it also encompasses the information technology (IT) functions
that support these practices. SOX can be supported by reviewing logs regularly to look
for signs of security violations, including exploitation, as well as retaining logs and
records of log reviews for future review by auditors.
• Payment Card Industry Data Security Standard (PCI DSS). PCI DSS applies to
organizations that “store, process or transmit cardholder data” for credit cards. One of
the requirements of PCI DSS is to “track…all access to network resources and
cardholder data”.

Log Management Infrastructure

A log management infrastructure consists of the hardware, software, networks, and media used
to generate, transmit, store, analyze, and dispose of log data. Most organizations have one or
more log management infrastructures. This section describes the typical architecture of a log
management infrastructure and how its components interact with each other. It then describes
the basic functions performed within a log management infrastructure. Next, it examines the
two major categories of log management software: syslog-based centralized logging software

47
and security information and event management software. The section also describes additional
types of software that may be useful within a log management infrastructure.

Architecture- A log management infrastructure typically comprises the following three tiers:
• Log Generation. The first tier contains the hosts that generate the log data. Some hosts
run logging client applications or services that make their log data available through
networks to log servers in the second tier. Other hosts make their logs available through
other means, such as allowing the servers to authenticate to them and retrieve copies
of the log files.
• Log Analysis and Storage. The second tier is composed of one or more log servers
that receive log data or copies of log data from the hosts in the first tier. The data is
transferred to the servers either in a real-time or near-real-time manner, or in occasional
batches based on a schedule or the amount of log data waiting to be transferred. Servers
that receive log data from multiple log generators are sometimes called collectors or
aggregators. Log data may be stored on the log servers themselves or on separate
database servers.
• Log Monitoring. The third tier contains consoles that may be used to monitor and
review log data and the results of automated analysis. Log monitoring consoles can
also be used to generate reports. In some log management infrastructures, consoles can
also be used to provide management for the log servers and clients. Also, console user
privileges sometimes can be limited to only the necessary functions and data sources
for each user.
The second tier—log analysis and storage—can vary greatly in complexity and structure. The
simplest arrangement is a single log server that handles all log analysis and storage functions.
Examples of more complex second tier arrangements are as follows:

• Multiple log servers that each perform a specialized function, such as one server
performing log collection, analysis, and short-term log storage, and another server
performing long-term storage.

• Multiple log servers that each perform analysis and/or storage for certain log generators.
This can also provide some redundancy. A log generator can switch to a backup log
server if its primary log server becomes unavailable. Also, log servers can be configured
to share log data with each other, which also supports redundancy.
• Two levels of log servers, with the first level of distributed log servers receiving logs
from the log generators and forwarding some or all of the log data they receive to a

48
second level of more centralized log servers. (Additional tiers can be added to this
architecture to make it even more flexible, scalable, and redundant.) In some cases, the
first level servers act as log caching servers—simply receiving logs from log generators
and forwarding them to other log servers. This can be done to protect the second level
of log servers from direct attacks, and it is also useful when there are network reliability
concerns between the log generators and the second level of log servers, such as those
servers being accessible only over the Internet. In that case, having log caching servers
on a reliable local network allows the log generators to transfer their logs to those
servers, which can then transfer the logs to the second level of log servers when network
connectivity permits.
Functions: Log management infrastructures typically perform several functions that assist in
the storage, analysis, and disposal of log data. These functions are normally performed in such
a way that they do not alter the original logs. The following items describe common log
management infrastructure functions:
General
– Log parsing is extracting data from a log so that the parsed values can be used as input for
another logging process. A simple example of parsing is reading a text-based log file that
contains 10 comma-separated values per line and extracting the 10 values from each line.
Parsing is performed as part of many other logging functions, such as log conversion and
log viewing.
– Event filtering is the suppression of log entries from analysis, reporting, or long-term
storage because their characteristics indicate that they are unlikely to contain information of
interest. For example, duplicate entries and standard informational entries might be filtered
because they do not provide useful information to log analysts. Typically, filtering does not
affect the generation or short-term storage of events because it does not alter the original
log files.
– In event aggregation, similar entries are consolidated into a single entry containing a count
of the number of occurrences of the event. For example, a thousand entries that each record
part of a scan could be aggregated into a single entry that indicates how many hosts were
scanned. Aggregation is often performed as logs are originally generated (the generator
counts similar related events and periodically writes a log entry containing the count), and
it can also be performed as part of log reduction or event correlation processes, which are
described below.
Storage

49
– Log rotation is closing a log file and opening a new log file when the first file is considered
to be complete. Log rotation is typically performed according to a schedule (e.g., hourly,
daily, weekly) or when a log file reaches a certain size. The primary benefits of log rotation
are preserving log entries and keeping the size of log files manageable. When a log file is
rotated, the preserved log file can be compressed to save space. Also, during log rotation,
scripts are often run that act on the archived log. For example, a script might analyze the old
log to identify malicious activity, or might perform filtering that causes only log entries
meeting certain characteristics to be preserved. Many log generators offer log rotation
capabilities; many log files can also be rotated through simple scripts or third-party utilities,
which in some cases offer features not provided by the log generators.
– Log archival is retaining logs for an extended period of time, typically on removable
media, a storage area network (SAN), or a specialized log archival appliance or
server. Logs often need to be preserved to meet legal or regulatory requirements.
–Log compression is storing a log file in a way that reduces the amount of storage
space needed for the file without altering the meaning of its contents. Log
compression is often performed when logs are rotated or archived.
– Log reduction is removing unneeded entries from a log to create a new log that is
smaller. A similar process is event reduction, which removes unneeded data fields
from all log entries. Log and event reduction are often performed in conjunction with
log archival so that only the log entries and data fields of interest are placed into
long-term storage.
– Log conversion is parsing a log in one format and storing its entries in a second
format. For example, conversion could take data from a log stored in a database and
save it in an XML format in a text file. Many log generators can convert their own
logs to another format; third-party conversion utilities are also available. Log
conversion sometimes includes actions such as filtering, aggregation, and
normalization.
– In log normalization, each log data field is converted to a particular data
representation and categorized consistently. One of the most common uses of
normalization is storing dates and times in a single format. For example, one log
generator might store the event time in a twelve-hour format (2:34:56 P.M. EDT)
categorized as Timestamp, while another log generator might store it in twenty-four
(14:34) format categorized as Event Time, with the time zone stored in different
notation (-0400) in a different field categorized as Time Zone.24 Normalizing the

50
data makes analysis and reporting much easier when multiple log formats are in use.
However, normalization can be very resource-intensive, especially for complex log
entries (e.g., typical intrusion detection logs).
– Log file integrity checking involves calculating a message digest for each file and
storing the message digest securely to ensure that changes to archived logs are
detected. A message digest is a digital signature that uniquely identifies data and has
the property that changing a single bit in the data causes a completely different
message digest to be generated. The most commonly used message digest algorithms
are MD5 and Secure Hash Algorithm 1 (SHA1).25 If the log file is modified and its
message digest is recalculated, it will not match the original message digest,
indicating that the file has been altered. The original message digests should be
protected from alteration through FIPS-approved encryption algorithms, storage on
read-only media, or other suitable means.
Analysis
– Event correlation is finding relationships between two or more log entries. The most
common form of event correlation is rule-based correlation, which matches multiple log
entries from a single source or multiple sources based on logged values, such as timestamps,
IP addresses, and event types
– Log viewing is displaying log entries in a human-readable format. Most log generators
provide some sort of log viewing capability; third-party log viewing utilities are also
available. Some log viewers provide filtering and aggregation capabilities.
– Log reporting is displaying the results of log analysis. Log reporting is often performed to
summarize significant activity over a particular period of time.
Disposal
– Log clearing is removing all entries from a log that precede a certain date and time. Log
clearing is often performed to remove old log data that is no longer needed on a system
because it is not of importance or it has been archived.
Syslog Security
Syslog was developed at a time when the security of logs was not a major consideration.
Accordingly, it did not support the use of basic security controls that would preserve the
confidentiality, integrity, and availability of logs. For example, most syslog
implementations use the connectionless, unreliable User Datagram Protocol (UDP) to
transfer logs between hosts. UDP provides no assurance that log entries are received
successfully or in the correct sequence.

51
Implementations based on RFC 3195 can support log confidentiality, integrity, and
availability through several features, including
the following:
• Reliable Log Delivery. Several syslog implementations support the use of
Transmission Control Protocol (TCP) in addition to UDP. TCP is a connection-
oriented protocol that attempts to ensure the reliable delivery of information across
networks. Using TCP helps to ensure that log entries reach their destination. Having
this reliability requires the use of more network bandwidth; also,it typically takes
more time for log entries to reach their destination. Some syslog implementations
use log caching servers.
• Transmission Confidentiality Protection. RFC 3195 recommends the use of the
Transport Layer Security (TLS) protocol to protect the confidentiality of transmitted
syslog messages. TLS can protect the messages during their entire transit between
hosts. TLS can only protect the payloads of packets, not their IP headers, which
means that an observer on the network can identify the source and destination of
transmitted syslog messages, possibly revealing the IP addresses of the syslog
servers and log sources. Some syslog implementations use other means to encrypt
network traffic, such as passing syslog messages through secure shell (SSH) tunnels.
• Protecting syslog transmissions can require additional network bandwidth and
increase the time needed for log entries to reach their destination.
• Transmission Integrity Protection and Authentication. RFC 3195 recommends that
if integrity protection and authentication are desired, that a message digest algorithm
be used. RFC 3195 recommends the use of MD5; proposed revisions to RFC 3195
mention the use of SHA-1.
Security Information and Event Management Software: Security information and event
management (SIEM) software34 is a relatively new type of centralized logging software
compared to syslog. SIEM products have one or more log servers that perform log analysis,
and one or more database servers that store the logs.36 Most SIEM products support two ways
of collecting logs from log generators:
• Agentless. The SIEM server receives data from the individual log generating hosts
without needing to have any special software installed on those hosts. Some servers
pull logs from the hosts, which is usually done by having the server authenticate to
each host and retrieve its logs regularly. In other cases, the hosts push their logs to the
server, which usually involves each host authenticating to the server and transferring

52
its logs regularly. Regardless of whether the logs are pushed or pulled, the server then
performs event filtering and aggregation and log normalization and analysis on the
collected logs.
• Agent-Based. An agent program is installed on the log generating host to perform
event filtering and aggregation and log normalization for a particular type of log, then
transmit the normalized log data to an SIEM server, usually on a real-time or near-real-
time basis for analysis and storage. If a host has multiple types of logs of interest, then
it might be necessary to install multiple agents. Some SIEM products also offer agents
for generic formats such as syslog and SNMP. A generic agent is used primarily to get
log data from a source for which a format-specific agent and an agentless method are
not available. Some products also allow administrators to create custom agents to
handle unsupported log sources.
Log Management Planning
To establish and maintain successful log management infrastructures, an organization should
perform significant planning and other preparatory actions for performing log management.
This is important for creating consistent, reliable, and efficient log management practices that
meet the organization’s needs and requirements and also provide additional value for the
organization. This section describes the definition of log management roles and
responsibilities, the creation of feasible logging policies, and the design of log management
infrastructures.
Define Roles and Responsibilities
As part of the log management planning process, an organization should define the roles and
responsibilities of individuals and teams who are expected to be involved in log management.
Teams and individual roles often involved in log management include the following:
• System and network administrators, who are usually responsible for configuring
logging on individual systems and network devices, analyzing those logs periodically,
reporting on the results of log management activities, and performing regular
maintenance of the logs and logging software.
• Security administrators, who are usually responsible for managing and monitoring the
log management infrastructures, configuring logging on security devices (e.g.,
firewalls, networkbased intrusion detection systems, antivirus servers), reporting on the
results of log management activities, and assisting others with configuring logging and
performing log analysis

53
• Computer security incident response teams, who use log data when handling some
incidents
• Application developers, who may need to design or customize applications so that they
perform logging in accordance with the logging requirements and recommendations
• Information security officers, who may oversee the log management infrastructures
• Chief information officers (CIO), who oversee the IT resources that generate, transmit,
and store the logs
• Auditors, who may use log data when performing audits
• Individuals involved in the procurement of software that should or can generate
computer security log data.
Establish Logging Policies
An organization should define its requirements and goals for performing logging and
monitoring logs. The requirements should include all applicable laws, regulations, and existing
Organizations should develop policies that clearly define mandatory requirements and
suggested recommendations for several aspects of log management, including the following:
Log generation
– Which types of hosts must or should perform logging
– Which host components must or should perform logging (e.g., OS, service, application)
– Which types of events each component must or should log (e.g., security events, network
connections, authentication attempts)
– Which data characteristics must or should be logged for each type of event (e.g., username
and source IP address for authentication attempts)
– How frequently each type of event must or should be logged (e.g., every occurrence, once
for all instances in x minutes, once for every x instance, every instance after x instances)
Log transmission
– Which types of hosts must or should transfer logs to a log management infrastructure
– Which types of entries and data characteristics must or should be transferred from individual
hosts to a log management infrastructure
– How log data must or should be transferred (e.g., which protocols are permissible), including
out-of-band methods where appropriate (e.g., for standalone systems)
– How frequently log data should be transferred from individual hosts to a log management
infrastructure (e.g., real-time, every 5 minutes, every hour)
– How the confidentiality, integrity, and availability of each type of log data must or should be
protected while in transit, including whether a separate logging network should be used

54
Log storage and disposal
– How often logs should be rotated
– How the confidentiality, integrity, and availability of each type of log data must or should be
protected while in storage (at both the system level and the infrastructure level).

Log analysis – How often each type of log data must or should be analyzed (at both the system
level and the infrastructure level)
– Who must or should be able to access the log data (at both the system level and the
infrastructure level), and how such accesses should be logged
– What must or should be done when suspicious activity or an anomaly is identified
– How the confidentiality, integrity, and availability of the results of log analysis (e.g., alerts,
reports) must or should be protected while in storage (at both the system level and the
infrastructure level) and in transit
– How inadvertent disclosures of sensitive information recorded in logs, such as passwords or
the contents of e-mails, should be handled.
Design Log Management Infrastructures After establishing an initial policy and identifying
roles and responsibilities, an organization should next design one or more log management
infrastructures that effectively support the policy and roles. If the organization already has a
log management infrastructure, then the organization should first determine if it can be
modified to meet the organization’s needs. If the existing infrastructure is unsuitable, or no
such infrastructure exists, then the organization should either identify its infrastructure
requirements, evaluate possible solutions, and implement the chosen solution (hardware,
software, and possibly network enhancements), or reevaluate its needs and modify its policy.
Organizations may wish to create a draft policy, attempt to design a corresponding log
management infrastructure, and determine what aspects of the policy make that infeasible. The
organization can then revise its policies so that the infrastructure implementation will be less
resource-intensive, while ensuring that all legal and regulatory requirements are still met.
Because of the complexities of log management, it may take a few cycles of policy
modification, infrastructure design, and design assessment to finalize the policy and design.
When designing a log management infrastructure, organizations should consider several
factors related to the current and future needs of both the infrastructure and the individual log
sources throughout the organization. Major factors include the following:

• The typical and peak volume of log data to be processed per hour and day. The typical
volume of log data tends to increase over time for most log sources. The peak volume

55
should include handling extreme situations, such as widespread malware incidents,
vulnerability scanning, and penetration tests that may cause unusually large numbers of
log entries to be generated in a short period of time. If the volume of log data is too
high, a logging denial of service may result. Many logging products rate their capacity
for processing log data by the volume of events they can process in a given time, most
often in events per second (EPS).

• The typical and peak usage of network bandwidth.

• The typical and peak usage of online and offline (e.g., archival) data storage. This
should include an analysis of the time and resources needed to perform backups and
archival of log data, as well as disposing of the data once it is no longer needed.
The Challenges in Log Management: Most organizations face similar log management-
related challenges, which have the same underlying problem: effectively balancing a limited
amount of log management resources with an ever-increasing supply of log data. This section
discusses the most common types of challenges, divided into three groups. First, there are
several potential problems with the initial generation of logs because of their variety and
prevalence. Second, the confidentiality, integrity, and availability of generated logs could be
breached inadvertently or intentionally. Finally, the people responsible for performing log
analysis are often inadequately prepared and supported

• Log Generation and Storage: In a typical organization, many hosts’ OSs, security
software, and other applications generate and store logs. This complicates log
management in the following ways: Many Log Sources. Logs are located on many
hosts throughout the organization, necessitating log management to be performed
throughout the organization. Also, a single log source can generate multiple logs—for
example, an application storing authentication attempts in one log and network activity
in another log.
• Inconsistent Log Content. Each log source records certain pieces of information in its
log entries, such as host IP addresses and usernames. For efficiency, log sources often
record only the pieces of information that they consider most important. This can make
it difficult to link events recorded by different log sources because they may not have
any common values recorded (e.g., source 1 records the source IP address but not the
username, and source 2 records the username but not the source IP address). Each type
of log source may also represent values differently; these differences may be slight,
such as one date being in MMDDYYYY format and another being in MM-DD-YYYY

56
format, or they may be much more complex, such as use of the File Transfer Protocol
(FTP) being identified by name in one log (“FTP”) and by port number in another log
(21). This further complicates the process of linking events recorded by different log
sources.

• Inconsistent Timestamps. Each host that generates logs typically references its
internal clock when setting a timestamp for each log entry. If a host’s clock is
inaccurate, the timestamps in its logs will also be inaccurate. This can make analysis of
logs more difficult, particularly when logs from multiple hosts are being analyzed. For
example, timestamps might indicate that event A happened 45 seconds before event B,
when event A actually happened two minutes after event B.

• Inconsistent Log Formats. Many of the log source types use different formats for their
logs, such as comma-separated or tab-separated text files,18 databases, syslog, Simple
Network Management Protocol (SNMP), Extensible Markup Language (XML), and
binary files.19 Some logs are designed for humans to read, while others are not; some
logs use standard formats, while others use proprietary formats. Some logs are created
not for local storage in a file, but for transmission to another system for processing; a
common example of this is SNMP traps. For some output formats, particularly text
files, there are many possibilities for the sequence of the values in each log entry and
the delimiters between the values (e.g., comma-separated values, tabdelimited values,
XML).

Log Protection: their confidentiality and integrity. For example, logs might intentionally or
inadvertently capture sensitive information such as users’ passwords and the content of e-mails.
This raises security and privacy concerns involving both the individuals that review the logs
and others that might be able to access the logs through authorized or unauthorized means.
Logs that are secured improperly in storage or in transit might also be susceptible to intentional
and unintentional alteration and destruction. This could cause a variety of impacts, including
allowing malicious activities to go unnoticed and manipulating evidence to conceal the identity
of a malicious party. For example, many rootkits are specifically designed to alter logs to
remove any evidence of the rootkits’ installation or execution. Organizations also need to
protect the availability of their logs. Many logs have a maximum size, such as storing the
10,000 most recent events, or keeping 100 megabytes of log data. When the size limit is
reached, the log might overwrite old data with new data or stop logging altogether, both of
which would cause a loss of log data availability. To meet data retention requirements,

57
organizations might need to keep copies of log files for a longer period of time than the original
log sources can support, which necessitates establishing log archival processes. Because of the
volume of logs, it might be appropriate in some cases to reduce the logs by filtering out log
entries that do not need to be archived. The confidentiality and integrity of the archived logs
also need to be protected.

Log Analysis: Within most organizations, network and system administrators have
traditionally been responsible for performing log analysis—studying log entries to identify
events of interest. It has often been treated as a low-priority task by administrators and
management because other duties of administrators, such as handling operational problems and
resolving security vulnerabilities, necessitate rapid responses. Administrators who are
responsible for performing log analysis often receive no training on doing it efficiently and
effectively, particularly on prioritization.

Also, administrators often do not receive tools that are effective at automating much of the
analysis process, such as scripts and security software tools (e.g., host-based intrusion detection
products, security information and event management software). Many of these tools are
particularly helpful in finding patterns that humans cannot easily see, such as correlating entries
from multiple logs that relate to the same event. Another problem is that many administrators
consider log analysis to be boring and to provide little benefit for the amount of time required.
Log analysis is often treated as reactive—something to be done after a problem has been
identified through other means—rather than proactive, to identify ongoing activity and look
for signs of impending problems. Traditionally, most logs have not been analyzed in a real-
time or near-real-time manner. Without sound processes for analyzing logs, the value of the
logs is significantly reduced.

Meeting the Challenges: Despite the many challenges an organization faces in log
management, there are a few key practices an organization can follow to avoid and even solve
many of these obstacles it confronts. The following four measures give a brief explanation of
these solutions:

• Prioritize log management appropriately throughout the organization. An


organization should define its requirements and goals for performing logging and
monitoring logs to include applicable laws, regulations, and existing organizational
policies. The organization can then prioritize its goals based on balancing the
organization’s reduction of risk with the time and resources needed to perform log
management functions.

58
• Establish policies and procedures for log management. Policies and procedures are
beneficial because they ensure a consistent approach throughout the organization as
well as ensuring that laws and regulatory requirements are being met. Periodic audits
are one way to confirm that logging standards and guidelines are being followed
throughout the organization. Testing and validation can further ensure that the policies
and procedures in the log management process are being performed properly.

• Create and maintain a secure log management infrastructure. It is very helpful for
an organization to create components of a log management infrastructure and determine
how these components interact. This aids in preserving the integrity of log data from
accidental or intentional modification or deletion, and also in maintaining the
confidentiality of log data. It is also critical to create an infrastructure robust enough to
handle not only expected volumes of log data, but also peak volumes during extreme
situation.
• Provide adequate support for all staff with log management responsibilities. While
defining the log management scheme, organizations should ensure that they provide the
necessary training to relevant staff regarding their log management responsibilities as
well as skill instruction for the needed resources to support log management. Support
also includes providing log management tools and tool documentation, providing
technical guidance on log management activities, and disseminating information to log
management staff.

59
UNIT-V: Log Analysis using Arc Sight
ArcSight
ArcSight is an ESM (Enterprise Security Manager) platform. It is a tool built and applied to
manage its security policy. It can detect, analyze, and resolve cyber security threats quickly.
The ESM platform has products for event collection, real-time event management, log
management, automatic response, and compliance management.
Is ArcSight a SIEM tool?
Yes, ArcSight Enterprise Security Manager (ESM), a robust, adaptive SIEM that brings real-
time threat detection and native SOAR technology to your SOC, is a SIEM tool that can
empower your security operations team.
ArcSight ESM Architecture
ESM uses SmartConnectors to collect event data from your network. SmartConnectors convert
device event data into a standardized schema that serves as the basis for correlation. In the
CORR-Engine, the Manager processes and stores event data. Users can use the ArcSight
Console or the ArcSight Command Center to monitor events, run reports, generate resources,
conduct investigations, and manage the system. Additional ArcSight solutions that drive event
flow, ease event analysis and provide security alerts and incident response are built on ESM's
fundamental architecture.

Components of ArcSight
ArcSight is a term used to define the components of a security model, which include features
and functionalities for security monitoring. By gathering and preserving data for long-term use
cases, ArcSight overcomes the issues of a variety of requirements.
• Arcsight SIEM Platform
The security and visibility operations that use the monitoring platform architecture are part of
the Arcsight SIEM Platform environment. The platform collects, normalizes, and categorizes
all network and security device events and logs.

60
• ArcSight ESM
The ArcSight ESM can collect a wide range of log data and combine it with a robust correlation
engine to detect threats across various products and notify customers to take action on
vulnerabilities.
• ArcSight Logger
The ArcSight Logger enables automated compliance reporting and log management and
storage. It has a storage capacity of up to 42TB of log data and can search for multiple events
per second across organized and unstructured data. It enables SOX, PCI DSS, NERC, and other
regulations' automated reporting.
• ArcSight Express
ESM and logger's real-time correlation and log management capabilities are included in the
ArcSight Express. The Express contains various built-in correlation rules, dashboards, and
reports and is described as a "security expert in a box." It delivers infrastructure setup and
monitoring solutions at a minimal cost.
• ArcSight SmartConnectors
The ArcSight SmartConnectors take event data from network devices and standardize it into a
schema. Data can be filtered via connections, saving network bandwidth and storage space.
SmartConnectors increases efficiency by grouping events and reducing the number of affairs
of the same type. The events may be organized into a legible manner, making it easier to use
them to create filters, rules, and reports.
ArcSight Latest Version
ArcSight ESM version 7.0, ArcSight Express version 5.0, ArcSight Investigate version 2.20,
and ArcSight Data Platform version 2.31 (containing ArcSight's Logger, ArcMC, and Event
Broker technology) were all launched in January 2019.
Network model in ArcSight ESM
The correlation criteria are built using the ArcSight ESM Network model, a network and assert
models blend.
• The network model represents the nodes and features of the network.
• The Assert model represents attributes.
The following resources make up the network model's elements.
• Asserts depict the network's nodes, such as servers, routers, and devices.
• Assert Ranges - This is a collection of network nodes with a single IP address block.
• Zones - A zone is a segment of the network divided into blocks of addresses.
• Networks - It distinguishes between the two private address spaces.

61
• Customers- Customers are the business units that are connected to the networks.
Asserts
• The Asserts resources identify any network endpoint within an IP address, MAC
address, hostname, or external ID.
• An assert resource is a network identification specification that includes the following.
o Assert name.
o Network IP address.
o MAC address.
o Hostname.
o External ID.
Assert Ranges
• An Assert Ranges is a set of assertions tied to a network that employs a block of IP
addresses.
• The SmartConnector identifies the endpoints of an event as a single asset or an asset
that belongs to a specific assert range when it is processed. The event schema is pre-
populated regarding an assert or asserts range identifier.
Zones
• A zone is a functional group within a network or a subnet, such as a LAN, VPN, or
DMZ; an IP address block identifies that.
• A zone is assigned to every assertion or address range. ESM comes pre-configured with
a global IP address, resolving problems without needing extra zones.
• Zones in the same network cannot have address ranges that overlap.
• When SmartConnector analyses an event, it looks for the zone associated with each IP
address in an ordered list of networks. If a matching zone is identified, the search ends;
if not, it moves to the following network in the order given during SmartConnector
configuration.

Networks
• When IP ranges overlap, ArcSight resources called networks are employed to
distinguish between the zones.
• For ESM, there are two standard networks: local and global.
• The SmartConnector will tag events with the relevant zone using network designations,
allowing the manager to discover the correct model for assert events.
Customers

62
• Customer tagging is a tool created to help Managed Security Services Providers'
(MSSP) settings.
• Instead of being considered a source or target of an event, a customer will be deemed
its "owner."
• A fixed string or a velocity template variable can be used as client variables.
ArcSight ESM Event Life Cycle
In ArcSight ESM, there are seven event life cycles.
• Data collection and event processing
The information is obtained from a variety of sources and then processed.
• Network model lookup and priority evaluation
We use the logical construction of a network with naming and structures to comprehend the
environment and location, and then it's time to prioritize.
• Correlation evaluation
The correlations will be analyzed in this step, followed by monitoring and investigation.
• Monitoring and investigation
The scenarios must be thoroughly understood to know what they are to monitor them, and then
an analyst must investigate them before moving on to the workflow.
• Workflow
The workflow process model is implemented in this phase.
• Incident analysis and Reporting
Here, we must report the data and analyze what has been gathered or received.
• Event archival
Finally, the events will be archived in an off-site location. The information can be kept for a
long time. All seven stages of an event must be completed before an event can be considered
complete.
Correlation and Aggregation in ArcSight
• Aggregation
At the SmartConnector level, aggregation limits the number of events consumed by the
destination device (ESM / Logger). Suppose a SmartConnector is receiving events from a
firewall device, for example. In that case, it will aggregate (i.e., summarize) similar
circumstances over a defined period and deliver a single event to the destination. This can save
you a lot of money in terms of bandwidth, storage, and processing.
• Correlation

63
Correlation is a technique for determining the correlations between events. ESM's correlation
engine, for example, employs the rules you create (or those provided by ESM) to correlate base
and aggregated events coming in from SmartConnectors to identify if something of interest has
occurred. For example, a failed login event on an endpoint may not be of interest in and of
itself, but if the same failed login event occurs several times in a short period, it could indicate
a brute force login attempt. This type of action can be monitored by a rule, which will generate
a correlation event that can act.

Advantages and Disadvantages of ArcSight


Below are a few advantages of ArcSight
• Integration with intelligent logger and ESM for easy rule creation and management.
• Simple integration with all end-point security management tools (IPS/IDS, Firewall,
Anti-Virus) and their consolidated output in a single location to effectively correct true
and false positives.
• ArcSight is a powerful tool that can handle millions of EPS files.
• Clustering is possible using ArcSight.
• Integration with IT infrastructures such as ticketing systems, web applications, and
threat feeds, among other things.
• Correlation in real-time is compelling.
• The use of dashboards and visualizations is excellent.
Below listed are the few disadvantages of ArcSight
• There is a storage issue that needs to be addressed to improve management.
• The search function needs to be improved.
• ArcSight is a complicated tool, and it's not easy to set up and maintain.

64
• If you have a vast environment, troubleshooting difficulties on ArcSight can be
complex.
• The user terminal is quite large and takes a long time to load.
• The integration is solid, but it is not yet complete because Arcsight cannot directly link
with several new popular apps.
• The user interface could've been enhanced.

65
UNIT -VI: Log Management using Splunk

Introduction
Splunk is a powerful platform for analyzing machine data, data that machines emit in great
volumes but which is seldom used effectively. The fastest way to understand the power and
versatility of Splunk is to consider two scenarios: one in the data center and one in the
marketing department. Splunk produces software for searching, monitoring, and analyzing
machine-generated big data, via a web-style interface.
What is Splunk?
Splunk is software that indexes IT data from any application, server or network device that
makes up your IT infrastructure. It's a powerful and versatile search and analysis engine that
lets you investigate, troubleshoot, monitor, alert, and report on everything that's happening in
your entire IT infrastructure from one location in real time.
Splunk is used for extracting value out of machine-generated data. It can be thought of as a
data mining tool for big data applications. Splunk can effectively handle big data with no
decrease in performance. The best part of Splunk is that it does not need any database to store
its data as it extensively makes use of its indexes to store the data.
Splunk is an absolutely fast engine and provides lightning-fast results. You can troubleshoot
any issue by resolving it with instant results and doing an effective root cause analysis. Splunk
can be used as a monitoring, reporting, analyzing, security information, and event management
tool among other things. Splunk takes valuable machine-generated data and converts it into
powerful operational intelligence by delivering insights through reports, charts, and alerts.
Who uses Splunk?
Splunk is versatile and thus has many uses and many different types of users. System
administrators, network engineers, security analysts, developers, service desk, and support staff
-- even Managers, VPs, and CIOs -- use Splunk to do their jobs better and faster. Application
support staff use Splunk for end-to-end investigation and remediation across the application
environment and to create alerts and dashboards that proactively monitor performance,
availability, and business metrics across an entire service. They use roles to segregate data
access along lines of duties and give application developers and Tier One support access to the
information they need from production logs without compromising security.
• System administrators and IT staff use Splunk to investigate server problems,
understand their configurations, and monitor user activity. Then, they turn the searches
into proactive alerts for performance thresholds, critical system errors, and load.

66
• Senior network engineers use Splunk to troubleshoot escalated problems, identify
events and patterns that are indicators of routine problems, such as misconfigured
routers and neighbor changes, and turn searches for these events into proactive alerts.
• Security analysts and incident response teams use Splunk to investigate activity for
flagged users and access to sensitive data, automatically monitor for known bad events,
and use sophisticated correlation via search to find known risk patterns such as brute
force attacks, data leakage, and even application-level fraud.
• Managers in all solution areas use Splunk to build reports and dashboards to monitor
and summarize the health, performance, activity, and capacity of their IT infrastructure
and businesses.
Splunk Architecture
The Splunk Architecture comprises three main components. These components are as
follows:
• Splunk Forwarder
• Splunk Indexer
• Search Head
Now let us understand the meaning of all these components so as to better understand the entire
Splunk Architecture.

Splunk Forwarder
The Splunk Forwarder is used to collate real-time data so as to enable real-time data analysis
by the users. The Splunk Forwarder collects all of the log’s data and sends it to the indexer. In
carrying out all these activities, the Splunk Forwarder consumes less processing power than
other traditional monitoring tools. There are 2 types of Splunk Forwarders. These are:
• Splunk Universal Forwarder
• Splunk Heavy Forwarder

67
Splunk Indexer
The Splunk Indexer is used for indexing and storing the data that is received from the Splunk
Forwarder. It basically transforms data into events, stores and adds them to an index, which in
turn enhances searchability. The data received from the Splunk Forwarder is first parsed so as
to remove any unwanted data and then the indexing is done. In this entire process, the Splunk
Indexer creates the following files and later bifurcates them into various directories called
buckets:
• Compressed raw data
• Indexes pointing to raw data (TSIDX files)
• Metadata files
Splunk Search Head
It is basically a graphical user interface where the user can perform various operations as per
his/her requirements. In this stage, the users can easily interact with Splunk and perform search
and query operations on Splunk data. The users can feed in the search keywords and get the
result as per their requirements.

The following diagram shows how the above components work together in the Splunk
Architecture:

What is Splunk used for?


Splunk is a software platform used for performing monitoring, searching, analyzing, and
visualizing real-time machine-generated data. Its usage in indexing, correlating, and capturing
real-time data is very important and highly recognized. Also, Splunk is used in producing and

68
creating graphs, dashboards, alerts, and interactive visualizations. Using Splunk, organizations
can easily access data and arrive at solutions to complex business problems too.
Features of Splunk
Here in this section of the Splunk tutorial, we will discuss some of the top features of
Splunk.
• One of the biggest strengths of Splunk is real-time data processing
• The input data for Splunk could be in any format like CSV, JSON, and others
• You can easily search and investigate a particular result with Splunk
• It lets you troubleshoot any condition of failure for improved performance
• You can monitor any business metrics and make an informed decision
• It is possible to visualize and analyze the results through powerful dashboards
• You can analyze the performance of any IT system with the Splunk tool
• Splunk even lets you incorporate Artificial Intelligence into your data strategy.

Applications of Splunk
We will discuss some of the applications of Splunk to give you a brief idea about the vast
possibilities of Splunk.
• You can deploy Splunk for web analytics to understand KPIs and improve
performance
• It is used in IT operations to detect intrusion, breaches, and network abusers
• Tracking, analyzing, and fine-tuning digital marketing initiatives with Splunk
• Working in conjunction with the Internet of Things is a big part of Splunk’s future
• It is used in industrial automation systems to see everything is working as expected
• Advising cybersecurity personnel on the best course of action for securing IT
systems.
Splunk Dashboard
Splunk Dashboards contain data visualization displays such as tables, charts, lists, maps, etc.
Each of these panels provides the visualization results using a base. You can build and edit
dashboards using the Splunk Web dashboard editor, which is the user interface in Splunk Light.
The created dashboards can also be edited using Simple XML source code.
The following steps can be used to build the dashboard:
• Firstly, you need to add content. This can be done by creating searches that
power up the dashboard, saving searches as reports, or creating panels for reuse.
• The next step will be to create or design the user interface. For designing,
perform dashboard modifications using panels, visualizations, and forms.

69
• The next step is adding interactivity. Though this is an optional step, users may
give it a try. This step basically involves adding interactivity to the dashboard
using forms.
• The next step would be to customize the dashboard. Users can add custom
features to enhance the customization.
• Finally, use Splunk Web Dashboard Editor to build and edit your dashboard.
Splunk gathered all of the relevant information into a central index that you could rapidly
search. Splunk can provide a detailed window into what is happening in your machine data.
Splunk can also reveal historical trends, correlate multiple sources of information, and help in
thousands of other ways.
Splunk does something that no other product can: efficiently capture and analyze massive
amounts of unstructured, time-series textual machine data.
Splunk works into three phases–
• First, identify the data that can answer your question.
• Second, transform the data into the results that can answer your question.
• Third, display the answer in a report, interactive chart, or graph to make it
intelligible to a wide range of audiences.

How Splunk Mastered Machine Data in the Datacentre


• Splunk begins with indexing, which means gathering all the data from diverse
locations and combining it into centralized indexes. Before Splunk, system
administrators would have had to log in to many different machines to gain access
to all the data using far less powerful tools.

70
• Using the indexes, Splunk can quickly search the logs from all servers and hone in
on when the problem occurred. With its speed, scale, and usability, Splunk makes
determining when a problem occurred that much faster.
• Splunk can then drill down into the period when the problem first occurred to
determine its root cause. Alerts can then be created to head the issue off in the
future.

Operational Intelligence
Operational intelligence is not an outgrowth of business intelligence (BI), but a new
approach based on sources of information not typically within the purview of BI solutions.
Operational data is not only incredibly valuable for improving IT operations, but also for
yielding insights into other parts of the business. Operational intelligence enables organizations
to:-
• Use machine data to gain a deeper understanding of their customers.
• Reveal important patterns and analytics derived from correlating events from many
sources.
• Reduce the time between an important event and its detection
• Leverage live feeds and historical data to make sense of what is happening now,
to find trends and anomalies, and to make more informed decisions based on that
information.
• Deploy a solution quickly and deliver the flexibility needed by organizations today
and in the future—that is, the ability to provide ad hoc reports, answer questions,
and add new data sources.
Environment Setup to Install Splunk
It covers installing Splunk, importing your data, and a bit about how the data is organized to
facilitate searching.
Machine Data Basics
Splunk’s mission is to make machine data useful for people. Splunk divides raw machine data
into discrete pieces of information known as events. When you do a simple search, Splunk
retrieves the events that match your search terms. Each event consists of discrete pieces of data
known as fields. In clock data, the fields might include second, minute, hour, day, month, and
year.
Types of Data Splunk Can Read
One of the common characteristics of machine data is that it almost always contains some
indication of when the data was created or when an event described by the data occurred.

71
Given this characteristic, Splunk’s indexes are optimized to retrieve events in time-series order.
If the raw data does not have an explicit timestamp, Splunk assigns the time at which the event
was indexed by Splunk to the events in the data or uses other approximations, such as the time
the file was last modified or the timestamp of previous events.

The only other requirement is that the machine data be textual, not binary, data. Image and
sound files are common examples of binary data files. Some types of binary files, like the core
dump produced when a program crashes, can be converted to textual information, such as a
stack trace. Splunk can call your scripts to do that conversion before indexing the data.
Ultimately, though, Splunk data must have a textual representation to be indexed and searched.
Splunk Data Sources
During indexing, Splunk can read machine data from any number of sources. The most
common input sources are:
• Files: Splunk can monitor specific files or directories. If data is added to a file or
a new file is added to a monitored directory, Splunk reads that data.
• The Network: Splunk can listen on TCP or UDP ports, reading any data sent.
• Scripted Inputs: Splunk can read the machine data output by programs or
scripts, such as a Unix® command or a custom script that monitors sensors.
Downloading and Installing Splunk
We can download fully functional Splunk for free, for learning, or support small to moderate
use of Splunk, and after downloading install Splunk after it starts the Splunk.
• Starting the Splunk
To start Splunk on Windows, launch the application from the Start menu. To start Splunk
on Mac OS X or Unix, open a terminal window. Go to the directory where you installed
Splunk, go to the bin subdirectory, and, at the command prompt, type:

./splunk start

The very last line of the information you see when Splunk starts is:
The Splunk web interface is at http://your-machinename:
8000
Follow that link to the login screen. If you don’t have a username and password, the default
credentials are admin and change me. After you log in, the Welcome screen appears. The
Welcome screen shows what you can do with your pristine instance of Splunk: add data or
launch the search app.
Bringing Data in for Indexing

72
The next step in learning and exploring Splunk is to add some data to the index so you can
explore it.
There are two steps to the indexing process:
• Downloading the sample file from the Splunk website
• Telling Splunk to index that file
To add the file to Splunk:
• From the Welcome screen, click Add Data.
• Click From files and directories on the bottom half of the screen.
• Select Skip preview.
• Click the radio button next to Upload and index a file.
• Select the file you downloaded to your desktop.
• Click Save.
Understanding How Splunk Indexes Data
Splunk’s core value to most organizations is its unique ability to index machine data so that it
can be quickly searched for analysis, reporting, and alerts. The data that you start with is called
raw data. Splunk indexes raw data by creating a time-based map of the words in the data
without modifying the data itself.
Before Splunk can search massive amounts of data, it must index the data. The Splunk index
is similar to indexes in the back of textbooks, which point to pages with specific keywords. In
Splunk, the “pages” are called events.

Splunk divides a stream of machine data into individual events. Remember, an event in
machine data can be as simple as one line in a log file or as complicated as a stack trace
containing several hundred lines.
Every grouping event in Splunk has at least four default fields. Default fields are indexed along
with the raw data. The timestamp (_time) field is special because Splunk indexers use it to
order events, enabling Splunk to efficiently retrieve events within a time range.

73
Goal of Search with Splunk
The goal of search is to help you find exactly what you need. It can mean filtering,
summarizing, and visualizing a large amount of data, to answer your questions about the data.
Splunk Installation is the first step to the goal of searching with Splunk. The Summary
dashboard gives you a quick overview of the data visible to you. Click the Launch search
app on the Splunk Welcome tab. If you’re on the Splunk Home tab, click Search under Your
Apps.
Few points about this dashboard:
• The search bar at the top is empty, ready for you to type in a search.
• The time range picker to the right of the search bar permits time range
adjustment. You can see events from the last 15 minutes, for example, or any
desired time interval. For real-time streaming data, you can select an interval to
view, ranging from 30 seconds to an hour.
• The All-indexed data panel displays a running total of the indexed data.
The next three panels show the most recent or common values that have been indexed in each
category:
• The Sources panel shows which files (or other sources) your data came from.
• The Source types panel shows the types of sources in your data.
• The Hosts panel shows which host your data came from.
Search navigation menus near the top of the page include:-
• The summary is where we are.
• Search leads to the main search interface, the Search dashboard.
• Status lists dashboards on the status of your Splunk instance.
• Dashboards & Views list your dashboards and views.
• Searches & Reports lists your saved searches and reports.
The Search Dashboard
If you click the Search option or enter a search in the search bar, the page switches to
the Search dashboard (sometimes called the timeline or flashtimeline view). When a search
is kicked off, the results almost immediately start displaying. For example, entering an asterisk
(*) in the search bar retrieves all the data in your default indexes.
The contents of this dashboard:-
• Timeline: A graphic representation of the number of events matching your search
over time.

74
• Fields sidebar: Relevant fields along with event counts. This menu also allows
you to add a field to the results.
• Field discovery switch: Turns automatic field discovery on or off. When Splunk
executes a search and field discovery is on, Splunk attempts to identify fields
automatically for the current search.
• Results area: This shows the events from your search. Events are ordered
by Timestamp, which appears to the left of each event. Beneath the Raw text of
each event are any fields selected from the Fields sidebar for which the event has
a value.

When you start typing in the search bar, context-sensitive information appears below, with
matching searches on the left and help on the right.

The search job controls are only active when a search is running. If you haven’t run a search,
or if your search has finished, they are inactive and greyed out. But if you’re running a search
that takes a long time to complete, you can use these icons to control the search progress:
• Sending a search to the background lets it keep running to completion on the server
while you run other searches or even close the window and log out. When you
click Send to background, the search bar clears and you can continue with other
tasks. When the job is done, a notification appears on your screen if you’re still
logged in; otherwise, Splunk emails you (if you’ve specified an email address). If
you want to check on the job in the meantime, or at a later time, click the Jobs link
at the top of the page.
• Pausing a search temporarily stops it and lets you explore the results to that point.
While the search is paused, the icon changes to a play button. Clicking that button
resumes the search from the point where you paused it.
• Finalizing a search stops it before it completes, but retains the results to that point
and so you can view and explore it in the search view.

75
• In contrast, cancelling a search stops it from running, discards the results, and
clears them from the screen.
The Job inspector icon takes you to the Job inspector page, which shows details about your
search, such as the execution costs of your search, debug messages, and search job properties.
Use the Save menu to save the search, save the results, or save and share the results. If you
save the search, you can find it on the Searches & Reports menu. If you save the results, you
can review them by clicking on Jobs in the upper right corner of the screen.
Use the Create menu to create dashboards, alerts, reports, event types, and scheduled searches.
Moving down to the upper left corner of the Results area, you see the following row of icons.

By default, Splunk shows events as a list, from most recent events to least, but you can click
on the Table icon to view your results as a table, or you can click the Chart icon to view them
as a chart. The Export button exports your search results in various formats: CSV, raw events,
XML, or JSON.

76
UNIT -VII: Incident Response and Handling
1. Introduction

1.1 Authority
The National Institute of Standards and Technology (NIST) developed this document in
furtherance of its statutory responsibilities under the Federal Information Security
Management Act (FISMA) of 2002, Public Law 107-347.
NIST is responsible for developing standards and guidelines, including minimum
requirements, for providing adequate information security for all agency operations and
assets, but such standards and guidelines shall not apply to national security systems. This
guideline is consistent with the requirements of the Office of Management and Budget
(OMB) Circular A-130, Section 8b(3), “Securing Agency Information Systems,” as
analyzed in A-130, Appendix IV: Analysis of Key Sections. Supplemental information is
provided in A-130, Appendix III.
This guideline has been prepared for use by Federal agencies. It may be used by
nongovernmental organizations on a voluntary basis and is not subject to copyright, though
attribution is desired.
Nothing in this document should be taken to contradict standards and guidelines made
mandatory and binding on Federal agencies by the Secretary of Commerce under statutory
authority, nor should these guidelines be interpreted as altering or superseding the existing
authorities of the Secretary of Commerce, Director of the OMB, or any other Federal official.
1.2 Purpose and Scope
This publication seeks to assist organizations in mitigating the risks from computer
security incidents by providing practical guidelines on responding to incidents effectively
and efficiently. It includes guidelines on establishing an effective incident response
program, but the primary focus of the document is detecting, analyzing, prioritizing, and
handling incidents. Organizations are encouraged to tailor the recommended guidelines
and solutions to meet their specific security and mission requirements.
1.3 Audience
This document has been created for computer security incident response teams (CSIRTs),
system and network administrators, security staff, technical support staff, chief
information security officers (CISOs), chief information officers (CIOs), computer
security program managers, and others who are responsible for preparing for, or
responding to, security incidents.

77
1.4 Document Structure
The remainder of this document is organized into the following sections and appendices:
 Section 2 discusses the need for incident response, outlines possible incident
response team structures, and highlights other groups within an organization
that may participate in incident handling.
 Section 3 reviews the basic incident handling steps and provides advice for
performing incident handling more effectively, particularly incident detection
and analysis.
 Section 4 examines the need for incident response coordination and information sharing.
2. Organizing a Computer Security Incident Response Capability

Organizing an effective computer security incident response capability (CSIRC) involves


several major decisions and actions. One of the first considerations should be to create an
organization-specific definition of the term “incident” so that the scope of the term is clear.
The organization should decide what services the incident response team should provide,
consider which team structures and models can provide those services, and select and
implement one or more incident response teams. Incident response plan, policy, and
procedure creation is an important part of establishing a team, so that incident response is
performed effectively, efficiently, and consistently, and so that the team is empowered to
do what needs to be done. The plan, policies, and procedures should reflect the team’s
interactions with other teams within the organization as well as with outside parties, such
as law enforcement, the media, and other incident response organizations. This section
provides not only guidelines that should be helpful to organizations that are establishing
incident response capabilities, but also advice on maintaining and enhancing existing
capabilities.
2.1 Events and Incidents
An event is any observable occurrence in a system or network. Events include a user
connecting to a file share, a server receiving a request for a web page, a user sending
email, and a firewall blocking a connection attempt. Adverse events are events with a
negative consequence, such as system crashes, packet floods, unauthorized use of system
privileges, unauthorized access to sensitive data, and execution of malware that destroys
data. This guide addresses only adverse events that are computer security- related, not
those caused by natural disasters, power failures, etc.

78
A computer security incident is a violation or imminent threat of violation1 of computer
security policies, acceptable use policies, or standard security practices. Examples of
incidents2 are:
 An attacker commands a botnet to send high volumes of connection requests to a web
server, causing it to crash.
 Users are tricked into opening a “quarterly report” sent via email that is actually
malware; running the tool has infected their computers and established connections
with an external host.
 An attacker obtains sensitive data and threatens that the details will be released publicly
if the organization does not pay a designated sum of money.
 A user provides or exposes sensitive information to others through peer-to-peer file
sharing services.
2.2 Need for Incident Response
Attacks frequently compromise personal and business data, and it is critical to respond
quickly and effectively when security breaches occur. The concept of computer security
incident response has become widely accepted and implemented. One of the benefits of
having an incident response capability is that it supports responding to incidents
systematically (i.e., following a consistent incident handling methodology) so that the
appropriate actions are taken. Incident response helps personnel to minimize loss or theft of
information and disruption of services caused by incidents. Another benefit of incident
response is the ability to use information gained during incident handling to better prepare
for handling future incidents and to provide stronger protection for systems and data. An
incident response capability also helps with dealing properly with legal issues that may arise
during incidents.
Besides the business reasons to establish an incident response capability, Federal
departments and agencies must comply with law, regulations, and policy directing a
coordinated, effective defense against information security threats. Chief among these are
the following:
 OMB’s Circular No. A-130, Appendix III,3 released in 2000, which directs Federal
agencies to “ensure that there is a capability to provide help to users when a security
incident occurs in the system and to share information concerning common
vulnerabilities and threats. This capability shall share information with other
organizations … and should assist the agency in pursuing appropriate legal action,
consistent with Department of Justice guidance.”

79
 FISMA (from 2002),4 which requires agencies to have “procedures for detecting,
reporting, and responding to security incidents” and establishes a centralized Federal
information security incident center, in part to:
– “Provide timely technical assistance to operators of agency information
systems … including guidance on detecting and handling information security
incidents …
– Compile and analyze information about incidents that threaten information security

– Inform operators of agency information systems about current and potential
information security threats, and vulnerabilities … .”
 Federal Information Processing Standards (FIPS) 200, Minimum Security
Requirements for Federal Information and Information Systems5, March 2006, which
specifies minimum security requirements for Federal information and information
systems, including incident response. The specific requirements are defined in NIST
Special Publication (SP) 800-53, Recommended Security Controls for Federal
Information Systems and Organizations.
 OMB Memorandum M-07-16, Safeguarding Against and Responding to the
Breach of Personally Identifiable Information6, May 2007, which provides
guidance on reporting security incidents that involve PII.
2.3 Incident Response Policy, Plan, and Procedure Creation
This section discusses policies, plans, and procedures related to incident response, with an
emphasis on interactions with outside parties.
2.3.1 Policy Elements
Policy governing incident response is highly individualized to the organization. However,
most policies include the same key elements:
 Statement of management commitment
 Purpose and objectives of the policy
 Scope of the policy (to whom and what it applies and under what circumstances)
 Definition of computer security incidents and related terms
 Organizational structure and definition of roles, responsibilities, and levels of
authority; should include the authority of the incident response team to confiscate or
disconnect equipment and to monitor suspicious activity, the requirements for
reporting certain types of incidents, the requirements and guidelines for external
communications and information sharing (e.g., what can be shared with whom, when,

80
and over what channels), and the handoff and escalation points in the incident
management process
 Prioritization or severity ratings of incidents
 Performance measures
 Reporting and contact forms.
2.3.2 Plan Elements
Organizations should have a formal, focused, and coordinated approach to responding to
incidents, including an incident response plan that provides the roadmap for implementing
the incident response capability. Each organization needs a plan that meets its unique
requirements, which relates to the organization’s mission, size, structure, and functions.
The plan should lay out the necessary resources and management support. The incident
response plan should include the following elements:
 Mission
 Strategies and goals
 Senior management approval
 Organizational approach to incident response
 How the incident response team will communicate with the rest of the organization
and with other organizations
 Metrics for measuring the incident response capability and its effectiveness
 Roadmap for maturing the incident response capability
 How the program fits into the overall organization.
The organization’s mission, strategies, and goals for incident response should help in
determining the structure of its incident response capability. The incident response program
structure should also be discussed within the plan. Section 2.4.1 discusses the types of
structures.
Once an organization develops a plan and gains management approval, the organization
should implement the plan and review it at least annually to ensure the organization is
following the roadmap for maturing the capability and fulfilling their goals for incident
response.
2.3.3 Procedure Elements
Procedures should be based on the incident response policy and plan. Standard operating
procedures (SOPs) are a delineation of the specific technical processes, techniques,
checklists, and forms used by the incident response team. SOPs should be reasonably
comprehensive and detailed to ensure that the priorities of the organization are reflected

81
in response operations. In addition, following standardized responses should minimize
errors, particularly those that might be caused by stressful incident handling situations.
SOPs should be tested to validate their accuracy and usefulness, then distributed to all
team members. Training should be provided for SOP users; the SOP documents can be
used as an instructional tool. Suggested SOP elements are presented throughout Section
3.
2.3.4 Sharing Information with Outside Parties
Organizations often need to communicate with outside parties regarding an incident, and
they should do so whenever appropriate, such as contacting law enforcement, fielding
media inquiries, and seeking external expertise. Another example is discussing incidents
with other involved parties, such as Internet service providers (ISPs), the vendor of
vulnerable software, or other incident response teams.
Organizations may also proactively share relevant incident indicator information with
peers to improve detection and analysis of incidents. The incident response team should
discuss information sharing with the organization’s public affairs office, legal department,
and management before an incident occurs to establish policies and procedures regarding
information sharing. Otherwise, sensitive information regarding incidents may be
provided to unauthorized parties, potentially leading to additional disruption and financial
loss. The team should document all contacts and communications with outside parties for
liability and evidentiary purposes.
The following sections provide guidelines on communicating with several types of outside
parties, as depicted in Figure 2-1. The double-headed arrows indicate that either party may
initiate communications. See Section 4 for additional information on communicating with
outside parties, and see Section 2.4 for a discussion of communications involving incident
response outsourcers.
 Legal Department. Legal experts should review incident response plans, policies, and
procedures to ensure their compliance with law and Federal guidance, including the
right to privacy. In addition, the guidance of the general counsel or legal department
should be sought if there is reason to believe that an incident may have legal
ramifications, including evidence collection, prosecution of a suspect, or a lawsuit, or
if there may be a need for a memorandum of understanding (MOU) or other binding
agreements involving liability limitations for information sharing.
 Public Affairs and Media Relations. Depending on the nature and impact of an
incident, a need may exist to inform the media and, by extension, the public.

82
 Human Resources. If an employee is suspected of causing an incident, the
human resources department may be involved—for example, in assisting with
disciplinary proceedings.
 Business Continuity Planning. Organizations should ensure that incident response
policies and procedures and business continuity processes are in sync. Computer
security incidents undermine the business resilience of an organization. Business
continuity planning professionals should be made aware of incidents and their
impacts so they can fine-tune business impact assessments, risk assessments, and
continuity of operations plans. Further, because business continuity planners have
extensive expertise in minimizing operational disruption during severe
circumstances, they may be valuable in planning responses to certain situations, such
as denial of service (DoS) conditions.
 Physical Security and Facilities Management. Some computer security incidents
occur through breaches of physical security or involve coordinated logical and
physical attacks. The incident response team also may need access to facilities during
incident handling—for example, to acquire a compromised workstation from a
locked office.
2.5 Incident Response Team Services
The main focus of an incident response team is performing incident response, but it is
fairly rare for a team to perform incident response only. The following are examples of
other services a team might offer:
 Intrusion Detection. The first tier of an incident response team often assumes
responsibility for intrusion detection.17 The team generally benefits because it
should be poised to analyze incidents more quickly and accurately, based on the
knowledge it gains of intrusion detection technologies.
 Advisory Distribution. A team may issue advisories within the organization
regarding new vulnerabilities and threats.18 Automated methods should be used
whenever appropriate to disseminate information; for example, the National
Vulnerability Database (NVD) provides information via XML and RSS feeds when
new vulnerabilities are added to it.19 Advisories are often most necessary when new
threats are emerging, such as a high-profile social or political event (e.g., celebrity
wedding) that attackers are likely to leverage in their social engineering. Only one
group within the organization should distribute computer security advisories to avoid
duplicated effort and conflicting information.

83
 Education and Awareness. Education and awareness are resource multipliers—the
more the users and technical staff know about detecting, reporting, and responding
to incidents, the less drain there should be on the incident response team. This
information can be communicated through many means: workshops, websites,
newsletters, posters, and even stickers on monitors and laptops.
 Information Sharing. Incident response teams often participate in information
sharing groups, such as ISACs or regional partnerships. Accordingly, incident
response teams often manage the organization’s incident information sharing
efforts, such as aggregating information related to incidents and effectively sharing
that information with other organizations, as well as ensuring that pertinent
information is shared within the enterprise.
2.6 Recommendations
The key recommendations presented in this section for organizing a computer security
incident handling capability are summarized below.
 Establish a formal incident response capability. Organizations should be
prepared to respond quickly and effectively when computer security defenses are
breached. FISMA requires Federal agencies to establish incident response
capabilities.
 Create an incident response policy. The incident response policy is the foundation
of the incident response program. It defines which events are considered incidents,
establishes the organizational structure for incident response, defines roles and
responsibilities, and lists the requirements for reporting incidents, among other
items.
 Develop an incident response plan based on the incident response policy. The
incident response plan provides a roadmap for implementing an incident response
program based on the organization’s policy. The plan indicates both short- and long-
term goals for the program, including metrics for measuring the program. The
incident response plan should also indicate how often incident handlers should be
trained and the requirements for incident handlers.
 Develop incident response procedures. The incident response procedures provide
detailed steps for responding to an incident. The procedures should cover all the
phases of the incident response process. The procedures should be based on the
incident response policy and plan.
 Establish policies and procedures regarding incident-related information

84
sharing. The organization should communicate appropriate incident details with
outside parties, such as the media, law enforcement agencies, and incident reporting
organizations. The incident response team should discuss this with the organization’s
public affairs office, legal department, and management to establish policies and
procedures regarding information sharing. The team should comply with existing
organization policy on interacting with the media and other outside parties.
 Provide pertinent information on incidents to the appropriate organization.
Federal civilian agencies are required to report incidents to US-CERT; other
organizations can contact US-CERT and/or their ISAC. Reporting is beneficial
because US-CERT and the ISACs use the reported data to provide information to the
reporting parties regarding new threats and incident trends.
 Consider the relevant factors when selecting an incident response team model.
Organizations should carefully weigh the advantages and disadvantages of each
possible team structure model and staffing model in the context of the organization’s
needs and available resources.
 Select people with appropriate skills for the incident response team. The
credibility and proficiency of the team depend to a large extent on the technical skills
and critical thinking abilities of its members. Critical technical skills include system
administration, network administration, programming, technical support, and intrusion
detection. Teamwork and communications skills are also needed for effective incident
handling. Necessary training should be provided to all team members.
 Identify other groups within the organization that may need to participate in
incident handling. Every incident response team relies on the expertise, judgment,
and abilities of other teams, including management, information assurance, IT
support, legal, public affairs, and facilities management.
 Determine which services the team should offer. Although the main focus of the
team is incident response, most teams perform additional functions. Examples
include monitoring intrusion detection sensors, distributing security advisories, and
educating users on security.
3. Handling an Incident

The incident response process has several phases. The initial phase involves establishing
and training an incident response team, and acquiring the necessary tools and resources.
During preparation, the organization also attempts to limit the number of incidents that

85
will occur by selecting and implementing a set of controls based on the results of risk
assessments. However, residual risk will inevitably persist after controls are implemented.
Detection of security breaches is thus necessary to alert the organization whenever
incidents occur. In keeping with the severity of the incident, the organization can mitigate
the impact of the incident by containing it and ultimately recovering from it. During this
phase, activity often cycles back to detection and analysis—for example, to see if
additional hosts are infected by malware while eradicating a malware incident. After the
incident is adequately handled, the organization issues a report that details the cause and
cost of the incident and the steps the organization should take to prevent future incidents.
This section describes the major phases of the incident response process—preparation,
detection and analysis, containment, eradication and recovery, and post-incident activity
in detail. Figure 3-1 illustrates the incident response life cycle
.

Figure 3-1. Incident Response Life Cycle

3.1 Preparation
Incident response methodologies typically emphasize preparation—not only establishing
an incident response capability so that the organization is ready to respond to incidents,

86
but also preventing incidents by ensuring that systems, networks, and applications are
sufficiently secure. Although the incident response team is not typically responsible for
incident prevention, it is fundamental to the success of incident response programs. This
section provides basic advice on preparing to handle incidents and on preventing
incidents.
3.2 Preparing to Handle Incidents
The lists below provide examples of tools and resources available that may be of value
during incident handling. These lists are intended to be a starting point for discussions
about which tools and resources an organization’s incident handlers need. For example,
smartphones are one way to have resilient emergency communication and coordination
mechanisms. An organization should have multiple (separate and different)
communication and coordination mechanisms in case of failure of one mechanism.
Incident Handler Communications and Facilities:
 Contact information for team members and others within and outside the
organization (primary and backup contacts), such as law enforcement and other
incident response teams; information may include phone numbers, email addresses,
public encryption keys (in accordance with the encryption software described below),
and instructions for verifying the contact’s identity
 On-call information for other teams within the organization, including escalation
information
 Incident reporting mechanisms, such as phone numbers, email addresses, online
forms, and secure instant messaging systems that users can use to report suspected
incidents; at least one mechanism should permit people to report incidents
anonymously
 Issue tracking system for tracking incident information, status, etc.
 Smartphones to be carried by team members for off-hour support and onsite
communications
 Encryption software to be used for communications among team members, within
the organization and with external parties; for Federal agencies, software must use a
FIPS-validated encryption algorithm20
 War room for central communication and coordination; if a permanent war room is
not necessary or practical, the team should create a procedure for procuring a
temporary war room when needed
 Secure storage facility for securing evidence and other

87
sensitive materials Incident Analysis Hardware and Software:
 Digital forensic workstations21 and/or backup devices to create disk images,
preserve log files, and save other relevant incident data
 Laptops for activities such as analyzing data, sniffing packets, and writing reports
 Spare workstations, servers, and networking equipment, or the virtualized
equivalents, which may be used for many purposes, such as restoring backups and
trying out malware
 Blank removable media
 Portable printer to print copies of log files and other evidence from non-networked
systems
 Packet sniffers and protocol analyzers to capture and analyze network traffic
 Digital forensic software to analyze disk images
 Removable media with trusted versions of programs to be used to gather evidence from
systems
 Evidence gathering accessories, including hard-bound notebooks, digital cameras,
audio recorders, chain of custody forms, evidence storage bags and tags, and
evidence tape, to preserve evidence for possible legal actions
Incident Analysis Resources:
 Port lists, including commonly used ports and Trojan horse ports
 Documentation for OSs, applications, protocols, and intrusion detection and antivirus
products
 Network diagrams and lists of critical assets, such as database servers
 Current baselines of expected network, system, and application activity
 Cryptographic hashes of critical files22 to speed incident analysis, verification,
and eradication Incident Mitigation Software:
 Access to images of clean OS and application installations for restoration and recovery
purposes
Many incident response teams create a jump kit, which is a portable case that contains
materials that may be needed during an investigation. The jump kit should be ready to go
at all times. Jump kits contain many of the same items listed in the bulleted lists above.
For example, each jump kit typically includes a laptop, loaded with appropriate software
(e.g., packet sniffers, digital forensics). Other important materials include backup devices,
blank media, and basic networking equipment and cables. Because the purpose of having

88
a jump kit is to facilitate faster responses, the team should avoid borrowing items from
the jump kit.
Each incident handler should have access to at least two computing devices (e.g., laptops).
One, such as the one from the jump kit, should be used to perform packet sniffing, malware
analysis, and all other actions that risk contaminating the laptop that performs them. This
laptop should be scrubbed and all software reinstalled before it is used for another incident.
Note that because this laptop is special purpose, it is likely to use software other than the
standard enterprise tools and configurations, and whenever possible the incident handlers
should be allowed to specify basic technical requirements for these special- purpose
investigative laptops. In addition to an investigative laptop, each incident handler should
also have a standard laptop, smart phone, or other computing device for writing reports,
reading email, and performing other duties unrelated to the hands-on incident analysis.
Exercises involving simulated incidents can also be very useful for preparing staff for
incident handling; see NIST SP 800-84 for more information on exercises23 and Appendix
A for sample exercise scenarios.
3.3 Preventing Incidents
Keeping the number of incidents reasonably low is very important to protect the business
processes of the organization. If security controls are insufficient, higher volumes of
incidents may occur, overwhelming the incident response team. This can lead to slow and
incomplete responses, which translate to a larger negative business impact (e.g., more
extensive damage, longer periods of service and data unavailability).
It is outside the scope of this document to provide specific advice on securing networks,
systems, and applications. Although incident response teams are generally not responsible
for securing resources, they can be advocates of sound security practices. An incident
response team may be able to identify problems that the organization is otherwise not
aware of; the team can play a key role in risk assessment and training by identifying gaps.
Other documents already provide advice on general security concepts and operating
system and application-specific guidelines.24 The following text, however, provides a
brief overview of some of the main recommended practices for securing networks,
systems, and applications:
 Risk Assessments. Periodic risk assessments of systems and applications should
determine what risks are posed by combinations of threats and vulnerabilities.25 This
should include understanding the applicable threats, including organization-specific
threats. Each risk should be prioritized, and the risks can be mitigated, transferred, or

89
accepted until a reasonable overall level of risk is reached. Another benefit of
conducting risk assessments regularly is that critical resources are identified, allowing
staff to emphasize monitoring and response activities for those resources.26
 Host Security. All hosts should be hardened appropriately using standard
configurations. In addition to keeping each host properly patched, hosts should be
configured to follow the principle of least privilege—granting users only the
privileges necessary for performing their authorized tasks. Hosts should have auditing
enabled and should log significant security-related events. The security of hosts and
their configurations should be continuously monitored.27 Many organizations use
Security Content Automation Protocol (SCAP)28 expressed operating system and
application configuration checklists to assist in securing hosts consistently and
effectively.29
 Network Security. The network perimeter should be configured to deny all
activity that is not expressly permitted. This includes securing all connection
points, such as virtual private networks (VPNs) and dedicated connections to other
organizations.
 Malware Prevention. Software to detect and stop malware should be deployed
throughout the organization. Malware protection should be deployed at the host level
(e.g., server and workstation operating systems), the application server level (e.g.,
email server, web proxies), and the application client level (e.g., email clients, instant
messaging clients).30
 User Awareness and Training. Users should be made aware of policies and
procedures regarding appropriate use of networks, systems, and applications.
Applicable lessons learned from previous incidents should also be shared with users
so they can see how their actions could affect the organization. Improving user
awareness regarding incidents should reduce the frequency of incidents. IT staff
should be trained so that they can maintain their networks, systems, and applications
in accordance with the organization’s security standards.

90
3.4 Detection and Analysis

Figure 3-2. Incident Response Life Cycle (Detection and Analysis)

3.4.1 Attack Vectors


Incidents can occur in countless ways, so it is infeasible to develop step-by-step
instructions for handling every incident. Organizations should be generally prepared to
handle any incident but should focus on being prepared to handle incidents that use
common attack vectors. Different types of incidents merit different response strategies.
The attack vectors listed below are not intended to provide definitive classification for
incidents; rather, they simply list common methods of attack, which can be used as a basis
for defining more specific handling procedures.
 External/Removable Media: An attack executed from removable media or a
peripheral device—for example, malicious code spreading onto a system from an
infected USB flash drive.
 Attrition: An attack that employs brute force methods to compromise, degrade, or
destroy systems, networks, or services (e.g., a DDoS intended to impair or deny
access to a service or application; a brute force attack against an authentication
mechanism, such as passwords, CAPTCHAS, or digital signatures).
 Web: An attack executed from a website or web-based application—for example, a
cross-site scripting attack used to steal credentials or a redirect to a site that exploits a
browser vulnerability and installs malware.
 Email: An attack executed via an email message or attachment—for example, exploit
code disguised as an attached document or a link to a malicious website in the body
of an email message.
 Impersonation: An attack involving replacement of something benign with
something malicious— for example, spoofing, man in the middle attacks, rogue

91
wireless access points, and SQL injection attacks all involve impersonation.
 Improper Usage: Any incident resulting from violation of an organization’s
acceptable usage policies by an authorized user, excluding the above categories; for
example, a user installs file sharing software, leading to the loss of sensitive data; or a
user performs illegal activities on a system.
 Loss or Theft of Equipment: The loss or theft of a computing device or
media used by the organization, such as a laptop, smartphone, or
authentication token.
 Other: An attack that does not fit into any of the other categories.
This section focuses on recommended practices for handling any type of incident. It is
outside the scope of this publication to give specific advice based on the attack vectors;
such guidelines would be provided in separate publications addressing other incident
handling topics, such as NIST SP 800-83 on malware incident prevention and handling.
3.4.2 Signs of an Incident
For many organizations, the most challenging part of the incident response process is
accurately detecting and assessing possible incidents—determining whether an incident
has occurred and, if so, the type, extent, and magnitude of the problem. What makes this
so challenging is a combination of three factors:
 Incidents may be detected through many different means, with varying levels of
detail and fidelity. Automated detection capabilities include network-based and host-
based IDPSs, antivirus software, and log analyzers. Incidents may also be detected
through manual means, such as problems reported by users. Some incidents have
overt signs that can be easily detected, whereas others are almost impossible to
detect.
 The volume of potential signs of incidents is typically high—for example, it is not
uncommon for an organization to receive thousands or even millions of intrusion
detection sensor alerts per day. (See Section 3.2.4 for information on analyzing such
alerts.)
 Deep, specialized technical knowledge and extensive experience are necessary
for proper and efficient analysis of incident-related data.
Signs of an incident fall into one of two categories: precursors and indicators. A
precursor is a sign that an incident may occur in the future. An indicator is a sign that
an incident may have occurred or may be occurring now.

92
Most attacks do not have any identifiable or detectable precursors from the target’s
perspective. If precursors are detected, the organization may have an opportunity to
prevent the incident by altering its security posture to save a target from attack. At a
minimum, the organization could monitor activity involving the target more closely.
Examples of precursors are:
 Web server log entries that show the usage of a vulnerability scanner
 An announcement of a new exploit that targets a vulnerability of the organization’s mail
server
 A threat from a group stating that the group will attack the organization.
While precursors are relatively rare, indicators are all too common. Too many types of
indicators exist to exhaustively list them, but some examples are listed below:
 A network intrusion detection sensor alerts when a buffer overflow attempt occurs
against a database server.
 Antivirus software alerts when it detects that a host is infected with malware.
 A system administrator sees a filename with unusual characters.
 A host records an auditing configuration change in its log.
 An application logs multiple failed logins attempts from an unfamiliar remote system.
 An email administrator sees a large number of bounced emails with suspicious content.
 A network administrator notices an unusual deviation from typical network traffic flows.
3.4.3 Sources of Precursors and Indicators
Precursors and indicators are identified using many different sources, with the most
common being computer security software alerts, logs, publicly available information, and
people. Table 3-2 lists common sources of precursors and indicators for each category.
Table 3-1. Common Sources of Precursors and Indicators

Source Description
Alerts
IDPSs IDPS products identify suspicious events and record pertinent data regarding them,
including the date and time the attack was detected, the type of attack, the source
and destination IP addresses, and the username (if applicable and known). Most
IDPS products use attack signatures to identify malicious activity; the signatures
must be kept up to date so that the newest attacks can be detected. IDPS software
often produces false positives—alerts that indicate malicious activity is occurring,
when in fact there has been none. Analysts should manually validate IDPS alerts
either by closely reviewing the recorded supporting data or by getting related data

93
from other sources.31

SIEMs Security Information and Event Management (SIEM) products are similar to IDPS
products, but they generate alerts based on analysis of log data (see below).
Antivirus Antivirus software detects various forms of malware, generates alerts, and prevents
and the malware from infecting hosts. Current antivirus products are effective at
antispam stopping many instances of malware if their signatures are kept up to date. Antispam
software software is used to detect spam and prevent it from reaching users’ mailboxes. Spam
may contain malware, phishing attacks, and other malicious content, so alerts from
antispam software may indicate attack attempts.
File File integrity checking software can detect changes made to important files during
integrity incidents. It uses a hashing algorithm to obtain a cryptographic checksum for each
checking designated file. If the file is altered and the checksum is recalculated, an extremely
software high probability exists that the new checksum will not match the old checksum. By
regularly recalculating checksums and comparing them with previous values,
changes to files can be detected.
Third- Third parties offer a variety of subscription-based and free monitoring services. An
party example is fraud detection services that will notify an organization if its IP
monitori addresses, domain names, etc. are associated with current incident activity
ng involving other organizations. There are also free real-time blacklists with similar
services information. Another example of a third-party monitoring service is a CSIRC
notification list; these lists are often available only to other incident response teams.
Logs
Operatin Logs from operating systems, services, and applications (particularly audit-related
g system, data) are frequently of great value when an incident occurs, such as recording which
service accounts were accessed and what actions were performed. Organizations should
and require a baseline level of logging on all systems and a higher baseline level on
applicati critical systems. Logs can be used for analysis by correlating event information.
on logs Depending on the event information, an alert can be generated to indicate an
incident. Section 3.2.4 discusses the value of centralized logging.
Network Logs from network devices such as firewalls and routers are not typically a primary
device source of precursors or indicators. Although these devices are usually configured to
logs log blocked connection attempts, they provide little information about the nature of
the activity. Still, they can be valuable in identifying network trends and in

94
correlating events detected by other devices.

Source Description
Network flows A network flow is a particular communication session occurring between hosts.
Routers and other networking devices can provide network flow information,
which can be used to find anomalous network activity caused by malware, data
exfiltration, and other malicious acts. There are many standards for flow data
formats, including NetFlow, Flow, and IPFIX.
Publicly Available Information
Information Keeping up with new vulnerabilities and exploits can prevent some incidents
on new from occurring and assist in detecting and analyzing new attacks. The National
vulnerabilitie Vulnerability Database (NVD) contains information on vulnerabilities.32
s and Organizations such as US-CERT33 and CERT®/CC periodically provide threat
exploits update information through briefings, web postings, and mailing lists.
People
People from Users, system administrators, network administrators, security staff, and others
within the from within the organization may report signs of incidents. It is important to
organization validate all such reports. One approach is to ask people who provide such
information how confident they are of the accuracy of the information.
Recording this estimate along with the information provided can help
considerably during incident analysis, particularly when conflicting data is
discovered.
People from Reports of incidents that originate externally should be taken seriously. For
other example, the organization might be contacted by a party claiming a system at the
organizations organization is attacking its systems. External users may also report other
indicators, such as a defaced web page or an unavailable service. Other incident
response teams also may report incidents. It is important to have mechanisms in
place for external parties to report indicators and for trained staff to monitor those
mechanisms carefully; this may be as simple as setting up a phone number and
email address, configured to forward messages to the help desk.
3.4.4 Incident Analysis
Incident detection and analysis would be easy if every precursor or indicator were
guaranteed to be accurate; unfortunately, this is not the case. For example, user-provided
indicators such as a complaint of a server being unavailable are often incorrect. Intrusion
detection systems may produce false positives— incorrect indicators. These examples

95
demonstrate what makes incident detection and analysis so difficult: each indicator ideally
should be evaluated to determine if it is legitimate. Making matters worse, the total
number of indicators may be thousands or millions a day. Finding the real security
incidents that occurred out of all the indicators can be a daunting task.
Even if an indicator is accurate, it does not necessarily mean that an incident has occurred.
Some indicators, such as a server crash or modification of critical files, could happen for
several reasons other than a security incident, including human error. Given the occurrence
of indicators, however, it is reasonable to suspect that an incident might be occurring and
to act accordingly. Determining whether a particular event is actually an incident is
sometimes a matter of judgment. It may be necessary to collaborate with other technical
and information security personnel to make a decision. In many instances, a situation
should be handled the same way regardless of whether it is security related. For example,
if an organization is losing Internet connectivity every 12 hours and no one knows the
cause, the staff would want to resolve the problem just as quickly and would use the same
resources to diagnose the problem, regardless of its cause.
Some incidents are easy to detect, such as an obviously defaced web page. However, many
incidents are not associated with such clear symptoms. Small signs such as one change in
one system configuration file may be the only indicators that an incident has occurred. In
incident handling, detection may be the most difficult task. Incident handlers are
responsible for analyzing ambiguous, contradictory, and incomplete symptoms to
determine what has happened. Although technical solutions exist that can make detection
easier, the best remedy is to build a team of highly experienced and proficient staff
members who can analyze the precursors and indicators effectively and efficiently and
take appropriate actions. Without a well-trained and capable staff, incident detection and
analysis will be conducted inefficiently, and costly mistakes will be made.
The incident response team should work quickly to analyze and validate each incident,
following a pre- defined process and documenting each step taken. When the team
believes that an incident has occurred, the team should rapidly perform an initial analysis
to determine the incident’s scope, such as which networks, systems, or applications are
affected; who or what originated the incident; and how the incident is occurring (e.g., what
tools or attack methods are being used, what vulnerabilities are being exploited). The
initial analysis should provide enough information for the team to prioritize subsequent
activities, such as containment of the incident and deeper analysis of the effects of the
incident.

96
Performing the initial analysis and validation is challenging. The following are
recommendations for making incident analysis easier and more effective:
 Profile Networks and Systems. Profiling is measuring the characteristics of expected
activity so that changes to it can be more easily identified. Examples of profiling are
running file integrity checking software on hosts to derive checksums for critical files
and monitoring network bandwidth usage to determine what the average and peak
usage levels are on various days and times. In practice, it is difficult to detect incidents
accurately using most profiling techniques; organizations should use profiling as one
of several detection and analysis techniques.
 Understand Normal Behaviors. Incident response team members should study
networks, systems, and applications to understand what their normal behavior is so
that abnormal behavior can be recognized more easily. No incident handler will have
a comprehensive knowledge of all behavior throughout the environment, but handlers
should know which experts could fill in the gaps. One way to gain this knowledge is
through reviewing log entries and security alerts. This may be tedious if filtering is
not used to condense the logs to a reasonable size. As handlers become more familiar
with the logs and alerts, they should be able to focus on unexplained entries, which
are usually more important to investigate. Conducting frequent log reviews should
keep the knowledge fresh, and the analyst should be able to notice trends and changes
over time. The reviews also give the analyst an indication of the reliability of each
source.
 Create a Log Retention Policy. Information regarding an incident may be recorded
in several places, such as firewall, IDPS, and application logs. Creating and
implementing a log retention policy that specifies how long log data should be
maintained may be extremely helpful in analysis because older log entries may show
reconnaissance activity or previous instances of similar attacks. Another reason for
retaining logs is that incidents may not be discovered until days, weeks, or even months
later. The length of time to maintain log data is dependent on several factors, including
the organization’s data retention policies and the volume of data. See NIST SP 800-
92, Guide to Computer Security Log Management for additional recommendations
related to logging.34
 Perform Event Correlation. Evidence of an incident may be captured in several logs
that each contain different types of data—a firewall log may have the source IP
address that was used, whereas an application log may contain a username. A network

97
IDPS may detect that an attack was launched against a particular host, but it may not
know if the attack was successful. The analyst may need to examine the host’s logs to
determine that information. Correlating events among multiple indicator sources can
be invaluable in validating whether a particular incident occurred.
 Keep All Host Clocks Synchronized. Protocols such as the Network Time Protocol
(NTP) synchronize clocks among hosts.35 Event correlation will be more complicated
if the devices reporting events have inconsistent clock settings. From an evidentiary
standpoint, it is preferable to have consistent timestamps in logs—for example, to have
three logs that show an attack occurred at 12:07:01 a.m., rather than logs that list the
attack as occurring at 12:07:01, 12:10:35, and 11:07:06.
 Maintain and Use a Knowledge Base of Information. The knowledge base should
include information that handlers need for referencing quickly during incident
analysis. Although it is possible to build a knowledge base with a complex structure, a
simple approach can be effective. Text documents, spreadsheets, and relatively simple
databases provide effective, flexible, and searchable mechanisms for sharing data
among team members. The knowledge base should also contain a variety of
information, including explanations of the significance and validity of precursors and
indicators, such as IDPS alerts, operating system log entries, and application error
codes.
 Use Internet Search Engines for Research. Internet search engines can help
analysts find information on unusual activity. For example, an analyst may see some
unusual connection attempts targeting TCP port 22912. Performing a search on the
terms “TCP,” “port,” and “22912” may return some hits that contain logs of similar
activity or even an explanation of the significance of the port number. Note that
separate workstations should be used for research to minimize the risk to the
organization from conducting these searches.
 Run Packet Sniffers to Collect Additional Data. Sometimes the indicators do not
record enough detail to permit the handler to understand what is occurring. If an
incident is occurring over a network, the fastest way to collect the necessary data may
be to have a packet sniffer capture network traffic. Configuring the sniffer to record
traffic that matches specified criteria should keep the volume of data manageable and
minimize the inadvertent capture of other information. Because of privacy concerns,
some organizations may require incident handlers to request and receive permission
before using packet sniffers.

98
 Filter the Data. There is simply not enough time to review and analyze all the
indicators; at minimum the most suspicious activity should be investigated. One
effective strategy is to filter out categories of indicators that tend to be insignificant.
Another filtering strategy is to show only the categories of indicators that are of the
highest significance; however, this approach carries substantial risk because new
malicious activity may not fall into one of the chosen indicator categories.
 Seek Assistance from Others. Occasionally, the team will be unable to determine
the full cause and nature of an incident. If the team lacks sufficient information to
contain and eradicate the incident, then it should consult with internal resources (e.g.,
information security staff) and external resources (e.g., US-CERT, other CSIRTs,
contractors with incident response expertise). It is important to accurately determine
the cause of each incident so that it can be fully contained and the exploited
vulnerabilities can be mitigated to prevent similar incidents from occurring.
3.4.5 Incident Documentation
An incident response team that suspects that an incident has occurred should immediately
start recording all facts regarding the incident.36 A logbook is an effective and simple
medium for this,37 but laptops, audio recorders, and digital cameras can also serve this
purpose.38 Documenting system events, conversations, and observed changes in files can
lead to a more efficient, more systematic, and less error- prone handling of the problem.
Every step taken from the time the incident was detected to its final resolution should be
documented and timestamped. Every document regarding the incident should be dated
and signed by the incident handler. Information of this nature can also be used as evidence
in a court of law if legal prosecution is pursued. Whenever possible, handlers should work
in teams of at least two: one person can record and log events while the other person
performs the technical tasks. Section
3.3.2 presents more information about evidence.39
The incident response team should maintain records about the status of incidents, along
with other pertinent information.40 Using an application or a database, such as an issue
tracking system, helps ensure that incidents are handled and resolved in a timely manner.
The issue tracking system should contain information on the following:
 The current status of the incident (new, in progress, forwarded for investigation, resolved,
etc.)
 A summary of the incident
 Indicators related to the incident

99
 Other incidents related to this incident
 Actions taken by all incident handlers on this incident
 Chain of custody, if applicable
 Impact assessments related to the incident
 Contact information for other involved parties (e.g., system owners, system
administrators)
 A list of evidence gathered during the incident investigation
 Comments from incident handlers
 Next steps to be taken (e.g., rebuild the host, upgrade an application).41
The incident response team should safeguard incident data and restrict access to it because
it often contains sensitive information—for example, data on exploited vulnerabilities,
recent security breaches, and users that may have performed inappropriate actions. For
example, only authorized personnel should have access to the incident database. Incident
communications (e.g., emails) and documents should be encrypted or otherwise protected
so that only authorized personnel can read them.
3.4.6 Incident Prioritization
Prioritizing the handling of the incident is perhaps the most critical decision point in the
incident handling process. Incidents should not be handled on a first-come, first-served
basis as a result of resource limitations. Instead, handling should be prioritized based on
the relevant factors, such as the following:
 Functional Impact of the Incident. Incidents targeting IT systems typically impact
the business functionality that those systems provide, resulting in some type of
negative impact to the users of those systems. Incident handlers should consider how
the incident will impact the existing functionality of the affected systems. Incident
handlers should consider not only the current functional impact of the incident, but
also the likely future functional impact of the incident if it is not immediately
contained.
 Information Impact of the Incident. Incidents may affect the confidentiality,
integrity, and availability of the organization’s information. For example, a malicious
agent may exfiltrate sensitive information. Incident handlers should consider how this
information exfiltration will impact the organization’s overall mission. An incident
that results in the exfiltration of sensitive information may also affect other
organizations if any of the data pertained to a partner organization.
 Recoverability from the Incident. The size of the incident and the type of resources

100
it affects will determine the amount of time and resources that must be spent on
recovering from that incident. In some instances it is not possible to recover from an
incident (e.g., if the confidentiality of sensitive information has been compromised)
and it would not make sense to spend limited resources on an elongated incident
handling cycle, unless that effort was directed at ensuring that a similar incident did
not occur in the future. In other cases, an incident may require far more resources to
handle than what an organization has available. Incident handlers should consider the
effort necessary to actually recover from an incident and carefully weigh that against
the value the recovery effort will create and any requirements related to incident
handling.
Combining the functional impact to the organization’s systems and the impact to the
organization’s information determines the business impact of the incident—for example,
a distributed denial of service attack against a public web server may temporarily reduce
the functionality for users attempting to access the server, whereas unauthorized root-level
access to a public web server may result in the exfiltration of personally identifiable
information (PII), which could have a long-lasting impact on the organization’s reputation.
The recoverability from the incident determines the possible responses that the team may
take when handling the incident. An incident with a high functional impact and low effort
to recover from is an ideal candidate for immediate action from the team. However, some
incidents may not have smooth recovery paths and may need to be queued for a more
strategic-level response—for example, an incident that results in an attacker exfiltrating
and publicly posting gigabytes of sensitive data has no easy recovery path since the data
is already exposed; in this case the team may transfer part of the responsibility for handling
the data exfiltration incident to a more strategic-level team that develops strategy for
preventing future breaches and creates an outreach plan for alerting those individuals or
organizations whose data was exfiltrated. The team should prioritize the response to each
incident based on its estimate of the business impact caused by the incident and the
estimated efforts required to recover from the incident.
An organization can best quantify the effect of its own incidents because of its situational
awareness. Table 3-2 provides examples of functional impact categories that an
organization might use for rating its own incidents. Rating incidents can be helpful in
prioritizing limited resources.

101
Table 3-2. Functional Impact Categories

Categor Definition
y
None No effect to the organization’s ability to provide all services to all users
Low Minimal effect; the organization can still provide all critical services to all users
but has lost efficiency
Medium Organization has lost the ability to provide a critical service to a subset of system
users
High Organization is no longer able to provide some critical services to any users

Table 3-3 provides examples of possible information impact categories that describe the
extent of information compromise that occurred during the incident. In this table, with the
exception of the ‘None’ value, the categories are not mutually exclusive and the organization
could choose more than one.
Table 3-3. Information Impact Categories

Category Definition

None No information was exfiltrated, changed, deleted, or otherwise


compromised
Privacy Sensitive personally identifiable information (PII) of taxpayers,
Breach employees, beneficiaries, etc. was accessed or exfiltrated

Proprietary Unclassified proprietary information, such as protected critical


Breach infrastructure information (PCII), was accessed or exfiltrated

Integrity Sensitive or proprietary information was changed or deleted


Loss

Table 3-4 shows examples of recoverability effort categories that reflect the level of and type
of resources required to recover from the incident.
Table 3-4. Recoverability Effort Categories

Category Definition
Regular Time to recovery is predictable with existing resources

102
Supplemented Time to recovery is predictable with additional resources
Extended Time to recovery is unpredictable; additional resources and outside help are
needed
Not Recovery from the incident is not possible (e.g., sensitive data exfiltrated and
Recoverable posted publicly); launch investigation

Organizations should also establish an escalation process for those instances when the
team does not respond to an incident within the designated time. This can happen for
many reasons: for example, cell phones may fail or people may have personal
emergencies. The escalation process should state how long a person should wait for a
response and what to do if no response occurs. Generally, the first step is to duplicate the
initial contact. After waiting for a brief time—perhaps 15 minutes—the caller should
escalate the incident to a higher level, such as the incident response team manager. If that
person does not respond within a certain time, then the incident should be escalated again
to a higher level of management. This process should be repeated until someone responds.
3.4.7 Incident Notification
When an incident is analyzed and prioritized, the incident response team needs to notify the
appropriate individuals so that all who need to be involved will play their roles. Incident
response policies should include provisions concerning incident reporting—at a minimum,
what must be reported to whom and at what times (e.g., initial notification, regular status
updates). The exact reporting requirements vary among organizations, but parties that are
typically notified include:
 CIO
 Head of information security
 Local information security officer
 Other incident response teams within the organization
 External incident response teams (if appropriate)
 System owner
 Human resources (for cases involving employees, such as harassment through email)
 Public affairs (for incidents that may generate publicity)
 Legal department (for incidents with potential legal ramifications)
 US-CERT (required for Federal agencies and systems operated on behalf of the
Federal government; see Section 2.3.4.3)
 Law enforcement (if appropriate)

103
During incident handling, the team may need to provide status updates to certain parties,
even in some cases the entire organization. The team should plan and prepare several
communication methods, including out-of-band methods (e.g., in person, paper), and
select the methods that are appropriate for a particular incident. Possible communication
methods include:
 Email
 Website (internal, external, or portal)
 Telephone calls
 In person (e.g., daily briefings)
 Voice mailbox greeting (e.g., set up a separate voice mailbox for incident updates,
and update the greeting message to reflect the current incident status; use the help
desk’s voice mail greeting)
 Paper (e.g., post notices on bulletin boards and doors, hand out notices at all entrance
points).
3.5 Containment, Eradication, and Recovery

Figure 3-3. Incident Response Life Cycle (Containment, Eradication, and Recovery)

3.5.1 Choosing a Containment Strategy


Containment is important before an incident overwhelms resources or increases damage.
Most incidents require containment, so that is an important consideration early in the
course of handling each incident. Containment provides time for developing a tailored
remediation strategy. An essential part of containment is decision-making (e.g., shut down
a system, disconnect it from a network, disable certain functions). Such decisions are
much easier to make if there are predetermined strategies and procedures for containing
the incident. Organizations should define acceptable risks in dealing with incidents and
develop strategies accordingly.

104
Containment strategies vary based on the type of incident. For example, the strategy for
containing an email-borne malware infection is quite different from that of a network-
based DDoS attack. Organizations should create separate containment strategies for each
major incident type, with criteria documented clearly to facilitate decision-making.
Criteria for determining the appropriate strategy include:
 Potential damage to and theft of resources
 Need for evidence preservation
 Service availability (e.g., network connectivity, services provided to external parties)
 Time and resources needed to implement the strategy
 Effectiveness of the strategy (e.g., partial containment, full containment)
 Duration of the solution (e.g., emergency workaround to be removed in four
hours, temporary workaround to be removed in two weeks, permanent
solution).
In certain cases, some organizations redirect the attacker to a sandbox (a form of
containment) so that they can monitor the attacker’s activity, usually to gather additional
evidence. The incident response team should discuss this strategy with its legal department
to determine if it is feasible. Ways of monitoring an attacker’s activity other than
sandboxing should not be used; if an organization knows that a system has been
compromised and allows the compromise to continue, it may be liable if the attacker uses
the compromised system to attack other systems. The delayed containment strategy is
dangerous because an attacker could escalate unauthorized access or compromise other
systems.
Another potential issue regarding containment is that some attacks may cause additional
damage when they are contained. For example, a compromised host may run a malicious
process that pings another host periodically. When the incident handler attempts to contain
the incident by disconnecting the compromised host from the network, the subsequent
pings will fail. As a result of the failure, the malicious process may overwrite or encrypt
all the data on the host’s hard drive. Handlers should not assume that just because a host
has been disconnected from the network, further damage to the host has been prevented.
3.5.2 Evidence Gathering and Handling
Although the primary reason for gathering evidence during an incident is to resolve the
incident, it may also be needed for legal proceedings.42 In such cases, it is important to
clearly document how all evidence, including compromised systems, has been
preserved.43 Evidence should be collected according to procedures that meet all applicable

105
laws and regulations that have been developed from previous discussions with legal staff
and appropriate law enforcement agencies so that any evidence can be admissible in
court.44 In addition, evidence should be accounted for at all times; whenever evidence is
transferred from person to person, chain of custody forms should detail the transfer and
include each party’s signature. A detailed log should be kept for all evidence, including
the following:
 Identifying information (e.g., the location, serial number, model number,
hostname, media access control (MAC) addresses, and IP addresses of a
computer)
 Name, title, and phone number of each individual who collected or handled the
evidence during the investigation
 Time and date (including time zone) of each occurrence of evidence handling
 Locations where the evidence was stored.
Collecting evidence from computing resources presents some challenges. It is generally
desirable to acquire evidence from a system of interest as soon as one suspects that an
incident may have occurred. Many incidents cause a dynamic chain of events to occur; an
initial system snapshot may do better in identifying the problem and its source than most
other actions that can be taken at this stage. From an evidentiary standpoint, it is much
better to get a snapshot of the system as-is rather than doing so after incident handlers,
system administrators, and others have inadvertently altered the state of the machine
during the investigation. Users and system administrators should be made aware of the
steps that they should take to preserve evidence. See NIST SP 800-86, Guide to Integrating
Forensic Techniques into Incident Response, for additional information on preserving
evidence.
3.5.3 Identifying the Attacking Hosts
During incident handling, system owners and others sometimes want to or need to identify
the attacking host or hosts. Although this information can be important, incident handlers
should generally stay focused on containment, eradication, and recovery. Identifying an
attacking host can be a time-consuming and futile process that can prevent a team from
achieving its primary goal—minimizing the business impact. The following items
describe the most commonly performed activities for attacking host identification:
 Validating the Attacking Host’s IP Address. New incident handlers often focus on
the attacking host’s IP address. The handler may attempt to validate that the address
was not spoofed by verifying connectivity to it; however, this simply indicates that a

106
host at that address does or does not respond to the requests. A failure to respond
does not mean the address is not real—for example, a host may be configured to
ignore pings and traceroutes. Also, the attacker may have received a dynamic address
that has already been reassigned to someone else.
 Researching the Attacking Host through Search Engines. Performing an Internet
search using the apparent source IP address of an attack may lead to more information
on the attack—for example, a mailing list message regarding a similar attack.
 Using Incident Databases. Several groups collect and consolidate incident data from
various organizations into incident databases. This information sharing may take
place in many forms, such as trackers and real-time blacklists. The organization can
also check its own knowledge base or issue tracking system for related activity.
 Monitoring Possible Attacker Communication Channels. Incident handlers can
monitor communication channels that may be used by an attacking host. For
example, many bots use IRC as their primary means of communication. Also,
attackers may congregate on certain IRC channels to brag about their compromises
and share information. However, incident handlers should treat any such information
that they acquire only as a potential lead, not as fact.
3.5.4 Eradication and Recovery
After an incident has been contained, eradication may be necessary to eliminate
components of the incident, such as deleting malware and disabling breached user
accounts, as well as identifying and mitigating all vulnerabilities that were exploited.
During eradication, it is important to identify all affected hosts within the organization so
that they can be remediated. For some incidents, eradication is either not necessary or is
performed during recovery.
In recovery, administrators restore systems to normal operation, confirm that the systems are
functioning normally, and (if applicable) remediate vulnerabilities to prevent similar
incidents. Recovery may involve such actions as restoring systems from clean backups,
rebuilding systems from scratch, replacing compromised files with clean versions, installing
patches, changing passwords, and tightening network perimeter security (e.g., firewall
rulesets, boundary router access control lists). Higher levels of system logging or network
monitoring are often part of the recovery process. Once a resource is successfully attacked,
it is often attacked again, or other resources within the organization are attacked in a similar
manner.

107
Eradication and recovery should be done in a phased approach so that remediation steps
are prioritized. For large-scale incidents, recovery may take months; the intent of the
early phases should be to increase the overall security with relatively quick (days to
weeks) high value changes to prevent future incidents. The later phases should focus on
longer-term changes (e.g., infrastructure changes) and ongoing work to keep the
enterprise as secure as possible.
Because eradication and recovery actions are typically OS or application-specific, detailed
recommendations and advice regarding them are outside the scope of this document.
3.5.5 Post-Incident Activity

Figure 3-4. Incident Response Life Cycle (Post-Incident Activity)


3.5.6 Lessons Learned
One of the most important parts of incident response is also the most often omitted:
learning and improving. Each incident response team should evolve to reflect new threats,
improved technology, and lessons learned. Holding a “lessons learned” meeting with all
involved parties after a major incident, and optionally periodically after lesser incidents
as resources permit, can be extremely helpful in improving security measures and the
incident handling process itself. Multiple incidents can be covered in a single lesson
learned meeting. This meeting provides a chance to achieve closure with respect to an
incident by reviewing what occurred, what was done to intervene, and how well
intervention worked. The meeting should be held within several days of the end of the
incident. Questions to be answered in the meeting include:
 Exactly what happened, and at what times?
 How well did staff and management perform in dealing with the incident? Were
the documented procedures followed? Were they adequate?
 What information was needed sooner?
 Were any steps or actions taken that might have inhibited the recovery?
 What would the staff and management do differently the next time a similar incident

108
occurs?
 How could information sharing with other organizations have been improved?
 What corrective actions can prevent similar incidents in the future?
 What precursors or indicators should be watched for in the future to detect similar
incidents?
 What additional tools or resources are needed to detect, analyze, and mitigate future
incidents?
Small incidents need limited post-incident analysis, with the exception of incidents
performed through new attack methods that are of widespread concern and interest. After
serious attacks have occurred, it is usually worthwhile to hold post-mortem meetings that
cross team and organizational boundaries to provide a mechanism for information sharing.
The primary consideration in holding such meetings is ensuring that the right people are
involved. Not only is it important to invite people who have been involved in the incident
that is being analyzed, but also it is wise to consider who should be invited for the purpose
of facilitating future cooperation.
The success of such meetings also depends on the agenda. Collecting input about
expectations and needs (including suggested topics to cover) from participants before the
meeting increases the likelihood that the participants’ needs will be met. In addition,
establishing rules of order before or during the start of a meeting can minimize confusion
and discord. Having one or more moderators who are skilled in group facilitation can yield
a high payoff. Finally, it is also important to document the major points of agreement and
action items and to communicate them to parties who could not attend the meeting.
Lessons learned meetings provide other benefits. Reports from these meetings are good
material for training new team members by showing them how more experienced team
members respond to incidents. Updating incident response policies and procedures is
another important part of the lessons learned process. Post-mortem analysis of the way an
incident was handled will often reveal a missing step or an inaccuracy in a procedure,
providing impetus for change. Because of the changing nature of information technology
and changes in personnel, the incident response team should review all related
documentation and procedures for handling incidents at designated intervals.
Another important post-incident activity is creating a follow-up report for each incident,
which can be quite valuable for future use. The report provides a reference that can be
used to assist in handling similar incidents. Creating a formal chronology of events
(including timestamped information such as log data from systems) is important for legal

109
reasons, as is creating a monetary estimate of the amount of damage the incident caused.
This estimate may become the basis for subsequent prosecution activity by entities such
as the U.S. Attorney General’s office. Follow-up reports should be kept for a period of
time as specified in record retention policies.45
3.5.7 Using Collected Incident Data
Lessons learned activities should produce a set of objective and subjective data regarding
each incident. Over time, the collected incident data should be useful in several capacities.
The data, particularly the total hours of involvement and the cost, may be used to justify
additional funding of the incident response team. A study of incident characteristics may
indicate systemic security weaknesses and threats, as well as changes in incident trends.
This data can be put back into the risk assessment process, ultimately leading to the
selection and implementation of additional controls. Another good use of the data is
measuring the success of the incident response team. If incident data is collected and
stored properly, it should provide several measures of the success (or at least the activities)
of the incident response team.
Incident data can also be collected to determine if a change to incident response capabilities
causes a corresponding change in the team’s performance (e.g., improvements in efficiency,
reductions in costs). Furthermore, organizations that are required to report incident
information will need to collect the necessary data to meet their requirements. See Section 4
for additional information on sharing incident data with other organizations.
Organizations should focus on collecting data that is actionable, rather than collecting
data simply because it is available. For example, counting the number of precursor port
scans that occur each week and producing a chart at the end of the year that shows port
scans increased by eight percent is not very helpful and may be quite time-consuming.
Absolute numbers are not informative—understanding how they represent threats to the
business processes of the organization is what matters. Organizations should decide what
incident data to collect based on reporting requirements and on the expected return on
investment from the data (e.g., identifying a new threat and mitigating the related
vulnerabilities before they can be exploited.) Possible metrics for incident-related data
include:
 Number of Incidents Handled.46 Handling more incidents is not necessarily better—
for example, the number of incidents handled may decrease because of better network
and host security controls, not because of negligence by the incident response team.
The number of incidents handled is best taken as a measure of the relative amount of

110
work that the incident response team had to perform, not as a measure of the quality
of the team, unless it is considered in the context of other measures that collectively
give an indication of work quality. It is more effective to produce separate incident
counts for each incident category. Subcategories also can be used to provide more
information. For example, a growing number of incidents performed by insiders could
prompt stronger policy provisions concerning background investigations for
personnel and misuse of computing resources and stronger security controls on
internal networks (e.g., deploying intrusion detection software to more internal
networks and hosts).
 Time Per Incident. For each incident, time can be measured in several ways:
– Total amount of labor spent working on the incident
– Elapsed time from the beginning of the incident-to-incident discovery, to the
initial impact assessment, and to each stage of the incident handling process
(e.g., containment, recovery)
– How long it took the incident response team to respond to the initial report of the
incident
– How long it took to report the incident to management and, if necessary,
appropriate external entities (e.g., US-CERT).
 Objective Assessment of Each Incident. The response to an incident that has been
resolved can be analyzed to determine how effective it was. The following are
examples of performing an objective assessment of an incident:
– Reviewing logs, forms, reports, and other incident documentation for adherence
to established incident response policies and procedures
– Identifying which precursors and indicators of the incident were recorded to
determine how effectively the incident was logged and identified
– Determining if the incident caused damage before it was detected
– Determining if the actual cause of the incident was identified, and identifying the
vector of attack, the vulnerabilities exploited, and the characteristics of the targeted
or victimized systems, networks, and applications
– Determining if the incident is a recurrence of a previous incident
– Calculating the estimated monetary damage from the incident (e.g.,
information and critical business processes negatively affected by the
incident)
– Measuring the difference between the initial impact assessment and the final

111
impact assessment (see Section 3.2.6)
– Identifying which measures, if any, could have prevented the incident.
 Subjective Assessment of Each Incident. Incident response team members may be
asked to assess their own performance, as well as that of other team members and of
the entire team. Another valuable source of input is the owner of a resource that was
attacked, in order to determine if the owner thinks the incident was handled
efficiently and if the outcome was satisfactory.
Besides using these metrics to measure the team’s success, organizations may also find it
useful to periodically audit their incident response programs. Audits will identify
problems and deficiencies that can then be corrected. At a minimum, an incident response
audit should evaluate the following items against applicable regulations, policies, and
generally accepted practices:
 Incident response policies, plans, and procedures
 Tools and resources
 Team model and structure
 Incident handler training and education
 Incident documentation and reports
 The measures of success discussed earlier in this section.
3.5.8 Evidence Retention
Organizations should establish policy for how long evidence from an incident should be
retained. Most organizations choose to retain all evidence for months or years after the
incident ends. The following factors should be considered during the policy creation:
 Prosecution. If it is possible that the attacker will be prosecuted, evidence may need
to be retained until all legal actions have been completed. In some cases, this may
take several years. Furthermore, evidence that seems insignificant now may become
more important in the future. For example, if an attacker is able to use knowledge
gathered in one attack to perform a more severe attack later, evidence from the first
attack may be key to explaining how the second attack was accomplished.
 Data Retention. Most organizations have data retention policies that state how long
certain types of data may be kept. For example, an organization may state that email
messages should be retained for only 180 days. If a disk image contains thousands of
emails, the organization may not want the image to be kept for more than 180 days
unless it is absolutely necessary. As discussed in Section 3.4.2,

112
General Records Schedule (GRS) 24 specifies that incident handling records should be
kept for three years.
 Cost. Original hardware (e.g., hard drives, compromised systems) that is stored as
evidence, as well as hard drives and removable media that are used to hold disk
images, are generally individually inexpensive. However, if an organization stores
many such components for years, the cost can be substantial. The organization also
must retain functional computers that can use the stored hardware and media.
3.6 Incident Handling Checklist
The checklist in Table 3-5 provides the major steps to be performed in the handling of an
incident. Note that the actual steps performed may vary based on the type of incident and
the nature of individual incidents. For example, if the handler knows exactly what has
happened based on analysis of indicators (Step 1.1), there may be no need to perform Steps
1.2 or 1.3 to further research the activity. The checklist provides guidelines to handlers on
the major steps that should be performed; it does not dictate the exact sequence of steps
that should always be followed.
Table 3-5. Incident Handling Checklist
Action Completed
Detection and Analysis
1. Determine whether an incident has occurred
1.1 Analyze the precursors and indicators
1.2 Look for correlating information
1.3 Perform research (e.g., search engines, knowledge base)
1.4 As soon as the handler believes an incident has occurred, begin documenting the
investigation and gathering evidence
2. Prioritize handling the incident based on the relevant factors (functional impact,
information impact, recoverability effort, etc.)
3. Report the incident to the appropriate internal personnel and external organizations
Containment, Eradication, and Recovery
4. Acquire, preserve, secure, and document evidence
5. Contain the incident
6. Eradicate the incident
6.1 Identify and mitigate all vulnerabilities that were exploited
6.2 Remove malware, inappropriate materials, and other components
6.3 If more affected hosts are discovered (e.g., new malware infections), repeat
the Detection and Analysis steps (1.1, 1.2) to identify all other affected hosts, then contain

113
(5) and eradicate (6) the incident for them

7. Recover from the incident


7.1 Return affected systems to an operationally ready state
7.2 Confirm that the affected systems are functioning normally
7.3 If necessary, implement additional monitoring to look for future related activity
Post-Incident Activity
8. Create a follow-up report
9. Hold a lesson learned meeting (mandatory for major incidents, optional otherwise)
3.7 Recommendations
The key recommendations presented in this section for handling incidents are summarized
below.
 Acquire tools and resources that may be of value during incident handling.
The team will be more efficient at handling incidents if various tools and resources
are already available to them. Examples include contact lists, encryption software,
network diagrams, backup devices, digital forensic software, and port lists.
 Prevent incidents from occurring by ensuring that networks, systems, and
applications are sufficiently secure. Preventing incidents is beneficial to the
organization and also reduces the workload of the incident response team. Performing
periodic risk assessments and reducing the identified risks to an acceptable level are
effective in reducing the number of incidents. Awareness of security policies and
procedures by users, IT staff, and management is also very important.
 Identify precursors and indicators through alerts generated by several types of
security software. Intrusion detection and prevention systems, antivirus software,
and file integrity checking software are valuable for detecting signs of incidents. Each
type of software may detect incidents that the other types of software cannot, so the
use of several types of computer security software is highly recommended. Third-
party monitoring services can also be helpful.
 Establish mechanisms for outside parties to report incidents. Outside parties may
want to report incidents to the organization—for example, they may believe that one
of the organization’s users is attacking them. Organizations should publish a phone
number and email address that outside parties can use to report such incidents.
 Require a baseline level of logging and auditing on all systems, and a higher
baseline level on all critical systems. Logs from operating systems, services, and
applications frequently provide value during incident analysis, particularly if auditing

114
was enabled. The logs can provide information such as which accounts were accessed
and what actions were performed.
 Profile networks and systems. Profiling measures the characteristics of expected
activity levels so that changes in patterns can be more easily identified. If the profiling
process is automated, deviations from expected activity levels can be detected and
reported to administrators quickly, leading to faster detection of incidents and
operational issues.
 Understand the normal behaviors of networks, systems, and applications. Team
members who understand normal behavior should be able to recognize abnormal
behavior more easily. This knowledge can best be gained by reviewing log entries and
security alerts; the handlers should become familiar with the typical data and can
investigate the unusual entries to gain more knowledge.
 Create a log retention policy. Information regarding an incident may be recorded in
several places. Creating and implementing a log retention policy that specifies how
long log data should be maintained may be extremely helpful in analysis because
older log entries may show reconnaissance activity or previous instances of similar
attacks.
 Perform event correlation. Evidence of an incident may be captured in several
logs. Correlating events among multiple sources can be invaluable in collecting all
the available information for an incident and validating whether the incident
occurred.
 Keep all host clocks synchronized. If the devices reporting events have inconsistent
clock settings, event correlation will be more complicated. Clock discrepancies may
also cause issues from an evidentiary standpoint.
 Maintain and use a knowledge base of information. Handlers need to reference
information quickly during incident analysis; a centralized knowledge base
provides a consistent, maintainable source of information. The knowledge base
should include general information, such as data on precursors and indicators of
previous incidents.
 Start recording all information as soon as the team suspects that an incident
has occurred. Every step taken, from the time the incident was detected to its final
resolution, should be documented and timestamped. Information of this nature can
serve as evidence in a court of law if legal prosecution is pursued. Recording the
steps performed can also lead to a more efficient, systematic, and less error-prone

115
handling of the problem.
 Safeguard incident data. It often contains sensitive information regarding such
things as vulnerabilities, security breaches, and users that may have performed
inappropriate actions. The team should ensure that access to incident data is restricted
properly, both logically and physically.
 Prioritize handling of the incidents based on the relevant factors. Because of
resource limitations, incidents should not be handled on a first-come, first-served
basis. Instead, organizations should establish written guidelines that outline how
quickly the team must respond to the incident and what actions should be performed,
based on relevant factors such as the functional and information impact of the incident,
and the likely recoverability from the incident. This saves time for the incident
handlers and provides a justification to management and system owners for their
actions. Organizations should also establish an escalation process for those instances
when the team does not respond to an incident within the designated time.
 Include provisions regarding incident reporting in the organization’s incident
response policy. Organizations should specify which incidents must be reported,
when they must be reported, and to whom. The parties most commonly notified are
the CIO, head of information security, local information security officer, other
incident response teams within the organization, and system owners.
 Establish strategies and procedures for containing incidents. It is important to
contain incidents quickly and effectively to limit their business impact. Organizations
should define acceptable risks in containing incidents and develop strategies and
procedures accordingly. Containment strategies should vary based on the type of
incident.
 Follow established procedures for evidence gathering and handling. The team
should clearly document how all evidence has been preserved. Evidence should be
accounted for at all times. The team should meet with legal staff and law
enforcement agencies to discuss evidence handling, then develop procedures based
on those discussions.
 Capture volatile data from systems as evidence. This includes lists of network
connections, processes, login sessions, open files, network interface configurations,
and the contents of memory. Running carefully chosen commands from trusted
media can collect the necessary information without damaging the system’s
evidence.

116
 Obtain system snapshots through full forensic disk images, not file system
backups. Disk images should be made to sanitized write-protectable or write-once
media. This process is superior to a file system backup for investigatory and
evidentiary purposes. Imaging is also valuable in that it is much safer to analyze an
image than it is to perform analysis on the original system because the analysis may
inadvertently alter the original.
 Hold lessons learned meetings after major incidents. Lessons learned meetings are
extremely helpful in improving security measures and the incident handling process
itself.

117
UNIT: VIII: Data Leakage Prevention

A Confusing Market1

Data Loss Prevention is one of the most hyped, and least understood, tools in the security
arsenal. With at least a half dozen different names and even more technology approaches, it
can be difficult to understand the ultimate value of the tools and which products best suit which
environments. This report will provide the necessary background in DLP to help you
understand the technology, know what to look for in a product, and find the best match for your
organization.

DLP is an adolescent technology that provides significant value for those organizations that
need it, despite products that may not be as mature as in other areas of IT. The market is
currently dominated by startups, but large vendors have started stepping in, typically through
acquisition.

The first problem in understanding DLP is figuring out what we're actually talking about. The
following names are all being used to describe the same market:

• Data Loss Prevention/Protection

• Data Leak Prevention/Protection

• Information Loss Prevention/Protection

• Information Leak Prevention/Protection

• Extrusion Prevention

• Content Monitoring and Filtering

• Content Monitoring and Protection

DLP seems the most common term, and while its life is probably limited, we will use it in this
report for simplicity.

Types of DLP technologies

DLP for data in use

1
Understanding and Selecting a Data Loss Prevention Solution, http://creativecommons.org/licenses/by-nc-
nd/3.0/us/

118
One class of DLP technologies secures data in use, defined as data that is being actively
processed by an application or an endpoint. These safeguards usually involve authenticating
users and controlling their access to resources.

DLP for data in motion

When confidential data is in transit across a network, DLP technologies are needed to make
sure it is not routed outside the organization or to insecure storage areas. Encryption plays a
large role in this step. Email security is also critical since so much business communication
goes through this channel.

DLP for data at rest

Even data that is not moving or in use needs safeguards. DLP technologies protect data residing
in a variety of storage mediums, including the cloud. DLP can place controls to make sure that
only authorized users are accessing the data and to track their access in case it is leaked or
stolen.

DLP seems the most common term, and while its life is probably limited, we will use it in this
report for simplicity.

Defining DLP

There is a lack of consensus on what actually compromises a DLP solution. Some people
consider encryption or USB port control DLP, while others limit themselves to complete
product suites. Securosis defines DLP as: Products that, based on central policies, identify,
monitor, and protect data at rest, in motion, and in use, through deep content analysis. Thus,
the key defining characteristics are:

• Deep content analysis


• Central policy management
• Broad content coverage across multiple platforms and locations Securosis, L.L.C.

DLP solutions both protect sensitive data and provide insight into the use of content within the
enterprise. Few enterprises classify data beyond that which is public, and everything else. DLP
helps organizations better understand their data and improved their ability to classify and
manage content. Point products may provide some DLP functionality, but tend to be more
limited in either their coverage (network only or endpoint only) or content analysis capabilities.
This report will focus on comprehensive DLP suites, but some organizations may find that a
point solution is able to meet their needs.

119
DLP Features vs. DLP Solutions

The DLP market is also split between DLP as a feature, and DLP as a solution. A number of
products, particularly email security solutions, provide basic DLP functions, but aren't
complete DLP solutions. The difference is:

• A DLP Product includes centralized management, policy creation, and enforcement


workflow, dedicated to the monitoring and protection of content and data. The user interface
and functionality are dedicated to solving the business and technical problems of protecting
content through content awareness.

• DLP Features include some of the detection and enforcement capabilities of DLP products,
but are not dedicated to the task of protecting content and data.

This distinction is important because DLP products solve a specific business problem that may
or may not be managed by the same business unit or administrator responsible for other security
functions. We often see non-technical users such as legal or compliance officers responsible
for the protection of content. Even human resources are often involved with the disposition of
DLP alerts. Some organizations find that the DLP policies themselves are highly sensitive or
need to be managed by business unit leaders outside of security, which also may argue for a
dedicated solution. Because DLP is dedicated to a clear business problem (protect my content)
that is differentiated from other security problems (protect my PC or protect my network) most
of you should look for dedicated DLP solutions.

This doesn't mean that DLP as a feature won't be the right solution for you, especially in smaller
organizations. It also doesn't mean that you won't buy a suite that includes DLP, as long as the
DLP management is separate and dedicated to DLP. We'll be seeing more and more suites as
large vendors enter the space, and it often makes sense to run DLP analysis or enforcement
within another product, but the central policy creation, management, and workflow should be
dedicated to the DLP problem and isolated from other security functions.

The last thing to remember about DLP is that it is highly effective against bad business
processes (FTP exchange of unencrypted medical records with your insurance company, for
example) and mistakes. While DLP offers some protection against malicious activity, we're at
least a few years away from these tools protecting against knowledgeable attackers.

Content Awareness

Content vs. Context

120
We need to distinguish content from context. One of the defining characteristics of DLP
solutions is their content awareness. This is the ability of products to analyze deep content
using a variety of techniques, and is very different from analyzing context. It's easiest to think
of content as a letter, and context as the envelope and environment around it.

Context includes things like source, destination, size, recipients, sender, header information,
metadata, time, format, and anything else short of the content of the letter itself. Context is
highly useful and any DLP solution should include contextual analysis as part of an overall
solution. A more advanced version of contextual analysis is business context analysis, which
involves deeper analysis of the content, its environment at the time of analysis, and the use of
the content at that time.

Content awareness involves peering inside containers and analyzing the content itself. The
advantage of content awareness is that while we use context, we're not restricted by it. If I want
to protect a piece of sensitive data, I want to protect it everywhere — not just in obviously
sensitive containers. I'm protecting the data, not the envelope, so it makes a lot more sense to
open the letter, read it, and decide how to treat it. This is more difficult and time consuming
than basic contextual analysis and is the defining characteristic of DLP solutions.

Content Analysis

The first step in content analysis is capturing the envelope and opening it. The engine then
needs to parse the context (we'll need that for the analysis) and dig into it. For a plain text email
this is easy, but when you want to look inside binary files it gets a little more complicated. All
DLP solutions solve this using file cracking. File cracking is the technology used to read and
understand the file, even if the content is buried multiple levels down. For example, it's not
unusual for the cracker to read an Excel spreadsheet embedded in a Word file that's zipped.
The product needs to unzip the file, read the Word doc, analyze it, find the Excel data, read
that, and analyze it. Other situations get far more complex, like a .pdf embedded in a CAD file.
Many of the products on the market today support around 300 file types, embedded content,
multiple languages, double byte character sets for Asian languages, and pulling plain text from
unidentified file types.

Quite a few use the Autonomy or Verity content engines to help with file cracking, but all the
serious tools have quite a bit of proprietary capability, in addition to the embedded content
engine. Some tools support analysis of encrypted data if enterprise encryption is used with

121
recovery keys, and most tools can identify standard encryption and use that as a contextual rule
to block/quarantine content.

Content Analysis Techniques

Once the content is accessed, there are seven major analysis techniques used to find policy
violations, each with its own strengths and weaknesses.

1. Rule-Based/Regular Expressions: This is the most common analysis technique available


in both DLP products and other tools with DLP features. It analyzes the content for specific
rules — such as 16-digit numbers that meet credit card checksum requirements, medical billing
codes, or other textual analyses. Most DLP solutions enhance basic regular expressions with
their own additional analysis rules (e.g., a name in proximity to an address near a credit card
number). What it's best for: As a first-pass filter, or for detecting easily identified pieces of
structured data like credit card numbers, social security numbers, and healthcare codes/records.

Strengths: Rules process quickly and can be easily configured. Most products ship with initial
rule sets. The technology is well understood and easy to incorporate into a variety of products.

Weaknesses: Prone to high false positive rates. Offers very little protection for unstructured
content like sensitive intellectual property.

2. Database Fingerprinting: Sometimes called Exact Data Matching. This technique takes
either a database dump or live data (via ODBC connection) from a database and only looks for
exact matches. For example, you could generate a policy to look only for credit card numbers
in your customer base, thus ignoring your own employees buying online. More advanced tools
look for combinations of information, such as the magic combination of first name or initial,
with last name, with credit card or social security number, that triggers a California SB 1386
disclosure. Make sure you understand the performance and security implications of nightly
extracts vs. live database connections.

What it's best for: Structured data from databases.

Strengths: Very low false positives (close to 0). Allows you to protect customer/sensitive data
while ignoring other, similar, data used by employees (like their personal credit cards for online
orders).

Weaknesses: Nightly dumps won't contain transaction data since the last extract. Live
connections can affect database performance. Large databases affect product performance.

122
3. Exact File Matching: With this technique you take a hash of a file and monitor for any files
that match that exact fingerprint. Some consider this to be a contextual analysis technique since
the file contents themselves are not analyzed.

What it's best for: Media files and other binaries where textual analysis isn't necessarily
possible. Strengths: Works on any file type, low false positives with a large enough hash value
(effectively none).

Weaknesses: Trivial to evade. Worthless for content that's edited, such as standard office
documents and edited media files.

4. Partial Document Matching: This technique looks for a complete or partial match on
protected content. Thus, you could build a policy to protect a sensitive document, and the DLP
solution will look for either the complete text of the document, or even excerpts as small as a
few sentences. For example, you could load up a business plan for a new product and the DLP
solution would alert if an employee pasted a single paragraph into an Instant Message. Most
solutions are based on a technique known as cyclical hashing, where you take a hash of a
portion of the content, offset a predetermined number of characters, then take another hash,
and keep going until the document is completely loaded as a series of overlapping hash values.
Outbound content is run through the same hash technique, and the hash values compared for
matches. Many products use cyclical hashing as a base, then add more advanced linguistic
analysis.

What it's best for: Protecting sensitive documents, or similar content with text such as CAD
files (with text labels) and source code. Unstructured content that's known to be sensitive.

Strengths: Ability to protect unstructured data. Generally low false positives (some vendors
will say zero false positives, but any common sentence/text in a protected document can trigger
alerts). Doesn't rely on complete matching of large documents; can find policy violations on
even a partial match.

Weaknesses: Performance limitations on the total volume of content that can be protected.
Common phrases/verbiage in a protected document may trigger false positives. Must know
exactly which documents you want to protect. Trivial to avoid (ROT 1 encryption is sufficient
for evasion).

5. Statistical Analysis: Use of machine learning, Bayesian analysis, and other statistical
techniques to analyze a corpus of content and find policy violations in content that resembles
the protected content. This category includes a wide range of statistical techniques which vary

123
greatly in implementation and effectiveness. Some techniques are very similar to those used to
block spam.

What it's best for: Unstructured content where a deterministic technique, like partial document
matching, would be

ineffective. For example, a repository of engineering plans that's impractical to load for partial
document matching due to high volatility or massive volume.

Strengths: Can work with more nebulous content where you may not be able to isolate exact
documents for matching. Can enforce policies such as "alert on anything outbound that
resembles the documents in this directory".

Weaknesses: Prone to false positives and false negatives. Requires a large corpus of source
content — the bigger the better.

6. Conceptual/Lexicon: This technique uses a combination of dictionaries, rules, and other


analyses to protect nebulous content that resembles an "idea". It's easier to give an example —
a policy that alerts on traffic that resembles insider trading, which uses key phrases, word
counts, and positions to find violations. Other examples are sexual harassment, running a
private business from a work account, and job hunting.

What it's best for: Completely unstructured ideas that defy simple categorization based on
matching known documents, databases, or other registered sources.

Strengths: Not all corporate policies or content can be described using specific examples;
Conceptual analysis can find loosely defined policy violations other techniques can't even think
of monitoring for.

Weaknesses: In most cases these are not user-definable and the rule sets must be built by the
DLP vendor with significant effort (costing more). Because of the loose nature of the rules, this
technique is very prone to false positives and false negatives.

7. Categories: Pre-built categories with rules and dictionaries for common types of sensitive
data, such as credit card numbers/PCI protection, HIPAA, etc.

What it's best for: Anything that neatly fits a provided category. Typically, easy to describe
content related to privacy, regulations, or industry-specific guidelines.

124
Strengths: Extremely simple to configure. Saves significant policy generation time. Category
policies can form the basis for more advanced, enterprise specific policies. For many
organizations, categories can meet a large percentage of their data protection needs.

Weaknesses: One size fits all might not work. Only good for easily categorized rules and
content.

These 7 techniques form the basis for most of the DLP products on the market. Not all products
include all techniques, and there can be significant differences between implementations. Most
products can also chain techniques — building complex policies from combinations of content
and contextual analysis techniques.

Technical Architecture

Protecting Data in Motion, At Rest, and In Use

The goal of DLP is to protect content throughout its lifecycle. In terms of DLP, this includes
three major aspects:

• Data At Rest includes scanning of storage and other content repositories to identify where
sensitive content is located. We call this content discovery. For example, you can use a DLP
product to scan your servers and identify documents with credit card numbers. If the server
isn't authorized for that kind of data, the file can be encrypted or removed, or a warning sent to
the file owner.

• Data In Motion is sniffing of traffic on the network (passively or inline via proxy) to identify
content being sent across specific communications channels. For example, this includes
sniffing emails, instant messages, and web traffic for snippets of sensitive source code. In
motion tools can often block based on central policies, depending on the type of traffic.

• Data In Use is typically addressed by endpoint solutions that monitor data as the user interacts
with it. For example, they can identify when you attempt to transfer a sensitive document to a
USB drive and block it (as opposed to blocking use of the USB drive entirely). Data in use
tools can also detect things like copy and paste, or use of sensitive data in an unapproved
application (such as someone attempting to encrypt data to sneak it past the sensors).

Data in Motion

Many organizations first enter the world of DLP with network-based products that provide
broad protection for managed and unmanaged systems. It’s typically easier to start a

125
deployment with network products to gain broad coverage quickly. Early products limited
themselves to basic monitoring and alerting, but all current products include advanced
capabilities to integrate with existing network infrastructure and provide protective, not just
detective, controls.

Network Monitor

At the heart of most DLP solutions lies a passive network monitor. The network monitoring
component is typically deployed at or near the gateway on a SPAN port (or a similar tap). It
performs full packet capture, session reconstruction, and content analysis in real time.
Performance is more complex and subtle than vendors normally discuss. First, on the client
expectation side, most clients claim they need full gigabit ethernet performance, but that level
of performance is unnecessary except in very unusual circumstances since few organizations
are really running that high a level of communications traffic.

DLP is a tool to monitor employee communications, not web application traffic. Realistically
we find that small enterprises normally run under 50 MB/s of relevant traffic, medium
enterprises run closer to 50-200 MB/s, and large enterprises around 300 MB/s (maybe as high
as 500 in a few cases). Because of the content analysis overhead, not every product runs full
packet capture. You might have to choose between pre-filtering (and thus missing non-standard
traffic) or buying more boxes and load balancing. Also, some products lock monitoring into
pre-defined port and protocol combinations, rather than using service/channel identification
based on packet content. Even if full application channel identification is included, you want
to make sure it's enabled. Otherwise, you might miss non-standard communications such as
connecting over an unusual port. Most of the network monitors are dedicated general-purpose
server hardware with DLP software installed. A few vendors deploy true specialized

126
appliances. While some products have their management, workflow, and reporting built into
the network monitor, this is often offloaded to a separate server or appliance.

Email Integration

The next major component is email integration. Since email is store and forward you can gain
a lot of capabilities, including quarantine, encryption integration, and filtering, without the
same hurdles to avoid blocking synchronous traffic. Most products embed an MTA (Mail
Transport Agent) into the product, allowing you to just add it as another hop in the email chain.
Quite a few also integrate with some of the major existing MTAs/email security solutions
directly for better performance. One weakness of this approach is it doesn't give you access to
internal email. If you're on an Exchange server, internal messages never make it through the
external MTA since there's no reason to send that traffic out. To monitor internal mail, you'll
need direct Exchange/ Lotus integration, which is surprisingly rare in the market. Full
integration is different from just scanning logs/libraries after the fact, which is what some
companies call internal mail support. Good email integration is absolutely critical if you ever
want to do any filtering, as opposed to just monitoring.

Filtering/Blocking and Proxy Integration

Nearly anyone deploying a DLP solution will eventually want to start blocking traffic. There's
only so long you can take watching all your juicy sensitive data running to the nether regions
of the Internet before you start taking some action. But blocking isn't the easiest thing in the
world, especially since we're trying to allow good traffic, only block bad traffic, and make the
decision using real-time content analysis. Email, as we just mentioned, is fairly straightforward

127
to filter. It's not quite real-time and is proxied by its very nature. Adding one more analysis hop
is a manageable problem in even the most complex environments. Outside of email most of
our communications traffic is synchronous — everything runs in real time. Thus, if we want to
filter it, we either need to bridge the traffic, proxy it, or poison it from the outside.

Bridge

With a bridge we just have a system with two network cards which performs content analysis
in the middle. If we see something bad, the bridge breaks the connection for that session.
Bridging isn't the best approach for DLP since it might not stop all the bad traffic before it leaks
out. It's like sitting in a doorway watching everything go past with a magnifying glass; by the
time you get enough traffic to make an intelligent decision, you may have missed the really
good stuff. Very few products take this approach, although it does have the advantage of being
protocol agnostic.

Proxy

In simplified terms, a proxy is protocol/application specific and queues up traffic before


passing it on, allowing for deeper analysis. We see gateway proxies mostly for HTTP, FTP,
and IM protocols. Few DLP solutions include their own proxies; they tend to integrate with
existing gateway/proxy vendors since most customers prefer integration with these existing
tools. Integration for web gateways is typically through the iCAP protocol, allowing the proxy
to grab the traffic, send it to the DLP product for analysis, and cut communications if there's a
violation. This means you don't have to add another piece of hardware in front of your network
traffic and the DLP vendors can avoid the difficulties of building dedicated network hardware
for inline analysis. If the gateway includes a reverse SSL proxy you can also sniff SSL
connections. You will need to make changes on your endpoints to deal with all the certificate
alerts, but you can now peer into encrypted traffic. For Instant Messaging you'll need an IM
proxy and a DLP product that specifically supports whatever IM protocol you're using.

128
TCP Poisoning

The last method of filtering is TCP poisoning. You monitor the traffic and when you see
something bad, you inject a TCP reset packet to kill the connection. This works on every TCP
protocol but isn't very efficient. For one thing, some protocols will keep trying to get the traffic
through. If you TCP poison a single email message, the server will keep trying to send it for 3
days, as often as every 15 minutes. The other problem is the same as bridging — since you
don't queue the traffic at all, by the time you notice something bad it might be too late. It's a
good stop-gap to cover nonstandard protocols, but you'll want to proxy as much as possible.

Internal Networks

Although technically capable of monitoring internal networks, DLP is rarely used on internal
traffic other than email.

Gateways provide convenient choke points; internal monitoring is a daunting prospect from
cost, performance, and policy management/false positive standpoints. A few DLP vendors have
partnerships for internal monitoring but this is a lower priority feature for most organizations.

Distributed and Hierarchical Deployments

All medium to large enterprises, and many smaller organizations, have multiple locations and
web gateways. A DLP solution should support multiple monitoring points, including a mix of
passive network monitoring, proxy points, email servers, and remote locations. While
processing/analysis can be offloaded to remote enforcement points, they should send all events
back to a central management server for workflow, reporting, investigations, and archiving.
Remote offices are usually easy to support since you can just push policies down and reporting
back, but not every product has this capability.

129
The more advanced products support hierarchical deployments for organizations that want to
manage DLP differently in multiple geographic locations, or by business unit. International

companies often need this to meet legal monitoring requirements which vary by country.
Hierarchical management supports coordinated local policies and enforcement in different
regions, running on their own management servers, communicating back to a central
management server. Early products only supported one management server but now we have
options to deal with these distributed situations, with a mix of corporate/regional/business unit
policies, reporting, and workflow.

130
UNIT-IX Threat Intelligence and Threat Modelling
Threat Intelligence
An intelligent hell would be better than a stupid paradise.
—Victor Hugo
Knowledge is one of the most powerful currencies. Some might even say it’s the ultimate
currency. How you react to a situation depends on how much knowledge you have of the
situation. Imagine the difference between having knowledge that the stock market is going to
crash before it happens versus having no knowledge until the event is well under way. If you’re
heavily invested in stocks, that piece of information could make the difference between
securing your wealth and ending up in bankruptcy. A similar comparison can be applied to the
world of cybersecurity. Having knowledge of a threat allows you to prepare a response instead
of reacting to the impact of the threat, which at that point is too late. This is
why threat intelligence has become and will continue to be a critical component of a successful
security operations centre.
This chapter dives into the world of threat intelligence. First, you will learn about general threat
data and how it can be converted into threat intelligence. You will then learn about the different
types of threat intelligence and how each type can be used within your SOC. This chapter
organizes the different types of threat intelligence into four categories:
• Strategic threat intelligence
• Tactical threat intelligence
• Operational threat intelligence
• Technical threat intelligence
Intelligence is more than the data you collect. Intelligence is what you do with that data or what
awareness it provides. That is why the term actionable intelligence is used often when
describing cybersecurity threat intelligence data. Actionable intelligence provides guidance for
choosing the best actions. You will learn how to leverage external resources for intelligence,
including how to evaluate the return on investment. You will find that all threat intelligence
data isn’t going to benefit your SOC and, in some cases, adding the wrong data can overwhelm
your SOC analysts and negatively impact your security tools.
Threat Intelligence Overview
Imagine having the knowledge that allows you to stop a cyberattack before it occurs. Some
people might equate this to a pitch for some sort of fortune-telling scam, but actually it’s a very
realistic scenario if you have the right data. What exactly is the “right data”? In many cases, it
is information about the mannerisms of the attacker, the network traffic and patterns created
by the attacker, what capabilities the attacker has, and which common vulnerabilities the

131
attacker exploits on a system. These types of information about the attack, the attacker, and the
motives of the attacker provide security context. A simple definition of what threat intelligence
is, is context. Context could include hash values to match files against, behavior patterns to
look for, threat actors to be aware of, and many other items that will help you make informed
decisions about your response to an attack.
Gartner has a slightly different definition for threat intelligence, which is the following:
• Threat intelligence is evidence-based knowledge, including context, mechanisms,
indicators, implications and actionable advice, about an existing or emerging menace
or hazard to assets that can be used to inform decisions regarding the subject’s response
to that menace or hazard.
• Gartner summarizes threat intelligence as evidence-based knowledge, which means you
can rely on it to make informed decisions about how to respond to a threat.
• This means if I give you a bunch of IP addresses with no context such as a warning list
posted on a website, you won’t understand what they mean—those IP addresses are just
data. If I tell you that these IP addresses are bad but I do not explain why, you have
only one specific use of the IP addresses, which is to block them. That would represent
threat data, which can be a form of threat intelligence depending on how it is used, but
by itself is not threat intelligence. Many people have a misconception about threat data,
so let’s address that up front before we move into the topic of threat intelligence.
Threat Data
All threat-related data is not threating intelligence. Many security tools are driven by a specific
type of threat data, but that doesn’t mean the data provided to the tool gives you, the user, any
value. You can have the best packet parser and analyzer on the planet, but if you can’t tell that
tool what to look for, then that tool has very little use to your SOC. Many security vendors
refer to data delivered to their tools as “threat intelligence,” which may be true if you
understand what that data is and how it applies to your goals. If you have no idea what is fed
to a vendor’s tool, then essentially the vendor’s tool is receiving threat data, which has nothing
to do with threat intelligence.
Threat Data Example
Here is a nontechnical example of comparing threat data and threat intelligence. I have two
kids. The younger is a one-year-old boy and the other is an older girl. The boy will smash
anything he can get into contact with. If my daughter wants to build a LEGO castle, there is
the threat that her brother will smash it. If I tell my daughter her castle can be smashed, that
information doesn’t inform her about the specific threat (her brother). It just points out there is

132
a risk that her castle can be smashed. With this knowledge, she can’t make any new decisions
outside of maybe increasing the strength of her castle. But I didn’t tell her about the threat,
which is that her brother will destroy any castle she builds, regardless of how much effort she
puts into reinforcing it. If I tell my daughter, on the other hand, that she needs to build her
castle outside of her brother’s reach or he will see it and come smash it, she not only
understands the threat, she understands how to reduce the risk of her castle being destroyed.
She understands what the threat is (her brother). She understands how the threat can identify
her castle (he can see it). She understands that reinforcing the castle is not the proper risk
reduction strategy. Instead, she understands her best next step is to move the castle out of the
view of her brother, which is a response to the risk that her brother will see her castle and
destroy it. This is an example of threat intelligence because she can understand the threat and
take action based on what she learned from me informing her about the threat (her brother).
Threat Data Value
Threat data can be extremely useful for enhancing security tools. In the past, the native
capabilities of security tools enabled them to thwart most tactics used by malicious parties;
today, however, the threat landscape is rapidly evolving and expanding beyond any tool’s local
defense capabilities. Effective security tools now must include a way to adapt to change, which
is based on continuous learning in the form of threat data. This is how tools such as IPSs,
antivirus, and sandboxes stay relevant. The security vendors continuously update their tools
with new detection capabilities in the form of signatures and behavior analysis to adjust to the
results of threat research. This is why threat data is extremely relevant in the security vendor
space.
To be clear regarding threat data, getting updates to existing security products is not necessarily
what the industry refers to as threat intelligence. Security tool updates only apply to the tool’s
capability of being updated, omitting many other details about the threat vector, which a
technical threat intelligence feed would contain. This specific difference of what is omitted
from the consumer’s view separates a vendor provided threat data fed to a security tool versus
you feeding a tool an external threat intelligence feed. When you provide the threat intelligence
feed using threat intelligence, you understand what it contains and can control how the
intelligence impacts the function of the tool. For example, a basic antivirus checksum update
would have hash matches for only the latest threats used to pattern match against files. Any
update to the antivirus product would not have any other context about the threat, such as who
is a target, where the threat is originating from, and other details that could be used by your
SOC to better understand, detect, and respond to the threat. Remember, a SOC has many roles,
including nontechnical responsibilities that don’t focus on details such as hash values, and

133
those roles need the context associated with threat data in order to use it to make a decision
outside of blocking what is matched. There are details that
can be pulled from threat intelligence, such as the associated risk with a threat that has nothing
to do with technical threat data feeds used by security products to block high risk resources.
This missing data from many threat data feeds sent to security tools is a very useful type of
threat intelligence that can help SOCs make decisions about how to respond to a situation.
Threat intelligence is about using all of the context associated with data.
Threat Data Limitations
An important limiting factor that influences the quality of the intelligence provided by a
vendor of a network or security product is that the vendor sells its product to multiple
customers. Each customer’s network is different, so how could a vendor provide a list that
encompasses threats targeting small businesses as well as threats targeting enterprise
organizations? Keep in mind that security tools can’t just check for everything—that would
mean billions of possible threats. A security vendor must consider all of its customers and
develop a threat data feed that is generic enough to be useful to everyone, yet still able to
prevent many of the common threats. Granted, many vendors have specialties and attempt to
work with different industries and business segments, but they still must take the
“accommodate all types of businesses” approach. This can lead to a threat data feed that not
only doesn’t cover all the threats that will impact your organization but also fills your tools
with additional data that has nothing to do with your line of business. For example, a vendor’s
IPS default security feed will likely provide a list of vulnerabilities that could be exploited;
however, your organization might not own the products associated with the vulnerability.
Therefore, the IPS is wasting resources looking for an attack that could never happen on your
systems. The same concept applies to the IPS not receiving signatures regarding tools specific
to your organization that are not part of the default signature update. As a result, some
resources would be exposed to attack until the IPS is manually adjusted to protect those
devices. In both cases, the vendor’s threat data updates leave gaps in your security capabilities
and can also lead to a false sense of security.
The point of understanding the limitations of vendor threat data updates is to recognize the gap
in threat data that needs to be filled. This is why it is critical to understand what threat
intelligence is and how it can fill this gap found within many security tools. Filling this gap
allows organizations and their tools to make faster and more informed decisions about security.
To be crystal clear, enabling vendor updates doesn’t mean you are obtaining threat intelligence.
You are just receiving the vendor’s generic threat data that all other customers are receiving.

134
Also, it is important to point out that threat intelligence isn’t always technical details about a
threat such as a bad domain, IP address, or hash of a malicious artifact. Malicious domains, file
hashes, and IP addresses are normally what the industry refers to as indicators of compromise
(IOCs). IOCs are considered threat intelligence only when combined with context. I find these
are some of the most common misunderstandings about threat intelligence in the industry.
Threat Intelligence Categories Now that you understand what threat data is and how it can be
confused with threat intelligence, it’s time to move to the topic of threat intelligence. Referring
again to the Gartner definition of threat intelligence, it is evidence-based knowledge, which
means you are able to take action based on the conclusion you make with the evidence that is
provided from the data. There are many forms of data that can provide threat intelligence to your
SOC. Maybe it’s social media. Maybe it’s a vendor’s threat data feed. Maybe it’s another
organization warning you about a threat.

Each category contains a different type of intelligence that can help a SOC better understand a
current situation based on the context the threat intelligence feed provides. Threat intelligence
can help leadership make decisions based on potential threats, but the data’s context must be
in a format they understand and can benefit from, or it will end up being useless data. For
example, if a CEO needs to decide if the organization should open a new office in another
country, telling the CEO about specific technical attack techniques seen by attackers from that
country will not help inform or influence the CEO’s decision. What the CEO needs is strategic
threat intelligence regarding the risks of opening the new office, the likelihood of the risks
occurring, and the impact if an event should occur. This is why it is important to understand
the type of threat intelligence you have access to and ensure that it is delivered to the
appropriate audience so the receiver is able to benefit from its context.

135
Strategic Threat Intelligence
Strategic threat intelligence views threat data from a high level rather than including technical
details such as which threat actor is involved or specific hashes of malware. The purpose of
strategic threat intelligence is to help executives make strategic decisions by giving them a
broad understanding of threats that could impact their organization. The simplest way to think
of this category of intelligence is to imagine a boardroom full of C-level leaders having a
conversation about security. The conversation would be about risk to the organization and
impact to operations, and would be outcome oriented. The specific details such as which tools
are used, how the threat could be executed, and other “in the weeds” type of information would
be handled at the operations level.
It is common for strategic threat intelligence to be provided in a report or briefing format (often
summarized in the executive summary). The details can come from policy documents created
by nation-states, various forms of media, recent publications, specialist activity, white papers,
industry guidelines, and research reports. Organizations such as Gartner and Forrester can be
contracted to develop on-demand strategic reports or more generic reports can be acquired.
One challenge with strategic threat intelligence is obtaining value. Many sources are saturated
with raw data and sometimes have hidden objectives or biased data. Filtering through strategic
threat intelligence can be a manual, time-consuming process if the resources are not leveraged
properly. Best practice is to view a strategic threat intelligence request as a project that contains
a continuous feedback loop between the requestor and analyst to ensure that more accurate
results are obtained.
Tactical Threat Intelligence
Tactical threat intelligence provides details about tactics, techniques, and procedures (TTPs)
used by threat actors. The purpose of this category of intelligence is to better understand how
threats will execute their attacks so that defenders can be better prepared to respond. Responses
could include improving security tools, identifying gaps in capabilities, and modifying the
people and process responsible for responding to attacks.
Tactical threat intelligence is intended to be used by technical roles responsible for an
organization’s defense. Job roles could include system architects, administrators, and other
security staff. Executive roles tend to rely on their technical staff to leverage tactical threat
intelligences, while the executives look for a nontechnical summary in the form of a strategic
threat intelligence report.
Sources of tactical threat intelligence include open-source tools, honeypots, data collectors on
dark networks, scanning technology, malware analysts, closed-source networks, and technical

136
experts. Expected details in tactical threat intelligence include potential targets of attack, attack
vectors such as phishing or a malware type, and which tools or technical infrastructure are used
by the attacker. An example could be data on how ransomware uses different types of
vulnerabilities to infect hosts. Knowing these details will allow a SOC to evaluate its capability
to defend against ransomware and be better prepared for a future attack. Tactical
threat intelligence is not focused on a specific ransomware campaign, which could include a
combination of tactics and customization unique to a specific threat actor. Tactical threat
intelligence is broader in scope.
Operational Threat Intelligence
While tactical threat intelligence covers details on attack behavior, it isn’t focused on a specific
attack or campaign. Attackers tend to use a combination of exploits, known as chained
exploitation, which viewed as a campaign can be fingerprinted, monitored, and sometimes
linked to a threat actor. Knowing about a specific campaign allows defenders to track impact
and risk associated with that campaign as well as validate if it changes its tactics or is likely to
target their organization. As an example of the difference between tactical threat intelligence
and operational threat intelligence, tactical threat intelligence of ransomware would look at a
general ransomware category, while operational threat intelligence would focus on a current
ransomware campaign targeting a specific Apache Struts vulnerability and possibly link it to a
threat actor located in a specific part of the world.
Technical Threat Intelligence
Technical threat intelligence refers specifically to the threat actor’s tools and infrastructure.
This form of intelligence is more specific and detailed than tactical threat intelligence and
focuses on indicators of compromise, or IOCs. The purpose of using focused technical threat
intelligence, commonly fed into security tools, is to provide rapid distribution and response to
threats. A simple way to think of technical threat intelligence is that it is any artifact or behavior
that indicates a compromise, requiring immediate attention.
Technical threat intelligence can be part of a vendor’s updates, the associated limitations of
which I previously covered. Technical threat intelligence can also be general feeds that include
lists of malware hashes, registration keys, or another file artifacts associated with malware;
email characteristics associated with phishing campaigns, malicious URLs or domains, and IP
addresses associated with malicious behavior such as known command and control (C&C)
infrastructures or exploit kits. If these datapoints provide context that can help your SOC make
better decisions, they would be considered threat intelligence; however, if these datapoints are

137
just an unknown feed, such as data classified as bad without any context pumped into a tool,
then they would represent threat data.
Technical threat intelligence typically has a shorter lifespan than other types of threat
intelligence and is consumed in high volume by various types of security tools to help such
tools be aware of recent attack indicators. It is common for a security tool to receive threat data
from its vendor; however, the tool’s capability is enhanced with a third-party threat intelligence
feed more specific to the organization’s need. For example, a school might purchase a threat
intelligence feed that includes many specific technical details about IOCs targeting schools,
which allows the school’s SOC to know when the school is being targeted by a certain threat
actor as well as helps tune their security tool beyond the generic threat data that the tool receives
from its vendor. This data will need to be continuously updated, as IOCs are constantly
changing.
Threat Intelligence Context
Understanding the context associated with threat data is the key value that makes it threat
intelligence. Threat intelligence can be obtained both internally and externally. Combining the
intelligence provided by vendor updates, events you see locally within your network, and
threats seen through external threat intelligence feeds provides a balanced system of data
allowing your organization to be best prepared for future events. The key to success is how to
balance external and internal intelligence, which will vary based on the expected outcome for
the use case being addressed. Figure 7-1 represents the concept of properly leveraging internal
and external threat intelligence.

FIGURE 7-1 Balance Between Internal and External Threat Intelligence


External threat intelligence feeds come in many flavors. A threat intelligence feed that is free
likely gathers data using only open sources. Commercial threat intelligence feeds requiring a
fee, in contrast, generally provide more unique data that can be specific to a market segment
and gathered from closed sources such as marketplaces in the criminal underground or from
security/data collection tools planted within various types of networks. It is critical to be aware

138
that just because a threat intelligence feed is commercial doesn’t mean it’s better than a free
source.
In fact, some commercial sources are actually just aggregations of open-source feeds. For
example, I have seen paid services that are composed completely of compiled open-source
tools. I asked one such service, “What am I paying for if you only use open-source feeds”? I
was told I’m paying for the effort to run a collector of open-source tools, giving me one spot
to collect from multiple open-source feeds, as well as for the consolidated and “cleaned up”
data. This example shows how important it is for you to question what you are paying for. For
some organizations, having cleaned-up open-source data is worth paying for because the
time and support to build such feeds has a cost, while many other organizations expect a paid
service to include unique data and value that could not be obtained from free open-source
services. Additionally, there are organizations that provide highly specialized threat
intelligence feeds, such as feeds that identify threats from darknets and cybercriminal markets,
feeds that specialize in information about zero-day vulnerabilities, and feeds that provide
industry-specific context regarding threats against critical infrastructure, the healthcare
industry, or some other sector.
Threat Context
There are specific factors about threat context that can be extremely useful. One important
concept regarding leveraging the context associated with threat intelligence is prevalence.
Knowing how prevalent a threat is can help determine if a threat is widespread or an isolated
incident. Imagine a situation where you identify malware within your organization. If research
shows that particular malware has very low prevalence, you should be concerned that you are
dealing with a targeted attack or something new. If threat intelligence points out that the threat
is very widespread, the context associated with the threat can help you learn
from how the industry has responded to the threat. Many vendors’ threat intelligence feeds
provide metrics regarding how widespread an identified threat is to help customers understand
the context pulled from threat prevalence. Figure 7-2 is an example of how Sophos uses the
concept of threat prevalence in its products.

139
FIGURE 7-2 Sophos Threat Prevalence Usage Example
Another context datapoint is the age of the data. For example, a URL that was considered
malicious a few months ago might be perfectly legitimate now. Maybe a trusted website was
compromised and for a period of time was used to deliver malware. Once the website owners
remediate the compromise, the URL is no longer a threat. The age of threat intelligence can
help you adjust your response based on whether the threat is considered very new, possibly no
longer a threat, or possibly a false positive. This is why I highly suggest you validate the quality
of data associated with free threat intelligence feeds. I find that many contain very old
data, which likely doesn’t provide much value because the threats identified have changed.
Other important context items you want to gather from threat intelligence are potential impact,
potential victims, attack trends, and anything else that can help the SOC better understand the

140
threat that needs to be addressed. This brings us to the next topic, which is how to identify
which context is ideal for your SOC using the proper threat intelligence evaluation process.
Evaluating Threat Intelligence
Regardless of the type of external threat intelligence feed you choose, it will provide a
nonprioritized list of data that does not have any context regarding impact to your specific
organization. It is up to your organization to pinpoint what you need so that you avoid being
saturated with useless data that provides more harm than value. To get the best value, you must
select feeds that can be used properly by the intended party looking to benefit from such data.
You must also evaluate whether the source is reliable, meaning the threat intelligence provides
data that is accurate, relevant, and timely. If any of these three key factors is not present, you
should consider that source to be unreliable and you should not use it.
Before you consider evaluating threat intelligence feeds, you first need to understand your
requirements for threat intelligence. You must evaluate the purpose and impact you hope to
achieve from using external threat data. Consider the following series of questions as you
evaluate an external threat intelligence feed:
• Who is the audience of the threat intelligence?
• What risks are unique to your organization’s industry?
• What does your organization’s network infrastructure and security capabilities
• look like, and how could they benefit from additional intelligence?
• What security capabilities or processes could benefit from threat intelligence?
• Is the threat intelligence being considered supported by your SOC’s existing
technologies? This includes the available format, how the data is delivered, how it is
secured, and so on.
• Does the provider have a strong history of providing accurate and timely data?
• How often is the data updated? Data that gets updated only every 30 days will
• likely not be useful versus data that is frequently updated.
• Is there another option already available versus adding new or more threat intelligence?
• What budget and resources are available that could be used to process and apply threat
intelligence to your practice?
There are some key elements you are looking to capture with these questions. First, you want
to consider the beneficiary of the threat intelligence. Next, you want to collect data that is
relevant to your industry. For example, if you work for a bank, you want to know about threats
that impact the banking industry. Yes, it wouldn’t hurt to know about a threat that has targeted
other industries; however, it would be better if the focus of the threat intelligence feed was

141
more specific to your line of business. Not using relevant data could also in some cases be more
harmful than helpful.
Threat Intelligence Checklist
Once you have answered questions regarding planning for threat intelligence, you can develop
a pre-evaluation checklist to help you shop for the best threat intelligence resource for your
organization’s goals. Table 7-2 is an example checklist of what the results of this initial
assessment could look like. By having this data, you will be able to quickly narrow down which
type of threat intelligence feeds would be most ideal for you to evaluate.

Content Quality
Another important part of evaluating threat intelligence is understanding the quality of the
external data. As you shop for a threat intelligence service, you will find vast differences in
what is provided and costs ranging from free to thousands of dollars. By developing a threat
intelligence checklist, you will have narrowed down the possible offerings that could work for
your objectives, saving you time during the research and trial process.
Key Content Quality Factors
Your SOC must base the quality of any threat intelligence resource on the same key elements.
You must judge whether the data is accurate, relevant, and timely to ensure it will provide
value. Accurate means the data represents real threats rather than containing many false alerts
that will cause more distraction than benefit.
Relevant relates to how likely the threat will impact your organization. Acquiring threat data
aligned with your market segment will make that data much more relevant than a generic
source. Timely means the data is recent enough that responding to the potential threat would
allow you to prepare for a real potential attack. Preparing for dated attacks provides little value
and also gives you a false sense of security because you won’t have data on current threats that

142
are more likely to be seen attacking your resources. Use these three factors to measure the
quality of the threat intelligence before starting your collection.
Content Quality Checklist
The best way to identify how accurate, relevant and timely a threat intelligence resource can
be for SOC is to ask the service provider the right questions. The following are some
fundamental questions you should include in your threat intelligence feed evaluation process
once you are ready to consider potential candidates:
• What are the data sources for the threat intelligence?
• What is the percentage of unique data?
• How long is the data relevant?
• How reliable is the threat intelligence?
• Is there a portal or other resource to gain more information about an event found within
the feed?
• How accurate are the results from the threat intelligence?
• What is the return on investment?
• What is the total cost to use this service?
• Is this a subscription, and is there a trial period as well as minimal contract
• required to obtain the threat intelligence?
• Are there ethnical, legal, or compliance violations to consider?
The first question to ask a threat data provider is where it gets its data. This will quickly uncover
whether the provider uses open-source tools, private data sources, or a combination of the two.
Next, asking about the percent of unique data helps you determine how much overlap you will
receive from a threat intelligence feed that draws upon data from multiple sources. For
example, if a feed includes data from three banking sources providing similar data, the feed
will seem like it is drawing on a single data source, not three unique data sources. Overlap has
value insofar as it indicates a threat is common across multiple collection sources, but if you
are regularly receiving a lot of overlapping data, you may be paying for what is advertised as
a larger service than what you are actually receiving once you remove duplicate data.
I gave an example earlier in this chapter about how a technical threat intelligence feed can be
used to update an IPS detection database. Using old or generic opensource technical threat
intelligence resources will limit the tool by enabling it to look for threats that are not current.
This can cause more harm than value considering the wasted resources. The same concern
relates to the question of how reliable the threat intelligence resource is. This is extremely
important for operational threat intelligence because the expectation is that details about threat

143
actors are included. Unreliable data can quickly lead to an overload of false positives, causing
more trouble than value in the SOC as well as invoking the wrong actions. An example of this
could be threat intelligence highlighting a specific domain as malicious, although later research
might show that the domain was either spoofed or used as a proxy for the attack. Once again,
it is important for an external threat data resource to be accurate, relevant, and timely.
In reality, many SOCs often do not evaluate the effectiveness of their threat intelligence after
they have implemented it. Sometimes this effectiveness of the threat intelligence is not
examined until it is time for contract renewals. I recommend a continuous evaluation of how
effective and actionable threat intelligence is from a specific vendor or feed. If it is not effective
or causes false positives or false negatives, it will waste the SOC’s time and potentially put the
SOC at a greater risk of missing an attack. The preceding list of questions also includes one
about whether a service provides a portal to validate a finding, since it is common to ask why
something has been flagged as bad. Identifying a threat is fine, but it’s more important to know
why what is being identified is a threat!
Testing Threat Intelligence
Outside of understanding details about the threat intelligence are the questions I suggested you
ask regarding operationalizing a threat intelligence service. Ideally, an external threat
intelligence service will allow you to use its feed for a trial period. You also want to consider
if any additional data modification (such as custom filters) will be needed before the data can
be used and how much effort is expected to abstract the value from the noise. Data formats for
threat intelligence are covered later in this chapter in the section “Collecting and Processing
Intelligence.”
Using this approach to organize your requirements for a threat intelligence service will not only
help keep your evaluation of security feeds organized, but also help keep your testing criteria
clear, allowing anyone, regardless of technical level, to understand how well each offering
satisfies your organization’s goals. One of the biggest challenges in large organizations is
acquiring abstract services such as threat intelligence feeds using an external procurement
organization. The requestor within the SOC knows what they want; however, the group
responsible for selecting the service might not and could purchase the wrong service.
Data Expectations for Technical Threat Intelligence
Technical threat data is specific to the threat actor’s tools and infrastructure, while leaving out
any additional context. The purpose of having this focus is to gather intelligence to improve
detection and prevention capabilities. There are many technical threat intelligence artifacts to
consider when evaluating technical threat data. At a bare minimum, I recommend looking for

144
the following items when evaluating a technical threat data feed before deciding to include it
as part of your project. These are the most common use cases I find within any SOC’s
requirements for technical threat data.
• Suspicious or known malicious domains/registered URLs
• IP addresses associated with malicious activity
• Latest malware hashes
• File artifacts from malware samples
• Subject lines or email content associated with phishing campaigns
Collecting a list of malicious domains is an easy method to develop a blacklist that can be used
to block access to current remote threats. A similar approach is to use IP addresses associated
with malicious activity. Host and gateway security tools can quickly digest and adjust their
blacklists, making the collection of data to operationalize a defense process simple. One
challenge you might face is hitting IP address limits not allowing storage for all of the IP
addresses or domains. Another challenge is false positives within the provided data. False
positives can result due to many factors, including a malicious source cloning a safe website,
the use of proxies by the attacker, Tor networks, or other deception and stealth tactics. IP
addresses are even harder to validate because they can easily be cloned and manipulated.
Testing and filtering might need to be applied to reduce the size of data that will be processed
and to remove data that isn’t reliable.
Capturing malware hashes and artifacts is another relatively easy approach to collect and
operationalize threat data. Many pattern-matching tools such as antivirus and sandboxes will
compare a suspicious artifact against a list of known malicious hash values. The vendor of a
security tool will provide its own data feeds; however, complementing what is provided by the
vendor’s updates with industry-specific threat data can help make your security tools more
effective at detecting threats that matter to your organization. I find many security tool vendors
are adding this capability by supporting data feeds formatted in STIX and TAXII, which is a
topic I will cover shortly in the section “Technical Processing.”
Monitoring emails for content associated with phishing is yet another simple and effective
method to improve detection against identified phishing campaigns. The value of this approach
will depend on how often the phishing language and content are changed as well as whether
the attackers use email security avoidance techniques. Using a combination of operational and
technical data feeds will help your SOC adjust to the latest phishing trends.
Technical threat intelligence might seem like something you can just consume and benefit
from, but once again I must remind you about the previous examples of filling your tools with

145
threat and vulnerability details that are not related to your organization. Doing so will reduce
the effectiveness of those tools rather than providing any form of benefit. To avoid this pitfall,
you will need to ensure the technical data is related to your field of business and could be
processed by your technology to ensure the data is relevant to your specific needs. Pushing a
ton of random open-source technical threat intelligence feeds will cause more harm than value.
Most organizations that are negatively impacting their tools with additional, yet not valuable,
threat intelligence feeds don’t know they are doing so. When asked about the value, the typical
response is “More is better, right?” This means they have not considered what is being added
and what purpose the feed is supposed to provide. In the upcoming section, I will explain how
to avoid this mistake.
Security Tools and Threat Intelligence
It is common for a SOC to leverage technical threat data to improve the detection and
prevention capabilities of security tools. To better understand these use cases, I will review
general categories of security tools and explain how threat intelligence can enhance their
functionality. Most of these tools rely on technical threat intelligence, but some may also
leverage elements of tactical and operational threat intelligence, which will also be highlighted.
In general, threat data will help security tools answer questions that can’t be answered without
an outside viewpoint of the threat landscape. Example questions that you hope to answer by
using threat data include the following (most of these questions can’t be answered using
internally generated event data):
• Who is attacking the organization?
• What are the attacker’s motives?
• What is their target?
• What tactics, techniques, and procedures (TTPs) are being used?
• What indicators of compromise should the SOC’s tools look for?
• What actions can the SOC take to reduce the risk of exploitation?
As the SOC feeds threat data into security tools, the SOC will phase in how it is used regardless
of the data that is contained within the threat intelligence feed. The first step is to monitor for
mentions of the IOC or artifact associated with attackers. This will allow for testing to validate
that the data is processed correctly and that the results meet the goals for adding the threat data.
Earlier in this chapter, I provided an example of Splunk consuming data; however, that data
was not properly processed, which means actions can’t be taken. You will want to validate that
the security tool is properly reading and using the data before you test any actions.

146
After the SOC validates that the security tool is properly consuming the threat data, the next
step is to attempt to review results from before and after you added the threat data. Your SOC’s
goal is to confirm that the data is being used and is positively impacting how the system
functions. Some security tools such as SIEMs have live widgets and reports that can be broken
or tainted with the wrong data elements if external threat data is not properly utilized and
filtered upon collection.
The SOC will want to adjust how the data is collected and make other adjustments according
to the system’s configuration options to modify results until the SOC has the best results for
the SOC’s and organization’s business goals. Looking back at the Splunk example that didn’t
process the data from the security tool correctly, I had to adjust how the system formatted the
data from that system until Splunk was able to properly categorize the events. At that point, I
found Splunk would offer many new variables I could access based on various IOCs obtained
from the threat data. Some items such as the timestamps were a bit off and required adjustments
to how the collector within Splunk was configured, but eventually I got the data
processed correctly so that I could quickly identify what I needed to find.
Once the integration of the threat data is configured correctly and tested, the final step is to add
the results to the SOC’s operation through analysis. Tools such as SIEMs will allow the SOC
to create widgets that will continuously collect and display results pulled from threat
intelligence feeds, such as current threats or top hashes of malware seen in the wild. The
analysis stage is essentially when the SOC converts the collected threat data into threat
intelligence, since actions are now being developed based on the results of the new data input.
The SOC can also develop and test playbooks, which is a topic related to SOAR and will be
addressed shortly. Figure 7-14 shows a diagram of steps to operationalize threat
data int o threat intelligence.

FIGURE 7-14 Steps to Operationalize Threat Data

Security Information and Event Management

147
A common topic covered in the security industry is centralizing data from multiple tools.
Sometimes, this is labeled as having a “a single pane of glass.” The goal is to reduce the time
to investigate an incident by having one place to perform the investigation and having the
ability to correlate data from different tools to gain a better understanding of threats impacting
your organization. SIEM technology attempts to solve this challenge by acting as that central
point for event data, as I described earlier in this book. Market leaders for SIEM solutions
include Splunk, QRadar, and LogRhythm according to sources such as Gartner. The goals of
using a SIEM solution are to improve attack detection, speed up incident handling, centralize
reporting, and provide a resource to measure compliance. With that in mind, where does threat
intelligence fit in?
It is common for a SIEM solution to digest and correlate findings with internal telemetry, such
as what you get from firewall and DNS logs. This allows you to match potential attacks to the
external data that was collected. The value of a SIEM solution depends on the data it receives.
If you send it limited data, you will get limited results. If the data sent is not good, the output
will also not be good.
Another way to say this is “garbage in, garbage out.” If you don’t configure it to present the
information in your desired format, you will end up with dashboards alerting about hundreds
of events you will never be able to address, known in the industry as the “bug splat.” A SIEM
dashboard throwing out thousands of alarms at your analysts will just overwhelm them, causing
key alerts to be missed and the SIEM to provide little value. This brings us to the pros and cons
of threat intelligence for a SIEM. I will start with the cons.
Adding threat data to a SIEM tool might seem like a simple process, but it is not. The purpose
of the SIEM tool is to piece everything together (hence the single pane of glass, right?). If you
mix in the wrong external threat data, you will cause a ton of misunderstandings within the
SIEM tool’s built-in logic because the SIEM tool won’t be able to determine what is internal
and external, forcing the SOC to either re-engineer everything or look at methods to isolate
external data, leading to limitations in data correlation. I have found that some SOCs will light
up threat intelligence, get trampled with tons of new alerts they can’t take action on, and soon
after disable the threat data. This occurs due to an increase in false positives, contradictions in
correlation data, and just more things to look at.
One other con I have seen is the complete opposite situation, where a SIEM solution is not
generating any new alerts after the threat intelligence feed is added. This can occur if the threat
data is not applied correctly and either forgotten about or the SOC didn’t find any major
changes and felt the feed didn’t provide value. When this occurs, I tend to find that a scope or

148
objective wasn’t set, meaning there wasn’t a set success criterion to measure against. The SOC
simply lit up the feed and hoped for what would feel as “better” results. This is a common
mistake when testing threat intelligence. The SOC should not just add a threat intelligence feed
and generally look for some major incident they didn’t know about to pop up. In the real world,
that doesn’t happen by automatically adding a threat intelligence
feed to a SIEM.
Adding Threat Intelligence to a SIEM
Given that just dumping threat data into a SIEM tool will cause more problems than value,
does that mean threat intelligence is not ideal to use in a SIEM tool?
The answer is no—if you apply data with a specific goal and expected outcome. I covered how
to evaluate threat data, with a focus on ensuring that the data is reliable, timely, and accurate
and is relevant to your line of business. There are a handful of additional checkpoints that can
help with threat data integrations with a SIEM tool. Consider the following items as you
evaluate threat data for your SIEM solution:
• Validate the threat data is in a format accepted by the SIEM solution of choice. If it
isn’t, how much effort is required to manipulate the feed so that it would be accepted?
• Will this new data source increase the monthly/annual SIEM bill? Many SIEM
providers charge on a usage billing cycle, meaning the more data used, the higher the
cost for using the SIEM solution. You could reduce the impact by filtering only what
is relevant to your organization and using other tuning tricks. If you are billed by your
SIEM provider, consider the impact of adding more data.
• What are the top threats you plan to address? Answering this allows you to focus the
results of the threat data to specific goals that matter to your SOC. For example, are you
a target for a nation state? Is phishing a top concern? You can use the SIEM tool to
develop reports and live displays that digest threat data and answer these questions. In
particular, you will find that some threats are better addressed by using external threat
data while others are easier to detect using data from internal security and network tools.
The two previous examples of phishing and nation state concerns are both better suited
for using external threat data.
• Does the SIEM tool offer a way to capture additional context about events?
• This is where support for tactical and operational data will be extremely handy,
allowing you to correlate attack data with additional details about the who, what, when,
where, and how of the attack. Having context will make decisions regarding what action

149
to take much simpler. This is especially true with SOARs and SOAR/SIEM
integrations.
• What filters are available in the SIEM tool, and can they be applied against the threat
data? The more you can focus on what is relevant to your specific needs, the more
useful the threat data will be.
• Would the threat intelligence improve the confidence of existing detection capabilities?
This is a huge question to answer since the SIEM tool is pulling in data from various
internal security and network tools. As pointed out earlier, if adding external data
weakens the SIEM tool’s decision process, this will reduce the SIEM’s confidence in
alerts being generated, causing a breakdown of the value it provides. One common
method to overcome this is to be selective about which checkpoints/widgets within the
SIEM tool are using the external threat data. The bad news is that it will require some
re-engineering of certain widgets and reports if filters were not put in place regarding
adding or removing external event data prior to when the reports and widgets were
originally created.
Threat data can be converted to threat intelligence using a SIEM solution if it is added in a
planned and meaningful manner. It is critical that you follow a solid rollout plan to ensure you
maximize your value received while also avoiding any losses from adding the feed. If you just
add threat data without any set goals or considerations for how the SIEM solution is currently
being used prior to the threat data, you will run into problems with your deployment.
In summary, your rollout plan should include the following steps. Many of these steps follow
the best practices I have covered in this chapter.
Step 1. Set objectives for using the threat data. What is your measurement for success?
Step 2. Configure the SIEM solution to accept the feed.
Step 3. Monitor that the data is being collected correctly.
Step 4. Identify if existing reports and live widgets are negatively impacted. If so, add filters
to remove the external data from existing reports and widgets.
Step 5. Attempt to identify objectives using live feeds against filters that include the new threat
data.
Step 6. Tune how the data is digested and troubleshoot any collection issues until you can
identify your goals.
Step 7. Operationalize your findings in widgets and reports.
Step 8. Build the new use cases into your SOC practice and SOAR solution.
Security Orchestration, Automation, and Response

150
One major drawback of a SIEM solution is its limitations in what actions it can take against an
event. This is where a security orchestration, automation, and response (SOAR) solution steps
in. A SOAR solution can provide case management, standardization, workflow, and analytics,
all of which enable the SOC to be much more productive. Without a SOAR tool, a SOC would
be left with a slew of alarms, leaving the SOC with the responsibility to manually investigate
and track how events are being handled.
The benefits of using threat data for a SOAR tool are slightly different than those for a SIEM
tool. One benefit is the impact on how playbooks are used. Having external data can be
extremely useful for this purpose. Playbooks can include additional triggers that are impacted
by threat data, allowing a SOC to take more proactive measures when attack campaigns are
being seen in the wild. Many SOAR providers have playbook templates that leverage both
internal and external threat data; not including a threat intelligence feed would limit usage of
such playbooks. External threat data can also add confidence in when a playbook is triggered
by adding additional context or checkpoints that must occur before the playbook is launched.
Other benefits from using external threat data with a SOAR tool are similar to those of using
external threat data with a SIEM tool. Those benefits include improvements to dashboards and
reporting and improvements to incident management and response. The assumption is,
however, that the same considerations are made regarding choosing which data and how it is
used as I covered in the SIEM section. As with a SIEM tool, you must follow a phased-in
approach to adding threat intelligence to a SOAR tool or your capabilities will break, data
within the SOAR tool will become tainted with false positives, and the overall value of the
SOAR tool will be negatively impacted.
I highly suggest following the same rollout plan for adding threat intelligence to a SOAR tool
as you would for a SIEM tool. The only difference will be identifying any default playbooks
or other SOAR capabilities that are designed for leveraging threat intelligence and adding those
to your evaluation plan to simplify your end results. Many SOARs likely have default
playbooks that are similar to your goals, allowing you to have a starting point rather than
developing each playbook from scratch. I recommend to first test default playbooks, which are
built for leveraging external threat data, before creating your own. Once those default
playbooks are capable of digesting the data, then attempt to modify the templates or build your
own, knowing that the threat data is captured correctly and available within the
SOAR tool.
I recommend picking a handful of testing criteria as you add threat data to a SOAR solution.
Those testing criteria should fall under three categories: playbooks, dashboards, and reporting.

151
I also recommend to first test the impact of data within the SOAR dashboard before attempting
projects associated with playbooks, reporting, or other automated tasks. The specifics of how
to carry out testing will depend on the SOAR solution, other security tools, and your business
objectives.
What data is open source?
As highlighted above, “open source” refers to data that is readily available for public
consumption. Rather than coming from a single location, OSD can be taken from a range of
sources for use in OSINT investigations. Let’s take a look at some of the places open source
data can emerge from in detail.
News media content
Content produced, published or broadcast — including online — for general public
consumption in multiple media formats such as journals, newspapers, radio and television. This
also includes media aggregators that do not necessarily publish original content.
Grey literature
This refers to materials and information from nonmedia organisations and institutions. This
includes —
• Academic institutions, think tanks and research institutions — for example, academic
papers.
• Government agencies — includes information that can be accessed on request, such as
census data.
• Businesses and corporations — this would include annual reports and company filings.
• Intergovernmental organisations — reports from organisations like the United Nations
and World Health Organization (WHO).
• Charities and Non-governmental organisations (NGOs).
Social media
Where publicly available, this includes information in both long-form, e.g. blogs and sources
such as Reddit, and short-form, e.g. posts on Facebook, Twitter, and LinkedIn.
Dark web
The dark web is a treasure trove of data often linked to criminal activity. It can contain data
such as usernames, email addresses and phone numbers of individuals connected to crimes.
More sources and more possibilities
In the past, open source material was mostly limited to printed media, such as books, articles
and public records, that could only be viewed at specific places and times. Online data provides
on-demand access to published material, as well as self-published blogs and social media posts.

152
There is also now a whole range of visual and auditory media that did not exist prior to the
development of smartphones and mobile technology. While providing professional researchers
and investigators with larger volumes of potentially useful information, the rapidly expanding
nature of OSD also threatens to overwhelm. Therefore, applying rigour to the way in which
OSD is collated, analysed and used is now more important than ever. In many ways, the
Information Age is also proving to be the Age of OSINT.
Benefits of OSINT
OSINT has been widely utilised across government and military applications since as early as
the Second World War, supporting investigations into global issues, including counter-
terrorism and counter-proliferation. As many former government investigators enter the private
sector, OSINT’s popularity in industries such as financial services has steadily increased. Many
organisations now consider OSINT a key part of investigative best practice, thanks to the range
of benefits it offers.
Benefit 1: Expanded insight
An investigator’s ability to extract meaningful insights is dependent on the information at their
disposal. The inclusion of open source data alongside internal and other data sources gives
investigators the context they need to make comprehensive decisions.
Financial services use case
Effective regulatory compliance processes such as anti-money laundering (AML) and know
your customer (KYC) rely on an in-depth understanding of clients and counterparties, risk
actors and threats. By using OSINT, financial institutions can extend their investigations
outside of siloed commercial databases and internal systems for a more expansive and proactive
understanding of illicit behaviours, connections, and risks. For example, correspondent
banking transactions present a specific challenge to due diligence processes, given the lack of
information available to the correspondent bank. Good governance and due diligence on behalf
of respondent banks are obviously critical. Moreover, OSINT provides one of only a few direct
ways for correspondent banks to undertake AFC and AML checks in this context.
Corporate use case
Proactive investigations into potential avenues of risk are important for businesses to protect
themselves against a range of complex threats that have the potential to inflict serious financial
and reputational damage. Examples where OSINT plays a particularly important role include

• Due diligence: Globally active corporations face an array of risks against which they need
to screen clients, suppliers, franchise partners, acquisition targets and other “third parties”.

153
OSINT can fill the gaps left by traditional solutions that may not provide a full picture
regarding sanctions, corruption and bribery, human rights, and ESG (Environmental,
Social and Governance) risks.
• Fraud, brand protection and illicit trade: Large enterprises, especially in the retail and
FMCG (Fast Moving Consumer Goods) space, face significant losses due to a wide variety
of increasingly complex fraud schemes, as well as from organised crime groups that are
flooding markets with counterfeit goods. The chances of preventing, detecting and
disrupting these activities can be greatly strengthened by proactively investigating
individuals and groups, and then providing high-quality intelligence to relevant legal
partners and law enforcement agencies.
Benefit 2: Improved accuracy
The huge amount of data available in open sources has the potential to provide numerous
additional insights, but these are hard to access without processes that turn data into
intelligence. OSINT helps investigators to improve accuracy by structuring the management of
large data sets using data categorisation, filtering and advanced analysis. Processes like these
ensure that all possible connections are made, and risks are identified. This enables streamlined,
effective investigations, and a more complete understanding of the information available.
Benefit 3: Greater access to intelligence
OSINT is derived from publicly available open source data. On its own, this brings three key
benefits —
• More ethical: Public scepticism surrounding data usage continues to grow, and
regulations such as GDPR have required organisations to enforce more stringent rules
regarding data collection, storage and analysis. There are reputational, compliance, and
moral challenges to address when handling data. OSINT helps create an ethical
approach to data analysis because it is publicly available.
• Easier to acquire: Some forms of intelligence are difficult to gather. For example,
HUMINT (intelligence derived from human sources) often requires highly trained
investigators to work anonymously in dangerous circumstances to acquire information.
Access to private data, such as call data, generally requires specialist technology and
privileges available only to law enforcement. Open source data, on the other hand, is
publicly available, meaning that anyone can access it and use it as intelligence if the
right processes and people are in place.

154
• Less expensive: OSINT is often freely accessible through search engines, data
aggregators and more. Compared to other types of intelligence, that makes OSINT
relatively inexpensive to access. With that said, upfront investment in OSINT
technology can be beneficial. OSINT solutions minimise the need to collect data
indiscriminately by using technology to focus investigators on relevant information.
Ultimately, effective OSINT technology can improve the efficiency and effectiveness
of your OSINT investigations.
Public sector use case
Prioritising OSINT, which focuses on publicly available data, can significantly reduce
investigative practices that encroach on individuals’ right to privacy. By centralising targeted
data collation and filtering out irrelevant information, governments, intelligence agencies, and
law enforcement stand to enhance investigations into networks, illicit connections, and more
without violating public trust or privacy.
Applying the Intelligence Cycle to OSINT

Step 1: Direction

Role of technology

OSINT technology can enhance the efficiency and effectiveness of the IC Direction phase by
providing —
• Team capabilities and messaging features that allow any IRs, task orders or
background briefings to be communicated to investigators and analysts.

155
• An ability for analysts to translate IRs into an investigative plan, create a
reporting template, and track the progress of an investigation.
Step 2: Collection

Role of technology

As well as having a Collection Plan, the efficiency and effectiveness of this IC phase can be
improved by the application of technology. For example, OSINT tools exist that can help after
analysts Strike a balance between comprehensive data collection and the targeted identification
of relevant information.

• Structure the way that data is collected, and keep a record of what happened when.
• Remain secure and anonymous when undertaking online research, a capability that’s
particularly important to investigations and intelligence professionals working in sensitive
roles where operational security is paramount. This is essential in a financial service
context where tipping off a suspect of an AML investigation can lead to criminal
prosecution.
• Search multiple search terms simultaneously across different sources, including search
engines, corporate records databases and social media platforms. When done manually,
this process can be extremely time consuming and error-prone.

Step 3: Processing

Role of technology

Technology has the potential to completely transform the Processing stage of the Intelligence
Cycle within an OSINT investigation. The automatic categorisation, referencing, deduplication
and translation of collected information can significantly reduce the time taken to complete
manual and repetitive tasks. With the right tools, the analyst should have to spend little time
on Processing, leaving them to concentrate on the next stage: Analysis.

Step 4: Analysis
Role of technology
In this phase, it’s the Analyst who takes centre stage. They are the subject matter experts who
need to consider the processed data, assess its reliability and relevance, and integrate their
findings into a finished brief or report. However, as mentioned above, using technology to
process open source data has the benefit of giving the investigator far more time to analyse it,
as well as ensuring that the processing is accurate and thorough.

156
Additionally, OSINT tools frequently offer visualisation capabilities such as charts, maps and
grids, which allow the investigator to see and understand connections and networks far more
quickly, and potentially gain greater insights from the data. These kinds of tools can radically
improve the speed and effectiveness of the analytical process, allowing for more data to be
collected and analysed in a shorter time frame. Finally, tools can support the analysis and final
report creation through the automated sourcing of any collected information, inbuilt logging
for auditability and cross-referencing, and standardised formatting options.

Step 5: Dissemination
Role of technology
Technology can simplify an analyst’s ability to develop and disseminate a final product that
has a compelling narrative and addresses the IRs. An effective OSINT tool does this in three
main ways —

• Visualisation: Many tools offer charts, graphs and other visualisation capabilities that help
investigators to make sense of large volumes of data, allowing them to explain decisions
to senior stakeholders clearly and quickly.
• Consistency: Technology provides a level of consistency around processes such as
reporting, enabling organisations with multiple investigators to provide standardised, high-
quality intelligence to internal and external customers.
• Information-sharing: OSINT tools often include capabilities that allow for secure and fast
teamwork, enabling investigators to share intelligence with those who need it without any
security concerns.

Threat modelling is first and foremost a practical discipline, and this chapter is structured to
reflect that practicality. Even though this book will provide you with many valuable definitions,
theories, philosophies, effective approaches, and well-tested techniques, you’ll want those to
be grounded in experience.
Therefore, this chapter avoids focusing on theory and ignores variations for now and instead
gives you a chance to learn by experience. To use an analogy, when you start playing an
instrument, you need to develop muscles and awareness by playing the instrument. It won’t
sound great at the start, and it will be frustrating at times, but as you do it, you’ll fi nd it gets
easier.
Threat Modelling: Process and Methodologies

With the number of hacking incidents on the rise, cybersecurity remains a top concern in
today's IT world. So many aspects of our lives have migrated online that the commercial and

157
private worlds alike have much to lose from security breaches. In response, cybersecurity
professionals are deploying an arsenal of defenses and countermeasures to keep transactional
data and sensitive information safe. Considering the sheer number and variety of attacks
available today, it's a huge undertaking. That's why threat modeling is making significant
inroads into the world of cybersecurity. We are about to take a close look at the threat modeling
process in cybersecurity, what it is, why it's needed, and the available methodologies.

What is Threat Modelling?


Threat modeling is a method of optimizing network security by locating vulnerabilities,
identifying objectives, and developing countermeasures to either prevent or mitigate the effects
of cyber-attacks against the system. While security teams can conduct threat modeling at any
point during development, doing it at the start of the project is best practice. This way, threats
can be identified sooner and dealt with before they become an issue.
It's also important to ask the following questions:
• What kind of threat model needs building? The answer requires studying data flow
transitions, architecture diagrams, and data classifications, so you get a virtual model
of the network you're trying to protect.
• What are the pitfalls? Here is where you research the main threats to your network
and applications.
• What actions should be taken to recover from a potential cyberattack? You've
identified the problems now; it's time to figure out some actionable solutions.
• Did it work? This step is a follow-up where you conduct a retrospective to monitor
the quality, feasibility, planning, and progress.
The Threat Modelling Process
Threat modelling consists of defining an enterprise's assets, identifying what function each
application serves in the grand scheme, and assembling a security profile for each application.
The process continues with identifying and prioritizing potential threats, then documenting
both the harmful events and what actions to take to resolve them.
Or, to put this in lay terms, threat modelling is the act of taking a step back, assessing your
organization's digital and network assets, identifying weak spots, determining what threats
exist, and coming up with plans to protect or recover.
It may sound like a no-brainer, but you'd be surprised how little attention security gets in some
sectors. We're talking about a world where some folks use the term PASSWORD as their

158
password or leave their mobile devices unattended. In that light, it's hardly surprising that many
organizations and businesses haven't even considered the idea of threat modelling.
Why Do We Need Security Threat Modelling?
Just how bad is the cybersecurity situation that we need to create things like threat modelling
to help combat it?
Cybercrime has exacted a heavy toll on the online community in recent years, as detailed in this
piece by Security Boulevard, which draws its conclusions from several industry sources.
Among other things, the report says that data breaches exposed 4.1 billion records in 2019 and
that social media-enabled cybercrimes steal $3.25 billion in annual global revenue.
According to KnowBe4's 2019 Security Threats and Trends report, 75 percent of businesses
consider insider threats to be a significant concern, 85 percent of organizations surveyed
reported being targeted by phishing and social engineering attacks, and percent of responders
cite email phishing scams as the largest security risk.
As a result of these troubling statistics, spending on cybersecurity products and services is
expected to surpass $1 trillion by 2021.
Cybercrime is happening all the time, and no business, organization, or consumer is safe.
Security breaches have increased by 11% since 2018, and a whopping 67 percent since 2014.
Smart organizations and individuals will take advantage of any reliable resources to fight this
growing epidemic, and sound threat modelling designing for security purposes is essential to
accomplish this.
Why is threat modelling important?
Risk mitigation
Threat modelling plays a crucial role in risk mitigation. By identifying potential threats
before they can be exploited, organizations can take proactive measures to eliminate or
reduce these risks. This is far more cost-effective than reacting to a breach or attack after it
has happened. Moreover, by understanding the potential attack vectors, organizations can
develop more secure systems and applications right from the start.
Enhanced security awareness
Threat modelling is not just a technical process. It also involves a significant amount of
human analysis and decision-making. By engaging in threat modelling, teams can enhance
their security awareness and foster a security-focused culture within the organization. Via
threat modelling, they can educate their teams about the potential threats they may encounter
and the actions they can take to mitigate these threats.
Easier compliance

159
Many industries and jurisdictions have strict compliance requirements when it comes to
cybersecurity. These requirements often include a comprehensive threat analysis and ongoing
risk assessment and process improvement. Threat modelling can help organizations meet
these compliance requirements by providing a systematic and documented approach to threat
analysis.
Furthermore, a well-documented threat model can serve as evidence of due diligence in the
event of a security incident. It can show that organizations have taken reasonable steps to
identify and mitigate potential threats to their systems.
Cost-effective security
Threat modelling is a cost-effective approach to security. By identifying threats early on,
organizations can avoid the high costs associated with security breaches and data loss.
Furthermore, by focusing resources on the most significant risks, organizations can ensure
that they get the most security for their investment.
Advantages of threat modelling
Detect problems early in the SDLC
By identifying potential threats and vulnerabilities at the design stage, organizations can
avoid costly and time-consuming fixes later in the development process. This proactive
approach allows organizations to build security into their systems from the ground up, rather
than trying to patch it on as an afterthought.
Identifying potential issues early on also gives developers the opportunity to address them in
their code. This can lead to more secure software and can help avoid the need for expensive
and disruptive patches or updates later on. In essence, threat modelling can help turn security
from a reactive process into a proactive one.
Evaluate new forms of attack
As the cybersecurity landscape evolves, new threats and attack vectors continue to emerge.
Threat modelling allows organizations to stay a step ahead of these evolving threats by
providing a structured approach to identifying and assessing them.
By regularly updating their threat models, organizations can stay abreast of the latest security
threats and vulnerabilities. This can help them adapt their security strategies and defences in
response to the changing threat landscape, ensuring that they are always prepared for the
latest attacks.

160
Identify security requirements
By understanding the potential threats to a system and the impact failures in maintaining a
good security posture could have, organizations can determine what security controls they
need to put in place to protect their assets.
These security requirements should be incorporated into the system design and development
process, ensuring that the system is built with security in mind from the start. This can result
in more secure systems and can help organizations avoid costly and disruptive security
breaches. However, as security is often an afterthought, creating lines of communication
between IT, Security, and Development is a key requirement in business continuity and
disaster recovery scenario building.
Map assets, threat agents, and controls
By creating a detailed model of the core elements of the most critical business systems across
the organization, security leadership and risk teams can gain a better understanding of what
they need to protect, who might want to attack it, and how they can defend it. For instance,
if the business is a transactional website, putting controls around the web interface, APIs, and
back-end databases must be prioritized. For a business in manufacturing, keeping service
accounts and ICS up and running while restricting external access of any kind may be
paramount.
This understanding can help organizations prioritize their security efforts and resources,
focusing on the most critical assets and threats. It can also provide a clear roadmap for
implementing security controls, helping organizations ensure that they are effectively
protecting their systems and data.
5 steps of the threat modelling process
When performing threat modelling, several processes and aspects should be included. Failing
to include one of these components can lead to incomplete models and can prevent threats
from being properly addressed.
1. Apply threat intelligence
This area includes information about types of threats, affected systems, detection
mechanisms, tools and processes used to exploit vulnerabilities, and motivations of attackers.
Gathering this intelligence is an ongoing process, and wherever possible should be automated
by security tools.
Threat intelligence information is often collected by security researchers and made accessible
through public databases, proprietary solutions, or security communications outlets. It is used
to enrich the understanding of possible threats and to inform responses.

161
2. Identify assets
Teams need a real-time inventory of components, credentials, and data in use, where those
assets are located, and what security measures are in place to maintain a secure posture. This
inventory helps security teams track assets with known vulnerabilities and monitor the end
state of passwords and permissions.
A real-time inventory enables security teams to gain visibility into asset changes. For
example, getting alerts when assets are added with or without authorized permission, which
can potentially signal a threat. A stale user or service account that has been suddenly active
may also be an indication of threat.
3. Identify mitigation capabilities
Mitigation capabilities generally refer to technology to protect, detect, and respond to a
certain type of threat, but can also refer to an organization’s security expertise and abilities,
and their processes. Assessing their existing capabilities will help determine whether they
need to add additional resources to mitigate a threat.
For example, if an organization has enterprise-grade antivirus or endpoint protection, they
have an initial level of protection against traditional malware threats. They can then
determine if they should invest further, for example, to correlate existing AV signals with
other detection capabilities.
4. Assess risks
Risk assessments correlate threat intelligence with asset inventories and current vulnerability
profiles. These tools are necessary for teams to understand the current status of their systems
and to develop a plan for addressing vulnerabilities.
Risk assessments can also involve active testing of systems and solutions. For example,
penetration testing to verify security measures and patching levels are effective in hardening
systems, as is application security testing and software composition analysis to make sure
that the applications running on those systems are as secure as current information will allow
5. Perform threat mapping
Threat mapping is a process that follows the potential path of threats through an
organization’s systems. It is used to model how attackers might move from resource to
resource and helps teams anticipate where defenses can be more effectively layered or
applied. Wherever specific security mitigations and logs can be mapped against the most
applicable use cases that threaten the business, there is a more effective model for
demonstrating the organization’s current security posture in reports for compliance.

162
Threat Modelling frameworks
Teams choosing to participate in Threat Modelling at CMS will have the option to work with
the Threat Modelling Team during a series of sessions. To successfully complete these
sessions, the Threat Modelling Team will use a number of proven frameworks including:

• Adam Shostack’s Four-Question Frame for Threat Modelling


• STRIDE Threat Model
These methods were chosen by the Threat Modelling Team because they are expedient,
reliable models that use industry-standard language and provide immediate value to CMS
teams. Read on to learn about the specifics of these frameworks.
Four-Question Frame for Threat Modelling
As your team embarks on its Threat Modelling journey, it’s important that these four questions
remain top-of-mind:
1. What are we working on?
2. What can go wrong?
3. What are we going to do about it?
4. Did we do a good enough job?
These questions form the base of the work that your team and the Threat Modelling Team will
complete together. The questions are actionable, and designed to quickly identify problems and
solutions, which is the core purpose of Threat Modelling.
Types of Threat Modelling
It can take various forms, each providing a framework to address specific cybersecurity
concerns. Two popular types of threat modelling are STRIDE and DREAD.

163
The STRIDE Model
The STRIDE Threat Modelling framework is a systematic approach used to identify and
analyze potential security threats and vulnerabilities in software systems. It provides a
structured methodology for understanding and addressing security risks during the design and
development stages of a system.
The acronym STRIDE stands for the six types of threats that the framework helps to identify:
Threat type Property Violated Threat Definition
Pretending to be something or someone other than
Spoofing Authentication
yourself
Modifying something on disk, network, memory, or
Tampering Integrity
elsewhere
Claiming that you didn’t do something or were not
Repudiation Non-Repudiation
responsible; can be honest or false
Information Providing information to someone not authorized to
Confidentiality
Disclosure access it
Denial of service Availability Exhausting resources needed to provide service

164
Elevation of Allowing someone to do something they are not
Authorization
Privilege authorized to do

The DREAD Model


Proposed for threat modelling, but Microsoft dropped it in 2008 due to inconsistent ratings.
OpenStack and many other organizations currently use DREAD. It's essentially a way to rank
and assess security risks in five categories:
• Damage Potential: Ranks the extent of damage resulting from an exploited
weakness.
• Reproducibility: Ranks the ease of reproducing an attack
• Exploitability: Assigns a numerical rating to the effort needed to launch the attack.
• Affected Users: A value representing how many users get impacted if an exploit
becomes widely available.
• Discoverability: Measures how easy it is to discover the threat.
Use cases of Threat Modelling
Threat modelling is highly valuable in the early stages of the software development life cycle
(SDLC) such as requirements gathering and design, where it helps identify potential risks and
vulnerabilities in the system architecture.
Additionally, it is beneficial in infrastructure management, particularly in assessing and
mitigating risks to critical assets such as networks, servers, and databases. By incorporating
threat modelling at these key points, organizations can proactively address security
concerns and implement appropriate security measures for robust protection.
Threat modelling offers several advantages to industries and businesses, including:
▪ Proactive Risk Mitigation: By identifying and assessing potential threats in advance,
organizations can take proactive measures to mitigate risks and vulnerabilities.
▪ Optimal Resource Allocation: It helps businesses prioritize security investments and
allocate resources effectively to address the most critical risks.
▪ Compliance with Standards: It assists in complying with industry standards and
regulations by identifying potential risks to sensitive data and implementing necessary
safeguards.
▪ Improved Collaboration: It encourages collaboration among different teams and
stakeholders within an organization, ensuring a unified approach to cybersecurity.

165
UNIT-X Legal and Regulatory Framework
Introduction to NICE
The National Initiative for Cyber Security Education (NICE) is a nationally coordinated effort
focused on cybersecurity awareness, education, training, and professional development. The
mission of NICE is to enhance the overall cybersecurity posture of the United States by
accelerating the availability of educational and training resources designed to improve the
cyber behavior, skills, and knowledge of every segment of the population, enabling a safer
cyberspace for all.
NICE Component 4 – Workforce Training and Professional Development
This component is responsible for: – defining the cybersecurity workforce; and – identifying
the training and professional development required for the nation’s cybersecurity workforce.
Lead by the DoD, ODNI and DHS, in coordination with academia, industry and state, local and
tribal governments.
Understanding the Cybersecurity Workforce
We need the answers to questions such as:
Who is a cybersecurity professional?
• Do we know in our Federal Government employee population, who works in
cybersecurity and what their capabilities are?
• How many cybersecurity professionals receive annual performance awards in
comparison to professionals in other occupations?
• What is the average starting salary of a System Architect within various Federal
Government organizations? How does this compare to private industry?
• What are the average promotion rates of different cybersecurity specialties compared to
one another and to other occupations?
• What are the attrition rates?
NICE Workforce Plan Overview

Need for Standardization


Today, there is very little consistency throughout the Federal Government and the Nation in
terms of how cybersecurity work is defined, described, and how the workforce is trained.
Establishing and implementing standards for cybersecurity workforce and training is a
foundational component for every workforce plan.

166
Component 4 Work Plan – Task Overview

Federal Department and Agency Support


6. Over 20 Federal Departments and Agencies participated to develop the framework,
including: Department of State
7. Department of Education
8. Department of Labor Office of Management and Budget Office of Personnel
Management Department of Defense
9. Department of Justice Information Sciences & Technologies
10. Department of Homeland Security (including NPPD, TSA, USSS, Coast Guard, ICE,
CBP, CIS, DHS OI&A).
11. Central Intelligence Agency
12. Defense Intelligence Agency
13. Director of National Intelligence
14. Federal Bureau of Investigation
15. National Security Agency
16. National Science Foundation
17. Department of Defense /DC3x
18. National Counterintelligence Executive Federal CIO Council
Cybersecurity Workforce Framework
Framework Development Process

167
Framework Categories
The Framework organizes cybersecurity into seven high-level categories, each comprised of
several specialty areas.

168
169
What is the NIST Cybersecurity Framework?
The NIST Cybersecurity Framework (CSF) provides guidance on how to manage and
reduce IT infrastructure security risk. The CSF is made up of standards, guidelines and
practices that can be used to prevent, detect and respond to cyberattacks.
NIST created the CSF to help private sector organizations in the United States develop a
roadmap for critical infrastructure cybersecurity. It has been translated into multiple languages
and is used by the governments of Japan, Israel and others.
The NIST CSF is most beneficial for small or less-regulated entities -- specifically those trying
to increase security awareness. The framework might be less informative for larger
organizations that already have a focused IT security program.
The framework was created as a voluntary measure through a collaboration between private
industry and government. NIST designed the framework to be flexible and cost-efficient, with
elements that can be prioritized. The CSF is available as a spreadsheet or PDF and as a
reference tool.
NIST CSF objectives
The NIST CSF aims to ensure critical IT infrastructure is secure. It is intended to provide
guidance but is not compliance focused. Its goal is to encourage organizations to prioritize
cybersecurity risks -- similar to financial, industrial/personnel safety and operational risks.
Another objective of the framework is to help include cybersecurity risk considerations in day-
to-day discussions at organizations.
NIST's Cybersecurity Framework uses
The CSF is designed to help organizations protect infrastructure they deem critical. It can help
increase security in the following ways:
• To determine current levels of implemented cybersecurity measures by creating a
profile.
• To identify new potential cybersecurity standards and policies.
• To communicate new requirements.
• To create a new cybersecurity program and requirements.
NIST compliance
The framework is both voluntary and performance based, meaning organizations are not
required to follow it. Originally, the CSF was designed as a guideline by executive order by
former President Barack H. Obama. The standards continued to be implemented by
government offices under former President Donald J. Trump and are still implemented by
government offices under President Joe Biden.

170
While designed primarily for government and private sector organizations, public companies
can also use the NIST Cybersecurity Framework. The U.S. government and NIST provide
several tools to help organizations get started with cybersecurity programs and assessments.
Version 1.1 of the framework added a section titled "Self-Assessing Cybersecurity Risk with
the Framework" for organizations to follow.
NIST does not use the term comply when it comes to the CSF. If an organization chooses to
follow the framework, NIST uses the term leverage -- as in an organization will leverage the
NIST Cybersecurity Framework.
History of NIST CSF
In February 2013, President Obama, issued Executive Order 13636: Improving Critical
Infrastructure Cybersecurity, which called for the development of a voluntary cybersecurity
framework that would provide a prioritized, flexible and performance-based approach to aid
organizations in managing cybersecurity risks for critical infrastructure services. While
multiple federal agencies were tasked with developing elements related to this executive order,
NIST was assigned to develop a cybersecurity framework with input from private industry.
The final version of NIST's document was released in 2014.
The document was later translated into different languages, including Spanish, Japanese,
Portuguese and Arabic, to be used by a variety of governments.
In 2017, draft version 1.1 of the document was circulated and later made publicly available in
April 2018.
3 parts of NIST's framework
The CSF is broken down into three parts: the core, implementation tiers and profiles.
• The framework core, as described by NIST, is the set of cybersecurity activities and
desired outcomes common across any critical infrastructure sector.
• The framework implementation tiers provide context around an organization's
cybersecurity risks and processes to put in place to manage risks. The tiers describe
the level at which an organization's cybersecurity risk management practices follow
the characteristics defined in the CSF. A tier 1 organization, for example, is one
that's ranked as partial, described as having limited awareness. Tier 2 is risk-
informed, tier 3 is repeatable and tier 4 is adaptive, meaning the organization can
best react to cybersecurity threats.
• The framework profiles describe the current state of an organization's security
program, as well as compare the current state to the desired state. This process helps

171
reveal any gaps, which can be later addressed. The goal of a profile is to aid
organizations in establishing a roadmap for reducing cybersecurity risk.
The CSF is made up of the following five core functions:
1. Identify refers to developing an understanding of how to manage cybersecurity
risks to systems, assets, data or other sources.
2. Protect refers to the safeguards put in place to ensure the delivery of critical
infrastructure services.
3. Detect defines how a cybersecurity event is identified.
4. Respond defines what actions are taken when a cybersecurity event is detected.
5. Recover identifies what services should focus on resilience and outlines restore
capabilities of impaired services.

The goal of these functions is to provide a strategic view of the cybersecurity risks in an
organization.

For The NIST Cybersecurity Framework 2.0 National Institute of Standards and Technology
Access the link: https://nvlpubs.nist.gov/nistpubs/CSWP/NIST.CSWP.29.ipd.pdf

ISO 27035 Information technology — Information security incident management (IT-


ISIM)

172
• 27035-1 Part1: Principles and process / 2023-02 (2nd ed., 33p.)
• 27035-2 Part2: Guidelines to plan and prepare for incident response / 2023-02 (2nd ed.,
53p.)
• 27035-3 Part3: Guidelines for ICT incident response operations / 2020-09 (1st ed., 31p.)
• 27035-4 Part4: Coordination / Not released yet (CD stage?, 22p.)

INTERNATIONAL STANDARD ISO/IEC 27035-1:2023

The ISO/IEC 27035 series provides additional guidance to the controls on incident
management in ISO/IEC 27002. These controls should be implemented based upon the
information security risks that the organization is facing.
Information security policies or controls alone do not guarantee total protection of information,
information systems, services or networks. After controls have been implemented, residual
vulnerabilities are likely to remain that can reduce the effectiveness of information security and
facilitate the occurrence of information security incidents. This can potentially have direct and
indirect adverse consequences on an organization's business operations. Furthermore, it is
inevitable that new instances of previously unidentified threats cause incidents to occur.
Insufficient preparation by an organization to deal with such incidents makes any response less
effective, and increases the degree of potential adverse business consequence. Therefore, it is
essential for any organization desiring a strong information security programme to have a
structured and planned approach to:
• plan and prepare information security incident management, including policy,
organization, plan, technical support, awareness and skills training, etc.;
• detect, report and assess information security incidents and vulnerabilities involved with
the incident;
• respond to information security incidents, including the activation of appropriate controls
to prevent, reduce, and recover from impact;
• deal with reported information security vulnerabilities involved with the incident
appropriately;

173
• learn from information security incidents and vulnerabilities involved with the incident,
implement and verify preventive controls, and make improvements to the overall approach
to information security incident management.
The ISO/IEC 27035 series is intended to complement other standards and documents that give
guidance on the investigation of, and preparation to investigate, information security incidents.
The ISO/IEC 27035 series is not a comprehensive guide, but a reference for certain
fundamental principles and a defined process that are intended to ensure that tools, techniques
and methods can be selected appropriately and shown to be fit for purpose should the need
arise.
While the ISO/IEC 27035 series encompasses the management of information security
incidents, it also covers some aspects of information security vulnerabilities. Guidance on
vulnerability disclosure and vulnerability handling by vendors is also provided in ISO/IEC
29147 and ISO/IEC 30111, respectively.
The ISO/IEC 27035 series also intends to inform decision-makers when determining the
reliability of digital evidence presented to them. It is applicable to organizations needing to
protect, analyse and present potential digital evidence. It is relevant to policy-making bodies
that create and evaluate procedures relating to digital evidence, often as part of a larger body
of evidence.
Information technology — Information security incident management —
Part 1:
Principles and process
1 Scope
This document is the foundation of the ISO/IEC 27035 series. It presents basic concepts,
principles and process with key activities of information security incident management, which
provide a structured approach to preparing for, detecting, reporting, assessing, and responding
to incidents, and applying lessons learned.
The guidance on the information security incident management process and its key activities
given in this document are generic and intended to be applicable to all organizations, regardless
of type, size or nature. Organizations can adjust the guidance according to their type, size and
nature of business in relation to the information security risk situation. This document is also
applicable to external organizations providing information security incident management
services.
2 Normative references

174
The following documents are referred to in the text in such a way that some or all of their
content constitutes requirements of this document. For dated references, only the edition cited
applies. For undated references, the latest edition of the referenced document (including any
amendments) applies.
ISO/IEC 27000, Information technology — Security techniques — Information security
management systems — Overview and vocabulary
3 Terms, definitions and abbreviated terms
3.1 Terms and definitions
For the purposes of this document, the terms and definitions given in ISO/IEC 27000 and the
following apply.
ISO and IEC maintain terminology databases for use in standardization at the following
addresses:
— ISO Online browsing platform: available at https:// www .iso .org/ obp
— IEC Electropedia: available at https:// www .electropedia .org/
3.1.1
Incident Management Team
IMT
team consisting of appropriately skilled and trusted members of an organization responsible
for leading all information security incident management activities, in coordination with other
parties both internal and external, throughout the incident lifecycle Note 1 to entry: The head
of this team can be called the incident manager who has been appointed by top management to
adequately respond to all types of incidents.
Incident Response Team
IRT
team of appropriately skilled and trusted members of an organization that responds to and
resolves incidents in a coordinated way
Note 1 to entry: There can be several IRTs, one for each aspect of the incident.
Note 2 to entry: Computer Emergency Response Team (CERT1)) and Computer Security
Incident Response Team (CSIRT) are specific examples of IRTs in organizations and sectorial,
regional, and national entities wanting to coordinate their response to large scale ICT and
cybersecurity incidents.
3.1.3
incident coordinator

175
person responsible for leading all incident response (3.1.9) activities and coordinating the
incident response team (3.1.2)
Note 1 to entry: An organization can decide to use another term for the incident coordinator.
3.1.4
information security event
occurrence indicating a possible breach of information security or failure of controls
3.1.5
information security incident
related and identified information security event(s) (3.1.4) that can harm an organization's
assets or compromise its operations
3.1.6
information security incident management
collaborative activities to handle information security incidents (3.1.5) in a consistent and
effective way
3.1.7
information security investigation
application of examinations, analysis and interpretation to aid understanding of an information
security incident (3.1.5)
[SOURCE: ISO/IEC 27042:2015, 3.10, modified —“information security” was added to the
term and the phrase “an incident” was replaced by “an information security incident” in the
definition.]
3.1.8
incident handling
actions of detecting, reporting, assessing, responding to, dealing with, and learning from
information security incidents (3.1.5)
3.1.9
incident response
actions taken to mitigate or resolve an information security incident (3.1.5), including those
taken to protect and restore the normal operational conditions of an information system and the
information stored in it
3.1.10
point of contact
PoC

176
defined organizational function or role serving as the coordinator or focal point of information
concerning incident management activities Note 1 to entry: The most obvious PoC is the role
to whom the information security event is raised.
3.2 Abbreviated terms
BCP business continuity planning
CERT computer emergency response team
CSIRT computer security incident response team
DRP disaster recovery planning
ICT information and communications technology
IMT incident management team
IRT incident response team
ISMS information security management system
PoC point of contact
RPO recovery point objective
RTO recovery time objective
4 Overview
4.1 Basic concepts
Information security events and incidents may happen due to several reasons:
• technical/technological, organizational or physical vulnerabilities, partly due to
incomplete implementations of the decided controls, are likely to be exploited, as
complete elimination of exposure or risk is unlikely;
• humans can make errors;
• technology can fail;
• risk assessment is incomplete and risks have been omitted;
• risk treatment does not sufficiently cover the risks;
• changes in the context (internal and/or external) so that new risks exist or treated risks
are no longer sufficiently covered.
The occurrence of an information security event does not necessarily mean that an attack has
been successful or that there are any implications on confidentiality, integrity or availability,
i.e. not all information security events are classified as information security incidents.
Information security incidents can be deliberate (e.g. caused by malware or breach of
discipline), accidental (e.g. caused by inadvertent human error) or environmental (e.g. caused
by fire or flood) and can be caused by technical (e.g. computer viruses) or non-technical (e.g.
loss or theft of hardcopy documents) means. Incidents can include the unauthorized disclosure,

177
modification, destruction, or unavailability of information, or the damage or theft of
organizational assets that contain information.
Annex B provides descriptions of selected examples of information security incidents and their
causes for informative purposes only. It is important to note that these examples are by no
means exhaustive. A threat exploits vulnerabilities (weaknesses) in information systems,
services, or networks, causing the occurrence of information security events and thus
potentially causing incidents to information assets exposed by the vulnerabilities. Figure 1
shows the relationship of objects in an information security incident.

Figure 1 — Relationship of objects in an information security incident


Coordination is an important aspect in information security incident management. Many
incidents cross organizational boundaries and cannot be easily resolved by a single
organization or, a part of an organization where the incident has been detected. Organizations
should commit to the overall incident management objectives. Incident management
coordination is required across the incident management process for multiple organizations to
work together to handle information security incidents. This is for example the role of CERTs
and CSIRTs. Information sharing is necessary for incident management coordination, where
different organizations share threat, attack, and vulnerability information with each other so
that each organization’s knowledge benefits the other. Organizations should protect sensitive
information during information sharing and communication. See ISO/IEC 27010 for further
details.
It is important to indicate that resolving an information security incident should be done within
a defined time frame to avoid unacceptable damage or a resulting catastrophe. This resolution
delay is not as important in case of an event, vulnerability or a non-conformity.
4.2 Objectives of incident management

178
As a key part of an organization's overall information security strategy, the organization should
put controls including procedures in place to enable a structured well-planned approach to the
management of information security incidents. From an organization’s perspective, the prime
objective is to avoid or contain the impacts of information security incidents in order to
minimize the direct and indirect damage to its operations caused by the incidents. Since damage
to information assets can have a negative consequence on operations, business and operational
perspectives should have a major influence in determining more specific objectives for
information security incident management.
More specific objectives of a structured well-planned approach to incident management should
include the following:
a) information security events are detected and efficiently dealt with, in particular deciding
whether they should be classified as information security incidents;
b) identified information security incidents are assessed and responded to in the most
appropriate and efficient manner and within the predetermined time frame;
c) the adverse impact(s) of information security incidents on the organization and involved
parties and their operations are minimized by appropriate controls as part of incident response;
d) a link with relevant elements from crisis management and business continuity management
through an escalation process is established. There is a need for a swift transfer of responsibility
and action from incident management to crisis management when the situation requires it, with
this order reversed once the crisis is resolved to allow for a complete resolution of the incident;
e) information security vulnerabilities involved with or discovered during the incident are
assessed and dealt with appropriately to prevent or reduce incidents. This assessment can be
done either by the incident response team (IRT) or other teams within the organization and
involved parties, depending on duty distribution;
f) lessons are learnt quickly from information security incidents, related vulnerabilities and
their management. This feedback mechanism is intended to increase the chances of preventing
future information security incidents from occurring, improve the implementation and use of
information security controls, and improve the overall information security incident
management plan.
To help achieve these objectives, organizations should ensure that information security
incidents are documented in a consistent manner, using appropriate standards or procedures for
incident categorization, classification, prioritization and sharing, so that metrics can be derived
from aggregated data over a period of time. This provides valuable information to aid the
strategic decision-making process when investing in information security controls. The

179
information security incident management system should be able to share information with
relevant internal and external parties.
Another objective associated with this document is to provide guidance to organizations that
aim to meet the information security management system (ISMS) requirements specified in
ISO/IEC 27001 which are supported by guidance from ISO/IEC 27002. ISO/IEC 27001
includes requirements related to information security incident management. Table C.1 provides
cross-references on information security incident management clauses from ISO/IEC 27001
and clauses in this document. ISMS relationships are also explained in Figure 2. This document
can also support the requirements of information security management systems that do not
follow ISO/IEC 27001.

Figure 2 — Information security incident management in relation to ISMS and applied


controls
4.3 Benefits of a structured approach
Using a structured approach to information security incident management can yield significant
benefits, which can be grouped under the following topics.
a) Improving overall information security
To ensure adequate identification of and response to information security events and incidents,
it is a prerequisite that there be a structured process for planning and preparation, detection,
reporting and assessment, and relevant decision-making. This improves overall security by
helping to quickly identify and implement a consistent solution, and thus provides a means of

180
preventing similar information security incidents in the future. Furthermore, benefits are gained
by metrics, sharing and aggregation.
The credibility of the organization can be improved by the demonstration of its implementation
of best practices with respect to information security incident management.
b) Reducing adverse business consequences
A structured approach to information security incident management can assist in reducing the
level of potential adverse business consequences associated with information security
incidents. These consequences can include immediate financial loss and longer-term loss
arising from damaged reputation and credibility. For further guidance on consequence
assessment, see ISO/IEC 27005.
For guidance on information and communication technology readiness for business continuity,
see ISO/IEC 27031.
c) Strengthening the focus on information security incident prevention
Using a structured approach to information security incident management helps to create a
better focus on incident prevention within an organization, including the development of
methods to identify new threats and vulnerabilities. Analysis of incident-related data enables
the identification of patterns and trends, thereby facilitating a more accurate focus on incident
prevention and identification of appropriate actions and controls to prevent further occurrence.
d) Improving prioritization
A structured approach to information security incident management provides a solid basis for
prioritization when conducting information security incident investigations, including the use
of effective categorization and classification scales. If there are no clear procedures, there is a
risk that investigation activities may be conducted in an overly reactive mode, responding to
incidents as they occur and overlooking what activities should be handled with a higher
priority.
e) Supporting evidence collection and investigation
If and when needed, clear incident investigation procedures help to ensure that data collection
and handling are evidentially sound and legally admissible. These are important considerations
if legal prosecution or disciplinary action follows. For more information on digital evidence
and investigation, see the investigative standards in Annex A.
f) Contributing to budget and resource justifications
A well-defined and structured approach to information security incident management helps to
justify and simplify the allocation of budgets and resources for involved organizational units.

181
Furthermore, benefit accrues for the information security incident management plan itself, with
the ability to better plan for the allocation of staff and resources.
One example of a way to control and optimize budget and resources is to add time tracking to
information security incident management tasks to facilitate quantitative assessment of the
organization's handling of information security incidents. It can provide information on how
long it takes to resolve information security incidents of different priorities and on different
platforms. If there are bottlenecks in the information security incident management process,
these should also be identifiable.
g) Improving updates to information security risk assessment and treatment results
The use of a structured approach to information security incident management facilitates:
• better collection of data for assisting in the identification and determination of the
characteristics of the various threat types and associated vulnerabilities, and
• provision of data about frequencies of occurrence of the identified threat types, to assist
with analysis of control efficacy (i.e. identify controls that failed and resulted in a breach,
with uplift of such controls to reduce reoccurrence).
The data collected about adverse impacts on business operations from information security
incidents is useful in business impact analysis. The data collected to identify the frequency of
various threat types can improve the quality of a threat assessment. Similarly, the data collected
on vulnerabilities can improve the quality of future vulnerability assessments. For guidance on
information security risk assessment and treatment, see ISO/IEC 27005.
h) Providing enhanced information security awareness and training programme material
A structured approach to information security incident management enables an organization to
collect experience and knowledge of how the organization and involved parties handle
incidents, which is valuable material for an information security awareness programme. An
awareness programme that includes lessons learned from real experience helps to reduce
mistakes or confusion in future information security incident handling and improve potential
response times and general awareness of reporting obligations.
i) Providing input to the information security policy and related documentation reviews
Data provided by the practice of a structured approach to information security incident
management can offer valuable input to reviews of the effectiveness and subsequent
improvement of incident management policies (and other related information security
documents). This applies to topic-specific policies and other documents applicable both for
organization-wide and for individual systems, services and networks.
4.4 Adaptability

182
The guidance provided by the ISO/IEC 27035 series is extensive and, if adopted in full, can
require significant resources to operate and manage. It is therefore important that an
organization applying this guidance should retain a sense of perspective and ensure that the
resources applied to information security incident management and the complexity of the
mechanisms implemented are proportional to the following:
a) size, structure and business nature of an organization including key critical assets, processes,
and data that should be protected;
b) scope of any information security management system for incident handling;
c) potential risk due to incidents;
d) the goals of the business.
An organization using this document should therefore adopt its guidance in a manner that is
relevant to the scale and characteristics of its business.
4.5 Capability
4.5.1 General
Information security incidents can jeopardize achievement of business objectives and generate
crises.
Following the risk assessment, it is possible to delineate between situations whose likelihood
is medium to high, and consequence low to medium, and those whose likelihood is (very) rare
and consequences very high. The second situation represents crises that are not always possible
to completely prevent.
ISO 27035 Model
➢ Infosec incident management – handling infosec incidents in consistent way
Incident handling
– detecting / reporting / assessing / responding / dealing with / learning from infosec
incidents
Infosec investigation - examinations, analysis and interpretation to understand of an
Infosec incident
➢ incident response – mitigation / resolution infosec incidents, including to protect and
restore incident management team (IMT), lead by Incident Manager, for all infosec
incident management activities throughout the incident (handling?) lifecycle (-2: manager
should be close to CxOs, might handle SOC area)
incident response team (IRT), lead by Incident Coordinator, for responding to and resolving
incidents in a coordinated way. Can be a few in a big organization.
Causes for Infosec Events and Incidents

183
1. Humans make errors
2. Technology fail
3. Vulnerabilities due to imperfection of controls
4. Risk - assessment: incomplete, - treatment: not cover risks; - changes in the context
Objectives of infosec incident management

1. Infosec events are detected, dealt with, incl. classified as infosec incidents

2. Infosec incidents are assessed and responded effectively (process, time)

3. Adverse impact of infosec incidents are minimized by appropriate controls as part of incident
response

4. Link with crisis management and BCP through an escalation process is established.

5. Infosec vulnerabilities during the incident are assessed / dealt - to prevent or reduce incidents.

6. Lessons are learnt quickly from infosec incidents, related vulnerabilities and their
management.

Benefits of structuring Infosec Incident Management

1. Improving overall information security

2. Reducing adverse business consequences

3. Strengthening the focus on information security incident prevention

4. Improving prioritization

5. Supporting evidence collection and investigation

6. Contributing to budget and resource justifications

7. Improving updates to information security risk assessment and treatment results

8. Providing enhanced information security awareness and training programme material

9. Providing input to the information security policy and related documentation reviews

Summary of ISO 27035:

 ISO 27035 is top-down value for ISO 27001-aligned organizations

 ..and most CSIRT/SOCs should implement ISMS for own infosec management

184
 Standard provides structure, but is a bit short on substance, i.e. it is not prescriptive

 Confusion is introduced by implicit gaps:

1. Lifecycle of incident (handling)

2. PoC or ticketing system?

3. Incident management log vs incident register?

4. Incident (Analysis? Investigation?) report vs Event report (“6.4 Is the response to


this event closed?”)

5. Not clear how vulnerabilities (and threats) are reported (not as Event Report)

6. Detection: as soon as possible, Reporting: without unnecessary delay, Response: as


soon as possible

7. Detect and alert on anomalous, suspicious, or malicious activities (why not call
“infosec events”?)

ISO 27035 related standards

• 27037 Digital evidences capture


• 27038 Digital redaction
• 27040 Data storage security
• 27041 Incident Investigation method
• 27042 Analysis of digital evidence
• 27043 Incident investigation principles and processes
• 27050 Electronic discoveries
• 30121 Digital forensics risk

185
186

You might also like