AI Uses in Blue Team Security WHPUABT WHP 1221

AI Uses in Blue Team Security
Emerging
Technology © 2021 ISACA. All Rights Reserved.
2 AI USES IN BLUE TEAM SECURITY
CONTENTS
4 Introduction
4 What Are Machine Learning, Deep
Learning and Artificial Intelligence?
6 / How Deep Learning Differs From
Machine Learning
7 Areas In Cybersecurity Where Machine
Learning Helps
7 / Network Intrusion Detection/ Security
Information and Event Management
(SIEM) Solutions
9 / Phishing Attack Prevention
9 / Offensive Cybersecurity Application
9 / Reconnaissance
10 / Scanning
10 / Fuzzing/Exploit Development
11 Areas in Cybersecurity Where Machine
Learning Is Overused
11 Malicious Use of ML and DL: Social
Engineering and Phishing
12 Conclusion
13 Acknowledgments
© 2021 ISACA. All Rights Reserved.

ABSTRACT
It is difficult to keep up with the pace of technology innovation in today’s world. Things
have been changing rapidly, especially in cybersecurity. Many cybersecurity experts feel
they are in the fight of their lives, and the criminal element has recently dealt some heavy
blows. This white paper explores the use of artificial intelligence (AI), machine learning
(ML) and deep learning (DL) applications in cybersecurity to identify what is working, what
is not working, what looks encouraging for the future, and what may be more hype than
substance.
This paper utilizes interviews with some of the engineers behind these technologies,
along with firsthand examination and use of some of the related products. It also includes
the observations of chief information security officers (CISOs) and chief information
officers (CIOs) who weighed in with their take on the effectiveness of certain ML/AI-based
products or enhancements. One conclusion drawn from the interviews and experiments
undertaken for this paper is that marketing tactics often obscure reality when it comes to
new security technology. Unfortunately, some truly great innovations in the area of ML
have been overshadowed by heavily marketed claims of AI magic from some major
industry brands.
The research conducted for this paper supports the basic conclusion that even though
the ML/AI movement does not offer a panacea for solving the greatest cybersecurity
problems, it is one of the bright areas in cybersecurity. Rapid adoption of ML and AI
principles will better equip the professionals who are the most engaged in cybersecurity
defense.

Introduction
Most major cybersecurity product vendors or service products, services and solutions deliver real added value.
providers advertise artificial intelligence (AI) as a key part According to MIT Sloan, only about one in 20 companies
of its new offerings. This paper explores some of the AI- has extensively incorporated AI into its solutions.1 1
principled machine learning (ML) and deep learning (DL)

This white paper is vendor-agnostic and the author has no
technologies that currently exist in the field of
ties with any manufacturers or sellers of these products.
cybersecurity and whether they are making an impact. It
Its focus is not to disprove any claims made by vendors
also considers which cybersecurity problems are most
but rather to examine whether advertised claims can be
responsive to AI learning technologies and which require
confirmed. This paper begins with descriptions of ML and
mostly manual human involvement. This analysis is
AI. It explains where ML and AI principles work in
important because the market is flooded with AI products
cybersecurity and where they do not. Finally, it indicates
that make grand claims of solving all the cybersecurity
where malicious threat actors are making serious
problems of the world. The marketing noise makes it
progress in their efforts to harness the power of ML and,
nearly impossible for business leaders to decide which
eventually, AI.
What Are Machine Learning, Deep

Learning and Artificial Intelligence?
It is important to distinguish between ML and AI. Data • Machine learning (ML)—A sub-branch of AI that enables
scientists, computer scientists and research professionals computers to learn, adapt and perform desired functions on their
typically view ML as a type of programming that enables own. ML algorithms learn patterns from previous input and results
automation, which eventually may lead to the realization and adjust tasks accordingly.3 Generally, if it is necessary to tweak
3
of some of the more ambitious goals of AI. Following are a computer program, someone has to recode it. With ML, it is
basic definitions of ML, DL and AI: possible for a program to recode or update itself. This ability may
• Artificial intelligence (AI)—A wide-ranging branch of computer be useful for firewalls, intrusion detection systems, and other
science dedicated to building smart machines capable of security appliances and tools, allowing them to adjust code on the
performing tasks that typically require human intelligence. AI is fly to adapt to new or emerging threats.
fundamentally an effort to make computers think like humans. • Deep learning (DL)—A subset of ML that processes data and
The term describes machines that mimic cognitive functions creates patterns for use in decision making. DL techniques
2
such as learning and problem solving. AI is the term generally
2
enable machines to complete tasks without human intelligence
used to describe ML and DL efforts. As ML and DL algorithms input.4 DL uses neural networks to emulate the human brain.
4
and implementations advance, AI will improve overall. Neural networks are essentially layers of ML algorithms
1
1
Ransbotham, S.; D. Kiron; P. Gerbert; M. Reeves; “Reshaping Business With Artificial Intelligence,” MIT Sloan Management Review, 6 September 2017,
https://sloanreview.mit.edu/projects/reshaping-business-with-artificial-intelligence/
2
2
Beal, V.; “Artificial Intelligence (AI),” Webopedia, 24 May 2021, www.webopedia.com/definitions/ai/
3
3
Roy, S.; “Machine Learning,” Webopedia, 1 September 2021, www.webopedia.com/definitions/machine-learning/
4
4
Beal, V.; “Deep Learning,” Webopedia, 24 May 2021, www.webopedia.com/definitions/deep-learning/

designed and implemented to emulate the way neurons illustrate this overlap. Ideally, the goal is to trim or
function in the human brain. The goal is to enable computer eliminate the overlap. Further refinement might be
code to adapt to problems in a way that closely mimics human possible by adding more differentiating features—for
processes. example, poison ivy leaves are typically hairy while poison
For many years, humans have mastered the art of oak leaves are usually smooth. The addition of a third
consuming data, creating data, using data and storing feature to the algorithm should improve the accuracy of
data. The recent addition of ML enables programmatic the result, as illustrated in the three-dimensional chart
decision making about data, and the resulting decisions shown in figure 2.
can even predict future things about the data. FIGURE 1: Features Used to Classify Leaves
ML can predict whether an individual might want to watch

a certain video based on what videos the person has
watched in the past, and it can even consider how much
of each video the person watched. ML can predict
whether a specific company’s stock is likely to go up or
down. These abilities may seem insignificant because
they are in common use today. However, they are
groundbreaking when compared to what humans were
able to do just 20 years ago. Yet, by some metrics used in
the data science field, these technological achievements
are not considered true AI. ML is regarded as a set of tools Poison ivy Poison oak
and techniques that could eventually lead to AI—but it is

only one set of tools and techniques among many. FIGURE 2: Three Dimensions of Leaf Features
An analysis of poison ivy and poison oak provides an

example. The goal of the exercise is to distinguish
between these two plants. In the ML world, this problem is
best solved using a type of ML called classification. To get
training data, a botanist could be enlisted to collect
samples of poison ivy and poison oak, labeling each
based on observed features. Those features would be
characteristics that differentiate the plants. For example,
poison ivy is shinier than poison oak, and it is a darker
shade of green. Poison oak is duller than poison ivy, and it
is a lighter shade of green. Figure 1 shows how the plants
could easily be sorted based on the features used to
classify their leaves.
Overlap in the middle is expected, because some

individual poison ivy samples might be a little lighter green
Poison ivy Poison oak
than most, and some poison oak samples might be a little
darker green than most. A confusion matrix5 is useful to 5 Source: Oksana Raievska; Getty Images
5
5
A confusion matrix is a table that is used to describe the performance of a classification model on a set of test data for which the true values are known
and allows visualization of the performance of an algorithm.

This is an example of the types of jobs ML is ideally cybersecurity has yet to experience a similar amazing
suited to perform. However, the more features added, the breakthrough, there are some promising bright spots.
more complex it becomes to illustrate the results. A
comparison of 10,000 features would be very difficult to
visualize, but it would make for some very powerful ML. Of
How Deep Learning Differs
course, it would also require more computing power. Now From Machine Learning
that cloud computing resources are available on demand, From a practical standpoint, DL is a subset of ML, often
it is possible to see some visualizations that were viewed as a progressive evolution of ML. In other words,
economically not feasible before. ML is a general practice and DL is a specialty within that
practice.
The vast amount of data allowed the platform to predict
Basic ML models often become progressively better at
the likelihood of certain diseases, such as Alzheimer’s, a
full two to three years faster than other systems currently whatever it is they are designed and engineered to do, but
in use in the medical community. they need significant guidance from developers and
engineers. If an ML algorithm should return an inaccurate
What is happening in the industry is that a lot of software
prediction, then a developer would have to intervene and
vendors claim AI and ML capabilities when their products
make the proper changes to the code and algorithms to
in fact use advanced versions of IF, AND and OR logical
get the results within the desired tolerance. With a DL
functions or other basic types of programmatic problem-
model, the algorithms can determine whether a resultant
solving. Of the 13 engineers who commented for this
prediction is sufficient or not. To do so, they typically use
paper—all of whom were working on well-known products
neural networks, which can be thought of as layers of ML
that make ML and AI claims—none felt that the marketing
algorithms.
associated with the products they were working on was
100 percent accurate with respect to advertised
With a DL model, the algorithms can determine whether a
capabilities. Yet, in the same breath, the engineers were resultant prediction is sufficient or not. To do so, they
optimistic about the direction they were going and the typically use neural networks, which can be thought of as
layers of ML algorithms.
technologies they were going to be creating as they relate
to ML and DL. The engineers provided excellent insight
For example, upon returning home, a person who wanted
into the capabilities that are reportedly on the way.
to hear some mood music could tell a smart speaker to
However, some of the better solutions under
“play something jazzy.” Based on that instruction, the
consideration still lack the ability to scale at a reasonable
smart speaker could use ML to review the user’s previous
cost.
jazz music selections. It might factor in things like time of
day to decide which song to play. This is an example of
The gradual adoption of ML has yielded some good
traditional ML.
results outside the world of cybersecurity. For example,
Mt. Sinai Hospital in New York City developed ML However, a more robust ML tool could go a lot further to
algorithms for use in its Deep Patient platform. The make a good music choice. For instance, it could use an
hospital loaded Deep Patient with the medical records of integrated gaming console camera to see that the person
every person who has used the facility in the last 20 years. is wearing dressy clothes (suggestive of a night on the
The vast amount of data allowed the platform to predict town), holding a bottle of wine and accompanied by a
the likelihood of certain diseases, such as Alzheimer’s, a date. The smart speaker might use those deep layers of
full two to three years faster than other systems currently information, plus the fact that it is 9 pm, to determine that
in use in the medical community. Though the field of a Barry White song would suit the moment. This scenario

is an example of taking advantage of neural networks and conclusions and do not make mistakes, but when
using layers of algorithms instead of a flat plane of successfully implemented, DL can be truly
algorithms, a step further toward AI. groundbreaking. It is currently regarded as the main
pathway to eventual true AI.
Unfortunately, cybercriminals are using ML and DL as well,
and it appears they might be outpacing the There are some innovative products in the market that will
cyberdefenders when it comes to developing and take ML and DL to new levels, and these approaches need
employing new technologies.
further development. For example, some vendors are
DL models are built to analyze data in a way that using ML and DL to discover patterns in packets that
resembles human thought processes. To accomplish this, previously were impossible to detect, due to the sheer
DL applications use layers of algorithms called artificial volume of packets and limitations with packet retention
neural networks. These artificial neural networks are and data retention in general.
modeled after the often-studied biological neural networks
Unfortunately, cybercriminals are using ML and DL as well,
in human brains. The idea is to develop models with
and it appears they might be outpacing the
intelligence that exceeds that of traditional non-DL ML.
cyberdefenders when it comes to developing and
It is very difficult to ensure that DL models draw correct employing new technologies.
Areas In Cybersecurity Where

Machine Learning Helps
Network Intrusion Detection/ • Signature-based intrusion detection—The system examines each
incoming packet using a set of rules based on known attack

Security Information and Event patterns, or signatures. Detecting new attacks is difficult when the
Management (SIEM) Solutions process is based only on static and manually created signatures.
• Anomaly-based intrusion detection—The system uses

Network intrusion detection is one of the main tools of the
statistics to form a baseline of what normal network use looks
security operations center (SOC). One of the key features
like at different time intervals. This technique was introduced in
of an intrusion detection system (IDS) is the ability to alert
response to zero-day attacks—that is, previously unknown
staff to anomalies—things that should not be entering the
attacks or exploits. Generally, the system uses ML to create
environment.
models that impersonate normal activity. It then identifies
Keeping an IDS up to date in terms of what it looks for and anomalies—that is, behaviors that do not match the previously
how it looks for it is still a somewhat manual and time- created models. Nearly every modern anomaly-based IDS either
consuming process. Subject matter experts (SMEs) may uses or is working toward using this technique.
create new rules that eventually are pushed down to in-
place IDS devices. Both of these methods are being enhanced and reimagined
with ML capabilities in several solutions in the market today.
There is some obvious ML applicability to this process. Manual processes are being replaced by automated
IDSs primarily use one of the two following methods: processes with the help of ML. Rules can be rewritten on the

fly by ML-driven services, devices and applications. There Algorithms are often grouped into classes according to
are essentially two types of tasks carried out in ML the similarity of their functions. Figure 3 shows some
settings: supervised and unsupervised. The following common algorithm groupings, but is by no means an
definitions summarize the main characteristics of each: exhaustive list of all ML algorithms—it is just a sample of
• Supervised ML algorithms apply what has been learned in the some commonly used types. IDSs that make use of these
past to predict future events using labeled examples. The techniques provide some of the best examples of how ML
algorithm analyses, known as training datasets, infer functions is being successfully applied to solve real problems. From
to make predictions about output values. After sufficient a cyberdefense perspective, some of these algorithms
training, the system can provide targets for new inputs. The work best when used together.
machine is then equipped with a new set of examples so the
SIEM solutions employ these techniques with great
supervised learning algorithm can analyze the training data and
success. Some SIEM providers even offer ML kits that
produce a correct outcome from labeled data.6 The6
allow customers to pick, test and choose their

classification of poison ivy and poison oak is an exercise that combinations of models, algorithms and techniques
exemplifies supervised ML (figure 1 and figure 2). based on what works for their enterprise, and considering
• Unsupervised machine learning algorithms study how systems the types of data they want to extract and the types of
can infer a function to describe a hidden structure from unlabeled visualizations they seek. This is a step in the direction of
data. These algorithms are used when training information is ML truly evolving into a tool that everyone can use instead
neither marked nor classified. The task of the machine is to group of remaining a specialized tool that only a few elite data
unsorted information according to patterns, similarities and scientists and enterprises employing data scientists can
differences without any prior training data.7
7
use.
FIGURE 3: Common Machine Learning Algorithm Classes
ML Algorithm
Purpose Cybersecurity Application
Class
Classification/ Used to train on datasets of previous observations, and Used to determine if an executable or other type of file, such as
supervised tries to apply what is learned to new data. This requires PDF, is malicious or safe, following use of training data
classifying data with different labels. It works with many obtained from an SME’s previous identification of malicious
kinds of input, including text, images and video. and safe files
Regression/ Often used to identify correlations across different Used in some security information and event management
unsupervised datasets and to understand their relationships, which (SIEM) solutions to establish data relationships across
might otherwise be hidden or unnoticed different log sources. It can be used to compare predicted
application programming interface (API) calls from a process
to previous legitimate calls to identify anomalies.
Clustering/ Works directly on new data without considering Clustering can be used to analyze traffic and look for common
unsupervised previous data, examples or training. Clustering is patterns. For example, malware operating internally on multiple
primarily used to identify commonalities between computers might exfiltrate data in a specific way, e.g., with
different artifacts and group them based on their common packet size or frequency. Even encrypted traffic might
common features. be identified as exfiltration traffic based on common features.
Stacking/ semi- Usually used after clustering has been performed to Stacking might identify and group traffic according to features
unsupervised further segment resulting clusters such as destination, e.g., a specific Internet Protocol (IP)
address or domain. Traffic that may look extremely similar to
exfiltration traffic may be grouped as benign due to its going to
a specific location.
6
6
Cuelogic, “Evaluation of Machine Learning Algorithms for Intrusion Detection System,” 10 May 2019, www.cuelogic.com/blog/machine-learning-
algorithms-for-intrusion-detection-systems
7
7
Ibid.

Phishing Attack Prevention Penetration testing is generally performed in five phases,

as follows:
Natural language processing (NLP) is a type of ML that
• Reconnaissance—Can be passive or active. An example of
aims primarily to give computers the ability to understand,
passive reconnaissance is using the vast pool of information
analyze and potentially generate human language in the
that is Google® to find out about a target enterprise or an
form of text or audio. As NLP has advanced, it has
individual within the enterprise. An example of active
become a prominent solution in cybersecurity operations,
reconnaissance is interacting with the target’s website to see
e.g., to determine if communications are from a human or
what is on it.
a machine. When products on the market are advertised
• Scanning—Actively running network discovery scans, or probes,
as having ML antiphishing capabilities, that typically
to see if devices are reachable; port scans to see what services
means they use NLP.
are reachable on those devices; service identification scans to
Thanks to a large amount of historical training data, NLP identify what versions of services are there; and vulnerability
has been adopted faster than some other subsets of ML. scans to find out what the services are vulnerable to
There are bots that pretend to be human, automated call • Gaining access/exploitation—Launching exploits to gain
center systems that pretend to be human, and countless access
Completely Automated Public Turing tests to tell • Privilege escalation and horizontal movement—Elevating
Computers and Humans Apart (CAPTCHAs) that demand privileges to a higher level
users prove they are human. NLP techniques are also • Covering tracks—Manipulating logs and taking other steps to
used to decipher the tone of emails and text messages, remove traces of the attacker’s presence and activity in the
and they can extract detailed information to evaluate environment
whether they are phishing attempts or legitimate
correspondence. Reconnaissance
Many of the tools that perform passive reconnaissance
A host of NLP algorithms—including the bag-of-words
currently limit the information coming back from Google
model,8 term frequency-inverse document frequency (tf-
8
and other sources simply because the amount is

idf)9 and stemming10 —make it possible to combine many
9 10
considered too great to process. Filtering all the results

different human languages and interactions in effective
from Google, which often number in the millions, typically
NLP models. For example, the bag-of-words model is a
requires too much processing for effective targeting. ML
method of feature extraction that pulls out what is useful
excels at solving this type of problem. In some cases,
in text. It starts with a baseline of known words and
tools must be opened up to prevent them from limiting the
determines a measure of presence of those known words.
number of results returned from Google. Sometimes a
It then converts the text to numbers—or, typically, vectors
paid API is useful. There are several commercial and open
of numbers.
source tools designed to ease some of the work of
passive reconnaissance, including Maltego™, DNSrecon,
Offensive Cybersecurity Recon-ng and the Google Hacking Database (GHDB).
Application However, they are all mostly limited by buffers.
Penetration testing is another area of cybersecurity where Google, Facebook® and many others already have
ML is being applied to good effect. algorithms designed to learn from all these data, primarily
8
8
Brownlee, J.; “A Gentle Introduction to the Bag-of-Words Model,” Machine Learning Mastery, 9 October 2017,
https://machinelearningmastery.com/gentle-introduction-bag-words-model/
9
9
Luvsandorj, Z.; “Introduction to NLP - Part 3: TF-IDF explained,” Towards Data Science, 6 June 2020, https://towardsdatascience.com/introduction-to-
nlp-part-3-tf-idf-explained-cedb1fc1f7dc
10
10
Srinidhi, S.; “Stemming of words in Natural Language Processing, what is it?”, Towards Data Science, 19 February 2020,
https://towardsdatascience.com/stemming-of-words-in-natural-language-processing-what-is-it-41a33e8996e2

for marketing purposes. It is possible to get access to the 3. Fine-tune the fuzzer to send massive combinations of input to
data through a paid subscription to aid passive the application until it does something unexpected, such as fault
reconnaissance. Tools designed to take massive amounts or crash.
of data from search results and quickly pull out what is 4. Examine the cause of the fault or crash, then duplicate it with a
useful are greatly enhanced when they incorporate ML. proof of concept.
5. Work the proof of concept to advance from getting random data

One of the biggest challenges in penetration testing, in
into the overflowed memory space to getting actual code into it.
general, is being able to make use of all the data pulled
out with various tools. The auto-limits some tools set on Essentially, once code is inserted into memory and
the amount of information they provide are meant to executed where part of the vulnerable program is
make the job more manageable. With ML, those limits can supposed to execute, it constitutes an exploit. Steps 3
be lifted, leading to better results and more fidelity in through 5 are very time-consuming and tedious because
passive reconnaissance efforts. there are nearly unlimited combinations of letters,

numbers and special characters up to any given length.
Scanning The process of generating and sending random strings,

One of the most important parts of offensive operations is analyzing the results, and then deciding what to send next
scanning to identify services and vulnerabilities. A lot of based on those results is an ideal task for unsupervised
ML effort has gone into weaponizing tools for this ML. From the initial fuzzing process, it might not be
purpose. Some of the more interesting ML-driven tools evident what anomalies in the responses would look like.
presented at recent industry conferences serve to However, the right ML algorithm could figure out what
automate things. One, for example, scans a massive human eyes would never be able to see.
number of web pages, uses classification algorithms to
look at screenshots of the pages, then applies cluster The process of generating and sending random strings,
analysis and eventually stacks the pages based on which analyzing the results, and then deciding what to send next
based on those results is an ideal task for unsupervised
are more likely to contain exploitable vulnerabilities. This
ML.
technology has not yet been perfected, but the
It would be possible to combine the fuzzing efforts of an
presentation and this author’s experimentation with the
entire community in which everyone uses unsupervised
tool suggest it is close to readiness.
ML to collect anomalies. Then those anomalies could go
into an unsupervised clustering algorithm, and next into a
Fuzzing/Exploit Development
stacking algorithm set. Finally, a host of SMEs could study
The application of machine learning to fuzzing and exploit
the resulting clusters, decide what they mean, and then
development is straightforward. Fuzzing is the process of
feed them into a supervised classification ML model. The
bombarding an application with massive amounts of
resulting training data could be pushed down for
various combinations of data to cause it to crash or
individual use. Essentially, each end user’s node would be
behave in some other unexpected way. The goal is to
made smarter through the collective via unsupervised ML.
make it respond in a way the creator of the application did
not intend. For example, an intruder who does not have The most interesting research often involves mixing and
access to an application’s source code might employ the matching the order of placement of each type of ML
following process to find a buffer overflow vulnerability: algorithm. The usually clear lines between learning and
1. Find out how the application takes input. predicting become a little more blurred, which in some
2. Develop a fuzzer based on the limitations discovered. ways is closer to true AI.

Areas in Cybersecurity Where

Machine Learning Is Overused
In nearly every product group observed for the research When dealing with certain other matters, such as the
underlying this paper, instances of overusing ML challenges presented by attackers using encrypted
terminology and techniques were found. Some developers exfiltration channels, ML solutions have been ineffective.
are throwing ML at problems that do not require it. Network-based IDSs and data loss prevention (DLP)
systems inspect the contents of packets to see what is
One demonstration claimed the product looked at more
inside, then decide whether the data inside the packet
than 4,000 data points in a network packet to determine
should be going to wherever they are going. If all the
whether it was malicious. Running Snort®,11 an open
11
exfiltrated packet payloads are encrypted using

source IDS, this author wrote two rules to look for two of
symmetric encryption, then the process of decrypting the
those data points. Surprisingly, the Snort IDS rule was
traffic on the fly is very impractical—and nearly impossible
equally effective as the 4,000-point AI/ML/DL super IDS
in many cases.
that was the subject of the demo. This was a classic
example of overuse. Even though only two of the 4,000 The use of ML and AI in cybersecurity often rests on the
data points existed, the packet could be determined ability of either a network-based IDS or a SIEM solution to
malicious with fair certainty. It was not necessary to at least be able to see the traffic. Most SIEMs are based
examine the other 3,998 points to determine that the on logs and alerts. For network IDSs and other security
packet was malicious. The product was using ML to solve appliances to alert to bad malicious packets, they must be
a problem that already had been solved. That would be able to see what is in the packets—and to see what is in
acceptable if ML could solve the problem in a significantly encrypted packets, they must first be decrypted. This is a
better way. However, in this case, the touted product was problem that is as old as intrusion detection itself and it is
an exponentially more expensive solution that provided no not close to being solved by ML currently.
measurable advantage over the much cheaper IF, AND
logic of the Snort solution.
Malicious Use of ML and DL: Social

Engineering and Phishing
Early social engineering and phishing attacks were very most successful ways to compromise an enterprise, and
rudimentary and easy to detect. As soon as the criminal they remain on top today.
world saw how successful even basic attacks were, social
NLP is one of the most effective techniques available to
engineering and phishing took their places among the
fight phishing emails. However, threat actors are also
aware of ML and AI capabilities.
11
11
Snort, www.snort.org/

By combining NLP with other antiphishing algorithms that a user does not click on. The threat actors using NLP
used by security professionals, cybercriminals have been to build phishing campaigns are almost guaranteed
able to engineer and deploy devastating phishing success. For that reason, using machine learning to
campaigns. In these attacks, the model learns what works defend against phishing attacks might seem like overkill,
and what does not on a per-second basis as it blasts out but given the types of innovations the threat actors are
large numbers of emails. The combination of NLP coming up with on the opposite side, it is clear that
algorithms with others, such as clustering, allows these implementing NLP in antiphishing defenses is necessary
phishing campaigns to automatically get smarter with just to be in the fight.
every email that a security product catches and every link
Conclusion
One of the major challenges concerning the use of ML, DL One of the biggest problems with IDSs and alerts comes
and AI in the cybersecurity industry is overuse of the down to storage and how much can be stored for how
terminology. With so many vendors and service providers long. ML enjoyed a resurgence in popularity a few years
embracing the terms, progress has been hindered in some ago mainly because of the novel ways it was used to
ways. The true innovation underlying some of the more analyze the massive amounts of data humans create, use
useful implementations has been lost amid the hype. and store daily on the Internet. The downside is that the
black hats have been making tremendous headway in
The meaning of AI has become so blurred that 10
adapting ML and AI principles for their activities. Several
different sources might offer 10 different definitions.
known advanced persistent threat (APT) groups have
However, the consensus among those educated in the
used ML to carry out devastating phishing attacks or
field appears to be that no product or vendor has
surgically effective ransomware attacks.
incorporated true AI just yet. Some interesting solutions
have emerged or been reimagined with the use of ML and In the history of human inventions, the visionaries who first
DL algorithms, but it is critical to pay close attention to became inspired to learn how to fly made machines that had
marketing claims to determine if a product offers true flapping wings that imitated birds. The concepts of lift and
innovation or fails to match its hype. drag and other principles of science and engineering were
not understood as well then as they are now.
One vendor claims its product analyzes billions of
decision points to decide if a file is malicious. Is it It took many years for inventors to advance from flapping-
necessary to examine billions of decision points for that? wing machines to the types of jet planes that passengers
Probably not. This is another case of an enterprise using routinely fly on these days. It may be that ML and AI are
ML algorithms to solve problems that are solved as easily currently in the flapping-wing phase of development and
using much simpler basic logic and elementary coding. true understanding and harnessing of these technologies
is a long way off. That said, with the massive computing
There are certainly some interesting potential applications
resources available in the cloud and recent advances in
for using ML, DL and AI in offensive security activities
quantum computing, these technologies could come
such as penetration testing. Protocol and application
together in just a few decades. It is exciting to see that
fuzzing or scanning and reconnaissance are among the
even if this game is essentially at day zero, there are
techniques and operations conducive for ML- and AI-
already signs of considerable progress. What is to come
principled solutions. There is real promise in the areas of
certainly will be much greater than what has been
endpoint detection and response (EDR), extended
imagined so far, and it will be exhilarating to go along for
detection and response across multiple security controls
the ride.
(XDR), and SIEM solutions.

Acknowledgments
ISACA would like to acknowledge:
Lead Developer Board of Directors

Keatron Evans Gregory Touhill, Chair Tracey Dedrick
Security Researcher CISM, CISSP ISACA Board Chair, 2020-2021
USA Director, CERT Division of Carnegie Mellon Former Chief Risk Officer, Hudson City
University’s Software Engineering Institute, Bancorp, USA
USA
Expert Reviewers Brennan P. Baybeck
Blake Curtis Pamela Nigro, Vice-Chair CISA, CISM, CRISC, CISSP
CISA, CISM, GGEIT, CRISC, CDPSE, CISSP CISA, CGEIT, CRISC, CDPSE, CRMA ISACA Board Chair, 2019-2020
USA Vice President–Information Technology, Vice President and Chief Information
Security Officer, Home Access Health, USA Security Officer for Customer Services,
Adham Etoom
Oracle Corporation, USA
CRISC, CISM, FAIR, GCIH, PMP John De Santis
Jordan Former Chairman and Chief Executive Rob Clyde
Officer, HyTrust, Inc., USA CISM
Joshua Scarpino
ISACA Board Chair, 2018-2019
CISM, CISSP Niel Harper
Independent Director, Titus, and Executive
USA CISA, CRISC, CDPSE
Chair, White Cloud Security, USA
Chief Information Security Officer, UNOPS,
Wickey (Jiewen) Wang
Denmark
CISA
Gabriela Hernandez-Cardoso
IT Audit Sr. Manager, Visa, Inc.
Independent Board Member, Mexico
USA
Maureen O’Connell
Larry G. Wlosinski
Board Chair, Acacia Research (NASDAQ),
CISA, CISM, CRISC, CDPSE, CAP, CBCP,
Former Chief Financial Officer and Chief
CCSP, CDP, CIPM, CISSP, ITIL v3, PMP
Administration Officer, Scholastic, Inc.,
Coalfire-Federal
USA
USA
Veronica Rose
CISA, CDPSE
Founder, Encrypt Africa, Kenya
David Samuelson
Chief Executive Officer, ISACA, USA
Gerrard Schmid
President and Chief Executive Officer,
Diebold Nixdorf, USA
Asaf Weisberg
CISA, CISM, CGEIT, CRISC
Chief Executive Officer, introSight Ltd.,
Israel

About ISACA
For more than 50 years, ISACA® (www.isaca.org) has advanced the best
1700 E. Golf Road, Suite 400
talent, expertise and learning in technology. ISACA equips individuals with
Schaumburg, IL 60173, USA
knowledge, credentials, education and community to progress their careers
and transform their organizations, and enables enterprises to train and build
Phone: +1.847.660.5505
quality teams that effectively drive IT audit, risk management and security
priorities forward. ISACA is a global professional association and learning Fax: +1.847.253.1755
organization that leverages the expertise of more than 150,000 members who
Support: support.isaca.org
work in information security, governance, assurance, risk and privacy to drive
innovation through technology. It has a presence in 188 countries, including Website: www.isaca.org
more than 220 chapters worldwide. In 2020, ISACA launched One In Tech, a
philanthropic foundation that supports IT education and career pathways for
under-resourced, under-represented populations.
Provide Feedback:
DISCLAIMER
www.isaca.org/ai-blue-team-security
ISACA has designed and created AI Uses in Blue Team Security (the “Work”)
primarily as an educational resource for professionals. ISACA makes no claim Participate in the ISACA Online
that use of any of the Work will assure a successful outcome. The Work Forums:
should not be considered inclusive of all proper information, procedures and https://engage.isaca.org/onlineforums
tests or exclusive of other information, procedures and tests that are Twitter:
www.twitter.com/ISACANews
reasonably directed to obtaining the same results. In determining the propriety
of any specific information, procedure or test, professionals should apply their LinkedIn:
www.linkedin.com/company/isaca
own professional judgment to the specific circumstances presented by the
particular systems or information technology environment. Facebook:
www.facebook.com/ISACAGlobal
RESERVATION OF RIGHTS Instagram:

www.instagram.com/isacanews/
© 2021 ISACA. All rights reserved.
AI Uses in Blue Team Security

AI Uses in Blue Team Security WHPUABT WHP 1221

Uploaded by

Copyright:

Available Formats

You might also like

AI Uses in Blue Team Security WHPUABT WHP 1221

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

AI Uses in Blue Team Security WHPUABT WHP 1221

Uploaded by

Copyright:

Available Formats

AI Uses in Blue Team Security

© 2021 ISACA. All Rights Reserved.

© 2021 ISACA. All Rights Reserved.

principled machine learning (ML) and deep learning (DL)

What Are Machine Learning, Deep

© 2021 ISACA. All Rights Reserved.

ML can predict whether an individual might want to watch

and techniques that could eventually lead to AI—but it is

An analysis of poison ivy and poison oak provides an

Overlap in the middle is expected, because some

© 2021 ISACA. All Rights Reserved.

© 2021 ISACA. All Rights Reserved.

Areas In Cybersecurity Where

incoming packet using a set of rules based on known attack

• Anomaly-based intrusion detection—The system uses

© 2021 ISACA. All Rights Reserved.

allow customers to pick, test and choose their

FIGURE 3: Common Machine Learning Algorithm Classes

© 2021 ISACA. All Rights Reserved.

Phishing Attack Prevention Penetration testing is generally performed in five phases,

and other sources simply because the amount is

considered too great to process. Filtering all the results

Application However, they are all mostly limited by buffers.

© 2021 ISACA. All Rights Reserved.

reconnaissance. Tools designed to take massive amounts or crash.

5. Work the proof of concept to advance from getting random data

passive reconnaissance efforts. there are nearly unlimited combinations of letters,

Scanning The process of generating and sending random strings,

© 2021 ISACA. All Rights Reserved.

Areas in Cybersecurity Where

exfiltrated packet payloads are encrypted using

Malicious Use of ML and DL: Social

© 2021 ISACA. All Rights Reserved.

© 2021 ISACA. All Rights Reserved.

Lead Developer Board of Directors

© 2021 ISACA. All Rights Reserved.

RESERVATION OF RIGHTS Instagram: