How AI and Machine Learning Improve Enterprise Cybersecurity

WHITE PAPER
How AI and Machine Learning

Improve Enterprise Cybersecurity
Connecting all of the Dots in a Complex Threat Landscape
A recent study1 by Information Risk and ML are showing increasing promise with
Management points out that 86 percent of the advent of Big Data because the quality of
enterprises will be impacted by AI (Artificial information that informs AI or from which ML
Intelligence) in the next five years. One of those can learn is improving. However, there is much
impacts will be in cybersecurity defenses. As more to be done.
the volume of cyberattacks grows, security
analysts have become overwhelmed. With Anomaly Detection – The Early Days
more security- related data collected and more
When we talk about security, we want a system
tools in the market, more alerts are generated
that can separate good from bad, normal from
every day, causing what is known as the “alert
abnormal. Therefore, it is quite natural to apply
fatigue” problem.
anomaly detection to security. We can trace the
On the other hand, the industry lacks enough beginning of anomaly detection back to 19872,
trained security analysts to process such alerts. when researchers started building intrusion
AI and its component, Machine Learning (ML) detection systems (IDS). Around 1998-1999,
can help automate laborious tasks like threat DARPA (the government agency that created
hunting, threat evaluation, and threat response, the Internet), created benchmark sets and
thereby relieving the burden on analysts who are called for research on ML methods in security3.
awash in alerts. Unfortunately, few of the results were practical
AI uses algorithms to make decisions about what enough and even fewer products got to the
to do in given situations. The algorithms are a operational stage.
plan for what the computer should do given Anomaly detection is based on unsupervised
certain specific inputs, and the inputs come learning, which is a type of self-organized
from data collected in the environment in which learning that helps find previously unknown
the AI system works. It tries to mimic human patterns in a data set without the use of pre-
cognitive functions, where a computer is given existing labels. In essence, a system based on
the environment and the problem setting, and unsupervised learning knows what is normal and
instructed to find a solution to the problem. ML identifies anything abnormal as an anomaly. For
uses data and prior algorithm responses to learn example, an IDS might know what ‘normal’ traffic
what to do in similar situations in the future. In looks like, and it will alert on any traffic variants
autonomous cars, for example, the algorithms that don’t match that learned knowledge, such
get trained by the information regarding the car’s as with a vulnerability scanner. Usually, the ML
surroundings and what maneuvering decisions scientist needs to design ways to model the
are corrected, and then the algorithms know normal situation using ML models, and then train
how to handle future situations that mirror those the model with a large amount of normal data.
dealt with previously. After that, the model is applied to label new
Cybersecurity is a fertile field for applying AI cases; if the cases have a very small chance to fit
and ML because there is a lot of data about with the normal model, the cases are reported as
the security environment, and many attacks anomalies. In short, anomaly detection systems
are similar. With ML, the algorithms can learn based on unsupervised learning make a decision
how to respond to similar attacks, achieving the (normal/abnormal). Some refer to unsupervised
automation with AI. ML is thus a fundamental learning applications as ‘one-class problems.’
concept that drives AI. As you might imagine, systems based on
Researchers have tried to implement AI and unsupervised learning can generate a lot of false
ML in cybersecurity solutions since the late positives, because a situation deemed abnormal
1980s, but progress has been slow. Today, AI can be perfectly innocuous (think vulnerability
www.stellarcyber.ai sales@stellarcyber.ai
scanner again). Anomaly Detection is good Combining Supervised
at detecting things that are different, but not and Unsupervised Learning
necessarily always malicious. This is a problem
Supervised learning has shown more success
that security analysts still struggle with today.
in security applications, but it requires easy
access to large sets of labelled data, which are
The Rise of Big Data very difficult to generate for cyberattacks such
After 2000, developers and researchers began as APT (advanced persistent threats) and zero-
creating anti-spam, anti-phishing, and URL day attacks targeted at enterprises. Therefore,
filtering systems based on supervised learning. we cannot easily apply supervised ML to solve
In supervised learning, decisions are learned all cyberattacks.
from a set of labelled data. The labels specify the This is where unsupervised learning comes back
normal case and perceived threats. One such into the situation. We need to develop more
example is a URL classifier. First, large datasets advanced AI/ML that can be unsupervised or
of benign URLs and malicious URLs are used to semi-supervised (e.g., through adaptive learning)
train a URL classifier. Then, the URL classifier is to solve the additional cybersecurity challenges.
deployed and matches the URLs extracted from Adaptive learning (human-guided analysis)
incoming e-mails, and for each input URL, the coupled with unsupervised learning improves
URL classifier will produce a label to tell whether your ability to detect those APTs and zero-day
it is benign or malicious. A supervised learning exploits. Unlike in supervised learning, where
algorithm analyzes the data and produces we need to label a large number of attack cases,
an inferred function (i.e., this traffic behavior adaptive learning leverages limited guidance
matches this input data, therefore it is bad), from humans to drive the direction or preference
which can be used for mapping new examples. of the anomaly detection to produce more
Early filtering systems using supervised learning accurate results.
were based on relatively small datasets, but
datasets have grown in size and sophistication A New Direction: Connecting the Dots
with the advent of Big Data. For example, Gmail One of the big problems with simple anomaly
employs an Internet-scale dataset to train nits detection is the volume of false positives, and
classifier to improve accuracy. Such a large even if the results are indeed accurate, the
scale, using Big Data and huge computation results are lacking context for a security analyst
resources, enables Google to train more to quickly evaluate the impact to the security
sophisticated models. posture and gear resources towards them.
Big models (in terms of the number of One way to address this issue is by correlating
parameters) based on Big Data, such as deep multiple events (dots) and then evaluating
learning models, have gradually become more whether or not the correlation indicates a strong
popular. For example, supervised ML has signal for a cyberattack. For example, one ‘dot’
been successfully used in anti-virus signature might be an executive logging into the network
generation for years, and in 2012, Cylance began at 2AM, and while this alone might be seen as a
offering next-generation anti-virus systems false positive, it wouldn’t be enough to trigger an
based on datasets other than signatures, such alert. However, if the executive is seen logging
as anomalous traffic behavior. The advantage in at 2AM from an IP address in Russia or China,
of ML-based methods compared with simple and it is running a PowerShell command the
signature matching is that the ML-based method executive should not know, it would trigger
might find similar but not exactly identical cases, an alert, because enough dots linked together
while signature matching looks for exact the providing the context indicate that it is more
same pattern. like an account takeover, and some hacker is
pretending to be the executive, but logging in constitute a threat. This approach significantly
at an usual time and from an unusual place reduces false positives and helps analysts
to do unusual things. identify APTs or zero-day attacks more quickly.
One fundamental new trend in ML/AI With Graph ML, we can correlate and group
community is to let the ML algorithm deal alerts into incidents, which will significantly
with graphs. Graphs with nodes and edges are reduce the number of cases which need to be
perfect for tracking relationships and expressing investigated individually. Furthermore, with the
contexts. However, traditionally, graphs are correlation and grouping, incidents have more
considered as discrete data structures. On the details and graph ML also can help provide a
other hand, in order to apply gradient descent ranking or score to let the SoC team better
optimization for ML, usually the data needs to focus on significant incidents more quickly. The
be continuous and differentiable. In recent years, resulting incident graphs are also a natural way
there are quite a few new breakthroughs on ML to present the attack context with a timeline.
to deal with graphs. Such techniques can be With these approaches, SoC efficiency can
called graph representative learning or Graph be fundamentally improved to cure alert
ML in short. Applying such techniques into fatigue problems.
products, Stellar Cyber’s Open-XDR platform
correlates multiple events and evaluates
whether, when looked at together, they
[1] Information Risk Management, “Risky Business,” 2019.

[2] D. E. Denning, “An Intrusion-Detection Model,” IEEE Transactions on Software Engineering,
vol. 13, no. 2, pp. 222–232, 1987.
[3] R. Lippmann, R. K. Cunningham, D. J. Fried, I. Graf, K. R. Kendall, S. E. Webster, and M. A. Zissman, “Results
of the 1998 DARPA Offline Intrusion Detection Evaluation,” in Proc. Recent Advances in Intrusion Detection, 1999.
Stellar Cyber’s Open XDR platform delivers Everything Detection and Response by ingesting data from all tools,
correlating incidents across the entire attack surface, delivering high-fidelity detections, and responding to threats
automatically through AI and machine learning. Our intelligent, next-gen security operations platform greatly reduces
enterprise risk through early and precise identification and remediation of all attack activities while slashing costs,
retaining investments in existing tools and accelerating analyst productivity. Typically, our platform delivers a 20X
improvement in MTTD and an 8X improvement in MTTR.
www.stellarcyber.ai | sales@stellarcyber.ai

How AI and Machine Learning Improve Enterprise Cybersecurity

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

How AI and Machine Learning Improve Enterprise Cybersecurity

Uploaded by

Copyright:

Available Formats

WHITE PAPER

How AI and Machine Learning

[1] Information Risk Management, “Risky Business,” 2019.

You might also like