Applying LLMs To Threat Intelligence - by Thomas Roccia - Nov, 2023 - SecurityBreak

This member-only story is on us. Upgrade to access all of Medium.
Member-only story
Applying LLMs to Threat

Intelligence
A Practical Guide with Code Examples
Thomas Roccia · Follow

Published in SecurityBreak · 15 min read · 4 hours ago
75
LLMs, or Large Language Models, are an exciting technology designed to
leverage natural languages with various technologies. Specifically in
Cybersecurity, and more so in Threat Intelligence, there are challenges
that can be partially addressed with LLMs and generative AI.
While much of the focus is on prompt engineering skills, there’s more to

consider than just choosing the right word to interact with a model.
In this blog, I will discuss the potential of LLMs for threat intelligence
applications. I will first introduce some common challenges, then define
what prompt engineering is and how it can be applied to practical use
cases. Next, I will discuss some techniques such as few-shot learning,
RAG, and agents. Everything will be illustrated with code examples. Stay
with me, as we’re about to dive deep and acquire real skills, rather than
just skimming the surface.
Search Write
Threat Intelligence Challenges

In Threat Intel, there are several challenges to deal with. First, the sheer
volume of information produced today can be overwhelming, and no one
has the time to read it all. Second, investigating a threat can be time-
consuming, and junior analysts might lack the necessary background to
conduct the investigation effectively. Additionally, the dynamic nature of
threats means that analysts often have to keep up with rapidly changing
tactics, techniques, and procedures, which can be daunting even for
seasoned professionals.
With these challenges in mind, let’s explore how LLMs can be utilized to
enhance analysts’ capabilities.
What is Prompt Engineering?
We cannot discuss LLMs without defining what is Prompt Engineering.
Prompt Engineering is the discipline and science of crafting effective

prompts to guide AI models, particularly LLMs, toward desired
outputs. Much like a potter, wood carver, or a “tailleur de pierre”
(stone cutter), prompt engineering is the essential tool.
Prompt Engineering Analogy
To craft the ideal prompt, there are several basics to follow:
Clarity: Define the task you want the model to perform clearly.
Specificity: Provide as much detail as necessary to eliminate

ambiguity.
Iteration: Continuously refine prompts based on feedback from the

AI.
However, there are also common pitfalls to be wary of:
Over-complexity: Refrain from making prompts excessively detailed.
Ambiguity: Avoid vague prompts as they can lead to generic answers.
Blind Trust in the Model: Relying too much on the model’s

capabilities without adequate verification.
No Examples: Omitting example inputs and outputs.
Misplaced Belief in Model’s Understanding: Assuming the model

grasps your intent without clarity.
Ignoring Obsolescence: Neglecting to refresh prompts in tandem

with model updates or changes in relevant data.
The following example demonstrates an ideally crafted prompt:
Anatomy of an Ideal Prompt (Extract from my BsidesMelbourne conference)
But while many individuals are focusing on crafting the perfect prompt,
they are essentially overlooking the true potential of LLMs and their vast
capabilities.
Now, let’s talk about the genuine strength of LLMs and explore how we
can pragmatically create our own applications with it.
Practical Application of LLMs
There are multiple techniques that can be used in conjunction with a
model. In this section, I will explore some of them to provide you with
the keys to delve into this technology independently and achieve a better
understanding of it.
Few-Shot Prompting
Few-shot prompting is an interesting technique that can be employed to
instruct an LLM using a very limited amount of data.
The idea is to supply your model with some examples of what you expect
so it can replicate them directly. For instance, in the code below, I ‘teach’
the model a desired output — in this case, a mermaid mindmap — so that
it can produce similar mindmaps in the future.
# Function to generate a mindmap (few shot technique).

# NB: the more shot you add the better the result will be
def run_models(input_text):
response = openai.ChatCompletion.create(
model="gpt-4",
messages= [
{
"role": "system",
"content":"You are tasked with creating an in-depth mindmap designed
},
{
"role": "user",
"content": "Title: \ud83e\udda0 Lazarus Group's Infrastructure Reuse
},
{
"role": "assistant",
"content": "mindmap\nroot(Lazarus Group Threat Analysis)\n (Infra
},
{"role": "user", "content": input_text},
],
)
return response.choices[0].message['content']
In the code above, I provide some examples to clarify what I’m expecting
for the output. The information breaks down as follows:
System: I assign the role of “system” to my tool and detail what I

anticipate from this system. For this example, I’m aiming for a
mindmap.
User: The second line designates the role of “user.” This line presents
examples of user inputs.
Assistant: With the “assistant” role (representing the model), I

provide an illustration of the expected output — in this instance, the
mermaid mindmap code.
Finally, I capture the user input, allowing the assistant to generate the
subsequent mindmap based on that input.
An example of the resulting mindmap can be seen below:
The Intel Brief Mindmap Example

I wrote another blogpost about this technique and how I implemented
this concept for my newsletter. Click here to learn more about it!
Retrieval Augmented Generation (RAG)

The models we use are trained on a specific set of data up to a particular
date. This implies that more recent data might not be recognized by the
model, and most importantly, your personal/private data isn’t known to it
either.
RAG presents an interesting approach that enables you to supplement the

model with your own data, thereby expanding its capabilities. RAG is a
technique that melds retrieval-based and generative models.
Two Phases: Retrieval & Generation
Retrieval: This phase searches the database of your data you

provided.
Generation: This phase produces a context-relevant response based

on the retrieved information from your database.
The primary objective here is to enhance a model using your data. But
how does it work under the hood?
RAG operates in multiple stages. The subsequent diagram offers a

streamlined visualization of the process.
RAG Overview
My friend, Roberto Rodriguez, conducted in-depth research on this topic

using the Mitre ATT&CK Groups as data source.
For the sake of this blog, I’ve adapted his code to be compatible with
Jupyter Notebook and create an interface using pywidget. I’ll walk you
through each step to construct your own RAG. In this example, we used
LangChain, which is an open-source library designed for interacting with
an LLM.
Prepare Your Data (No, Really!)

You might have heard that when working with machine learning, deep
learning, or AI models, it’s essential to clean your dataset. This step is
crucial for obtaining the most accurate results.
Ensuring that your entire dataset is well-formatted and consists of clean

data is of utmost importance. Once your data is prepped, you can begin
crafting your RAG.
In this example, we used data exported from the Mitre ATT&CK groups.
After downloading the data to your local system, you can begin loading it
using Langchain.
Note: For this example, the data is stored in Markdown format, but you
can use any type of data.
from langchain.document_loaders import UnstructuredMarkdownLoader

# Using glob to find all Markdown files in the knowledge_directory
# The "*.md" means it will look for all files ending with .md (Markdown files)
group_files = glob.glob(os.path.join(knowledge_directory, "*.md"))
# Initializing an empty list to store the content of Markdown files

md_docs = []
# Loop through each Markdown file path in group_files

for group in group_files:
# Create an instance of UnstructuredMarkdownLoader to load the content of the

loader = UnstructuredMarkdownLoader(group)
# Load the content and extend the md_docs list with it

md_docs.extend(loader.Load())
Here we are using the group knowledge to load into our RAG.
Tokenisation
Tokenization is the process of converting a sequence of text into
individual units, known as “tokens.” These tokens can range from being
as small as characters to as long as words, depending on the specific
needs of the task and the language in question. Tokenization is an
essential pre-processing step in Natural Language Processing (NLP) and
text analytics models. Tokenisation can be done using the library
Tiktoken.
In our context, tokenization isn’t strictly required. However, it proves
beneficial if you aim to manage the amount of data sent and for
optimization and cost-control purposes.
Splitting into Smaller Chunks

Dividing your imported data into smaller chunks is a strategy designed to
make it easier for the model to access the imported data.
In this instance, we’re using the `RecursiveCharacterTextSplitter` from

LangChain. This method attempts to divide the text based on a set
sequence of characters until the resulting chunks reach a desired size. By
default, the characters used for splitting are [“\n\n”, “\n”, “ “, “”]. The
method strives to maintain the integrity of paragraphs, sentences, and
words as they’re typically semantically connected. The size of each chunk
is determined by its character count.
The following code demonstrates how to employ this method with our
MITRE ATT&CK Groups data.
# Import the RecursiveCharacterTextSplitter class from the langchain library

from langchain.text_splitter import RecursiveCharacterTextSplitter
# Create an instance of RecursiveCharacterTextSplitter with specified parameters

text_splitter = RecursiveCharacterTextSplitter(
chunk_size=500, # Maximum number of tokens in each chunk
chunk_overlap=50, # Number of tokens that will overlap between adjacent chunk
length_function=tiktoken_len, # Function to calculate the number of tokens in
separators=['\n\n', '\n', ' ', ''] # List of separators used to split the tex
)
Embeddings
Embeddings provide a means to convert words or phrases into numerical
representations, or vectors, so they can be easily processed by
computers. Why is this useful? By transforming text into numerical form,
it becomes simpler to gauge the similarity between words or sentences,
facilitating tasks such as search and classification.
A vector, in essence, is a list of numbers. In embeddings, each number in

this list captures some aspect or feature of the text. Such vectors allow
computers to grasp and compare concepts. For instance, the vector for
“apple” might bear more similarity to the one for “fruit” than to that of
“car.” This helps a computer discern that apples are more akin to fruits
than to vehicles.
In simpler terms, embeddings convert text into vectors. As you might

glean, these vectors provide a convenient means to store data for our
RAG and model.
In the example below, we use FAISS. Developed by Facebook, FAISS aids

in swiftly identifying items that resemble a particular item based on their
numerical (vector) representation. To illustrate, imagine a vast library of
books, and you wish to pinpoint the ones most similar to a specific title.
FAISS expedites this task, even with an extensive collection.
from langchain.embeddings.openai import OpenAIEmbeddings

from langchain.vectorstores import FAISS
embeddings = OpenAIEmbeddings()
# Send text chunks to OpenAI Embeddings API

db = FAISS.from_documents(chunks, embeddings)
retriever = db.as_retriever(search_kwargs={"k":5})
query = "What are some phishing techniques used by threat actors?"
print("[+] Getting relevant documents for query..")
relevant_docs = retriever.get_relevant_documents(query)
Alright, our retriever is now up and running. The next step is to integrate
this retriever with our LLM.
Retriever and LLM

Once we can interact with our data, we can then employ our LLM to
formulate the expected answer. The below screenshot shows you the
Jupyter notebook with the code discussed.
Jupyter Notebook with RAG and ATT&CK
We now have our RAG operational. But one thing that’s bothersome is
that our model doesn’t remember what we’ve discussed previously…
RAG + Memory
Being able to interact with your own data is quite powerful; you can
essentially feed any type of data and let your LLM work with your
personalized or internal data.
However, as seen in our previous example, the model doesn’t retain the
memory of prior interactions, which can be somewhat frustrating when
trying to gather multiple pieces of information about the same threat
actor.
By configuring memory in your RAG tools, you can maintain a record of

previous interactions, ensuring a continuous flow of information without
needing to pose the same questions repeatedly.
This can be seamlessly achieved using Langchain.
from langchain.chains import ConversationalRetrievalChain

from langchain.memory import ConversationBufferMemory
from langchain.llms import OpenAI
from langchain.prompts.prompt import PromptTemplate
import json
# Initialize your Langchain model

model = ChatOpenAI(model_name="gpt-4", temperature=0.3)
# Initialize your retriever (assuming you have a retriever named 'db')

retriever = db.as_retriever(search_kwargs={"k": 8})
# Define your custom template

custom_template = """You are an AI assistant specialized in MITRE ATT&CK and you i
Chat History:
{chat_history}
Follow Up Input: {question}
Answer: """
CUSTOM_QUESTION_PROMPT = PromptTemplate.from_template(custom_template)
# Initialize memory for chat history

memory = ConversationBufferMemory(memory_key="chat_history", return_messages=True)
# Initialize the ConversationalRetrievalChain
qa_chain = ConversationalRetrievalChain.from_llm(model, retriever, condense_questi
Now, we can initiate a dialogue with our RAG:
RAG with Memory
To make things easier, I’ve put together a comprehensive Jupyter

notebook available on my website for you to tailor the code to your
specific needs.
ReAct and Agents

The next concept I’d like to discuss is the ReAct framework and the agent
features offered by LangChain.
ReAct is a logical framework designed for crafting intelligent agents. Its

chief purpose is to endow agents with the ability to carry out complex
tasks through a series of actions. Central to ReAct are two core
components: ‘Reason’ and ‘Act’. The ‘Reason’ facet reflects the agent’s
cognitive process, where it ponders and decides the subsequent action.
In contrast, ‘Act’ symbolizes the tangible action the agent executes based
on its prior reasoning.
You can think of ReAct’s operational flow as an “Action → Observation →

Thought Cycle”. Initially, the agent performs an action. It then observes
and evaluates the results of that action. After observing, the agent
ponders or reasons about its next step. This iterative process ensures the
agent continually adapts and responds to the dynamic conditions of its
surroundings.
Source: https://peterroelants.github.io/posts/react-repl-agent/
This notion is incredibly powerful and can be seamlessly integrated with

various tools. Remember, in LangChain, an agent can represent
anything, allowing you to essentially craft your own applications atop this
foundation.
In the example that follows, I’ve employed the agent functionality of

LangChain in synergy with MSTICpy, constructing an agent that
leverages MSTICpy’s features.
NB: MSTICpy is the Python library dedicated to threat intelligence

investigation.
from msticpy.sectools.tilookup import TILookup

from langchain.chat_models import ChatOpenAI
from langchain.agents import Tool
from langchain.agents import initialize_agent
from langchain.agents import AgentType
llm = ChatOpenAI(model_name="gpt-4", temperature=0.3)
class TIVTLookup:
def __init__(self):
self.ti_lookup = TILookup()
def ip_info(self, ip_address: str) -> str:

result = self.ti_lookup.lookup_ioc(observable=ip_address, ioc_type="ipv4"
details = result.at[0, 'RawResult']
sliced_details = str(details)[:3500]
return sliced_details
def communicating_samples(self, ip_address: str) -> str:

domain_relation = vt_lookup.lookup_ioc_relationships(observable = ip_addre
return domain_relation
def samples_identification(self, hash: str) -> str:

hash_details = vt_lookup.get_object(hash, "file")
return hash_details
ti_tool = TIVTLookup()
tools = [
Tool(
name="Retrieve_IP_Info",
func=ti_tool.ip_info,
description="Useful when you need to look up threat intelligence informati
),
Tool(
name="Retrieve_Communicating_Samples",
func=ti_tool.communicating_samples,
description="Useful when you need to get communicating samples from an ip
),
Tool(
name="Retrieve_Sample_information",
func=ti_tool.samples_identification,
description="Useful when you need to obtain more details about a sample."
),
]
This example demonstrates how to craft agents. In this scenario, my

agents utilize three functions from MSTICpy:
Retrieve_IP_Info: This function queries VirusTotal for a specific IP

address and relays the obtained information back to the model.
Retrieve_Communicating_Samples: This function fetches from

VirusTotal the samples that communicate with a particular IP, as
provided by the user.
Retrieve_Sample_Information: Here, we obtain details about a

specific sample.
It’s worth noting that numerous other functions can be integrated into
our code. However, for the purpose of this demonstration, we’ll maintain
simplicity.
agent = initialize_agent(
tools, llm=llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=False, ag
)
agent.run("Can you give me more details about this ip: 77.246.107.91? How many sam
In the provided example, I seek information regarding a specific IP.

What’s remarkable about pairing agents with LLMs is the innate ability of
the model to determine which agent to invoke based solely on the given
description. the description you provide is crucial, as it’s interpreted as a
prompt and serves as directives for the model.
Upon executing this example, my code activates the MSTICpy agents to

get the details. These details are then fed to the model to generate the
final response, as illustrated below.
Result from the Agents
In detail, the code will run MSTICpy automatically as shown below.

Under the Hood Agents
Conclusion
In this blog, I explored some interesting LLM features that allow you to
build your own application. I created some proof-of-concept
implementations that can be easily adapted for your own use case.
I started with a deep dive into prompt engineering concepts and few-shot
learning, and then looked at how to build a RAG with your own data.
Lastly, I discussed Agents and how they can be used in conjunction with
your existing tools.
I hope you enjoyed the journey. If you want to explore more about these
concepts, check out the resources below.
That’s it! If you like this blog, you can share it and like it. You can follow
me on Twitter @fr0gger_ for more stuff such as this one. ❤
You can also subscribe to my newsletter ‘The Intel Brief’
Consider becoming a Medium member if you appreciate my content and want

to help me as a writer. It cost $5 per month and gives you unlimited access to
Medium content. I’ll get a little commission if you sign up via my link and that
will help supporting my community projects. Thanks!
Ressources
OTRF/GenAI-Security-Adventures (github.com)
Retrieval Augmented Generation (RAG) | Prompt Engineering Guide

(promptingguide.ai)
Agents | Langchain
microsoft/msticpy: Microsoft Threat Intelligence Security Tools

(github.com)
https://peterroelants.github.io/posts/react-repl-agent/
TheIntelBrief (securitybreak.io)
Threatintel Python Cybersecurity Llm AI

Written by Thomas Roccia Follow
1.7K Followers · Editor for SecurityBreak
Security Researcher
More from Thomas Roccia and SecurityBreak

Thomas Roccia in SecurityBreak Thomas Roccia in SecurityBreak
The Intel Brief by SecurityBreak Security infographics

An LLM Experiment I often do infographics to share security
concepts or best practices. This page will…
list the different files. I’ll update it
· 3 min read · Sep 27 · 3 min read · May 29
periodically…
139 2 444 2
Thomas Roccia in SecurityBreak Thomas Roccia in SecurityBreak
6 Useful Infographics for Threat [Reverse Engineering Tip]—

Intelligence Analyzing a DLL in x64DBG
Visualizing Cybersecurity concepts can be This blog is a quick tip about how to load a
a terrific way to learn more about specific… dll in x64dbg in order to debug it and…
tools, methodologies, and techniques! Here analysed it. In this example we will use a
is a ·post…
3 min read · Dec 18, 2022 2 · Jan 10, 2020
min readdll…
random
553 8 41
See all from Thomas Roccia See all from SecurityBreak

Recommended from Medium
Antonio Formato in Microsoft Azure Akshay Kokane
Chat with your Cyber Threat Semantic Memory—A new open

Intelligence data with Azure… source project from Microsoft
OpenAI
Chatbot to engage in conversations with In the ever-evolving landscape of artificial
your threat intel data sourced from… intelligence and natural language…
Microsoft Defender Threat Intelligence, processing, the ability to efficiently index
· 14Azure
using min read · Oct 15
OpenAI. 4 read · Oct 13
min datasets…
vast
55 106 5
Lists
Coding & Development Predictive Modeling w/

11 stories · 250 saves Python
20 stories · 551 saves
Generative AI The New Chatbots:

Recommended Reading ChatGPT, Bard, and Beyond
52 stories · 364 saves 12 stories · 178 saves
Jon Baker in MITRE-Engenuity Cassie Kozyrkov
Our TRAM Large Language Why I quit my job as Google’s

Model Automates TTP… Chief Decision Scientist
Identification
Written in CTI
by James Ross Reports
& jackie lasky. What’s it like to go from being Chief
Decision Scientist at Google to being, well…
just me?
8 min read · Aug 29 · 7 min read · 5 days ago
78 5.4K 72
Kiran Neelakanda Panicker in LlamaIndex Blog The PyCoach in Artificial Corner
Mastering PDFs: Extracting ChatGPT Has Changed My

Sections, Headings, Paragraphs… Approach to Learning New…
and Tables
Despite recentwith Cutting-Edge
motivation to utilize NLP for Things
The days of learning only with textbooks or
Parser
wider range of real world applications, mo… Google are over.
NLP papers, tasks and pipelines assume
5 read · Oct 18
minclean…
raw, · 6 min read · Oct 26
552 9 1.5K 26
See more recommendations
Help Status About Careers Blog Privacy Terms Text to speech Teams

Applying LLMs To Threat Intelligence - by Thomas Roccia - Nov, 2023 - SecurityBreak

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Applying LLMs To Threat Intelligence - by Thomas Roccia - Nov, 2023 - SecurityBreak

Uploaded by

Copyright:

Available Formats

This member-only story is on us. Upgrade to access all of Medium.

Applying LLMs to Threat

Thomas Roccia · Follow

While much of the focus is on prompt engineering skills, there’s more to

Threat Intelligence Challenges

Prompt Engineering is the discipline and science of crafting effective

Prompt Engineering Analogy

To craft the ideal prompt, there are several basics to follow:

Specificity: Provide as much detail as necessary to eliminate

Iteration: Continuously refine prompts based on feedback from the

However, there are also common pitfalls to be wary of:

Over-complexity: Refrain from making prompts excessively detailed.

Ambiguity: Avoid vague prompts as they can lead to generic answers.

Blind Trust in the Model: Relying too much on the model’s

No Examples: Omitting example inputs and outputs.

Misplaced Belief in Model’s Understanding: Assuming the model

Ignoring Obsolescence: Neglecting to refresh prompts in tandem

The following example demonstrates an ideally crafted prompt:

Anatomy of an Ideal Prompt (Extract from my BsidesMelbourne conference)

# Function to generate a mindmap (few shot technique).

System: I assign the role of “system” to my tool and detail what I

Assistant: With the “assistant” role (representing the model), I

An example of the resulting mindmap can be seen below:

The Intel Brief Mindmap Example

Retrieval Augmented Generation (RAG)

RAG presents an interesting approach that enables you to supplement the

Two Phases: Retrieval & Generation

Retrieval: This phase searches the database of your data you

Generation: This phase produces a context-relevant response based

RAG operates in multiple stages. The subsequent diagram offers a

My friend, Roberto Rodriguez, conducted in-depth research on this topic

Prepare Your Data (No, Really!)

Ensuring that your entire dataset is well-formatted and consists of clean

from langchain.document_loaders import UnstructuredMarkdownLoader

# Initializing an empty list to store the content of Markdown files

# Loop through each Markdown file path in group_files

# Create an instance of UnstructuredMarkdownLoader to load the content of the

# Load the content and extend the md_docs list with it

Splitting into Smaller Chunks

In this instance, we’re using the `RecursiveCharacterTextSplitter` from

# Import the RecursiveCharacterTextSplitter class from the langchain library

# Create an instance of RecursiveCharacterTextSplitter with specified parameters

A vector, in essence, is a list of numbers. In embeddings, each number in

In simpler terms, embeddings convert text into vectors. As you might

In the example below, we use FAISS. Developed by Facebook, FAISS aids

from langchain.embeddings.openai import OpenAIEmbeddings

# Send text chunks to OpenAI Embeddings API

Retriever and LLM

Jupyter Notebook with RAG and ATT&CK

By configuring memory in your RAG tools, you can maintain a record of

This can be seamlessly achieved using Langchain.

from langchain.chains import ConversationalRetrievalChain

# Initialize your Langchain model

# Initialize your retriever (assuming you have a retriever named 'db')

# Define your custom template

# Initialize memory for chat history

Now, we can initiate a dialogue with our RAG:

RAG with Memory

To make things easier, I’ve put together a comprehensive Jupyter

ReAct and Agents

ReAct is a logical framework designed for crafting intelligent agents. Its

You can think of ReAct’s operational flow as an “Action → Observation →