2018 19AFC6P01NIA2CWReport16033230MilanShrestha PDF

See discussions, stats, and author profiles for this publication at: https://www.researchgate.
net/publication/338549126
Module Code & Module Title FC6P01NI Final Year Project Report Assessment
Weightage & Type 50% Final Report Year and Semester 2018-19 Autumn Threat
Detection and Alert System
Thesis · June 2019
CITATIONS READS
0 325
1 author:
Milann Shrestha
Islington College
1 PUBLICATION 0 CITATIONS
SEE PROFILE
Some of the authors of this publication are also working on these related projects:
Cyber Threat Monitoring System With E&K View project
All content following this page was uploaded by Milann Shrestha on 13 January 2020.
The user has requested enhancement of the downloaded file.

Module Code & Module Title
FC6P01NI Final Year Project Report
Assessment Weightage & Type
50% Final Report
Year and Semester
2018-19 Autumn
Threat Detection and Alert System
Student Name: Milan Shrestha

London Met ID: 16033230
College ID: sity1n216033
Assignment Due Date: 8th May, 2019
Assignment Submission Date: 8th May, 2019
Word Count (Where required): 11,000 approx.
First Supervisor Name: Mr. Akchayat Bikram Joshi

Second Supervisor Name: Ms. Suman Gupta
I confirm that I understand my coursework needs to be submitted online via Google Classroom under the relevant
module page before the deadline in order for my assignment to be accepted and marked. I am fully aware that late
submissions will be treated as non-submission and a marks of zero will be awarded.
Acknowledgment
Firstly I would like to thank my supervisors, Mr. Akchayat Bikram Joshi and Ms. Suman
Gupta for their support and guidance throughout the project. Their questions, as well as feedbacks,
really helped me drive this project forward in the right direction.
I would also like to thank my brother, Mr. Nishan Maharjan, for his real-world and
impactful teaching with experiences on cybersecurity and helping me to decide the topic for the
Final Year Project. I would also like to thank Islington College for providing me an opportunity to
work on a project that enhanced my skill before facing the real world of Information Technology.
Abstract
Organizations of all sizes are fighting in the same security battle while the attackers keep
changing the threat landscape by developing new tools and targeting victim machine. However,
their process along with motives have not changed. Traditionally, the security measures were all
driven to into incident response. The idea of Threat Hunting challenges the concept by introducing
one of the dynamic approaches to cybersecurity. Nowadays, the company simply deploys the
SIEM for the event management from log sources and allow to monitor continuously. Using
various log sources for the analysis and discover the attack pattern with possible evidence is the
general definition of threat detection with text logs. Therefore, the main aspect of Threat Detection
and Alert System is to automate the process and simplify the threat hunting by visualizing and
alerting the discoveries. This project proposes and evaluates the Threat Detection System and The
Elasticsearch Stack, an enterprise-grade logging repository and search engine, a solution for active
threat detection and visualization platform in Linux environment.
This report describes the project with vision, the process of development and the future plan
for the project. Starting with brief introductions to threat hunting, current scenario along with case
study, and how the project can stand out to be a solution, the aim, and objectives to achieve the
proposed system is included in the first chapter. Followed by the background, the analysis from
the survey and brief explanation about the related project and considered methodology and the
pre-development stages are covered in the next chapter. The unit testing for ensuring proposed
functions of the system is demonstrated following the considered methodology, DSDM. The
project is then critically analyzed concluding with legal-ethical implications, limitations and
describing the further work.
Table of Contents
Chapter 1: Introduction ............................................................................................................. 1
1.1 Introduction to topic ......................................................................................................... 1
1.2 Current Scenario ............................................................................................................... 1
1.3 Problem Statement ........................................................................................................... 5
1.4 Project as a solution.......................................................................................................... 6
1.5 Aim and Objectives .......................................................................................................... 7
1.5.1 Aim ........................................................................................................................... 7
1.5.2 Objectives ................................................................................................................. 7
Chapter 2: Background ............................................................................................................. 9
2.1 Requirement Analysis through Survey ............................................................................ 9
2.2 About the end user.......................................................................................................... 10
2.3 Understanding the solution............................................................................................. 11
2.3.1 Planning .................................................................................................................. 13
2.3.2 Gathering Logs........................................................................................................ 13
2.3.3 Analysis and Monitor .............................................................................................. 13
2.3.4 Respond................................................................................................................... 13
2.3.5 System Architecture ................................................................................................ 14
2.4 Similar System Review .................................................................................................. 14
2.4.1 Nimbus .................................................................................................................... 15
2.4.2 Vectra ...................................................................................................................... 15
2.4.3 Cybersponse ............................................................................................................ 16
2.4.4 Phantom .................................................................................................................. 16
2.5 Analysis with a similar system ....................................................................................... 17
2.6 Similar Projects Review ................................................................................................. 18

2.6.1 Secure automated threat detection and prevention (SATPD) ................................. 18
2.6.2 StoQ ........................................................................................................................ 18
2.7 Technical Aspects .......................................................................................................... 19
2.7.1 Operating System .................................................................................................... 19
2.7.2 Programming Language .......................................................................................... 19
2.7.3 SIEM framework .................................................................................................... 19
2.7.4 Text Editor .............................................................................................................. 19
2.7.5 Libraries .................................................................................................................. 20
2.7.6 Logs source ............................................................................................................. 20
Chapter 3: Development .......................................................................................................... 21
3.1 Methodology Consideration ........................................................................................... 21
3.1.1 Waterfall Model ...................................................................................................... 21
3.1.2 Dynamic System Development Methodology (DSDM) ......................................... 22
3.1.3 Big Bang Model ...................................................................................................... 22
3.2 Selected Methodology .................................................................................................... 23
Phase 1: The Pre-project ....................................................................................................... 23
Phase 2: The Project life-cycle ............................................................................................. 23
Phase 3: Post Project ............................................................................................................. 24
3.3 Development Stages ....................................................................................................... 24
3.3.1 Timebox 1: Threat Detection .................................................................................. 24
3.3.2 Timebox 2: Visualization and Threat Category ...................................................... 28
3.3.3 Timebox 3: Bot Alert and Report Generation......................................................... 32
Chapter 4: Testing .................................................................................................................... 35
4.1 Elasticsearch status API test ........................................................................................... 36
4.2 Slack Client API test ...................................................................................................... 37

4.3 IP API test ...................................................................................................................... 38
4.4 Build phishanalyzer.py execution test ............................................................................ 39
4.5 Build phishanalyzer.py [dashboard] execution test........................................................ 40
4.6 Build dnsanalyzer.py execution test ............................................................................... 41
4.7 Build dnsanalyzer.py [dashboard] execution test ........................................................... 42
4.8 Automated threat report generation................................................................................ 43
4.9 Test API limitations handling ........................................................................................ 44
Chapter 5: Critical Analysis .................................................................................................... 45
5.1 Legal and Ethical Implication ........................................................................................ 46
5.2 Limitations ..................................................................................................................... 47
Chapter 6: Conclusion and Review......................................................................................... 48
6.1 Future Plan ..................................................................................................................... 48
Chapter 7: References .............................................................................................................. 49
Chapter 8: Appendix ................................................................................................................ 52
8.1 MoSCow Prioritization .................................................................................................. 52
8.2 Gantt chart ...................................................................................................................... 52
8.3 Team Members ............................................................................................................... 52
8.4 Work Breakdown Structure............................................................................................ 53
8.5 User Manual Guide ........................................................................................................ 57
8.6 Installation of ELK Environment ................................................................................... 58
8.7 Code ............................................................................................................................... 60
8.8 Survey............................................................................................................................. 65
8.8.1 Survey Results ........................................................................................................ 82

Table of Figures
Figure 1 Security Incidents registered by Cisco Cognitive Intelligence group 2018 ..................... 2
Figure 2 Nepali domain used for malicious purpose ...................................................................... 3
Figure 3 % of respondent threats (left) and vulnerabilities (right) on 2013-2017 (Kessel, 2018) .. 4
Figure 4 Survey result on threat risk in Nepal ................................................................................ 6
Figure 5 Total number of survey candidates and their position...................................................... 9
Figure 6 Preferred threat detection technique by Security Analysts in Nepal .............................. 10
Figure 7 Growth in values after the implementation of the threat detection system .................... 11
Figure 8 Threat Hunting Loop ...................................................................................................... 12
Figure 9 Project System Architecture ........................................................................................... 14
Figure 10Dashboard of Nimbus (TeamCymru) ............................................................................ 15
Figure 11 Dashboard of Vectra ..................................................................................................... 15
Figure 12 Cybersponse Dashboard provided by demo version. ................................................... 16
Figure 13 Dashboard of Spunks’ Phantom ................................................................................... 16
Figure 14 Waterfall Model for SDLC (Sim, 2009)....................................................................... 21
Figure 15 Diagrammatical representation of DSDM work-flow (Siddharth, 2018) ..................... 22
Figure 16 Diagrammatical representation of the Big bang model ................................................ 22
Figure 17 Flowchart demonstrating overall project ...................................................................... 25
Figure 18 Importing Brothon library for dns.log analysis ............................................................ 26
Figure 19 Detailed information about malicious dns query.......................................................... 27
Figure 20 Flow diagram for Visualization and Threat Categorization ......................................... 28
Figure 21 Build of phishanalyzer.py posting to elasticsearch ..................................................... 29
Figure 22Build of dnsanalyzer.py posting to elasticsearch .......................................................... 30
Figure 23 A table for threat categorizations ................................................................................. 30
Figure 24 Dashboard for visualization.......................................................................................... 31
Figure 25 Flow diagram for Alert and Report generation ............................................................ 32
Figure 26 Build of main.py for report generation ......................................................................... 33
Figure 27 Threat Report generation and Slack Bot Alert ............................................................. 34
Figure 28 Status check of Elasticsearch at localhost:9200 ........................................................... 36
Figure 29 Elasticsearch API test with index name ....................................................................... 36
Figure 30 Test case for Slack bot for notification........................................................................ 37
Figure 31 Notification pop-up sample .......................................................................................... 37
Figure 32 IP API test with queried IP address .............................................................................. 38
Figure 33 Successful execution of phishanalyzer.py .................................................................... 39
Figure 34 Posted object from phishanalyzer.py to elasticsearch .................................................. 40
Figure 35 Dashboard displaying malicious http queries count ..................................................... 40
Figure 36 Successful execution of dnsanalyzer.py ....................................................................... 41
Figure 37 Posted object from dnsanalyzer.py to elasticsearch ..................................................... 42
Figure 38 Dashboard displaying malicious dns queries and notifications .................................... 42
Figure 39 Automatically generated pdf document example ......................................................... 43
Figure 40 GetIPIntel API handling ............................................................................................... 44
Figure 41Gantt Chart .................................................................................................................... 52
Table of Tables
Table 1 Side-by-side comparison with similar systems ................................................................ 17
Table 2 Test Case Summary ......................................................................................................... 35
Table 3 Test Case 1 ....................................................................................................................... 36
Table 4 Test Case 4 ....................................................................................................................... 37
Table 5 Test Case 3 ....................................................................................................................... 38
Table 6 Test Case 4 ....................................................................................................................... 39
Table 7 Test Case 5 ....................................................................................................................... 40
Table 8 Test Case 6 ....................................................................................................................... 41
Table 9 Test Case 7 ....................................................................................................................... 42
Table 10 Test Case 8 ..................................................................................................................... 43
Table 11 Test Case 9 ..................................................................................................................... 44
Table 12 MoSCoW Prioritization ................................................................................................. 52
Table 13 Team Members .............................................................................................................. 52
Table 14 Work Breakdown Structure ........................................................................................... 56
Acronym
IDS Intrusion Detection System
IPS Intrusion Prevention System
HIDS Host Intrusion Detection System
NIDS Network Intrusion Detection System
SOC Security Operation Centre
ELK Elasticsearch Logstash Kibana
IP Internet Protocol
SIEM Security Information and Event Management
NSM Network Security Monitor
HTTP Hyper Text Transfer Protocol
DNS Domain Name Server
OS Operating System
API Application Programming Interface
SDLC Software Development Life Cycle
DSDM Dynamic Software Development Method
IR Incident Response
CnC Command and Control
RAT Remote Access Trojan
IOC Indicator of Compromise

Chapter 1: Introduction
1.1 Introduction to topic
According to the NIST Cybersecurity Framework, “Detection” falls under the third step of
securing system. Threat detection from the network is a special procedure for securing any
system. It is being conducted by modern organizations that acquire delicate assets which
offer the values of the company. The network traffic is constantly monitored and analyzed.
Logs are pulled from different resources and after that examined for conceivable threats.
Security experts then analyze the logs and correlate them with obtained information from
multiple threat intelligence, deducing the patterns for a possible attack. This project aims
to help the network administrator in their threat detection process by automating the
repetitive task and filtering out the avoidable information.
1.2 Current Scenario

‘Threats from the network’ is one problem that is constantly growing on a large scale,
resulting in serious damage to organizations and business. Network traffics from bad IPs
and crawling over the phishing sites has been recognized as the most common methods for
threat actors to play with. Those who operate with highly sensitive data cannot tolerate the
compromised system with such malicious intent.
A Banking Trojan, Emotet Malware was a successful child of a threat family that is
considered to be a modular platform capable of inviting a variety of different attacks. The
researchers found that the malware was actually spread throughout the financial institutions
as Banking Trojans by common spam campaigns like payment themed emails
(MalwarebytesLabs, 2019). The attachment contained the micro virus enabled document
or included as a malicious link or phishing domains. The threat actors initially were focused
on stealing banking information: email address, passwords and other financial details of
institute customers. Since, the Emotet is regarded as the escape hole for other threat attacks
such as Botnets & RATs, Cryptomining and Phishing, the actors behind the Emotet appear
to be dealing as the distribution channel for another threat group. As stated by US-CERT,
up to $1 Million amount has already cost due to its infection (Bontchev, 2017).
1 | Milan Shrestha
The investigators from Cisco Cognitive Intelligence group prepared an analytical report
that suggests the top categories of security incidents faced by the organization from July
2018.
Top 5 Security Incedents 2018

Cryptominig
Banking Trojan
Trojan
Other
Phishing
Botnet & RATs
Botnet & RATs Trojan Cryptominig Phishing Banking Trojan
Figure 1 Security Incidents registered by Cisco Cognitive Intelligence group 2018
Dominantly, Botnet & RATs covered the security incident with 58% that included the
hazardous botnet threats and CnC such as Andromeda and Xtrat. In the second place,
crypto mining is the most popular DNS threats dated in early 2000, still on the chart with
30% security incidents. Although the unnoticeable percentage portion hold by Phishing
and the Banking Trojans, 9% and 2% respectively, both threats, with no doubts has the
equal and disastrous capability to any victims network (efficientIP, 2018).
The year 2017 was recorded as the year for cyber-attacks since the attacking vectors like
WannaCry and NotPetya were introduced an awareness globally. Business groups are
dedicated and more focused to understand and prevent the popular cyber threats used by
the threat actors or cybercriminal, statically, many of which are based on DNS queries.
Since, DNS is global, varied and dynamic being used priory used in traditional security
systems, the use of DNS based attacks are the most effective and popular targets among
the cybercriminals. The Cisco 2016 Security Report has concluded that 91% of malware
uses DNS for DoS amplification or communication with CnC servers. There has been some
serious business impact due to malicious DNS queries (Cisco, 2019). In 2017, around $1
2 | Milan Shrestha
trillion worth loss was registered in cyber-attack involving DNS based malware. The 2018
research shows the average cost damages caused by DNS-based cyber attacked has
increased by 57% as compared to results from 2017.
Malicious Nepal Domain

250
200
150
100
50
0
Socail Engineering Malware Dustribution
Figure 2 Nepali domain used for malicious purpose
Exploiting humans, technically, social engineering has also been a major technique to
provoke victims to handout their information unknowingly. The threat actor uses web
services for the exploitations considered to be a Phishing Attack. It is the popular method
among the cybercriminals targeting popular and branded financial institution, e-commerce
sites, social media, and government. OneDrive, Microsoft Office 365, Facebook and
Amazon are the highest phished sites that cybercriminal uses since they are most
commonly used by general users and has scope for fraud. (Zscaler ThreatLabZ, 2019) The
report prepared by Phish Labs 2018, over 1.3 million sites were malicious phishing sites in
2017 discovering nearly 300,000 unique domains. In an annual threat report 2018
published by ThreatNix, cyberspace in Nepal is prone to human exploitation (social
engineering) dominating the global threat, Malware Distribution. Most of the data breach
cases are rooted in sphere phishing via emails (Threatnix, 2018).
3 | Milan Shrestha
Figure 3 % of respondent threats (left) and vulnerabilities (right) on 2013-2017 (Kessel, 2018)
Above chart shows the state of the cyber security awareness among the employees
according to the 20th Global Information Security Survey 2017-18, that increases the risk
while, the attacker and the innocents seem to be in relation for greatest immediate threats
(Touche, 2017). At this existing threat level, organizations are willing to acquire the key
elements of cyber security resilience with tools such as antivirus, intrusion detection and
prevention systems.
With such growing numbers of the event in cyberspace, Security Operations Center has
been a heart of the organizations for cyber threat detection and responses providing a
centralized and structured hub for all cybersecurity events. Therefore, SOCs are the next
common thing for adaptation to organizations but only 52% of survey respondents have it
on their corresponding firm (Kessel, 2018).
According to the analytics of survey conducted on 19th December 2018, the IT

organizations around Kathmandu City are more likely to use NIDS or HIDS for the alarm
generation but hesitate to further analyze the logs for threat detection. 80% uses manual
methodologies for threat detection. 90% of IT people are aware and interested in the
automated threat monitoring system.
NOTE: Surveys Analytics are included in the Appendix section.
4 | Milan Shrestha
1.3 Problem Statement

In section 1.2, we discussed a topic based on a case study provided by 20th Global
Information Security Survey, the cyber threat activities that occur in many organizations
are categorized as ‘common attack’. According to Greg Young, Research Vice President
at Gartner, 99% common or vulnerabilities exploited threats will continue to exist through
the year 2020. The dominant threats in the current scenario, malicious DNS queries and
one of the aspects of social engineering, the phishing attack are profound to remain longer
than it takes time for mitigation (Paul van Kessel, 2018). Thus, threat identification with
correlation methods to aware such network activities is therefore crucial.
There are several services, currently available that claims to find threats from the
system/network automatically or without user interaction, but has been failing over the
year. The IDS attempts to automate the process by generating alarms on suspicious
activities, resulting in a huge number of false positive. The technique of feeding IP(s) and
Domain into threat intelligence with manual analysis of logs would be infeasible for any
analyst due to the sheer volume of activity being conducted on a network every second.
Security information and event management or SIEM has helped with managing the logs
but has been unable to filter out the avoidable fields. This can become a serious challenge
for an analyst at times.
Referring to my short interview during survey session with security analysts at Vairav
Technology (formerly known as Security Department of Rigo Technology Pvt. Ltd.) about
their problems faced while threat hunt, they shared their requirement of automation over
manual check-ups. Although the network analyst suggests deploying threat monitoring
system to their respective organizations, due to inefficacy in understanding risk factor by
the management, they doubt to invest in a system that is beyond organizations budget.
5 | Milan Shrestha
1.4 Project as a solution

Since the project will deliver the system that detects the threats that are ‘common’ to
various globally recognized threat intelligence, this will solve the problem to defend the
common attacks over the network.
Secondly, the daunting tasks for an analyst to hunt for malicious IPs and domains over the
network logs need to be fixed. In this work, we present systematic solutions to the
aforementioned problems. This project aims to develop a system that auto-analyzes the
logs from Bro network security monitoring tool that includes the features of the signature-
based intrusion detection system (IDS). This development will settle the hectic task to filter
and parse logs continuously. Instead, the corresponding person can utilize his/her time to
understand the output of the logs. Visualizing the real-time suspicious activities in the
network and correlating the logs output with reputed cyber threat intelligence is another
important feature of this project.
Threat Risks in Nepal
High Medium Low
Figure 4 Survey result on threat risk in Nepal
With the reference to the analytical survey conducted, most of the candidates are aware of
the cyber threats and the level of threat risks in Nepal but has been found in the failure to
establish a system for threat detecting and monitoring system. This system works as an
open source project and can be optimized with further development in the upcoming
version. Being feasible and contribution to the open source community, the motivation for
6 | Milan Shrestha
developing this system can be a solution for creating a threat hunt scenario as a beginning
in the case of Nepal.
1.5 Aim and Objectives

1.5.1 Aim
This project aims to provide a web portal as a visualization for threat activities and filter
out the avoidable information for threat detection as much as possible also, correlating the
log outputs to the data from threat intelligence services for the detection of common attack
patterns.
1.5.2 Objectives
The objectives followed during the development of this project are listed below that is in
accordance with the adaptation of software development methodology i.e. Dynamic
systems development method (DSDM).
 An extensive study of the threat reports issued by security researchers from renowned
organizations such as Cisco, PhishLabs, and Rapid for the global and Vairav
Technology (formerly known as Rigo Technology) and ThreatNix for Nepal’s
cybersecurity status.
 A comprehensive study of existing tools and techniques used for defending possible
threats from the network referenced with the survey conducted during the requirement
analysis phase of development.
 Understanding the problem and coming up with the solution of automating the threat
detecting procedure supported by various frameworks and Application programming
interface (API).
 Building up the knowledge on Cyber Threat Intelligence (CTI) services and its
providers like VirusTotal, Cisco Talos, Malwarebytes, Phishtank.
 Analyzing the logs from Bro Network Security Monitor tool for parsing HTTP headers
and DNS queries for threat intelligence feeds.
 Filtering out the parsed log to document based database i.e. elasticsearch (ES).
 Since the project aims to automate the threat detection along with notifying the
concerned end user, the script is written on python programming language importing
required libraries.
7 | Milan Shrestha
 The dashboard for monitoring the events, one of the elements of ELK stack, Kibana is
considered as SIEM framework for the project.
 This project serves as the alert system, so the end users are notified with bot alert
system. Slack Client API provided by the Slack, a team collaboration service is used.
8 | Milan Shrestha
Chapter 2: Background
2.1 Requirement Analysis through Survey
To understand the features required for an efficient Threat Monitoring and Alert system, a
general survey with the self-made questionnaire was conducted. Since the project considers
network administrators and security analysts of organizations as an end user, the survey
was conducted with employees with fine knowledge about threats detection and interested
in the deployment of the system including Security Analysts, IT Officers, IT students,
Security Engineers from the various organization around Kathmandu.
25
20
No. of candidates
15
10
0
Security Analyst IT Officer IT Student Security Engineer
Position
Figure 5 Total number of survey candidates and their position
From this survey, it was concluded that the Threat monitoring system was, in fact, a
necessity. Different opinions on whether to implement the system with existing IDS or
setting up the new environment was obtained. After compiling the survey in statistics, the
following requirements were enlisted:
 A Dashboard portal for threat monitoring,

 Weekly Threat Report
 Correlation between parse logs with different sources
 An Alert System.
9 | Milan Shrestha
2.2 About the end user

According to the global survey investigated by Alert Logic on the users of threat detection
systems, as would expected, the IT security team (70%) is the primary consumer with the
Incident Response and SOC team (43% and 38% respectively) being significant users of
the threat detections using threat intelligence (Alert Logic, 2017).
Preferred Threat Detection Technique

in Nepal
Automated detection Manual Check-up No Idea
Figure 6 Preferred threat detection technique by Security Analysts in Nepal
In the case of Nepal, this project targets the network administrators of any organizations
that have high valued assets and those who cannot tolerate any kind of cyber threat events.
Since this project comes with the feature of alert and provides a platform to real-time
monitoring, the admin or the incident responders can take actions to detected threats.
Besides that, this project can be a handful to the security analysts from the SOC department
who continuously hunt for the threat. The automation of threat hunt process can save time
for them. According to findings from the survey, most of the analysts and correspondence
used the traditional technique for threat detection, that was manual check-up and were
positive about automating the procedure.
10 | Milan Shrestha
2.3 Understanding the solution

RSA had conducted a global survey to understand the global insight upon technologies,
efforts, and satisfaction with the current toolsets among five industry sectors: Financial
Service, High Tech, Education, Service, and Government. Only 20% of organizations were
satisfied with the current ability to detect threats while, there was a positive response for
those who collect specific types of data for its investigation (RSA, 2016). This data clearly
points out that threat detection or threat hunting procedure has been an effective security
practice. So, what exactly is threat detection/threat hunting?
Figure 7 Growth in values after the implementation of the threat detection system
Threat Hunt is a ‘sub-domain’ of the process Threat detection, as it focuses on identifying

the threat at the earliest possible phase of the procedure. Generally, the threat hunting
procedure is manual and hectic to analysts. Thus, the question raised is, ‘Can threat hunting
be fully automated?’ while the Threat Detection and getting notified to the incident
responder is relatively a new concept for emphasizing the security level. Detection takes a
vital role and eases the response team as the discovered threat are further analyzed to take
action as a response to the particular event (Robert M. Lee, 2016).
Before discussing the project system, it is necessary to understand how threat hunt is
described. Hunt Evil: Your Practical Guide to Threat Hunting mentions the phases of the
hunting process and clears out the myths that it cannot be fully automated since it requires
11 | Milan Shrestha
human analysts. As stated in the above chapters, the manual check-up or threat hunting can
become a tiresome job to the analysts and automating the tasks is one of the necessity. The
aspect that can be automated is to detected threats. As threats are detected, the analysts can
investigate for its root. SANS’s white paper by Eric Cole, Ph.D., titled Automating the
hunt for hidden threats points out the failure of traditional tools for threat detection such
as IDS and Antivirus. The author is indirectly inclined to the fact that security defense must
be able to adapt and upgrade according to modern times security events (Eric Cole, 2015).
The paper suggests such tools and organizations include more proactive techniques for
defending security events along with activities during the threat hunt.
Since, the final year project is partially based on those activities that author Eric Cole,
Ph.D., described as essential, those activities are:
 Understanding the threats,

 Collecting, analyzing and mapping the data,
 Leverage and feeds to threat intel,
 Trace IOC,
 Respond.
To complete the activities listed above, the researchers have defined some steps as Threat
Hunting Cycle . The brief explanation of the steps are:
Figure 8 Threat Hunting Loop
12 | Milan Shrestha
2.3.1 Planning
The analyst team creates a hypothesis to initiate the hunting process. The follow the rule
of starting it with small by prioritizing data of most interests to threat actors. The scopes
for analysis, consideration of methodologies, and criticality of assets are defined here in
this phase. The decision after planning is basically made based on hunting techniques or
the latest security trends. Moreover, the decision could also be made based on the risk
assessment that had already been conducted by the organizations.
2.3.2 Gathering Logs

It has been a mandatory task to keep a log of any systems, more focused on security tools
and systems. The gathering process of logs from various sources are carried out in this
phase. The resources from which logs are to be gathered depends upon the planning phase.
Some of the log resources are:
 Application Level: Antivirus, Firewalls, Syslog Servers, Mail Server.

 Network Traffic: Routers, Switches, NIDS.
 Endpoint Logs: Operating System Logs, HIDS.
2.3.3 Analysis and Monitor

This phase basically talks about analyzing the collected log and actively monitoring users,
networks, servers. Monitoring the malicious activity and hunting for its IOCs,
organizations are able to aware themselves about the attacks earlier along with learning
lessons from the past events and reusing them to detect and prevent for the future. The
analysis may also refer to the actual hunting process. Most of the organizations today rely
on the alerts from the existing tools for the analysis and monitor purpose yet, the analysts
are overloaded with much information and finds very difficult in deducing the patterns.
2.3.4 Respond
After hunting phase for the malicious events, any sort of discovery is supposed to notify
the respective team, usually the IR team of the organization. Active response to the events
is a positive side for an organizations value as a concern for future possible malicious
events.
(Sqrrl, 2018)
13 | Milan Shrestha
2.3.5 System Architecture

Therefore, above mentioned four phases are the brief explanations of the steps for
traditional threat hunting methods. Exploring those phases, analyzing and responding is the
most critical yet proven to be challenging to the analyst team. Although the logs are
generated and gathered at one point, analysts had to hunt giving much time to filter logs
rather than actual hunting. The system developed with this project will assist in filtering
the logs by parsing the logs generated by Bro NMS. The main logs that will be analyzed in
this project are HTTP and DNS for finding malicious events.
Bro Bro
System Alerts
NSM Logs
Figure 9 Project System Architecture
An open source SIEM framework, Elasticsearch, Logstash, and Kibana, collectively ELK
stack is installed to manage the filtered log for detected threat information and displaying
for the real-time monitoring feature. The IP address and domain name are filtered out from
the logs and queried with threat intelligence platform, i.e. GetIPIntel and OpenPhish. The
scripts for detecting threats (malicious DNS queries and phishing domains) also pulls out
the information about the particular threat agent and ultimately set to visualize at Kibana
web portal. Alerting feature is also introduced along with an automatically generated
summary of threat report. For alerts, Slack is used as a primary application. Thus, providing
these features in a system will ease the security analyst as well as in the productivity of
possible threat detection.
2.4 Similar System Review

Although the idea for threat detection and providing a platform for proper threat
visualization has become popular among the various organization, there are only a few
systems that solely focuses on the automation process. Following are some of the security
and compliance solution services which are considered as inspiration for this project.
14 | Milan Shrestha
2.4.1 Nimbus
Figure 10Dashboard of Nimbus (TeamCymru)
Team Cymru is an organization providing security platform and services to various

organizations. Nimbus is one of its community services. Nimbus is a real-time threat
monitoring system built with Kibana. The Team Cymru Nimbus works with cloud-based
net flow collection along with its proper analysis and reporting platform. The IPs from the
net flow log are pushed to its partner and world-class threat intelligence feeds for analysis
of IP reputations and CnC servers.
2.4.2 Vectra
Figure 11 Dashboard of Vectra
Vectra is another new automation based threat detection service that claims to be powered
AI for attack detection and its respond. This service comes with the IR features including
the identification of hidden tunnels in HTTP and DNS traffic that may evade security
enforcement sensors. Vectra possesses the Cognito Detect for cyberattack-detection and
threat hunting that automates high-risk threats instantly and triggers and correlates to host
so the security team could respond fast without advert data loss (Vectra, 2018).
15 | Milan Shrestha
2.4.3 Cybersponse
Figure 12 Cybersponse Dashboard provided by demo version.
Cybersponse is a renowned threat detection platform that provides automated incident

response solutions to cybersecurity threat management team. It is considered as the only
patented automated IR platform that fills the gap between automation only and human
dependent security organizations (CyberSponse, 2018). It is based on an alert from various
sources but does not includes their own technology for threat identification. The company
aims to improve its security products by collaboration with Raytheon. But as this report
writing, the project has not been finished or made publicly available (Raytheon, 2018).
2.4.4 Phantom
Figure 13 Dashboard of Spunks’ Phantom
Phantom is a service that follows Security Automation, Orchestration and Response

(SOAR), with incident response and threat intelligence enablement. It supports multiple
16 | Milan Shrestha
APIs to connect and coordinate with the various platform. This platform was acquired by
Splunk to integrate with its security operations center to accelerate incident response
(Sawers, 2018). Phantom also pulls the data of any type and sources to trigger its IR
technology. This service is also known as an analyst-driven workflow system, since it hits
the automated SIEM, querying the threat intelligence for contextual information to aid with
decision making (Splunk, 2018).
2.5 Analysis with a similar system

Following tabular demonstrations is the side-by-side comparison with similar systems to
my project:
Sn. Features Nimbus CyebrSponse Phantom Vectra My Project

1. Threat Detection ✔ ✔ ✔ ✔ ✔
2. Full Automation ✔ ✔ ✔ ✔ ✔
3. Geo-location ✔ ✘ ✔ ✘ ✔
preview
4. Correlate Alerts ✔ ✔ ✘ ✔ ✔
5. Security ✔ ✘ ✔ ✔ ✔
Orchestration
6. Incident Response ✔ ✔ ✔ ✔ ✘
7. SIEM integration ✔ ✔ ✘ ✔ ✔
8. Phishing Detection ✘ ✔ ✘ ✘ ✔
9. Multiple log source ✔ ✔ ✔ ✔ ✘
10. Open Source ✘ ✘ ✘ ✘ ✔
11. Data Collection ✔ ✔ ✔ ✔ ✘
12. Reporting ✔ ✔ ✔ ✔ ✔
Table 1 Side-by-side comparison with similar systems
By comparing the functions from the similar system with the system that is to be developed
in this project, most of the features were already included as per end users requirements,
and some of them could be added as the further version of the system. Since the project is
17 | Milan Shrestha
completely based on open source and custom scripting, the system comes with the
flexibility to mold with any additional components.
2.6 Similar Projects Review

2.6.1 Secure automated threat detection and prevention (SATPD)
With the motive of reducing workload for security analyst by automating the threat
detection and threat hunting job is the common target of both projects. The concept of
SATDP is developed in 2018, by the team of four as published by the International Journal
of Engineering & Technology. This project discusses the implementation of Artificial
Intelligence and Machine Learning as its algorithm. Anomaly detection is also included as
a focused part of the project (CH. Ramaiah, 2018).
The project is based on the concept of automating the threat detection and introducing
various machine learning in IDS, the prevention of detected threats are considered. Yet,
the threats are anomaly centric so, the limitations can be the inability of correlation with
threat intelligence.
2.6.2 StoQ
Similar to the project, this analysis framework fully depends upon the open source systems
such as Bro, Suricata, and Elasticsearch and threat intelligence like fire-eye, virus total,
total hash and Yara for correlation. This project, developed in 2011, also aims to automate
and simplify the repetitive tasks done by the analysts. Tasks, such as parsing SMTP session,
extracting attachments, scanning and finally analyzing them are automated with a
collection of scripts (StoQ, 2017).
Although, the system architecture of StopQ and system that is developed in this project are
similar, the additional features are added overcoming the limitation of StopQ such as,
sending notifications, visualization and report generation.
18 | Milan Shrestha
2.7 Technical Aspects

This project work surrounds with parsing of logs generated from Bro (IDS). The process
of automating this process can be made possible with Python Programming language.
Following are the lists with its comprehensive review of tools used in this project:
2.7.1 Operating System

The development of the system is liable to any UNIX based operating system, but prior to
Debian based Linux and Macintosh. For the prototyping of this project, it was proposed to
be developed at Debian 9 hosting in a virtual environment, father study confirmed that the
system can be developed in host machine directly.
 Ubuntu 16.04 LTS: Since, Ubuntu is also a Debian distro, the stability of its repositories
can be trusted. All the scripting and managing the ELK SIEM framework services are run
here. This distribution was select because of its stability with its repositories. Also, various
researches on ELK were already done in the Ubuntu system. To avoid future hurdles during
development, this OS was chosen.
2.7.2 Programming Language

 Python: Python, version 3.7, will be used as the major programming language to achieve
the objectives of this project. Python is considerably easy to learn and understand provides
all the libraries that support this project such as Brothon Library.
2.7.3 SIEM framework

 ELK stack: ELK framework will be used as the central point for logs and its correlations.
Being open source, this will also be used as the platform for visualization of the aimed
information of this project.
2.7.4 Text Editor

Any other text editor or IDE such as Sublime Text 3, PyCharm, VIM, Gedit, etc., can be
used for the scripting process but, Nano Text editor is chosen as primary IDE for the
project.
 Nano: Since the development is based on a Linux system, Nano will be the major text
editor. Despite the graphical user interface in this text editor, it is very convenient for
modifying and also easy to use.
19 | Milan Shrestha
2.7.5 Libraries
 Brothon: This library for Bro IDS for Python will be used in the development of the
system. This library will allow the script to parse the Bro Logs. Brothon is a package that
supports the ingestion, processing, and analysis of Bro IDS data with python. Alternatives
of this library exist but comparatively, brothon is easy and meets the system requirements.
 Elasticsearch API: This API will be used to post (pass) the information to the
elasticsearch SIEM. It will be used as the medium for Bro and Elasticsearch
communication
 GetIPintel.net API: This IP intelligence service will be used to determine the likeliness
of IPs to be proxy or VPN or malicious.
 OpenPhish API: With this API, the insider domain will be checked to the globally
recognized phishing site datasets. The alternative for this API can be Phishtank where we
can also find the registered phishing sites. Comparatively, the response from Open Phish
API was found a bit faster than that of Phishtank API.
 Slack Client: Slack being a team collaboration tool, it provides a service to add an
application as a bot, using the slack client in python. This service is used to notify the
network admins of the company about detected threats and phishing URLs. Discord is
another platform for similar facilities, but the motive of these two application differs if it
comes in an institutional case.
 PDFDoucument: To document the detection from Threat Detection System in Portable
Document Format (PDF), this library is used to automatically generate the report of set
time 24 hours.
2.7.6 Logs source

 Bro Logs: Bro Log sample will be used to identify the malicious HTTP and DNS queries.
Bro Logs is considered in this project because of its flexibility with the Python
programming language and malicious log samples are easily available over the internet.
20 | Milan Shrestha
Chapter 3: Development
3.1 Methodology Consideration
To start a project with smooth development phases, proper planning is required before
anything else. Proper planning can help to break down the hectic tasks at times, into smaller
manageable chunks. Since, this final year project includes substantial development from
script writing to the demonstration, several methodologies for software development life
cycle were considered.
3.1.1 Waterfall Model
Figure 14 Waterfall Model for SDLC (Sim, 2009)
In the software development life cycle, the waterfall model is a linear process for software
development phases. Before starting the development phase, planning and collecting the
requirements are done in this model. After the phase is completed, this model does not
suggest the developers review the same phase again. Thus, the initial phase, planning, and
requirement gathering are the most important in this software development model (Rouse,
2007).
21 | Milan Shrestha
3.1.2 Dynamic System Development Methodology (DSDM)
Figure 15 Diagrammatical representation of DSDM work-flow (Siddharth, 2018)
DSDM is a popular subset of agile methodology. This framework for software

development has clearly defined phases, sub-phases, roles, and principle. It bears many
similarities other agile methodologies such as Extreme Programming and Scrum (Slegten,
2016). This methodology aims to develop an application of the desired quality without
exceeding and budget. To achieve this, DSDM focuses on customer interaction and end
user, delivery of frequent prototypes, mass testing throughout the process and in setting
priorities between the lists of requirements given by the customer (Daniel Dinis Teixeira,
2005).
3.1.3 Big Bang Model
Figure 16 Diagrammatical representation of the Big bang model
Unlike any other SLDC model, the Big Bang model is unique and includes a minimum
planning phase. This model features not following any specific process. Time, Effort and
22 | Milan Shrestha
Resources are considered as an Input and software developed is an Output, which may or
may not be as per client’s requirement (Tutorials Point, 2019).
3.2 Selected Methodology

The project in threat detection and visualizing it includes various steps in its development
phase. Constant reviews of the system are required, thus, active user involvement is
important. Following the traditional methods of software development that focuses on the
final product would be troublesome for the developer if any errors are found (Benjamin J.
J. Voigt, 2004). Since DSDM reduces error cost and gives priority to user perception, this
model is suitable for this project work. The time box feature of this framework will set a
deadline for the specific stage while the development of prototype becomes faster and
productive. Using this methodology, modifications can be easily made according to
iterations so adding and removing of features to the system are supported. The DSDM
framework includes three sequential phases which are mentioned below.
Phase 1: The Pre-project

This phase is basically the initialization with project suggestions and selection of proposed
project and commitment upon the project. The pre-project predicts an early stage issues to
minimize problems at later stages.
Phase 2: The Project life-cycle

This phase overviews the core 5 stages of the project. Feasibility Study, Business Study,
Functional Model Iteration, Design & Build Iteration and Implementation. The first two
studies are sequential and carried out on the same time span. Here, the problem is addressed
and assessed as likely cost and feasibility of delivering the system to solve a business
problem is studied. Client understanding and requirement information are gathered during
the study.
After these phases been conducted, the system is developed iteratively and incrementally
in Functional Model Iteration, Design & Build and Implementation Phase. In this duration,
project development is chunked into several Time boxes (Selected Business Solution,
2019).
23 | Milan Shrestha
Phase 3: Post Project

The post-project phase ensures the final product operating effectively and efficiently. The
maintenance of the system after its implementation is carried out according to the principle
of DSDM model (Marc Clifton, 2003).
3.3 Development Stages

After finalizing the software development methodology, Dynamic systems development
method (DSDM) and analysis with a survey, the tools for development collected. The
development for each time box according to selected methodology is described below:
3.3.1 Timebox 1: Threat Detection

I. Planning
According to the proposed software development methodology, the ‘must have’ feature of
this system project is Threat Detection, holding 80% of the development in this phase.
The design and build in this phase will include the upcoming development phases i.e.
Visualization, Threat Categorization, Alert system introducing bot and Threat Report
Generation automatically.
For the threat detection, the root aspect we need is the IP addresses (DNS query) and
domain names queried over the internet. To find the threats in the network, logs generated
by Bro, as bro logs are considered as the log source. Bro, being a popular network
monitoring system and open source, it already has the module for parsing the logs in a
python programming language as Brothon library. For the prototyping this project, the
malicious log sample will be used as the main log for scanning. Required fields for the
threat detection and its other factors will be parsed as the source and destination
information of detected threats. The script for this build will be continuously running until
there is any sort of ‘keyboard interrupt’ supporting the fact that logs are to be monitored
constantly.
For the Malicious DNS queries, GetIPIntel and IPAPI are used as a platform for threat
queries. While for the phishing URL detection, OpenPhish API will be used. If any query
from the program was found malicious, it will get alerted and also visualized with Kibana
web portal. Elasticsearch and Kibana from ELK stack will be used as a platform for
normalizing alert logs and creating a dashboard for the system.
24 | Milan Shrestha
II. Design
Figure 17 Flowchart demonstrating overall project
25 | Milan Shrestha
III. Build
The tool is build combining three scripts that are, dnsanalyzer.py, phishanalyzer.py and
main.py. Since we analyze the DNS and HTTP log from the Bro NMS, the dnsanalyzer.py
and phishanalyzer.py analyses the DNS queries and HTTP headers respectively, using the
python library bro_log_reader. The library is imported from Brothon package. Along with
that, other provided libraries by python are imported as per the requirement of the project.
Both scripts, dnsanalyzer.py, and phishanalyzer.py include a function which processes the
respective task. Function HTTP () in this analyzer.py checks the triggered domain if it’s
malicious or not whereas, function DNS () in dnsanalyzer.py, pass the DNS queries to
threat Intel APIs for its malicious status. The script is designed in a way that follows python
loops and conditions for constant analysis of passed bro logs (DNS and HTTP). The
findings are then included in a python dictionary converting it into a JSON format. Thus,
the elasticsearch object is allowed to post the JSON values to the elasticsearch.
Figure 18 Importing Brothon library for dns.log analysis
26 | Milan Shrestha
The log analyzing scripts, DNS and HTTP have similar functions for detections,
posts to elasticsearch, bot alert, and report generations. They both include a function that
feed IP addresses to API for its detailed information such as Geo Locations and origins.
Such findings are notified in further builds of the project.
IV. Test
Figure 19 Detailed information about malicious dns query
27 | Milan Shrestha
3.3.2 Timebox 2: Visualization and Threat Category

I. Planning
The malicious queries are found and further passed in here. The concept of visualizing
the threat module with its category is defined in this phase. Categorization of threat is
made possible with a script written in the previous build where a unique value for each
threat is tagged. Tagging of each threat log helps in normalizing it and maintaining the
dashboard for visualization.
Coming to this phase, according to the above diagram, it falls under ‘alert and post to
elasticsearch’, so, by now, we have the dictionary containing keys and keywords that
supports Kibana for visualization. Kibana is one of the element of ELK stack as a web
portal for an easy and customizable dashboard.
II. Design
Figure 20 Flow diagram for Visualization and Threat Categorization
III. Build
Following the build of Time box: 1, the findings are set to push data for visualizations.
This build falls under the scripts developed in Time box: 1, phishanalyzer.py and
dnsanalyzer.py. For posting the findings from the Threat Detection build, Elasticsearch
and Kibana web portal is used as SIEM framework. The Elasticsearch supports the
python library thus, imported in the scripts. The elasticsearch is hosted on localhost
(http://127.0.0.1:9200) where 9200 is default port number of hosting elasticsearch. The
28 | Milan Shrestha
findings of the previous build are appended into the dictionary and then converted to
JSON format, since elasticsearch intakes JSON data for proper indexing. After pushing
the objects packed with JSON formatted data to elasticsearch, the discover section of
elasticsearch notifies the connection to the scripts and the ELK server. In this part, all
the keys with their corresponding values are tabulated which makes one of the ‘must
have’ (Threat Categorization) feature easier to achieve. Benefited by Kibana, hosting
it on localhost, (http://127.0.0.1:9200 where 9200 is default port number for Kibana
web portal), the system purpose of visualization is made possible at efficient manner.
After achieving the indexes and objects from the build, the design of the dashboard is
done with the Visualization section of Kibana.
Figure 21 Build of phishanalyzer.py posting to elasticsearch
29 | Milan Shrestha
Figure 22Build of dnsanalyzer.py posting to elasticsearch
IV. Test
Figure 23 A table for threat categorizations
30 | Milan Shrestha
Figure 24 Dashboard for visualization
31 | Milan Shrestha
3.3.3 Timebox 3: Bot Alert and Report Generation

I. Planning
Similar to the planning of phase Time box: 2, the results from the Threat Detection cycle
is passed in here simultaneously with Time Box 2. The objective of this Time box is to get
alerts and generate the documented report for future reference automatically. It is one of
the ways for analysts to get updated about the threat status of a given network. The script
will include the function that calls findings and prints out the outputs that are to be sent for
alert and document generation. Slack, a team talk application is considered as a medium of
alert bot development. While, a python library, pdfDocument is imported for report
formatting.
II. Design
Figure 25 Flow diagram for Alert and Report generation
III. Build
According to the proposed time box, the Threat alert system and Report generation goal is
achieved with additional codes in the scripts. Again, the scripts are updated with libraries
and API services. For the bot development, we take SlackClient as an object, setting the
API key in variables. The finding from the previous build (Build from Timebox 1), is
pushed as an alert to the Slack client. The formatting of the message is done within the
script since it is developed as a common message format in all cases such as Alerting and
Reporting. For the threat bot, we have addressed it to dnsanalyzer.py and phishanalyzer.py
so that the alerting process executes right after detection enhancing the real-time properties.
32 | Milan Shrestha
While pdfdocuments.PDFDocument() object is called form the python module imported in

all the scripts. The report generation function contains the parameter as ‘z’ in the script that
carries the message that is to be printed in a PDF document. The function is defined at
dnsanalyzer.py and phishanalyzer.py and called in the main.py. The script is designed in a
way that report is generated after a certain period of time, which can be set accordingly,
else, a PDF document is achieved just after the interruption of the program.
Figure 26 Build of main.py for report generation
33 | Milan Shrestha
IV. Test
Figure 27 Threat Report generation and Slack Bot Alert
34 | Milan Shrestha
Chapter 4: Testing
Testing is one of the important phases in developing a complete system. Since the
considered methodology i.e. DSDM includes testing right before deployment of the
system.
This project holds three main scripts. Each module holds a couple of APIs and
corresponding functions. Each module was tested and examine whether the unit component
of the system has achieved the expected result or not.
Sn. Test Case Objectives Type Result
1 Elasticsearch status API test Unit Successful
2 Slack Client API test Unit Successful
3 IP API Test Unit Successful
4 Build phishanalyser.py execution test Unit Successful
5 Build phishanalyser.py [dashboard] execution test Unit Successful
6 Build dnsanalyser.py execution test Unit Successful
7 Build dnsanalyser.py [dashboard] execution test Unit Successful
8 Automated threat report generation test Unit Successful
9 Test API limitations handling Unit Successful
Table 2 Test Case Summary
35 | Milan Shrestha
4.1 Elasticsearch status API test

Test Case: 01
Objective To have elasticsearch running and find index from a script from
API
Expected Result Active status and index name pushed to elasticsearch,
Actual Result Message from elasticsearch and index named ctms_proj displayed
Analysis Successful
Proof: Screenshots
Figure 28 Status check of Elasticsearch at localhost:9200
Figure 29 Elasticsearch API test with the index name
Table 3 Test Case 1
36 | Milan Shrestha
4.2 Slack Client API test

Test Case: 02
Objective Active status of Slack Client API
Expected Result To get alert from Slack bot
Actual Result Notification form Slack Bot
Analysis Successful
Proof: Screenshot
Figure 30 Test case for Slack bot for notification
Figure 31 Notification pop-up sample
Table 4 Test Case 4
37 | Milan Shrestha
4.3 IP API test

Test Case: 03
Objective To retrieve data from IP API
Expected Result To get data about queried IP in the dictionary,
Actual Result Data in the dictionary about queried IP.
Analysis Successful
Proof: Screenshots
Figure 32 IP API test with queried IP address
Table 5 Test Case 3
38 | Milan Shrestha
4.4 Build phishanalyzer.py execution test

Test Case: 04
Objective To get phishing URL from the script phishanalyzer.py
Expected Result Output printed as a phishing URL detected
Actual Result Phishing URL matched with API feeds
Analysis Successful
Proof: Screenshot
Figure 33 Successful execution of phishanalyzer.py
Table 6 Test Case 4
39 | Milan Shrestha
4.5 Build phishanalyzer.py [dashboard] execution test

Test Case: 05
Objective To push data to elasticsearch and visualize it on Kibana dashboard
Expected Result Objects created for http() in elasticsearch
Actual Result Got the objects in elasticsearch and able to visualize it.
Analysis Successful
Proof: Screenshot
Figure 34 Posted object from phishanalyzer.py to elasticsearch
Figure 35 Dashboard displaying malicious HTTP queries count
Table 7 Test Case 5
40 | Milan Shrestha
4.6 Build dnsanalyzer.py execution test

Test Case: 06
Objective To achieve suspicious IPs from threat intel feed
Expected Result Printed output as malicious DNS queries detected
Actual Result Matched with malicious DNS and printed out the result
Analysis Successful
Proof: Screenshot
Figure 36 Successful execution of dnsanalyzer.py
Table 8 Test Case 6
41 | Milan Shrestha
4.7 Build dnsanalyzer.py [dashboard] execution test

Test Case: 07
Objective To post DNS() objects to elasticsearch and visualize it on Kibana
Expected Result Objects created for DNS() in elasticsearch
Actual Result Got the objects in elasticsearch and able to visualize it.
Analysis Successful
Proof: Screenshot
Figure 37 Posted object from dnsanalyzer.py to elasticsearch
Figure 38 Dashboard displaying malicious DNS queries and notifications
Table 9 Test Case 7
42 | Milan Shrestha
4.8 Automated threat report generation

Test Case: 08
Objective To automatically generate a documented report by the program
Expected Result A pdf document in a set path
Actual Result A pdf document with details of threat detected
Analysis Successful
Proof: Screenshot
Figure 39 Automatically generated pdf document example
Table 10 Test Case 8
43 | Milan Shrestha
4.9 Test API limitations handling

Test Case: 09
Objective getIPintel limits the data feed for 15 queries per minute
Expected Result To push IP addresses continuously and hold for 5 sec
Actual Result The system gets to pause for the minimum time
Analysis Successful (according to the documentation of getIPintel)
Proof: Screenshot
Figure 40 GetIPIntel API handling
Table 11 Test Case 9
44 | Milan Shrestha
Chapter 5: Critical Analysis

With research, planning, references, and scripting, the development of the proposed system
is successfully completed. The project aimed to build a system that reads bro NMS logs, parse
the logs HTTP and DNS specifically, analyze it with a python scripts, detect any malicious IP
address or domains from the log and push an alert if found. Overall, this process is similar to
the threat hunt activities done by security analysts of the SOC department. The tool targets to
minimize the hectic jobs such as manual threat hunt by automating them. The developed
system primarily includes Threat Detection procedure, visualizing them in a well-designed
dashboard, Alert system, and Documentation as initially promised, which are briefly
explained in the development section of the report.
Before the initialization of the development phase of this project, research on several
journals, whitepapers, reports, and documentation was reviewed for a reference. The current
scenario of the proposed system and the problem it aimed to mitigate were observed. The
findings from the research were then addressed using this project as a solution. The software
development life cycle was chosen as Dynamic systems development method (DSDM), a
subdomain of Agile Methodology. As it suggests to breakdown the project into smaller builds,
the requirements for the system and the goals of each phase were identified.
One of the significant approaches of this project is that it is based on data feeds on threat
intelligence platform for threat detection. It uses GetIPIntel and OpenPhish as its platform to
retrieve information from the given feeds. GetIPIntel uses machine learning technology for
its malicious IP address analysis. Therefore, building a system that is based on such an
advanced algorithm surely provides a backup with trust and considerably false positive report.
The intelligence suggests hosting the API within a static network to avoid the blacklists.
Whereas, the OpenPhish processes the registered phish URL worldwide that is updated on a
daily basis. These features of both of the intelligence platform can be considered to be a
positive side for the developed system. The test cases for each intelligence features are
included in the Test section of the report.
The system features two ways of displaying the result of the program. According to the
results from the requirement gathering phase, the security analysts and network engineers
voted for graphical visualization over the text-based log. So, to keep up with the requirement
45 | Milan Shrestha
through analysis, the project delivers a graphical/illustrative dashboard and elasticsearch

tabulation of detected threats. The end users (mainly security analysts) has the ability to filter
out the query through any methods. Although the methods of displaying results are considered
to be separate, in fact, each update on Kibana dashboard are synchronized with elasticsearch.
The test case for threat visualization is included in chapter 4. Another effective way for result
output is notifying and documentation of the findings from systems. Slack Bot is successfully
developed as an application for alerting the responsible person. This alerting feature solves
the problem of continuously monitoring the network. It auto-detects the threat and pushes
notifications. And, at the end of the set time, the system auto-generates the document with
verbose threat information for future analysis.
After the execution of the program, the only task needed to be done by the end users is the
incident response. Incident response is one of the vast and unpredictable subjects itself that
requires the human decisions making instead of automating it, thus the IR feature becomes
out of scope for this project. The developed system covers all the features that were promised
at the initial phase of this project except for the incident response that comes under threat
mitigation which was mentioned as won’t have the feature of the project.
5.1 Legal and Ethical Implication

While cybercrime taking over the internet, introduction to cyber law has become equally
important for implementation as other national laws. Since the government is considered as
the top authority of any nation, the responsible individuals must be more concerned about
the cyber threats and approach the methods and techniques for its identification and
mitigation. Following the necessity, the Government of Nepal has introduced a
comprehensive information technology legislation as Electronic Transition Act 2063 in
which, Chapter 9, focusing priory to the ‘Offence Relating to Computer’.
Following bullets are the mentions from Chapter 9 that are considered illegal and worth
punishment, financially or imprisonment or both, if committed.
 To Pirate, Destroy or Alter computer source code,

 Unauthorized Access in Computer Materials,
 Damage to any Computer and Information System,
46 | Milan Shrestha
 Publication of illegal materials in electronic form,

 Confidentiality to Divulge,
 To inform False statement,
 Submission or Display of False License or Certificates.
(Government of Nepal, 2008)
The occurrence of such criminal events continues, if proper techniques of detection are not
performed. The points above are prone to violation with the malicious domain quarries, once
the computers on the target network are victimized. So, the system developed in this project
is one of the tools thatsupport the legislation prepared by the Government of Nepal. Any
organization can acquire the Threat Detection and Alert System to identify the malicious
findings and protect the organization from future possible threats.
The system offers the facilities to monitor the non-content (without credentials) logs only
within a network as configured at the beginning, so the doubts on violating the privacy by
monitoring logs are escaped as one of the development strategies. At such conditions, the
ethical values and rights of the users will not be affected.
5.2 Limitations
Despite promising results, Threat Detection and Alert System still have drawbacks. The
methodologies for the threat detecting process meets its standard, but the scripts for data
feeding to threat intelligence decelerates based on various factor such as drop-in network
bandwidth. This may be considered as the deployment majors to carry out, but eventually, it
affects the system with a huge amount of logs to analyze and might not be able to handle
such load. However, the system still can run as developed ignoring the fact that it might face
a tiny lag.
Furthermore, the system touches the two logs but fails to analyze both simultaneously. The
script is written in a manner that first analyze the DNS logs first and then HTTP logs. This
flaw in the system is a major limitation of the system since it fails to deal with real-time
threat detection mechanism. Various methods can be adapted to fix such as ‘threading’, that
will be primary tasks to solve in further work.
47 | Milan Shrestha
Chapter 6: Conclusion and Review

In conclusion, the system developed in this project is one of the security tools that can
come handy to security analysts for finding threats more efficiently and without hesitation of
repeating tiresome tasks. There are many such tools that ease the tasks enabling the automation
feature, this project is one of them. The system effectively gathers the logs from sources, parse
them as required and uses the threat intelligence platform to identify threats. It also allows the
end users to get notified with an alert system with Slack Bot. The best thing about the
developed system is, it can be an alternative for manual threat detection by using automated
mechanism but also, the ELK framework has provided the system with a centralized
dashboard for viewing two main logs’ threat status (malicious DNS queries and HTTP
domains). Now, the end users won’t have to check logs one after other but the system simply
visualizes it on the same dashboard. For example, using phishanalyzer.py script from the
project, analysts no longer need to review the entire HTTP net flow traffic of given network
and query them to threat intelligence platforms. Instead, they can simply run the script and
further investigate with results.
6.1 Future Plan

Although the proposed features for the project is available in the current system but some
modifications and upgrades may require if the conditions are varied since it is developed
within certain boundaries of time and resources. Parameters such as hosting of the ELK
server needs to be changed instead of localhost if the system is deployed in different network.
There is a place for improvisation in the scripts in the tool, for continues executions and
avoid human interaction. The current script only detects a specific type of threats as
programmed for the system. Cyberspace has become unpredictable, thus,a new type of
threats are introduce anytime, so the system can be modified or upgraded with more scripts
running to detect such threats. The sources for log gathering can be increased as per required.
There are options for logs sources such as Suricata, Windows event logs for shipping logs
to ELK. Furthermore, the correlation with more threat Intel service provider can be
introduced to the system for more efficient result.
48 | Milan Shrestha
Chapter 7: References
1. ABDULGHANI ALI AHMED, N. A. A., 2016. Real Time Detection of Phishing Websites,
Singapour: University Malaysia Pahang.
2. Alert Logic, 2017. Threat Monitoring, Detection & Responce, U.K.: Information Security.
3. Benjamin J. J. Voigt, Z. S., 2004. Dynamic System Development Method, s.l.: s.n.
4. Bontchev, V., 2017. Macro virus identification problem. FRISK Software International,
17(1), pp. 69-85.
5. CH. Ramaiah, D. A. C. R. S. A. P. P. K., 2018. Secure automated threat detection and
prevention (SATPD). International Journal of Enginerring& Technology, 7(2), pp. 86-89.
6. Chun-jing LU, H. Z. J.-y. L. R. Z., 2017. Network Security Log Analysis System Based on
ELK. Information Security Center, Beijing University of Posts and Telecommunications,
5(13), p. 554.
7. Cisco, 2019. Defending against today's critcal threats, Sanfrancisco: Cisco.
8. CyberSponse, 2018. About Cybersponse. [Online]
Available at: https://cybersponse.com/about/
[Accessed Dec 2018].
9. Daniel Dinis Teixeira, F. J. A. P. J. P. G. d. S., 2005. DSDM - Dynamic Systems
Development Methodology. , s.l.: s.n.
10. Delgado, P., 2018. Developing an Adaptive Threat Hunting Solution The Elasticsearch
Stack, Houston: College of Information and Logistics Technology.
11. efficientIP, 2018. A New Era of Network Attacks, New York: efficientIP.
12. Eric Cole, P., 2015. Automating the Hunt for Hidden Threats, s.l.: SANS Institute.
13. Government of Nepal, 2008. The Electronic Transactions Act, 2063 (2008). Kathmandu,
Nepal.
14. GUODONG ZHAO, K. X., 2015. Detecting APT Malware Infections Based on Malicious
DNS and Traffic Analysis, s.l.: IEEE Acess.
15. JIAN MAO, W. T., 2017. Phishing-Alarm: Robust and Efficient Phishing. 5(17), pp.
17020-17026.
16. Jin Cao, L. D. a. R. H., 2017. Statistical Network Behavior Based Threat Detection. IEEE
Conference on Computer Communications Workshops, 3(12), pp. 420-432.
17. Kessel, P. v., 2018. Cybersecurity regained: preparing to face cyber attacks, s.l.: EYGM.
18. MalwarebytesLabs, 2019. 2019 State of Malware, s.l.: MalwarebytesLabs.
49 | Milan Shrestha
19. Marc Clifton, J. D., 2003. What is DSDM?. [Online]

Available at: https://www.codeproject.com/Articles/5097/What-Is-DSDM#Post-
Project%20-%20Maintenance%2011
20. Paul van Kessel, A. G. B., 2018. EY Global Information Security Survey, Chicago: EY.
21. Quentin Le Sceller, E. B. K., 2017. SONAR: Automatic Detection of Cyber Security Events
over the Twitter Stream, Canada: Security Research Centre.
22. Raytheon, 2018. Raytheon, CyberSponse partner for faster, smarter cyber defense.
[Online]
Available at: https://www.raytheon.com/cyber/news/feature/cyber-combination-rapid-
response
23. Robert M. Lee, R. L., 2016. The Who, What, Where, When, Why and How of Effective
Threat Hunting, UK: SANS.
24. Rouse, M., 2007. What is waterfall model. [Online]
Available at: https://searchsoftwarequality.techtarget.com/definition/waterfall-model
25. RSA, 2016. Threat Detection Effectiveness Global Benchmarks 2016, United States: EMC
Corporation.
26. Sawers, P., 2018. Splunk to acquire security automation and orchestration platform
Phantom for $350 million. [Online]
Available at: https://venturebeat.com/2018/02/27/splunk-to-acquire-security-automation-
and-orchestration-platform-phantom-for-350-million/
27. Selected Business Solution, 2019. What is DSDM?. [Online]
Available at: http://www.selectbs.com/process-maturity/what-is-dsdm
[Accessed Jan 2019].
28. Siddharth, 2018. Snyxius. [Online]
Available at: https://www.snyxius.com/implement-agile-development-process-easy-
steps/dsdm/
29. Sim, S., 2009. Research Gate. [Online]
Available at: https://www.researchgate.net/figure/Diagram-of-the-waterfall-software-
development-process-model_fig1_220169042
30. Slegten, K., 2016. Dynamic Systems Development Method (DSDM), s.l.: s.n.
31. Splunk, 2018. Product Features. [Online]
Available at: https://www.splunk.com/en_us/software/splunk-security-orchestration-and-
50 | Milan Shrestha
automation/features.html#how-it-works
32. Sqrrl, 2018. Hunt Evil: Your Practical Guide to Threat Hunting, s.l.: Sqrrl.
33. StoQ, 2017. stoQ: automation. simplified.. [Online]
Available at: https://stoq-framework.readthedocs.io/en/latest/
[Accessed 13 December 2018].
34. Threatnix, 2018. Threat Report 2018, Nepal, Kathmandu: ThreatNix.
35. Touche, W., 2017. Cyber risk reporting in the UK, s.l.: Governance in focus.
36. Tutorials Point, 2019. SDLC - Big Bang Model. [Online]
Available at: https://www.tutorialspoint.com/sdlc/sdlc_bigbang_model.htm
37. Vectra, 2018. Cognito Detect is the most powerful way to find and stop cyberattackers in
real time. [Online]
Available at: https://vectra.ai/assets/cognito-detect-overview.pdf
[Accessed 2018 2018].
38. Zscaler ThreatLabZ, 2019. An analysis if SSL/TLS-based threat, San Jose: Zscaler.
51 | Milan Shrestha
Chapter 8: Appendix
8.1 MoSCow Prioritization
Must-Have Should Have Could Have Won't Have
M01: Threat S01: Bot Alert C01. Report W01. Threat
Detection Generation Mitigation
- S02: Visualization C02. Threat -

Category
Table 12 MoSCoW Prioritization
8.2 Gantt chart
Figure 41Gantt chart
8.3 Team Members
Table 13 Team Members
52 | Milan Shrestha
8.4 Work Breakdown Structure

No. Activity Deliverable Start Date End Date Duration Resource
1. Pre Project Project Topic 19/11/2018 06/12/2018 18 days PRM,
Stage Selection TLD,
SUP
1.a Exploring on Research on an 19/11/2018 01/12/2018 13 days PRM
interested scopes appropriate
topic
1.b Preparing for the Topic 02/12/2018 06/12/2018 5 days PRM,
project proposal Selection TLD,SUP
2 Project Stage Multiple 06/12/2018 06/01/2019 34 days BSP,

TLD,
PRM,
BUA
2.a Feasibility Study Finding current 06/12/2018 10/12/2018 4 days BRM,
scenario of TLD
chosen domain
2.a. Practicality 11/12/2018 15/12/2018 5 days BUA
a Research
2.a. Methodology DSDM model 16/12/2018 18/12/2018 3 days TLD,
b Selection
2.b Business Study Requirements 19/12/2018 23/12/2018 5 days BUA
Prioritization
2.b. Searching for Find client 24/12/2018 02/01/2019 10 days TLD
c client
2.b. Financial Financial 03/01/2019 06/01/2019 3 days BUA
d Analysis Report
53 | Milan Shrestha
3 Documentation Interim 07/01/2019 16/01/2019 9 days TLD,

Report PRM
MoSCoW 07/01/2019 08/01/2019 1 day TLD
Prioritization
2.d Research on Clear 09/01/2019 11/01/2019 3 days PRM
similar projects understanding
the chosen
topic
2.e Further research Interim Report 12/01/2019 16/01/2019 5 days PRM
and report
writing
3 Planning the Application 17/01/2019 25/01/2019 9 days PRM,

system frameworks SOD,
and user WOF
feedbacks
3.a Managing Tools and 17/01/2019 21/01/2019 5 days WOF,
system Framework SOD
components
3.b Designing work System 22/01/2019 25/01/2019 4 days PRM
flow diagram flowchart
4 Time Box 1: Threat 26/01/2019 21/02/2019 25 days SOD,

(M01) Detection SOT,
SAN,
PRO
4.a M01: Log 26/01/2019 30/01/2019 4 days SAN,
analysis SOD
4.b Choosing API 31/01/2019 4/02/2019 5 days SAN,
SOD
54 | Milan Shrestha
4.c Coding 05/02/2019 14/02/2019 10 days PRO

4.d Testing 15/02/2019 19/02/2019 1 day SOT
5 Time Box 2: Visualization 19/02/2019 16/03/2019 28 days SOD,

(S02, C02) and Threat SOT,
category SAN,
PRO
5.a S02: 17/02/2019 19/02/2019 3 days SAN,
Elasticsearch SOD
Documentation
5.b S02: 20/02/2019 22/02/2019 3 days SAN,
Configuration SOD
5.c S02:Push Logs 23/02/2019 25/02/2019 3 days SAN
5.d C02: Categorize 26/02/2019 02/03/2019 5 days SAN
5.e Coding 03/03/2019 11/03/2019 10 days PRO
5.f Testing 12/03/2019 16/03/2019 3 days SOT
6 Time Box 3: Bot Alert and 16/03/2019 10/04/2019 27 days SOD,

(S01, C01) Report SOT,
Generation SAN,
PRO
6.a S01: API 16/03/2019 20/03/2019 3 days SOD,
documentation SAN
6.b S01: Correlation 20/03/2019 23/03/2019 3 days SOD,
SAN
6.c C01: PDF 23/03/2019 26/03/2019 3 days PRO
generation
6.d Coding 26/03/2019 08/04/2019 13days PRO
6.e Testing 09/04/2019 13/04/2019 5 days SOT
55 | Milan Shrestha
7 Termination 14/04/2019 08/05/2019 25 days TLD,

SUP, BSP
7.a Documentation Final Report 14/04/2019 07/05/2019 14 days TLD
7.b Project - 07/05/2019 08/05/2019 1 day SUP
Handover
Table 14 Work Breakdown Structure
56 | Milan Shrestha
8.5 User Manual Guide

Following guide is dedicated to the end users to get started with Threat Detection and Alert
System.
 First most, the system is solely based on Linux (Debian) environment. So, the ISO of
Debian or any other Debian distro operating system should be installed on a machine,
 The source code of the system should be in the same directory with root privilege so it
could access required services.
 The machine must be pre-installed with python (version 3.7). The script must be installed
or saved in .py format.
 Bro IDS should be installed and configured, so the traffics from configured interfaces could
be generated logs to analyse.
 The path of the Bro logs should be set.
 Install the deb package of ELK framework (especially, Elasticsearch and Kibana),
 Services, Elasticsearch and Kibana should get initialize
 Access the portal with localhost: 5601, by default for visualization.
 The Slack bot for notification, it can be used from a web application or desktop application.
 The credentials for Slack Client should be modified from the source code.
Following the above steps for setting up the system, we can successfully run the Threat Detection
and Alert System.
Note: The system is suggested to be run on a dedicated server with proper network configurations.
57 | Milan Shrestha
8.6 Installation of ELK Environment

OS Prerequisites for ELK: Ubuntu 16.04 (Debian)
Once the Operating system was installed then
sudo apt-get update
sudo apt-get upgrade
Elasticsearch Prerequisites
Step 1: install Java 1.8 JDK.
sudo apt-get install default-JRE
OpenJDK version "1.8.0_151"
OpenJDK Runtime Environment (build 1.8.0_151-8u151-b12-0ubuntu0.16.04.2-b12)
OpenJDK 64-Bit Server VM (build 25.151-b12, mixed mode)
Step 2: Import Elasticsearch PGP key

wget -qO - https://artifacts.elastic.co/GPG-KEY-elasticsearch | sudo apt-key add –
sudo apt-get install apt-transport-https
echo "deb https://artifacts.elastic.co/packages/6.x/apt stable main" | sudo tee -a
/etc/apt/sources.list.d/elastic-6.x.list
Step 3: Install Elasticsearch

sudo apt-get install elasticsearch
Step 4: Configure Elasticsearch

sudo vim /etc/elasticsearch/elasticsearch.yml
network.host: "localhost"
http.port:9200
sudo service elasticsearch start
{
"name" : "my-ctms-proj",
58 | Milan Shrestha
"cluster_name" : "elasticsearch",
"cluster_uuid" : "mTkBe_AlSZGbX-vDIe_vZQ",
"version" : {
"number" : "6.1.2",
"build_hash" : "5b1fea5",
"build_date" : "2018-01-10T02:35:59.208Z",
"build_snapshot" : false,
"lucene_version" : "7.1.0",
"minimum_wire_compatibility_version" : "5.6.0",
"minimum_index_compatibility_version" : "5.0.0"
},
"tagline" : "You Know, for Search"

}
Step 6: Install Kibana

sudo apt-get install kibana
Step 7: Configure Kibana

sudo vim /etc/kibana/kibana.yml
server.port: 5601
elasticsearch.url: http://localhost:9200
sudo service kibana start
59 | Milan Shrestha
8.7 Code
dnsanalyzer.py
#! /usr/bin/python3
from slackclient import SlackClient

import ast
from pdfdocument.document import PDFDocument
from datetime import datetime
from elasticsearch import Elasticsearch
import json
from brothon import bro_log_reader
import socket
import requests
__author__ = 'Milan Shrestha'

__copyright__ = 'Copyright 2019, Threat Detect and Alert, FYP'
slack_key = SlackClient('token') #credentials for slackbot API
reader = bro_log_reader.BroLogReader('/home/pastalins/Desktop/FYP/dns.log')
#definig paths of bro logs
already = [] #empty lists for non repeating dns querries for bro logs
set_list = [] #empty lists for corresponding soruce and destination ip
addresses
third = {}
for row in reader.readrows():
try:
dns_query = row['query'] #accerssing only rows of querry
if dns_query not in already:
already.append(dns_query) #appending non repeating
querry to the lists
ns_lookup = socket.gethostbyname(dns_query)
#print (dns_query,' = ',ns_lookup)
set_list.append({ns_lookup:[row['id.orig_h'],row['id.resp_h']]})
#appending to set_list[]
third[ns_lookup] =
[row['id.resp_h'],row['id.resp_p'],row['id.orig_p']]
except:
continue
#print(set_list)
#print(third)
alread2 = []
def dns(z):
z.h2('BAD IP address Found') #heading for report
messege = ''
unique_dns = 0 #constant for count
for ips in set_list:

for ip in ips:
if ip not in alread2:
messege = ''
60 | Milan Shrestha
alread2.append(ip)
intel =
requests.get('http://check.getipintel.net/check.php?ip={}&contact=rocks_slash
@yahoo.com'.format(ip)) #API requests for DNS querry
a_txt = intel.text
decode = str(a_txt)
print ('IP Address:{}\nCriticality Score:{}\n--
---'.format(ip, decode, dns_query))
if decode == '1':
print ('\nCriticality Score matched
with Malicious Limit\n')
try:
es =
Elasticsearch('localhost:9200') #hosting Elasticsearch at localhost
my_dictionary = {} #a
dictionary that carries all message to elasticsearch and slack bot
my_dictionary['src_ip'] = ip
my_dictionary['dest_ip']
=third[ip][0]
my_dictionary['src_port'] =
third[ip][2]
my_dictionary['dest_port'] =
third[ip][1]
my_dictionary['@timestamp'] =
datetime.utcnow().isoformat()
my_dictionary['threat-intel'] =
'http://getipintel.net'
my_dictionary['query-type'] =
'DNS-Query'
my_dictionary['unique-dns'] =
unique_dns
find =
requests.get('https://ipapi.co/{}/json/'.format(str(ip))) #name server
look up api
b_txt = find.text
in_json = json.loads(b_txt)
#print(in_json)
my_dictionary.update(in_json)
es.index(index = 'ctms-test-1',
doc_type = 'ip-info', body = my_dictionary)
# es.index(index = 'my-CTMS',
doc_type = 'info', body = my_dictionary)
messege = messege + "Source IP:
{}\nSource Port: {}\nDestination IP: {}\nDestinatio Port: {}\nOrijin:
{}\nLatitude:{}\nLogitude :{}\nTimestamp:
{}\n".format(my_dictionary['src_ip'],
my_dictionary['src_port'],
my_dictionary['dest_ip'],
61 | Milan Shrestha
my_dictionary['dest_port'],
my_dictionary['org'],
my_dictionary['latitude'],
my_dictionary['longitude'],
datetime.now().isoformat())
print (messege)
unique_dns = unique_dns + 1
slack_key.api_call('chat.postMessage', channel = 'threat-alert', text

= '*_Detected:_ _Malicious_ _query(s)_*: ```{}```'.format(messege)) #pushing
messege to slack client
except:
pass
z.p(messege) #passing messege to main.py module
phishanalyzer.py
#!/usr/bin/python3
#!/usr/bin/python3
#importing required libraries
import ast
from datetime import datetime
from elasticsearch import Elasticsearch
import json
from brothon import bro_log_reader

import socket
import requests
import re
__author__ = 'Milan Shrestha'

__copyright__ = 'Copyright 2019, Threat Detect and Alert, FYP'
slack_key = SlackClient('Token') #credentials for SlackBot
intel = requests.get('https://openphish.com/feed.txt') #URL requests

for mailcious url matching
txt = intel.text
string = str(txt)
put_list = string.split()
phish_list = [] #an empty lists for

for i in put_list:
url = re.compile(r"https?://(www\.)?") #using regex to strip url
strip_http = url.sub("", i).strip().strip('/')
phish_list.append(strip_http)
62 | Milan Shrestha
reader = bro_log_reader.BroLogReader('/home/pastalins/Desktop/FYP/http.log')
#defining paths for accessing bro logs (HTTP)
bro_domain_list = [] #empty lists for domains from bro logs

bro_domain_src = [] #empty lists for source ip form bro logs
for row in reader.readrows():

try:
header = row['host'] #picking only rows of hosts from logs
uri = row['uri'] #picking only rows of uri from
logs
joint = (header + uri) #concatination
bro_domain_list.append(joint) #appending it to lists
except:
pass
#print (bro_domain_list)
def http(z):
z.h2('Phishing Domain')
messege = ''
unique_http = 0
for p in phish_list:
for q in bro_domain_list:
# try:
if q == p:
print ('Phishing Domain Found :', p)
only_domain = p.split("//")[-
1].split("/")[0].split('?')[0] #extracting domain from URL
print(only_domain)
domain_lookup =
socket.gethostbyname(only_domain) #getting nslookup of domain
es = Elasticsearch('localhost:9200')
my_dictionary = {}
my_dictionary['ip'] = domain_lookup
my_dictionary['unique-http'] = unique_http
my_dictionary['phish-url'] = q
my_dictionary['threat-intel'] = 'openphish'
my_dictionary['type'] = 'HTTP-query'
my_dictionary['@timestamp'] =
datetime.utcnow().isoformat()
es.index(index = 'ctms-test-1', doc_type = 'ip-

info', body = my_dictionary)
messege = messege + 'Domain name: {}\nIP:
{}\nTimestamp: {}\n'.format(my_dictionary['phish-url'],
my_dictionary['ip'],
my_dictionary['@timestamp'])
unique_http = unique_http + 1
slack_key.api_call('chat.postMessage', channel
= 'threat-alert', text = '*_Detected:_ _Phishing_
_Domain(s)_*:```\n{}```'.format(messege))
# except:
# pass
63 | Milan Shrestha
#print (messege)
z.p(messege)
#http()
main.py
#! /usr/bin/python3
import requests

from io import BytesIO
from phishanalyzer import http

from dnsanalyzer import dns_anal
z = PDFDocument('Threat-Report.pdf')
z.init_report()
z.h1("Threat Report")
dns(z) #calling function dns(z) from dnsanalyzer.py

z.p('Above report is based upon check.GetIPIntel.net and IPAPI feeds 2019.')
http(z) #calling function http(z) from phishanalyzer.py

z.p('Above report is based upon OpenPhish feeds 2019.')
z.h3("© Threat Detect and Alert System, Author Milan Shrestha\n ")
z.generate()
#followinf lines of code will upoad generated pdf to slack client

my_file = {
'file' : ('/home/pastalins/Desktop/FYP/Threat-Report.pdf',
open('/home/pastalins/Desktop/FYP/Threat-Report.pdf', 'rb'), 'pdf')
}
payload={
"filename":"Threat-Report.pdf",
# "timestamp":datetime.now().isoformat(),
"token":'=api key==',
"channels":['#threat-alert'],
}
r = requests.post("https://slack.com/api/files.upload", params=payload,
files=my_file)
64 | Milan Shrestha
8.8 Survey
65 | Milan Shrestha
66 | Milan Shrestha
67 | Milan Shrestha
68 | Milan Shrestha
69 | Milan Shrestha
70 | Milan Shrestha
71 | Milan Shrestha
72 | Milan Shrestha
73 | Milan Shrestha
74 | Milan Shrestha
75 | Milan Shrestha
76 | Milan Shrestha
77 | Milan Shrestha
78 | Milan Shrestha
79 | Milan Shrestha
80 | Milan Shrestha
81 | Milan Shrestha
8.8.1 Survey Results
82 | Milan Shrestha
83 | Milan Shrestha
84 | Milan Shrestha
85 | Milan Shrestha
86 | Milan Shrestha
View publication stats

2018 19AFC6P01NIA2CWReport16033230MilanShrestha PDF

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

2018 19AFC6P01NIA2CWReport16033230MilanShrestha PDF

Uploaded by

Copyright:

Available Formats

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

Thesis · June 2019

Cyber Threat Monitoring System With E&K View project

The user has requested enhancement of the downloaded file.

Threat Detection and Alert System

Student Name: Milan Shrestha

First Supervisor Name: Mr. Akchayat Bikram Joshi

1.1 Introduction to topic ......................................................................................................... 1

1.2 Current Scenario ............................................................................................................... 1

1.3 Problem Statement ........................................................................................................... 5

1.4 Project as a solution.......................................................................................................... 6

1.5 Aim and Objectives .......................................................................................................... 7

1.5.1 Aim ........................................................................................................................... 7

1.5.2 Objectives ................................................................................................................. 7

Chapter 2: Background ............................................................................................................. 9

2.1 Requirement Analysis through Survey ............................................................................ 9

2.2 About the end user.......................................................................................................... 10

2.3 Understanding the solution............................................................................................. 11

2.3.1 Planning .................................................................................................................. 13

2.3.2 Gathering Logs........................................................................................................ 13

2.3.3 Analysis and Monitor .............................................................................................. 13

2.3.5 System Architecture ................................................................................................ 14

2.4 Similar System Review .................................................................................................. 14

2.4.1 Nimbus .................................................................................................................... 15

2.4.2 Vectra ...................................................................................................................... 15

2.4.3 Cybersponse ............................................................................................................ 16

2.4.4 Phantom .................................................................................................................. 16

2.5 Analysis with a similar system ....................................................................................... 17

2.6 Similar Projects Review ................................................................................................. 18

2.6.2 StoQ ........................................................................................................................ 18

2.7 Technical Aspects .......................................................................................................... 19

2.7.1 Operating System .................................................................................................... 19

2.7.2 Programming Language .......................................................................................... 19

2.7.3 SIEM framework .................................................................................................... 19

2.7.4 Text Editor .............................................................................................................. 19

2.7.5 Libraries .................................................................................................................. 20

2.7.6 Logs source ............................................................................................................. 20

Chapter 3: Development .......................................................................................................... 21

3.1 Methodology Consideration ........................................................................................... 21

3.1.1 Waterfall Model ...................................................................................................... 21

3.1.2 Dynamic System Development Methodology (DSDM) ......................................... 22

3.1.3 Big Bang Model ...................................................................................................... 22

3.2 Selected Methodology .................................................................................................... 23

Phase 1: The Pre-project ....................................................................................................... 23

Phase 2: The Project life-cycle ............................................................................................. 23

Phase 3: Post Project ............................................................................................................. 24

3.3 Development Stages ....................................................................................................... 24

3.3.1 Timebox 1: Threat Detection .................................................................................. 24

3.3.2 Timebox 2: Visualization and Threat Category ...................................................... 28

3.3.3 Timebox 3: Bot Alert and Report Generation......................................................... 32

Chapter 4: Testing .................................................................................................................... 35

4.1 Elasticsearch status API test ........................................................................................... 36

4.2 Slack Client API test ...................................................................................................... 37

4.4 Build phishanalyzer.py execution test ............................................................................ 39

4.5 Build phishanalyzer.py [dashboard] execution test........................................................ 40

4.6 Build dnsanalyzer.py execution test ............................................................................... 41

4.7 Build dnsanalyzer.py [dashboard] execution test ........................................................... 42

4.8 Automated threat report generation................................................................................ 43

4.9 Test API limitations handling ........................................................................................ 44

Chapter 5: Critical Analysis .................................................................................................... 45

5.1 Legal and Ethical Implication ........................................................................................ 46

5.2 Limitations ..................................................................................................................... 47

Chapter 6: Conclusion and Review......................................................................................... 48