Paper045 2nd IEEE IMITEC 2020 Conference FINAL-2

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 13

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/348977997

An Analysis of the Use of DNS for Malicious Payload Distribution

Conference Paper · November 2020


DOI: 10.1109/IMITEC50163.2020.9334104

CITATIONS READS
2 245

2 authors:

Ishmael Dube George Wells


University of Johannesburg Rhodes University
2 PUBLICATIONS 2 CITATIONS 41 PUBLICATIONS 203 CITATIONS

SEE PROFILE SEE PROFILE

All content following this page was uploaded by Ishmael Dube on 12 October 2021.

The user has requested enhancement of the downloaded file.


An Analysis of the Use of DNS for Malicious
Payload Distribution
1st Ishmael Dube 2nd Professor George Wells
Department of Computer Science Department of Computer Science
Rhodes University Rhodes University
Grahamstown, South Africa Grahamstown, South Africa
g17d8835@campus.ru.ac.za, ORCID:0000-0002-5439-7600 g.wells@ru.ac.za , ORCID: 0000-0001-9088-3449

Abstract—The Domain Name System (DNS) protocol is a DNS queries. The payload is the portion of the malware which
fundamental part of Internet activities that can be abused by performs a malicious action on the target [78].
cybercriminals to conduct malicious activities. Previous research
has shown that cybercriminals use different methods, including In its simplest form, the DNS is a powerful yet simple
the DNS protocol, to distribute malicious content, remain hidden database management system that contains Internet Protocol
and avoid detection from various technologies that are put (IP) addresses of every domain name [34]. As a fundamental
in place to detect anomalies. This allows botnets and certain part of Internet activities, the DNS is attracting cybercriminals
malware families to establish covert communication channels that who can exploit its vulnerabilities and also use it as part of
can be used to send or receive data and also distribute malicious
payloads using the DNS queries and responses. Cybercriminals their malicious network/infrastructure. The DNS can be used
use the DNS to breach highly protected networks, distribute as a covert communication channel between the command and
malicious content, and exfiltrate sensitive information without control servers and the bots. DNS traffic bypasses traditional
being detected by security controls put in place by embedding security monitoring giving an opportunity to botnets to send
certain strings in DNS packets. This research undertaking and receive data in highly protected networks by embedding
analysed the use of the DNS in detecting domains and channels
that are used for distributing malicious payloads. Passive DNS certain strings in the DNS packets [64]. The simple architec-
data which replicate DNS queries on name servers to detect ture of the DNS protocol allows and facilitates the transfer of
anomalies in DNS queries was evaluated and analysed in order data in query and response packets, a feature., cybercriminals
to detect malicious payloads. The research characterised the also use to establish covert communication channels that allow
malicious payload distribution channels by analysing passive DNS them to send and distribute malicious payloads. It is, therefore,
traffic and modelled the DNS query and response patterns used
during malicious payload distribution. The research found that essential to analyse DNS activity to understand, detect and
it is possible to detect malicious payload distribution channels set up appropriate defence strategies against such emerging
through the analysis of DNS TXT resource records. threats. This can be achieved through analysing Passive DNS
Index Terms—DNS, Malicious Payload Distribution, Passive datasets which are databases of actual historic DNS traffic
DNS, Covert DNS Tunnelling [44].
According to [9, 79], research in the analysis of the DNS as
I. I NTRODUCTION a payload distribution channel is limited and often concentrates
on specific malware families. This research aims to broaden
Previous research has shown that botnet masters use differ- this research field and fill in the existing research gap by
ent methods to remain hidden and avoid detection from various extending the analysis of DNS being used as a payload
technologies that are put in place to detect anomalies [37]. A distribution channel to detection of multi-purpose domains
botnet is defined as a coordinated group of malicious instances that are used to distribute different malicious payloads. Multi-
that are usually distributed but are controlled by a command purpose domains are used for different malicious activities
and control instance to deliver malware or launch cyberattacks such as sending spam messages, phishing campaigns, and
[72]. The Domain Name System (DNS) has become a target distribution of malware among others [30, 10].
for botnet masters in conducting their malicious activities [67].
Cybercriminals use the DNS to breach highly protected net-
works, steal and exfiltrate sensitive information without being II. R ESEARCH O BJECTIVES
detected by security controls put in place by organisations [46].
According to [42], DNS traffic is considered harmless, is often The objective of this research is to analyse and understand
not monitored or less effort is put in monitoring this traffic and how DNS resource records are used in the distribution of
is allowed to bypass monitoring measures being implemented. malicious payloads. The goal is to also characterise DNS
This allows botnets and certain malware families to establish messages associated with malicious networks and investigate
covert communication channels that can be used to send or various ways that malicious networks are using DNS to
receive data and also distribute malicious payloads using the distribute attack payloads.
III. R ESEARCH Q UESTION with malicious payload distribution channels is that they
While the subject of malicious payload distribution channels usually do not show characteristics of DNS channels
using the DNS has been studied before, there is still a research because of the relatively low volume of upstream data.
gap when it comes to multi-purpose domains used for various Studies have been conducted on various DNS tunnelling
malicious activities. Below is the research question being tools and ways of detecting such tunnels [75, 65, 2,
investigated: 20, 52, 4, 55, 62, 58], ways of improving detection
mechanisms [2] and techniques that can be used to protect
• Can domains used for distributing multiple malicious
and mitigate threats posed by DNS tunnels [25, 19]. Other
payloads using covert channels be characterised and
studies have focused on DNS tunnelling performance
modelled based on the data transferred using the DNS?
issues due to limitations of DNS packet sizes and traffic
In addition to the research question above, the following sub- overhead [1, 65, 52]. Finally, [58] investigated new ways
questions were also investigated: that criminals are using to avoid detection when using
• How can the abuse of the DNS be quantified and all the DNS tunnels.
involved parties and infrastructure be identified? 2) Fast-Flux Networks: According to [13], fast-flux net-
• Can encrypted payload distribution channels that are used works are a result of a different set of IP addresses being
for various purposes be detected and monitored through returned for each DNS query. The basic concept of a fast
analysis of DNS traffic? flux network is having multiple IP addresses associated
IV. R ELATED W ORK with a domain name, and then constantly changing them
in quick succession. With the passage of time, fast-flux
A. Domain Name System networks have become complex making it even more
According to [19], the DNS protocol is a translation service difficult to detect and trace them [67].
used by the Internet protocol. To access a web page on the 2) Abuse of the DNS at System-Level: According to [43],
Internet, a request to access a domain name begins with a the DNS has a number of yet to be discovered vulnerabilities
DNS query to obtain an IP address which corresponds to the that exist at every layer in the DNS. These unknown vulnera-
requested domain name. The DNS protocol utilises the Fully bilities present an opportunity for criminals to abuse the DNS
Qualified Domain Name (FQDN) to specify the location of system whenever they can [45]. Some studies have focused
a computer in the hierarchy of the DNS [51, 3]. According on the domains that are based on the Domain Generation
to [38] authoritative name servers store original information Algorithm (DGA). These domains are randomly generated
about a domain in a text file called a zone file. A zone file domains [35] and botnets send queries to them hoping that
is comprised of entries or mappings between IP addresses, they are already registered. This method is also called domain
domain names and other resources that are organised using a fluxing [68, 6]. Another system-level DNS abuse is the DNS
textual representation called a Resource Record [5, 19]. Each amplification attack, a reflection-based distributed denial of
Resource Record consists of five fields, namely, Name, Time service (DDos) attack [5].
to Live (TTL), Record Class, Record Type and Record Data. Studies have been conducted on cache manipulation attacks
For the purpose of this research the focus will be on the Record that rely on caching DNS information in the hierarchy of the
Type field which specifies the type of information that will be DNS [49, 74, 33]. These studies have ranged from methods
carried by the DNS message. Some of the record types that used to poison DNS caches [27] to ways of detecting cache
were considered in this research undertaking are A/AAAA, poisoning attacks by scanning DNS resolvers [73]. DNS Cache
NS , MX ,TXT and CNAME. Poisoning (DNS Spoofing) is a cyber-attack that exploits
B. The Abuse of the DNS vulnerabilities in the domain name system (DNS) by diverting
Internet traffic away from legitimate servers and towards fake
According to [67, 64], the DNS has been and is a target ones. [32] studied a vulnerability which allowed domains that
for criminals for different purposes due to the simplicity and had been deleted to be kept alive. In addition to the above
the underlying architecture of the DNS protocol. Based on the abuses, [70] studied the impact of rogue resolvers in the
form of abuses in the past, [30] categorised these DNS abuses DNS hierarchy. These resolve domain names to malicious IP
into system-level and protocol-level abuses. addresses that can be under the control of the attackers. [40]
1) Protocol-Level Abuse of the DNS: Protocol-level DNS observed that domains used for malicious purposes are also
abuses target the DNS protocol and exploit the weaknesses found in authoritative name servers. There have been propos-
in the architecture and the way the protocol is designed [37]. als around algorithms that can be used to detect malicious
From the literature reviewed, DNS Tunnelling and Fast-Flux activities based on DNS queries [22].
networks were identified as protocol-level abuses of the DNS.
1) DNS Tunnelling: DNS tunnelling can be used to bypass C. Security Measures Put in Place to Protect the DNS from
network security devices and communicate with rogue Abuse
DNS resolvers [47]. The FeederBot botnet used the DNS There are various mechanisms and measures that have been
as a channel for communicating with the command and put in place to protect the DNS against security threats [26],
control servers [18]. [9] note that the biggest challenge such as traffic shaping, flow filtering and prioritisation. [28]
proposes a number of measures to protect the DNS, such
as hardening the environment hosting the DNS. [19] also
recommend that recursion be disabled on master and primary
DNS, and restriction of recursive queries on slave servers to
mitigate spoofing attacks. The DNS with Security Extensions
(DNSSEC), a suite of specifications used for securing informa-
tion supplied by the DNS, provides a set of security extensions
[14]. According to [56], these extensions also address most
DNS vulnerabilities that are known, but its usage is not as
widespread as expected.

D. Payload Distribution Using the DNS


According to [30], there are limited legitimate uses of using
the DNS to distribute payload. These include antivirus updates
[63, 52] and mobile device authentication system for public
hotspots [23]. According to [66] the method of communication
is the most important element of an efficient and resilient
malicious network. [38] argue that the DNS is a perfect choice
because DNS traffic is usually allowed to bypass network
inspection. The DNS is also chosen for malicious distribution
due to flexible fields in the DNS protocol which can be abused
for malicious use. Malicious payload data is usually stored in
resource records and cached in resolvers so that the payload
can even be accessed when the command and control servers
are down. Recent and current work on payload distribution
using the DNS includes work by [53, 37, 57, 69, 33, 48].

V. OVERVIEW OF THE M ETHODOLOGY AND A PPROACH


The system used in this research analysed DNS queries and
responses in the passive DNS dataset, with the aim of detecting
malicious multi-purpose payload distribution channels. The
system, at a high-level, is composed of two main modules:
a module for query and response analysis, and another for
detecting the malicious payload distribution channels. The
system has a DNS query and response message analysis zone
where the pattern of each channel is analysed, hereinafter Fig. 1. System Overview (Adapted from [30])
referred to as the pattern analysis module. After the DNS
query and response message analysis, the DNS query and
response messages is also sent to a zone responsible for capability to differentiate the DNS behaviour of malicious
analysis of the presence of a potential payload distribution domains from non-malicious domains [7, 29]. The design
channel, hereinafter referred to as the payload analysis module. uses four sets of DNS features that are obtained from DNS
This zone extracted all the relevant DNS Resource Record records. These features are time-based features, domain name-
(RR) activities for particular domains from the passive DNS based features, TTL-based features, and DNS answer-based
data. The zone or module is responsible for determining the features [37]. In addition to the detection techniques used by
intensity of the payload distribution using the activities of EXPOSURE, this research undertaking also used the following
DNS resource records. The intensity of a payload distribution features in the analysis: payload analysis, size of request and
channel measures the level of activity in a particular channel. response, entropy of hostnames, history of the domain and
Figure 1 shows the system that was developed, including the WHOIS-based Features.
modules involved and the flow of data from one module to A. Datasets
the other.
The approach is an adaptation of previous work by [8] who Passive DNS data obtained from FarSight Security 1 cover-
proposed a system named EXPOSURE. This research also ing a period of six (6) months: from 01 January 2017 to 30
expands on the work by [30] which also identified this research June 2017 was used as a primary data set and is generally
undertaking as a gap that needed to be examined further. 1 According to [21], the DNSDB database currently has over 100 billion
EXPOSURE utilises passive DNS analysis and leveraging of unique DNS records with over 200,000 new raw observations/second, totalling
machine learning to build detection rules which have the over 2TB of DNS data, collected daily.
considered to be the most comprehensive passive DNS data malicious information may be obfuscated in some text. Well-
available [54]. The rationale behind choosing this time period known legitimate domains derived from the Alexa Top 1000
is that the period is long enough to have include multiple Global Sites list were excluded as they are highly unlikely
malware families that exhibit the features that are relevant to be involved in malicious activities. This is based on the
to this study and gives enough coverage to detect instances assumption that these sites usually have many visitors and or
where DNS TXT records have been used to distribute payload. users and these domains are well monitored and maintained.
According to [31], passive DNS system stores DNS resolution Domains that are older than a specific age were also excluded
data for a given location, record, and time period. DNS data because malicious domains are usually discovered after a
can be replicated and stored in historical databases that can be certain period, after which they are blacklisted.
used for further investigations. Most passive DNS implemen-
tations have adapted Weimer’s passive DNS implementation VI. A NALYSIS
[41, 4] shown in Figure 2. This experimentation involved using previously analysed
malware that are used to distribute malicious payload using
the TXT records of the DNS protocol. In some instances,
the payload was obfuscated and could not be easily detected
except through the use of decryption.

A. DNS Query and Response Pattern Analysis


During this phase, the probability of each query and re-
sponse to be involved in the distribution of payload was
analysed through comparison of the average distinct count of
TXT messages. Using previous work by [30], the query and
response pattern analysis allowed the researcher to classify the
Fig. 2. Passive DNS. pattern as either Many-to-Many, Single-to-Single or Single-to-
Many: a ratio of the number of command and control server
Malware samples obtained from freely available malware to the number of target machines. Many-to-Many patterns
research repositories were analysed in a controlled and sand- have the potential of distributing huge volumes of data as
boxed environment. In addition to the above, malicious domain these patterns have many targets. Many-to-Many patterns are,
lists were collected from various reputable sources. To provide however, more likely to be easily detected due to volumes of
deeper insights about malicious activities and to enhance the data being exchanged and the rate at which it is exchanged.
accuracy and coverage, the detection approach presented in This DNS query and response pattern is most likely to alert
this research undertaking utilised external sources of data and be detected by security measures in place. The other class
such as Geolocation, Autonomous System Number (ASN), is the Single-to-Single pattern which is targeted to specific
Registration Records, and IP/domain blacklists/whitelists to systems while maintaining low footprint and visibility. This
enrich DNS information. enables these patterns to remain undetected because such DNS
queries can blend with normal DNS traffic. According to [52],
B. Pre-Analysis Filtering these are the most resilient payload distribution channels that
A total of 1542 top level domains were considered for are difficult to detect.
this analysis. The list of the Top-Level Domains (TLDs) From the individual top level domain dataset, it was ob-
was obtained from the Internet Assigned Numbers Authority served that the dataset for the .de domain (the country
(IANA). Internationalised Domain Names (IDNs) [39] were code top-level domain (ccTLD) for the Federal Republic of
excluded although these are commonly used for phishing Germany) was the biggest in size and therefore became an
purposes as these domains can easily deceive users to visit interesting case that needed to be analysed. Figure 3 shows
a seemingly non-malicious website. As a result, only DNS the distribution of top access counts for domains in the .de
TXT records for 1391 top level domains were considered, of ccTLD. Figure 3 shows only domains whose access count for
which most of these TLDs not show any activity during the TXT queries is more than 10. Nine domains accounted for the
period under consideration. Due to limitations when extracting bulk of queries.
bulk records for the .com top level domain and the number Analysis of this sample showed that some domains had a
of domains under the .com top-level domain, they were not very high count of queries: the average number of counts per
considered for this analysis. The .arpa domain was also not domain in the .de ccTLD was relatively low (an average
considered as it is practically no longer used for any purposes count of approximately 2 per domain). However, the 1yf.de
that may be applicable to this research. In this data analysis, domains showed excessive access count of more than 2000
all TXT records that contained any of legitimate forms of for each domain. A further analysis on the domain 1yf.de
information such as Sender Policy Framework (SPF), Domain reveals that this could be DNS tunnelling traffic. Further analy-
Keys (DK), and Domain Keys Identified E-mail (DKIM) [60, sis revealed that s01.1yf.de relies upon the name servers.
59] were not considered, although it should be noted that Checking the DNSDB to find out other domains that could
record activity of regular Alexa Top 1000 domains2 and
the payload distribution channels was investigated to account
for malicious behaviour. This process involved retrieving the
access counts of Resource Records for both known malicious
domains and the regular Alexa Top 1000 domains as these are,
in the opinion of the researcher, a reasonable measure that can
be used to understand the activity of Resource Records of any
domain under consideration.
As expected, the domains for the Alexa Top 1000 received
queries for various resource records and this could be at-
tributed to the fact that these domains are utilised for dif-
ferent services (Figure 5). The malware domains also received
queries for various resource records but in some instances there
were significantly higher counts for certain resource records
as shown in Figure 6.

Fig. 3. Access Count of TXT Records for the .de ccTLD

be using the same name servers reveals that there are several
other domains that use the same name servers. The domains
that use the same name servers are shown in the output in
Figure 4. The records in Figure 4 all appear to be related to Fig. 5. Alexa Top 1000 Resource Record Distribution
the www.your-freedom.net domain. The www.your-
freedom.net domain offers VPN tunneling, firewall and
proxy bypassing, anonymisation and anti-censorship solutions
[77]. In this case, the record types NULL and TXT appear to
be obfuscated by encryption.

Fig. 6. Malware Domains Resource Record Distribution


Fig. 4. Domains that use dns.resolution.de as a Name Server

B. Detecting the Payload Distribution Channels 2) Comparison with Top 20 “Shady” TLDs:: According
to [36], “shadiness” looks at the ratio of malicious second-
Once the pattern of DNS queries and responses had been level domains to the total number of domains registered. Other
observed, they were then inspected in the module that analyses alternative terms used are “evil index”, “badness”, among
DNS zones. This was achieved through the use of access others.
counts of TXT resource records for each domain. From the
passive DNS dataset, it was observed that there were over
10 million domains with TXT resource record activities. The 2 Of the Alexa Top 1000 domains extracted as at 12/08/2018, only 708
Access Count ranged from 1 to 272 737 (for the domain had DNS traffic during the period under investigation. The remaining 292
s06.1yf.de: possibly used for DNS tunneling) domains did not have any activity and the extracts were empty: some domains
were not even existent during the period under consideration.Unfortunately the
1) Comparison With Alexa Top 1000 Domains: To ensure researcher could not obtain a historical Alexa Top 1000 domains list for the
that the model can be validated, the difference in resource period under investigation
and responding to DNS requests. Due to the recurring pattern
of having hard-coded command and control servers in all the
malware families that were considered, the researcher decided
to perform a manual analysis and manual detection process.
Automating the detection process would have increased the
scope and complexity of the research considerably.
1) Analysis of the DNSMessenger Malware: An earlier
variant of the malware was obtained and a sandboxed malware
analysis was performed using the steps and methods used by
the organisations and researchers who discovered the malware
[11, 12, 76]. A comprehensive analysis was not performed
as this was out of scope, and the analysis was performed
only to ensure that the DNS details about the malware were
correct. According to an investigation conducted by [11],
DNSMessenger used the contents of the TXT record in the
response to these queries to determine what action to take
next. For instance, the first subdomain is www and a query
Fig. 7. Resource Record Count in the Top 20 Shady TLD domains response with a TXT record containing www will instruct the
script to proceed. Other actions that could be taken were idle
This analysis was not conclusive, maybe, because these and stop.
malicious domains focus mainly on phishing and spams and A search for www in TXT records revealed a total of
they are not used for intensive payload distribution. The six domains as shown in Figure 8. An investigation of the
resource record count for the Top 20 ”Shady” TLDs is shown first four domains in Figure 8 did not point to anything
in Figure 7. There were other lists that were used to compare interesting/amiss.
resource activities but these all generally point to the same
direction: known malware top-level domains generally do not
have a higher TXT resource record access count.
C. Malware Families that use DNS in their Operations
The following list is based on previous research and includes
the well-known malware families that use DNS resource
records to conduct their activities: Morto discovered in 2011
[50], FeederBot discovered in 2011 [18], PlugX discovered in Fig. 8. Search for the ”www” in TXT records
2013 [71], FrameworkPOS discovered in 2014 [15], Wekby
discovered in 2015 [24], BernhardPOS discovered in 2015
[15], JAKU discovered in 2015 [61], MULTIGRAIN discov- The other two domains (www.cihr.site and
ered in 2016 [15] and DNSMessenger discovered in 2017 www.ckwl.pw) were part of the domains that were
[11]. After a careful consideration of the malware families hard-coded in the source code of the malware. An analysis of
that use DNS for payload distribution, only two families were the passive DNS data did not reveal any interesting activity
considered for this research purpose, namely, Wekby [24] and associated with these two domains. The hard-coded domains
DNSMessenger [11] because unlike other malware families, appeared for less than 72 hours in the global-time series.
they used not just the DNS to distribute payload, but they This is a relatively short time compared to domains used for
specifically used TXT records for their malicious activities. legitimate purposes, which may be in existence for many
The rest of the families either exhibit the “Many-to-Many” years.
pattern and would be easy to detect and be addressed, or The other commands mail and stop revealed what could
there have not been any known/discovered variants that would be payload distribution but further investigation was not con-
warrant further investigation. clusive as shown in Figures 9 and 10.
Wekby uses DNS tunneling which takes advantage of the
TXT transport layer within the DNS protocol used by top
and second level domain name system servers [24]. However,
searching for “commands” used by Wekby did not yield any
meaningful results, and therefore using the manual method of
searching for phrases/commands resulted in a detection rate of
zero. As a result, this research focused on the DNSMessenger
malware family. The analysis of the two malware families
revealed that they used certain ”phrases” during the querying Fig. 9. Search for “stop” in TXT Records
domains had incremental names that appeared to be generated
by algorithms. Further investigation of the sharepoint-
microsoft.co domain revealed that it is used for mal-
ware delivery, command and control, among others. Note-
worthy observations about the domains sharepoint-
Fig. 10. Search for “mail” in TXT records
microsoft.co and jreupdate.javaupdate.com are
highlighted below:
A manual search for the idle command yielded in-
teresting results, shown in Figure 11, which are dis- 1) The two domains impersonate major Internet,
cussed in the following section.Three domains contained software companies and services Microsoft
the idle command/text: microx.club, sharepoint- (sharepoint-microsoft.co), Oracle
microsoft.co, and jreupdate.javaupdate.com. A (jreupdate.javaupdate.com), among others.
DNS analysis of the microx.club domain did not reveal 2) The cybercriminals used WHOISGUARD3 , one of the
anything useful. No further investigation was carried out on ways that are used to avoid being identified. The above
the microx.club domain. The other two domains exhibited domains had long sub-domains like those used by Content
behaviour that warranted further investigation. The analysis Delivery Networks as highlighted in Figure 12.
and investigation of the two other domain is described in the
sections below.

Fig. 12. Content Delivery Network

The sharepoint-microsoft.co also had many mul-


tiple sub-domains associated with it. It appears that the botnet
points malicious domains to IP addresses not in their control.
For example, the sharepoint-microsoft.co domain
pointed to a non-malicious IP owned by the Microsoft Cor-
poration (sharepoint-microsoft.co resolved to the
IP address 104.43.195.251 with Autonomous System
Number 8075).
Fig. 11. Search for “idle” in TXT Records Some of the domains associated with the IP address
104.43.195.251: windows.com, dynamics.com,
Based on the analysis, the activity on the hard-coded do- dynamics.com, www.msdn.com, www.windows.com,
mains was short-lived (roughly two days) meaning that by xbox.com, microsoft.ca and www.microsoft.ch.
then there might not have been enough time for the malware A DNS analysis of the domain javaupdate.co revealed
to spread, the malware appears to have been targeted, which a pattern that was similar to that of sharepoint-
is typical in Advanced Persistent Threats (APTs), and there microsoft.co. The domain javaupdate.co resolved
is possible use of open resolvers as “may” have been shown to a non-malicious IP addresses owned by Oracle Corporation.
by low activity on passive DNS data. There is also possible This pattern was instrumental in directing the research focus,
use of rogue DNS resolvers. Later variants of DNSMessenger and detecting that the sharepoint-microsoft.co and
changed the way the malware communicates using the TXT javaupdate.co domains were malicious domains. This
records [12]. The later variants do not have hardcoded domain information pointed to other possible malicious domains, but
names, and are dynamic in nature. the leads were not followed due to the fact that this malware
family had already been analysed in detail before in a study
VII. B OTNET
conducted by ClearSky and Trend Micro, which published
The analysis of DNSMessenger and TXT records which a report about an Advanced Persistent Threat (APT) group
have the text idle in them revealed a possible botnet as- called CopyKittens [16].
sociated with the domain sharepoint-microsoft.co.
Further analysis was conducted as the behaviour of the do- 3 http://www.whoisguard.com/: WHOISGUARD substitutes a domain
mains showed that it was a possible command and control owner’s private information with its own information so that this information
for a botnet. This was because the third-level and upper level cannot be readily accessed by spammers
A. Advanced Persistent Threat Group many indicators of anomalies ranging from simple indicators
According to [16], CopyKittens is a cyber espionage actor such as many DNS queries and responses during a specific
which uses Cobalt Strike, a publicly available commercial soft- time, followed by a period of relative inactivity, to other
ware for “Adversary Simulations and Red Team Operations” indicators such as FQDNs that resemble the pattern used by
[17]. A notable characteristic of CopyKittens is the use of DGAs. This information coupled with other publicly available
DNS for command and control communication, and for data information such as WHOIS data, domain registration data,
exfiltration. Most of the infrastructure used by the group was among others, helped identify the infrastructure and possibly
identified to have been in the United States of America and some parties that are involved. Care should, however, be taken
Great Britain (IP address and analysis of information obtained to attribute the source of these abuses as the origin of a payload
from other sources: geolocation, WHOIS, among others). distribution could also be compromised and could be used as
a mask in order to avoid being identified by law enforcement
B. Analysis of the Other Domains agencies. The abuse of the DNS can be quantified through the
Other domain records with 255 characters (excluding those query and response patterns and this gives an indication of
with DKIM, SPF and other filters used before) were anal- the parties involved. Using supplementary information from
ysed for any suspicious behaviour. Decoding the content of secondary sources, the infrastructure used by malicious actors
TXT records did not yield anything using readily available can be identified.
encoding/decoding tools. Most of the DNS records with 255 Using manual procedures and searching for specific com-
characters in the TXT field were either associated with the: mands in the TXT records, it was discovered that payload
distribution channels can be detected and characterised. While
1) CopyKitten APT mentioned above, or
investigating one of the commands, the researcher stumbled
2) the 1yf.de domain associated with https://your-
upon what turned out to be a botnet that uses TXT records, and
freedom.net/, a VPN Tunneling, Firewall and Proxy
further investigation indicated that this was indeed an advanced
Bypassing, Anonymization and Anti-Censorship Solu-
botnet used by one of the APT groups, CopyKittens. From the
tion, discussed in previous sections.
analysis (excluding the .com TLD, the .arpa TLD, those
The second group of domains that have 255 characters in with SPF, DKIM among others), it can be concluded that for
the TXT field were: DNS records that have 255 characters in the TXT field are
• dns.njcate.org - shares the nameserver with the either involved in malware payload distribution or,they are
sharepoint-microsoft.co domain. There were, being used for DNS Tunnelling services (which can be abused
however, multiple changes in name servers since the to distribute malicious payload). The expectation was that
dns.njcate.org domain was registered. certain channels would be used for various purposes/different
• ksx.la - nameserver associated with ksx.la resolves payloads, but disappointingly, the analysis showed that these
to multiple domains, most of which are associated with channels are normally used for targeted attacks and as such,
suspicious malware content. are used for a single purpose to avoid detection. There was
• bn.tl - has been used to serve malware in the past. no instance where a malware domain was used for phishing
• The rest of the second level domains follow the same and then also for payload distribution. Neither was there an
pattern as sharepoint-microsoft.co, jreup- instance where a malware domain was associated with various
date.javaupdate.com. The analysis matched a malware families. Using a channel for various purposes would
number of domains and indicators, such that these do- increase the chances of detection.
mains can be comfortably tagged as likely being DNS
tunnel-related too. IX. L IMITATIONS OF THIS R ESEARCH
There are limitations with the system. One of the limitations
VIII. S IGNIFICANT F INDINGS AND D ISCUSSION is that the system was unable to identify malware and distri-
To answer the research question and the sub-research bution channels that mimic the DNS activities related to TXT
questions: From the analysis of the passive DNS data and resource records that belong to legitimate payload channels.
information associated with the malicious domains, it can be Another limitation is that the system relies on malware datasets
established that the domains that are used for distributing and dynamic analysis in a sandboxed environment may not be
payload can be detected and identified using various features able to detect all the behaviour of some malware families that
that are synonymous with malicious activities. TXT resource have the ability to detect that they are in a sandboxed/virtual
records were used in this research to characterise each distribu- environment. Another limitation is that the system cannot offer
tion channel and establish whether a channel was distributing real-time solutions as the system is an offline one that relies
payload based on the DNS activities related to TXT resource on data being analysed only at the end of a given time period.
records. Using the malware samples that were identified to The last limitation has to do with the shortcomings of the
have been active during the period under consideration, namely passive DNS systems themselves. The passive DNS systems
DNSMessenger, it was found that these malware families still collect DNS traffic through multiple sensors located at various
use the basic method of sending using simple commands locations. Some payload distribution channels and malware
using the TXT resource records. From the analysis, there were families may not use caching resolvers in their respective
networks and instead, send their DNS queries to open resolvers [6] Elisa Bertino and Nayeem Islam. “Botnets and internet
directly. This would mean that the DNS traffic would bypass of things security”. In: Computer 50.2 (2017), pp. 76–
the sensors and therefore cannot be captured for analysis. 79.
[7] Leyla Bilge et al. “EXPOSURE: a passive DNS analysis
X. F UTURE W ORK
service to detect and report malicious domains”. In:
The analysis identified several significant challenges that ACM Transactions on Information and System Security
need to be explored further in future work. As such, machine (TISSEC) 16.4 (2014), p. 14.
learning, automated and dynamic analysis should be explored [8] Leyla Bilge et al. “EXPOSURE: Finding Malicious
in future so that the detection rate and the dynamic nature Domains Using Passive DNS Analysis.” In: Network
of these malware families can be taken into consideration. and Distributed System Security (2011). Available on-
The research undertaking uses mainly the TXT records for line at http://www.cs.ucsb.edu/ ∼chris/research/doc/
identifying payload distribution channels and in some cases, ndss11 exposure.pdf. Retrieved 10 May 2018.
domains may use different DNS resource record types at [9] Hamad Binsalleeh et al. “Characterization of covert
different times to perform malicious activities. Future work channels in DNS”. In: 2014 6th International Con-
can expand the scope and also consider the .com TLD and ference on New Technologies, Mobility and Security
also include other resource records. The usage of open DNS (NTMS). IEEE. 2014, pp. 1–5.
resolvers and rogue resolvers may need to be investigated as [10] Edmund Brumaghin. “Want Tofsee My Pictures? A
they may shed light on some activities that are not detected Botnet Gets Aggressive”. In: (2016). Available online
and sensed by the Passive DNS sensors. at https://blog.talosintelligence.com/2016/09/tofsee-
XI. C ONCLUSION spam.html. Retrieved 4 December 2018.
[11] Edmund Brumaghin and Colin Grady. “Covert Chan-
The objective of this research undertaking was to analyse nels and Poor Decisions: The Tale of DNSMessen-
and understand how DNS resource records are used in the ger”. In: (2017). Available online at https : / / blog .
distribution of malicious payloads. Through the analysis of talosintelligence . com / 2017 / 03 / dnsmessenger . html.
passive DNS data, the researcher was able to characterise DNS Retrieved 20 January 2018.
TXT resource records associated with malicious networks and [12] Edmund Brumaghin, Colin Grady, and Dave Maynor.
investigate how malicious networks use the DNS to distribute Spoofed SEC Emails Distribute Evolved DNSMessen-
malicious payloads. This research also managed to highlight ger. Available online at https://blog.talosintelligence.
that through monitoring and analysis of DNS TXT resource com / 2017 / 10 / dnsmessenger - sec - campaign . html.
records, botnets and their underlying infrastructure could be Retrieved 20 August 2018. 2017.
identified. Finally, the analysis suggests that usage of DNS [13] Prabhjot Singh Chahal and Surinder Singh Khurana.
TXT records other than for commonly known purposes, such “TempR: Application of stricture dependent intelligent
as domain authentication, is a strong indicator of malicious classifier for fast flux domain detection”. In: Interna-
activity. tional Journal of Computer Network and Information
R EFERENCES Security 8.10 (2016), p. 37.
[14] Taejoong Chung et al. “Understanding the role of regis-
[1] Maurizio Aiello, Alessio Merlo, and Gianluca Papaleo.
trars in DNSSEC deployment”. In: Proceedings of the
“Performance assessment and analysis of DNS tunnel-
2017 Internet Measurement Conference. ACM. 2017,
ing tools”. In: Logic Journal of the IGPL 21.4 (2013),
pp. 369–383.
pp. 592–602.
[15] Lynch Cian, Andonov Dimiter, and Teodorescu
[2] Maurizio Aiello, Maurizio Mongelli, and Gianluca Pa-
Claudiu. “MULTIGRAIN – Point of Sale Attackers
paleo. “DNS tunneling detection through statistical fin-
Make an Unhealthy Addition to the Pantry”. In: (2016).
gerprints of protocol messages and machine learning”.
Available online at https://www.fireeye.com/blog/threat-
In: International Journal of Communication Systems
research/2016/04/multigrain pointo.html. Retrieved 12
28.14 (2015), pp. 1987–2002.
June 2018.
[3] Abdelraman Alenazi et al. “Holistic Model for HTTP
[16] ClearSky. “Operation Wilted Tulip”. In: (2017). Avail-
Botnet Detection Based on DNS Traffic Analysis”. In:
able online at https : / / www . clearskysec . com / wp -
International Conference on Intelligent, Secure, and
content/uploads/2017/07/Operation Wilted Tulip.pdf.
Dependable Systems in Distributed and Cloud Environ-
Retrieved 20 August 2018.
ments. Springer. 2017, pp. 1–18.
[17] CobaltStrike. “Adversary Simulations and Red Team
[4] Kamal Alieyan et al. “A survey of botnet detection
Operations”. In: (2018). Available online at https : / /
based on DNS”. In: Neural Computing and Applications
www.cobaltstrike.com/. Retrieved 20 August 2018.
28.7 (2017), pp. 1541–1558.
[18] Christian J Dietrich et al. “On Botnets that use DNS
[5] Marios Anagnostopoulos et al. “DNS amplification at-
for Command and Control”. In: 2011 Seventh European
tack revisited”. In: Computers & Security 39 (2013),
Conference on Computer Network Defense (EC2ND).
pp. 475–485.
IEEE. 2011, pp. 9–16.
[19] Michael Dooley and Timothy Rooney. DNS Security Conference on Computer and Communications Security.
Management. John Wiley & Sons, 2017. ACM. 2016, pp. 663–674.
[20] Paal Engelstad, Boning Feng, Thanh van Do, et al. [32] Hiroaki Kikuchi and Tomohiro Arimizu. “On the Vul-
“Detection of DNS tunneling in mobile networks us- nerability of Ghost Domain Names”. In: 2014 Eighth
ing machine learning”. In: International Conference on International Conference on Innovative Mobile and In-
Information Science and Applications. Springer. 2017, ternet Services in Ubiquitous Computing (IMIS). IEEE.
pp. 221–230. 2014, pp. 584–587.
[21] Farsight. “Frequently Asked Questions”. In: Farsight [33] Amit Klein, Haya Shulman, and Michael Waidner.
Security (2018). Available online at https : / / www . “Internet-wide study of DNS cache injections”. In:
farsightsecurity.com/faq/. Retrieved 25 June 2018. INFOCOM 2017-IEEE Conference on Computer Com-
[22] Kensuke Fukuda, John Heidemann, and Abdul Qadeer. munications, IEEE. IEEE. 2017, pp. 1–9.
“Detecting Malicious Activity With DNS Backscatter [34] Yakup Koc, Almerima Jamakovic, and Bart Gijsen. “A
Over Time”. In: IEEE/ACM Transactions on Network- global reference model of the Domain Name System”.
ing 25.5 (2017), pp. 3203–3218. In: International Journal of Critical Infrastructure Pro-
[23] J. Gordon. “Systems and methods for identifying a net- tection 5.3 (2012), pp. 108–117.
work”. In: (Jan. 2013). US Patent 8.353.007. Available [35] Vimal Kumar, Satish Kumar, and Avadhesh Kumar
online at https://www.google.com/patents/US8353007. Gupta. “Real-time Detection of Botnet Behavior in
URL: https://www.google.com/patents/US8353007. Cloud Using Domain Generation Algorithm”. In: Pro-
[24] Josh Grunzweig, Mike Scott, and Brian Lee. “New ceedings of the International Conference on Advances
Wekby Attacks Use DNS Requests As Command and in Information Communication Technology & Comput-
Control Mechanism”. In: (2015). Available online at ing. ACM. 2016, p. 69.
https://researchcenter.paloaltonetworks.com/2016/05/ [36] Chris Larsen. “The “Top 20”: Shady Top-Level Do-
unit42 - new - wekby - attacks - use - dns - requests - as - mains”. In: (2018). Available online at https : / / www.
command- and- control- mechanism. Retrieved 20 June symantec.com/blogs/feature-stories/top-20-shady-top-
2018. level-domains. Retrieved 20 August 2018.
[25] Nicole M Hands, Baijian Yang, and Raymond A [37] Xingguo Li, Junfeng Wang, and Xiaosong Zhang. “Bot-
Hansen. “A study on botnets utilizing DNS”. In: Pro- net Detection Technology Based on DNS”. In: Future
ceedings of the 4th Annual ACM Conference on Re- Internet 9.4 (2017), p. 55.
search in Information Technology. ACM. 2015, pp. 23– [38] Allan Liska and Geoffrey Stowe. DNS Security: De-
28. fending the Domain Name System. Syngress, 2016.
[26] Filip Hock and Peter Kortiš. “Design, implementation [39] Baojun Liu et al. “A Reexamination of Internationalized
and monitoring of the firewall system for a DNS Domain Names: the Good, the Bad and the Ugly”. In:
server protection”. In: 2016 International Conference 2018 48th Annual IEEE/IFIP International Conference
on Emerging eLearning Technologies and Applications on Dependable Systems and Networks (DSN). IEEE.
(ICETA). IEEE. 2016, pp. 91–96. 2018, pp. 654–665.
[27] Mohammed Abdulridha Hussain et al. “DNS Protection [40] Cricket Liu. “Actively boosting network security with
against Spoofing and Poisoning Attacks”. In: 2016 3rd passive DNS”. In: Network Security 2016.5 (2016),
International Conference on Information Science and pp. 18–20.
Control Engineering (ICISCE). IEEE. 2016, pp. 1308– [41] Daiping Liu, Shuai Hao, and Haining Wang. “All your
1312. DNS records point to us: Understanding the security
[28] MH Jalalzai, WB Shahid, and MMW Iqbal. “DNS threats of dangling DNS records”. In: Proceedings of
security challenges and best practices to deploy secure the 2016 ACM SIGSAC Conference on Computer and
DNS with digital signatures”. In: 2015 12th Interna- Communications Security. ACM. 2016, pp. 1414–1425.
tional Bhurban Conference on Applied Sciences and [42] Jingkun Liu et al. “Detecting DNS Tunnel through
Technology (IBCAST). IEEE. 2015, pp. 280–285. Binary-Classification Based on Behavior Features”. In:
[29] A Mert Kara et al. “Detection of Malicious Payload 2017 IEEE Trustcom/BigDataSE/ICESS. IEEE. 2017,
Distribution Channels in DNS”. MA thesis. Concordia pp. 339–346.
University, 2012. [43] Keyu Lu et al. “DNS recursive server health evaluation
[30] A Mert Kara et al. “Detection of malicious payload model”. In: 2016 18th Asia-Pacific Network Operations
distribution channels in DNS”. In: 2014 IEEE Inter- and Management Symposium (APNOMS). IEEE. 2016,
national Conference on Communications (ICC). IEEE. pp. 1–4.
2014, pp. 853–858. [44] Samuel Marchal et al. “Semantic exploration of DNS”.
[31] Issa Khalil, Ting Yu, and Bei Guan. “Discovering In: International Conference on Research in Network-
malicious domains through passive DNS data graph ing. Springer. 2012, pp. 370–384.
analysis”. In: Proceedings of the 11th ACM on Asia
[45] Chris Marrison. “Understanding the threats to DNS and [58] Mahmoud Sammour, Burairah Hussin, and Fairuz
how to secure it”. In: Network Security 2015.10 (2015), Iskandar Othman. “Comparative Analysis for Detecting
pp. 8–10. DNS Tunneling Using Machine Learning Techniques”.
[46] Sara Marie Mc Carthy et al. “Data Exfiltration De- In: International Journal of Applied Engineering Re-
tection and Prevention: Virtually Distributed POMDPs search 12.22 (2017), pp. 12762–12766.
for Practically Safer Networks”. In: International Con- [59] Sarah Scheffler et al. “The Unintended Consequences of
ference on Decision and Game Theory for Security. Email Spam Prevention”. In: International Conference
Springer. 2016, pp. 39–61. on Passive and Active Network Measurement. Springer.
[47] Alessio Merlo et al. “A comparative performance eval- 2018, pp. 158–169.
uation of DNS tunneling tools”. In: Computational In- [60] Kitterman Scott. Sender Policy Framework (SPF) for
telligence in Security for Information Systems. Springer, Authorizing Use of Domains in Email, Version 1. RFC
2011, pp. 84–91. 7208. Available online at http://www.rfc-editor.org/rfc/
[48] Khulood Al Messabi et al. “Malware detection using rfc7208.txt. RFC Editor, 2014.
dns records and domain name features”. In: Proceedings [61] Andy Settle et al. “An analysis of Botnet Cam-
of the 2nd International Conference on Future Networks paign:Jaku”. In: (2015). Available online at https : / /
and Distributed Systems. 2018, pp. 1–7. www . forcepoint . com / sites / default / files / resources /
[49] Jayashree Mohan, Shruthi Puranik, and K Chan- files / report jaku analysis of botnet campaign en 0 .
drasekaran. “Reducing DNS cache poisoning attacks”. pdf. Retrieved 20 June 2018.
In: 2015 International Conference on Advanced Com- [62] Saeed Shafieian, Daniel Smith, and Mohammad Zulk-
puting and Communication Systems. IEEE. 2015, pp. 1– ernine. “Detecting DNS Tunneling Using Ensemble
6. Learning”. In: International Conference on Network
[50] Cathall Mullaney. “Morto worm sets a (DNS) record”. and System Security. Springer. 2017, pp. 112–127.
In: (2011). Available online at https://www.symantec. [63] Bisma Shah. “Cisco Umbrella: A Cloud-Based Secure
com / connect / blogs / morto - worm - sets - dns - record. Internet Gateway (SIG) On and Off Network”. In:
Retrieved 20 January 2018. International Journal 8.2 (2017).
[51] Bold Munkhbaatar, Mamoru Mimura, and Hidema [64] Pooja Sharma, Sanjeev Kumar, and Neeraj Sharma.
Tanaka. “Dark Domain Name Attack: A New Threat “BotMAD: Botnet malicious activity detector based on
to Domain Name System”. In: International Confer- DNS traffic analysis”. In: 2016 2nd International Con-
ence on Information Systems Security. Springer. 2017, ference on Next Generation Computing Technologies
pp. 405–414. (NGCT). IEEE. 2016, pp. 824–830.
[52] Asaf Nadler, Avi Aminov, and Asaf Shabtai. “Detection [65] Stephen Sheridan and Anthony Keane. “Detection of
of Malicious and Low Throughput Data Exfiltration DNS Based Covert Channels”. In: European Conference
over the DNS Protocol”. In: Cryptography and Security on Cyber Warfare and Security. Academic Conferences
(2017). Available online at https://arxiv.org/abs/1709. International Limited. 2015, p. 267.
08395. Retrieved 29 September 2017. [66] Stephen Sheridan and Anthony Keane. “Improving the
[53] Asaf Nadler, Avi Aminov, and Asaf Shabtai. “Detec- Stealthiness of DNS-Based Covert Communication”.
tion of malicious and low throughput data exfiltration In: ECCWS 2017 16th European Conference on Cy-
over the DNS protocol”. In: Computers & Security 80 ber Warfare and Security. Academic Conferences and
(2019), pp. 36–53. Publishing Limited. 2017, p. 433.
[54] Arman Noroozian et al. “Who gets the boot? Analyzing [67] Somayeh Soltani et al. “A survey on real world botnets
victimization by DDOS-as-a-service”. In: International and detection mechanisms”. In: International Journal of
Symposium on Research in Attacks, Intrusions, and Information and Network Security 3.2 (2014), p. 116.
Defenses. Springer. 2016, pp. 368–389. [68] Aditya K Sood and Sherali Zeadally. “A taxonomy of
[55] Viivi Nuojua, Gil David, and Timo Hämäläinen. domain-generation algorithms”. In: IEEE Security &
“DNS Tunneling Detection Techniques–Classification, Privacy 14.4 (2016), pp. 46–53.
and Theoretical Comparison in Case of a Real APT [69] Jacob Steadman and Sandra Scott-Hayward. “DNSxD:
Campaign”. In: Internet of Things, Smart Spaces, and Detecting Data Exfiltration Over DNS”. In: 2018 IEEE
Next Generation Networks and Systems. Springer, 2017, Conference on Network Function Virtualization and
pp. 280–291. Software Defined Networks (NFV-SDN). IEEE. 2018,
[56] Roland van Rijswijk-Deij. “Improving DNS security: A pp. 1–6.
Measurement-based Approach”. PhD thesis. University [70] Martino Trevisan et al. “Automatic detection of DNS
of Twente, 2017. manipulations”. In: 2017 IEEE International Confer-
[57] Salvatore Saeli et al. “DNS Covert Channel Detec- ence on Big Data (Big Data). IEEE. 2017, pp. 4010–
tion via Behavioral Analysis: a Machine Learning Ap- 4015.
proach”. In: arXiv preprint arXiv:2010.01582 (2020). [71] Roman Valisenko. “An Analysis of PlugX Malware”.
In: (2013). Available online at https : / / www. lastline .
com/labsblog/an-analysis-of-plugx-malware/. Retrieved
20 June 2018.
[72] Gernot Vormayr, Tanja Zseby, and Joachim Fabini.
“Botnet communication patterns”. In: IEEE Commu-
nications Surveys & Tutorials 19.4 (2017), pp. 2768–
2796.
[73] Zheng Wang. “POSTER: on the capability of DNS
cache poisoning attacks”. In: Proceedings of the 2014
ACM SIGSAC Conference on Computer and Communi-
cations Security. ACM. 2014, pp. 1523–1525.
[74] Hao Wu et al. “Kalman filter based DNS cache poi-
soning attack detection”. In: 2015 IEEE International
Conference on Automation Science and Engineering
(CASE). IEEE. 2015, pp. 1594–1600.
[75] Kui Xu et al. “DNS for massive-scale command and
control”. In: IEEE Transactions on Dependable and
Secure Computing 10.3 (2013), pp. 143–153.
[76] Anthony Yates. “More info on “Evolved DNSMes-
senger” ”. In: (2017). Available online at https : / /
wraithhacker.com/2017/10/11/more- info- on- evolved-
dnsmessenger/. Retrieved 20 August 2018.
[77] Your-Freedom. “VPN tunneling, anonymisation and
anti-censorship”. In: (2017). Available online at https:
//your-freedom.net/. Retrieved 20 August 2018.
[78] Z Zhang, Roy George, and Khalil Shujaee. “An ap-
proach to malicious payload detection”. In: 2018 World
Automation Congress (WAC). IEEE. 2018, pp. 1–5.
[79] Yury Zhauniarovich et al. “A Survey on Malicious
Domains Detection through DNS Data Analysis”. In:
ACM Computing Survey (2018). Available online at
https : / / arxiv. org / pdf / 1805 . 08426 . pdf. Retrieved 10
May 2018.

View publication stats

You might also like