Professional Documents
Culture Documents
Paper045 2nd IEEE IMITEC 2020 Conference FINAL-2
Paper045 2nd IEEE IMITEC 2020 Conference FINAL-2
Paper045 2nd IEEE IMITEC 2020 Conference FINAL-2
net/publication/348977997
CITATIONS READS
2 245
2 authors:
All content following this page was uploaded by Ishmael Dube on 12 October 2021.
Abstract—The Domain Name System (DNS) protocol is a DNS queries. The payload is the portion of the malware which
fundamental part of Internet activities that can be abused by performs a malicious action on the target [78].
cybercriminals to conduct malicious activities. Previous research
has shown that cybercriminals use different methods, including In its simplest form, the DNS is a powerful yet simple
the DNS protocol, to distribute malicious content, remain hidden database management system that contains Internet Protocol
and avoid detection from various technologies that are put (IP) addresses of every domain name [34]. As a fundamental
in place to detect anomalies. This allows botnets and certain part of Internet activities, the DNS is attracting cybercriminals
malware families to establish covert communication channels that who can exploit its vulnerabilities and also use it as part of
can be used to send or receive data and also distribute malicious
payloads using the DNS queries and responses. Cybercriminals their malicious network/infrastructure. The DNS can be used
use the DNS to breach highly protected networks, distribute as a covert communication channel between the command and
malicious content, and exfiltrate sensitive information without control servers and the bots. DNS traffic bypasses traditional
being detected by security controls put in place by embedding security monitoring giving an opportunity to botnets to send
certain strings in DNS packets. This research undertaking and receive data in highly protected networks by embedding
analysed the use of the DNS in detecting domains and channels
that are used for distributing malicious payloads. Passive DNS certain strings in the DNS packets [64]. The simple architec-
data which replicate DNS queries on name servers to detect ture of the DNS protocol allows and facilitates the transfer of
anomalies in DNS queries was evaluated and analysed in order data in query and response packets, a feature., cybercriminals
to detect malicious payloads. The research characterised the also use to establish covert communication channels that allow
malicious payload distribution channels by analysing passive DNS them to send and distribute malicious payloads. It is, therefore,
traffic and modelled the DNS query and response patterns used
during malicious payload distribution. The research found that essential to analyse DNS activity to understand, detect and
it is possible to detect malicious payload distribution channels set up appropriate defence strategies against such emerging
through the analysis of DNS TXT resource records. threats. This can be achieved through analysing Passive DNS
Index Terms—DNS, Malicious Payload Distribution, Passive datasets which are databases of actual historic DNS traffic
DNS, Covert DNS Tunnelling [44].
According to [9, 79], research in the analysis of the DNS as
I. I NTRODUCTION a payload distribution channel is limited and often concentrates
on specific malware families. This research aims to broaden
Previous research has shown that botnet masters use differ- this research field and fill in the existing research gap by
ent methods to remain hidden and avoid detection from various extending the analysis of DNS being used as a payload
technologies that are put in place to detect anomalies [37]. A distribution channel to detection of multi-purpose domains
botnet is defined as a coordinated group of malicious instances that are used to distribute different malicious payloads. Multi-
that are usually distributed but are controlled by a command purpose domains are used for different malicious activities
and control instance to deliver malware or launch cyberattacks such as sending spam messages, phishing campaigns, and
[72]. The Domain Name System (DNS) has become a target distribution of malware among others [30, 10].
for botnet masters in conducting their malicious activities [67].
Cybercriminals use the DNS to breach highly protected net-
works, steal and exfiltrate sensitive information without being II. R ESEARCH O BJECTIVES
detected by security controls put in place by organisations [46].
According to [42], DNS traffic is considered harmless, is often The objective of this research is to analyse and understand
not monitored or less effort is put in monitoring this traffic and how DNS resource records are used in the distribution of
is allowed to bypass monitoring measures being implemented. malicious payloads. The goal is to also characterise DNS
This allows botnets and certain malware families to establish messages associated with malicious networks and investigate
covert communication channels that can be used to send or various ways that malicious networks are using DNS to
receive data and also distribute malicious payloads using the distribute attack payloads.
III. R ESEARCH Q UESTION with malicious payload distribution channels is that they
While the subject of malicious payload distribution channels usually do not show characteristics of DNS channels
using the DNS has been studied before, there is still a research because of the relatively low volume of upstream data.
gap when it comes to multi-purpose domains used for various Studies have been conducted on various DNS tunnelling
malicious activities. Below is the research question being tools and ways of detecting such tunnels [75, 65, 2,
investigated: 20, 52, 4, 55, 62, 58], ways of improving detection
mechanisms [2] and techniques that can be used to protect
• Can domains used for distributing multiple malicious
and mitigate threats posed by DNS tunnels [25, 19]. Other
payloads using covert channels be characterised and
studies have focused on DNS tunnelling performance
modelled based on the data transferred using the DNS?
issues due to limitations of DNS packet sizes and traffic
In addition to the research question above, the following sub- overhead [1, 65, 52]. Finally, [58] investigated new ways
questions were also investigated: that criminals are using to avoid detection when using
• How can the abuse of the DNS be quantified and all the DNS tunnels.
involved parties and infrastructure be identified? 2) Fast-Flux Networks: According to [13], fast-flux net-
• Can encrypted payload distribution channels that are used works are a result of a different set of IP addresses being
for various purposes be detected and monitored through returned for each DNS query. The basic concept of a fast
analysis of DNS traffic? flux network is having multiple IP addresses associated
IV. R ELATED W ORK with a domain name, and then constantly changing them
in quick succession. With the passage of time, fast-flux
A. Domain Name System networks have become complex making it even more
According to [19], the DNS protocol is a translation service difficult to detect and trace them [67].
used by the Internet protocol. To access a web page on the 2) Abuse of the DNS at System-Level: According to [43],
Internet, a request to access a domain name begins with a the DNS has a number of yet to be discovered vulnerabilities
DNS query to obtain an IP address which corresponds to the that exist at every layer in the DNS. These unknown vulnera-
requested domain name. The DNS protocol utilises the Fully bilities present an opportunity for criminals to abuse the DNS
Qualified Domain Name (FQDN) to specify the location of system whenever they can [45]. Some studies have focused
a computer in the hierarchy of the DNS [51, 3]. According on the domains that are based on the Domain Generation
to [38] authoritative name servers store original information Algorithm (DGA). These domains are randomly generated
about a domain in a text file called a zone file. A zone file domains [35] and botnets send queries to them hoping that
is comprised of entries or mappings between IP addresses, they are already registered. This method is also called domain
domain names and other resources that are organised using a fluxing [68, 6]. Another system-level DNS abuse is the DNS
textual representation called a Resource Record [5, 19]. Each amplification attack, a reflection-based distributed denial of
Resource Record consists of five fields, namely, Name, Time service (DDos) attack [5].
to Live (TTL), Record Class, Record Type and Record Data. Studies have been conducted on cache manipulation attacks
For the purpose of this research the focus will be on the Record that rely on caching DNS information in the hierarchy of the
Type field which specifies the type of information that will be DNS [49, 74, 33]. These studies have ranged from methods
carried by the DNS message. Some of the record types that used to poison DNS caches [27] to ways of detecting cache
were considered in this research undertaking are A/AAAA, poisoning attacks by scanning DNS resolvers [73]. DNS Cache
NS , MX ,TXT and CNAME. Poisoning (DNS Spoofing) is a cyber-attack that exploits
B. The Abuse of the DNS vulnerabilities in the domain name system (DNS) by diverting
Internet traffic away from legitimate servers and towards fake
According to [67, 64], the DNS has been and is a target ones. [32] studied a vulnerability which allowed domains that
for criminals for different purposes due to the simplicity and had been deleted to be kept alive. In addition to the above
the underlying architecture of the DNS protocol. Based on the abuses, [70] studied the impact of rogue resolvers in the
form of abuses in the past, [30] categorised these DNS abuses DNS hierarchy. These resolve domain names to malicious IP
into system-level and protocol-level abuses. addresses that can be under the control of the attackers. [40]
1) Protocol-Level Abuse of the DNS: Protocol-level DNS observed that domains used for malicious purposes are also
abuses target the DNS protocol and exploit the weaknesses found in authoritative name servers. There have been propos-
in the architecture and the way the protocol is designed [37]. als around algorithms that can be used to detect malicious
From the literature reviewed, DNS Tunnelling and Fast-Flux activities based on DNS queries [22].
networks were identified as protocol-level abuses of the DNS.
1) DNS Tunnelling: DNS tunnelling can be used to bypass C. Security Measures Put in Place to Protect the DNS from
network security devices and communicate with rogue Abuse
DNS resolvers [47]. The FeederBot botnet used the DNS There are various mechanisms and measures that have been
as a channel for communicating with the command and put in place to protect the DNS against security threats [26],
control servers [18]. [9] note that the biggest challenge such as traffic shaping, flow filtering and prioritisation. [28]
proposes a number of measures to protect the DNS, such
as hardening the environment hosting the DNS. [19] also
recommend that recursion be disabled on master and primary
DNS, and restriction of recursive queries on slave servers to
mitigate spoofing attacks. The DNS with Security Extensions
(DNSSEC), a suite of specifications used for securing informa-
tion supplied by the DNS, provides a set of security extensions
[14]. According to [56], these extensions also address most
DNS vulnerabilities that are known, but its usage is not as
widespread as expected.
be using the same name servers reveals that there are several
other domains that use the same name servers. The domains
that use the same name servers are shown in the output in
Figure 4. The records in Figure 4 all appear to be related to Fig. 5. Alexa Top 1000 Resource Record Distribution
the www.your-freedom.net domain. The www.your-
freedom.net domain offers VPN tunneling, firewall and
proxy bypassing, anonymisation and anti-censorship solutions
[77]. In this case, the record types NULL and TXT appear to
be obfuscated by encryption.
B. Detecting the Payload Distribution Channels 2) Comparison with Top 20 “Shady” TLDs:: According
to [36], “shadiness” looks at the ratio of malicious second-
Once the pattern of DNS queries and responses had been level domains to the total number of domains registered. Other
observed, they were then inspected in the module that analyses alternative terms used are “evil index”, “badness”, among
DNS zones. This was achieved through the use of access others.
counts of TXT resource records for each domain. From the
passive DNS dataset, it was observed that there were over
10 million domains with TXT resource record activities. The 2 Of the Alexa Top 1000 domains extracted as at 12/08/2018, only 708
Access Count ranged from 1 to 272 737 (for the domain had DNS traffic during the period under investigation. The remaining 292
s06.1yf.de: possibly used for DNS tunneling) domains did not have any activity and the extracts were empty: some domains
were not even existent during the period under consideration.Unfortunately the
1) Comparison With Alexa Top 1000 Domains: To ensure researcher could not obtain a historical Alexa Top 1000 domains list for the
that the model can be validated, the difference in resource period under investigation
and responding to DNS requests. Due to the recurring pattern
of having hard-coded command and control servers in all the
malware families that were considered, the researcher decided
to perform a manual analysis and manual detection process.
Automating the detection process would have increased the
scope and complexity of the research considerably.
1) Analysis of the DNSMessenger Malware: An earlier
variant of the malware was obtained and a sandboxed malware
analysis was performed using the steps and methods used by
the organisations and researchers who discovered the malware
[11, 12, 76]. A comprehensive analysis was not performed
as this was out of scope, and the analysis was performed
only to ensure that the DNS details about the malware were
correct. According to an investigation conducted by [11],
DNSMessenger used the contents of the TXT record in the
response to these queries to determine what action to take
next. For instance, the first subdomain is www and a query
Fig. 7. Resource Record Count in the Top 20 Shady TLD domains response with a TXT record containing www will instruct the
script to proceed. Other actions that could be taken were idle
This analysis was not conclusive, maybe, because these and stop.
malicious domains focus mainly on phishing and spams and A search for www in TXT records revealed a total of
they are not used for intensive payload distribution. The six domains as shown in Figure 8. An investigation of the
resource record count for the Top 20 ”Shady” TLDs is shown first four domains in Figure 8 did not point to anything
in Figure 7. There were other lists that were used to compare interesting/amiss.
resource activities but these all generally point to the same
direction: known malware top-level domains generally do not
have a higher TXT resource record access count.
C. Malware Families that use DNS in their Operations
The following list is based on previous research and includes
the well-known malware families that use DNS resource
records to conduct their activities: Morto discovered in 2011
[50], FeederBot discovered in 2011 [18], PlugX discovered in Fig. 8. Search for the ”www” in TXT records
2013 [71], FrameworkPOS discovered in 2014 [15], Wekby
discovered in 2015 [24], BernhardPOS discovered in 2015
[15], JAKU discovered in 2015 [61], MULTIGRAIN discov- The other two domains (www.cihr.site and
ered in 2016 [15] and DNSMessenger discovered in 2017 www.ckwl.pw) were part of the domains that were
[11]. After a careful consideration of the malware families hard-coded in the source code of the malware. An analysis of
that use DNS for payload distribution, only two families were the passive DNS data did not reveal any interesting activity
considered for this research purpose, namely, Wekby [24] and associated with these two domains. The hard-coded domains
DNSMessenger [11] because unlike other malware families, appeared for less than 72 hours in the global-time series.
they used not just the DNS to distribute payload, but they This is a relatively short time compared to domains used for
specifically used TXT records for their malicious activities. legitimate purposes, which may be in existence for many
The rest of the families either exhibit the “Many-to-Many” years.
pattern and would be easy to detect and be addressed, or The other commands mail and stop revealed what could
there have not been any known/discovered variants that would be payload distribution but further investigation was not con-
warrant further investigation. clusive as shown in Figures 9 and 10.
Wekby uses DNS tunneling which takes advantage of the
TXT transport layer within the DNS protocol used by top
and second level domain name system servers [24]. However,
searching for “commands” used by Wekby did not yield any
meaningful results, and therefore using the manual method of
searching for phrases/commands resulted in a detection rate of
zero. As a result, this research focused on the DNSMessenger
malware family. The analysis of the two malware families
revealed that they used certain ”phrases” during the querying Fig. 9. Search for “stop” in TXT Records
domains had incremental names that appeared to be generated
by algorithms. Further investigation of the sharepoint-
microsoft.co domain revealed that it is used for mal-
ware delivery, command and control, among others. Note-
worthy observations about the domains sharepoint-
Fig. 10. Search for “mail” in TXT records
microsoft.co and jreupdate.javaupdate.com are
highlighted below:
A manual search for the idle command yielded in-
teresting results, shown in Figure 11, which are dis- 1) The two domains impersonate major Internet,
cussed in the following section.Three domains contained software companies and services Microsoft
the idle command/text: microx.club, sharepoint- (sharepoint-microsoft.co), Oracle
microsoft.co, and jreupdate.javaupdate.com. A (jreupdate.javaupdate.com), among others.
DNS analysis of the microx.club domain did not reveal 2) The cybercriminals used WHOISGUARD3 , one of the
anything useful. No further investigation was carried out on ways that are used to avoid being identified. The above
the microx.club domain. The other two domains exhibited domains had long sub-domains like those used by Content
behaviour that warranted further investigation. The analysis Delivery Networks as highlighted in Figure 12.
and investigation of the two other domain is described in the
sections below.