Professional Documents
Culture Documents
A Data Analytics Approach To The Cybercrime Underground Economy
A Data Analytics Approach To The Cybercrime Underground Economy
A Data Analytics Approach To The Cybercrime Underground Economy
ABSTRACT Despite the rapid escalation of cyber threats, there has still been little research into the
foundations of the subject or methodologies that could serve to guide information systems researchers
and practitioners who deal with cybersecurity. In addition, little is known about crime-as-a-service (CaaS),
a criminal business model that underpins the cybercrime underground. This research gap and the practical
cybercrime problems we face have motivated us to investigate the cybercrime underground economy by
taking a data analytics approach from a design science perspective. To achieve this goal, we: (1) propose
a data analysis framework for analyzing the cybercrime underground; (2) propose CaaS and crimeware
definitions; (3) propose an associated classification model, and (4) develop an example application to
demonstrate how the proposed framework and classification model could be implemented in practice.
We then use this application to investigate the cybercrime underground economy by analyzing a large data
set obtained from the online hacking community. By taking a design science research approach, this paper
contributes to the design artifacts, foundations, and methodologies in this area. Moreover, it provides useful
practical insights to practitioners by suggesting guidelines as to how governments and organizations in all
industries can prepare for attacks by the cybercrime underground.
2169-3536 2018 IEEE. Translations and content mining are permitted for academic research only.
26636 Personal use is also permitted, but republication/redistribution requires IEEE permission. VOLUME 6, 2018
See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
J. An, H.-W. Kim: Data Analytics Approach to the Cybercrime Underground Economy
cyber threats, few have attempted to put this new interest on a Accordingly, this study uses dynamic analysis to conduct
solid foundation or develop suitable methodologies. Previous an ex-ante evaluation of the classification model. It also
studies have not analyzed the underground economy behind conducts an ex-post evaluation of a front-end application
cybercrime in depth. Furthermore, little is known about CaaS, using observational methods (case examples). From a prac-
one of the primary business models behind the cybercrime tical perspective, this study also provides practitioners with
underground. There is an overall lack of understanding, both useful insights by making suggestions to guide governments
in research and practice, of the nature of this underground and and organizations in all industries in solving the problems
the mechanisms underlying it. they face when preparing for attacks from the cybercrime
This research gap, and the practical problems faced by underground.
cybercriminals, motivates our study. We take a data ana-
lytics approach and investigate the cybercrime economy II. CONCEPTUAL BACKGROUND
from a design science perspective. To achieve this goal, we A. CYBERCRIME UNDERGROUND BUSINESS MODEL
(1) propose a data analysis framework for analyzing the Cybercrime has undergone a revolutionary change, going
cybercrime underground to guide researchers and practition- from being product-oriented to service-oriented because the
ers; (2) define CaaS and crimeware to better reflect their fact it operates in the virtual world, with different spatial
features from both academic research and business practice and temporal constraints, differentiates it from other crime
perspectives; (3) use this to build a classification model for taking place in the physical world [11]. As part of this change,
CaaS and crimeware; and (4) build an application to demon- the cybercrime underground has emerged as a secret cyber-
strate how the proposed framework and classification model crime marketplace because emerging technological changes
could be implemented in practice. We then evaluate this appli- have provided organized cybercriminal groups with unprece-
cation by applying it in a case study, namely investigating the dented opportunities for exploitation [12].
cybercrime economy by analyzing a large dataset from the The cybercrime underground has a highly professional
online hacking community. business model that supports its own underground
This study takes a design science research (DSR) economy [5]. This business model, known as CaaS, is
approach. Design science ‘‘creates and evaluates informa- ‘‘a business model used in the underground market where ille-
tion technology artifacts intended to solve identified prob- gal services are provided to help underground buyers conduct
lems’’ [6]. DSR involves developing a range of IT artifacts, cybercrimes, such as attacks, infections, and money launder-
such as decision support systems, models, frameworks, ing in an automated manner,’’ [3]. Thus, CaaS is referred
tools, methods, and applications [7]. Where behavioral sci- to as a do-it-for-me service, unlike crimeware which is a
ence research seeks to develop and justify theories that do-it-yourself product.
explain or predict human or organizational phenomena, Because CaaS is designed for novices, its customers do
DSR seeks to extend the boundaries of human and orga- not need to run a hacking server or have high-level hacking
nizational capabilities by creating new and innovative arti- skills. Consequently, the CaaS business model can involve
facts [6]–[8]. DSR’s contribution is to add value to the the following roles: writing a hacking program, performing
literature and practice in terms of ‘‘design artifacts, design an attack, commissioning an attack, providing an attack
construction knowledge (e.g., foundations), and/or design server (infrastructure), and laundering the proceeds. Sood
evaluation knowledge (e.g., methodologies),’’ [7]. and Enbody [3] have suggested that crimeware marketplaces
This study follows these DSR guidelines and contributes have three key elements, namely actors (e.g., coders, oper-
design artifacts, foundations, and methodologies [7]. In par- ators, or buyers), value chains, and modes of operation
ticular, DSR must demonstrate that design artifacts are (e.g., CaaS, pay-per-install, crimeware toolkits, brokerage,
‘‘implementable’’ in the business environment to solve an or supplying data). Periodic monitoring and analysis of the
important problem [7], so we provide an implementable content of cybercrime marketplaces could help predict future
framework rather than a conceptual one. We also create a cyber threats [3].
front-end application as a case example to demonstrate how
the proposed framework and classification model could be B. ROUTINE ACTIVITY THEORY
implemented in practice. In addition, this study contributes In criminology, routine activity theory (RAT) is used to
to design theory [9], [10]. explain the causes of crime, both general criminal activ-
As for foundations, DSR should have a creative develop- ity and cybercrime [13], [14]. According to this theory,
ment of constructs, models, methods, or instantiations that three elements are necessary for crimes to be committed:
extend the design science knowledge base [7]. This study (1) a likely offender, (2) a suitable target, and (3) the absence
therefore adds to the knowledge base by providing founda- of capable guardians against crime. In a cybercrime context,
tional elements such as constructs (definitions, frameworks, the ‘‘likely offenders’’ are motivated sellers and potential
and applications), a model (classification model), a method buyers in the underground market, and the ‘‘suitable targets’’
(analysis), and instantiations (applications). are the targeted vulnerable organizations. The ‘‘absence of
As for methodologies, the creative development and capable guardians against crime’’ is due to organizations
use of evaluation methods provide DSR contributions [7]. failing to take preventive measures against cybercrime.
Two types of product or service are available in the cyber- III. CLASSIFICATION AND DEFINITION OF CRIMEWARE
crime underground. The first can be either CaaS or crimeware PRODUCTS AND SERVICES
that are related to attack strategy, for example, phishing, Although both academics and practitioners have recently
brute force, or DDoS attacks, or can be used for spam- started to devote more attention to CaaS, its fast-growing
ming or creating botnets, exploits, ransomware, rootkits, or nature has prevented them from reaching consensus on how
Trojans. Attack strategies often exploit system vulnerabilities to define different types of CaaS and crimeware. As a result,
such as application loopholes. In addition, social engineering most of the academic research has borrowed the definitions
attacks exploit human vulnerabilities [15]. The most well- used by the business practice literature, leading to widely
known example of such an attack is the use of a ‘‘secret varying interpretations in different disciplines. Given this
question’’ for password recovery: attackers check into the ambiguity, we approach categorizing CaaS and crimeware
user’s background to guess the secret question and hence steal from an RAT perspective (considering vulnerabilities as suit-
the account. However, because social engineering is one of able targets and preventive measures as capable guardians
the oldest account hacking techniques, most account holders against crime) in a cybercrime underground context.
are now aware of it. In addition, social engineering-related In addition, we redefine CaaS and crimeware based on the
products and services are rarely traded underground, although definitions used in existing research and practice.
a few sellers have been known to sell tutorials. As a result,
we have not included ‘‘social engineering services’’ as a A. CLASSIFICATION OF CRIMEWARE SERVICES
CaaS type. AND PRODUCTS
The second type of product or service available neutral- Table 1 lists the definitions of CaaS and crimeware used in
izes organizations’ preventive measures, such as anti-virus the academic and business practices literature, which form
programs. These are based on programs designed to evade a basis for our classification model, suitable for the IS field.
anti-virus software to either cause mischief or be left behind We reclassify CaaS and crimeware in terms of the suitable tar-
for later activation. Examples include encryption and virtual gets (attack strategy/mode) and absence of capable guardians
private network (VPN) services, crypters, and proxies. (preventive measures) in a cybercrime underground context.
From the perspective of RAT, the likely offenders are The different attack strategies/modes in Table 1 are asso-
attackers motivated to attack organizations or products that ciated with RAT’s suitable targets because vulnerable orga-
constitute a suitable target. If such targets are attacked, how- nizations, products, and services may suffer from attacks
ever, both the targets and those who supply their cybersecurity using a variety of strategies. In contrast, preventive mea-
products become aware of the vulnerabilities that made the sures are associated with RAT’s absence of capable guardians
attack possible, leading them to apply security updates to their because encryption and VPN services, crypters, and proxies
software. These updates can be seen as capable guardians are intended to neutralize preventive measures by bypassing
against crime, and the preventive measures taken can be anti-virus and log monitoring software.
identified by looking through each program’s version history.
However, this is not the end of the matter, because the B. DEFINITION OF CRIMEWARE SERVICES
attackers will then develop and sell new versions of their AND PRODUCTS
hacking tools to combat the guardians, thus re-establishing We now need to review the definitions used in both the
the third RAT condition, the absence of capable guardians research and business practice literature. This study extends
against crime. Such events can also be identified by the ver- the IS literature by facilitating a conceptual understanding
sion numbers of the hacking tools sold on the black market: of the CaaS business models used by the cybercrime under-
since it is an online marketplace, attackers must give detailed ground. Drawing upon prior research and business practice
explanations to retain their customers’ confidence. This cycle literature, we propose definitions of CaaS and crimeware that
will continue as long as attackers can find vulnerabilities in better reflect the features of CaaS in both of these areas.
organizations or products.
From this perspective, the cybercrime underground black 1) CRIMEWARE-AS-A-SERVICE
market is essentially a market economy, ruled by supply and • Account Hacking Services: Previous academic research
demand, with the preventive measures taken by organiza- has defined account hacking as ‘‘a crime which origi-
tions being the key drivers of demand. Ironically, attackers nated as a type of theft specific to digital environments
can only sell new tools because of their target organiza- where users create personal digital profiles and store
tions’ ongoing preventive measures, which serve to make valuable personal information such as passwords, bank
the black market more viable. Unlike criminals in gen- account numbers, and ID numbers,’’ [16]. In digital envi-
eral, attackers regard capable guardians against crime as a ronments, such as cloud computing platforms, account
necessary evil, because cybercrime tends to adhere faith- hacking is one of the main cybersecurity threats. The
fully to market economy principles. Therefore, to get at the most common account hacking methods are phishing
fundamental cybercrime issues, we need to understand the and brute force attacks. With an emphasis on selling
mechanisms underlying the cybercrime underground from an this as a service, we define an account hacking service
RAT perspective. as a service that offers to gain unauthorized access
TABLE 1. Classification of crimeware products and services. Phishing and brute force attack services are subsets of account hacking service.
to a target’s account by obtaining account information trying random passwords. Such attacks often target less
(e.g., username and password) or extra security infor- specific targets than phishing or social engineering. For
mation (e.g., security questions and answers). example, an attacker may try to log in using one of the
Phishing Services: Phishing has been defined in the system’s default usernames (e.g., ‘‘root’’ or ‘‘admin’’)
business practice literature in the last few years because by systematically trying all possible passwords. We thus
it has become increasingly sophisticated and is one of define a brute force attack service as a service that hacks
the most common techniques used by cybercriminals. accounts by trying all possible passwords.
Phishing is defined as ‘‘masquerading as a trustworthy • DDoS Attack Services: In the research literature,
source in an attempt to bait a user to surrender sensitive a DDoS attack is defined as ‘‘an attack which makes
information such as a username, password, and credit resources unavailable to its legitimate users,’’ [25]. In the
card number,’’ [22]. Volonino et al. [18] defined phish- business practice literature, it is defined as ‘‘an attack
ing as ‘‘sending an e-mail to a user falsely claiming to be involving an enormous number of spurious requests
a legitimate enterprise in an attempt to scam the user.’’ from a large number of computers worldwide that
The term ‘‘phishing’’ is a portmanteau of ‘‘password’’ flood a target server,’’ [16]. DDoS botnet attacks can
and ‘‘fishing,’’ where the latter refers to catching fish cause serious damage: for example, the Gameover Zeus
using bait or a lure. We thus define a phishing service attack stole online banking credentials, resulting in
as a service that hacks accounts by pretending to be a a $100 million loss [26]. However, the above definitions
reliable source, such as a bank or card service. are not precise and do not encompass all the definitions
Brute Force Attack Services: A brute force attack is an used in research and practice. We thus define a DDoS
attempt to log in to an account and steal it by repeatedly attack service as a service that makes one target service
unavailable by flooding it with traffic from multiple some way [38]. We thus define ransomware asmalicious
compromised sources. software that encrypts a victim’s data to extort money
• Spamming Services: Over the last decade, spamming from them.
has been defined in a variety of ways in the litera- • Rootkit: The business practice literature defines a
ture. The academic literature defines spam as ‘‘unso- rootkit as ‘‘a program that allows someone to obtain
licited and unwanted e-mail from a stranger that is root-level access to the computer,’’ [44]. We thus
sent in bulk to large mailing lists, usually with some define a rootkit as a piece of malicious software that
commercial objective,’’ [27]. Likewise, Gyongyi and enables administrator-level access to an operating sys-
Garcia-Molina [28] defined spamming as ‘‘any deliber- tem or computer network.
ate human action that is meant to trigger an unjustifiably • Trojan: Trojans are defined by Colarik and
favorable relevance or importance of some web page Janczewski [46] as malicious programs that perform a
considering the page’s true value.’’ Based on these char- legitimate function but also engage in unknown and/or
acteristics, we define a spamming service as a service unwanted activity. We thus define a Trojan as a piece of
that sends out unsolicited emails to a large number of malware that provides unauthorized remote access to a
people (e.g., mailing lists) using automated software. victim’s computer.
• Crypting Services: Crypter encrypt programs or source • Drive-by download: All these crimeware products are
code to avoid detection and tracking and thus bypass used in drive-by download attacks, which have become
anti-virus software [30]. Like other hacking services, one of the primary types of cyberattack worldwide. Such
encryption is sold as a service because crypters require attacks target victims through their Internet browsers,
a certain level of skill to use. The goal of such a service installing malware their computers as soon as they visit
is to neutralize the preventive measures put in place by an infected website [33]. We thus define a drive-by
organizations and anti-virus software, preventing hack- download attack as an attack that installs malware when
ing programs from being caught or allowing them to be the victim visits a malicious webpage.
left behind to collect information. We define an crypting • Crypter: Crypters can encrypt programs or source code
service as a service that encrypts malicious code by to avoid detection and tracking by bypassing anti-virus
using a crypter to bypass anti-virus software. software [30], and can also be offered as a service.
• VPN Services: Networks connect different entities, and We thus define a crypter as a piece of encryption soft-
private networks only allow access by closed commu- ware that helps an intruder to bypass security programs.
nities of authorized users [31]. The most secure way • Proxy: Proxies are used for a variety of purposes, such as
to access the Internet is using a VPN, because it hides accelerating data transmission and filtering traffic [20].
all user information (e.g., identity and IP address). We thus define a proxy as a server that enables anony-
Because attackers use VPN services to avoid tracking or mous Web browsing.
IP blocks, they are categorized as CaaS-related preven-
tive measures. We thus define a VPN service as a service IV. ANALYTICAL FRAMEWORK AND METHODS
that provides a secure connection to the Internet via a The constructs used in DSR are entity representations [10]
virtual private network. that provide the vocabulary and symbols needed to define
problems and solutions [7]. Accordingly, the design elements
2) CRIMEWARE PRODUCTS used in this study are the cybercrime underground, criminal
Crimeware itself is not considered to be CaaS, and comes in items (CaaS and crimeware), classifications, and front-end
several different forms, as follows. system applications, and the artifacts are based on these
• Bootnet: Botnets are networks of compromised constructs. These artifacts are evaluated in two stages [49]:
(or ‘‘zombie’’) computers controlled by ‘‘bot masters,’’ ex-ante (classification evaluation) and ex-post (case exam-
and have become the most common cyberattack vector ple). Because DSR should be tentative, this ex-post evaluation
over the past few years [34], [35]. We define a botnet is essential to the search process used by iterative DSR, which
as a network of infected devices, typically used for comprises search, design, ex-ante evaluation, construction,
DDoS attacks. artifact, ex-post evaluation, and research [49]. Based on this,
• Exploit: In the business practice field, an exploit is we propose the data analysis framework shown in Fig. 1.
defined as ‘‘a program created specifically to exploit Because cybercrime differs from general crime in many
a vulnerability, in other words—simply trying to take ways, we need to conduct a variety of analyses using a large
advantage of an error in the design or programming of a dataset. A previous study [50] proposed a data mining frame-
system or application,’’ [37] and is used to obtain admin- work for crime, dividing crimes harmful to the general public
istrator privileges on a system. We thus define an exploit into eight categories: traffic violations, sex crime, theft, fraud,
as a program or script that exploits vulnerabilities in arson, gang/drug offenses, violent crime, and cybercrime.
applications, servers, or clients. Although the previous study explained how data mining
• Ransomware: Ransomware is a type of malicious soft- techniques could be applied to crime analysis, it did not
ware that disables the functionality of a computer in consider the specific features of cybercrime. Furthermore, it
only explained the data mining techniques briefly, rather than B. STEP 2: IDENTIFYING SOURCES
presenting a broad overview of the framework [50]. In con- The second step is to identify the data sources, based on
trast, the goal of our data analysis framework is to conduct a the goals defined by Step 1. This step should consider what
big-picture investigation of the cybercrime underground by data is needed and where it can be obtained. Since the goal
covering all phases of data analysis from the beginning to of this study is to investigate the cybercrime underground,
the end (see Fig. 1). This framework comprises four steps: we consider data on the cybercrime underground community.
(1) defining goals; (2) identifying sources; (3) selecting ana- We therefore collected such data from the community itself
lytical methods; and (4) implementing an application. and obtained a malware database from a leading global cyber-
Because this study emphasizes the importance of RAT security research firm.
for analyzing the cybercrime underground, the proposed Because cybercimimals often change their IP addresses
RAT-based definitions are critical to this framework: and use anti-crawling scripts to conceal their communica-
Steps 1–4 all contain the RAT elements, as Fig. 1 shows. tions, we used a self-developed crawler that can resolve
captchas and anti-crawling scripts to gather the neces-
A. STEP 1: DEFINING GOALS sary data. We collected a total of 2,672,091 posts sell-
The first step is to identify the conceptual scope of the ing CaaS or crimeware, made between August 2008 and
analysis. Specifically, this step identifies the analysis con- October 2017, from a large hacking community site
text, namely the objectives and goals. To gain an in-depth (www.hackforums.net) with over 578,000 members and more
understanding of the current CaaS research, we investigated than 40 million posts. We also collected 16,172 user profiles
the cybercrime underground, which operates as a closed of sellers and potential buyers, based on their communication
community. Thus, the goal of the proposed framework is to histories, as well as prices and questions and answers about
‘‘investigate the cybercrime underground economy.’’ the transactions.
The black market uses traditional forum threads To be classified as a dangerous Threat, for example, a mes-
(e.g., bulletin boards) instead of typical e-commerce plat- sage must also contain Market-related keywords. Messages
forms (e.g., eBay, and Amazon). For example, sellers create containing both Threat- and Market-related keywords are
threads in marketplace forums to sell items, and potential considered more dangerous (e.g., ‘‘Selling silent Microsoft
buyers comment on these threads. One of the most significant Office exploit’’) than messages with only Threat-related
challenges was therefore converting this unstructured data keywords (e.g., ‘‘Can I hide a file inside a word doc?’’).
into structured data. Since the product features, prices, and Likewise, messages related to the Product/Service, Market,
descriptions were explained within longer texts, we used a and File Extension categories are not identified as dangerous
variety of text mining techniques to extract the important if they only contain keywords related to one category.
features: for example, we used named entity recognition to In addition, messages containing Exclusion-related keywords
extract company names (see Section IV-C(2)). Since these (e.g., ‘‘tutorials’’ or ‘‘tips’’) are not identified as a dangerous
texts included many typographic errors and jargon terms, (see Fig. 2).
we had to create a dictionary for use during a preprocessing To classify messages correctly, we also use keywords
step. related to CaaS and crimeware. This classification step is
In addition, we obtained a malware database from a applied after the messages have been filtered as above,
cybersecurity firm containing over 53,815 entries covering so many keywords are not needed and the criteria are simpler.
cybercrimes between May 11, 2010 and January 13, 2014. However, when a message fits into multiple categories, this
This unique dataset strengthened our study by providing real- overlap is recorded so as to derive additional insights from the
world evidence from a different viewpoint. later analysis and applications. The types of keyword used for
the proposed classification model are as follows.
C. STEP 3: SELECTING ANALYTICAL METHODS • Threat: keywords directly related to threats or cyberat-
1) CAAS AND CRIMEWARE CLASSIFICATION MODEL tacks (e.g., ‘‘exploit’’ or ‘‘botnet’’).
A diverse range of items are sold in the cybercrime under- • Product/Service: keywords related to products or ser-
ground, with different degrees of associated risk. For this vices (e.g., ‘‘Facebook’’ or ‘‘Skype’’).
study, we focused mainly on items critical to hacking. We first • File Extension: keywords related to software or add-ons
filtered the messages to select only those that carried signif- (e.g., ‘‘doc’’ or ‘‘ppt’’).
icant risks, and then divided them into the categories shown • Market: keywords related to markets or transactions
in Table 1. (e.g., ‘‘selling’’ or ‘‘$’’).
To determine if a given message is dangerous, our classi- • Exclusion: keywords that are not related to malware
fication model checks whether it falls into one of the follow- (e.g., ‘‘tutorial’’ or ‘‘tips’’).
ing five categories: Threat, Product/Service, File Extension, To improve the quality of the training data, we referred
Market, and Exclusion. Fig. 2 shows a simplified example to the malware database obtained from the cybersecu-
to clarify this rule-based approach. We used a dictionary rity research firm. Since this database contained labeled
consisting of 1,191 keywords spread across five categories, black market communications by cybersecurity profession-
built using data obtained from the cybersecurity research als, it provided an appropriate guide for building the training
firm, anti-virus vendors, Wikipedia, and forums. dataset.
attack strategy/mode or to preventive measures (see Table 1), show the frequencies with which particular CaaS and crime-
this can be interpreted in terms of RAT’s suitable targets ware items are mentioned. These both serve to highlight the
(attack strategy/mode) and capable guardians against crime changes in cybercrime market trends over time. Although this
(preventive measures). application is based on the proposed classifications, it also
As Fig. 6 illustrates, the application allows users to search allows new CaaS and crimeware items to be added that
for CaaS and crimeware trends in the cybercrime under- have not yet been classified. This scalability is an important
ground data (see Section IV-B). The data used here were part of DSR’s search process and its emphasis on tentative
collected from the ‘‘Premium Sellers Section.’’ This appli- study [49].
cation can show the CaaS and crimeware trends since 2008.
Analyzing the hacking tool trends may allow organizations to B. DETECTING POTENTIAL TARGETS (COMPANIES,
discover which ones they should focus on protecting them- PRODUCTS, AND SERVICES)
selves against. This section describes an application that relies on extracting
These results can be intuitively understood, enhancing our company names (see Section IV-C(2)) and potential hacking
understanding of how CaaS and crimeware change over time. targets (see Section V-C). The goal of this example applica-
First, bar graphs show which of the selected keywords were tion is to identify potential target companies, products, and
most used within the given period. Second, daily trend graphs services. The analysis in Fig. 7 is based on using the named
TABLE 2. Company names mentioned in cybercrime underground. Names are in alphabetical order.
entity recognition algorithm to extract company names By analyzing the attackers’ conversations, the
from both ‘‘Hacks, Exploits, and Various Discussions’’ and application can extract the names of the companies,
‘‘Premium Sellers Section’’ in the cybercrime community products, and services that they mention and therefore
forum. The companies’ SIC codes are used to categorize them their likely targets (see Fig. 7). This analysis of RAT’s
by industry. suitable targets [13], [14] allows security managers to
FIGURE 7. Hacking vulnerability disclosed and the earlier signal from the underground: (a) Monitoring system. (b) Relevant result (July 6, 2011).
(c) BBC news (Feb. 24, 2016).
monitor the potential threats and hence prevent the proposed the activity of the underground community can enable vulner-
attacks. abilities to be discovered before companies formally disclose
Figs. 7(b) and (c) illustrate a real-world example. them.
On February 24, 2016, BBC News reported that a ‘‘Nissan
Leaf electric cars hack vulnerability has been disclosed’’ C. REAL-TIME SOCIAL MEDIA MONITORING
and explained that the vehicle’s app could be spied on Cyberattacks are unpredictable and damaging, but those
(see Fig. 7 (c)). Interestingly, this vulnerability had already who have not taken precautions against such attacks suffer
been discussed in the underground community, on the most. The most effective way to reduce the damage is
July 5, 2011 (4.5 years earlier). This shows that monitoring to respond in real time. This section therefore focuses on
a real-time monitoring application that aims to monitor in the underground market. This application aims to iden-
cybercrime-related discussions on social networks. Unlike tify the potential buyers and sellers of CaaS and crimeware,
Sections VI-A and VI-B, this application may reflect different using data collected from the forums at www.hackforums.net.
RAT views, depending on who is tweeting, such as an attacker In this case, we visualize the data using a network whose
(motivated offender) or anti-virus vendor (guardian against nodes represent potential buyers and sellers and whose edges
crime), and on what topic (e.g., suitable targets or preventive represent forum threads and replies. This allows us to assess
measures). their relationships in terms of the degrees of connectivity
Fig. 8 shows that the ‘‘hacking’’ keyword was mentioned and centrality, based on the numbers of edges connected to
in a range of different places. The application presents real- particular nodes (see Fig. 9). This enables the application to
time global search results visually, allowing users to identify identify the most influential users as well as any patterns in
the new trends and meaningful discussions contained in Twit- the network.
ter messages. It can locate the authors of tweets containing This feature is also a potentially useful tool for moni-
specific keywords immediately. The application thus yields toring behavior associated with money laundering. Because
insights into the original languages, locations, and hashtags money laundering involves more than one transaction,
associated with given keywords. In most cybercrime cases, it is of vital importance to monitor and detect pat-
it is critically important that organizations take immediate terns of interaction among community members. It also
action, so this monitoring helps organizations to react imme- enables end users to keep an eye on the most influen-
diately to the use of specific keywords. tial players in the market. By defining particular attributes
based on activity-related information, additional analy-
D. CYBERCRIMINAL NETWORK MONITORING ses, such as impact, clustering, and homophily analyses,
Now, we apply the methods discussed in Section V-B to can be used to monitor noteworthy attackers and profile
analyze the relationships between potential buyers and sellers criminals.
definitions and classification model will serve as a basis for underground markets where hacking tools are sold. More
further research. importantly, these tools could be based on vulnerabilities
Third, this study adds to the body of knowledge by demon- in their organizations, products, and services. Governments
strating new approaches to the problems cybercrime and and organizations therefore need to increase their technical
social media researchers face [7], [73]. Despite the increasing capabilities when it comes to analyzing large-scale datasets
importance of data analysis, researchers have been slow to of different types [66], [67]. Although the proposed frame-
recognize the advantages of new and more powerful data- work and classification model are of particular use to compa-
driven analysis methods. We have applied several modern nies mentioned specifically by the cybercrime underground,
techniques, such as machine learning, key phrase extrac- the framework can also be used to analyze more general
tion, and natural language processing, in this area, thereby types of issues commonly encountered in practice [68]. In this
encouraging future research to be more systematic and empir- regard, legal and technical training is needed to reduce the
ical. In addition, our results suggest that combining natural impact of cyberattacks [64].
language processing and machine learning approaches is a Third, this study calls for researchers, companies, anti-
suitable way to study closed communities whose members virus vendors, and governments to collaborate in the fight
frequently use jargon or obscure expert language. against cybercrime using civil resources. Rather than act-
Finally, this study adds to RAT [13], [14] by applying it to ing alone, these groups should unite to maximize their
the cybercrime underground. The same three factors can be efficiency and effectiveness. Successful collaboration may
applied to cybercrime and general crimes, so we have clas- enable stronger and better-coordinated responses to imme-
sified CaaS and crimeware in the context of the cybercrime diate cyber threats in risky environments [69]. For example,
underground and analyzed them accordingly. by sharing information, technology, and support, stronger
defense systems can be built for everyone. Our study
D. IMPLICATIONS FOR PRACTICE enables this by providing a framework, definitions, classi-
From a RAT perspective, the practical implications of this fication model, and applications that can be implemented
study mainly affect the capable guardians against crime, by researchers, governments, organizations, and anti-virus
because our results indicate how underground attackers per- vendors.
ceive preventive measures. A previous review of the current Finally, this study also has important implications for
status of legal, organizational, and technological efforts to society. Over the last few years, the world has been facing
combat cybercrime in different countries relied on a case cyberterrorism and cyberwar threats from nation-sponsored
study of the work being done in Taiwan [64]. It made attackers [70]. Pollitt [71] defined cyberterrorism as ‘‘the
four recommendations for governments, lawmakers, interna- premeditated, politically motivated attack against informa-
tional organizations, intelligence and law enforcement agen- tion, computer systems, computer programs and data which
cies, and researchers: (1) regularly update existing laws; results in violence against non-combatant targets by sub-
(2) enhance specialized task forces; (3) use civil resources; national groups or clandestine agents.’’ Unlike most cyber-
and (4) promote cybercrime research. The practical impli- crime, which is primarily motivated by monetary gain [72],
cations of our study are based on those of the previous cyberterrorists are politically motivated. As a result, govern-
study [64]. We have already discussed the fourth recom- ments should, for example, strengthen their ability to protect
mendation (‘‘promote cybercrime research’’) in the previous their citizens in online virtual environments by enhancing
section, so we will now focus on the other three areas. their immediate responses to threats such as cyberespionage
First, our study has implications for governments and and cyberterrorism. This issue therefore has profound impli-
lawmakers in that it recommends existing laws be regularly cations in terms of the need for a global cyber defense to
updated. The proposed CaaS and crimeware definitions and maintain a cyber-safe environment.
classification model may improve national defense and secu-
rity by suggesting potential government roles and the adop- REFERENCES
tion of particular regulatory policies. A previous study [65] [1] J. C. Wong and O. Solon. (May 12, 2017). Massive Ransomware
suggested that governments and lawmakers should encourage Cyber-Attack Hits Nearly 100 Countries Around the World. [Online].
Available: https://www.theguardian.com/technology/2017/may/12/global-
security providers, such as anti-software vendors, to collab- cyber-attack-ransomware-nsa-uk-nhs
orate and share security-related information. For example, [2] FACT SHEET: Cybersecurity National Action Plan, The White House,
governments and companies could develop joint plans to stop Washington, DC, USA, 2016.
[3] A. K. Sood and R. J. Enbody, ‘‘Crimeware-as-a-service—A survey of com-
the spread of cybercrime by tracking cyber threats [64]. Our moditized crimeware in the underground market,’’ Int. J. Crit. Infrastruct.
study therefore suggests governments should actively encour- Protect., vol. 6, no. 1, pp. 28–38, 2013.
age companies to invest in their cybersecurity infrastructures. [4] S. W. Brenner, ‘‘Organized cybercrime-how cyberspace may affect the
structure of criminal relationships,’’ North Carolina J. Law, Technol.,
Second, the proposed data analysis framework can be used vol. 4, no. 1, pp. 1–50, 2002.
to enhance specialized task forces. This study suggests that [5] K. Hughes, ‘‘Entering the World-Wide Web,’’ ACM SIGWEB Newslett.,
organizations in all industries should attempt to gain a deeper vol. 3, no. 1, pp. 4–8, 1994.
[6] S. Gregor and A. R. Hevner, ‘‘Positioning and presenting design science
understanding of the nature of the cybercrime underground. research for maximum impact,’’ MIS Quart., vol. 37, no. 2, pp. 337–356,
For example, they should be aware that there are cybercrime 2013.
[7] A. R. Hevner, S. T. March, J. Park, and S. Ram, ‘‘Design science in [34] P. Wang, S. Sparks, and C. C. Zou, ‘‘An advanced hybrid peer-to-peer
information systems research,’’ MIS Quart., vol. 28, no. 4, pp. 75–105, botnet,’’ IEEE Trans. Depend. Sec. Comput., vol. 7, no. 2, pp. 113–127,
2004. Apr. 2010.
[8] K. Peffers, T. Tuunanen, M. A. Rothenberger, and S. Chatterjee, [35] H. R. Zeidanloo and A. B. A. Manaf, ‘‘Botnet detection by monitoring
‘‘A design science research methodology for information systems similar communication patterns,’’ Int. J. Comput. Sci. Inf. Secur., vol. 7,
research,’’ J. Manag. Inf. Syst., vol. 24, no. 3, pp. 45–77, 2007. no. 3, pp. 36–45, 2010.
[9] S. Gregor, ‘‘Design theory in information systems,’’ Austral. J. Inf. Syst., [36] H. R. Shahriari and R. Jalili, ‘‘Vulnerability take grant (VTG): An efficient
vol. 10, no. 1, pp. 14–22, 2002. approach to analyze network vulnerabilities,’’ Comput. Secur., vol. 26,
[10] S. Gregor and D. Jones, ‘‘The anatomy of a design theory,’’ J. Assoc. Inf. no. 5, pp. 349–360, 2007.
Syst., vol. 8, no. 5, pp. 313–335, 2007. [37] C. G. Amaya. (Oct. 2014). Myths About Malware: An Exploit is
[11] M. Yar, ‘‘The novelty of ‘Cybercrime’: An assessment in light of routine the Same as Malware. [Online]. Available: http://www.welivesecurity.
activity theory,’’ Eur. J. Criminol., vol. 2, no. 4, pp. 407–427, 2005. com/2014/10/21/myths-about-malware-exploit-is-the-same-as-malware/
[12] K.-K. R. Choo, ‘‘Organised crime groups in cyberspace: A typology,’’ [38] A. Gazet, ‘‘Comparative analysis of various ransomware virii,’’ J. Comput.
Trends Organized Crime, vol. 11, no. 3, pp. 270–295, 2008. Virol., vol. 6, no. 1, pp. 77–90, 2010.
[13] L. E. Cohen and M. Felson, ‘‘Social change and crime rate trends: A routine [39] G. O’Gorman and G. McDonald, ‘‘Ransomware: A growing
activity approach,’’ Amer. Sociol. Rev., vol. 44, no. 4, pp. 588–608, 1979. menace,’’ Symantec Corp., Mountain View, CA, USA,
[14] M. Felson, ‘‘Routine activities and crime prevention in the developing Tech. Rep., 2012. [Online]. Available: http://www.symantec.com/content/
metropolis,’’ Criminology, vol. 25, no. 4, pp. 911–932, 1987. en/us/enterprise/media/security_response/whitepapers/ransomware-a-
[15] F. Mouton, M. M. Malan, K. K. Kimppa, and H. S. Venter, ‘‘Neces- growing-menace.pdf
sity for ethics in social engineering research,’’ Comput. Secur., vol. 55, [40] A. Khanse. (Dec. 18, 2013). How to Protect Against and Prevent
pp. 114–127, Nov. 2015. Ransomware Attacks & Infections. [Online]. Available: http://www.
[16] A. S. Rakitianskaia, M. S. Olivier, and A. K. Cooper, ‘‘Nature and thewindowsclub.com/prevent-ransomware-windows
forensic investigation of crime in second life,’’ in Proc. 10th Annu. [41] D. Turkel. (Dec. 16, 2015). There Are Now Programs That Anyone
Inf. Secur. South Africa Conf., 2011. [Online]. Available: http://dblp.uni- Can Use to Extort Money From You. [Online]. Available: http://www.
trier.de/rec/bibtex/conf/issa/RakitianskaiaOC11 businessinsider.com/ransomware-as-a-service-is-the-next-big-cyber-
[17] A. van der Merwe, M. Loock, and M. Dabrowski, ‘‘Characteristics and crime-2015-12/
responsibilities involved in a phishing attack,’’ in Proc. 4th Int. Symp. Inf. [42] Y. Zhu, S. L. Liu, H. Lu, and W. Tang, ‘‘Research on the detection technique
Commun. Technol., 2005, pp. 249–254. of bootkit,’’ in Proc. Int. Conf. Graph. Image Process., 2013, p. 876860.
[18] L. Volonino, R. Anzaldua, and J. Godwin, Computer Forensics: Principles [43] J. Luo, M. Li, A. Khashnobish, J. McDermott, and J. Froscher, A Taxon-
and Practices. Englewood Cliffs, NJ, USA: Prentice-Hall, 2006. omy of Software Deceptive Interpretation in the Linux Operating System,
[19] G. Álvarez, F. Montoya, M. Romera, and G. Pastor, ‘‘Cryptanalyzing document NRL/MR/5540–04-8841, DTIC, 2004.
a discrete-time chaos synchronization secure communication system,’’ [44] M. Kassner. (Sep. 17, 2008). 10+ Things You Should Know About Rootkits.
Chaos, Solitons Fractals, vol. 21, no. 3, pp. 689–694, 2004. [Online]. Available: http://www.techrepublic.com/blog/10-things/10-plus-
[20] M. Goncharov. (2014). Russian Underground Revisited. [Online]. things-you-should-know-about-rootkits/
Available: https://www.trendmicro.de/cloud-content/us/pdfs/security- [45] M. Tehranipoor and C. Wang, Introduction to Hardware Security and Trust.
intelligence/white-papers/wp-russian-underground-revisited.pdf New York, NY, USA: Springer, 2011.
[21] V. Bezmalyi. (Oct. 1, 2014). Why Phishing Works and How to [46] L. J. Janczewski and A. M. Colarik, Cyber Warfare and Cyber Terrorism.
Avoid It. [Online]. Available: https://blog.kaspersky.com/how-to-avoid- Hershey, PA, USA: IGI Global, 2007.
phishing/6145/ [47] J. Ortiz. (Nov. 30, 2015). The CryptoLocker Legacy—Another Reason
[22] C. Ng. (May 21, 2014). What’s the Difference Between Hacking and for Strong Data Protection. [Online]. Available: https://storageswiss.
Phishing? [Online]. Available: https://blog.varonis.com/whats-difference- com/2015/11/30/the-cryptolocker-legacy/
hacking-phishing/ [48] J. Waldo, ‘‘The Jini architecture for network-centric computing,’’ Com-
[23] P. Shankdhar. (May 29, 2017). Popular Tools for Brute-force Attacks. mun. ACM, vol. 42, no. 7, pp. 76–82, 1999.
[Online]. Available: http://resources.infosecinstitute.com/popular-tools- [49] R. Baskerville, A. Ga, J. Pries-heje, and J. Venable, ‘‘Soft design science
for-brute-force-attacks methodology,’’ in Proc. 4th Int. Conf. Design Sci. Res. Inf. Syst. Technol.,
[24] J. Mirkovic, G. Prier, and P. Reiher, ‘‘Source-end DDoS Defense,’’ in Proc. 2009, p. 9.
2nd IEEE Int. Symp. Netw. Comput. Appl. (NCA), Apr. 2003, pp. 171–178. [50] H. Chen, W. Chung, J. J. Xu, G. Wang, Y. Qin, and M. Chau, ‘‘Crime data
[25] A. Singh and D. Juneja, ‘‘Agent based preventive measure for UDP mining: A general framework and some examples,’’ Computer, vol. 37,
flood attack in DDoS attacks,’’ Int. J. Eng. Sci. Technol., vol. 2, no. 8, no. 4, pp. 50–56, Apr. 2004.
pp. 3405–3411, 2010. [51] J. R. Landis and G. G. Koch, ‘‘The measurement of observer agreement
[26] D. McMillen. (Mar. 24, 2016). Why Botnets Remain the for categorical data,’’ Biometrics, vol. 33, no. 1, pp. 159–174, 1977.
Go-To Weapon for Cybercriminals. [Online]. Available: [52] P. Langley and S. Sage, ‘‘Induction of selective Bayesian classifiers,’’ in
https://securityintelligence.com/why-botnets-remain-the-go-to-weapon- Proc. 10th Int. Conf. Uncertain. Artif. Intell., 1994, pp. 399–406.
for-cybercriminals/ [53] P. Langley, W. Iba, and K. Thompson, ‘‘An analysis of Bayesian classi-
[27] P. Cunningham, N. Nowlan, S. J. Delany, and M. Haahr, ‘‘A case-based fiers,’’ Aaai, vol. 90, pp. 223–228, Jul. 1992.
approach to spam filtering that can track concept drift,’’ in Proc. Workshop [54] J. Chen, H. Huang, S. Tian, and Y. Qu, ‘‘Feature selection for text classifi-
Long-Lived CBR Syst. (ICCBR), 2003, pp. 2003-1–2003-3. cation with Naïve Bayes,’’ Expert Syst. Appl., vol. 36, no. 3, pp. 5432–5435,
[28] Z. Gyöngyi and H. Garcia-Molina, ‘‘Web spam taxonomy,’’ in Proc. 1st 2009.
Int. Workshop Adversarial Inf. Retr. Web (AIRWeb), 2005, pp. 1–9. [55] T. Fawcett, ‘‘An introduction to ROC analysis,’’ Pattern Recognit. Lett.,
[29] A. Zaharia. (Nov. 14, 2015). Analysis: How Malware Creators vol. 27, no. 8, pp. 861–874, Jun. 2006.
Use Spam to Maximize Their Impact. [Online]. Available: [56] Y. Ben-Itzhak, ‘‘Organised cybercrime and payment cards,’’ Card Technol.
https://heimdalsecurity.com/blog/analysis-how-malware-creators-use- Today, vol. 21, no. 2, pp. 10–11, 2009.
spam-to-maximize-their-impact/ [57] L. Tabansky, ‘‘Cybercrime: A national security issue?’’ Military Strategic
[30] V. G. Tasiopoulos and S. K. Katsikas, ‘‘Bypassing antivirus detection with Affairs, vol. 4, no. 3, pp. 117–136, 2012.
encryption,’’ in Proc. 18th Panhellenic Conf. Informat. (PCI), New York, [58] W. Kim, O.-R. Jeong, C. Kim, and J. So, ‘‘The dark side of the Internet:
NY, USA, 2014, pp. 1–2. Attacks, costs and responses,’’ Inf. Syst., vol. 36, no. 3, pp. 675–705, 2011.
[31] R. Venkateswaran, ‘‘Virtual private networks,’’ IEEE Potentials, vol. 20, [59] S. Gordon and R. Ford, ‘‘On the definition and classification of cyber-
no. 1, pp. 11–15, Feb. 2001. crime,’’ J. Comput. Virol., vol. 2, no. 1, pp. 13–20, 2006.
[32] A. K. Sood, S. Zeadally, and R. Bansal, ‘‘Cybercrime at a scale: A prac- [60] S. Philippsohn, ‘‘Trends in cybercrime—An overview of current financial
tical study of deployments of HTTP-based botnet command and control crimes on the Internet,’’ Comput. Secur., vol. 20, no. 1, pp. 53–69, 2001.
panels,’’ IEEE Commun. Mag., vol. 55, no. 7, pp. 22–28, Jul. 2017. [61] M. S. Gerber, ‘‘Predicting crime using Twitter and kernel density estima-
[33] J. Glassberg, (Apr. 7, 2016). What You Need to Know About, tion,’’ Decision Support Syst., vol. 61, pp. 115–125, May 2014.
‘Drive-By’ Cyber Attacks. [Online]. Available: http://www.foxbusiness. [62] S. B. Hoar, ‘‘Trends in cybercrime: The dark side of the Internet,’’ Criminal
com/features/what-you-need-to-know-about-drive-by-cyber-attacks Justice, vol. 20, pp. 4–13, 2005.
[63] D. Martens and F. Provost, ‘‘Explaining data-driven document classifica- JUNGKOOK AN is currently pursuing the Ph.D.
tions,’’ MIS Quart., vol. 38, no. 1, pp. 73–99, 2014. degree with the Graduate School of Informa-
[64] W. Chung, H. Chen, W. Chang, and S. Chou, ‘‘Fighting cybercrime: tion, Yonsei University, Seoul, South Korea. His
A review and the Taiwan experience,’’ Decision Support Syst., vol. 41, research interests include big data analytics, cyber-
no. 3, pp. 669–682, 2006. security, natural language processing, business
[65] S. H. Kim and B. C. Kim, ‘‘Differential effects of prior experience on intelligence, social media marketing, and brand
the malware resolution process,’’ MIS Quart., vol. 38, no. 3, pp. 655–678, engagement.
2014.
[66] K. S. Wilson and M. A. Kiy, ‘‘Some fundamental cybersecurity concepts,’’
IEEE Access, vol. 2, pp. 116–124, 2014.
[67] A. Sajid, H. Abbas, and K. Saleem, ‘‘Cloud-assisted IoT-based SCADA
Syst. Security: A review of the state of the art and future challenges,’’ IEEE
Access, vol. 4, pp. 1375–1384, 2016.
[68] Z. Shi, G. M. Lee, and A. B. Whinston, ‘‘Toward a better measure of
business proximity: Topic modeling for industry intelligence,’’ MIS Quart.,
vol. 40, no. 4, pp. 1035–1056, 2016.
[69] A. Majchrzak and S. L. Jarvenpaa, ‘‘Safe contexts for interorganizational
collaborations among homeland security professionals,’’ J. Manag. Inf.
Syst., vol. 27, no. 2, pp. 55–86, 2010.
HEE-WOONG KIM was a Faculty Member with
[70] G. Giacomello, ‘‘Close to the edge: Cyberterrorism today,’’ in Understand-
ing Terrorism. Bingley, U.K: Emerald Group, 2014, pp. 217–236.
the Department of Information Systems and Ana-
[71] M. M. Pollitt, ‘‘Cyberterrorism—Fact or Fancy?’’ Comput. Fraud lytics, National University of Singapore. He is
Secur., vol. 1998, no. 2, pp. 8–10, 1998. [Online]. Available: currently a Professor with the Graduate School
https://doi.org/10.1016/S1361-3723(00)87009-8 of Information, Yonsei University, Seoul, South
[72] I. P. L. Png, C.-Y. Wang, and Q.-H. Wang, ‘‘The deterrent and displace- Korea. He has served on the Editorial Boards for
ment effects of information security enforcement: International evidence,’’ the Journal of the Association for Information Sys-
J. Manag. Inf. Syst., vol. 25, no. 2, pp. 125–144, 2008. tems and the IEEE TRANSACTIONS ON ENGINEERING
[73] H. W. Kim, H. C. Chan, and S. Gupta, ‘‘Social media for business and MANAGEMENT.
society,’’ Asia–Pacific J. Inf. Syst., vol. 25, no. 2, pp. 211–233, 2015.