Vol. 49. No. 01. January 2022 第 49 卷第 01 期 2022 年 1 月 湖南大学学 报(自然科学版) Journal of Hunan University(Natural Sciences)

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 21

第 49 卷第 01 期 湖南大学学报(自然科学版) Vol. 49. No. 01.

2022 年 1 月 Journal of Hunan University(Natural Sciences) January 2022

Open Access Article


MALWARE RECOGNITION SCHEME FOR ANDROID PLATFORM USING AN
OPTIMIZED CNN-BASED APPROACH

Karrar Ahmed Kareem, Ethar Sabah Mohammad Ali


Ministry of Oil, Thiqar Oil Company, Iraq
College Of Engineering, Wasit University, Ministry Of Higher Education and Scientific Research, Iraq
kalknany526@gmail.com

Abstract
Malware is a malicious code which is developed to harm a computer or network. The number of
malwares is growing so fast and this amount of growth makes the computer security researchers invent
new methods to protect computers and networks. It is a very serious problem and many efforts are
devoted to malware detection in today’s cybersecurity world. In this research, a new method will be
proposed to malware detection scheme for Android platform using an Optimized Convolutional Neural
Networks based approach, which integrates both risky permission combinations and vulnerable API
calls and use them as features in the CNN algorithm.
Keywords: Malware Detection, Deep Learning, Machine Learning, Security, Classification,
Convolutional Neural Networks, Particle Swarm Optimization (PSO).
抽象的
恶意软件是一种恶意代码,旨在损害计算机或网络。 恶意软件的数量增长如此之快,这种增
长量使计算机安全研究人员发明了保护计算机和网络的新方法。 这是一个非常严重的问题,
在当今的网络安全世界中,许多努力都致力于恶意软件检测。 在这项研究中,将提出一种基
于优化卷积神经网络的Android平台恶意软件检测方案的新方法,该方法集成了风险权限组合
和易受攻击的API调用,并将它们用作CNN算法中的特征。
关键词:恶意软件检测、深度学习、机器学习、安全、分类、卷积神经网络、粒子群优化
(PSO)。
the above types as well [1]. Annual reports from
Introduction antivirus companies show that thousands of new
Malware, which is slipped into a victim’s malware are created every single day. This new
computer by hackers (attackers) through security malware become more sophisticated that they
vulnerabilities of the operating system or could no longer be detected by the traditional
application software, can influence normal detection techniques such as signature-based
operation, collect sensitive information and steal Signature-based detection searches for specified
superuser privileges in order to perform bytes sequences into an object so that it can
malicious actions. Generally, mainstream identify exception- ally a particular type of a
malware includes malicious scripts, vulnerability malware. Its drawback is that it cannot detect
exploits, back doors, worms, trojans, spywares, zero-day or new malware since these mal ware
rootkits, etc., and combinations or variations of
Received: October 18, 2021 / Revised: November 09, 2021 / Accepted: December 30, 2021 / Published: January 31, 2022

About the authors : Karrar Ahmed Kareem


Corresponding author- Email: kalknany526@gmail.com
208

signatures are not supposed to be listed into the Optimized Convolutional Neural Networks
signature database [2]. based approach, which integrates both risky
Identifying malware is essentially a software permission combinations and vulnerable API
classification problem. The process of analyzing calls and use them as features in the CNN
and classifying software involves two main algorithm.
steps. First, through program analysis, various
descriptions of a program are generated and Main Challenges
features are extracted from the program  The most widely used method for
descriptions. Then classifiers can be constructed malware detection is the signature-based
to perform classification based on these features. method. The signature is sensitive to
Due to the sheer number of apps that need to be slight changes in malicious code. They
processed, automated software analysis and require prior knowledge of malware
classification is necessary. It is desired that little samples. Signature-based approaches
or no human effort be required in either of the cannot detect modified or previously
steps [3]. To identify malware, most antivirus unseen malware. Therefore, one primary
scanners use a combination of signature problem faced by the antivirus
matching and heuristic-based detection [4]. The community is how to detect previously
obvious problem with this is that only known and unseen malicious code.
partially known samples based on the artifacts  In the field of malware detection,
extracted by malware analysis will be collecting data samples is a relatively
recognized. Because of the rapid growth in easy task; however, determining whether
malware samples over the last years, this is no an executable is malicious or is not labor
longer sufficient. intensive. Thus, it is difficult for us to
Deep Learning is called deep learning as it is a obtain a large amount of labeled data.
kind of Machine Learning that imitates the  How to build a malware detection system
human brain. Deep Learning, which means based on PSO-CNN.
teaching a computer how to think, has moved  Whether the unlabeled data can be used
away from research topics due to the problem of to improve the accuracy of malware
taking too much time to learn data. However, detection.
since AlexNet, who used CNN, a type of Deep  Whether the deep representation
Learning, won the 2012 image recognition generated using PSO-CNN is helpful for
algorithm competition, with 16.4 percent error feature extraction and dimension
rate, Deep Learning has drawn keen attention reduction.
again [5]. Deep learning mechanisms itself
Purposes
facilitated to extract features by taking raw data
 The goal of using PSO-CNN as a
set as input. They are mainly categorized into
classifier is to reduce the dimensions of
two types (1) convolution neural network (CNN)
the feature vector and malware detection.
(2) recurrent neural network (RNN) [6].
In this proposal, we proposed a malware
detection scheme for Android platform using an
209

1-4 Summarization send text messages to insurnce or bank numbers


It can be said that the main purpose of this without user approval, access to personal data or
research is proposing an optimizal approach for even install code that can download and execute
proposing a method name optimized CNN with additional malware on the victim's device. The
PSO (PSO-CNN) for malware detection in malware can also be used to create botnets. Over
Android platform. The rest of this research is the past few years, the number of instances of
organized as below: malware attacking Android has increased
 In chapter 2, will be consider some significantly. According to a recent McAfee Inc.
articles for liteture review. report, more than 2.5 million new Android
 In chapter 3, the proposed method will be malware programs were discovered in 2017,
describe. increasing the number of malware instances in
 In chapter 4, a simulation will be done various regions to approximately 25 million in
and analysis methods in comparison to 2017 [7].
recent models. Google in February 2012 introduced an
 In chapter 5, a conclusion will be identification mechanism to its app markets
proposed and some methods which can called Bouncer to reduce the prevalence of
used for futures to optimizing some malware, including malware. The Bouncer tests
remained challenges of this study will be presented their programs in a sandbox for 5
consider. minutes to identify any harmful behaviors.
However, it has been shown that Bouncer can be
Google's Android operating system is expected prevented by simple detection methods. At the
to increase dramatically with the 1,500 Android same time as Bouncer, google introduced Google
devices released by 2021. The company is Play Protect and introduced it at the Google 2017
currently leading the mobile operating system event. Google Play Protect is a regular service
market with over 80% market share compared to that automatically scans applications even after
iOS, Windows, Blackberry and Symbian. The installation to ensure that installed applications
availability of diverse Android markets such as are safe. It is reported that more than 50 billion
Google Play, the official store, and third-party programs are scanned and verified every day, no
markets make Android a popular target not only matter where they are downloaded. However,
for legitimate developers, but also for malware according to McAfee Inc. reporst, Google Play
developers. More than one billion devices have Protect also failed in testing against malware
been sold and over 65 billion downloads from discovered in the last 90 days in 2017. In
Google Play. Android apps can be viewed in addition, most third-party stores are not capable
various areas, such as educational apps, game of scanning and detecting harmful programs.
apps, social media apps, entertainment apps, Obviously there is still a need for additional
banking apps, and more [7]. research into effective ways to identify Android
As an open source technology and widely used malware in the world in order to overcome the
by users and developers, Android faces many aforementioned challenges [7].
challenges, especially in the case of malicious Various methods have been proposed in
software. Malware-infected applications can previous work aimed at detecting Android
210

malware. These approaches are classified into process to execute them. Like a
static analysis, dynamic analysis, or hybrid biological virus, they can spread quickly
analysis (where static and dynamic are used and widely, causing damage to the core
together). Static analysis methods use reverse functionality of systems, corrupting files
engineering to analyze malicious code. This and locking users out of their computers.
chapter provides an overview of the issues in the They are usually contained within an
field of malware and their use in the real world, executable file.
along with the challenges involved in detecting  Worms: Worms get their name from the
them. It also discusses a series of previous way they infect systems. Starting from
research that has been done to detect malware on one infected machine, they weave their
Android or other operating systems. way through the network, connecting to
consecutive machines in order to
continue the spread of infection. This
2-1 Malwares type of malware can infect entire
Or malware is the common name for a number networks of devices very quickly.
of types of malware, including viruses,  Spyware: Spyware, as its name suggests,
ransomware, and spyware. Malware usually is designed to spy on what a user is doing.
includes code generated by cyberattacks Hiding in the background on a computer,
designed to cause extensive damage to data and this type of malware will collect
systems or unauthorized access to a network. information without the user knowing,
Malware is typically delivered as a link or file via such as credit card details, passwords and
email, requiring the user to click on the link or other sensitive information.
open the file to run the malware [100].  Trojans: Just like Greek soldiers hid in a
Malware has been under threat from giant horse to deliver their attack, this
individuals and organizations since the early type of malware hides within or disguises
1970s when the Creeper virus first appeared. itself as legitimate software. Acting
Since then, the world has been attacked by discretely, it will breach security by
hundreds of thousands of different types of creating backdoors that give other
malware, all aiming to cause as much disruption malware variants easy access.
and damage as possible. Malware delivers its  Ransomware: Also known as scareware,
load in several different ways. From ransacking ransomware comes with a heavy price.
to stealing sensitive personal information, Able to lockdown networks and lock out
cybercriminals are becoming more sophisticated users until a ransom is paid, ransomware
in their methods. Below is a list of some of the has targeted some of the biggest
common types and definitions of malware [8]. organizations in the world today — with
expensive results.
2-1-1 Types of Malware:
 Virus: Possibly the most common type of 2-1-2 How Does Malware Spread?
malware, viruses attach their malicious Each type of malware has its own unique way
code to clean code and wait for an of causing destruction and relies heavily on the
unsuspecting user or an automated
211

type of user. Some strains are delivered via email similar way. However, its capabilities are
or executable link. Others are delivered via different. While the basic models of machine
instant message or social media. Even mobile learning are getting better with respect to their
phones are under attack. Organizations need to performance, they still need some guidance. If an
be aware of all vulnerabilities in order to artificial intelligence algorithm returns a false
determine an effective defensive line [8]. prediction, then an engineer must step in and
make adjustments. With the deep learning
algorithm, an algorithm alone can determine
2-2 Deep Learning Review whether the prediction is correct through its
Recently, there has been a lot of talk and neural network [15].
discussion about all the possibilities of learning A deep learning model is designed to
machines doing what people are currently doing continuously analyze data with a logical structure
in factories, warehouses, offices and homes. similar to how a human concludes. To achieve
While this technology is evolving rapidly with this, deep learning programs use a layered
fear and excitement, some terms such as artificial structure of algorithms called artificial neural
intelligence, machine learning and deep learning networks. The design of an artificial neural
can make people anxious. network draws inspiration from the biological
The field of artificial intelligence is basically neural network of the human brain and results in
when machines can perform tasks that usually a learning process that is far more powerful than
require human intelligence. This includes standard machine learning models. It is certain
machine learning in which machines can learn that a deep learning model does not produce the
with experience and acquire their skills without wrong result as with other examples of artificial
human involvement. Deep learning is a subset of intelligence, it requires a lot of training to correct
machine learning in which artificial neural learning processes. But when it works as
networks, algorithms inspired by the human intended, deep functional learning is often
brain, learn from large amounts of data. Similarly perceived as a scientific surprise, with many
to how we learn from experience, the deep finding it the backbone of real artificial
learning algorithm does the same thing intelligence [14].
repeatedly, each time improving it slightly to An excellent example of deep learning is
optimize the result. Deep learning models benefit Google's AlphaGo. Google has developed a
from more sophisticated artificial neural computer program with its own neural network
networks, more sophisticated system features that has learned to play the abstract board game
and architecture, and can perform better than called Go that requires sharp intellect and
traditional machine learning models [14]. intuition. Playing against professional Go
Deep learning allows machines to solve players, AlphaGo's deep learning paradigm
complex problems even when using a highly learned how to play at a level never before seen
diverse, unstructured, interconnected dataset. in artificial intelligence, without telling when to
The deeper the learning algorithms, the better make a specific move as a learning model. It
they perform. Practically, deep learning is just a saddened AlphaGo to defeat several "celebrity"
subset of machine learning. In fact, deep learning masters of the game - not only could a machine
is technically a machine learning and works in a
212

understand complex techniques and abstract neural networks. Better depth can be added
aspects of the game, but also become one of its through better algorithms and greater computing
greatest players. Several advances are now in power. While the current market focus is on deep
deep learning [15]: learning techniques in cognitive computing
 Algorithmic advances have enhanced the applications, there is also great potential in more
performance of deep learning methods. traditional applications of analysis, for example,
 New approaches to machine learning time series analysis. Another possibility is to
have improved the accuracy of models. simply be more efficient and streamlined in
 New classes of neural networks have analytical operations [15].
been developed that are well suited for From an outside perspective, deep learning
applications such as text translation and may be at the research stage, as computer
image classification. scientists and data scientists are still testing its
 Much more data is needed to build neural capabilities. However, deep learning has many
networks with deep layers, including IoT practical applications that businesses are using
data, social media text data, physician today and others that are being used as research
notes, and research transcripts. continues. A traditional approach is to analyze
 The computational advances of the use of data needed for engineering to extract
distributed cloud computing and graphics new variables, then select an analytical model
processing units have provided incredible and finally estimate the parameters (or
computing power. This level of unknowns) of that model. These techniques can
computational power is needed to train have predictive systems that are not well
deep algorithms. generalized because their completeness and
At the same time, human-machine interfaces accuracy depend on the quality of the model and
have also evolved a lot. The mouse and keyboard its features. For example, if a fraud model is
are replaced by natural gesture, touch, and developed with attribute engineering, it starts
language, with a particular interest in artificial with a set of variables and is most likely a model
intelligence and deep learning. To solve deep of those variables using data conversion. It may
learning problems due to the duplicate nature of end up with the 30,000 variables that the model
deep learning algorithms, their complexity depends on, then the model needs to be shaped
requires increasing computational power with and understood which variables are meaningful,
the increasing number of layers and large which are not, and so on. Adding more
amounts of data needed to train networks. The information will force people to redo. The new
dynamic nature of deep learning methods - their approach is deep learning, replacing formulation
ability to continually improve and adapt to and model specifications with hierarchical
changes in contextual information patterns - is a properties (or layers) that learn to distinguish the
great opportunity to introduce more dynamic underlying features of the data from the
behavior in analytics. Further customizing perspectives within the layers. Changing the
customer analytics is a possibility. Another great paradigm by deep learning is a move from
opportunity is to improve the accuracy and feature engineering to feature representation. The
performance of applications that have long used promise of deep learning is that it can lead to
213

predictive systems that are well-generalized, widely accepted measures, such as TF-IDF and
well-adapted, continually improving with new cosine similarity. Finally, we use Deep learning
data entry, and are more dynamic than predicted algorithm to build a classification model and
systems. Hard business rules have been made evaluate them on the Android app dataset by
[15]. classifying them into malware or benign apps.
Among several DL variations, Convolutional
Neural Networks (CNNs) currently represent the
2-3 Proposed Method Modeling state-of-the-art for complex classification
In this section, we first introduce the overall problems, especially related to image analysis
architecture of proposed malware detection [1]. For the complex task at hand, we propose an
scheme, and then describe each of the functions PSO-optimized CNN classifier. The Traditional
in details to demonstrate how the proposed CNN features 3 layers, with convolution and max
scheme works for malware detection. Figure (3- pooling. In specific case, the input tensor is a
1) depicts the overall structure of the malware single vector with the features of selected the
detection scheme. dataset. Each convolution layer i can be
described by 3 hyper-parameters wi, wdi, hi,
respectively; output channels, convolution
window, and max pooling window. In addition to
the convolution layers, there's a fully connected
rectifier layer with a hyper-parameter w4 for the
output channels, with a final softmax layer with
dropout. Given the hyper-parameters, our
objective becomes finding their optimal (or near-
optimal) values, for our target classification
problem. In order to reach this goal, we resort to
PSO-based optimization. In fact, PSO algorithm
will be used as feature extraction in hidden layers
of CNN for tuning this neural network.
The first step is to enter the data into the CNN
Figure (3-1) The schematic diagram of structure for training and testing with feature
optimized CNN-based malware detection extraction operations, which uses PSO
There are three major components in the algorithms to construct a PSO-CNN structure to
malware detection scheme, namely decoder detect malware on Android operating systems. In
(decompilation), feature extraction, and fact, here's a modeling. A training set is
classifier. In the decompilation component, each constructed in 𝑁 × 𝑀 dimensions where 𝑀 is
Android app is unpacked and decoded into a the number of data samples containing malware-
readable smali file. Some key features, such as free and malware-free data. The average data set
risky permissions, suspicious API calls, and is calculated by the equation (3-1).
URLs are then extracted in feature extraction 1
components according to several important and 𝜓= Γ
𝑀
214

(1-3) components are then constructed by subtracting


According to Equation (3-1), 𝜓 is the mean data, data from the number of special vector classes.
𝑀 is the number of data and Γ is the data vector. The number of special vector classes means all
The components corresponding to the highest the malware or malware in the data. The selected
eigenvalue are retained. These components set of special vectors is multiplied by matrix 𝐴 to
define the malware space. Specific space is reduce the feature space. Special vectors from
created by plotting data into the malware space, smaller eigenvalues correspond to smaller
which creates components. So the weight vectors changes in the covariance matrix. Other features
are calculated. Data dimensions are set to view of the data are maintained. The number of special
specifications and data is refined in the pre- vectors depends on the accuracy of the data in the
processing step of identification. Then the weight database. A group of specially selected vectors
data vector and the malware weight vector are are called components. Once the components are
compared in the database. The average malware obtained, the data is collected in a database
is calculated from the data and then subtracted containing component space and the weights of
from any data in the training set. Matrix 𝐴 is the components in the same projected space.
constructed using the results of subtraction Special coefficients are compared to determine a
operations. The difference between each data and data item with a special coefficient of database
its mean is calculated as an equation (2–3). data. The component is then built into the data.
The Euclidean distance between the data
𝜙 =Γ −Ψ, 𝑖 = 1,2, … , 𝑀 component and the components collected in the
(2-3) previous step is calculated. Malware is identified
According to equation (3-2), 𝜙 is the as an object whose Euclidean distance is less than
difference between the original data and the the threshold value in the component database. If
mean data. The matrix obtained by subtraction all Euclidean gaps are greater than the threshold,
operation, which is matrix 𝐴, is multiplied by its then malware will not be detected in the data and
derivation and eventually the covariance matrix the data will be ignored. For the final application
𝐶 is formed which is the relation of the equation of the grid, it is necessary to identify the twists
(3-3). that there are three general methods for doing
𝐶=𝐴 𝐴 this, including thresholding of wavelet
(3-3) coefficients, adaptive filters, and thresholding of
According to equation (3-3), this 𝐴 is formed the range of action potentials. The approach of
by the difference between the vectors, for this research to detect malware is to use the
example 𝐴 = [𝜙 , 𝜙 , 𝜙 , … 𝜙 ]. The dimension threshold of action potentials. The value of this
of matrix 𝐶 is 𝑁 × 𝑁. The number of data threshold is determined by the equation (3-4).
samples or 𝑀 is used to form the 𝐶 matrix. In |𝑥|
𝜎 = 𝑚𝑒𝑑𝑖𝑎𝑛{
practice we can say that the matrix 𝐶 is 𝑁 × 𝑀. 0.6745
𝑇ℎ𝑟𝑒𝑠ℎ𝑜𝑙𝑑 = 3.5 𝜎
On the other hand, when the order 𝐴 is equal to
(4-3)
𝑀, only 𝑀 of 𝑁 is a special vector number of
In relation (4–4), the 𝑥 signal or malware data
non-zero values. Then the eigenvalues of the
recorded by the microelectrode (raw signal) and
covariance matrix are calculated. The
𝜎 is an estimate of the standard deviation of the
215

noise. It is important to note that if we use the operation is carried out, which includes the three
standard deviation of the signal, a larger inner layers of twisting, pooling and fully
threshold value will be obtained and as a result connected, as well as the final test operation on
some twists will be mistakenly eliminated. After the face output layer. The issue of detecting the
selecting the threshold value, the torsions are presence or absence of malware creates a
aligned according to their maximum values. The challenge and a search space. In an optimization
precise alignment of the twists is a crucial factor problem such as malware detection, the presence
in identifying malware with twists. Like other or absence of malware from the data with the
neural networks, this neural network also next 𝑁 would be an 𝑥 array that
requires training, with the goal of finding a represents the current position for the coiled
mapping such as 𝑓: 𝑅 → 𝑅 in equation to (3-5). layer in the convolutional neural network. This
𝑓(𝑣) = ∑ 𝑤 𝜑( |𝑣 − 𝐶 | ) array is defined as equation (3-8).
(5-3) 𝐶𝑜𝑛𝑣𝑜𝑙𝑣𝑒 = [𝑥 , 𝑥 , … , 𝑥 ]
According to Equation (3-5), 𝑣 ∈ 𝑅 is a 32- (8-3)
point vector for the input and the Gaussian basic The degree of appropriateness (or amount of
function φ(0) is determined by Equation (3-6). benefit) in the current Convolve layer is obtained
−𝑣 by evaluating the malware function 𝑓 in
φ(𝑣) = 𝑒𝑥𝑝( )
2𝜎 Convolve. So there are equations (3-9) and (3-
(6-3) 10).
Then, it is necessary to calculate the error 𝑝𝑟𝑜𝑓𝑖𝑡 = 𝑓 (𝐶𝑜𝑛𝑣𝑜𝑙𝑣𝑒)
corresponding to each sample from the gradient = 𝑓 (𝑥 , 𝑥 , … , 𝑥 )
decsent and for the random initial values for the (9-3)
weights, for each training sample, in relation to () ( ) ( )
𝑗 𝑥 ,…,𝑥 ,𝜃 ,…,𝜃 ( )
(3-7). 1
= ( 𝜃( ) 𝑥( )
2
𝑒 =𝑡 −𝑦 =𝑡 𝑤 𝜑( 𝑣 − 𝐶 ) ( , ): ( , )

𝜆 ()
(7-3) − 𝑦( , )) + (𝑥 )
2
Therefore, the total error of the network for all
training input vectors or 𝑃 of the data is 𝐸 = 𝜆 ( )
+ (𝜃 )
∑ |𝑒 | . If the errorE is less than the threshold 2
error, the training ends. This value is set (10-3)
manually at the beginning of the job. Otherwise In fact, the overall purpose of the malware
weights will be updated using the gradient detection function is whether or not it is related
gradient. Upon completion of training with the (3-10). In general, we should minimize the above
neural network, the ability of each twist to belong function as much as possible to detect malware.
to the class to which it belongs is achieved. It In fact, a modal deletion of additional
needs to be characterized by its layering components is considered for precise detection of
structure, which includes the input layer malware that minimizes the equation (11–3).
(neurons), the secret layer where the training
216

𝑚𝑖𝑛 𝑗 𝑥( ), … , 𝑥( ), 𝜃( ), … , 𝜃( ) convolve segment in the deep CNN only λ % of


( ) ,…, ( )
( ) ,…, ( ) all detected areas toward the current ideal target
(11-3) and also has a deviation of φ radians. These two
As can be seen, the deep neural network parameters help each convolve part of the deep
structure used is a convolution algorithm that CNN convergence to search for more
maximizes the malware detection function or environment. λ is a random number between 0
not. To use a deep CNN to solve minimization and 1 and φ is a number between π/6 and -π/6.
problems such as malware detection, it is When all of convolve in the deep CNN migrated
sufficient to multiply a negative sign in the cost to the target point and new habitats were
function. To start the convolution optimization identified, each convolve part in the deep CNN
algorithm, a 𝐶𝑜𝑛𝑣𝑜𝑙𝑣𝑒 matrix is generated with had some layers. Depending on the number of
𝑁 ∗ 𝑁 . Then, for each of these Convolves, layers in each convolve segment in the deep
CNN, a 𝑀𝑎𝑥 is specified for it,
a random number of Layers is assigned. Pooling
layers are generally between 2 and 5 items. These and then communication with the layers begins.
numbers are used as the upper and lower limits It is now necessary to optimize this deep CNN
of the Poolling assignment to each convolv in using the PSO algorithm for the extraction of
depth training in different iterations. Another correct and optimal features. It is assumed that
custom of any deep structure based CNN is that the dataset is 𝑀 which represents the number of
they have layers in a given domain. Hence, the training images, 𝐹 average images, and 𝐿 per
maximum amplitude of layers connected in the 𝑇 vector image. Initially 𝑀 is the number of data
CNN is called 𝑀𝑎𝑥 . In an that each data contains 𝑀 × 𝑁 dimension. Any
optimization problem, the upper limit of the data can be represented in 𝑁 space by 𝐴 = 𝑁 ∗
variables 𝑣𝑎𝑟 and the lower limit 𝑣𝑎𝑟 𝑁 ∗ 𝑀. Therefore, it is necessary to provide a
hybrid layer for the PSO algorithm segment with
each layer will have a 𝑀𝑎𝑥 that is
deep CNN, which will be in equation (13-3).
proportional to the total number of layers, the ( ) ( ) ( )
number of layers of current training data, and 𝑥 = (𝑤 ) (𝑦 )( ) + (𝑤 ) 𝑦
( )
also The upper and lower limit are the variables +𝑏
of the problem. Therefore 𝑀𝑎𝑥 is (13-3)
defined as equation (3-12). In equation (3-13), 𝑦 is the output of the deep
𝑀𝑎𝑥 CNN and 𝑦 is the output of the PSO algorithm
=𝛼 that has two parts 𝑤 and 𝑤 for weight. In fact,
𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑐𝑢𝑟𝑟𝑒𝑛𝑡 𝑙𝑎𝑦𝑒𝑟𝑠 𝑤 determines the weight of the deep CNN and
×
𝑇𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑙𝑎𝑦𝑒𝑟𝑠 𝑤 is the weight of the PSO algorithm. This layer
× 𝑉𝑎𝑟 − 𝑉𝑎𝑟 is used as 𝑦 to combine these two layers, namely
(12-3) the deep CNN and the PSO algorithm section,
In equation (3-12), 𝛼 is the variable with which which has an activation function that is
the maximum value of 𝑀𝑎𝑥 is set. considered as a hyperbolic tangent. In fact, the
In equation (3-12) there are also 𝜃 layers and in classification and detection section was
relation (3-11), λ is the estimator value. Each developed using an innovative method called the
217

deep CNN based on the PSO algorithm or PSO- guarantee the proposed approach to other
CNN. methods. It should be noted that the mean error
What separates the proposed approach from the squares also have a training stage that is not
neural-fuzzy network (ANFIS) structure is that in considered in this study and its training is used
neural-fuzzy network structures, the definition of when using neural network structures and then
membership functions and fuzzy numbers is the mean error squares as a Use the evaluation
based on inputs and some properties especially in method. To identify the parameters, we need
images whose features are And are connected to training data expressed as an equation (3-14).
an inference system that is trained on a weighted {(𝑢 ; 𝑦 ), 𝑖 = 1, … , 𝑚}
and bias-based training layer with a transfer or (14-3)
stimulus function. But in the proposed approach Replacing the input and output pairs in the main
structure that performs the detection operation, equation, we arrive at the linear equation 𝑚 such
the structure of the PSO algorithm is located on as euation(3-15).
the training layers or hidden layers of the deep {𝑓 (𝑢 )𝜃 + 𝑓 (𝑢 )𝜃 + ⋯ + 𝑓 (𝑢 )𝜃 = 𝑦

CNN section after the twisting layer, where the ⎪ {𝑓
(𝑢 )𝜃 + 𝑓 (𝑢 )𝜃 + ⋯ + 𝑓 (𝑢 )𝜃 = 𝑦
twisting layer, the PSO algorithm layer. The .
optimum feature extraction is simultaneous with ⎨ .
the instruction, the pooling layer and the fully ⎪ .
⎩{𝑓 (𝑢 )𝜃 + 𝑓 (𝑢 )𝜃 + ⋯ + 𝑓 (𝑢 )𝜃 = 𝑦
connected layer, with the filter applied in a 3 × 3
(15-3)
window type. In fact, in this training operation,
Which is in the form of a relation matrix such as
the coefficients between the PSO algorithm are
equation (3-16).
applied with the layers of the convoluted
𝐴𝜃 = 𝑌
segment, ie the convolve layer, and the output
(16-3)
layer displays the classification based on the data
And matrix A is like equation (3-17).
obtained as a vector matrix and is processed
definitely to prove the optimization problem is  f1 (u1 )  f n (u1 ) 
done on the input data itself. A     
 f1 (u m )  f n (u m ) 
2-4 Evaluation
(17-3)
Several evaluation criteria have been used in
And θ is a vector 𝑛 × 𝜃, which is the same as
this study, including Mean Square Error (MSE),
equation (3-18).
Peak Signal-to-Noise Ratio (PSNR), Signal-to-
𝜃
Noise Ratio (SNR), and precision criteria. At the ⎡𝜃 ⎤
beginning, each survey is completed and finally ⎢ ⎥
.
the results of the research performed with each 𝜃=⎢ ⎥
⎢ . ⎥
evaluation criterion are mentioned. ⎢ . ⎥
⎣𝜃 ⎦
3- 1 Mean Square Error (MSE) (18-3)
In this project, the MSE are used to evaluate the And 𝑦 is the output vector 𝑚 × 1 and such as
system. The purpose of the evaluation here is to equation (3-19).
218

𝑦 sensitivity, and feature rates. Because the mean


⎡𝑦 ⎤
⎢. ⎥ squared error can be either a variance or an
𝑌=⎢ . ⎥ integer below 5, the best rate is between 0.1 and
⎢ ⎥ 0.9. Based on the change in the mean squared
⎢. ⎥
⎣𝑦 ⎦ error as the estimator, the results of other
(19-3) estimation methods may change and have a high
In this case, the vector 𝜃 can be calculated as impact on them. In fact, this same mean of error
equation (3-20). squares affects other estimation methods based
𝜃=𝐴 𝑌 on this result rate, a percentage for accuracy,
(20-3) sensitivity and specificity estimation methods,
However, the number of input-output data pairs and a number in dB for the peak signal ratio. The
is usually greater than the number of equations. noise and signal-to-noise ratio is obtained.
In general, the data may be accompanied by
noise that is too high or the model may not be 3- 2 Peak Signal-to-Noise Ratio (PSNR)
able to accurately determine the output. In this
case the equation is in the form of equation (3- Another criterion used to evaluate is the Peak
21). Signal-to-Noise Ratio (PSNR), which is the ratio
𝐴𝜃 + 𝑒 = 𝑌 between the maximum possible signal power and
(21-3) the noise power calculated from the equation
In this case we will seek 𝜃 = 𝜃 to minimize (24-23) in dB.
the sum of the squares of the error and be such as
𝑀𝐴𝑋
equation (3-22). 𝑃𝑆𝑁𝑅 = 10. 𝑙𝑜𝑔 ( )
𝑀𝑆𝐸
𝐸(𝜃) = ∑ (𝑦 − 𝑎 𝜃) = 𝑒 𝑒 = (𝑦 −
𝐴𝜃) (𝑦 − 𝐴𝜃) (24-3)
(22-3) Where 𝑀𝐴𝑋 is the maximum possible signal.
If 𝐴𝐴 is reversible then we have the equation (3-
23).
3- 3 Signal-to-Noise Ratio (SNR)
𝜃 = (𝐴 𝐴) 𝐴 𝑦
There is a need for a benchmark to represent
(23-3)
the useful signal versus noise signal in the built
In each iteration, the feedback method is
project. A value less than 1 indicates a serious
forwarded until the matrix A and the outputs are
noise problem in the images. A value greater than
obtained by the least squares error method. It
1 is a satisfactory condition and a value above 2
should be noted that all training data must be
is appropriate. This claim can be substantiated by
applied and the basic parameters kept constant.
citing scientific articles as well as explanations
Then the empty parameters are kept constant and
available on most websites. In fact, the higher
the initial parameters are set by the descending
this indicator is, the better the signal is in the
gradient. When the minimum value indicates less
image. The Signal-to-Noise Ratio (SNR) is
erosive variance, the value is usually expressed
defined as the Signal-to-Noise Ratio of the power
as a percentage. Percentage value refers to the
to noise ratio according to equation (3-25).
fraction of data we want to measure accuracy,
219

𝑆𝑁𝑅 = In general, the ROC curve is a graphical


representation of the sensitivity or prediction
(25-3)
accuracy versus the false prediction in a binary
Where P is the average signal power value.
classification system in which the threshold of
Because most signals have a dynamic range, they
separation varies. The ROC curve is also shown
are defined in decibel logarithms, which will be
by plotting the predicted positives against the
the equation (3-27) for the power signal and the
predicted false positives. The area under the
equation (3-27) for the noise signal.
ROC curve is a number that measures one aspect
𝑃 = 10𝑙𝑜𝑔 (𝑃 )
, of performance. This area below the curve is
(26-3) called the AUC, which is between zero and one.
𝑃 , = 10𝑙𝑜𝑔 (𝑃 The value is 0.5 times the random prediction, and
(27-3) values above 0.7 to 1 indicate excellent
prediction, classification or clustering.
3- 4 Precission (Accuracy)
The accuracy rate is a criterion expressed as a
percentage, which is the most important overall Simulation and Results
result of the evaluation criteria section, which is
4-2 Simulation and Results
the accuracy of the relationship and its eqution is
MATLAB software will be used to implement
(3-28).
𝑇𝑃 + 𝑇𝑁 the approach as well as using a set of evaluation
𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 = 100 × criteria as the simulation platform method. In
𝑇𝑁 + 𝑇𝑃 + 𝐹𝑁 + 𝐹𝑃
(28-3) fact, there is a simulation and evaluation of the
In equation (3-28), 𝑇𝑃 is false positive, 𝑇𝑁 method, along with a Java-based structure for
positive negative, false positive 𝐹𝑃 and false Android that can be used. The main purpose is to
negative 𝐹𝑁. Equation (3-29) shows the evaluate the proposed approach, called PSO-
sensitivity expressed in percentage. CNN, which can be compared with other
𝑇𝑃 methods in this field to determine the accuracy of
𝑆𝑒𝑛𝑠𝑖𝑡𝑖𝑣𝑖𝑡𝑦 = malware detection.
𝑇𝑃 + 𝐹𝑁
(29-3) Feature selection and extraction is one of the
Equation (3-30) shows the data properties issues in machine learning as well as statistical
expressed in percentage. pattern recognition. This is important in many
𝑇𝑁 applications because there are many features that
𝑆𝑝𝑒𝑐𝑖𝑓𝑖𝑐𝑖𝑡𝑦 = many are either useless or have little information
𝑇𝑁 + 𝐹𝑃
(30-3) about. Not removing these features does not
cause any information problems, but it does
increase the computational burden for the
3- 5 ROC and AUC intended application and in addition it saves a lot
This curve is known as one of the most of useless information along with useful data.
important evaluation criteria that expresses the For the feature selection problem, there are
criteria for measuring the efficiency of a many solutions and algorithms, some of which
classification or clustering operation in a system. are thirty or forty years old. Different feature
220

selection methods try to find the best one among Gatak (G) 1011
the 2 candidate subfolders. In all of these Benign Sample (B) 1200
methods, based on the application and the type of
definition, a subset is selected as the answer that The general configurations of the
can optimize the value of an evaluation function. Convolutional Neural Network (CNN) are
Although every method tries to pick the best one, discussed in detail in Chapter 3, but in the
but given the breadth of possible solutions and simulation, additional parameters are required
how these solution sets increase with N, finding for the Convolutional Neural Network (CNN) as
the optimal solution is difficult. And in medium shown in Figure (4-1). The network uses two
to large N, it is very expensive. In this research, layers for training, both using the trainlm or
a method based on the PSO-CNN hybrid method Levenberg Marquardt or trainlm activation
is presented where the PSO algorithm is placed function with 10 neurons. The number of neural
as a layer in the hidden layers phase of CNN. In network cycles is assumed to be 1000 rpm. The
fact, CNN's data training and feature extraction mutation rate is also 0.001. The weight of each
operations are to be done simultaneously in its layer is 1. The layers are named sequentially in
PSO-based training layer. In this simulation, a set Chapter 3, which consists of a hidden layer in
of measurement criteria is taken into account, two sections with convolve layers, an PSO
and the conditions in the two reference papers algorithm layer for feature extraction, a pooling
[25] and [26] are also modeled and simulated in layer and a fully connected layer.
the same research method to compare them
completely. Be. It is noteworthy that the
proposed approach and two reference articles use
a database called MS Big Malware Database,
taken from the Figure (4-1) PSO-CNN in MATLAB
https://www.kaggle.com/c/malware- configuration
classification website. In general, the MS BIG It is then necessary to introduce the PSO
dataset with malware class distribution and algorithm operators as presented in Table (4-2)
benign instances is shown in Table (4-1). by default.
Table (4-1) MS BIG dataset with malware class Table (4-2) PSO algorithm operator’s rate
distribution and benign instances
Initial Population (Particles) 100
Number of
Malware Type Iteration Numbers 100
Samples
Velocity 2
Ramnit (R) 1534
𝐹 rate for operating mobility 0.0001
Lollipop (L) 2470
𝐶𝑅 rate for operating mobility of 0.1
Kelihos ver3 (K3) 2942
two features (malware – benign)
Vundo (V) 451
Number of all operators [2 12]
Simda (S) 41
The number of strong agents per 2
Tracur (T) 685
iteration
Kelihos ver1 (K1) 386
The number of eliminating factors 4
Obfuscator.ACY (O) 1158 per iteration
221

Count the best possible solution 6 Figure (4-3) Bit rate data on dimensionality
and analyze its best change in malicious areas

Outputs are displayed after all the initial values As shown in Figure (4-3), bit rate data on
are set. Initially, Figure (4-2) is shown as dimensionality change is shown in malware
follows. areas. In fact, it is clear that as the dimensions get
larger, the bit rate data is more likely to be sent,
and this is a common occurrence. But since the
growth of data volume in bits may include
malware, it is shown in scales. It can be seen that
the proposed approach with cyan color diagrams
has a functional advantage over the other two
methods, namely [25] with blue and reference
[26] with pink. Prior to testing the data by PSO-
Figure (4-2) Bit Rate Control in Malware CNN, the accuracy was approximately 70%,
Detection after Proposed Method which can be seen in Figure (4-4), whose y-axis
is 100% unit.
As can be seen in Figure (4-2), bit rate data
controllability in malware detection after
applying the proposed method is initially
increased due to the use of appropriate number of
agents in the environment. It's gone downhill and
it's actually been optimized. The controllability
reflects the fact that the proposed approach is
capable of controlling malware. Figure (4-3)
shows the bitrate data for dimensionality change
in malicious areas compared to the methods
presented in [25] and [26].

Figure (4-4) approximate accuracy before data


test operation for detection and classification by
PSO-CNN method
But after thorough analysis of the proposed
approach, the accuracy of malware detection by
the PSO-CNN method reaches 98.54%. In
general, the results of the evaluation of the
proposed approach are shown in Table (4-3) and
the result of the ROC chart and the AUC rate in
Figure (4-5).
222

Table (4-3) results of the evaluation of the Table (4-4) Evaluation results in comparison to
proposed approach recent methods based on accuracy
AUC Specifici Sensitivi Accura MS References Approaches Accuracy
ty (%) ty (%) cy (%) E (%)
0.859 85 % 90 % 98.54 % 0.45 Zhongru Ren, et al., CRNN- 95.80 %
2 2020 [25] LSTM
Xinbo Liu, et al., Visual-AT 97.56 %
2020 [26]
Zhongru Ren, et al., CNN 93.60 %
2020 [25] – other
methods
Zhongru Ren, et al., CRNN- 93.70 %
2020 [25] – other GRU
methods
Xinbo Liu, et al., Random 95.13 %
2020 [26] - other Forest
methods
Figure (4-5) ROC and AUC rate after applying Xinbo Liu, et al., LZJD- 96.78 %
proposed method 2020 [26] - other KNN
methods
The results of the proposed approach are found Xinbo Liu, et al., M-CNN 98.05 %
to be relatively appropriate. Also in Figure (4-5) 2020 [26] - other
the data that is out of line of the ROC and is methods
disconnected from it are actually errors or Xinbo Liu, et al., Lanzcos- 95.61 %
outliers. The red line is the ROC line, where it 2020 [26] - other CNN
rises and is calculated based on the positive methods
integer or TP on the false integer or FP, the AUC Proposed Method PSO-CNN 98.54 %
rate or the area under its curve. The middle black
line is the regression line. A comparison of the
accuracy criteria is also made with the two
methods presented in [25] and [26], the results of
which can be seen in Table (4-4) and Figure (4-
6). Also in these two references, there are a
number of other methods that have been
compared with their predecessor methods, which
is also done in this study. It is noteworthy that the
validation for accuracy was based on the Kfold
method, with a K rate of 10 which would be
10Kfold.
223

Figure (4-6) Evaluation results in comparison to proposed approach, which should be numerically
recent methods based on accuracy (%) lower than a standard, is better than previous
methods.
It is found that the accuracy of the proposed
approach in detecting malware is superior to
previous methods when using the same MS Big Conclusion and Suggestions
data set. Following is a comparison of the area
Conclusion
under the curve or the AUC with the method
Nowadays, mobile devices such as
presented in [25], the results of which are
smartphones and tablets have become very
presented in Table (4-5) and Figure (4-7).
popular, due to a reduction in their cost and an
Table (4-5) Evaluation results in comparison to
increase in their functionalities and services
recent methods based on AUC
availability. Moreover, the growing trend of
References Approaches AUC
implementing bring your own device (BYOD)
Zhongru Ren, et al., CRNN- 0.982
policies in organizations has also contributed to
2020 [25] LSTM
the adoption of these technologies, not only for
Zhongru Ren, et al., CNN 0.980
everyday communication activities but to
2020 [25] – other
support enterprise systems, industrial
methods
applications, and commercial transactions,
Zhongru Ren, et al., RNN-LSTM 0.975
which raise new security issues. In this scenario,
2020 [25] – other
operating systems have also played an important
methods
role in the adoption and proliferation of mobile
Zhongru Ren, et al., RNN-GRU 0.983 devices and applications, giving also space for
2020 [25] – other the appearance of malicious software (malware).
methods This is the case for the Android OS, which, due
Proposed Approach PSO-CNN 0.8592 to its openness and free availability, has become
not only a major stakeholder in the market of
mobile devices but has also become an attractive
target for cybercriminals.

Figure (4-7) Evaluation results in comparison to


recent methods based on AUC
Figure (5-1) the main Android platform building
blocks
It can be seen from Table (4-5) and Figure (4-
7) that the area under the curve, or AUC, of the
224

First of all, the Android device hardware block new application to an official or alternative
refers to the wide range of hardware market. Users could be vulnerable by being
configurations where Android can be run, enticed to download and install these infected
including smartphones, tablets, watches, applications. In the case of the update attack,
automobiles, smart TVs, OTT gaming boxes, and instead of enclosing the payload as a whole only
set top boxes. Android is processor-agnostic, but an update component is included, which will
it does take advantage of some hardware-specific fetch or download the malicious payloads at
security capabilities such as ARM eXecute- runtime. Because the malicious payload is in the
Never. Secondly, the Android OS building block “updated” application, not the original
refers to the Android OS itself, which is built on application itself, it is stealthier than the malware
top of the Linux kernel, thus all device resources installation technique that directly includes the
are accessed through the operating system. entire malicious payload in the first place. The
Thirdly, the Android application runtime block third technique applies the traditional drive-by
refers to the managed runtime used by download attack to mobile space. Though they
applications and some system services on are not directly exploiting mobile browser
Android. In this case, it must be taken into vulnerabilities, they are essentially enticing users
account that applications are written in the Java to download “interesting” or “feature-rich”
language and run in the Android runtime (ART). applications. This is only a set of common
However, many applications, including core techniques, other threats include combinations of
Android services and applications, are native the previous techniques, as well as approaches
applications or included native libraries. Both such as “spyware,” which intend to be installed
ART and native applications run with the same to victim’s phones on purpose; fake apps that
security environment, contained with the masquerade as the legitimate applications but
applications sandbox. Thus, applications get a stealthily perform malicious actions, such as
dedicated part of the file system in which they stealing users’ credential; applications that
can write private data, including database and provide the functionality they claimed, they are
raw files. not fake ones, but that intentionally include
Android malware can be characterized in malicious functionality, which is unknown to
different ways: a systematic characterization is users.
proposed ranging from their installation, Malware programs currently represent the
activation, to the carried malicious payloads. most serious threat to computer information
Thus, malware installation can be generalized systems. Despite the performed efforts of
into three main social engineering-based researchers in this field, detection tools still have
techniques: repackaging, update attack, and limitations for one main reason. Actually,
drive-by download. Repackaging is one of the malware developers usually use obfuscation
most common techniques that malware authors techniques consisting in a set of transformations
use to piggyback malicious payloads into that make the code and/or its execution difficult
applications. In essence, malware authors get an to analyze by hindering both manual and
application file, disassemble them, enclose automated inspections. These techniques allow
malicious payloads, reassemble, and submit the
225

the malware to escape the detection tools, and 3. Nix, Robin, and Jian Zhang.
hence to be seen as a benign program. "Classification of Android apps and
The approach presented by this study is to use malware using deep neural networks." In
a hybrid method of Convolutional Neural 2017 International joint conference on
Network (CNN) and Particle Swarm neural networks (IJCNN), pp. 1871-
Optimization (PSO) algorithm. In fact, the 1878. IEEE, 2017.
Convolutional Neural Network (CNN) is 4. Bragen, Simen Rune. "Malware detection
intended to train and test data to make a through opcode sequence analysis using
diagnosis, but it is important to identify malware machine learning." Master's thesis, 2015.
features from a dataset, and the Convolutional 5. Lee, Yoon-seon, Jae-ung Lee, and Woo-
Neural Network (CNN) cannot be precisely young Soh. "Trend of Malware Detection
identifiy feature extraction, so it uses Particle Using Deep Learning." In Proceedings of
Swarm Optimization (PSO) algorithm to the 2nd International Conference on
improve it in the training and test layers. In fact, Education and Multimedia Technology,
all the data is in the input layer and in the hidden pp. 102-106. ACM, 2018.
layer, first is the convolve layer, then the Particle 6. Vinayakumar, R., K. P. Soman, and
Swarm Optimization (PSO) algorithm layer, and Prabaharan Poornachandran. "Detecting
then the pooling layers and finally the fully malicious domain names using deep
connected layer. The output layer is also intended learning approaches at scale." Journal of
for displaying malware work and detection. The Intelligent & Fuzzy Systems 34, no. 3
results show that the accuracy of the proposed (2018): 1355-1367.
approach has a relative advantage over the 7. Mohammed K. Alzaylaee, Suleiman Y.
previous methods and in terms of the AUC based Yerima, and Sakir Sezer. DL-Droid:
on the ROC curve, optimization with the Deep learning based android malware
previous methods is also evident. detection using real devices. Computers
& Security, Volume 89, 2020, 11 Pages.
References 8. S. Sibi Chakkaravarthy, D. Sangeetha,
1. Ni, Sang, Quan Qian, and Rui Zhang. and V. Vaidehi. A Survey on malware
"Malware identification using analysis and mitigation techniques.
visualization images and deep learning." Computer Science ReviewVolume
Computers & Security 77 (2018): 871- 32May 2019Pages 1-23.
885. 9. Sacha Gobeyn, Ans M. Mouton, Anna F.
2. Kumar, Rajesh, Zhang Xiaosong, Riaz Cord, Andrea Kaim, and Peter L. M.
Ullah Khan, Ijaz Ahad, and Jay Kumar. Goethals. Evolutionary algorithms for
"Malicious code detection based on species distribution modelling: A review
image processing using deep learning." in the context of machine learning.
In Proceedings of the 2018 International Ecological Modelling, Volume 39224
Conference on Computing and Artificial January 2019, Pages 179-195.
Intelligence, pp. 81-85. ACM, 2018.
226

10. Maurice Clerc. Particle Swarm Method for Android Malware Detection
Optimization. First published:1 January Based on Control Flow Graphs and
2006, 243 Pages. Machine Learning Algorithms." IEEE
11. Sara Kaviani, and Insoo Sohn. Influence Access 7 (2019): 21235-21245.
of random topology in artificial neural 19. Cui, Zhihua, Fei Xue, Xingjuan Cai,
networks: A survey. ICT Express, in Yang Cao, Gai-ge Wang, and Jinjun
press, corrected proof, Available online Chen. "Detection of malicious code
14 February 2020. variants based on deep learning." IEEE
12. Long Jin, Shuai Li, Bin Hu, and Mei Liu. Transactions on Industrial Informatics
A survey on projection neural networks 14, no. 7 (2018): 3187-3196.
and their applications. Applied Soft 20. Li, Jin, Lichao Sun, Qiben Yan, Zhiqiang
Computing, Volume 76, March 2019, pp. Li, Witawas Srisa-an, and Heng Ye.
533-544. "Significant permission identification for
13. Martin T. Hagan. Neural network design. machine-learning-based android
1996-2017. malware detection." IEEE Transactions
14. Geert Litjens, Francesco Ciompi, Jelmer on Industrial Informatics 14, no. 7
M. Wolterink, Bob D. de Vos, and Ivana (2018): 3216-3225.
Išgum. State-of-the-Art Deep Learning in 21. Azmoodeh, Amin, Ali Dehghantanha,
Cardiovascular Image Analysis. JACC: and Kim-Kwang Raymond Choo.
Cardiovascular Imaging, Volume 12, "Robust malware detection for internet of
Issue 8, Part 1, August 2019, pp. 1549- (battlefield) things devices using deep
1565. eigenspace learning." IEEE Transactions
15. Nikhil Buduma. Fundamentals of Deep on Sustainable Computing 4, no. 1
Learning, Designing Next-Generation (2018): 88-95.
Machine Intelligence Algorithms. 298 22. Kim, TaeGuen, BooJoong Kang, Mina
Pages, 2017. Rho, Sakir Sezer, and Eul Gyu Im. "A
16. Philip G. Brodrick, Andrew B. Davies, multimodal deep learning method for
and Gregory P. Asner. Uncovering Android malware detection using various
Ecological Patterns with Convolutional features." IEEE Transactions on
Neural Networks. Trends in Ecology & Information Forensics and Security 14,
Evolution, Volume 34, Issue 8August no. 3 (2018): 773-788.
2019, Pages 734-745. 23. Daniel Gibert, Carles Mateu, and Jordi
17. Arshad, Saba, Munam A. Shah, Abdul Planes. The rise of machine learning for
Wahid, Amjad Mehmood, Houbing detection and classification of malware:
Song, and Hongnian Yu. "Samadroid: a Research developments, trends and
novel 3-level hybrid malware detection challenges. Journal of Network and
model for android operating system." Computer Applications, Volume 1531
IEEE Access 6 (2018): 4321-4339. March 2020.
18. Ma, Zhuo, Haoran Ge, Yang Liu, Meng 24. Ali Feizollah, Nor Badrul Anuar, Rosli
Zhao, and Jianfeng Ma. "A Combination Salleh, and Ainuddin Wahid Abdul
227

Wahab. A review on feature selection in


mobile malware detection. Digital
Investigation, Volume 13, June 2015,
Pages 22-37.
25. Zhongru Ren, Haomin Wu, Qian Ning,
Iftikhar Hussain, and Bingcai Chen. End-
to-end malware detection for android IoT
devices using deep learning. Ad Hoc
Networks, Volume 101, 15 April 2020.
26. Xinbo Liu, Yaping Lin, He Li, and Jiliang
Zhang. A novel method for malware
detection on ML-based visualization
technique. Computers & Security,
Volume 8, 9 February 2020.
27. Manel Jerbi, Zaineb Chelly Dagdia, Slim
Bechikh, and Lamjed Ben Said. On the
use of artificial malicious patterns for
android malware detection. Computers &
Security, Volume 9, 2 May 2020.
28. Rahim Taheri, Meysam Ghahramani,
Reza Javidan, Mohammad Shojafar,
Mauro Conti. Similarity-based Android
malware detection using Hamming
distance of static binary features. Future
Generation Computer Systems, Volume
10, 5 April 2020, Pages 230-247.
29. ElMouatez Billah Karbab, Mourad
Debbabi, Abdelouahid Derhab, and
Djedjiga Mouheb. MalDozer: Automatic
framework for android malware
detection using deep learning. Digital
Investigation, Volume 24, Supplement,
March 2018, Pages s48-s59.

You might also like