Obaid Jamil (220311)

Master of Science in Cybersecurity


Dr. Muhammad Imran

Assistant Professor




Table of Contents:

1. Title ............................................................................................................... 1

2. Objective ....................................................................................................... 1
3. Brief Description.......................................................................................... 1
4. Literature Review ......................................................................................... 2

4.1. Table………………………..……………………………………....7
5. Justification for selecting Topic .................................................................. 8

6. Problem Statement ....................................................................................... 8

7. Scope of Project ........................................................................................... 8
8. References .................................................................................................... 9



Image-based malware detection using Long Short-Term Memory (LSTM).


The main goal of this research study is to provide an effective and efficient approach for
identifying malware using Long Short-Term Memory (LSTM) neural networks and image-based

 To protect individuals’ systems from the latest malware.

 To introduce a new way to detect Hidden Malware.
 Analysing Metadata of Shared Images to Uncover Malicious Code.

Brief Description:

This research paper proposes a novel approach to malware detection by converting binary files
into images and analyzing those using LSTM neural networks. By leveraging advanced
visualization techniques, the proposed method will enhance the detection and classification of
malware, ultimately contributing to more effective cybersecurity measures. To strengthen
cybersecurity defenses, this research will examine the shortcomings of current approaches,
develop a novel strategy, assess its efficacy, and optimize it for efficiency and scalability. This
research will involve a comprehensive examination of current malware image production
techniques, the creation of a novel method using cutting-edge methodologies, and a careful
assessment of the effectiveness of the method in identifying and categorizing malware.
This research has the potential to significantly advance the state of the art in dynamic image
malware identification by resolving the shortcomings of existing techniques, thereby giving
cybersecurity professionals more reliable and effective tools. The findings of this study will not
only advance our knowledge of malware detection on an academic level, but they will also have
implications for creating stronger cybersecurity defenses against malware attacks that are
becoming more complex.


Literature Review:
Previous research in dynamic malware image detection has focused on converting malware
binaries into grayscale images, using techniques such as pixel intensity mapping and entropy-
based visualization.
Nataraj, L., et al. (2011) [1] proposed a method for converting malware binaries into grayscale
images using pixel intensity mapping and entropy-based visualization. Their comparative
assessment showed promise in detecting and classifying malware. With this method, malware's
structure and behavior were intended to be more easily identified and categorized. The authors
evaluated their approach in comparison to other methods to show its effectiveness in malware
identification. However, the study also identified shortcomings in terms of precision and
effectiveness, highlighting the need for additional investigation and advancement.
Vinod, P., et al. in 2013 [2] conducted a survey on malware detection methods, including static
and dynamic analysis techniques, and highlighted the need for more effective detection and
classification methods. The authors outlined the advantages and disadvantages of various
strategies and emphasized the need for better detection and classification techniques. This study
shed important information on the current status of malware detection research and highlighted
the difficulties facing researchers as they work to create more precise and effective methods.
Raff, J., & Nicholas, C. in 2017 [3] also introduced a method for malware detection by
analyzing the entire executable file. The authors were able to classify malware based on the
structure of the binary file by using deep learning algorithms. Their method showed excellent
malware detection and classification accuracy, but it also made clear the necessity for additional
developments in deep learning methodologies to further malware detection capabilities.
Pascanu, R., et al. in 2015 [4] explored malware classification using recurrent neural networks.
The authors demonstrated how machine learning techniques, in particular RNNs, can be used to
identify and categorize malware. Their study demonstrated great malware classification
accuracy, but it also highlighted the significance of further machine-learning algorithm research
and development for better malware detection.
Vasan, Danish, et al. (2020) [5] proposed a new image-based malware classification technique
that uses an ensemble of Convolutional Neural Network (CNN) architectures. Through the
development of a system and the training of numerous CNN models with various architectures,
they were able to extract visual information from virus images. On a dataset containing several
malware families, scientists were able to attain a high classification accuracy of 98.9% by


combining the predictions of these models. The outcomes showed that the ensemble CNN
method was quite efficient at correctly identifying and categorizing malware based on picture
Kim, Hae-Jung (2018) [6] proposed an image-based malware classification approach based on
CNN. The proposed method used transfer learning with the VGG16 architecture. Transfer
learning was used in the suggested approach, specifically with the VGG16 architecture. By using
existing models and optimizing them for a particular goal, transfer learning entails leveraging
existing models. In this instance, the pre-trained VGG16 model was modified to categorize
malware based on photos. By using image analysis to classify malware with high accuracy, the
method showed encouraging results. The study demonstrated the utility of CNNs and transfer
learning for malware categorization.
Huang, Xiang, et al. (2021) [7] proposed a new deep learning-based method for Windows
malware detection. The proposed method used a stacked autoencoder with a softmax classifier.
A particular kind of unsupervised neural network known as a stacked autoencoder learns to
reconstruct input data, while the softmax classifier is in charge of categorizing the data. The
proposed method found Windows malware with considerable success by utilizing deep learning's
power. The research emphasized the effectiveness of applying deep learning techniques to the
Windows operating system for virus detection.
Kancherla, Kesav, and Srinivas Mukkamala (2013) [8] proposed a technique for image
visualization-based malware detection. The proposed approach used dimensionality reduction
techniques to visualize the binary files. They sought to simplify the analysis and detection
process by converting the binary information into visual representations, such as images. To
identify crucial traits connected to the presence of malware, many attributes were retrieved from
these visual representations. The suggested method aimed to increase malware detection by
making use of binary file visualization.
Awan et al. (2021) [9] proposed an image-based malware classification technique using the
VGG19 network and spatial convolutional attention. They used a dataset with samples from 10
different malware families to train their algorithm. They sought to precisely categorize malware
based on visual patterns by utilizing deep learning techniques and concentrating on spatial
aspects. This image-based classification algorithm provided a fresh method for analyzing and
identifying malware.
Nguyen, Huy, et al (2023) [10] discuss the use of generative adversarial networks (GANs) for
the classification of malware based on images is examined in this paper. GANs are neural


networks that combine to produce realistic artificial images of malware. They are made up of
two neural networks: a discriminator and a generator. The GAN model design and training
procedure are described in depth, and the performance of the model is assessed using a dataset of
malware images. The outcomes highlight GANs' potential for enhancing malware categorization.
The study emphasizes the significance of cutting-edge machine learning methods in
cybersecurity applications and makes recommendations for future research areas.
Li, C., Zhu, Y., Wu, Z., and Zhang, L. (2021) [11] presented a technique for malware detection
using convolutional neural networks (CNNs) with image-based analysis. Through the use of
visual representations of malware samples, their suggested method aims to improve the precision
and effectiveness of malware detection. Their method showed enhanced effectiveness in
recognizing and classifying malware by analyzing visual features and using machine learning
algorithms. The growth of cybersecurity research was aided by this innovative and successful
method of malware detection that combined image analysis and machine learning.
Liu et al. (2019) [12] developed a malware detection system that uses a combination of deep
learning and traditional machine learning techniques. Their strategy attempted to increase the
precision and potency of malware detection. They tried to improve the system's detecting
abilities by combining the advantages of deep learning and conventional machine learning. This
fusion of many approaches offered a complete approach to identifying and reducing malware
threats, advancing the field of malware detection research.
In 2019, Guo, W., Guo, M., and Liu, Y. et al. [13] introduced a malware detection system that
utilized a combination of image analysis and machine learning techniques. Through the use of
visual representations of malware samples, their suggested method aims to improve the precision
and effectiveness of malware detection. Their method showed enhanced effectiveness in
recognizing and classifying malware by analyzing visual features and using machine learning
algorithms. The growth of cybersecurity research was aided by this innovative and successful
method of malware detection that combined image analysis and machine learning.
Zheng et al. (2021) [14] proposed an image malware detection method based on the YOLOv3
object detection algorithm. Their method achieved a detection accuracy of 91.3%. Their research
demonstrated the potential of adopting such strategies in precisely detecting and categorizing
malware by utilizing machine learning algorithms. The effectiveness of machine learning in the
field of malware analysis and classification was demonstrated by this method for image-based
malware detection, which showed encouraging results.


Overview of Previous Work:

No. Paper Problem Addressed Methodology Datasets Used Limitations

1 Nataraj et Malware classification using Binary texture analysis, Malware samples from VX Focus on binary texture
al. (2011) binary texture analysis and dynamic analysis, and Heavens. analysis
dynamic analysis. machine learning
2 Vinod et al. Survey on malware detection Literature review of N/A No experimental results,
(2009) methods. malware detection only a survey.
3 Raff et al. Malware detection by analyzing Deep learning model (1D Microsoft Malware Limited to Windows PE
(2017) entire executable files. CNN). Classification Challenge files.
4 Pascanu et Malware classification using Recurrent neural networks Microsoft Malware Limited to Windows PE
al. (2015) recurrent networks. (RNNs). Classification Challenge files.
5 Vasan et al. Image-based malware Ensemble of CNN Malware samples from Focus on image-based
(2020) classification using an ensemble architectures. Virus Share and Virus analysis.
of CNN architectures. Total.
6 Kim (2018) Image-based malware Convolutional neural Malware samples from Limited dataset size,
classification using convolutional network (CNN). Virus Total. focus on image-based
neural network. analysis
7 Huang et al. Windows malware detection Deep learning model (1D Malware samples from Limited to Windows
(2021) based on deep learning CNN). Virus Total malware, focus on deep
8 Kancherla & Image visualization-based Image visualization and Malware samples from VX Limited dataset size,
Mukkamala malware detection. machine learning Heavens. focus on image
(2013) algorithms. visualization.
9 Awan et al. Image-based malware VGG19 network with spatial Malware samples from Limited dataset size,
(2021) classification using VGG19 convolutional attention. Virus Total focus on image-based
network and spatial attention. analysis.
10 Nguyen et Generative adversarial networks Generative adversarial Malware samples from Limited dataset size,
al. (2023) and image-based malware networks (GANs) and Virus Total. focus on image-based
classification. machine learning analysis and GANs.
11 Li et al. Image-based malware detection Convolutional neural Malware samples from Limited dataset size,
(2021) using convolutional neural network (CNN). Virus Total. focus on image-based
networks. analysis.
12 Liu et al. Deep learning and traditional Deep learning model (1D Malware samples from Limited dataset size,
(2019) machine learning-based malware CNN) and traditional Virus Total. focus on deep learning
detection system. machine learning and traditional machine
algorithms. learning algorithms
13 Guo et al. Malware detection using image Image analysis and machine Malware samples from Limited dataset size,
(2019) analysis and machine learning learning algorithms. Virus Total. focus on image analysis
techniques and machine learning
14 Zheng et al. Image malware detection based YOLOv3 object detection Malware samples from YOLOv3 object detection
(2021) on YOLOv3 algorithm. Virus Total. algorithm.
Limited dataset size.


Justification for selecting a topic:

With the increasing prevalence of sophisticated malware attacks, there is a pressing need for
more effective detection and classification methods. The increase in cyber-attacks and no
significant way to protect the computer world automatically without the user’s permission.
Image-based analysis and LSTM neural networks offer a promising approach to address these
challenges and improve malware detection capabilities. This research has the potential to
significantly improve the state of the art in dynamic image malware detection, ultimately
contributing to more robust cybersecurity defenses.

Problem Statement:

The rapid advancement of malware is challenging with traditional detection methods, making it
crucial to develop innovative approaches for identifying and classifying malicious software.
With the use of LSTM networks and their capacity to recognize temporal correlations and learn
complicated patterns, this research intends to develop an image-based malware detection system
that can accurately analyze and categorize malware by translating its binary representations into
visual patterns.

Scope of Project:

This project will focus on the development of a new method for generating malware images, as
well as the evaluation of its effectiveness in detecting and classifying malware. The research will
involve the following steps:
1. A review of the drawbacks of the malware image generation techniques currently in use.
2. Development of a novel method for generating accurate malware images using advanced
visualization techniques.
3. An examination of the proposed method's performance in identifying and categorizing
malware to other approaches.
4. Improving the proposed approach for increased effectiveness and scalability.



[1]. Nataraj, Lakshmanan, et al. "A comparative assessment of malware classification using
binary texture analysis and dynamic analysis." Proceedings of the 4th ACM Workshop on
Security and Artificial Intelligence. 2011.
[2]. Vinod, P., et al. "Survey on malware detection methods." Proceedings of the 3rd Hackers’
Workshop on Computer and internet security (IITKHACK’09). 2009.
[3]. Raff, Edward, et al. "Malware detection by eating a whole exe." arXiv preprint
arXiv:1710.09435 (2017).
[4]. Pascanu, Razvan, et al. "Malware classification with recurrent networks." 2015 IEEE
International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2015.
[5]. Vasan, Danish, et al. "Image-Based malware classification using an ensemble of CNN
architectures (IMCEC)." Computers & Security 92 (2020)
[6]. Kim, Hae-Jung. "Image-based malware classification using convolutional neural
network." Advances in Computer Science and Ubiquitous Computing: CSA-CUTE 17. Springer
Singapore, 2018.
[7]. Huang, Xiang, et al. "A method for Windows malware detection based on deep
learning." Journal of Signal Processing Systems 93 (2021)
[8].Kancherla, Kesav, and Srinivas Mukkamala. "Image visualization based malware
detection." 2013 IEEE Symposium on Computational Intelligence in Cyber Security (CICS).
IEEE, 2013.
[9]. Awan, Mazhar Javed, et al. "Image-based malware classification using VGG19 network and
spatial convolutional attention." Electronics 10.19 (2021)
[10] Nguyen, Huy, et al. "Generative adversarial networks and image-based malware
classification." Journal of Computer Virology and Hacking Techniques (2023).
[11]. Li, C., Zhu, Y., Wu, Z., & Zhang, L. (2021). Image-based malware detection using
convolutional neural networks. International Journal of Machine Learning and Cybernetics,
12(1), 207-218.
[12]. Liu, Y., Liu, J., Li, H., & Liu, J. (2019). A deep learning and traditional machine learning
based malware detection system. Journal of Ambient Intelligence and Humanized Computing.


[13]. Guo, W., Guo, M., & Liu, Y. (2019). Malware detection using image analysis and machine
learning techniques. International Journal of Machine Learning and Cybernetics.

[14]. Zheng, Y., Liao, X., Tang, Y., & Zhu, H. (2021). Image malware detection based on
YOLOv3. In Proceedings of the 2021 IEEE International Conference on Artificial Intelligence
and Computer Applications.

The end


