Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 6

In the paper titled "Android Source Code Vulnerability Detection: A Systematic Literature

Review," authored by Janaka Senanayake (Robert Gordon University, UK and University of


Kelaniya, Sri Lanka), Harsha Kalutarage (Robert Gordon University, UK), Mhd Omar Al-Kadri
(Birmingham City University, UK), Andrei Petrovski (Robert Gordon University, UK), and Luca
Piras (Middlesex University, UK), the authors address the escalating use of mobile devices in
today's technological landscape. They highlight the prevalent issue of inadequate security
measures in many Android applications, attributing this shortcoming to the absence of
automated mechanisms for identifying and rectifying source code vulnerabilities during the early
stages of design and development. The systematic literature review (SLR) critically evaluates
118 selected technical studies published between 2016 and 2022, shedding light on both
conventional and Machine Learning (ML)-based methods for vulnerability detection. With a
particular emphasis on ML-based approaches, the article aims to equip researchers with
comprehensive insights into secure mobile application development, serving as a foundation for
identifying potential future research and development directions in the field.

The authors explore the increasing reliance on Android smartphones, projecting 4.3 billion users
by 2023. They highlight the lack of proper security mechanisms in Android app development,
exacerbated by rapid life cycles and limited validation on platforms like Google Play.

Acknowledging previous studies proposing vulnerability detection methods, including machine


learning (ML) and deep learning, the authors underscore the need for developers to grasp these
techniques for enhanced software security. They identify limitations in prior literature reviews,
such as narrow scopes and insufficient analysis of ML-based methods. This systematic
literature review addresses these gaps by critically evaluating 118 selected studies, offering
insights into source code vulnerability detection methods and guiding future research in Android
source code security.

The paper continues by explaining the objective of it. Overall they aim to address three key
research questions pertaining to source code vulnerability detection methods in Android
applications. The first question, "RQ1: What are the existing methods for source code and
application analysis?" delves into the realm of source code and application analysis methods,
encompassing various approaches such as application reverse-engineering and byte-code-
based analysis. Static analysis techniques take precedence, supplemented by dynamic and
hybrid analysis methods, with detailed discussions provided in Section 4 of the review. Moving
on to the second research question, "RQ2: What are existing Android source code vulnerability
detection methods, and how to use them to prevent vulnerabilities?" the SLR explores existing
Android source code vulnerability detection methods and their applications in preventing
vulnerabilities. The study notes the prevalence of both machine learning (ML) and conventional
methods in detecting vulnerabilities, highlighting the growing popularity of ML approaches in
recent years. The importance of integrating these detection techniques into software
development environments for preventing security issues is underscored, with Section 5 offering
an in-depth discussion on these aspects. The third research question, "RQ3: Which tools and
repositories can be used to detect vulnerabilities in Android apps?" focuses on tools and
repositories for detecting vulnerabilities in Android apps. The SLR identifies and explores
various tools, repositories, and datasets crucial for source code analysis and vulnerability
detection. Discussions in Section 6 delve into the characteristics and usage of these resources,
providing valuable insights to facilitate new research studies.

The organizational structure of the SLR, depicted in Figure 1, outlines three main sections
aligning with the research questions. Sections 2 and 3 cover background, related literature, and
the review methodology. The review then unfolds with experimental studies categorized into
application analysis (Section 4), code vulnerability detection (Section 5), and supportive tools
and repositories (Section 6). The review addresses threats to validity in Section 7, and Section 8
concludes the article.

The background and literature review offer a thorough examination of security concerns in
Android applications. It addresses the Android layered architecture, security implications,
common vulnerabilities, and user and developer mistakes leading to security issues. The
section also introduces the machine learning (ML) process, crucial for understanding ML-based
vulnerability detection mechanisms. It emphasizes the significance of the layered architecture,
Android platform security rules, and proper security measures. The discussion on Android
application vulnerabilities covers SSL/TLS protocol issues, permissions, web views, and various
vulnerability types. The section highlights user mistakes in permission granting and developer
errors, underscoring the need for rigorous testing and validation in the app development
lifecycle. The related literature reviews (References [1, 2, 38, 50, 72, 80, 86, 127, 132]) are
summarized, noting their focus areas, methodologies, and limitations, with a call for a more
comprehensive review of recent studies specifically addressing Android source code
vulnerability detection and prevention mechanisms.

The methodology employed the Preferred Reporting Items for Systematic Reviews and Meta-
Analysis (PRISMA) model to conduct a systematic literature review (SLR) on Android source
code vulnerability detection. The study defined a search strategy, inclusion/exclusion criteria,
and database usage based on formulated research questions. The search string encompassed
terms related to vulnerability detection, source code vulnerabilities, and Android, with a focus on
machine learning (ML) methods. The review covered technical studies from 2016 to June 2022,
utilizing reputable repositories such as ACM Digital Library, IEEEXplore, Science Direct, Web of
Science, and Springer Link. Additionally, Google Scholar was employed to identify studies not
present in primary repositories. The search results were distributed across various primary
sources, providing a comprehensive dataset for analysis. The process aimed to address the
research gap in recent studies related to Android source code vulnerability detection and
prevention mechanisms.

In the study selection process, the researchers initially identified 2,526 research papers from top
research repositories and an additional 1,400 from Google Scholar. After excluding 3,112
papers due to duplications and 127 because of unavailability, 687 studies remained. To refine
the selection, a manual screening was conducted by thoroughly examining the title and abstract
of each paper to ensure alignment with the review's focus, resulting in 119 eligible studies.
Three articles were further excluded, leaving a total of 116 studies. The researchers employed
the snowballing process, examining references in retrieved papers and those citing them,
leading to the inclusion of 2 more relevant papers. The final step involved cross-validating the
results through a peer-verification process by all the authors, resulting in a total of 118 articles
being reviewed. The methodology followed the PRISMA model, as illustrated in Figure 2. The
study selection process demonstrates a rigorous and systematic approach to identify relevant
literature for the systematic review. The inclusion of a large initial pool of papers from reputable
repositories and Google Scholar, followed by careful manual screening, reflects the researchers'
commitment to ensuring the quality and relevance of the selected studies. The use of the
snowballing process further strengthens the review by capturing additional relevant papers
through references and citations. The cross-validation step involving all authors adds another
layer of credibility to the paper selection process. Overall, the methodology employed aligns
with best practices for systematic reviews, enhancing the robustness and reliability of the
study's findings.

In the application analysis section, two main approaches for vulnerability detection are
discussed: reverse-engineering source code of Android Application Packages (APKs) and
simultaneous source code analysis during development. Feature extraction is crucial, achieved
through static, dynamic, or hybrid analysis techniques. Static analysis, applicable to both Java
and Kotlin, examines XML files and source code without execution. Manifest analysis extracts
package details, permissions, and other features from AndroidManifest.xml. Code analysis
delves into API calls, information flow, taint tracking, and other aspects. Dynamic analysis
involves executing the application in a sandbox environment to identify vulnerabilities and
malware. Feature extraction methods include network traffic analysis, code instrumentation,
system call analysis, system resources analysis, and user interaction analysis. Hybrid analysis
combines static and dynamic features, providing a comprehensive understanding of
applications. Various studies utilize these methods, including one that identifies Android security
vulnerabilities by analyzing metadata, data flow, API hooks, and executable scripts using hybrid
analysis techniques. This approach achieved a high level of accuracy, processing within 93
seconds on average. Additionally, SSL/TLS issues are addressed through hybrid analysis in
frameworks like DCDroid, identifying security issues related to SSL certificates/TLS in
applications.

In the section on code vulnerability detection, the paper delves into methods for identifying
vulnerabilities in Android applications stemming from source code issues. The exploration
encompasses machine learning (ML), deep learning (DL), heuristic-based methods, and formal
methods, employing static, dynamic, and hybrid analysis techniques. In the realm of machine
learning methods, the application of static analysis involves techniques such as utilizing ML
algorithms—NB, LR, DT, RF, GB, LSTM, RNN, and MLP—with features identified through
mechanisms like Abstract Syntax Tree (AST). Noteworthy studies, including WaffleDetector and
Vulvet, leverage static analysis to discern malicious code and vulnerabilities, achieving
commendable precision. Furthermore, data flow analysis is employed in one study to extract
topic-specific data flow signatures, providing insights into characterizing malicious Android apps.
Several models apply ML algorithms such as RF, GB, DT, and CNN to classify codes in source
files, attaining high accuracy in the process. Dynamic analysis techniques in tandem with ML
models like NB, K-Star, RF, DT, and Simple Logistic are instrumental in detecting vulnerabilities
during execution. Another facet involves applying deep learning methods like CNN, LSTM, and
CNN-LSTM in dynamic analysis, demonstrating notable accuracy in vulnerability detection.
Notable studies propose ML-based vulnerability detection methods utilizing dynamic analysis
and address challenges associated with shared permissions between malicious and benign
apps. Hybrid analysis, combining static and dynamic features, is prevalent for enhancing
vulnerability detection. Studies, exemplified by one referenced as [169], propose ML-based
vulnerability detection leveraging hybrid analysis techniques, achieving an average accuracy of
77%. In the domain of deep learning methods, studies explore techniques such as CNN, LSTM,
and DNN to predict vulnerable source code and classify vulnerable classes with high precision
and accuracy. The application of dynamic analysis using DL, exemplified by Genetic Algorithm
LSTM, yields promising results, detecting anomalies in system calls with an F-Score of over
85%. In a different study, the application of CNN-LSTM achieves a detection accuracy of 83.6%.
Models employing hybrid analysis with ML models, such as one referenced as [48], propose a
parallel-classifier scheme for Android vulnerability detection, achieving high accuracy by
combining static and dynamic features. Another study utilizing hybrid analysis with ML models
achieves 80% accuracy in static analysis and 60% in dynamic analysis. The integration of both
methods enhances the overall accuracy of vulnerability detection. Despite the multitude of
proposed ML/DL-based methods, some fall short in their ability to detect code vulnerabilities
during app development. A comprehensive overview, presented in Table 3, summarizes various
models, outlining their methodology, analysis technique, employed ML/DL methods, tools,
datasets, and overall accuracy.

The paper explores conventional methods for detecting code vulnerabilities in Android
applications, covering static, dynamic, and hybrid analysis techniques. In static analysis, formal
models, like Alloy-based approaches, and tools such as vulnerability parsers are used to identify
vulnerabilities in Android apps, addressing issues in permission protocols, third-party libraries,
and web view objects. The DroidRA model enhances static analysis by resolving targets of
reflective calls through constraint solving. Dynamic analysis, exemplified by VulArcher and
VScanner, focuses on detecting vulnerabilities in Android apps efficiently. VulArcher employs a
heuristic vulnerability search algorithm, while VScanner uses a scalable Lua script engine for
dynamic detection. Hybrid analysis, combining static and dynamic techniques, is also explored.
Studies use tools like AndroBugs, SandDroid, and Qark to detect common vulnerabilities in
Android apps. Another hybrid model employs fuzzy dynamic testing technology to enhance
mining accuracy. The summary in Table 4 provides an overview of these studies, highlighting
considered vulnerabilities, findings, limitations, datasets, tools, and methods used in
conventional models for vulnerability detection.

The prevention techniques for Android app code vulnerabilities underscore early intervention
through frameworks, tools, and plugins integrated into development environments.
Experimentation emphasizes automated detection support, with stitch-in-time mechanisms
identifying security issues during development. Android Lint, along with other linters, prioritizes
warnings through sentiment analysis, aiding developers in addressing vulnerabilities effectively.
Various tools, such as MagpieBridge, DevKnox, FixDroid, SOURCERER, and VuRLE, offer
solutions for vulnerability detection and resolution during development, with varying approaches
and success rates. In the discussion on vulnerability detection methods, the prevalence of static
analysis (51%), followed by hybrid analysis (35%) and dynamic analysis (14%), is observed.
The increasing adoption of machine learning-based methods surpasses conventional
approaches, reflecting their accuracy, problem-solving capabilities, and scalability. Feature
extraction methods, including code analysis, manifest analysis, and system call analysis,
highlight API calls as the most widely extracted feature. Few studies integrate prevention
mechanisms with detection tools, emphasizing the need to build effective prevention
mechanisms for Android source code vulnerability mitigation. This section provides a succinct
overview of prevention techniques, tools, and methods in Android app development,
emphasizing the importance of early intervention and prevention in addressing code
vulnerabilities.

This section discusses supportive tools and repositories for Android app development,
emphasizing the significance of analysis tools, vulnerability detection, and datasets for ML
model training. A variety of tools and frameworks facilitate application and source code analysis
for vulnerability detection in Android. Several studies, such as Reference [157] and [118], have
compared tools based on characteristics like their nature (tool vs. framework), cost,
maintenance, type of analysis (static vs. dynamic), and local vs. remote execution. Examples of
tools include FlowDroid, DIALDroid, JAADAS, DevKnox, AndroBugs, and MobSF. Table 5
provides a comparison of these tools, highlighting their capabilities, limitations, analysis
techniques, and usage. Repositories and datasets play a crucial role in supporting various ML
and conventional vulnerability detection methods. Notable datasets include Drebin, Google
Play, AndroZoo, AppChina, and GitHub, offering diverse sources for experiments. Ghera,
introduced in Reference [97], serves as an open-source benchmark repository capturing 25
known vulnerabilities in Android apps. However, the need for comprehensive datasets specific
to Android source code vulnerability detection is recognized. Datasets like CVE details,
AndroVul, LVDAndro, and others contribute to the development of tools for identifying security-
related commits and training ML models.

The systematic review acknowledges potential threats to its validity, categorizing them into
construct, internal, external, and conclusion validity, and outlines measures to mitigate these
risks. Construct validity, associated with search term-based queries on repositories, is
addressed by supplementing primary sources with Google Scholar to capture potentially missed
studies. Cross-checking is employed to minimize errors in the inclusion/exclusion criteria.
Internal validity, concerning data extraction and analysis, faces a heavy workload, but efforts are
made to enhance soundness by cross-checking and obtaining consensus among authors, with
consideration for involving original authors to reduce errors. External validity is discussed in
terms of the summary of results, emphasizing the review's focus on research publications from
2016 to June 2022. However, the evolving landscape of ML techniques poses challenges in
generalizing across different time periods. Conclusion validity strives to minimize bias during
paper selection, considering only English-language papers and using a peer-verified systematic
review process with cross-checking mechanisms. Despite a potential bias toward positive
results, the study aims for a comprehensive examination, and the constant involvement of all
authors ensures rigorous reviewing.

In conclusion, the review underscores the importance of addressing security concerns in


Android application development. It covers the latest research on Android source code
vulnerability detection from 2016 to June 2022, emphasizing three key steps: analysis,
detection, and prevention of vulnerabilities. The review explores various vulnerability detection
methods, including static analysis, dynamic analysis, and hybrid analysis, along with tools and
techniques, both conventional and ML/DL-based. Notable findings include the prevalence of
static analysis, with API calls, permissions, and system calls as widely extracted features.
ML/DL-based techniques are identified as common for vulnerability detection. The study
recognizes the need for a labeled dataset specific to Android source code vulnerabilities and
proposes future research directions, including the introduction of comprehensive code analysis
mechanisms and the integration of detection methods into development environments. The lack
of an automated mechanism for identifying reasons for vulnerabilities is highlighted, suggesting
future research on integrating explainable AI techniques. Given the evolving nature of Android
vulnerabilities and detection techniques, the study recommends similar future reviews to cover
emerging threats and novel methodologies, including ML, DL, and reinforcement learning
methods.

You might also like