Download as pdf or txt
Download as pdf or txt
You are on page 1of 13

sensors

Article
Smart Partitioned Blockchain
Basem Assiri * and Hani Alnami

Computer Science Department, Jazan University, Jazan 82917, Saudi Arabia; hmalnami@jazanu.edu.sa
* Correspondence: babumussmar@jazanu.edu.sa or bas0911@hotmail.com

Abstract: Blockchain is a developing technology that promises advancements when it is applied to


other fields. Applying blockchain to other systems requires a customized blockchain model to satisfy
the requirements of different application fields. One important area is to integrate blockchain with
smart spaces and the Internet of Things to process, manage, and store data. Actually, smart spaces
and Internet of Things systems include various types of transactions in terms of sensitivity. The
sensitivity can be considered as correctness sensitivity, time sensitivity, and specialization sensitivity.
Correctness sensitivity means that the systems should accept precise or approximated data in some
cases, whereas time sensitivity means that there are time bounds for each type of transaction, and
specialization sensitivity applies when some transactions are processed only by specialized people.
Therefore, this work introduces the smart partitioned blockchain model, where we use machine
learning and deep learning models to classify transactions into different pools according to their
sensitivity levels. Then, each pool is mapped to a specific part of the smart partitioned blockchain
model. The parts can be permissioned or permissionless. The permissioned parts can have different
sub-parts if needed. Consequently, the smart partitioned blockchain can be customized to meet
application-field requirements. In the experimental results, we use bank and medical datasets with
a predefined sensitivity threshold for classification accuracy in each system. The bank transactions
are critical, whereas the classification of the medical dataset is speculative and less critical. The
Random Forest model is used for bank-dataset classification, and its accuracy reaches 100%, whereas
Sequential Deep Learning is used for the medical dataset, which reaches 91%. This means that all bank
transactions are correctly mapped to the corresponding parts of the blockchain, whereas accuracy
is lower for the medical dataset. However, acceptability is determined based on the predefined
sensitivity threshold.

Keywords: blockchain; partitioned blockchain; machine learning; deep learning; data sensitivity;
Citation: Assiri, B.; Alnami, H. Smart security; smart spaces; Internet of Things
Partitioned Blockchain. Sensors 2024,
24, 4033. https://doi.org/10.3390/
s24134033

Academic Editor: Maurizio Talamo 1. Introduction


Blockchain is a well-designed technology that was initially developed for digital
Received: 27 May 2024
Revised: 10 June 2024
cryptocurrencies such as Bitcoin [1]. Blockchain technology has the ability to apply many
Accepted: 19 June 2024
concepts and values such as transparency, sharing, decentralization, and security. It uses
Published: 21 June 2024 distribution and decentralization approaches to replace trust, as trust is usually imple-
mented through a centralized third party. Therefore, it has been proposed for use in many
other areas, such as education, healthcare, and national security [2–5]. One important
application is to integrate blockchain with smart spaces and the Internet of Things (IoT)
Copyright: © 2024 by the authors. to process, manage, and store data. Actually, smart spaces and IoT systems include var-
Licensee MDPI, Basel, Switzerland. ious types of sensitivity transactions. However, the adoption of blockchain requires the
This article is an open access article modification and development of its architecture, algorithm, and protocols [6,7].
distributed under the terms and Indeed, blockchain consists of a peer-to-peer network that connects many nodes,
conditions of the Creative Commons where each node has a copy of a ledger. The ledger contains a chain of all approved blocks,
Attribution (CC BY) license (https://
where each block contains a group of valid transactions. The blockchain algorithm starts
creativecommons.org/licenses/by/
with transaction validation, where nodes validate transactions and group them into blocks.
4.0/).

Sensors 2024, 24, 4033. https://doi.org/10.3390/s24134033 https://www.mdpi.com/journal/sensors


Sensors 2024, 24, 4033 2 of 13

The nodes compete to push their blocks to the network for another level of validation.
When a node succeeds in pushing a new block to the network, the other nodes validate the
block to approve or disapprove it. Only the approved blocks are added to the ledger [8].
In fact, there are different types of blockchain, which are public, private, consortium,
and hybrid. Firstly, the public blockchain is a permissionless system where anyone can join
the network to process transactions, blocks, and ledgers. It is used in famous cryptocurren-
cies such as Bitcoin and Ethereum [9,10]. Secondly, a private blockchain is a permissioned
system that restricts participation to specific members. Only those authenticated members
can join the network and have the encryption keys to process transactions, blocks, and
ledgers. This can be used internally by a company or an organization such as Corda and
Quorum [10]. Thirdly, a consortium blockchain is a permissioned private blockchain that
works across multiple organizations. It enables different companies and organizations to
join the network. It can be used for interleaved businesses such as supply chains [9,10].
Fourthly, a hybrid blockchain is permissioned and permissionless, where the data and
ledger are permissioned so that the network is private from one side. However, only some
transactions are permissionless. Those transactions become available to the public for
validation as needed [11]. Table 1 illustrates the main criteria of each type of blockchain.

Table 1. Types of Blockchain.

Criteria Public Private Consortium Hybrid


Permissioned network but
Network Permissionless Permissioned Permissioned some transactions are
permissionless

Consensus All members Organization members Authorized members Authorized members


across organizations
Decentralization Very High Low Low High

High, because of Very High, because of Very High, because of


permissioned access, permissioned access, Very High, because it acts as
Security decentralization and ledger
authentication, and authentication and a private blockchain
distribution encryption properties encryption properties
The openness can introduce Centralization can lead Cannot guarantee trust Transparency issues can
the risk of 51% attacks, denial to a specific group among organizations affect the output, as public
Threat
of service attacks, and sybil controlling the that have different transactions have hidden
attacks blockchain decisions interests data
Scalability Very High Low High High

- Centralization - Centralization - Partial Centralization


- Lack of authentication - Lack of scalability - Lack of scalability - Partial scalability
Limitations - Lack of customization
- Lack of customization - Lack of - Lack of
customization customization - Lack of transparency
as data is shielded

In addition, smart spaces and IoT provide different kinds of transactions in which
transaction is the main unit of blockchain data. Having different kinds of transactions
requires different procedures. Therefore, at the beginning, the system’s transactions are
classified according to their sensitivity into sensitive and non-sensitive transactions. The
sensitive transactions need more privacy, authorization, or specialization. For example, in
healthcare systems, some transactions need a specialized doctor to process them, so the
public cannot process them. Another example is that some transactions need more privacy
or security, such as those related to politics, business competition, and national security.
Those transactions also cannot be revealed to the public, even with hidden identities. Now,
this work facilitates the application of blockchain to different areas. However, the four
mentioned types of blockchain each have their own vulnerabilities.
Therefore, this work presents a new type of blockchain, namely the smart partitioned
blockchain. The partitioned blockchain has two parts: the first part is permissionless,
and the other part is permissioned. The permissionless part is a part of the network that
enables accessing, processing, and storage of non-sensitive transactions. In contrast, the
Sensors 2024, 24, 4033 3 of 13

permissioned part of the network enables accessing, processing, and storage of sensitive
transactions. Moreover, the transaction classification can be predefined or smart. This work
proposes a smart partitioned blockchain wherein another frontal layer is added to classify
the transactions before they are introduced to the blockchain parts. The smart classification
will be conducted using machine learning (ML) and deep learning (DL) models. The
transactions are classified into many categories, such as sensitive, normal, or non-sensitive
and then assigned to the proper blockchain parts.
The experimental results employ the Random Forest Classifier and a Sequential Deep
Learning algorithm for practical implementation. The Random Forest model is applied
to identify the status of bank transactions, and the classification accuracy reaches 100%,
which means that each transaction is assigned to the appropriate part of the blockchain.
The Sequential Deep Learning algorithm is utilized to predict medical-transaction statuses,
and its classification accuracy reaches 91%. Both results are considered significant in terms
of system sensitivity requirements, which will be explained in detail in Sections 4–6.
The paper’s remainder is organized as follows: Section 2 shows the related work;
Section 3 demonstrates the methodology, and Section 4 shows the dataset and prepro-
cessing details. Section 5 discusses modeling and evaluation, whereas Section 6 discusses
experimental results. Finally, Section 7 concludes the paper.

2. Related Work
The blockchain types and algorithms are discussed in many works. Yang et al. study
private and public blockchain models [9], whereas the hybrid blockchain is investigated in
another work [11]. Krishnan et al. introduces a comprehensive study of all types [10].
Recently, many works have modified blockchain algorithms to overcome the limita-
tions and address the needs of applications in different areas. Some researchers modify the
blockchain general models or algorithms and involve the concept of leader election [12,13].
Liyanaarachchi et al. focus on socio-technical decision-making in the blockchain, so they
imply selective immutability and federal decentralization concepts [14]. Some others relax
the blockchain models to reduce the cost or to speed up the processing time. This helps to
apply blockchain technology in many other fields that cannot handle processing costs and
delays. For example, a blockchain model has been relaxed to enhance cross-border data
sharing in [15].
In addition, many consensus protocols have been introduced to give the mining
process more flexibility and to avoid limitations. The typical consensus protocol is Proof-
of-Work (PoW), where the mining process includes solving a complicated mathematical
puzzle [16]. PoW provides more security and decentralization, but it suffers from being
time and energy consuming. After that, Proof-of-Stake (PoS) and Delegated Proof-of-Stake
(DPoS) are introduced to reduce the processing cost and energy consumption of PoW [16,17].
In PoS and DPoS, the miner is selected based on the amount of cryptocurrency it has.
However, in PoS and DPoS, the rich nodes can control the system processes. Additionally,
Proof-of-Authority is another consensus protocol that relies on authorized validators who
meet the transaction requirements, but it also affects blockchain decentralization [18]. One
of the famous consensus protocols is the Practical Byzantine Fault Tolerance (PBFT), which
tolerates partial errors and approves majority decisions [17,18]. However, such tolerance is
not acceptable for critical systems. This variety among consensus protocols enables support
provision for different systems.
Moreover, the main consideration of the four types of blockchain is security. The public
blockchain uses decentralization in the processing and distribution of storage (ledger).
Meanwhile, the private blockchain provides higher security as it relies on permissioned
access (access restriction). The consortium blockchain also relies on permissioned access to
support security, but it is expanded to work across organizations rather than being private
and restricted to one organization. The hybrid blockchain combines permissioned and
permissionless techniques as it generally works as a private blockchain but only introduces
part of transactions to the public for validation [19]. The hybrid blockchain is not open,
(ledger). Meanwhile, the private blockchain provides higher security as it relies on per-
missioned access (access restriction). The consortium blockchain also relies on permis-
sioned access to support security, but it is expanded to work across organizations rather
than being private and restricted to one organization. The hybrid blockchain combines
permissioned and permissionless techniques as it generally works as a private blockchain
Sensors 2024, 24, 4033 4 of 13
but only introduces part of transactions to the public for validation [19]. The hybrid block-
chain is not open, which highlights the need for another type, namely that is the smart
partitioned blockchain that has a specific partition that enables the public to not only val-
which highlights thebut
idate transactions, need foralso
can another
sharetype,
data,namely thatblocks,
validate is the smart
vote, partitioned
and access blockchain
the ledger.
that has a specific partition that enables the public to not only validate
Indeed, the smart partitioned blockchain comprises different parts, with some being transactions, butfully
can
also share data, validate blocks, vote, and access the ledger. Indeed, the
permissioned and others fully permissionless. Further details are provided in Section 3.smart partitioned
blockchain comprises
Furthermore, different
systems parts,the
address with some
time being
costs fully permissioned
of blockchain algorithms. and others
Some fully
systems
permissionless. Further details are provided in Section 3.
enable scalability solutions by validating transactions off-chain or in sidechains [20–22],
whichFurthermore,
reduces the systems address
contention of the
thetime costs of blockchain
blockchain network, as algorithms.
is done in Some systems
Bitcoin and
enable scalability solutions by validating transactions off-chain or in
Ethereum [23,24]. On the other hand, supporting blockchain with other innovative tech- sidechains [20–22],
which reduces
nologies, such asthe contention of proofs
zero-knowledge the blockchain
and efficient network,
encryptionas istechniques,
done in Bitcoin
is a vitalandc
step
Ethereum [23,24]. On the other hand, supporting blockchain with
[25]. However, this work uses ML and DL models for more efficiency and automation. other innovative tech-
nologies,
Actually, such
ML andas zero-knowledge
DL models enhance proofs and efficient
efficiency encryption
by improving techniques,
productivity andisscalabil-
a vital
step [25]. However, this work uses ML and DL models for more efficiency
ity. They also enhance automation by reducing human involvement and biases. Addition- and automation.
Actually, ML and DL models enhance efficiency by improving productivity and scalability.
ally, the ML and DL models help to find sensitive cases so that they are treated carefully
They also enhance automation by reducing human involvement and biases. Addition-
by directing them to the suitable part of the blockchain to maintain the required safety
ally, the ML and DL models help to find sensitive cases so that they are treated carefully
and security.
by directing them to the suitable part of the blockchain to maintain the required safety
and security.
3. Methodology and Architecture
At the beginning,
3. Methodology transactions are generated by users and are put in the public trans-
and Architecture
actions pool. Figure 1 shows the general
At the beginning, transactions architecture
are generated of theand
by users smart
are partitioned blockchain,
put in the public trans-
which has
actions pool.three major
Figure phases.
1 shows the Upon
generalreceiving transactions
architecture from
of the smart IoT sources,
partitioned the first
blockchain,
phase has
which is for transaction
three classification,
major phases. the second
Upon receiving phase isfrom
transactions to map the transaction
IoT sources, the firstclasses
phase
to the partitioned blockchain, whereas the third phase focuses on ledger storage
is for transaction classification, the second phase is to map the transaction classes manage-
to the
ment.
partitioned blockchain, whereas the third phase focuses on ledger storage management.

Figure 1.
Figure 1. Shows
Showsaasmart
smartpartitioned
partitionedblockchain
blockchainwith two
with categories
two of transactions,
categories which
of transactions, are sen-
which are
sitive and
sensitive non-sensitive.
and Consequently,
non-sensitive. Consequently,thethepartitioned
partitionedblockchain
blockchainlayer
layer has
has aa permissioned
permissioned part
part
andaapermissionless
and permissionlesspart.
part.The
Thelast
lastlayer
layershows
showsthe
theledger
ledgerstorage.
storage.

Firstly,transactions
Firstly, transactionsgenerated
generated from
from IoTIoT sources
sources are are grouped
grouped in a in a transaction
transaction pool.pool.
The
Theand
ML ML and DL models
DL models are are
usedused to classify
to classify transactions.
transactions. Theclasses
The classesare
aredetermined
determinedbased
based
on the sensitivity of transactions. In fact, some systems can use ML models, and others
use DL models according to the accuracy of the transaction-classification process. For
each system, many models are tested, and the model with the highest accuracy is selected.
This flexibility supports the ability to classify different kinds and different structures
of transactions. The transactions’ types and classes can be predefined. However, the
application of ML and DL in the classification process enhances automation and reduces
biased human-factor involvement. In addition, after training the ML and DL models, the
rules and classification patterns can be transformed into smart contracts. This helps to
speed up the classification process.
Secondly, as the blockchain is partitioned, each class of transactions is mapped to
the suitable part of the blockchain. Indeed, the blockchain is partitioned according to
the system’s needs. For example, if a system requires different security levels, then the
of ML and DL in the classification process enhances automation and reduces biased hu-
man-factor involvement. In addition, after training the ML and DL models, the rules and
classification patterns can be transformed into smart contracts. This helps to speed up the
classification process.
Sensors 2024, 24, 4033 Secondly, as the blockchain is partitioned, each class of transactions is mapped to 5 ofthe
13
suitable part of the blockchain. Indeed, the blockchain is partitioned according to the sys-
tem’s needs. For example, if a system requires different security levels, then the sensitive
transactions
sensitive are processed
transactions using the
are processed permissioned
using part ofpart
the permissioned the ofblockchain, and the
the blockchain, andnon-
the
sensitive transactions are processed using the permissionless part of the
non-sensitive transactions are processed using the permissionless part of the blockchain. blockchain. An-
other example
Another exampleis that if a system
is that requires
if a system different
requires levels levels
different of specialization, then thethen
of specialization, normal
the
transactions
normal are mapped
transactions to the permissionless
are mapped part ofpart
to the permissionless the blockchain, whereas
of the blockchain, the sensi-
whereas the
tive transactions
sensitive that that
transactions require specialization
require are are
specialization processed using
processed usingthethe
permissioned
permissionedpart of
part
thethe
of blockchain.
blockchain. If there are are
If there different kinds
different of specializations,
kinds thenthen
of specializations, the permissioned
the permissionedpart
of the
part of blockchain
the blockchain is divided intointo
is divided many sub-parts,
many andand
sub-parts, the the
transactions are are
transactions classified ac-
classified
cordingly, as shown in Figure
accordingly, as shown in Figure 2. 2.

Figure
Figure 2. Shows aa smart
2. Shows smart partitioned
partitioned blockchain
blockchain with
with three
three categories
categories of transactions. The
of transactions. The partitioned
partitioned
blockchain layer has two sub-parts
sub-parts of
of permissioned
permissioned blockchain
blockchain and one
one permissionless
permissionless blockchain
blockchain
part. The last layer shows the ledger.
part. ledger.

Thirdly,
Thirdly, the
the ledger-management
ledger-management process manages data availability availability and
and accessibility.
accessibility.
As
As there are different levels of data sensitivity, data-access control is
different levels of data sensitivity, data-access control is required. Thus, required. Thus,
for
for
ledger-access-control management, we suggest three different mechanisms that suit suit
ledger-access-control management, we suggest three different mechanisms that dif-
different kinds
ferent kinds of of systems,
systems, which
which areare explained
explained as as follows:
follows:
•• Separate
Separate ledgers:
ledgers:IfIfthe
thetransaction
transactionclasses
classes areare
fully independent,
fully independent, then each
then partpart
each of the
of
blockchain stores the approved blocks in its own ledger. This means
the blockchain stores the approved blocks in its own ledger. This means that each that each part has
apart
separate ledger. ledger.
has a separate
•• The
The general ledger:
general ledger: All
All approved blocks are
approved blocks are stored
stored inin the
the same
same ledger
ledger regardless
regardless ofof
which part generates
which part generates them.them.
•• Hierarchical
Hierarchical ledger:
ledger: In
In this
this case,
case, the
the system
system hashas two
two ledgers—local
ledgers—local and and general.
general. The
The
permissionless
permissionless part has its own ledger (the local ledger), so any block approved by
part has its own ledger (the local ledger), so any block approved by
the
the permissionless
permissionless partpart is
is stored
stored inin the
the local
local ledger,
ledger, and
and another
another copycopy of
of each
each block
block
is
is stored
stored inin the
the general
generalledger
ledgerasaswell.
well.On Onthethe other
other hand,
hand, all all approved
approved blocks
blocks of
of the
the permissioned part are stored in the general ledger. Therefore, the general ledger
permissioned part are stored in the general ledger. Therefore, the general ledger
includes all approved blocks within the system (from all parts), whereas the local
ledger only shows the blocks of the permissionless part. This helps to separate public
data from private data.

4. Dataset Description and Preprocessing


4.1. Dataset Description
This study analyzed two distinct datasets to investigate different kinds of smart spaces,
which are bank and medical data. The bank-transaction dataset comprises a detailed record
of transactions, ATM locations, transaction types, cardholder information, and temporal
data from Wisabi Bank [26]. Key features within the bank dataset are outlined in Table 2,
with the primary aim being to identify transaction status, which is categorized into three
types: sensitive, non-sensitive, and soft-sensitive transactions (soft-sensitive is a category
that lies between sensitive and non-sensitive). The target variable is derived by associating
Sensors 2024, 24, 4033 6 of 13

values from the TransactionType feature, encompassing withdrawals, deposits, balance


inquiries, and transfer transactions. Withdrawals and transfers are classified as sensitive
due to their direct influences on the available balance, whereas deposits are considered
soft-sensitive and balance inquiries are non-sensitive. The dataset contains a total of
2,143,838 transactions.

Table 2. Wasabi Bank Variable Descriptions.

Variables Description
Transaction ID Represent the ID of each transaction
Represent the start and end time of the transaction in the
Transaction Start and End Time
format (Year/Month/Day), Hour: Minutes
Cardholder ID The ID of the cardholder
Location ID Location ID of where the transaction is made
Transaction Amount Amount of transaction
The types of the transactions consist of withdrawal, deposit,
Transaction Type ID
balance-inquiry, and transfer transactions
Is derived by associating values from the transaction type
where withdrawal and transfer are classified as sensitive,
Transaction Status (Target)
whereas deposit and balance inquiries are classified as
soft-sensitive and non-sensitive, respectively

The medical-transaction dataset is sourced from the Centers for Disease Control and
Prevention (CDC) and encompasses various data on heart disease and its associated fac-
tors [27]. The CDC accumulated over 100,000 medical records in 2020, containing several
crucial and sensitive variables such as BMI, smoking habits, physical activity levels, and
general health indicators, as detailed in Table 3. According to CDC findings, half of the
American population exhibits at least one of three major risk factors for heart disease,
namely smoking, high cholesterol, and high blood pressure. This study endeavors to differ-
entiate between sensitive and non-sensitive medical records using a binary classification
deep learning algorithm in order to reduce the prevalence of heart disease by facilitating
early medical interventions.

Table 3. Heart Disease Dataset Variable Descriptions.

Variables Description
Heart Disease Did the patient have heart disease? (Yes or No)
BMI Body Mass Index measurement
Smoking Does the patient have smoking habits? (Yes or No)
Alcohol Drinking Does the patient drink alcohol? (Yes or No)
Stroke Did the patient have a stroke? (Yes or No)
How many days a month did the patient feel poor physical
Physical Health
health?
Mental Health How many days a month did the patient feel poor mental health?
Difficulty Walking Difficulty climbing stairs
Diabetic Does the patient have diabetes? (Yes or No)
Physical Activity A patient who reported their physical activity in the past 30 days
Gen Health General health (Fair, Good, Very Good, or Excellent)
Sleep Time Number of hours of sleep
Asthma Does the patient have asthma disease? (Yes or No)
Kidney Disease Does the patient have kidney disease? (Yes or No)
Skin Cancer Does the patient have skin cancer disease? (Yes or No)

4.2. Data Preprocessing


The dataset concerning bank transactions was employed to develop a detection model
to identify transactions as sensitive, non-sensitive, or soft-sensitive. Consequently, transac-
tion types closely link the target variable, encompassing withdrawals, deposits, transfers,
Kidney Disease Does the patient have kidney disease? (Yes or No)
Skin Cancer Does the patient have skin cancer disease? (Yes or No)

4.2. Data Preprocessing


Sensors 2024, 24, 4033 The dataset concerning bank transactions was employed to develop a detection 7 of 13
model to identify transactions as sensitive, non-sensitive, or soft-sensitive. Consequently,
transaction types closely link the target variable, encompassing withdrawals, deposits,
transfers, and balance inquiries. We categorized transaction types as follows: withdrawals
and balance inquiries. We categorized transaction types as follows: withdrawals and
and transfers were classified as sensitive transactions, deposits were classified as soft-sen-
transfers were classified as sensitive transactions, deposits were classified as soft-sensitive,
sitive, and balance inquiries were classified as non-sensitive.
and balance inquiries were classified as non-sensitive.
In the medical dataset, our approach began with converting categorical values into
In the medical dataset, our approach began with converting categorical values into
numeric equivalents using the pandas factorize function. We subsequently applied corre-
numeric equivalents using the pandas factorize function. We subsequently applied corre-
lation feature selection to identify variables that were strongly associated with our target
lation feature selection to identify variables that were strongly associated with our target
variable, as depicted in Figure 3. According to the correlation heat map, variables such as
variable, as depicted in Figure 3. According to the correlation heat map, variables such as
physical health
physical health and
and activity, difficulty climbing
activity, difficulty climbing stairs,
stairs, and
and kidney
kidney disease
disease exhibited
exhibited robust
robust
and positive correlations
and positive correlations with
with the
the target
target variable
variable representing
representing transaction
transaction status.
status.

Figure 3.
Figure Correlationmatrix
3. Correlation matrixofofthe
the health
health data;
data; thethe target
target variable
variable is 0 is
or01,orwhere
1, where 0 denotes
0 denotes a non-a
non-sensitive medical transaction, and 1 indicates a sensitive medical transaction. We classified
sensitive medical transaction, and 1 indicates a sensitive medical transaction. We classified transac-
tions of patients
transactions experiencing
of patients a heart aattack
experiencing heartas sensitive
attack transactions.
as sensitive transactions.

5. Modeling and Evaluation


In this study, we employed a Random Forest Classifier and a Sequential Deep Learning
algorithm for practical implementation. The Random Forest model was applied to identify
the status of bank transactions, whereas the Sequential Deep Learning model was utilized
for predicting medical-transaction statuses. Both models are recognized for their effective
performance: Random Forest operates as an ensemble learning algorithm by amalgamating
numerous decision tree models [28]. DL demonstrates proficiency in discerning intricate
data patterns and relationships [29].
A Random Forest algorithm was constructed using 100 estimators and trained on 75%
of the dataset to detect bank-transaction statuses. We split the bank dataset, containing over
2 million records, into training and testing sets using the train_test_split function available
in the sklearn model_selection module [30]. Then, the transaction type ID was selected
as the input feature, while the transaction status feature was the target. The algorithm
identified withdrawal and transfer transactions as sensitive, classified deposit transactions
as soft-sensitive, and categorized all balance-inquiry transactions as non-sensitive. To
assess the efficacy of the Random Forest algorithms, we employed widely recognized
evaluation metrics, including the confusion matrix, precision, recall, F1-score, and accuracy.
We identified sensitivity thresholds by measuring how accurately the model identified
actual positives for the following three classes: sensitive, soft-sensitive, and non-sensitive
transactions. We used metrics such as recall and Receiver Operating Characteristic (ROC)
curves to find the best threshold value for the Random Forest model.
Sensors 2024, 24, 4033 8 of 13

In the context of the medical dataset, we employed a neural network architecture


utilizing Keras, an interface facilitating the development of high-level neural networks
atop TensorFlow. Specifically, we crafted a sequential model comprising an input layer,
two hidden layers, and an output layer. The input layer was configured to receive input
vectors (Difficulty climbing stairs, physical activity, kidney disease, and physical health) of
four dimensions, while the output layer employs softmax activation to generate probability
distributions for two classes (sensitive: has heart disease and non-sensitive: does not have
heart disease). The structure of the hidden layers was delineated as follows: the initial
hidden layer was composed of 64 neurons, employing a rectified linear activation function,
and the subsequent layer consisted of 32 units, also employing the rectified linear activation
function. Our model was configured with the Adam optimizer algorithm, which is a
widely adopted method for optimizing neural network training that dynamically adjusts
the learning rate during the training process. Accuracy, precision, recall, and area-under-
the-curve metrics were employed throughout training to assess the model’s predictive
performance relative to the ground truth labels and to determine the optimal sensitive-
threshold value for the model. Training was conducted utilizing 80% of the available data
for 16 epochs.
On the other hand, to apply smart partitioned blockchain on different domains, we
should investigate the required sensitivity level of each system. Accordingly, we determined
a threshold (A) to represent the minimum classification accuracy that can be accepted. For
each system, A is predefined by the system admins according to the sensitivity level.
For some systems, A is determined by international and national organizations through
their standards, rules and regulations. For example, the World Health Organization,
along with national food and drug administrations, define some numerical thresholds for
critical products and processes. The World Trade Organization defines some numerical
thresholds and standards for more products and processes. In additions, some systems
should investigate their products and processes to find the sensitivity levels. For example,
some systems rely on stakeholders’ feedback, revenue, and other statistics to measure their
performance and adjust A over the time.
Indeed, each system has its sensitivity requirements. For example, some health-
care transactions, such as reading or updating patients’ health records, are very critical.
Therefore, 100% accuracy is required (A = 100). Some other systems, such as speculation,
prediction, decision support, and inventory systems, are less sensitive, and so the A can
be between 85 and 95. Some systems, such as recommendation systems for restaurants or
movies, are not sensitive [31–33].
This shows how critical it is to classify transactions in the right pools, and to be
processed by the suitable part of the smart partitioned blockchain. Thus, ML and DL
classification accuracy cannot be less than system A.
In our experiment, we used two datasets, namely the bank and medical datasets. The
bank dataset is very sensitive, with A = 100. Within the bank dataset, there are different
kinds of transactions, which are classified according to their sensitivity and then mapped to
the right parts of the blockchain. Meanwhile, the medical dataset has some health records,
including personal health records. The learning model should classify those records into
sensitive and non-sensitive records in relation to heart disease. This is a speculative study
to find those people who are more likely to be exposed to heart disease. In addition, due to
speculation and the large number of risk factors for developing heart disease, the sensitivity
of classification accuracy can be less than 100%, let us say A = 90, for example.

6. Results and Discussion


6.1. ML and DL Results
In our experiment, we used the Random Forest model as ML model, and Sequential
Deep Learning as DL model. Those models are selected as examples to show how the
transactions’ classification is applied as a layer in the smart partitioned blockchain. If
6. Results and Discussion
6.1. ML and DL Results
In our experiment, we used the Random Forest model as ML model, and Sequential
Sensors 2024, 24, 4033 9 of 13
Deep Learning as DL model. Those models are selected as examples to show how the
transactions’ classification is applied as a layer in the smart partitioned blockchain. If the
classification accuracy is less than the system threshold, then other models can be used
the classification accuracy is less than the system threshold, then other models can be
instead.
used The
instead.
Random Forest model exhibited a remarkable ability to detect bank-transaction
The as
statuses, Random Forest
evidenced bymodel
its 100% exhibited
accuracy a in
remarkable ability
all evaluation to detect
metrics bank-transaction
(Precision, recall, F1-
statuses, as evidenced by its 100% accuracy in all evaluation metrics
score, accuracy, macro average, and weighted average) for all three classes. Additionally, (Precision, recall,
F1-score, accuracy, macro average, and weighted average) for all three
the ROC curve indicates that the true positive rate of the ML model achieved optimal classes. Additionally,
the ROCwith
results, curve indicates
a value that
of 1 or the true
(100%) positive
for all rate classes.
predicted of the MLThismodel
resultachieved optimal
was achieved be-
results,
cause thewith
inputa value of (Transaction
variable 1 or (100%) for all creates
types) predicted the classes. This result
target variable. waswords,
In other achievedthe
because the input
model links variable
the input to the(Transaction types) creates
output to provide the final thedetection
target variable. In other
of transaction words,
statuses,
the model links the input to the output to provide the final detection
making it straightforward for the model to determine the transaction statuses based onof transaction statuses,
making it straightforward
the transaction type input for the model
feature. Figureto4 determine
shows the the transaction
confusion matrixstatuses
of the based
Random on
the transaction
Forest model. type input feature. Figure 4 shows the confusion matrix of the Random
Forest model.

Figure 4. Confusion matrix of bank-transaction-status detection.

The Sequential Deep Learning algorithm demonstrated exceptional performance


across all evaluation metrics. Table 4 summarizes the model’s performance: 91% accuracy,
precision, and recall, with an AUC of 92%. The accuracy of medical transactions classifica-
tion is slightly lower than that of the bank transactions. This is because the target variable
has no direct link with the input features, unlike in the case of the bank transactions. The
deep learning model aims to predict whether a person has heart disease based on difficulty
climbing stairs, physical activity, kidney disease, and physical health. Using only these
features, the deep learning model successfully predicts the medical condition with over
90% accuracy across all evaluation metrics.

Table 4. Evaluation Metrics Results of The Sequential Deep Learning Model.

Algorithm Accuracy Precision Recall AUC


Sequential 91% 91% 91% 92%

6.2. Smart Partitioned Blockchain


As mentioned earlier, we should determine the sensitivity level of the smart space
and IoT systems in general. Then, we determine the type of transactions, the quantity of
each type, and the sensitivity level of each type. Consequently, we decide how many parts
should be in the smart partitioned blockchain and the scalability of each part. For example,
Sensors 2024, 24, 4033 10 of 13

if the system has three types of transactions, and one of them is very sensitive, whereas the
other two types are not sensitive, then the smart partitioned blockchain should have two
parts, with one part permissioned for the very sensitive transactions, and a permissionless
part to which the other two types are mapped.
As shown in Figure 5a, our experiment uses the bank dataset, which has four kinds
of transactions. Every kind has a different structure, which makes it easy for them to be
recognized. The kinds of transactions are withdrawals, transfers, deposits, and balance-
inquiry transactions. The ML classifies withdrawals and transfers transactions as sensitive,
and they are mapped to part I of the smart partition blockchain (permissioned). Deposit
transactions are classified as soft-sensitive, and they are mapped to part II of the11smart
Sensors 2024, 24, x FOR PEER REVIEW of 14
partition blockchain (permissioned). The balance-inquiry transactions are classified as non-
sensitive, so they are mapped to part III of the smart partition blockchain (permissionless).

Figure 5.
Figure 5. Part
Part (a)
(a) shows
shows the
the application
application of
of smart
smart partitioned
partitioned blockchain
blockchain on
on the
the bank
bank dataset
dataset with
with
three categories of transactions that are mapped to three parts of the blockchain, whereas
three categories of transactions that are mapped to three parts of the blockchain, whereas part part (b)
shows the application of a smart partitioned blockchain to the medical dataset, with two categories
(b) shows the application of a smart partitioned blockchain to the medical dataset, with two categories
that are mapped to two parts of the blockchain.
that are mapped to two parts of the blockchain.

Finally,
The as mentioned
system sensitivity earlier,
can be the bankasdataset
defined is verysensitivity,
correctness sensitive where A = 100, and
time sensitivity, or
the accuracy of
specialization the classification
sensitivity. The bank process of the
system’s ML model
sensitivity is 100%. On
is considered the other
to include hand, the
correctness
medical system
sensitivity applies
and time a speculative
sensitivity. study,
Therefore, all and thus the have
transactions sensitivity of classification
a bounded accu-
time to execute,
racythe
and cansensitive
be less than 100% andmust
transaction the DL model reaches
be executed 91% accuracy.
immediately and correctly. The time is

6.3. Discussion
The smart partitioned blockchain has many potential advantages. Firstly, the smart
partitioned blockchain out performs the other four types of blockchain. It out performs
Sensors 2024, 24, 4033 11 of 13

relaxed a little for the soft-sensitive transactions, but it must be correct. For non-sensitive
ones, the correctness is relaxed. Indeed, time relaxation results in correctness relaxation. For
example, if an account x contains USD1000 and someone deposits USD100, and concurrently
someone issues a balance-inquiry transaction, then due to the relaxation of the deposit
transaction as it is soft-sensitive, the balance inquiry will return USD1000, and the new
deposited value will not appear.
As shown in Figure 5b, our experiment also uses the medical dataset, which con-
tains personal health records. The records are classified as sensitive or non-sensitive. The
sensitive transactions are mapped to part I of the smart partition blockchain (permis-
sioned), whereas the non-sensitive transactions are mapped to part II of the smart partition
blockchain (permissionless). The personal health records that are classified as sensitive
are mapped to the permissioned part that has specialized doctors. The non-sensitive
transactions are mapped to the permissionless part that has general doctors.
Finally, as mentioned earlier, the bank dataset is very sensitive where A = 100, and
the accuracy of the classification process of the ML model is 100%. On the other hand,
the medical system applies a speculative study, and thus the sensitivity of classification
accuracy can be less than 100% and the DL model reaches 91% accuracy.

6.3. Discussion
The smart partitioned blockchain has many potential advantages. Firstly, the smart
partitioned blockchain out performs the other four types of blockchain. It out performs
permissioned and permissionless blockchain types as it combines both of them to obtain
the advantages of both. It also out performs the consortium type as it is an expanded
permissioned blockchain that runs across organizations, whereas the smart partitioned
blockchain includes a permissionless part that provides more scalability, productivity,
and decentralization. It out performs the hybrid blockchain type that works as a private
blockchain but only introduces part of transactions to the public for validation. The smart
partitioned blockchain has a specific part that enables the public to not only validate
transactions, but to also share data, validate blocks, vote, and access the ledger. The
smart partitioned blockchain has the advantage of allowing for the processing and storage
of both sensitive and non-sensitive transactions in a single network. This can provide
better privacy and security for sensitive transactions while still allowing for transparency
and decentralization. Secondly, the smart partitioned blockchain supports blockchain
adaption in different areas through the ability of create customization. The customization
is built on the nature of data, processing requirements, network capacity, and storage
requirements. Thirdly, the smart partitioned blockchain integrates blockchain with ML and
DL, for more efficiency and automation. Actually, ML and DL models enhance efficiency by
improving productivity and scalability. They also enhance automation by reducing human
involvement and biases. Additionally, the ML and DL models help to find sensitive cases
so that they are treated carefully by directing them to the suitable part of the blockchain to
maintain safety and security.
On the other hand, there are many challenges that may result in some disadvantages
when applying the smart partitioned blockchain in different areas. Firstly, blockchain
processes data in a transactional form; however, transforming the data into a transactional
form is critical process. For example, in IoT, part of the data comes from sensors in the
form of signals, and transforming them into a transactional form is critical. Secondly,
determining the block structure, size, content, and hashing techniques for each domain is
difficult. Thirdly, for each system, it is hard to define the suitable network size, communi-
cation protocol, transmission security, and encryption standards. Fourthly, selecting the
suitable ML and DL models for each system requires deep theoretical and experimental
investigations. Fifthly, finding the proper accuracy thresholds is another critical point that
was already discussed in Section 5. Sixthly, selecting the proper ledger type (among those
presented) is another issue. In fact, further research is required to overcome such challenges,
Sensors 2024, 24, 4033 12 of 13

although researchers can rely on smart partitioned blockchain as a customizable model


that facilitates application to other domains.

7. Conclusions
To integrate blockchain with other smart spaces and IoT, this work introduces a
smart partitioned blockchain model that uses ML and DL models to classify transactions
into different pools according to their sensitivity levels. Then, each pool is mapped to a
specific part of the smart partitioned blockchain model. The parts can be permissioned
or permissionless. The permissioned parts can have different sub-parts if needed. The
experiments show that the Random Forest model is used for bank-dataset classification and
that its accuracy reaches 100%, whereas Sequential Deep Learning is used for the medical
dataset and its accuracy reaches 91%. The acceptability of the degree of accuracy is decided
according to the predefined sensitivity threshold for each system. In the future, we suggest
using different ML and DL models to enhance the classification accuracy. In addition, we
are working on applying a smart partitioned blockchain to a specific domain to maintain
all requirements using smart contracts and consensus protocols.

Supplementary Materials: The following supporting information can be downloaded at: https:
//www.mdpi.com/article/10.3390/s24134033/s1, HealthCare Transaction Dataset-deep learning
model_code.ipynb, experiment1; Wisabi Bank transaction Dataset-code.ipynb, experiment2.
Author Contributions: Conceptualization, B.A.; methodology, B.A.; validation, B.A. and H.A.;
formal analysis B.A. and H.A.; investigation, B.A. and H.A.; resources, H.A.; data curation, H.A.;
writing—original draft preparation, B.A. and H.A.; writing—review and editing, B.A. and H.A.;
visualization, H.A.; supervision, B.A.; project administration, B.A.; funding acquisition, B.A. All
authors have read and agreed to the published version of the manuscript.
Funding: The authors gratefully acknowledge the funding of the Deanship of Graduate Study and
Scientific Research, Jazan University, Saudi Arabia, through Project number GSSRD-24, which funds
the APC.
Data Availability Statement: The used datasets are cited. and the classification codes are openly
available and are submitted as Supplementary Materials.
Conflicts of Interest: The authors declare no conflicts of interest.

References
1. Larrier, J.H. A brief history of blockchain. In Transforming Scholarly Publishing with Blockchain Technologies and AI; IGI Global:
Hershey, PA, USA, 2021; pp. 85–100.
2. Steiu, M.F. Blockchain in education: Opportunities, applications, and challenges. First Monday 2020, 25. [CrossRef]
3. Guustaaf, E.; Rahardja, U.; Aini, Q.; Maharani, H.W.; Santoso, N.A. Blockchain-based Education Project. Aptisi Trans. Manag.
2021, 5, 46–61. [CrossRef]
4. Saleh, A.J.; Alazzam, F.A.F.; Khalaf, K.; Aldrou, A.R.; Zavalna, Z. Legal aspects of the management of cryptocurrency assets in the
national security system. J. Secur. Sustain. Issues 2020, 10, 235–247. [CrossRef]
5. Conklin, M.; Elzweig, B.; Trautman, L.J. Legal Recourse for Victims of Blockchain and Cyber Breach Attacks. UC Davis Bus. LJ
2023, 23, 135. [CrossRef]
6. Assiri, B. Using leader election and blockchain in e-health. Adv. Sci. Technol. Eng. Syst. J. 2020, 5, 46–54. [CrossRef]
7. Assiri, B. A Modified and Effective Blockchain Model for E-Healthcare Systems. Appl. Sci. 2023, 13, 12630. [CrossRef]
8. Assiri, B.; Khan, W.Z. Fair and trustworthy: Lock-free enhanced tendermint blockchain algorithm. TELKOMNIKA Telecommun.
Comput. Electron. Control. 2020, 18, 2224. [CrossRef]
9. Yang, R.; Wakefield, R.; Lyu, S.; Jayasuriya, S.; Han, F.; Yi, X.; Yang, X.; Amarasinghe, G.; Chen, S. Public and private blockchain in
construction business process and information integration. Autom. Constr. 2020, 118, 103276. [CrossRef]
10. Krishnan, S.; Balas, V.E.; Julie, E.G.; Yesudhas, H.R.; Balaji, S.; Kumar, R. (Eds.) Handbook of Research on Blockchain Technology;
Academic Press: Cambridge, MA, USA, 2020.
11. Marar, H.W.; Marar, R.W. Hybrid blockchain. Jordanian J. Comput. Inf. Technol. JJCIT 2020, 6, 317–325. [CrossRef]
12. Assiri, B. Leader election and blockchain algorithm in cloud environment for e-health. In Proceedings of the 2019 2nd International
Conference on New Trends in Computing Sciences (ICTCS), Amman, Jordan, 9–11 October 2019; pp. 1–6.
Sensors 2024, 24, 4033 13 of 13

13. Numan, M.; Subhan, F.; Khan, W.Z.; Assiri, B.; Armi, N. Well-organized bully leader election algorithm for distributed system.
In Proceedings of the 2018 International Conference on Radar, Antenna, Microwave, Electronics, and Telecommunications
(ICRAMET), Serpong, Indonesia, 1–2 November 2018; pp. 5–10.
14. Liyanaarachchi, G.; Viglia, G.; Kurtaliqi, F. Addressing challenges of digital transformation with modified blockchain. Technol.
Forecast. Soc. Chang. 2024, 201, 123254. [CrossRef]
15. Rahman, M.S.; Al Omar, A.; Alam Bhuiyan, Z.; Basu, A.; Kiyomoto, S.; Wang, G. Accountable cross-border data sharing using
blockchain under relaxed trust assumption. IEEE Trans. Eng. Manag. 2020, 67, 1476–1486. [CrossRef]
16. Chauhan, A.; Rishabh Shankar, L.N.; Mittal, P. A Deep Dive into Blockchain Consensus Protocols. In Smart Trends in Computing
and Communications: Proceedings of SmartCom; Springer: Singapore, 2021; Volume 2022, pp. 571–581.
17. Jain, A.; Jat, D.S. A.; Jat, D.S. A review on consensus protocol of blockchain technology. In Intelligent Sustainable Systems: Selected
Papers of WorldS4 2021; Springer: New York, NY, USA, 2022; Volume 2, pp. 813–829.
18. De Angelis, S.; Aniello, L.; Baldoni, R.; Lombardi, F.; Margheri, A.; Sassone, V. PBFT vs proof-of-authority: Applying the CAP
theorem to permissioned blockchain. In Proceedings of the CEUR Workshop Proceedings, CEUR-WS, Aachen, Germany, 6–9
February 2018; p. 2058.
19. Paul, P.; Aithal, P.S.; Saavedra, R.; Ghosh, S. Blockchain technology and its types—A short review. Int. J. Appl. Sci. Eng. IJASE
2021, 9, 189–200. [CrossRef]
20. Fynn, E.; Pedone, F. Challenges and pitfalls of partitioning blockchains. In Proceedings of the 2018 48th Annual IEEE/IFIP
International Conference on Dependable Systems and Networks Workshops (DSN-W), Luxembourg, 25–28 June 2018;
pp. 128–133.
21. Xu, C.; Zhang, C.; Xu, J.; Pei, J. Slimchain: Scaling blockchain transactions through off-chain storage and parallel processing. Proc.
Vldb Endow. 2021, 14, 2314–2326. [CrossRef]
22. Vairagade, R.S.; SH, B. Enabling machine learning-based side-chaining for improving QoS in blockchain-powered IoT networks.
Trans. Emerg. Telecommun. Technol. 2022, 33, e4433. [CrossRef]
23. Reed, C.; Sathyanarayan, U.M.; Ruan, S.; Collins, J. Beyond BitCoin—Legal impurities and off-chain assets. Int. J. Law Inf. Technol.
2018, 26, 160–182. [CrossRef]
24. Liu, X.; Chen, R.; Chen, Y.W.; Yuan, S.M. Off-chain data fetching architecture for ethereum smart contract. In Proceedings of the
2018 International Conference on Cloud Computing, Big Data and Blockchain (ICCBB), Fuzhou, China, 15–17 November 2018;
pp. 1–4.
25. Sun, X.; Yu, F.R.; Zhang, P.; Sun, Z.; Xie, W.; Peng, X. A survey on zero-knowledge proof in blockchain. IEEE Netw. 2021, 35,
198–205. [CrossRef]
26. Iheanachor, O. Wisabi Bank Dataset. Kaggle. Available online: https://www.kaggle.com/datasets/obinnaiheanachor/wisabi-
bank-dataset (accessed on 13 June 2023).
27. Pytlak, K. Indicators of Heart Disease (2022 Update). Kaggle. Available online: https://www.kaggle.com/datasets/kamilpytlak/
personal-key-indicators-of-heart-disease/data (accessed on 23 October 2023).
28. Rodriguez-Galiano, V.F.; Ghimire, B.; Rogan, J.; Chica-Olmo, M.; Rigol-Sanchez, J.P. An assessment of the effectiveness of a
random forest classifier for land-cover classification. ISPRS J. Photogramm. Remote Sens. 2012, 67, 93–104. [CrossRef]
29. Fang, H.; Zhang, D.; Shu, Y.; Guo, G. Deep learning for sequential recommendation: Algorithms, influential factors, and
evaluations. ACM Trans. Inf. Syst. TOIS 2020, 39, 1–42. [CrossRef]
30. Train_test_split. Scikit. n.d. Available online: https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.
train_test_split.html (accessed on 20 December 2023).
31. Mohideen, B.I.; Assiri, B. Internet of Things (IoT): Classification, secured architecture based on data sensitivity, security issues
and their countermeasures. J. Inf. Knowl. Manag. 2021, 20, 2140001. [CrossRef]
32. Razavi, S.; Jakeman, A.; Saltelli, A.; Prieur, C.; Iooss, B.; Borgonovo, E.; Plischke, E.; Piano, S.L.; Iwanaga, T.; Becker, W.; et al.
The future of sensitivity analysis: An essential discipline for systems modeling and policy support. Environ. Model. Softw. 2020,
137, 104954. [CrossRef]
33. Xu, C.; Wu, H.; Liu, H.; Gu, W.; Li, Y.; Cao, D. Blockchain-oriented privacy protection of sensitive data in the internet of vehicles.
IEEE Trans. Intell. Veh. 2022, 8, 1057–1067. [CrossRef]

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual
author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to
people or property resulting from any ideas, methods, instructions or products referred to in the content.

You might also like