Professional Documents
Culture Documents
Detection and Classification of Gastrointestinal Disease Using Convolutional Neural Network and SVM
Detection and Classification of Gastrointestinal Disease Using Convolutional Neural Network and SVM
Melaku Bitew Haile, Ayodeji Olalekan Salau, Belay Enyew & Abebech Jenber
Belay |
To cite this article: Melaku Bitew Haile, Ayodeji Olalekan Salau, Belay Enyew & Abebech
Jenber Belay | (2022) Detection and classification of gastrointestinal disease using
convolutional neural network and SVM, Cogent Engineering, 9:1, 2084878, DOI:
10.1080/23311916.2022.2084878
© 2022 The Author(s). This open access article is distributed under a Creative Commons
Attribution (CC-BY) 4.0 license.
Page 1 of 23
Haile et al., Cogent Engineering (2022), 9: 2084878
https://doi.org/10.1080/23311916.2022.2084878
the best performance compared to others, using the available standard dataset.
The proposed model achieves a classification accuracy of 98% and Matthews’s
Correlation Coefficient of 97.8%, which is a significant improvement over previous
techniques and other neural network architectures.
1. Introduction
The human gastrointestinal tract is prone to several abnormal outcomes from minor discomfort to
life-threatening diseases (Naz et al., 2021). Gastrointestinal tract diseases such as ulcers, bleeding,
polyps, Crohn’s disease, colorectal cancer, tumor cancer, and other related diseases are well known
today worldwide (Jha et al., 2021). According to the International Agency for Research on Cancer, the
number of new cases of gastrointestinal cancer and the number of related deaths worldwide in 2018
is estimated at 4.8 million and is responsible for 26% of global cancer cases and 35% of all cancer
deaths (Naz et al., 2021). Early diagnosis of various gastrointestinal diseases can result in effective
treatment and reduce the risk of mortality. Unfortunately, various gastrointestinal diseases are
undetectable or create confusion during screening by medical experts due to noise in the images
which suppress important details (Borgli et al., 2020; Khan, Majid et al., 2021; Ramzan et al., 2021).
Visual assessment of endoscopy images is subjective, often time-consuming, and minimally repeti
tive, which could lead to an inaccurate diagnosis (Borgli et al., 2020). The use of artificial intelligence
(AI) in a variety of gastrointestinal endoscopic applications has the potential to improve clinical
practice and improve the efficiency and precision of current diagnostic methods (Chanie et al.,
2020; Woreta et al., 2015).
Several studies have been conducted to classify gastrointestinal diseases with the goal of making a
significant contribution to the effective diagnosis and treatment of gastrointestinal diseases.
However, several studies used a small amount of data , and the majority of research studies aimed
to classify a small number of gastrointestinal disorders in a specific part of the human gastrointest
inal system. However, in Borgli et al. (2020), a series of experiments were conducted based on deep
convolutional neural networks to classify 23 different classes of images. The authors achieved a
Matthews Correlation Coefficient of 0.902. More importantly, the study showed that there is a need
for improvement as some gastrointestinal diseases are more difficult to identify than others.
Deep learning techniques are another form of machine learning techniques which have been
used in several areas of gastrointestinal endoscopy including colorectal polyp detection and
classification, analysis of endoscopic images for diagnosis of helicobacter pylori infection detection
and depth assessment of early gastric cancer, and detection of various abnormalities in wireless
capsule endoscopy images (Khan, Nasir et al., 2021; Khan, Sarfraz et al., 2020; Majid et al., 2020).
Öztürk & Özkaya (2020) stated that due to the deficiencies of radiologists and other human factors
which could lead to a false diagnosis, a computer-aided automated system would be useful for
accurately diagnosing gastrointestinal polyps in the early stages of cancer. In this paper, we
proposed a gastrointestinal disease diagnosis model that uses SVM as a classifier to determine
which of the 23 gastrointestinal diseases the endoscopic scan images correspond to. The main
contributions of this paper are as follows:
● We propose a novel concatenated feature extraction model of the VGGNet and InceptionNet deep
learning models for endoscopic image classification. This method builds on convolving images using
concatenated VGGNET and InceptionNet architectures for feature modeling and then classifying
these features using RBF kernel-based multi-support vector machine. The proposed approach avoids
the use of handcrafted feature extraction and selection. Furthermore, the proposed model over
comes the problem of over-fitting and requires few parameters to learn discriminant features.
Page 2 of 23
Haile et al., Cogent Engineering (2022), 9: 2084878
https://doi.org/10.1080/23311916.2022.2084878
● We consider two popular pre-trained CNN models with transfer learning principles for robustness
testing of the proposed model.
The remainder of this paper is organized as follows. Section 2 presents an overview of related
works. Section 3 explains the materials and methods, while Section 4 articulates the experimental
results and comparative analysis. Finally, Section 5 deals with the concluding remarks.
2. Related works
Over the past few years, the strength of deep learning algorithms have been explored in the field
of endoscopy including capsule endoscopy, esophagus-astro-duodenoscopy, and colonoscopy to
help doctors successfully diagnose various gastrointestinal lesions (Borgli et al., 2020; Chanie et al.,
2020; Khan, Nasir et al., 2021; Khan, Khan et al., 2020; Gamage et al., 2019). Nowadays, deep
learning approaches especially CNN has become a powerful machine learning technique in image
processing tasks such as the classification of gastrointestinal diseases.
Borgli et al. (2020) carried out a series of experiments based on deep convolutional neural
networks using applied common architectures with a slight modification to classify 23 different
classes of images with the help of Pre-Trained ResNet-50, Pre-Trained ResNet-152, Pre-Trained
DensNet-161, Averaged ResNet-152 + DenseNet-161, and ResNet-152 + DenseNet-161 + MLP
which achieve a Matthews correlation coefficient (MCC) of 0.826, 0.898, 0.899, 0.899, and 0.902,
respectively. However, the results also show that there is room for improvement as some classes
are more difficult to identify than others (difficulty in classifying ulcerative colitis and esophagitis,
difficulty in classifying dyed lifted polyps and dyed resection margins, and difficulty in classifying
barrettes from esophagitis or z-line).
The descriptive study of Woreta et al. (2015) conducted at Gondar University Hospital based on
patient data indicates that esophago-gastroduodenal pathology was detected in 1093 (83.4%)
patients of which duodenal ulcer was the most common endoscopic findings obtained from 333
(25.4%) cases.
The study performed by Chanie et al. (2020) at Saint Paul’s Hospital Millennium Medical College
on 128 patients varices indicates that the most common cause of upper gastrointestinal bleeding
was observed in 46.1% (59), followed by peptic ulcer disease, 24.2% (31), esophagitis, 3.9% (5),
gastritis, 6.3% (8), duodenitis, 3.1% (4), and malignancy, 4.7% (6), while 10 patients (7.8%) had
both varices and ulcer. The proportion of deaths in the study was 17.2%.
Khan, Nasir et al., (2021) presented a ulcer segmentation approach based on deep learning of the
modified Recurrent CNN masks for the detection of ulcers and classification of gastrointestinal
diseases (ulcer, polyp, bleeding). During the classification phase, the ResNet101 pre-trained CNN
model is further developed through transfer learning to derive deep characteristics. Grasshopper
optimization and the minimal distance fitness function were used to improve the acquired deep
features. For the final classification, the best-selected features are fed into a multi-support vector
machine classifier using the cubic kernel function. Validation is done with a private data set, which
achieves a MOC of 88.08% for ulcer segmentation and 99.92% for the classification accuracy. The
ulcer regions were not correctly segmented because of the small amount of training data which lead
to the failure of the approach to segment polyp and bleeding regions.
Gamage et al. (2019) used pre-trained DenseNet-201, ResNet-18, and VGG-16 CNN models as
feature extractors with a global average pooling (GAP) layer to produce an ensemble of deep
features as a single feature vector with a promising accuracy of 97.38%. Only eight classes of
digestive tract anomalies were predicted by the suggested method.
Takiyama et al. (2018) proposed an alternative CNN-based classification model for categorizing
the anatomical location of the human digestive tract. This technique could classify EGD images
Page 3 of 23
Haile et al., Cogent Engineering (2022), 9: 2084878
https://doi.org/10.1080/23311916.2022.2084878
into four large anatomical locations (larynx, esophagus, stomach, and duodenum) and three
subcategories for stomach images (upper, middle, and lower regions). A predetermined CNN
architecture called GoogleNet was used for this classification problem, which achieved a high
performance.
Owais et al. (2019) proposed a CNN and TML classification framework for the classification of
several gastrointestinal diseases using endoscopic videos with a dataset containing a total of 77
video files with52,471 images. They considered a total of 37 different classes (both diseased and
normal cases) related to the human GI tract and achieved an average accuracy of 92.57%.
However, more data was required to improve the systems performance and due to the limited
size of the dataset, it was not possible to use different videos for training and testing, because
most of the classes (about 21 out of 37 classes) consisted of single-patient data (one video per
class).
Sharif et al. (2019) proposed a new technique based on merging deep CNN and geometrical
functions. Initially, the disease regions are extracted from the given WCE images using a new
approach called improved contrast color features. The geometric features were derived from the
segmented disease area. Thereafter, a unique fusion of the deep CNN functions, VGG16 and VGG19
was carried out based on the Euclidean Fisher Vector. The unique features are combined with
geometric features which are then used to choose the best features using the conditional entropy
approach. K-Nearest Neighbor (kNN) was used to classify the selected features. For the evaluation
of the suggested method, a privately gathered database of 5500 WCE images was used which
achieved a classification accuracy of 99.42% and a precision rate of 99.51%. However, the authors
obtained just three classification classes, namely: ulcers, bleeding, and health.
Shichijo et al. (2017) proposed a deep learning CAD tool for the diagnosis of Helicobacter pylori
(H. pylori) infection. The proposed framework employed two-step CNN models. In the first step,
a 22-layer deep CNN was refined for classification (i.e., positive or negative) of H. pylori infection.
Then, in step 2, another CNN was used to further classify the data set (EGD images) according to
eight different anatomical locations to achieve an accuracy of 83.1%.
Segu et al. (2016) introduced a deep CNN system for characterizing small intestine motility. This
CNN-based method exploited the general representation of six different intestinal motility events
by extracting deep features, which resulted in superior classification performance when compared
to the other handcrafted feature-based methods. Although they achieved a high classification
performance of 96% accuracy, they only considered a limited number of classes.
Li et al. (2012) used machine learning methods to analyze lymph node metastasis in gastric
cancer using gemstone spectral imaging. They used feature selection and metric learning methods
to reduce data dimension and feature space, and then used the kNN classifier to distinguish lymph
node metastasis from non-lymph node metastasis using 38 lymph node samples in gastric cancer
to achieve an overall accuracy of 96.33%. Wang et al. (2015) proposed a system for detecting
polyps in colonoscopy. It can generate an alert with real-time feedback during colonoscopy. The
authors used visual elements and a rule-based classifier to detect polyp borders. The system
achieved an accuracy of 97.7% for polyp detection.
A multiscale approach for detecting ulcers was introduced by Souaidi et al. (2019). The authors
retrieved texture patterns such as the entire pyramids of LBP and Laplacian, and then classified
them using SVM. They tested the system on two WCE datasets and found that it achieved
accuracies of 95.11% and 93.88%. Li & Meng (2012) proposed a framework for wireless CE images
through integrated LBP and discrete wave transformation to extract texture characteristics from
scale and rotation invariants. Finally, the selected characteristics were categorized using the SVM
classifier.
Page 4 of 23
Haile et al., Cogent Engineering (2022), 9: 2084878
https://doi.org/10.1080/23311916.2022.2084878
By combining the bag-of-features (BoF) approach with the salience map, Yuan & Meng (2014)
proposed an integrated polyp identification system. The bag-of-features approach uses a Scale-
Invariant Feature Transformation feature vector with k-means clustering to characterize local
features in the first stage. The saliency map was then used to generate a histogram, which was
then used to extract saliency features. Finally, the BoF and primary attributes were sent into the
SVM classifier, which was used to classify the data. For the detection of polyps and ulcers in the
small intestine in CE, Karargyris & Bourbakis (2011) proposed a geometric and texture-based
technique. The images were pre-processed with Log Gabor filters and the SUSAN edge detector,
and geometrical features were retrieved to determine the polyp and ulcer area. To identify normal
and pathological tissues, Li & Meng (2012) employed discrete wavelet processing and a uniform
local binary model (LBP) with SVM. Wavelet transform combines the ability of multi-resolution and
uniform LBP analysis in this feature extraction approach to give robustness to lighting fluctuations,
resulting in superior performance.
Most of the presented approaches have been designed to classify a limited number of gastro
intestinal diseases in a specific portion of the human gastrointestinal tract. Moreover, feature
extraction is a challenging task in medical imaging for recognition of infections because of
additional and irrelevant features, or noises that degrades the overall recognition accuracy. Also,
some gastrointestinal diseases are difficult to identify from the other disease types. Therefore, to
deal with these challenges, in this paper, we employ different image processing stages. Finally, we
used deep feature concatenation as a single feature vector by combining VGGNet and
InceptionNet models as feature extractors, followed by a SVM classification.
Page 5 of 23
Haile et al., Cogent Engineering (2022), 9: 2084878
https://doi.org/10.1080/23311916.2022.2084878
quality of mucosal views which include bbps 0–1, bbps 2–3, and impacted stool and therapeutic
interventions; dyed-lifted-polyps, dyed-resection-margins. In total, the dataset contains 10,662
labeled images stored using the JPEG format. Finally, the data is split into 70/30 which means 70%
for training, 20% for testing, and 10% of the data for validation. Image samples of the various
labeled classes for gastrointestinal disease are shown in Figure 1.
3.2.1. Preprocessing
Each endoscopy image is stored, and its quality might be degraded by different factors such as no
uniform intensity, variations, motions, shift, and noise. Thus, it requires preprocessing techniques
to reduce the disease identification complexity. In this study, we used image preprocessing
techniques such as image resizing, and noise filtering techniques.
3.2.1.1. Image resizing. The dataset has images that have a uniform size of 100 × 100 resolution,
but due to our computer performance, the image resolution of the original image needed to be
resized before feeding it into the model. Therefore, the size of the image was resized to a
resolution of 32 × 32.
3.2.1.2. Denoising. Medical images play a vital role in disseminating information about polyps,
cancers, hemorrhoids, and other diseases. A major challenge in the process of medical imaging is
to obtain an image without the loss of any significant information. The images obtained were
affected by noise, and this noise affects the classification accuracy of the proposed model. Basic
digital image filters like Gaussian filter (GF), Median filter (MF), and Adaptive Median filter (AMF)
(Guan et al., 2014) are common filters used to remove noise in medical images. In this study, we
compared GF, MF, and AMF todetermine the most suitable filtering technique that improves the
classification performance of the proposed model.
i. Adaptive median filter
Adaptive median filter (AMF) is being applied widely as an advanced denoising technique
compared with the traditional median filter because the AMF executes spatial processing to
determine which pixels in the image has been affected by noise. It classifies pixels by comparison
of each pixel in the image to its surrounding neighbor pixel. It preserves image details such as
edges and smoothens non-impulsive noise, while the standard median filter does not (Ali, 2018).
An illustration of the resultant image after using the noise removal technique is shown in Figure 2.
3.2.1.3. Data augmentation. Deep learning has been used to classify several types of medical images
over the past few years and has achieved a desirable performance. This was found to be true when there
is an extensive amount of data that helps to attain the desired performance model. The limitations of
the lack of training data are addressed through augmentation and transfer learning, which is very
Page 6 of 23
Haile et al., Cogent Engineering (2022), 9: 2084878
https://doi.org/10.1080/23311916.2022.2084878
essential because of the inadequate medical datasets. Data augmentation enhances classification
performance and prevents the overfitting issue since it offers a large amount of data (Shorten &
Khoshgoftaar, 2019). In this paper, we applied rotation of 45°, width_shift_range = 0.2, height_shif
t_range = 0.2, and horizontal flip augmentation technique. After augmentation, the dataset is increased
to 47,398 images.
Page 7 of 23
Haile et al., Cogent Engineering (2022), 9: 2084878
https://doi.org/10.1080/23311916.2022.2084878
a single model (Lonseko et al., 2021). The concatenation of both model features is taken as the
input to the next classifier.
3.2.2.1. Feature extraction using the proposed VGGNet model. In this study, high-level features are
extracted using the proposed VGGNet model. Despite this, feature extraction with CNN models
consists of several similar steps because each step is made up of cascading layers: convolution,
pooling, dropout, batch normalization, and dense layer, each of which represents features at
different levels of abstraction. However, CNN models are different from each other in how these
important layers are adjusted and also in how they form the network. The architectural design of
the proposed VGGNet model used in this study is presented in Figure 3, while the parameters used
for the proposed VGGNet architecture of the CNN model are presented in Table 1. The proposed
VGGNet model comprises of six convolutions, two batch normalizations, three max-pooling, three
dropouts, and one flattened layer. A dropout layer is inserted below each max pooling and dense
layer to overcome overfitting. Finally, the extracted features are classified into 23 gastrointestinal
disease types by Softmax, KNN, RF, and SVM classifiers. When we trained the model, we observed
overfitting of the training data. During this time, the model performs well with the training set but
didnt perform well with the validation or test set. To solve this challenge, we used data augmenta
tion to increase the dataset and dropout operation for regularization.
3.2.2.2. Feature extraction using the proposed InceptionNet model. The second model (the archi
tectural design of the proposed InceptionNet model) used in this study is described in Figure 4,
while the parameters used for the proposed InceptionNet architecture of the CNN model are
presented in Table 2. The model comprises of six convolutions, one max-pooling, one global
average polling, one batch normalization, one dropout, and one flattened layer. A dropout layer
is inserted below the global average pooling and dense layer to overcome overfitting. Finally, the
extracted features are classified into 23 gastrointestinal disease types using Softmax, KNN, RF, and
Page 8 of 23
Haile et al., Cogent Engineering (2022), 9: 2084878
https://doi.org/10.1080/23311916.2022.2084878
Table 2. Description of the parameters of the proposed InceptionNet architecture of the CNN
model
Operation Kernel /stride Filter size Output shape Parameter
2D convolution 1 1x1 64 32 x 32 x 96 384
2D convolution 2 1x1 96 32 x 32 x 16 64
2D convolution 3 3x3 128 32 x 32 x 64 256
2D convolution 4 1x1 16 32 x 32 x 128 110,720
2D convolution 5 5x5 32 32 x 32 x 32 12,832
2D maxpooling1 3x3 - 32 x 32 x 3 0
2D convolution 6 1x1 32 32 x 32 x 32 128
Concatenate - - 32 x 32 x 256 0
Batch normalization - - 32 x 32 x 256 1024
Global average - - 256 0
pooling
Dropout - - 256 0
Flatten - 256 256 0
Dense 524 524 134,668
Dropout - - 524 0
Dense - 524 23 12,075
Page 9 of 23
Haile et al., Cogent Engineering (2022), 9: 2084878
https://doi.org/10.1080/23311916.2022.2084878
SVM classifiers. When we trained the model, we observed overfitting of the training data in the
same way as the other model. This was solved by adding a dropout layer.
3.2.2.3. Feature extraction using the proposed concatenated model. As previously stated, the
proposed model aims to accurately diagnose gastrointestinal disease by concatenating deep
features extracted from endoscopy images by using two different models (CNN architecture of
VGGNet and InceptionNet). First, a VGGNet model is proposed to extract features from endoscopy
images. The proposed InceptionNet model extracts features from the same images in the other
model. Finally, the extracted features from these models are flattened and concatenated into
a single classification descriptor. Finally, the extracted features are fed into the classifier.
Page 10 of 23
Haile et al., Cogent Engineering (2022), 9: 2084878
https://doi.org/10.1080/23311916.2022.2084878
classifying images (Noor Mohamed & Srinivasan, 2020; Lin et al., 2020). Deep Learning is now
widely used to solve complex problems like image classification, natural language processing, and
voice recognition. RBM, Autoencoder, DBN, CNN, and RNN are the most important deep learning
algorithms (Shetty & Siddiqa, 2019). According to Guo et al. (2016), CNN-based architectures are
most widely used for image classification problems. CNN is currently the most widely used
machine learning algorithm in medical image analysis because it preserves the spatial relation
ships when filtering input images, which is crucial in endoscopy, for example, where a normal
colon tissue meets a cancerous tissue (Khan et al., 2020). CNN employs three basic types of neural
layers: convolutional layers, pooling layers, and fully connected layers. Convolutional Layers: In the
convolutional layers, CNN uses different portions of the image as computational intelligence to
convolve the full image and produce different feature maps. Pooling layers: Pooling layers are in
charge of reducing the spatial measurements (width, height) of the information volume for the
next convolutional layer. This layer’s activity is also known as subsampling or downsampling,
because the decrease in size results in a loss of data. Nonetheless, such loss is beneficial to the
network because the reduction in size results in less computational overhead for the network’s
subsequent layers, as well as the elimination of overfitting. The most commonly used techniques
are average pooling and maximum pooling. Furthermore, there are various varieties of the pooling
layers in literature, each propelled by different inspirations and serving specific needs, for example,
stochastic pooling, spatial pyramid pooling, and def-pooling. Fully Connected Layers: After a few
convolutional and pooling layers, the neural network’s important level of thinking is carried out
through completely associated layers. As the name implies, neurons in a completely associated
layer have complete associations with all representations in the previous layer. Their actuation can
then be achieved by duplicating the framework followed by an inclination counterbalance. Finally,
fully associated layers convert the two-dimensional component maps to a one-dimensional ele
ment vector. The obtained vector could either be carried forward into a specific number of classes
for classification or considered as a component vector for additional preparation (Escobar et al.,
2021). Pre-trained models are a type of deep learning models that have been trained on a large
dataset to solve classification problems (Al-Adhaileh et al., 2021). VGGNet, ResNet, Inception, and
Xception are some of the most recent basic deep learning structures for CNN. Simonyan and
Zisserman introduced VGGNet in 2014. VGGNet was the first runner-up design in the characteriza
tion task of the ImageNet Large Scale Visual Recognition Competition (Simonyan & Zisserman,
2015; Sakib et al., 2018). ResNet CNN’s architecture demonstrates exceptional fundamental use of
skip connections and batch normalization. Also, at the network’s end, there are no fully connected
layers. The main flaw of this system is that due to the vastness of the boundaries, it is expensive to
use. In any case, ResNet is currently regarded as the best convolutional neural network model,
having won the ILSVRC 2015 challenge. It is worth noting that, despite having many more layers
than the VGG, it requires much less memory, nearly 5 times less. This is because, instead of dense
layers in the classification phase, this network employs GlobalAveragepooling, a type of layer that
converts the final layer in the feature extraction stage’s two-dimensional activity maps to an
n-class vector that is used to calculate the probability of belonging to each class. Szegedy et al.
(2015) proposed Inception, a deep convolutional neural network design that is responsible for
setting a new critical edge for classification and detection in the ImageNet Large-Scale Visual
Recognition Challenge in 2014 (ILSVRC14). The main advantage of this strategy is that it adds
a critical component to a minor increase in computer requirements when compared to smaller and
smaller networks. Chollet (2017) designed Xception, which has 36 convolutional layers that serve
as the foundation for extracting network features. The Xception architecture is made up of a linear
stack of depth-separable convolution layers with residual connections. It is an adaptation of
Inception, with deep-separable convolutions replacing the Inception modules. Furthermore, it
has approximately the same number of parameters as Inception-v1. We used InceptionNet and
VGGNet because both performed well in the 2014 ImageNet challenge, and InceptionNet gained
attention due to its architecture based on modules called Inception.
Page 11 of 23
Haile et al., Cogent Engineering (2022), 9: 2084878
https://doi.org/10.1080/23311916.2022.2084878
Page 12 of 23
Haile et al., Cogent Engineering (2022), 9: 2084878
https://doi.org/10.1080/23311916.2022.2084878
validation phase for testing the performance of the trained model with the new image dataset.
First, the whole gastrointestinal image dataset is resized to 32 × 32. Secondly, the resized dataset
is filtered with the AMF technique. Thirdly, filtered images are augmented. Then, the augmented
images are subdivided into training dataset, validation dataset, and test dataset. Next, both the
training dataset and validation dataset were fed to the concatenated model. The next step is
feature extraction using the concatenated model of the VGGNet and InceptionNet architecture of
the CNN model, which is used to extract the useful high-level features from the images. Lastly,
extracted features of VGGNet and InceptionNet are concatenated together and fed into the
classifiers (Softmax, KNN, RF, and SVM). While in the test set, both models features (VGGNet and
InceptionNet) are concatenated together and fed to the previously trained SVM model. The detail
of each phase of the proposed model is presented in section 3.3.2.
(i)Learning method
The proposed concatenated model is evaluated in terms of accuracy and loss function. To measure
the difference between the output predictions and the target output, we used a categorical cross-
entropy loss function, which is commonly used for multi-class problems. Adaptive momentum
estimation (Adam) optimization function with an initial learning rate of 0.001 was used to adjust
the learning rate and other parameters to increase the accuracy of the model and to decrease the
loss. The classified classes are compared in accordance with the classification’s performance
parameters. The confusion matrix is used to calculate the parameters performance. According to
Ayalew et al. (2022; Salau (2021), each performance measure parameter is computed as follows:
● False-Positive Rate (FPR): indicates that the diseases are not detected correctly.
● False-Negative Rate (FNR): indicates that the diseases are detected as non-disease regions.
● True Positive Rate (TPR): indicates that the diseases are detected as disease regions.
● True Negative Rate (TNR): indicates that a non-disease region is correctly recognized as a non-
disease region.
● Confusion matrix (CM): CM is used to find the correctness and accuracy of the model. It is obtained
using parameters such as TP, TN, FP, and FN of the matrix.
● Accuracy: Accuracy is the number of correct predictions divided by the total number of samples. The
formula for accuracy is given as follows:
TP þ TN
Accuracy ¼ (1)
TP þ TN þ FP þ FN
● Precision: It is a measure that indicates the proportion of patients that were diagnosed as having
a disease. The predicted positives (People predicted with a disease are TP and FP) and the people
having a disease are TP as follows:
Page 13 of 23
Haile et al., Cogent Engineering (2022), 9: 2084878
https://doi.org/10.1080/23311916.2022.2084878
TP
Precision ¼ (2)
TP þ FP
● Recall (Sensitivity): It is a measure that indicates the proportion of patients that have a disease and
were diagnosed by the model as having a disease. The actual positives (People having a disease are
TP and FN) and the people diagnosed by the model having a disease are TP as follows:
TP
Sensitivity ¼ (3)
TP þ FN
● Dice similarity coefficient (F1-score): It represents a harmonic average of the precision and sensi
tivity and measures the similarity rate between predicted and ground-truth regions. It is calulated
using Eq. (4):
2�precision � recall
F1 score ¼ (4)
precision þ recall
● Matthews Correlation Coefficient (MCC): accuracy is sensitive to class imbalance. Precision, recall,
and F1-score are asymmetric, but MCC is perfectly symmetric; therefore, no class is more important
than the other. It measures the classification quality even if the classes are of different sizes, true
positives, and false positives and negatives as follows:
TP � TN FP � FN
MCC ¼ (5)
½ðTP þ FPÞ�ðFN þ TNÞ�ðFP þ TPÞ�ðTP þ FNÞ�1=2
Page 14 of 23
Haile et al., Cogent Engineering (2022), 9: 2084878
https://doi.org/10.1080/23311916.2022.2084878
Table 3. Accuracy of the proposed model with resized data using noise filtering and segmentation techniques
Experiment Model Noise filtering Segmentation Classifier Test accuracy (%)
Number technique technique
Experiment 1 VGGNet-InceptionNet No No Softmax 71.8
Experiment 2 VGGNet-InceptionNet Gaussian filter No Softmax 70.8
Experiment 3 VGGNet-InceptionNet Median filter No Softmax 68.6
Experiment 4 VGGNet-Inception Adaptive median No Softmax 72.7
Net filter
Experiment 5 VGGNet-InceptionNet Adaptive median filter Otsu Softmax 54.7
Experiment 6 VGGNet-InceptionNet Adaptive median filter Thresholding Softmax 52.3
Experiment 7 VGGNet-InceptionNet Adaptive median filter Kmeans Softmax 56.3
the three segmentation techniques (we compared the otsu, adaptive thresholding, and kmeans
segmentation techniques). We achieved 54.7%, 52.3%, and 56.3% accuracy when we used otsu,
adaptive thresholding, and k-means, respectively. Since both segmentation techniques achieved
low accuracies, we did not use segmentation techniques for the rest of the experiment. The results
are shown in Table 3.
Page 15 of 23
Haile et al., Cogent Engineering (2022), 9: 2084878
https://doi.org/10.1080/23311916.2022.2084878
Table 5. The precision, recall, training accuracy, validation accuracy, training loss, validation
loss, Matthews correlation coefficient, and f1-score of the proposed concatenated CNN model
with SVM classifier
Class Precision Recall F1-score
Barrett’s 0.98 0.94 0.96
Barrett’s short-segment 0.97 0.98 0.98
Bbps 0–1 0.99 0.99 0.99
Bbps 2–3 0.99 0.99 0.99
Dyed-lifted-polyps 0.99 0.93 0.96
Dyed rection margins 0.98 0.98 0.98
Esophagitis a 0.98 0.97 0.97
Esophagitis b-d 0.92 0.99 0.95
Hemorrhoids 0.99 1.00 1.00
Ileum 1.00 0.99 0.99
Impacted-stool 0.99 1.00 1.00
Cecum 0.97 0.97 0.97
Pylorus 0.95 0.99 0.97
Z-line 0.92 0.99 0.95
Polyps 0.98 0.99 0.99
Retroflex-rectum 1.00 0.98 0.99
Retroflex-stomach 0.99 0.98 0.98
Ulcerative-colitis-grade 0.99 0.98 0.99
-0–1
Ulcerative-colitis-grade-1 0.98 0.98 0.98
Ulcerative colitis-grade 0.99 1.00 1.00
-1–2
Ulcerative-colitis-grade-2 0.98 0.97 0.97
Ulcerative-colitis-grade 1.00 1.00 1.00
-2–3
Ulcerative colitis-grade-3 1.00 0.94 0.97
Training accuracy 0.987
Testing accuracy 0.98
Validation accuracy 0.982
Training loss 0.107
Validation loss 0.071
Matthews_correlation 0.978
coefficient
KNN, RF, and SVM, respectively. The last two experiments were used to demonstrate the performance
of the pre-trained VGGNet16 and pre-trained Inceptionv3. The models achieve an accuracy of 89%
and 91%, respectively. The results are shown in Table 4.
4.4. Performance of the proposed concatenated model using the SVM classifier
The results of the experiments performed to measure the training accuracy, validation accuracy,
testing accuracy, and Matthews’s correlation coefficient of the proposed concatenated CNN model
with SVM classifier are presented in Table 5. The experiments were performed using the concatenated
model with a resized and a filtered (AMF technique) image dataset and SVM classifier. The experi
ments were performed using 75 epochs with a batch size of 32. We train the model with 33,944
training images and 3844 validation images, and we test with 9610 images. The model validation is
Page 16 of 23
Haile et al., Cogent Engineering (2022), 9: 2084878
https://doi.org/10.1080/23311916.2022.2084878
done at the end of each epoch, while the model trains per epoch. At the end of the last epoch (epoch
75), our proposed concatenated model with SVM classifier obtains a training accuracy of 98.7%,
validation accuracy of 98.2%, testing accuracy of 98%, and Matthews correlation coefficient of 97.8%.
These results are shown in Figures 6 and 7.
Page 17 of 23
Haile et al., Cogent Engineering (2022), 9: 2084878
https://doi.org/10.1080/23311916.2022.2084878
Table 6. Comparison of proposed study with existing researches related to gastrointestinal disease
Author Method End Goal Classifier Strength Weakness Accuracy/MCC
Borgli et al. (2020) Averaged ResNet- Classification of GI Multiclass They use the Misclassification MCC = 0.902
152 + DenseNet- diseases power of two between class
161 models
Khan, Khan et al. (RNN,pre-trained detection of ulcers Multiclass Accuracy is high fail for the 99.13%
(2020) ResNet101) +SVM and classification segmentation of
of gastrointestinal polyp and bleeding
diseases regions and not
rely on the
correctness of
ground truth data
Limited number of
classes
Gamage et al. combining pre- predicts anomalies Multiclass combining pre- predicts eight- 97%
(2019) trained DenseNet- of the digestive trained models as class anomalies of
201, ResNet-18, tract diseases the Feature the digestive tract
and VGG-16 CNN extractors diseases only
+ANN
Takiyama et al. GoogleNet Anatomical Multiclass High classification The Limited 97%
(2018) classification performance number of classes
of GI images Computationally Only used for
efficient anatomical
classification
Owais et al. (2019) CNN (ResNet) + Classification of Multiclass Computationally Cascaded training 92.57%
LSTM multiple efficient of CNN and
GI diseases High classification LSTM requires
performance more time
Cascaded training
of CNN and
LSTM requires
more time
Sharif et al. (2019) (VGG16, VGG19) gastrointestinal Multiclass merging deep Limited dataset 99.42%
+K-Nearest tract diseases convolutional and number of
Neighbor detection and neural networks classes
classification (CNN)
Shichijo et al. CNN H. pylori Multiclass Comparable CAD performance 83.1%
(2017) infection detection performance should be
of second CNN enhanced.
with the clinical A limited number
diagnosis of classes
reference standard
Segu et al. (2016) CNN Small intestine Multiclass High classification the limited 96%
movement performance number of classes
Characterization
Li et al. (2012) K-nearest neighbor Distinguish lymph Binary combines several insufficient 96.33%
classifier node metastasis feature selection number of clinical
from non-lymph algorithms and cases
node metastasis metric learning
methods to get
high efficiency
Wang et al. (2015) Edge cross-section detecting polyps in - Performance is They did not 97.7%
profile (ECSP) colonoscopy high classify polyps into
features different types
Souaidi et al. texture patterns detection of ulcers Binary conduct several Limited dataset 93.88%
(2019) (LBP, Laplacian) + experiments to and number of
SVM show comparison classes
with the state-of-
the-art methods
(Continued)
Page 18 of 23
Haile et al., Cogent Engineering (2022), 9: 2084878
https://doi.org/10.1080/23311916.2022.2084878
experiment, we confirmed that the prediction ability of the model is increased when we used the
concatenated features of the proposed VGGNet and InceptionNet models. Before using noise filtering
and augmentation techniques, the accuracy of the model is 71.8%, and after noise removal, the
accuracy of the model increased by 0.9% (72.7%). We obtained the best classification accuracy with
the adaptive median noise filtering technique. We also applied the augmentation technique to
control overfitting and to learn different features of images for the proposed concatenated model
that helps to increase its classification ability. After augmentation, the accuracy of the proposed end-
to-end model is 94%. We also applied different segmentation techniques to identify regions of
interest in the image; however, these segmentation techniques result in the low classification
performance of the model. Also, when the accuracy of each model was examined with the SVM
classifier after the features were extracted with each of the feature extractors individually, the results
show that VGGNet and InceptionNet have accuracies of 85% and 93.4%, respectively. Afterwards, the
features of VGGNet and InceptionNet were concatenated which results in an improved classification
performance of 98% accuracy. In this study, satisfactory results were obtained after we conducted 21
experiments. Finally, a comparison of state-of-the-art classifiers was performed using the proposed
concatenated model. Our proposed concatenated model with SVM classifier achieves a training
accuracy of 98.7%, validation accuracy of 98.2%, testing accuracy of 98%, and Matthews’s correlation
coefficient of 97.8%. The results of the proposed concatenated model outperformed the proposed
VGGNet, the proposed InceptionNet, and the state-of-the-art pre-trained VGGNet16 and Inceptionv3
models. Generally, based on the experiments, our proposed model improves Matthews’s correlation
coefficient of the previous work by 7.6%. This indicates that the proposed concatenated model is
effective for gastrointestinal disease diagnosis from endoscopy images. A comparative analysis of our
proposed method with existing works is presented in Table 6. The results show that the proposed
method achieves a high accuracy as compared to other methods.
5. Conclusion
The digestive system consists of the gastrointestinal tract and other organs which help the body to
break down and absorb food. The gastrointestinal tract is affected by a variety of diseases which
Page 19 of 23
Haile et al., Cogent Engineering (2022), 9: 2084878
https://doi.org/10.1080/23311916.2022.2084878
makes it not to function properly. The domain of gastrointestinal endoscopy includes the endo
scopic diagnosis of various digestive diseases using image analysis and various devices. Endoscopy
is currently the preferred method for examining the gastrointestinal tract; however, its effective
ness is severely limited by the variation of the models performance. This study presents
a concatenated model by concatenating features of VGGNet and InceptionNet for the classification
of multiple gastrointestinal diseases from endoscopy images. The proposed model was trained and
tested using a publicly available dataset from the HyperKvasir database. Extracted features were
used to train the Softmax, KNN, RF, and SVM classifiers. The model achieves a test accuracy of 98%
using the SVM classifier, which is an incredible performance when compared to state-of-the-art
approaches. It outperforms the previous similar works by 7.6%, with Matthews’ correlation coeffi
cient of 90.2%. As a result, the proposed model is one of the best gastrointestinal disease
classification models. Based on the experimental outcomes of this study, we suggest the following
recommendation as future work:
● Segmentation plays a critical role in the early and accurate diagnosis of various diseases using
computerized systems. For the future, we recommend that a new simultaneous segmentation method
can be used by taking advantage of MASK-RCNN for detection of the disease location and GRABCUT for
segmentation to improve the performance of the model.
● This study was performed for only 23 gastrointestinal disease types; therefore, other types of gastro
intestinal disease can also be considered.
● In future works, it is important to evaluate different feature selection algorithms that can help to
determine the smallest subset of features that can aid in the accurate classification of gastrointest
inal disease types.
Page 20 of 23
Haile et al., Cogent Engineering (2022), 9: 2084878
https://doi.org/10.1080/23311916.2022.2084878
recognition, CVPR 2017, 2017 January, 1800–1807. Computers, Materials & Continua, 67(3), 3382–3398.
https://doi.org/10.1109/CVPR.2017.195 https://doi.org/10.32604/cmc.2021.014983
Deng, J., Dong, W., Socher, R., Li, L., Li, K., & Fei-fei, L. Khan, M. A., Nasir, I. M., Sharif, M., Alhaisoni, M., Kadry, S.,
(2009). ImageNet : A Large-scale hierarchical image Bukhari, S. A. C., & Nam, Y. (2021). A blockchain based
database. IEEE Conference on Computer Vision and framework for stomach abnormalities recognition.
Pattern Recognition (pp. 248–255). https://doi.org/10. Computers, Materials & Continua, 67(1), 141–158.
1109/CVPR.2009.5206848. https://doi.org/10.32604/cmc.2021.013217
Escobar, J., Sanchez, K., Hinojosa, C., Arguello, H., & Li, B., & Meng, M. Q. (2012). Expert systems with applica
Castillo, S. (2021). Accurate deep learning-based tions automatic polyp detection for wireless capsule
gastrointestinal disease classification via transfer endoscopy images. Expert Systems With Applications,
learning strategy. XXIII Symposium on Image, Signal 39(12), 10952–10958. https://doi.org/10.1016/j.eswa.
Processing and Artificial Vision (STSIVA), 1–5. https:// 2012.03.029
doi.org/10.1109/STSIVA53688.2021.9591995. Li, B., & Meng, M. Q. H. (2012). Tumor recognition in
Gamage, C., Wijesinghe, I., Chitraranjan, C., & Perera, I. wireless capsule endoscopy images using textural
(2019). GI-Net: Anomalies classification in gastroin features and SVM-based feature selection. IEEE
testinal tract through endoscopic imagery with deep Transactions on Information Technology in
learning,” MERCon 2019 - Proceedings, 5th Biomedicine, 16(3), 323–329. https://doi.org/10.1109/
International Multidisciplinary Moratuwa Engineering TITB.2012.2185807
Research Conference, 66–71, https://doi.org/10.1109/ Li, C., Zhang, S., Zhang, H., Pang, L., Lam, K., Hui, C., &
MERCon.2019.8818929 Zhang, S. (2012). Using the K-Nearest neighbor
Guan, F., Ton, P., Ge, S., & Zhao, L. (2014). Anisotropic algorithm for the classification of lymph node
diffusion filtering for ultrasound speckle reduction. metastasis in gastric cancer. Computational and
Science China Technological Sciences, 57(3), 607–614. Mathematical Methods in Medicine, 2012, 1–11.
https://doi.org/10.1007/s11431-014-5483-7 https://doi.org/10.1155/2012/876545
Guo, Y., Liu, Y., Oerlemans, A., Lao, S., Wu, S., & Lew, M. S. Lin, W., Hasenstab, K., Cunha, G. M., Schwartzman, A., &
(2016). Deep learning for visual understanding: A Elbe-Bürger, A. (2020). Comparison of handcrafted
review. Neurocomputing, 187, 27–48. https://doi.org/ features and convolutional neural networks for liver
10.1016/j.neucom.2015.09.116 MR image adequacy assessment. Scientific Reports, 10
Hirasawa, T., Aoyama, K., Tanimoto, T., Ishihara, S., (1), 1–11. https://doi.org/10.1038/s41598-020-77264-y
Shichijo, S., Ozawa, T., Ohnishi, T., Fujishiro, M., Lonseko, Z. M., Adjei, P. E., Du, W., Luo, C., Hu, D., Zhu, L.,
Matsuo, K., Fujisaki, J., & Tada, T. (2018). Application Gan, T., & Rao, N. (2021). Gastrointestinal disease clas
of artificial intelligence using a convolutional neural sification in endoscopic images using attention-guided
network for detecting gastric cancer in endoscopic convolutional neural networks. Applied Sciences, 11(23),
images. Gastric Cancer: Official Journal of the 11136. https://doi.org/10.3390/app112311136
International Gastric Cancer Association and the Maier-hein, L., Vedula, S. S., Speidel, S., Navab, N., Kikinis, R.,
Japanese Gastric Cancer Association, 21(4), 653–660. Park, A., Eisenmann, M., Feussner, H., Forestier, G.,
https://doi.org/10.1007/s10120-018-0793-2 Giannarou, S., Hashizume, M., Katic, D., Kenngott, H.,
Jain, S., & Salau, A. O. (2019). An image feature selection Kranzfelder, M., Malpani, A., März, K., Neumuth, T.,
approach for dimensionality reduction based on kNN Padoy, N., Pugh, C., Schoch, N., Stoyanov, D., Taylor, R.,
and SVM for AKT Proteins. Cogent Engineering, 6(1), Wagner, M., Hager, D. G., & Jannin, P. (2017). Surgical
1–14. https://doi.org/10.1080/23311916.2019.1599537 data science for next- generation interventions. Nature
Jha, D., Ali, S., Hicks, S., Thambawita, V., Borgli, H., Biomedical Engineering, 1(9), 691–696. https://doi.org/
Smedsrud, H. P., & Halvorsen, P. (2021). 10.1038/s41551-017-0132-7
A comprehensive analysis of classification methods Majid, A., Khan, M. A., Yasmin, M., Rehman, A.,
in gastrointestinal endoscopy imaging. Medical Yousafzai, A., & Tariq, U. (2020). Classification of
Image Analysis, 70(1), 102007. https://doi.org/10. stomach infections: A paradigm of convolutional
1016/j.media.2021.102007 neural network along with classical features fusion
Karargyris, A., & Bourbakis, N. (2011). Detection of small and selection. Microscopy Research and Technique,
bowel polyps and ulcers in wireless capsule endo 83(5), 562–576. https://doi.org/10.1002/jemt.23447
scopy videos. IEEE Transactions on Biomedical Mengistie, Y. (2020). The pattern and outcome of upper
Engineering, 58(10 PART 1), 2777–2786. https://doi. gastrointestinal bleeding at St. Paul's Millenium
org/10.1109/TBME.2011.2155064 Medical College, Addis Ababa, Ethiopia. Ethiopian
Khan, A., Sohail, A., Zahoora, U., & Qureshi, A. S. (2020). Medical Journal, 58(4), 323–327.
A survey of the recent architectures of deep convo Naz, J., Sharif, M., Yasmin, M., Raza, M., & Khan, M. A.
lutional neural networks. Artificial Intelligence (2021). Detection and classification of gastrointest
Review, 53(8), 5455–5516. https://doi.org/10.1007/ inal diseases using machine learning. Curr Med
s10462-020-09825-6 Imaging, 17(4), 479–490. https://doi.org/10.2174/
Khan, M. A., Khan, M. A., Ahmed, F., Mittal, M., Goyal, L. M., 1573405616666200928144626
Jude Hemanth, D., & Satapathy, S. C. (2020). Noor Mohamed, S. S., & Srinivasan, K. (2020). Comparative
Gastrointestinal diseases segmentation and classifi analysis of deep neural networks for crack image
cation based on duo-deep architectures. Pattern classification. In Lecture Notes on Data Engineering
Recognition Letters, 131(1), 193–204. https://doi.org/ and Communications Technologies (Vol. 38, pp. 434–
10.1016/j.patrec.2019.12.024 443). Springer. https://doi.org/10.1007/978-3-030-
Khan, M. A., Sarfraz, M. S., Alhaisoni, M., Albesher, A. A., 34080-3_49
Wang, S., & Ashraf, I. (2020). StomachNet: Optimal Owais, M., Arsalan, M., Choi, J., Mahmood, T., & Park, K. R.
deep learning features fusion for stomach abnormal (2019). Artificial intelligence-based classification of
ities classification. IEEE Access, 8(1), 197969–197981. multiple gastrointestinal diseases using endoscopy
https://doi.org/10.1109/ACCESS.2020.3034217 videos for clinical diagnosis. Journal of Clinical Medicine,
Khan, M. A., Majid, A., Hussain, N., Alhaisoni, M., Zhang, Y., 8(7), 986. https://doi.org/10.3390/jcm8070986
Kadry, S., & Nam, Y. (2021). Multiclass stomach diseases Öztürk, S., & Özkaya, U. (2020). Residual LSTM layered
classification using deep learning features optimization. CNN for classification of gastrointestinal tract
Page 21 of 23
Haile et al., Cogent Engineering (2022), 9: 2084878
https://doi.org/10.1080/23311916.2022.2084878
diseases. Journal of Biomedical Informatics, 103638, Shichijo, S., Nomura, S., Aoyama, K., Nishikawa, Y.,
1–31. https://doi.org/10.1016/j.jbi.2020.103638 Miura, M., Shinagawa, T., Takiyama, H., Tanimoto, T.,
Paoletti, M. E., Haut, J. M., Tao, X., Miguel, J. P., & Plaza, A. Ishihara, S., Matsuo, K., & Tada, T. (2017).
(2020). A new GPU implementation of support vector EBiomedicine application of convolutional neural
machines for fast hyperspectral image classification. networks in the diagnosis of Helicobacter pylori
Remote Sensing, 12(8), 1257. https://doi.org/10.3390/ infection based on endoscopic images. EBioMedicine,
RS12081257 25(1), 106–111. https://doi.org/10.1016/j.ebiom.2017.
Ramzan, M., Raza, M., Sharif, M., Khan, M. A., & Nam, Y. 10.014
(2021). Gastrointestinal tract infections classification Shorten, C., & Khoshgoftaar, T. M. (2019). A survey on
using deep learning. Computers, Materials & image data augmentation for deep learning. Journal
Continua, 67(3), 3239–3257. https://doi.org/10. of Big Data, 6(1), 1–48. https://doi.org/10.1186/
32604/cmc.2021.015920 s40537-019-0197-0
Sakib, S., Nazib, A., Jawad, A., Kabir, J., & Ahmed, H. Simonyan, K., & Zisserman, A. (2015). Very deep convo
(2018). An overview of convolutional neural network. lutional networks for large-scale image recognition.
Its Architecture and Applications. https://doi.org/10. In 3rd International Conference on Learning
20944/preprints201811.0546.v1 Representations, ICLR 2015 - Conference Track
Salau, A. O., & Jain, S. (2019). Feature extraction: Proceedings, 1–14.
A survey of the types, techniques, and applications. Souaidi, M., Ait, A., & El Ansari, M. (2019). Multi-scale
5th IEEE International Conference on Signal completed local binary patterns for ulcer detection in
Processing and Communication (ICSC), Noida, India, wireless capsule endoscopy images. Multimedia Tools
IEEE, 158–164. https://doi.org/10.1109/ICSC45622. and Applications, 78(10), 13091–13108. https://doi.
2019.8938371 org/10.1007/s11042-018-6086-2
Salau, A. O., & Jain, S. (2021). Adaptive Diagnostic Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov,
machine learning technique for classification of cell D., Erhan, D., Vanhoucke, V., Rabinovich, A. (2015). Going
decisions for AKT protein. Informatics in Medicine deeper with convolutions. Proceedings of the IEEE
Unlocked, 23(1), 1–9. https://doi.org/10.1016/j.imu. Computer Society Conference on Computer Vision and
2021.100511 Pattern Recognition, 07–12June, 1–9. https://doi.org/10.
Salau, A. O. (2021). Detection of corona virus disease 1109/CVPR.2015.7298594
using a novel machine learning approach. Takiyama, H., Ozawa, T., Ishihara, S., Fujishiro, M.,
International Conference on Decision Aid Sciences Shichijo, S., Nomura, S., Miura, M., & Tada, T. (2018).
and Application (DASA), 587–590. https://doi.org/10. Automatic anatomical classification of esophago
1109/DASA53625.2021.9682267 gastrodenoscopy images using deep convolutional
Segu, S., Drozdzal, M., Pascual, G., Radeva, P., Malagelada, C., neural networks. Scientific Reports, 8(1), 1–8. https://
Azpiroz, F., & Vitrià, J. (2016). Generic feature learning doi.org/10.1038/s41598-018-25842-6
for wireless capsule endoscopy analysis. Computers in Wang, Y., Tavanapong, W., Wong, J., Hwan, J., & De
Biology and Medicine, 79(1), 163–172. https://doi.org/ Groen, P. C. (2015). Polyp-Alert : Near real-time feed
10.1016/j.compbiomed.2016.10 back during colonoscopy. Computer Methods and
Sharif, M., Attique Khan, M., Rashid, M., Yasmin, M., Programs in Biomedicine, 120(3), 164–179. https://
Afza, F., & Tanik, U. J. (2019). Deep CNN and geo doi.org/10.1016/j.cmpb.2015.04.002
metric features-based gastrointestinal tract dis Woreta, S. A., Yassin, M. O., Teklie, S. Y., Getahun, G. M., &
eases detection and classification from wireless Abubeker, Z. A. (2015). Upper gastrointestinal endo
capsule endoscopy images. Journal of Experimental scopy findings at Gondar university international
and Theoretical Artificial Intelligence, 1–23. https:// journal of pharmaceuticals and health care research.
doi.org/10.1080/0952813X.2019.1572657 International Journal of Pharmaceuticals and Health
Shetty, S. K., & Siddiqa, A. (2019). Deep learning algo Care Research, 3(2), 60–65.
rithms and applications in computer vision. Yuan, Y., & Meng, M. Q. (2014). Polyp classification based
International Journal of Computer Sciences and on bag of features and saliency in wireless capsule
Engineering, 7(7), 195–201. https://doi.org/10.26438/ endoscopy. IEEE International Conference on Robotics
ijcse/v7i7.195201 and Automation (ICRA), 3930–3935.
Page 22 of 23
Haile et al., Cogent Engineering (2022), 9: 2084878
https://doi.org/10.1080/23311916.2022.2084878
© 2022 The Author(s). This open access article is distributed under a Creative Commons Attribution (CC-BY) 4.0 license.
You are free to:
Share — copy and redistribute the material in any medium or format.
Adapt — remix, transform, and build upon the material for any purpose, even commercially.
The licensor cannot revoke these freedoms as long as you follow the license terms.
Under the following terms:
Attribution — You must give appropriate credit, provide a link to the license, and indicate if changes were made.
You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
No additional restrictions
You may not apply legal terms or technological measures that legally restrict others from doing anything the license permits.
Cogent Engineering (ISSN: 2331-1916) is published by Cogent OA, part of Taylor & Francis Group.
Publishing with Cogent OA ensures:
• Immediate, universal access to your article on publication
• High visibility and discoverability via the Cogent OA website as well as Taylor & Francis Online
• Download and citation statistics for your article
• Rapid online publication
• Input from, and dialog with, expert editors and editorial boards
• Retention of full copyright of your article
• Guaranteed legacy preservation of your article
• Discounts and waivers for authors in developing regions
Submit your manuscript to a Cogent OA journal at www.CogentOA.com
Page 23 of 23