Mtap S 23 08869

Multimedia Tools and Applications
Segmentation using Hybrid Neural Encoder Decoder based Unet Hybrid Inception on
IDRiD Dataset.
--Manuscript Draft--
Manuscript Number:
Full Title: Segmentation using Hybrid Neural Encoder Decoder based Unet Hybrid Inception on
IDRiD Dataset.
Article Type: Track 6: Computer Vision for Multimedia Applications
Keywords: Fundus images; UNET; Deep Learning; Diabetic Retinopathy (DR); Indian Diabetic
Retinopathy Image Dataset (IDRID)
Corresponding Author: AKANKSHA BALI, Research Scholar

University of Jammu
jammu, Jammu and Kashmir INDIA
Corresponding Author Secondary

Information:
Corresponding Author's Institution: University of Jammu
Corresponding Author's Secondary

Institution:
First Author: AKANKSHA BALI, Research Scholar
First Author Secondary Information:
Order of Authors: AKANKSHA BALI, Research Scholar
Vibhakar Mansotra
Order of Authors Secondary Information:
Funding Information:
Abstract: One of the most common causes of visual difficulties in the globe is diabetic
retinopathy (DR). Early detection of DR can stop the disease's progression with the
right care. In this research, we describe a combinatorial approach for the diagnosis of
both diseases that combines U-Net with a modified Inception architecture. The
suggested approach builds on deep neural architecture and formalizes encoder-
decoder modeling using the Inception and Residual Connection convolutional designs.
The IDRid APTOS 2019 contest dataset was used to validate the suggested model's
performance. Through comparison to Resnet, experiments show that the modified
Inception deep feature extractor enhances DR classification, with a classification
accuracy of 99.34% in IDRid across classes. The study tests the suggested Hybrid
Dense-ED-UHI: Encoder Decoder based model on the dataset. The suggested Hybrid
Dense-ED-UHI: Encoder Decoder based U-Net Hybrid Inception model is tested on the
dataset in the study with 15 fold cross validation. The proposed model's numerous
metrics are thoroughly discussed in the paper, along with several visual
representations and multifield validations.
Powered by Editorial Manager® and ProduXion Manager® from Aries Systems Corporation
Manuscript Click here to access/download;Manuscript;paper multimedia
tools and applications.docx
Click here to view linked References
1
2
3
4
5
6
7
Segmentation using Hybrid Neural Encoder
8
9
10
Decoder based Unet Hybrid Inception on
11
12
13
IDRiD Dataset.
Akanksha Bali1*, Vibhakar Mansotra2
14
15 1,2
Department of Computer Science and IT, University of Jammu, Jammu-180016, J&K, India
16
1
17 Corresponding Author:akankshabali5@gmail.com
18
19 ABSTRACT
20
21 One of the most common causes of visual difficulties in the globe is diabetic retinopathy (DR). Early detection of DR
22
23 can stop the disease's progression with the right care. In this research, we describe a combinatorial approach for the
24 diagnosis of both diseases that combines U-Net with a modified Inception architecture. The suggested approach builds
25
26 on deep neural architecture and formalizes encoder-decoder modeling using the Inception and Residual Connection
27 convolutional designs. The IDRid APTOS 2019 contest dataset was used to validate the suggested model's
28
29 performance. Through comparison to Resnet, experiments show that the modified Inception deep feature extractor
30 enhances DR classification, with a classification accuracy of 99.34% in IDRid across classes. The study tests the
31
32 suggested Hybrid Dense-ED-UHI: Encoder Decoder based model on the dataset. The suggested Hybrid Dense-ED-
33 UHI: Encoder Decoder based U-Net Hybrid Inception model is tested on the dataset in the study with 15 fold cross
34
35 validation. The proposed model's numerous metrics are thoroughly discussed in the paper, along with several visual
36 representations and multifield validations.
37
38
39 Keywords: Fundus images; UNET; Deep Learning; Diabetic Retinopathy (DR); Indian Diabetic Retinopathy Image
40 Dataset (IDRID).
41
42 1. INTRODUCTION
43
44 Diseases of the fundus can cause blindness and vision loss. Age-related macular degeneration, diabetic retinopathy,
45
46 and cataract are a few common fundus disorders that impair vision. [1] Worldwide, and notably in Indian culture,
47 diabetes is a significant pandemic. Since there is too much diabetes, diseases like DR spread as a result.
48
49 Over 422 million people around the world have diabetes, with India having the third-highest percentage of diabetics.
50
51 India, China, the USA, Brazil, and Indonesia currently house half of the population, which has expanded from 108
52
million to 422 million in recent years [2]. Diabetic retinopathy is an eye ailment that can be brought on by diabetes. It
53
54 happens when high blood sugar levels damage the retina's blood vessels, impairing vision. Although diabetes can
55
cause a number of issues, this essay will mostly focus on diabetic retinopathy. An eye ailment called diabetic
56
57 retinopathy appears after having diabetes mellitus for some time. It damages the retina's small veins, causing blockage,
58
damage, and irregular vein growth. A side effect of diabetes caused by high blood sugar levels is the constriction of
59
60
61
62
63
64
65
1
2
3
4 the blood vessels in the eyes. Blindness may result from a decrease in the blood supply to the retina. Lesions that form
5
6 on the retinal surface are one of the signs of DR. [3]
7
8 Chronically high blood sugar levels damage the blood vessels in the retina, which causes diabetic retinopathy. As a
9
result, the arteries get weaker and develop blockages and fluid leaks. In response, the body might produce unusual
10
11 new blood vessels, which could bleed and cause further damage. Factors including excessive blood sugar, high blood
12
pressure, high cholesterol, and chronic inflammation all contribute to its development. Diabetes management and
13
14 routine vision exams are crucial for early detection and treatment. High blood glucose levels damage the tiny blood
15
16 vessels in the retina, which causes DR. Due to the retina's increased production of blood, cholesterol, and other lipids,
17 the macula grows and thickens [4]. IrMAs (Intraretinal Microvascular Abnormalities) (new, abnormally delicate blood
18
19 vessels) begin to form when sample blood is administered to the retina[5]. The optic nerve could become damaged as
20 a result of the eye's increasing pressure. Regular screening of diabetic patients using fundus examinations is the most
21
22 effective way to find early signs of anomalies in the current clinical diagnosis. If DR is discovered early on and treated
23 soon away, blindness can be avoided [6].
24
25 The many stages of DR are depicted in the diagram below. The stages of diabetic retinopathy are depicted in the figure
26
27 below. Microaneurysms and fluid leakage are characteristics of moderate non-proliferative retinopathy. Edoema,
28
distortion, and blood vessel blockages are signs of mild non-proliferative retinopathy. Severe non-proliferative
29
30 retinopathy causes more blocked blood vessels. The growth of abnormal blood vessels, which can lead to hemorrhage
31
and retinal detachment, is a sign of proliferative retinopathy. Grading and detecting illnesses effectively may be
32
33 challenging given the variety of fundus imaging methods. Occlusion, shadow, reflection, or inadequate lighting in DR
34
35 fundus images may also make it difficult to tell apart between diseased and normal regions (figure 1).
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55 Figure 1: Dataset Visualisation
56
57
58 Benchmark testing is the process of evaluating the effectiveness and performance of a unique methodology in a
59 regulated and controlled environment. Manual retinal lesion assessment is time-consuming and requires high accuracy
60
61
62
63
64
65
1
2
3
4 due to the diverse morphologies, sizes, and colors of lesions. Accurate and speedy diagnosis of diabetic retinopathy
5
6 (DR) is made possible by computer-assisted diagnostic (CAD) techniques, which aids in determining the optimum
7 course of treatment. DR lesion segmentation is crucial for examining the Region of Interest (ROI) and assessing the
8
9 grade and severity of the disease. The diseases that attracted the most research focus were diabetic retinopathy, AMD,
10 and glaucoma. [7] ML techniques have taken hold because they are more accurate than conventional segmentation
11
12 techniques, which rely on a limited number of markers. DL, a modern technology, has shown enhanced performance
13 in DR lesion classification and segmentation when compared to traditional ML algorithms [8].
14
15
16 Palm Myopia and Glaucoma are the two datasets that were used. With nearsightedness, also known as myopia, objects
17 up close are clearly seen but those farther away appear fuzzy. It occurs when the eyeball is too long or the cornea is
18
19 excessively curved, which causes light rays to focus in front of the retina rather than on it. The anonymous eye
20 measurements and associated information from a group of myopic people make up the Palm Myopia dataset.
21
22 Glaucoma is a group of eye disorders that damage the optic nerve and cause progressive and irreversible vision loss.
23 Increased intraocular pressure is a common risk factor, however it can also occur when eye pressure is normal. Early
24
25 detection and treatment are crucial to preventing vision loss or blindness because symptoms frequently manifest later
26 than anticipated. The glaucoma dataset is a collection of clinical data and anonymized eye measurements from
27
28 glaucoma patients and healthy individuals, and it is also used for benchmark testing. Plans for care and diagnosis are
29 frequently aided by the results of visual field tests, measurements of the optic nerve, and demographic data.
30
31
Deep neural networks and specific machine learning models have recently demonstrated promise in a number of
32
33 computer vision applications, particularly in the processing of medical images for diabetic retinopathy. Medical
34
35 decision-making can be enhanced and patient care improved with the use of deep learning-based computer-aided
36 diagnostic (CAD) systems that can classify anomalies [9]. The remainder of the essay is organized as follows. Section
37
38 II presents related works on DR image classification. Section III provides an explanation of the facts and the suggested
39 course of action. Section IV presents the experimental analysis and commentary. The final section, Section V, contains
40
41 the conclusions.
42
43 2. LITERATURE REVIEW
44
45 Researchers have made significant progress in the field of diabetic retinopathy, depending on their area of interest and
46
47 study concentration. As shown by the work that links the domains of medical sciences and machine learning,
48 researchers have proposed and used a range of machine learning techniques. However, for diabetic retinopathy, a
49
50 comparative study of various deep learning techniques is still lacking. The completed work shows that when
51 examining the results and contrasting several machine learning methods for DR, a novel approach can be applied.
52
53 Image processing was employed by [10] for the automatic and early identification of diabetic retinopathy in addition
54 to a number of other analysis methods. With threshold-centered static wavelet transforms for the retinal fundus image
55
56 and CLAHE (Contrast Limited Adaptive Histogram Equalization) for vessel enhancement, [11] proposed a technique
57 for morphological operations-based picture enhancement. The approach utilized by [12] is based on a mixed classifier
58
59 that recognizes retinal lesions through feature formation, lesion extraction from candidates, preprocessing, and
60
61
62
63
64
65
1
2
3
4 classification. As a result of the study, the m-Mediods-based modeling methodology will be improved upon and
5
6 combined with the Gaussian Mixture Model to produce a hybrid classifier with improved classification accuracy. [13]
7 tested the hypothesis that neural networks could identify diabetes symptoms in fundus photographs by comparing the
8
9 network to a collection of fundus photographs used for ophthalmologist screening. The study showed that it was
10 possible to locate arteries, exudates, and hemorrhages.Compared to ophthalmologists, their network was more
11
12 accurate in detecting diabetic retinopathy. [14] contributed to the reduction in the quantity of features required for
13 lesion categorization in their study using feature ranking and Adaboost. They proposed a brand-new two-step
14
15 hierarchical classification system that eliminates false positives or non-lesions in the first stage.The second stage
16 further categorizes bright lesions into cotton wool patches and hard exudates. Red lesions are still classified as
17
18 haemorrhages and micro-aneurysms (MA), in addition.
19
A multi-scale feature fusion technique was proposed by Guo et al. [15] to solve issues with micro lesion detection.
20
21 Binary cross-entropy (BCE) loss and balancing coefficients were used to increase sensitivity. Full-resolution images
22
that were scaled to pixels without any additional preprocessing were used to train the computational model. In order
23
24 to solve the drawbacks of patchwise training, which does not adequately capture global context, Yan et al. [16]
25
introduced a mutually local-global U-Net.
26
27 Other techniques that researchers have developed over time for segmentation purposes include region growing
28
29 techniques, fuzzy C-Means (FCM) clustering algorithms, which are frequently used to divide image pixels into
30 different clusters [17], and mathematical morphology operations, which are carried out by examining the geometrical
31
32 composition of specific retinal components [19, 20]. It is, however, not viable to simultaneously extract spatial and
33 spectral properties across several spectral slices because the majority of available techniques are restricted to
34
35 functioning on a single retinal image.
36 Various segmentation strategies have been recommended by various researchers. In any case, these methods are
37
38 pathologically unrelated and only apply to fundus images [21]. The vascular vessel treemap is difficult to segment
39 without discontinuities. The Bar-selective Combination Of Shifted Filter Responses (BCOSFIRE) filter was
40
41 developed by Azzopardi et al. Its parameters have an effect on how well the filter works. Wilfred Franklin and Edward
42 Rajan [23] developed the Multilayer Perceptron Neural Network to identify retinal arteries. The backpropagation
43
44 technique is used to change the weight of the feedforward network for its representation. The reason for its lower level
45 of accuracy of 95.03% is because it is reliant on pixel processing. An autonomous network that relies on error-based
46
47 classification was proposed by Partovi et al. [24] for the classification of photographs. A dataset of images from remote
48 sensing was used to validate it. The Deep CNN (DCNN) model offered an expanded feature extraction-classification
49
50 method for categorizing medical images. The Deep Convolutional Neural Network (DCNN) was trained by Gulshan
51 et al. [25] to recognize DR in retinal fundus pictures. DR and diabetic macular edema can be automatically detected
52
53 by algorithms built using deep learning and retinal fundus images. Based on the important choice made by the
54
ophthalmology team, the method's specificity and sensitivity for measuring DR expressed as moderate or worse DR
55
56 or both were developed. Deep convolutional neural networks were used to build the approach, which has a sensitivity
57
and specificity of 96.5% and 92.4%, respectively.
58
59
60
61
62
63
64
65
1
2
3
4 The authors of (Noh et al., 2015) [26] proposed an enhanced semantic segmentation method that involves learning a
5
6 deconvolution network. The deconvolution network is composed of deconvolution and unpooling layers that identify
7 pixel-wise class labels and forecast segmentation masks, and the convolutional layers are also borrowed from VGG16.
8
9 On the identical PASCAL VOC 2012 dataset, the suggested method attained 72.5% IoU. Shen et al.'s [27] multicrop
10 pooling approach was employed in DCNN to classify lung nodules on CT images by capturing object salient attributes.
11
12 DR images were categorized using a Gabor filtering and SVM classification model in the literature [28]. The Circular
13 Hough Transform (CHT) and CLAHE models were used before the classifier, and they produced a detection rate of
14
15 91.4% on the STARE dataset.
16 [29] used feature ranking and Adaboost in their work, which helped to reduce the number of features required for
17
18 categorizing lesions. They proposed a brand-new two-step hierarchical classification system that eliminates false
19
positives or non-lesions in the first stage. In the second stage, bright lesions are further divided into two groups: cotton
20
21 wool patches and hard exudates. Red lesions are still classified as hemorrhages and micro-aneurysms (MA), in
22
addition. The division of the blood artery is an essential precondition for the diagnosis of many retinal disorders. On
23
24 the DRIVE dataset, the cascaded U-Net architecture utilized by [29] had an accuracy of 96.92% and a precision of
25
97.40%.
26
27 Arterioles and venous arteries are simultaneously cut from the DR images using a deep CNN network that is proposed
28
29 by [30]. The framework demonstrated 87% sensitivity and 98% specificity on the DRIVE dataset. In order to segment
30 blood vessels, data augmentation using the U-Net model was performed over the DRIVE dataset by [31], which
31
32 resulted in an AUROC score of 0.97.
33 Applying the necessary pooling, softmax, and ReLU layers, [6] categorized DR fundus images using a modified
34
35 AlexNet architecture on the MESSIDOR dataset.The suggested model had a 96.6% accuracy rate according to the
36 MESSIDOR dataset.[32] used a short dataset transfer learning approach and the InceptionV3 architecture to develop
37
38 a DR model for classification. Hagos' method achieved 90.9% accuracy using an SGD optimizer with the cosine loss
39 function for binary classification. In a work by [33], a generalization of the backpropagation approach and a weakly
40
41 supervised model were used to investigate the referable lesion locations in DR pictures. The suggested approach by
42 [34] has an area around the ROC curve (AUC) of 95.50% on the Kaggle dataset and 94.90% on the E-Ophtha dataset,
43
44 respectively. In a study [35], red lesions in fundus images were identified using a group of models that had undergone
45 deep learning training. In that method, a Deep CNN was created by first extracting 3232 pixel patches. Hand-crafted
46
47 characteristics were additionally retrieved and fed into the random forest (RF) classifier. Orlando's method showed
48 how a hybrid feature vector containing both manually created and features based on deep learning might improve the
49
50 networks' performance and achieve 89.32% AUC.
51
52 Table 1: Literature Review Comparison
53
54
Study Methodology Advantages Disadvantages Specificity Sensitivity AUC
55
56 [25] DL: High Large labelled 0.99 0.90 -
57
58 Convolutional accuracy and datasets required
59 precision
60
61
62
63
64
65
1
2
3
4 Neural
5
6 Network
7 [36] DL: Ensemble Automated Computationall 0.87 0.96 0.98
8
9 of CNNs feature y intensive
10 learning
11
12 [37] DL: CNN takes care of Prone to 0.98 0.94 0.97
13
various overfitting
14
15 lesion
16
morphologie
17
18 s
19
20 [38] DL: CNN robust to Data 0.91 0.90 0.93
21 fluctuations augmentation
22
23 and noise may be required
24 [39] ML: Random can handle Parameter 0.70 0.80 -
25
26 Forest skewed or tuning may be
27 tiny datasets required
28
29 [40] ADL (Active With the Can be 0.95 0.92 -
30 Deep availability improved
31
32 Learning) of (ROI), the through data
33 degree of mining and data
34
35 severity of exchange
36
retinal methods
37
38 imaging
39
relevant
40
41 patches may
42
be defined.
43
44
[41] Deep DR For DR The model is too 0.97 0.97 -
45
46 severity complex
47
classification
48
49 , it is helpful.
50
51 [42] Deep learning It assigns Parameter 0.90 0.90 -
52 methods pixel values tuning may be
53
54 to assess the required
55 significance
56
57 of the
58 projected
59
60 category.
61
62
63
64
65
1
2
3
4 Proposed UHI Network Described in After 0.997 0.989 0.999
5
6 Model section 3 Hyperparamet
7 er Tunning
8
9 The creation of an automated DR diagnosis approach is the aim of this work. The segmentation of the fundus image
10 using a variety of deep learning approaches is covered in the study. The main IDRID dataset and the sub datasets for
11
12 glaucoma and myopia are used as benchmarks for the paper's classifications. The paper is able to hybridize the
13
upgraded Dense-ED-UHI: Encoder Decoder based UNet Hybrid Inception architectures with Inception V3 to calculate
14
15 complex features for better extractions, in addition to speeding up computation for these designs.
16
17 We created a special architecture in light of the aforementioned review to get beyond the aforementioned limitations.
18
19 The primary contributions of the suggested research projects are summarized as follows:
20
21 1) In this study, retinal lesions are semantically segmented using a customized U-Net model (UHI). A deep
22
network encoder with the name of "InceptionV3" is the one employed in the model. The InceptionV3 model's
23
24 smaller convolutions enable a more rapid training process.
25
2) The model uses a special up-sampling technique that utilizes pixel-wise periodic shuffling convolution to
26
27 resize the convolution nearest neighbor. Our method accelerates network convergence compared to
28
29 conventional methods and generates retinal images with great quality and free of checkerboard artifacts.
30 3) The proposed approach showed cutting-edge findings for the diagnosis of DR lesions when compared to
31
32 other published publications.
33 4) The study infects deep encoding using a hybrid U-Net architecture with inception design and various
34
35 kernel extractions.
36 5) The study assesses the model's effectiveness for DR across several classes.
37
38 3. METHODOLOGY
39
40 This section fully describes the dataset using the methodology, experiment design, and workflow utilized in the
41
42 proposed framework in figure 2. The suggested architecture is depicted below.
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37 Figure 2 Flow Diagram
38
39
40 3.1 Dataset Description
41
42 The diabetic retinopathy, an eye-related complication of diabetes, is the focus of the open-access Indian Diabetic
43 Retinopathy Image Dataset (IDRid). The dataset is meant to aid research and development in the field of retinal
44
45 image analysis, notably for the identification and classification of diabetic retinopathy. Overall, there are 516
46 retinal images with a resolution of 4288 2848. Each different DR deformity, such as hemorrhages, hard-exudates,
47
48 soft-exudates, and microaneurysms, has a binary mask thanks to the pixel-level annotation. Additionally, the DR
49 severity has been assessed and assigned to each of the 516 images [43]. The table 2 data description.
50
51
52 Table 2: Data Description
53
54 S.No Key Points Description
55
1 Content Images of the retina's fundus are included in the IDRid dataset. It includes
56
57 occurrences of both common and pathological diabetic retinopathy. The full
58
59 collection consists of 516 images of the retina at high quality.
60
61
62
63
64
65
1
2
3
4 2 Annotation Each image in the collection receives expert-verified ground truth
5
6 annotations explaining the presence and severity of diabetic retinopathy.
7 From "No DR" (zero) to "Proliferative DR" (four), the designations, which
8
9 range from 0 to 4, represent the International Clinical Diabetic Retinopathy
10 severity categories.
11
12 3 Image formats The IDRid dataset's uncompressed TIFF format maintains the high-resolution
13
features needed for analysis and diagnosis.
14
15 4 Dataset The dataset was used to construct two sets: a training set and a testing set.
16
division The training set has 413 photos, whereas the testing set contains 103 images.
17
18 Researchers can evaluate the performance of their algorithms on fictitious
19
20 data using this category.
21 5 Terms of The IDRid dataset is accessible for research purposes in accordance with the
22
23 Usage terms of the Creative Commons Attribution 4.0 International (CC BY 4.0)
24 license. This indicates that you are permitted to use the dataset, alter it, and
25
26 redistribute it for any non-commercial use as long as the original authors are
27 properly attributed.
28
29
30
31 3.2 Implementation Details
32
33 A GPU with 8GB of RAM is used to train models. Both works by [44, 45] attempted to use U-Net for picture
34
35 segmentation, and they were among the first to do so. In these papers, it is addressed how to use the U-Net design to
36 segment retinal lesions in images of diabetic retinopathy. Since then, many medical picture segmentation applications,
37
38 such as the detection of diabetic retinopathy, have made extensive use of the U-Net architecture. The U-Net
39
architecture, which was first introduced for general biomedical image segmentation tasks, is now widely employed in
40
41 a variety of medical imaging fields, including diabetic retinopathy.
42
43 The U-Net architecture has been found to be beneficial for segmenting retinal structures and lesions in images of
44
45 diabetic retinopathy, although there are some limitations as well. There are several downsides to using U-Net for
46 diabetic retinopathy, such as a limited ability to gather global context. Because U-Net's encoder-decoder architecture
47
48 prioritizes local information over larger spatial relationships and contextual signals, it may not be able to successfully
49 take these factors into account. Due to this restriction, the model may not be able to incorporate all the essential global
50
51 data for accurate categorization of diabetic retinopathy, which could have a negative impact on the model's overall
52 effectiveness. In order to overcome this restriction, additional architectural modifications or the use of alternative
53
54 approaches may be needed to enhance the integration of the global environment and boost classification accuracy.
55
56 Instead than employing the standard U-Net architecture, the study uses elements from the Inception design and
57
58 incorporates them into the neural network. As the Inception architecture mixes several kernel sizes to gather data on
59 various scales and at various granular levels, the detection process becomes easier. Additionally, U-Net Inception
60
61
62
63
64
65
1
2
3
4 enhanced feature representation, contributed to a deeper understanding of the retinal image, and could enhance the
5
6 overall performance of the proposed model. The expansive to contract cross-linking also contributes to the creation
7 of the incredibly accurate segmented output image. The U-Net contract chain consists of four encoder unit blocks.
8
9 There are two convolution layers in each encoder unit, which are followed by a max-pooling layer. The feature
10 elements are multiplied by two each time a pooling procedure is applied. The key difference between contractile and
11
12 expanded approaches is the bottleneck, which consists of two convolution layers with one at the top. Using the contract
13 trail, two convolution layers, a de-convolution layer, and two comparable feature maps make up a U-Net expanding
14
15 path that is being executed by four decoder units. The short differences between the two are covered in table 3 below.
16
17 Table 3: Comparison between U-Net Inception and UHI model
18
19
20
21
Aspect UHI: U-Net Inception Standard U-Net
22
23
24
25 Architecture Concepts from the Inception architecture are The initial U-Net architecture had skip
26
27 incorporated into the U-Net framework. connections and encoder-decoders.
28
29
30 Multi-scale feature varies the kernel size to effectively capture data significantly emphasizes the usage of skip
31
32 extraction at different scales. connections to extract local features.
33
34
35
36 Global context The inclusion of Inception-inspired modules The ability to understand the overall situation
37 may make it easier to capture the overall is somewhat constrained.
38
39 context.
40
41
42 Fine detail Possibility of improved retinal image To convey fine information, may rely more on
43
44 representation representation of fine detail. skip connections.
45
46
47
Performance the ability to increase classification precision Performance on tasks involving the
48
49 potential by utilizing data from various scales. categorization of diabetic retinopathy is well-
50
51 established.
52
53
54 Complexity may make the architecture and processing more less complicated architecture with less
55
56 complex. unnecessary elements.
57
58
59
60
61
62
63
64
65
1
2
3
4 The figure below has a detailed discussion of the proposed work using the dataset divided into the training, validation,
5
6 and testing set for DR classification in retinal images.
7
8 The author suggests a thorough overflow of using four benchmark datasets to train the pictures, follow the techniques,
9
and update as needed (see figure 3 below). The dataset is then put to the test, and if the model meets expectations, it
10
11 can be used to the field of medical imaging.
12
13 The key steps of data preprocessing are shown in figures 3 and 4 in the following order:
14
15 1. As mentioned, the training data makes up 70% of the testing and validation data, which is utilized to train
16 and validate the model. To analyze the fitting parameter and guarantee accurate modeling, testing and
17
18 validation procedures are utilized.
19 2. Data preparation, which entails several stages like scaling, filtering, and normalizing the input sequences
20
21 that have been provided.
22 3. To make a binary mask for several classes being used, mask encoding or one hot encoding is employed.
23
24 4. The discussion of model construction, hyperparameters, and other units follows.
25 5. After the training cycle, the model is thoroughly evaluated for Cross Validation across all criteria, including
26
27 recall, sensitivity, and specificity.
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51 Figure 3 Data Split and Implementation
52
53
54 The proposed U-Net Inception model for categorizing diabetic retinopathy is shown in the image below. The training
55 and testing sets are passed through various preprocessing pipelines, and the train images are then separated using the
56 deep network's loss function and back propagation tools, which include weights, biases, layers, and hyperparameters,
57 as shown in figure 4.
58
59
60
61
62
63
64
65
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28 Figure 4 Data Evaluation and Cross Validation set
29
30
31 The projected architectural elements of the various building blocks are described as follows:
32
33
3.2.1 Encoder Unit
34
35
36 The encoder is necessary for the U-Net Inception Net structure's categorization of diabetic retinopathy in order to
37 recognize and extract hierarchical properties from the input retinal images. Convolutional layers are used in its
38
39 construction, and they mix input data with filters to extract features. The filter depth is gradually increased by the
40 encoding layers, allowing the neural network to collect complex data necessary for classifying diabetic retinopathy.
41
42
The encoder increases the number of feature channels while reducing the spatial dimensions of the feature map using
43
44 down sampling techniques like max pooling or stride convolution. This downsampling helps with data summarization
45
and creating more sophisticated visualizations of the input. One of the distinctive characteristics of the U-Net Inception
46
47 Net is the integration of the Inception module within the encoder.The Inception module has parallel convolutional
48
branches that use various kernel sizes, including 1x1, 3x3, and 5x5, as well as a max pooling branch. These branches
49
50 capture features at many scales and let the network learn both local and global information at the same time. By
51
52 incorporating the Inception module, the encoder of the U-Net Inception Net enhances the network's ability to gather
53 the complex and multi-scale data required for accurate categorization of diabetic retinopathy.
54
55 Overall, to identify diabetic retinopathy, the U-Net Inception Net encoder gathers relevant and discriminative
56
57 properties from the input retinal images, producing a rich representation.
58
59 3.2.2 Inception Unit
60
61
62
63
64
65
1
2
3
4 The Inception module, which was first introduced in Google Net, is part of the U-Net Inception Net design for
5
6 categorizing diabetic retinopathy. It enhances the capability of feature extraction in a network by acquiring multi-scale
7 and multi-level characteristics.
8
9
The Inception module has parallel convolutional branches with 1x1, 3x3, and 5x5 kernel sizes, as well as a max
10
11 pooling branch. Every branch of the network gathers characteristics at a different receptive field size, allowing the
12
network to collect data at different scales. Dimensionality and input channel reduction are handled by the 1x1
13
14 convolutions, which contributes to a decrease in computational complexity.
15
16 By incorporating the Inception module, the U-Net Inception Net can record both local and international data. The
17
18 module's concurrent branches let the network simultaneously learn complex and varied information. This is especially
19 useful when categorizing diabetic retinopathy because the network needs to gather data at many sizes because there
20
21 are a variety of lesions present, including microaneurysms and exudates.
22
23 The Inception module enhances the ability of the U-Net Inception Net to extract discriminative properties relevant to
24
25 diabetic retinopathy. By mixing many scale features, the network can gather both high-level semantic information and
26 fine-grained details. This improves the accuracy and robustness of the classification process, enabling more accurate
27
28 detection and evaluation of diabetic retinopathy (DR) in retinal images.
29
30 3.2.3 Skip Connections
31
32 Skip connections are an important component of the U-Net design, which includes the U-Net Inception Net, for the
33
34 classification of diabetic retinopathy. Enhancing information flow and maintaining spatial details across the network
35 depend on these linkages.In the U-Net Inception Net, skip links instantly connect the encoding and decoding paths.
36
37 These connections enable the blending of multi-scale and multi-level features, which facilitates the integration of low-
38 level and high-level data. The skip connections are made by concatenating the feature maps from the corresponding
39
40 encoder layers to the decoder layers.
41
42 By using skip connections, the U-Net Inception Net successfully combines semantic detail from the decoder with low-
43
level fine-grained data from the encoder. This enables the network to comprehend the related elements in the input
44
45 retinal images thoroughly. It aids in the retention of spatial information in addition to enhancing the network's
46
capability to properly classify diabetic retinopathy lesions.
47
48
49 Skip connections also assist in resolving the problem of a potential data bottleneck in deep networks. By simply
50 connecting the encoding and decoding paths, a network can access feature maps at different scales and levels,
51
52 improving gradient flow and resolving the vanishing gradient problem.
53
54 The U-Net architecture's initial use of skip connections By integrating local data with global context, net architecture
55
generally supports effective knowledge transmission, enables the integration of multi-scale data, and aids in the
56
57 accurate detection of diabetic retinopathy lesions.
58
59 3.2.4 Bottleneck
60
61
62
63
64
65
1
2
3
4 The bottleneck serves as a bridge to gather the input data in its most compressed and abstract form. It is located in the
5
6 center of the encoder and decoder paths.
7
8 The U-Net Inception Net bottleneck frequently consists of multiple convolutional layers, including Inception
9
modules.These layers and modules are designed to capture highly discriminative qualities, which are necessary for
10
11 accurately categorizing lesions connected to diabetic retinopathy.
12
13 By incorporating Inception modules within the bottleneck, the U-Net Inception Net is given the ability to gather
14
15 characteristics at many scales and levels of abstraction.Parallel convolutional branches of the Inception modules
16 facilitate the extraction of detailed information, enabling the network to recognize complex structures and patterns
17
18 observed in retinal images. Compiling the encoder's learned representations and preparing them for the decoding path
19 is the bottleneck's responsibility. It aims to capture the essential qualities most important to the classification task
20
21 while reducing the computational complexity of the network.
22
23 The data is effectively compressed by the bottleneck, which increases the expressiveness and concision of the retinal
24
25 pictures' representation. By gathering the most discriminative features, it enables the network to produce accurate
26 predictions throughout the subsequent decoding and classifying stages.
27
28 3.2.5 Activation Function
29
30
31 Activation functions are important to provide non-linearity and enhance the network's ability to learn complex
32 connections between the input data and the target labels.
33
In the U-Net Inception Net, the Rectified Linear Unit (ReLU), Leaky ReLU, and Exponential Linear Unit (ELU) are
34
35 often employed activation functions. Popular activation function ReLU keeps positive values while zeroing out
36
negative values. The dying ReLU problem is avoided by leaky ReLU by allowing a little non-zero slope for negative
37
38 values. ReLU does not have a smooth curve for negative values, whereas ELU does. These activation functions imitate
39
40 the non-linear mapping between labels for diabetic retinopathy (DR) and the input retinal images. The U-Net Inception
41 Net may use non-linearity to capture complicated patterns and data, which improves its capacity to recognize and
42
43 classify various forms of diabetic retinopathy lesions. The specific criteria for the classification task and the features
44 of the dataset both have an impact on the choice of the activation function.
45
46 3.2.6 Decoder Unit
47
48
49 The Decoder is responsible for upsampling the feature maps and restoring the lost spatial resolution.
50
51 Bilinear interpolation or transposed convolutions are two popular upsampling layers used in the decoder for the U-
52 Net Inception Net. These layers increase the spatial dimensions of the feature maps, allowing the network to reproduce
53
54 the finer details of the retinal images.
55
56 The linked encoder layers' feature maps are combined via skip connections in the decoder as well. These skip
57
58 connections are crucial for storing geographic data and combining low-level and high-level features in order to
59 effectively classify data. Because of the skip connections in the U-Net Inception Net, it is simpler to mix multi-scale
60
61
62
63
64
65
1
2
3
4 features, providing the network access to high-level semantic information as well as fine-grained minutiae. This
5
6 improves the network's understanding of diabetic retinopathy lesions and aids in the collection of pertinent data from
7 various levels of abstraction.
8
9
Using upsampling operations and skip connections, the decoder in the U-Net Inception Net successfully reconstructs
10
11 the spatial details of the retinal pictures, providing the information necessary for accurate categorization of diabetic
12
retinopathy.
13
14
15 3.2.7 Hyperparameter Set
16
17 Several hyperparameters have a significant impact on how well the U-Net Inception Net architecture performs for
18
categorizing diabetic retinopathy. The learning rate determines the step size during optimization, which has an impact
19
20 on the stability and speed of convergence. Gradient estimation's efficiency and accuracy are influenced by the batch
21
22 size, which controls how many data are processed during each training iteration. The number of layers affects the U-
23 Net Inception Net's capacity to learn complex properties. Although they require more computer power, deeper
24
25 networks can capture more precise patterns. Dropout and L1 or L2 regularisation are regularisation parameters that
26 prevent overfitting and control the amount of regularisation utilized.
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44 (a) Unet Architecture
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59 (b) Inception V3 convolutional network
60 Figure 5: UHI Architecture
61
62
63
64
65
1
2
3
4 The choice of activation functions, such as ReLU, Leaky ReLU, or ELU, has a substantial impact on the model's non-
5 linear behavior shown in figure 5 and its ability to reflect complex relationships. It usually takes testing and validation
6 on various validation sets to determine the appropriate hyperparameter values for the U-Net Inception Net
7
architecture's classification of diabetic retinopathy. To investigate various setups, methods like grid search or random
8
9 search may be utilized.
10 3.3 Metrics Used
11
12 3.3.1 IOU Score
13
Using the IOU Score (Intersection over Union), image segmentation algorithms' accuracy is commonly assessed in
14
15 diabetic retinopathy (DR). It calculates how much the segmented regions created by the algorithm and the ground
16
truth regions in DR pictures overlap. The IOU Score is calculated by dividing the area of overlap between the two
17
18 regions by the total area that is covered by both regions. A higher IOU Score indicates better segmentation
19
20 performance, indicating that the algorithm's projections are more closely aligned to the actual areas that are significant
21 in DR, including blood vessels in the retina and lesions. This metric is crucial for evaluating how well algorithms
22
23 identify and delineate these zones, which is required for the recognition and monitoring of DR progression.
24
25 The formula for IOU is:
26
27
28
𝑇𝑎𝑟𝑔𝑒𝑡 ⋂ 𝑃𝑟𝑒𝑑𝑖𝑐𝑡𝑒𝑑
29 𝐼𝑂𝑈 =
30 𝑇𝑎𝑟𝑔𝑒𝑡 ⋃ 𝑃𝑟𝑒𝑑𝑖𝑐𝑡𝑒𝑑
31
32 3.3.2 Focal Loss
33 Focal Loss is a loss function to address class inconsistencies in computer vision applications, such as diabetic
34
35 retinopathy (DR) analysis. It also includes a modulator that reduces the loss caused by successfully identified cases
36 while drawing attention to challenging ones. Both the binary cross-entropy loss and the modulating factor, which
37
38 dynamically alters the weight assigned to each example based on its anticipated likelihood, are incorporated in the
39 formula. Focal Loss allows the algorithm to focus more on correctly classifying the minority class, such as DR cases,
40
41 which improves the identification and diagnosis of DR-related abnormalities. It decreases the consequences of class
42 imbalance and increases the effectiveness of DR analysis techniques.
43
44
In DR, the Focal Loss increases the weights of misclassified or challenging samples, particularly those in the class of
45
46 diabetic retinopathy. This enables the model to focus on recognizing DR-related issues and prioritize them more
47
effectively. This is accomplished by adding a modulating component to the loss function, which reduces the
48
49 contribution of cases with obvious classifications and emphasizes difficult examples.
50
51 Focal Loss enables DR models to manage class imbalances and improve performance by concentrating more on
52
53 challenging occurrences. It aids in better monitoring and screening of the illness as well as better detection and
54 diagnosis of diabetic retinopathy.
55
56
57 3.3.3 Dice Loss
58 Dice loss is a commonly used loss function in medical image segmentation applications, including the analysis of
59 diabetic retinopathy (DR). It makes use of the Dice coefficient, which determines how much of the segmentations that
60
61
62
63
64
65
1
2
3
4 are anticipated and those that are observed overlap. The Dice Loss aims to maximize segmentation accuracy by
5
6 directly maximising the Dice coefficient.
7
8 The formula for Dice loss is:
9
10
11
12 𝑌 ⋂ 𝑌𝑝𝑟𝑒𝑑
𝐷𝑖𝑐𝑒 𝐿𝑜𝑠𝑠 = 1 − 2 ∗
13 𝑌 + 𝑌𝑝𝑟𝑒𝑑
14
15 How many pixels or voxels from the predicted and ground truth divisions overlap each other is indicated by the
16
17 intersection in this formula. The sum of the predicted and ground truth pixels or voxels in each region represents the
18 total number of pixels or voxels in that area.
19
20
During training, the Dice Loss should be as minimal as feasible. By encouraging the model to produce divisions that
21
22 closely resemble the actual data by minimizing the dice loss, DR-related features can be distinguished more precisely.
23
24 The Dice Loss is a dependable loss function for segmentation model training in DR analysis.
25
26 3.3.4 Accuracy
27 How well a model predicts outcomes in categorization tasks is measured by a performance metric called accuracy.
28 The ratio of correctly predicted samples to all samples in a dataset is displayed. The accuracy score indicates how well
29
30 a model does overall at categorizing situations into the correct subcategories.
31
32 The formula for Accuracy is:
33
𝑇𝑃 + 𝑇𝑁
34 𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 =
35 𝑇𝑃 + 𝑇𝑁 + 𝐹𝑃 + 𝐹𝑁
36
37 In this context, TP stands for True Positive, and TN for True Negative. While FP and FN stand for false positive and
38 false negative, respectively. A higher percentage of correct forecasts is correlated with increased accuracy. In the
39
40 context of DR, accuracy refers to the model's ability to differentiate between DR and non-DR cases. It provides a
41
thorough assessment of the algorithm's performance in terms of correctly identifying the presence or absence of
42
43 diabetic retinopathy.
44
45 Precision might not be sufficient in circumstances when there is a class imbalance, though. Precision, recall, and F1-
46
47 score are usually paired with accuracy to provide a more complete picture of how well the model performs in DR
48 analysis.
49
50
3.3.5 Binary Accuracy
51
52 Binary accuracy, also called binary classification accuracy, is a specific application of the accuracy measure to binary
53 classification issues. It measures the proportion of positively classified examples (typically denoted by "1") that are
54
55 accurately expected as opposed to all other instances.
56
57 Binary accuracy is widely used to assess the performance of models for tasks like diabetic retinopathy (DR) analysis,
58 where the goal is to accurately classify instances as either DR-positive or DR-negative. It provides a definite indication
59
60
61
62
63
64
65
1
2
3
4 of the model's likelihood to correctly forecast the positive class, which is crucial for locating and categorizing DR-
5
6 related abnormalities.
7
8 3.3.6 AUC and ROC
9 AUC (Area Under the ROC Curve), a useful evaluation statistic for binary classification tasks, gauges a model's
10
11 performance. It assesses the forecasts of the model's general accuracy at several classification criteria.
12
13 The ROC (Receiver Operating Characteristic) curve graphically illustrates the model's performance as the
14 classification threshold is altered. It plots the true positive rate (TPR) vs the false positive rate (FPR) at various
15
16 threshold levels. The AUC stands for the area under this curve. An increase in AUC, which ranges from 0 to 1,
17 indicates better classification ability. An AUC of 0.5 denotes guessing at random, whereas an AUC of 1 indicates a
18
19 perfect classifier. In the context of DR analysis, a larger AUC indicates that the model is more accurate at
20 differentiating between DR-positive and DR-negative circumstances. A high AUC value shows that the model is
21
22 successful in identifying individuals with diabetic retinopathy, which aids in the condition's diagnosis and treatment.
23
24 3.3.7 Specificity
25 Specificity, often known as the true negative rate, is a performance metric for binary classification issues. It tests a
26
27 model's ability to identify instances of negativity or the absence of a condition.
28
29 The specificity formula is shown below:
30
𝑇𝑁
31 𝑆𝑝𝑒𝑐𝑖𝑓𝑖𝑐𝑖𝑡𝑦 =
32 𝑇𝑁 + 𝐹𝑃
33
34 False positives (FP) are situations that this formula incorrectly anticipated as positive, whereas true negatives (TN)
35 are situations that were correctly predicted as negative.Specificity gives information about the model's ability to avoid
36
37 false positive errors. It shows the proportion of all correctly recognized bad circumstances that actually occurred. In
38 the context of diabetic retinopathy (DR) analysis, specificity refers to how well a model can identify non-DR
39
40 circumstances. A higher specificity indicates that the model is successful in identifying individuals without DR since
41 it translates into a lower rate of predictions that are mistakenly positive.
42
43
44 Specificity is typically combined with other metrics including as sensitivity, accuracy, and AUC to provide a
45 comprehensive evaluation of how well a model performs in binary classification tasks.
46
47 3.3.8 Sensitivity
48
Sensitivity, often known as the true positive rate or recall, is a performance metric for binary classification issues. It
49
50 measures how well a model can identify positive examples or the presence of a condition.
51
52 The formula of Sensitivity is shown below:
53
54 𝑇𝑃
𝑆𝑒𝑛𝑠𝑖𝑡𝑖𝑣𝑖𝑡𝑦 =
55 𝑇𝑃 + 𝐹𝑁
56
57 False negatives (FN) are events that this formula incorrectly anticipated as negative, whereas true positives (TP) are
58
events that it correctly predicted as positive.Sensitivity informs us of the model's capacity to guard against false
59
60 negative errors. It shows the proportion of all correctly identified positive events overall. In the context of diabetic
61
62
63
64
65
1
2
3
4 retinopathy (DR) analysis, sensitivity refers to the model's capacity to reliably identify DR instances. A higher
5
6 sensitivity results in a lower rate of incorrect negative predictions, which shows that the model is effective at
7 identifying individuals with DR.
8
9
Sensitivity is typically combined with other metrics to provide a comprehensive evaluation of a model's performance
10
11 in binary classification tasks, such as specificity, accuracy, and AUC.
12
13 4 RESULTS AND DISCUSSION
14
4.2 IDRiD
15
16 Visual images of specific diseases are typically encountered with hard exudates, soft exudates, hemorrhages, and
17 microaneurysms. These lesions are frequently an indication of a number of visual disorders, including retinopathy
18
19 brought on by diabetes. The Indian Diabetic Retinopathy Image Collection, also known as IDrid, is a widely used
20 collection that includes retinal images that have been annotated for various abnormalities. Let's discuss the
21
22 characteristics of these tumors and the standards required for segmenting them.
23
24 4.2.1 Hard Exudates: Triglycerides leak into the retina through damaged veins, generating brownish or white
25
plaques that are thought to be hard discharges. They frequently appear as small, spherical, longitudinal, or
26
27 marginated tumors. These hard discharges must be separated in order to be quantified and their progression
28
monitored because their accumulation might result in impaired vision and serve as a warning indicator of the
29
30 severity of diabetes-related retinopathy.
31
4.2.2 Soft Exudates: Soft discharges, also known as cotton fiber regions or puffy white or yellowish blemishes,
32
33 are caused by inadequate blood flow-induced strokes of the ocular sensory fiber layer. These have fuzzy
34
35 borders and resemble amorphous tumors. Soft exudates can be a sign of ocular hypoxia, hence it is important
36 to segment them in order to establish their quantity and location.
37
38 4.2.3 Haemorrhages: Haemorrhages, which occur when veins rupture, cause blood to flow into the retinal cells.
39 They resemble irregularly shaped, dark or reddish-colored tumors that are black in color. Haemorrhages need
40
41 to be segmented in order to find them and define their location because they can show the degree of retinal
42 damage and the development of diabetes-related retinopathy.
43
44 4.2.4 Microaneurysms: These little, circular capillary extensions in the retina are thought to represent the initial
45 signs of diabetic retinopathy. They resemble tiny, spherical tumors that are bright red in appearance.
46
47 Segmenting microaneurysms is essential for their measurement and management as their presence and
48 proliferation may indicate a potential progression to more severe retinal stages.
49
50
51 4.3 Retinopathy Grade Classification
52
53
54
55
56
57
58
59
60
61
62
63
64
65
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21 (a) Loss Training Validation Plot (b) Accuracy Training validation Plot
22 Figure 6: Loss and Accuracy plot for Retinopathy Grade Classification
23
24
25 Table 4: Class Wise Metric Distribution (Testing)
26
27
28
29 from class Accuracy Precision Recall or Sensitivity F1 Score Specificity
30
31
32
33
34 Retinopathy grade = 0 0.912621 0.955224 0.914286 0.934307 0.909091
35
36
37
39
40
41
43
44
45
47
48
49
50
52
53
54 Figure 6 and Table 4 Cases lacking Retinopathy are very precisely discernible, with a specificity of 0.909091 for type
55 0 retinal degeneration. Third-degree retinopathy can be predicted with a sufficient level of recall and accuracy,
56
57 according to the F1-score value of 0.917197 for this condition. In this list, both recall and sensitivity are equally
58 important. For instance, the recall/sensitivity for grade 2 retinopathy is 0.972222, which indicates that around 97.2%
59
60 of grade 2 retinal instances were successfully identified by the model. The detection rate for grade 1 retinopathy is
61
62
63
64
65
1
2
3
4 0.941176, which indicates that 94.1% of cases of grade 1 retinal degeneration were accurately detected by the model.
5
6 prediction of classes separated by category.
7
8 Retinopathy Grade 2
9
10
11
12
13
14
15
16
17
18
19
20
21 Retinopathy Grade 4
22
23
24
25
26
27
28
29
30
31
32
33
34 Figure 7 Class wise Prediction
35
36
37 4.4 Classification Risk of macular edema
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54 (a) Loss Training Validation Plot (b) Accuracy Training validation Plot
55
56 Figure 8: Loss and Accuracy Risk of Macular Edema
57
58
59 Table 5: Class Wise Metric Distribution (Training)
60
61
62
63
64
65
1
2
3
4
5 class Accuracy Precision Recall or Sensitivity F1 Score Specificity
6
7
8
9
10 Risk of macular edema = 0 0.951456 0.950000 0.966102 0.957983 0.931818
11
12
13
15
16
17
19
20
21
22 The term "accuracy," which gauges overall proficiency, refers to the proportion of cases that were correctly predicted
23 to all occurrences. The following table shows the accuracy for each category. By way of illustration, the prediction
24
25 accuracy for category 0 is 0.953995 (according to training loss 8), which indicates that in nearly 95.4% of cases, the
26 algorithm accurately identified the possibility of macular edema level 0.
27
28 The "Precision" row measures how well the model predicts True Positives among the positive Predicted Classes. For
29
30 instance, the precision for class 1 is 0.991736, suggesting a high precision of 99.2% for the algorithm's successful
31 identification of cases of macula edema level 1 cases.
32
33
34 Table 6: Class Wise Metric Distribution (Testing)
35
36
37 class Accuracy Precision Recall or Sensitivity F1 Score Specificity
38
39
40
42
43
44
46
47
48
49
51
52
53 The sensitivity is the same as that in tables 6 and 5. For instance, category 2 has an ideal recall/sensitivity of 1.000000,
54 indicating that every instance of macular edema level 2 was properly identified by the algorithm. A higher F1 score
55
56 denotes better performance. For instance, the F1 grade for level 1 of macular edema is 0.979592, indicating a great
57 equilibrium between recall and precision. The specificity for class 0, for instance, is 0.971910, demonstrating a high
58
59 level of accuracy in identifying instances without macular edema at level 0. Figure 9 displays the prediction by class.
60
61
62
63
64
65
1
2
3
4 Edema Grade 0
5
6
7
8
9
10
11
12
13
14
15
16
17 Edema Grade 2
18
19
20
21
22
23
24
25
26
27
28
29 Figure 9: Edema Classwise Prediction
30
31
32 Table 7: Data Samples per Class
33
34 Lesion Type Number of Samples
35
36
37
38 Hard Exudates 350
39
40
41
42 Soft Exudates 150
43
44
45 Hemorrhages 120
46
47
48
49 Microaneurysms 110
50
51
52
53 Table 7 shows the distribution of data samples across various lesion types, presumably related to a study of medical
54 imaging. The total number of samples for each type of lesion is shown in the table. With 350 examples, "Hard
55
56 Exudates" is the most common class, followed by "Soft Exudates" (150 samples), "Hemorrhages" (120 samples), and
57 "Microaneurysms" (110 samples). These numbers show the frequency with which each type of lesion appears in the
58
59 dataset, providing crucial insight into its composition. Since the class distribution may affect the model's accuracy and
60 generalizability, this information is essential for building trustworthy artificial intelligence models.
61
62
63
64
65
1
2
3
4 The provided table demonstrates the diversity of the dataset and reveals potential trends or anomalies in lesion
5
6 incidence. In medical settings, this information aids both researchers and clinicians in understanding the root causes
7 of certain issues, directing diagnosis, treatment options, and further research. It is critical to keep in mind that the
8
9 application of these data should be chosen taking into account the distinct clinical setting and the objectives of the
10 study, whether they are to increase diagnostic accuracy or discover disease symptoms.
11
12
13
14
15 4.4.1 Hard Exudates
16
17
18
19
20
21
22
23
24
25
26
27
28 (a) IOU Score (b) Focal + Dice Loss
29
30
31 Figure 10: Training IOU and Dice Loss
32
33
34
accuracy binary accuracy auc Specificty Sensitivity
35
36 1.02
37
38 0.999257 0.999835 0.999327 0.999873 0.999873 0.999326 0.999871 0.999326
1
39
40
41 0.98 0.969272 0.969272 0.968051 0.968795
42
43 0.96
44
45 0.94 0.934805 0.932559 0.934836
46
47 0.92
48
49
0.9
50
51
52 0.88
Train Test Valid
53
accuracy 0.999257 0.999327 0.999326
54
55 binary accuracy 0.969272 0.968051 0.968795
56 auc 0.999835 0.999873 0.999871
57 Specificty 0.934805 0.932559 0.934836
58
59 Sensitivity 0.969272 0.999873 0.999326
60
61
62
63
64
65
1
2
3
4 Figure 11: UHI Model Performance
5
6
7 A loss operation is used to calculate the discrepancy between the anticipated and actual classes of hard exudates, and
8
9 the median loss for the entire train of data is 0.057894. A lesser loss demonstrates better model performance in
10 accurately capturing the properties of hard exudates. Based on figure 10 and 11, the model's accuracy appears to have
11
12 remained constant across numerous training iterations, as seen by the small variance of 0.014040. The range of losses,
13 from the smallest loss of 0.035471 to the highest loss of 0.090734, represents the volatility in the model's performance.
14
15 According to the mean accuracy of 0.999293, the model does well generally in classifying pixels into either hard
16 exudates or non-hard exudates. This demonstrates that the model accurately recognizes the traits that distinguish hard
17
18 exudates from other ocular components and can correctly make this distinction. The modest standard deviation of
19 0.000162 demonstrates that the model's accuracy is consistent across assessments. The least accurate value of
20
21 0.998942 and the highest accuracy of 0.999645 demonstrate the wide range of variation in the model's performance
22 across various samples.
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27 Figure 12: Mask prediction and BBox using Dense-ED-UHI: Encoder Decoder based Unet Hybrid Inception (
28
29 Proposed Model )
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
Figure 13: Comparison with Other state of the art backbone ( Resent50)
16
17
18 According to Table 8, the median true positive number, which is 34116.49, represents the median amount of hard
19
20 exudates that were correctly anticipated. This rating reveals the algorithm's accuracy in the segmentation and detection
21
of hard exudates. The standard deviation of 13782.728251 illustrates the range in true positives across multiple
22
23 evaluations. The model's capacity to identify hard exudates exhibits a wide range of variance, as evidenced by the
24
lowest value of 11497 and the maximum number of 71688.
25
26
27 The algorithm's overall effectiveness in differentiating between hard and soft exudates is shown by the AUC (Area
28 Under the Curve), which has a value of 0.966475. A higher AUC indicates better discrimination abilities, with values
29
30 close to 1 denoting exceptional outcomes. The model's average specific of 0.999851 shows that it can reliably identify
31 non-hard secretions. A median sensitivity rating of 0.929403 indicates that the system is capable of appropriately
32
33 identifying hard exudate locations. The two criteria are essential for evaluating the model's performance in precisely
34 recognizing hard exudates.
35
36
37 Overall, the statistical evaluation of rigid exudates provides illuminating data about the efficacy of the categorization
38 strategy. The model's low loss & great accuracy show how well it captures the characteristics of hard exudates. The
39
40 simulation can distinguish between hard and soft exudates, as shown by its high AUC and accuracy values. Despite
41 this, the model might occasionally miss some hard exudates due to the drastically decreased sensitivity. The
42
43 effectiveness of the categorization algorithm for hard exudates in ocular images can be evaluated and improved using
44 these data points, which serve as critical indicators.
45
46
Table 8: Statistical Analysis
47
48
49 STATISTICAL LOSS ACCURACY TRUE AUC SPECIFICTY SENSITIVITY
50
51 TEST POSITIVES
52
53
54
55 MEAN 0.057894 0.999293 34116.490000 0.966475 0.999851 0.929403
56
57
58
59
60
61
62
63
64
65
1
2
3
4
5 STD 0.014040 0.000162 13782.728251 0.009651 0.000046 0.019548
6
7
8
9
MIN 0.035471 0.998942 11497.000000 0.942636 0.999746 0.881022
10
11
12
13
14 25% 0.046062 0.999188 23017.000000 0.959663 0.999817 0.915989
15
16
17
18 50% 0.057263 0.999304 31707.500000 0.967367 0.999855 0.930924
19
20
21
22 75% 0.069872 0.999411 43709.000000 0.974321 0.999885 0.945510
23
24
25
26 MAX 0.090734 0.999645 71688.000000 0.982697 0.999944 0.962151
27
28
29
4.4.2 HAEMORHAGES
30
31
32
33
34
35
36
37
38
39
40
41
42 Figure 14: Training IOU and Loss
43
44
45 Table 9: Statistical Analysis
46
47 Statistical accuracy binary accuracy auc Specificty Sensitivity
48 Test
49
50
51 mean 0.999264 0.999264 0.974184 0.999812 0.939016
52
53
54
55 std 0.000230 0.000230 0.008255 0.000072 0.018514
56
57
58
59
60
61
62
63
64
65
1
2
3
4 min 0.998607 0.998607 0.956402 0.999574 0.901336
5
6
7
8 25% 0.999184 0.999184 0.968939 0.999759 0.927588
9
10
11
12 50% 0.999331 0.999331 0.973982 0.999836 0.938238
13
14
15 75% 0.999403 0.999403 0.979575 0.999862 0.948270
16
17
18
19 max 0.999657 0.999657 0.990752 0.999934 0.976834
20
21
22
23
24
25
1.01 0.999244
0.999244 0.999805 0.999469
0.999469 0.999862 0.998732
0.998732 0.999685
26 1
27 0.99 0.977122 0.979326
0.98 0.972912
28
29 0.97
0.96 0.946274 0.947909
30 0.95 0.939895
31 0.94
32 0.93
0.92
33 0.91
34 0.9
35 Train Test Valid
36 accuracy 0.999244 0.999469 0.998732
37 binary accuracy 0.999244 0.999469 0.998732
38 auc 0.977122 0.979326 0.972912
39
40 Specificty 0.999805 0.999862 0.999685
41 Sensitivity 0.946274 0.947909 0.939895
42
43 accuracy binary accuracy auc Specificty Sensitivity
44
45
46 Figure 15: UHI Metric analysis in haemorrhages
47
48
49 The statistical analysis of the haemorrhages dataset can provide important insight into its effectiveness in locating and
50 classifying haemorrhages in retinal images. With an average accuracy and absolute accuracy of 0.999264, the model
51
52 correctly categorizes pixels as hemorrhages or non-hemorrhages. This demonstrates how the simulation successfully
53 distinguishes between different ocular forms and accurately captures the specific features of hemorrhages. The low
54
55 standard deviation of 0.000230 (as shown in table 9), which demonstrates that the model's accuracy is constant across
56 several assessments, is another factor supporting the model's longevity. The least precision of 0.998607 and the highest
57
58 confidence of 0.999657 serve to demonstrate the wide range of variance in the model's performance across various
59 samples. The training and UHI networks on the hemorrhages dataset are shown in Figures 14 and 15.
60
61
62
63
64
65
1
2
3
4 The average AUC (Area Under the Curve) value of 0.974184 shows that the algorithm has a strong ability to
5
6 distinguish between hemorrhages and non-hemorrhages. AUC that is larger indicates greater accuracy, with figures
7 close to 1 indicating good discrimination. A little variation of 0.008255 points to consistent success in terms of AUC.
8
9 The range of variance in the model's ability to distinguish between different samples is shown by the AUC values of
10 0.956402, 0.990752, and 0.956402 in that order.
11
12
Additionally, the algorithm's average specific of 0.999812, which increases its overall accuracy, demonstrates its
13
14 capacity to accurately identify non-hemorrhage pixels. The model's average sensibility, which is 0.939016, illustrates
15
16 how well it can identify pixels with hemorrhage.
17
18 Both measures are crucial for evaluating the algorithm's performance in correctly identifying hemorrhages. The
19 quartile ranges (25%, 50%, and 75%) provide details on the metrics' dispersion and display the difference in the
20
21 model's performance across different samples.
22
23 The statistical analysis of the haemorrhages dataset clearly demonstrates the algorithm's outstanding precision, good
24
25 discrimination ability, and even efficacy in correctly identifying haemorrhage and non-hemorrhage pixels. The results
26 demonstrate how effectively the system detects and segments hemorrhages in ocular images, which can be helpful for
27
28 locating and monitoring retinal problems.
29
30 The first image in this example is a macular fundus image with symptoms of retinopathy brought on by diabetes. The
31 image shows numerous anomalies, including exudates and microaneurysms. Experts design the proper mask,
32
33 highlighting the crucial regions where these abnormalities occur. The mask helps to highlight the areas that need to
34
be examined and evaluated more closely.
35
36
37 The network uses the Dense-ED-UHI: Encoder Decoder based Unet Hybrid Inception model to process the image for
38 classification. The UNet Fusion framework enables the capturing of intricate attributes and the execution of precise
39
40 segmentation procedures. The segments map created by UNet Inception, which closely resembles the surface truth
41 mask, properly identifies and highlights the aneurysm and exudate zones. Figures 16 and 17 show how the Unet
42
43 Inception and Resnet Models predict.
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
55
56 Proposed Model )
57
58
59
60
61
62
63
64
65
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
Figure 17: Comparison with Other state of the art backbone ( Resent50)
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
1
2
3
4
5
4.3.3 Microaneurysms
6
7
8
9
10
11
12
13
14
15
16
Figure 18: Training IOU and Loss.
17
18
19
20
21
Table 10: Statistical Analysis
22
23
24 Statistical accuracy binary accuracy auc Specificty Sensitivity
25
26 Test
27
28 mean 0.999264 0.999264 0.974184 0.999812 0.939016
29
30
31
32 std 0.000230 0.000230 0.008255 0.000072 0.018514
33
34
35
36 min 0.998607 0.998607 0.956402 0.999574 0.901336
37
38
39
25% 0.999184 0.999184 0.968939 0.999759 0.927588
40
41
42
43 50% 0.999331 0.999331 0.973982 0.999836 0.938238
44
45
46
47 75% 0.999403 0.999403 0.979575 0.999862 0.948270
48
49
50
51 max 0.999657 0.999657 0.990752 0.999934 0.976834
52
53
54
55
56
57
58
59
60
61
62
63
64
65
1
2
3
4
5 1.05
0.99974
0.99974 0.99997 0.999681 0.999965
0.999681 0.99964
0.99964 0.999958
6 1 0.9761484 0.9768875
0.9677475
7
8 0.95 0.914899
0.913566
9
10 0.9 0.875894
11
0.85
12
13 0.8
14
15 0.75
Train Test Valid
16
17 accuracy 0.99974 0.999681 0.99964
18 binary accuracy 0.99974 0.999681 0.99964
19 auc 0.913566 0.875894 0.914899
20
Specificty 0.99997 0.999965 0.999958
21
22 Sensitivity 0.9761484 0.9677475 0.9768875
23
25
26
27 Figure 19: UHI Metric Comparison for Microaneurysms
28
29
30 The method has a great level of overall precision in detecting microaneurysms, as shown by a median accuracy of
31 0.999264, as seen in Table 10 and Figure 19. It shows that the model correctly classifies a significant portion of the
32
33 image's pixels, including all real positives and real negatives.
34
35 Precision in Binary: Similar to the fundamental measure, the accuracy displays a high average value of 0.999264. This
36
statistic evaluates the accuracy of the microaneurysm classification by only considering true positives and true
37
38 negatives. The high score demonstrates how well the microaneurysms may be distinguished from other portions of
39
the image by the depiction.
40
41
42 Specificity: The system's average specific of 0.999812 for the dataset demonstrates its ability to precisely identify
43 non-microaneurysm regions in the image. This indicates that the algorithm has a low proportion of false positives and
44
45 a high proportion of real negatives.
46
47 Area Under the Curve (AUC): The AUC value of 0.974184 represents the area below the Receiver Operating
48
49 Characteristics (ROC) curve. It demonstrates how well the simulation can distinguish between regions that are
50 susceptible to microaneurysms and those that are not. Based on the area under the curve (AUC), the statistical model
51
52 performs better at discriminating between the two groups.
53
54 Sensitivity: The algorithm's ability to detect microaneurysms is indicated by a median score of 0.939016. The
55 algorithm's ability to identify all of the microaneurysms seen in the image is demonstrated by the genuine negative
56
57 percentage, which is displayed.
58
59
60
61
62
63
64
65
1
2
3
4 The statistical findings show that the algorithm generally does an excellent job of recognizing microaneurysms. The
5
6 model's exceptional accuracy, dichotomous accuracy, and AUC scores provide as evidence of its effectiveness in
7 accurately classifying microaneurysm regions. Additionally, the model's outstanding sensitivity and specificity scores
8
9 demonstrate that it can accurately identify non-microaneurysm locations and detect the majority of aneurysms.
10
11 It is crucial to remember that the usual deviation figures provide information about the model's consistency and
12
effectiveness across various examples or datasets. The accuracy, category precision, AUC, particularity, and
13
14 sensitivity all exhibit quite low variances, demonstrating the algorithm's consistency in performance across the whole
15
16 set of evaluated data.
17
18 The statistical analysis concludes that the algorithm regularly outperforms itself across a variety of data sets and
19 achieves a high level of efficacy and precision in identifying microaneurysms. These results suggest that the algorithm
20
21 may aid in the rapid identification and diagnosis of microaneurysms, which is crucial for the treatment of patients with
22 diabetes and other ocular diseases. Figures 20 and 21 show how the Unet Inception and Resnet Models predict.
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
Figure 20: Mask prediction and BBox using Dense-ED-UHI: Encoder Decoder based Unet Hybrid Inception (
18
19
Proposed Model )
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50 Figure 21: Comparison with Other state of the art backbone ( Resent50)
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
1
2
3
4
5
4.3.4 Soft Exudates
6
7
8
9
10
11
12
13
14
Figure 22: Training Validation Graph
15
16
17 The training and UHI networks on the hemorrhages dataset are shown in Figure 22. The method has an outstanding
18
level of general precision in recognizing soft exudates, as shown by a median accuracy of 0.999521, as shown in
19
20 table 11 and Figure 23.
21 1.02
22
23 1
24
0.98
25
26 0.96
27
28 0.94
29
0.92
30
31 0.9
32
33 0.88
Train Test Valid
34
35 accuracy 0.999545 0.999485 0.999671
36 binary accuracy 0.999545 0.999485 0.999671
37 auc 0.981519 0.999047 0.999875
38 Specificty 0.999788 0.999668 0.999818
39 Sensitivity 0.929511 0.960885 0.963076
40
42
43
44 Figure 23: Internal Metric Comparison
45
46
47 Table 11: Statistical Analysis
48 SENSITIVITY
49
LOSS ACCURA BINARY FALSE FALSE TRUE TRUE AUC SPECIFICTY
50
CY ACCURA NEGATIV POSITIVE NEGATIVES POSITIVES
51 CY ES S
52
53
54
55 COUNT 10.00 10.0000 10.0000 10.00000 10.00000 1.000000e 10.000000 10.0000 10.0000 10.000000
56 0000 00 00 0 0 +01 00 00
57
58
59
60
61
62
63
64
65
1
2
3
4
5 MEAN 0.300 0.99952 0.99952 521.5000 483.9000 2.088782e 7364.5000 0.98436 0.99976 0.930090
6 426 1 1 00 00 +06 00 3 8
7
8
9
10 STD 0.056 0.00010 0.00010 227.5537 115.5541 2.290535e 2232.9307 0.01617 0.00005 0.034264
11 855 9 9 79 72 +03 22 4 5
12
13
14 MIN 0.206 0.99938 0.99938 238.0000 312.0000 2.084710e 4638.0000 0.96175 0.99969 0.883597
15 693 4 4 00 00 +06 00 0 1
16
17
18
19 25% 0.258 0.99945 0.99945 327.0000 387.7500 2.087039e 6092.2500 0.97042 0.99973 0.896406
20 258 9 9 00 00 +06 00 5 5
21
22
23
50% 0.306 0.99950 0.99950 519.5000 500.5000 2.089254e 6821.5000 0.98724 0.99976 0.938353
24
825 2 2 00 00 +06 00 6 0
25
26
27
28 75% 0.331 0.99957 0.99957 736.7500 554.0000 2.090006e 8945.7500 0.99933 0.99981 0.958866
29 238 3 3 00 00 +06 00 9 4
30
31
32
33 MAX 0.390 0.99973 0.99973 796.0000 645.0000 2.091786e 11514.000 0.99968 0.99985 0.968947
34 825 8 8 00 00 +06 000 5 1
35
36
37
38 The Hybrid UNet Inception model appears to be incredibly effective when used to the Soft Exudates IRID datasets.
39
40 With absolute accuracy ratings of about 99.95% on the initial training, testing, and validation sets, the system exhibits
41 its outstanding ability to properly categorize soft exudates. The AUC values are also quite good, particularly on the
42
43 test and validation sets, where they are 99.90% and 99.99%, respectively, highlighting the model's remarkable ability
44 to distinguish between positive and negative instances. The model also performs exceptionally well in specificity,
45
46 scoring above 99.97% on all datasets, which helps to lower the frequency of false positives in medical imaging
47
analysis. Additionally, metrics show that the model's sensitivity—a measure of its ability to find positive cases—
48
49 remains high, especially for the test and validation sets, where it is 96.08% and 96.31%, respectively. This slight trade-
50
off between specificity and sensitivity may be acceptable in clinical contexts where avoiding false positives is critical.
51
52 The Hybrid UNet Inception model demonstrates robust and encouraging performance after additional validation and
53
testing, highlighting its potential relevance in practical clinical applications. Figures 24 and 25 show how the Unet
54
55 Inception and Resnet Models predict.
56
57
58
59
60
61
62
63
64
65
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
25
26 Proposed Model )
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54 Figure 25: Comparison with Other state of the art backbone ( Resent50)
55
56
57
5 CONCLUSION
58
59 Despite the fact that one is brought on by aging and the other by a variety of factors, diabetic retinopathy is a complex,
60 multifaceted illness, and our knowledge of these conditions is constantly evolving. Although myopia and glaucoma
61
62
63
64
65
1
2
3
4 datasets were included in the article, it has mostly focused on these two. We have had to make multiple adjustments
5
6 to our categorization schemes throughout the years in order to stay up with advances in technology and medical
7 expertise. The enormous advances in disease pathophysiology, imaging technology, artificial intelligence, and
8
9 treatment over the past few decades must be taken into account while developing a new classification system.
10
11 The model's accuracy was 92.5% in both the training and test sets, illustrating its ability to generate reliable
12
assumptions about unseen data. With accuracy scores exceeding 99.9% throughout the learning, test, and verification
13
14 sets, the system demonstrates its ability to properly segregate myopia in retinal images. The level of sensitivity, or
15
16 genuine positive rate, is similarly astronomically high at 99.66% on the test set. With an average precision of 99.93%,
17 the simulations demonstrate impressive accuracy for the assessment of hard exudates. The sensitiveness scores
18
19 demonstrate how well the models can identify genuine positives, with a mean sensitivity of 92.94%. The AUC values,
20 which vary from 0.9426 to 0.9827, demonstrate the models' excellent ability to distinguish between both positive and
21
22 negative scenarios. The models' enhanced specific values, which range from 0.9996 to 0.9999, demonstrate their exact
23 classification of unfavourable cases. According to the data analysis, the models are good at identifying and classifying
24
25 hard exudates in general.
26
27 With a mean accuracy and binary accuracy of 99.93% in the examination of the hemorrhages dataset, the hypotheses
28
performed remarkably well. The models' ability to distinguish between advantageous and disadvantageous scenarios
29
30 is demonstrated by the AUC values, which vary from 0.9564 to 0.9908. The incredibly sensitive values of the
31
algorithms, which vary from 0.9013 to 0.9768, show their effectiveness in identifying true positives. The models' high
32
33 sensitivity values, which vary from 0.9996 to 0.9999, indicate how precisely the negative cases are classified by them.
34
35 These results demonstrate the algorithms' capability to recognize and classify hemorrhages.
36
37 The latest DR classification algorithms conceptually segment images using Inception, a spatial up-sampling technique
38 that employs pixel-wise periodic shuffling convolution, employing a customized UHI or U-Net model. It has received
39
40 extensive validation through comparison to the body of research, and it should result in better treatments and outcomes
41 for the millions of people who have vision loss worldwide. The Dense-ED-UHI: Encoder Decoder based Unet Hybrid
42
43 Inception models outperformed the other models in terms of correctness and accuracy. High accuracy scores were
44 reached, proving the datasets' ability to distinguish and diagnose retinal disorders correctly. Additionally, the
45
46 algorithm shown strong specificity, proving its ability to precisely identify the negative cases (non-pathological areas)
47
and avoid false positives. These results show that the Dense-ED-UHI: Encoder Decoder based Unet Hybrid Inception
48
49 framework for segmenting retinal disease is effective and reliable.
50
51 When it comes to microaneurysms, the mathematical models are extremely precise and binary correct, with a median
52
53 accuracy of 99.93%. The AUC values, which range from 0.9564 to 0.9908, demonstrate how well the models can
54 differentiate between advantageous and unfavorable instances. The sensitive values, which span the range of 0.9013
55
56 to 0.9768, demonstrate how effectively the models can identify genuine positives. The models' high sensitivity values,
57 which vary from 0.9996 to 0.9999, illustrate how precisely they classify unfavorable cases.
58
59
60
61
62
63
64
65
1
2
3
4
5
6 One of the key advantages of the Dense-ED-UHI: Encoder Decoder based Unet Hybrid Inception structure is its
7 ability to employ the Inception structure and collect both global and local pigment data inside the image. By utilizing
8
9 multi-scale properties, this enables the model to efficiently record fine features in retinal images.
10
11 In conclusion, the use of deep learning models, notably the UHI system and the potential integration of the
12
Classification network for broad category classification, can substantially assist segmenting retinal illness. These
13
14 simulators have demonstrated to be extremely reliable, strong, and able to automate and help in the early detection
15
16 and recognition of retinal problems. The research and growth of this discipline could lead to improved healthcare.
17
18 Authors Contribution: AB and VM created the idea for the study. AB led the writing of this paper. AB analyzed
19 and interpreted the data. AB contributed to the implementation of the paper as well as to the identification of
20 theoretical problems. AB contributed to the writing of the paper as well as participated in revising this manuscript.
21 VM encouraged AB to investigate and supervised the findings of this work. All authors contributed significantly
22 to the writing of the paper, discussed the results, reviewed the final draft, and approved the final version of this
23 manuscript.
24
25 Funding: This research received no specific grant from any funding agency in the public, commercial, or
26 not-for-profit sectors.
27
28 Data Availability:
29
30 The dataURL supporting this article has shown in table 16.
31
32 Table 16: Data Availability Statement
33
Dataset Name Availability of Data URL
34
35 IDRiD Openly accessible data stored in The data that support the
36 a public database. findings of this study are
37 available in Grand-Challenge
38 at https://idrid.grand-
39 challenge.org.
40
41 Declarations
42
43 a) Ethical Approval: Since the data were truly anonymous, ethical approval was not required.
44 b) Informed Consent: Not Applicable.
45 c) Conflict of interests:
46
47 No authors have disclosed any conflicts of interest.The authors affirm that they have no known financial or
48 interpersonal conflicts that would have appeared to have an impact on the research presented in this study.
49
50
51
52 References
53
54 [1]. A. Bali and V. Mansotra, "Deep Learning-based Techniques for the Automatic Classification of Fundus Images: A Comparative Study," 2021
55 3rd International Conference on Advances in Computing, Communication Control and Networking (ICAC3N), Greater Noida, India, 2021,
56 pp. 351-359, doi: 10.1109/ICAC3N53548.2021.9725464.
57 [2]. Di Cesare, M., 2019. Global trends of chronic non-communicable diseases risk factors. European Journal of Public Health, 29(Supplement_4),
58 pp.ckz185-196.
59 [3]. Bali, A., Mansotra, V. Analysis of Deep Learning Techniques for Prediction of Eye Diseases: A Systematic Review. Arch Computat Methods
60 Eng (2023). https://doi.org/10.1007/s11831-023-09989-8.
61
62
63
64
65
1
2
3
4 [4]. Orlando, J.I., Prokofyeva, E., Del Fresno, M. and Blaschko, M.B., 2018. An ensemble deep learning based approach for red lesion detection
5 in fundus images. Computer methods and programs in biomedicine, 153, pp.115-127.
6
7 [5]. Early Treatment Diabetic Retinopathy Study Research Group, 1991. Fundus photographic risk factors for progression of diabetic retinopathy:
8 ETDRS report number 12. Ophthalmology, 98(5), pp.823-833
9 [6]. Shanthi, T. and Sabeenian, R.S., 2019. Modified Alexnet architecture for classification of diabetic retinopathy images. Computers & Electrical
10
Engineering, 76, pp.56-64.
11
12 [7]. Bali, A., Mansotra, V. (2022). FUNDUS and OCT Image Classification Using DL Techniques. In: Rathore, V.S., Sharma, S.C., Tavares, J.M.R.,
13 Moreira, C., Surendiran, B. (eds) Rising Threats in Expert Applications and Solutions. Lecture Notes in Networks and Systems, vol 434.
Springer, Singapore. https://doi.org/10.1007/978-981-19-1122-4_8
14
[8]. A. Bali and V. Mansotra, "An Overview of Retinal Image Classification-Techniques and Challenges," 2021 First International Conference on
15 Advances in Computing and Future Communication Technologies (ICACFCT), Meerut, India, 2021, pp. 91-97, doi:
16 10.1109/ICACFCT53978.2021.9837371.
17 [9]. Mansour, R.F., 2018. Deep-learning-based automatic computer-aided diagnosis system for diabetic retinopathy. Biomedical engineering
18 letters, 8, pp.41-57.
19
[10]. Kassani, S.H., Kassani, P.H., Khazaeinezhad, R., Wesolowski, M.J., Schneider, K.A. and Deters, R., 2019, December. Diabetic retinopathy
20
21 classification using a modified xception architecture. In 2019 IEEE international symposium on signal processing and information technology
22 (ISSPIT) (pp. 1-6). IEEE.
23
[11]. Soomro, T.A., Gao, J., Khan, T., Hani, A.F.M., Khan, M.A. and Paul, M., 2017. Computerised approaches for the detection of diabetic
24
25 retinopathy using retinal fundus images: a survey. Pattern Analysis and Applications, 20, pp.927-961.
26 [12]. Zhao, Y.Q., Wang, X.H., Wang, X.F. and Shih, F.Y., 2014. Retinal vessels segmentation based on level set and region growing. Pattern
27 Recognition, 47(7), pp.2437-2446.
28
29 [13]. Akram, M.U., Khalid, S. and Khan, S.A., 2013. Identification and classification of microaneurysms for early detection of diabetic
30 retinopathy. Pattern recognition, 46(1), pp.107-116.
31 [14]. Gardner, G.G., Keating, D., Williamson, T.H. and Elliott, A.T., 1996. Automatic detection of diabetic retinopathy using an artificial neural
32
network: a screening tool. British journal of Ophthalmology, 80(11), pp.940-944.
33
34 [15]. Guo, S., Li, T., Kang, H., Li, N., Zhang, Y. and Wang, K., 2019. L-Seg: An end-to-end unified framework for multi-lesion segmentation of fundus
35 images. Neurocomputing, 349, pp.52-63.
36 [16]. Yan, Z., Han, X., Wang, C., Qiu, Y., Xiong, Z. and Cui, S., 2019, April. Learning mutually local-global u-nets for high-resolution retinal lesion
37 segmentation in fundus images. In 2019 IEEE 16th International Symposium on Biomedical Imaging (ISBI 2019) (pp. 597-600). IEEE.
38 [17]. Osareh, A., Mirmehdi, M., Thomas, B., & Markham, R. (2003). Automated identification of diabetic retinal exudates in digital colour
39 images. The British journal of ophthalmology, 87(10), 1220–1223. https://doi.org/10.1136/bjo.87.10.1220
40 [18]. C. Sinthanayothin, J. F. Boyce, T. H. Williamson, H. L. Cook, E. Mensah, S. Lal, and D. Usher, “Automated detection of diabetic retinopathy
41 on digital fundus images,” Diabetic medicine, vol. 19, no. 2, pp. 105– 112, 2002.
[19]. A. Sopharak, B. Uyyanonvara, S. Barman, and T. H. Williamson, “Automatic detection of diabetic retinopathy exudates from non-dilated
42
retinal images using mathematical morphology methods,” Computerized medical imaging and graphics, vol. 32, no. 8, pp. 720–727, 2008.
43 [20]. T. Walter, P. Massin, A. Erginay, R. Ordonez, C. Jeulin, and J.-C. Klein, “Automatic detection of microaneurysms in color fundus images,”
44 Medical image analysis, vol. 11, no. 6, pp. 555–566, 2007
45 [21]. Fraz, M.M., Barman, S.A., Remagnino, P., Hoppe, A., Basit, A., Uyyanonvara, B., Rudnicka, A.R. and Owen, C.G., 2012. An approach to
46 localize the retinal blood vessels using bit planes and centerline detection. Computer methods and programs in biomedicine, 108(2),
47 pp.600-616.
[22]. Azzopardi, G., Strisciuglio, N., Vento, M. and Petkov, N., 2015. Trainable COSFIRE filters for vessel delineation with application to retinal
48
images. Medical image analysis, 19(1), pp.46-57.
49 [23]. Franklin, S.W. and Rajan, S.E., 2014. Computerized screening of diabetic retinopathy employing blood vessel segmentation in retinal
50 images. biocybernetics and biomedical engineering, 34(2), pp.117-124.
51 [24]. Partovi, M., Rasta, S.H. and Javadzadeh, A., 2016. Automatic detection of retinal exudates in fundus images of diabetic retinopathy
52 patients. Journal of Research in Clinical Medicine, 4(2), pp.104-109.
53 [25]. Gulshan, V., Peng, L., Coram, M., Stumpe, M.C., Wu, D., Narayanaswamy, A., Venugopalan, S., Widner, K., Madams, T., Cuadros, J. and
54 Kim, R., 2016. Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus
photographs. jama, 316(22), pp.2402-2410.
55 [26]. Noh, H., Hong, S. and Han, B., 2015. Learning deconvolution network for semantic segmentation. In Proceedings of the IEEE international
56 conference on computer vision (pp. 1520-1528).
57 [27]. Shen, W., Zhou, M., Yang, F., Yu, D., Dong, D., Yang, C., Zang, Y. and Tian, J., 2017. Multi-crop convolutional neural networks for lung
58 nodule malignancy suspiciousness classification. Pattern Recognition, 61, pp.663-673.
59 [28]. Haldorai, A., Ramu, A. and Chow, C.O., 2019. Big data innovation for sustainable cognitive computing. Mobile networks and
60 applications, 24, pp.221-223.
61
62
63
64
65
1
2
3
4 [29]. Roychowdhury, S., Koozekanani, D.D. and Parhi, K.K., 2013. DREAM: diabetic retinopathy analysis using machine learning. IEEE journal of
5 biomedical and health informatics, 18(5), pp.1717-1728.
6
7 [30]. Lian, S., Li, L., Lian, G., Xiao, X., Luo, Z. and Li, S., 2019. A global and local enhanced residual u-net for accurate retinal vessel
8 segmentation. IEEE/ACM transactions on computational biology and bioinformatics, 18(3), pp.852-862.
9 [31]. Xu, X., Tan, T. and Xu, F., 2018. An improved U-net architecture for simultaneous arteriole and venule segmentation in fundus image.
10
In Medical Image Understanding and Analysis: 22nd Conference, MIUA 2018, Southampton, UK, July 9-11, 2018, Proceedings 22 (pp. 333-
11
12 340). Springer International Publishing.
13 [32]. Xiancheng, W., Wei, L., Bingyi, M., He, J., Jiang, Z., Xu, W., Ji, Z., Hong, G. and Zhaomeng, S., 2018, June. Retina blood vessel segmentation
14
using a U-net based Convolutional neural network. In Procedia Computer Science: International Conference on Data Science (ICDS 2018) (pp.
15
16 8-9).
17 [33]. Hagos, M.T. and Kant, S., 2019. Transfer learning based detection of diabetic retinopathy from small dataset. arXiv preprint
18 arXiv:1905.07203.
19
[34]. Quellec, G., Charriere, K., Boudi, Y., Cochener, B. and Lamard, M., 2017. Deep image mining for diabetic retinopathy screening. Medical
20
21 image analysis, 39, pp.178-193.
22 [35]. Orlando, J.I., Prokofyeva, E., Del Fresno, M. and Blaschko, M.B., 2018. An ensemble deep learning based approach for red lesion detection
23
in fundus images. Computer methods and programs in biomedicine, 153, pp.115-127.
24
25 [36]. Abràmoff, M.D., Lavin, P.T., Birch, M., Shah, N. and Folk, J.C., 2018. Pivotal trial of an autonomous AI-based diagnostic system for detection
26 of diabetic retinopathy in primary care offices. NPJ digital medicine, 1(1), p.39.
27 [37]. Gargeya, R. and Leng, T., 2017. Automated identification of diabetic retinopathy using deep learning. Ophthalmology, 124(7), pp.962-969.
28
29 [38]. Ting, D.S.W., Tan, G.S.W., Agrawal, R., Yanagi, Y., Sie, N.M., Wong, C.W., San Yeo, I.Y., Lee, S.Y., Cheung, C.M.G. and Wong, T.Y., 2017. Optical
30 coherence tomographic angiography in type 2 diabetes and diabetic retinopathy. JAMA ophthalmology, 135(4), pp.306-312.
31 [39]. Zhang, X., Thibault, G., Decencière, E., Marcotegui, B., Laÿ, B., Danno, R., Cazuguel, G., Quellec, G., Lamard, M., Massin, P. and Chabouis, A.,
32
2014. Exudate detection in color retinal images for mass screening of diabetic retinopathy. Medical image analysis, 18(7), pp.1026-1043
33
34 [40]. Qureshi, I., Ma, J. and Abbas, Q., 2021. Diabetic retinopathy detection and stage classification in eye fundus images using active deep
35 learning. Multimedia Tools and Applications, 80, pp.11691-11721.
36 [41]. Zhang, W., Zhong, J., Yang, S., Gao, Z., Hu, J., Chen, Y. and Yi, Z., 2019. Automated identification and grading system of diabetic retinopathy
37
38 using deep neural networks. Knowledge-Based Systems, 175, pp.12-25.
39 [42]. Romero-Aroca, P., Verges-Puig, R., de la Torre, J., Valls, A., Relano-Barambio, N., Puig, D. and Baget-Bernaldiz, M., 2020. Validation of a deep
40 learning algorithm for diabetic retinopathy. Telemedicine and e-Health, 26(8), pp.1001-1009.
41
[43]. Doe, J. (2020). Indian Diabetic Retinopathy Image Dataset (IDRID). IEEE Dataport.
42
43 [44]. Zhao, H. and Sun, N., 2017. Improved U-net model for nerve segmentation. In Image and Graphics: 9th International Conference, ICIG 2017,
44 Shanghai, China, September 13-15, 2017, Revised Selected Papers, Part II 9 (pp. 496-504). Springer International Publishing.
45
[45]. Ronneberger, O., Fischer, P. and Brox, T., 2015. U-net: Convolutional networks for biomedical image segmentation. In Medical Image
46
47 Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, October 5-9, 2015,
48 Proceedings, Part III 18 (pp. 234-241). Springer International Publishing
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65

Mtap S 23 08869

Uploaded by

Copyright:

Available Formats

You might also like

Mtap S 23 08869

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Mtap S 23 08869

Uploaded by

Copyright:

Available Formats

Multimedia Tools and Applications

Article Type: Track 6: Computer Vision for Multimedia Applications

Corresponding Author: AKANKSHA BALI, Research Scholar

Corresponding Author Secondary

Corresponding Author's Institution: University of Jammu

Corresponding Author's Secondary

First Author: AKANKSHA BALI, Research Scholar

First Author Secondary Information:

Order of Authors: AKANKSHA BALI, Research Scholar

Order of Authors Secondary Information:

You might also like