Download as pdf or txt
Download as pdf or txt
You are on page 1of 7

REVIEW ARTICLE

Artificial Intelligence in Diabetic Eye Disease Screening


Carol Y. Cheung, PhD,* Fangyao Tang, PhD,* Daniel Shu Wei Ting, MD, PhD,†
Gavin Siew Wei Tan, MD,† and Tien Yin Wong, MD, PhD†

age-related macular degeneration (AMD),7–9 glaucoma,10,11 and


Abstract: Systematic or national screening programs for diabetic retinopathy of prematurity,12,13 as well as the segmentation and
retinopathy (DR) and diabetic macular edema (DME), using digital assessment of optical coherence tomography (OCT) images for
fundus photography and optical coherence tomography (OCT), are diagnosis and screening of major retinal diseases.14–16
currently implemented at primary care level, aiming to provide timely For DR screening, DL has made remarkable breakthrough
referral for vision-threatening DR and DME to ophthalmologists for with some of the first few landmark studies in this area, as
Downloaded from http://journals.lww.com/apjoo by BhDMf5ePHKbH4TTImqenVFpRRqarA4WBo19OZATH64c0vutF+167Hc8QOeWVQUEbdtXf1GZF1KE= on 12/08/2019

timely treatment and vision loss prevention. However, interpretation of well as approval and registration of a fundus camera device for
retinal images requires specialized knowledge and expertise in diabetic DR screening using DL technology by the US Food and Drug
eye disease. Furthermore, current DR screening programs are capital- and Administration (FDA).17 A study by Google Health, in particular,
labor-intensive, which makes it difficult to rapidly scale up and expand has drawn substantial media publicity, showing extremely high
diabetic eye screening to meet the needs of this growing global epidemic. sensitivity and specificity in detecting referable DR from fundus
Deep learning (DL), a new branch of machine learning technology photographs.3 These developments have been cited in mainstream
under the broad term of artificial intelligence (AI), has made remarkable media and editorials in non-ophthalmology literature.18–21
breakthrough in medical imaging in particular for pattern recognition and Despite the promise and hype, there remains significant
image classification. In ophthalmology, AI and DL technology has been challenges in the actual implementation of DL in DR screening,
developed from big image datasets in assessment of retinal photographs and translating DL research into direct clinical applications of
for detection and screening of DR as well as the segmentation and screening in a community setting. This review will focus on
assessment of OCT images for diagnosis and screening of DME. This the current progress and the development of using AI and DL
review aimed to summarize the current progress and the development of technology for DR screening.
using AI and DL technology for diabetic eye disease screening as well
as current challenges in the actual implementation of DL in screening
programs, and translating DL research into direct clinical applications of WHY IT IS IMPORTANT TO SCREEN FOR DR?
screening in a community setting. Screening for DR is based on a high level of evidence. By
2040, 642 million people worldwide is projected to have diabetes
Key Words: artificial intelligence, deep learning, diabetic retinopathy, mellitus (DM).22 This is a global epidemic with heavy health
optical coherence tomography, screening burden to individuals and societies across the world, affecting
populations not only in highly developed countries, but also
(Asia Pac J Ophthalmol (Phila) 2019;8:158–164) increasingly in developing countries.22–24 Of note, DR is already
the most specific microvascular complication of DM that leads to
progressive visual impairment and even blindness. The prevalence

A rtificial intelligence (AI) using deep learning (DL) algorithms


or systems has been identified as a major ‘disruptive
innovation’ in medicine and healthcare. In ophthalmology, AI
of DR ranges from 30% to 45%, with 10% having vision-
threatening DR (VTDR), defined as severe non-proliferative
DR, proliferative DR, and diabetic macular edema (DME).25–28
and DL technology has been developed in several areas, with the Diabetes is estimated to account for 11.6% of the annual healthcare
two most prominent being in assessment of retinal photographs budgets in most countries.29 The cost of care for patients with DR
for detection and screening of diabetic retinopathy (DR),1–6 is higher than those without DR, and increasing with severity.30
With a projected increase in the incidence of DM, increased life
expectancy and an aging population, it is anticipated that the
impact of vision impairment associated with DR will inevitably
From the *Department of Ophthalmology and Visual Sciences, The Chinese increase in the coming years. Early identification of eyes at risk
University of Hong Kong, Hong Kong; and †Singapore Eye Research Institute, of VTDR and timely treatment has been shown to be effective
Singapore National Eye Centre, Singapore.
Submitted March 25, 2019; accepted April 8, 2019. in preventing the onset of visual impairment.25 Recent results of
Supported by the Research Grants Council – General Research Fund, Hong randomized controlled trials have also shown that intraocular
Kong (Ref: 14102418); National Medical Research Council, Singapore (Ref:
NMRC/STaR/0016/2013). D.S.W.T. and T.Y.W. are co-inventors of a patent on injections of drugs targeting vascular endothelial growth factor or
a deep learning system. inflammation in DME can lead to better visual outcomes.31 Early
Reprints: Tien Y. Wong, Singapore National Eye Center, 11 Third Hospital Avenue,
Singapore 168751. E-mail: wong.tien.yin@snec.com.sg. detection through regular surveillance by clinical examination or
Copyright © 2019 by Asia-Pacific Academy of Ophthalmology grading of retinal photographs is, therefore, the key to prevent
ISSN: 2162-0989
DOI: 10.22608/APO.201976 vision loss and even blindness.

158 | www.apjo.org Asia-Pacific Journal of Ophthalmology • Volume 8, Number 2, March/April 2019

Copyright © 2019 Asia-Pacific Academy of Ophthalmology. Unauthorized reproduction of this article is prohibited.
Asia-Pacific Journal of Ophthalmology • Volume 8, Number 2, March/April 2019 AI in Diabetic Eye Disease Screening

CURRENT CHALLENGES IN DR SCREENING and SD-OCT to achieve excellent diagnostic performance.


PROGRAMS
While the rationale to screen for DR among patients with
DM is clear, there are significant challenges in designing and DL AS THE BREAKTHROUGH IN AI TECHNOLOGY
sustaining a DR screening program in many communities and Deep learning is a new branch of machine learning
countries. technology under the broad term of AI. The concept is that
Systematic or national screening programs for DR using DL permits software algorithms to be trained through a large
2-dimensional (2D) non-stereoscopic digital fundus photography mathematical function with millions of parameters on vast
are currently implemented at primary care level in many amounts of data called convolutional neural network (CNN). Such
countries, aiming to provide timely referral for more severe network is inspired by the ability of brains to learn complicated
levels of DR to ophthalmologists for timely treatment and vision patterns in data by changing the strengths of synaptic connections
loss prevention.32–34 Diagnosis of DME requires identification of between neurons. Deep learning uses deep networks with many
thickening of the macula; therefore, screening for DME using intermediate layers of artificial “neurons” between the input and
non-stereoscopic fundus photographs is bound to a very high the output, and, like the visual cortex, these artificial neurons learn
false-positive rate (eg, >86% in Hong Kong35 and >79% in the a hierarchy of progressively more complex feature detectors.44 In
UK36). Currently, spectral-domain optical coherence tomography recent years, AI-DL with CNNs has been developed rapidly and
(SD-OCT), which provides 3D volumetric data cube of the has demonstrated strengths in intricate pattern recognition from
layered retinal structures, is being piloted to add in different DR big datasets in the medical field. For example, it is shown that
screening programs in the UK, China and Singapore, aiming to the performance of automated classification for DR,1–5 retinal
reduce the false-positive rate of DME detection from fundus diseases,14–16 glaucoma10,11 as well as other medical conditions
photography.37–39 When signs of referable DR or other referable including malignant melanoma,45 tuberculosis,46 and lung
eye diseases are identified from retinal photographs and SD- cancer47 by DL systems are equal to or better than board-certified
OCT, patients are referred to eye clinics at tertiary eye hospitals specialists. These studies underscoring the ability of automated
for clinical examination and management to prevent vision loss. image interpretation using DL are promising.
However, interpretation of both retinal photographs and SD- For image interpretation, DL is superior to traditional
OCT requires specialized knowledge and expertise in diabetic machine learning. In traditional machine learning, features need
eye diseases and retinal imaging. Inter-individual differences to be extracted manually, before they are fed in the machine
in interpretation as well as potential of missing important learning algorithm. Using DL, the features are automatically
retinal pathologies, especially with the engagement of non- learnt by neural networks in the feature extraction stage and then
ophthalmologist graders of varying skills and experience, are fed into a classifier for classification. No lesion-based features
always concerned. Furthermore, current DR screening programs need to be specified in DL, and thus steps of traditional feature
are capital- and labor-intensive, which makes it difficult to rapidly calculation or lesion segmentation machine learning algorithms
scale up and expand diabetic eye screening to meet the needs of can be skipped which required considerable domain expertise and
this growing global epidemic.40 Given that the number of DM is careful engineering. Furthermore, such DL approach may identify
expected to rise rapidly, manual grading of DR and DME from “different features” that are associated with the desired classifi-
retinal images in DR screening programs will not be sustainable. cation and unknown to humans previously. Table 1 summarizes
recent DL systems for screening DR and DME.1–6,14,15,48,49

THE CASE FOR “AUTOMATED” MACHINE-BASED


ASSESSMENT OF DR DL IN RETINAL PHOTOGRAPH FOR DETECTING DR
In view of the challenges associated with DR screening, an There have been several DL systems developed by different
automated program to assess DR from photographs has significant groups for detecting DR and VTDR from retinal photographs.
potential of increasing efficiency, reproducibility, and coverage 1. Abràmoff et al2 was the first to use a DL-enhanced algorithm
of screening programs, and at the same time reducing cost and for the automated detection of referable DR and VTDR
barriers to access, and improving patient outcomes by providing using 1748 retinal photographs from Messidor-2 dataset.
early detection and treatment. They found that the sensitivity of the DL system was not
Image interpretation using AI has been developed in the statistically different from their previously published
field of medicine. Over the last 2 decades, automated analysis algorithm based on traditional machine learning methods, but
of retinal photographs for DR has been studied using traditional with a significantly better specificity.50 The DL system has
machine learning methods based on feature extraction or pattern recently been evaluated using data collected prospectively
recognition specified by experts manually for DR characterization from 819 subjects recruited from 10 primary care sites in the
and classification.41–43 However, the detection of DR is a complex US with high sensitivity (87.2%) and specificity (90.7%).48
image-interpretation task. Based on feature extraction or pattern The DL system has also been evaluated in primary care
recognition using traditional machine learning algorithms, these setting outside the US.51,52 In April 2018, the DL algorithm
systems still cannot reach a good level of specificity. Furthermore, developed by Abramoff et al, called IDx-DR, has received
traditional machine learning algorithms typically plateau in the first approval from the FDA for detecting more-than-
performance after it reaches a threshold of training data. mild DR in adults who have DM using DL without assisted
Recently, DL has been applied in the field of ophthalmology interpretation by a clinician.17 The case of IDx-DR highlights
to outstrip the ceiling performance of prior technology, and has one of the earliest successes of a DL-based screening tool in
revolutionized the detection of DR/DME from retinal photographs completing the regulatory process.

© 2019 Asia-Pacific Academy of Ophthalmology www.apjo.org | 159

Copyright © 2019 Asia-Pacific Academy of Ophthalmology. Unauthorized reproduction of this article is prohibited.
TABLE 1. A Summary of Recent Deep Learning Systems for Screening DR and DME

Size of Dataset
Cheung et al

Output(s) of DR/ Deep Learning Size of Dataset for for External Sensitivity for Specificity for AUC for Output of
Study Year Imaging DME Techniques Development Validation Other Outputs Output of DR/DME Output of DR/DME DR/DME

Abràmoff et 2016, Fundus mtmDR (defined AlexNet and Messidor-2: 1748 819 subjects Sufficient image 87.2% 90.7% Not reported

160 | www.apjo.org
al2,48 2018 photograph as ETDRS level VGG network images recruited from 10 quality: yes/no
≥35 and/or DME architectures primary care sites
present)
Gulshan et al3 2016 Fundus Referable DR Inception-V3 128,175 images EyePACS-1: 9963 All-cause referable EyePACS-1: 97.5% EyePACS-1: 93.4% EyePACS-1: 0.991
photograph architecture images Messidor-2: 96.1% Messidor-2: 93.9% Messidor-2: 0.990
Messidor-2: 1748
images
Gargeya and 2017 Fundus Any stage of DR, Customized CNNs 75,137 images Messidor-2: 1748 Nil Messidor-2 (no DR Messidor-2 (no DR Messidor-2 (no DR
Leng4 photograph mild DR images vs any stage of DR): vs any stage of DR): vs any stage of DR):
E-Ophtha: 405 93% 87% 0.94
images Messidor-2 (no DR Messidor-2 (no DR Messidor-2 (no DR
vs mild DR): 74% vs mild DR): 80% vs mild DR): 0.83
E-Ophtha (no DR vs E-Ophtha (no DR vs E-Ophtha (no DR vs
mild DR): 90% mild DR): 94% mild DR): 0.95
Ting et al1 2017 Fundus Referable DR and Adapted VGGNet DR: 76,370 images Primary: 71,896 Referable possible Referable DR: Referable DR: Referable DR:
photograph VTDR architecture Possible glaucoma: images glaucoma 90.5% 91.6% 0.936
125,189 images 10 additional Referable AMD VTDR: 100% VTDR: 91.1% VTDR: 0.958
AMD: 72,610 datasets: 40,752
images images
Li et al5 2018 Fundus Vision-threatening Inception V3 71,043 images 35,201 images Nil 92.5% 98.5% 0.955
photograph referable DR architecture
Kanagasingam 2018 Fundus DR Inception V3 30,000 images 193 subjects Nil 100% 92% Nil
et al6 photograph architecture
Kermany et al14 2018 OCT DME Inception V3 108,312 images 1000 images CNV; drusen; Multiclass model: Multiclass model: Multiclass model:
architecture and urgent referral 97.8% 97.4% 0.999
transfer learning Limited model: Limited model: Limited model:
96.6% 94.0% 0.988
Binary model: Binary model: Binary model:
96.8% 99.6% 0.999
De Fauw et al15 2018 OCT Diagnosis 3D U-Net 877 segmented 997 subjects Referral suggestion, Not reported Not reported 0.990
probability of architecture for scans for the tissue volume
macular edema segmentation segmentation of drusen and
network network; 14,884 epiretinal
Customized CNNs scans with membrane,
for classification diagnoses and and diagnosis

Copyright © 2019 Asia-Pacific Academy of Ophthalmology. Unauthorized reproduction of this article is prohibited.
network referral decisions probability of
for classification other retinal
network abnormalities
Li et al49 2019 OCT DME VGG-16 network 109,312 images 1000 images CNV; drusen 98.8% 98.8% 0.999
architecture

AMD indicates age-related macular degeneration; AUC, area under the curve; CNN, convolutional neural network; CNV, choroidal neovascularization; DME, diabetic macular edema; DR, diabetic retinopathy; ETDRS,
Early Treatment Diabetic Retinopathy Study; mtmDR, more-than-mild diabetic retinopathy; OCT, optical coherence tomography; VGG, Visual Geometry Group; VTDR, vision-threatening diabetic retinopathy.

© 2019 Asia-Pacific Academy of Ophthalmology


Asia-Pacific Journal of Ophthalmology • Volume 8, Number 2, March/April 2019
Asia-Pacific Journal of Ophthalmology • Volume 8, Number 2, March/April 2019 AI in Diabetic Eye Disease Screening

2. A major development in DL for DR detection was the study sensitivity (>90%), specificity (>73.3%) and AUC (>0.89)
by Gulshan et al3 from Google Healthcare which developed for identifying other referable eye conditions.
a DL system for detecting referable DR using 128,175
retinal photographs which were graded 3 to 7 times for DR,
DME, and image gradability by a panel of 54 US-licensed DL IN OCT FOR DETECTING DME
ophthalmologists and ophthalmology senior residents. The A major issue in DR screening is the detection of
DL algorithm achieved a high sensitivity (≥87%), specificity DME, which is not easily captured from 2-dimensional (2D)
(≥90%) and area under the receiver operating characteristic fundus photographs.35,38 Thus, recent studies have focused on
curve (AUC) (≥0.99) in the external validation using 2 demonstrating that DL algorithms can be trained using OCT
public databases (EyePACS-1: n = 9963 and Messidor-2: images to detect DME and other retinal diseases. Kermany et al14
n = 1748); both graded by at least 7 US board-certified firstly applied DL and transfer learning techniques in the detection
ophthalmologists.3 Other groups have reported similar of DME as well as choroidal neovascularization, drusen, and
results. Gargeya and Leng4 developed a DL algorithm for normal from 2D OCT images. Two models (multiclass model
detecting DR (any DR and mild DR) using 75,137 retinal and binary model) were firstly developed using a training dataset
photographs obtained from the EyePACS public dataset. of 108,312 2D OCT images and a “limited model” was further
The DL algorithm has further been evaluated using 2 other trained using 1000 2D OCT images randomly selected from each
public databases (Messidor-2: n = 1748 and E-Ophtha: n = class during training to compare transfer learning performance
405) with a sensitivity of ≥74%, specificity of ≥80%, and using limited data compared with results using a large dataset.14
AUC of ≥0.83.4 Li et al5 developed another DL algorithm In the validation, they evaluated the performance of the models
and internally validated it using 71,043 retinal photographs of a dataset of 1000 2D OCT images and found that both models
acquired from a total of 36 hospital ophthalmology achieved high performance even in the limited model (sensitivity
departments, optometry clinics, and screening settings ≥96%, specificity ≥94%, and AUC ≥0.99).14 This study
in China. In the external validation, the DL algorithm highlighted the application of transfer learning in conditions with
was evaluated using 35,201 retinal photographs from small datasets.
population-based cohorts of Malay, Caucasian Australians, Another major study was performed by Google’s Deepmind
and Indigenous Australians with a sensitivity of 92.5%, a in collaboration with Moorfields Eye Hospital, where the group
specificity of 98.5% and an AUC of 0.955.5 Kanagasingam applied a combined 3D segmentation and classification networks
et al6 developed a DL algorithm based on several training for interpreting 3D OCT images in predicting tissue volumes and
dataset including DiaRetDB1, EyePACS, and the Australian identifying and referring different retinal abnormalities.15 They
Tele-eye care DR database. They deployed the DL algorithm developed the DL algorithm using 877 segmented scans for
prospectively and evaluated the performance in 193 patients the segmentation network and 14,884 scans from 7621 patients
at a primary care clinic in Australia with a specificity of 92% with multiple clinical diagnoses and referral decisions for
and a positive predictive value of just 12% driven by the low classification network. The classification network learned to map
incidence of DR in their cohort (2 of 193).6 a segmentation map to 4 referral decisions (urgent, semi-urgent,
3. A third major development addresses a major gap in routine, and observation only) and 10 different retinal pathologies
previous studies in which the DL algorithms were not trained (macular retinal edema, choroidal neovascularization, full-
to detect other common sight-threatening eye conditions, thickness macular hole, partial-thickness macular hole, epiretinal
such as glaucoma and AMD. To address this gap, Ting et al1 membrane, geographic atrophy, drusen, vitreomacular traction,
developed a DL system to screen for possible glaucoma and central serous retinopathy, and normal). Their framework
AMD, in addition to referable DR and VTDR. The DL system (predicting tissue volumes, identifying retinal abnormalities,
was composed of 8 CNNs, all using an adaptation of the and providing referral suggestions) closely matched the clinical
VGGNet architecture: (i) an ensemble of 2 networks for the decision-making process, separating judgments about the scan
classification of DR severity; (ii) an ensemble of 2 networks itself from the subsequent referral decision, allowing a clinician
for the identification of referable possible glaucoma; (iii) an to inspect and visualize an interpretable segmentation, rather than
ensemble of 2 networks for the identification of referable simply being presented with a diagnosis and referral suggestion.
AMD; (iv) 1 network to assess image quality; and (v) 1
network to reject invalid non-retinal images. The DL system
was validated using retinal photographs collected from VISUALIZING DL AND EXPLAINABILITY
the Singapore Integrated Diabetic Retinopathy Program A key challenge for clinical adoption of DL algorithms is
(SIDRP) and from 10 additional multiethnic datasets from a major mindset shift in how clinicians entrust clinical care to
different countries with diverse community- and clinic- machines. The issue of ‘black box’ problem has been identified as
based populations with DM.1 In the primary validation an impediment to the application of DL in healthcare.53 Previous
dataset of 71,896 images from 17,974 patients, the AUC reports have already performed visualization heat map analysis to
of the DL system was 0.936 for referable DR, 0.958 for represent intuitively the learning procedure of the CNNs for DR
VTDR, 0.941 for glaucoma suspect, and 0.931 for referable and DME.4,14 Recently, Keel et al54 conducted a study to analyze
AMD. The DL system had AUCs between 0.889 and 0.983 the CNN learning procedure for 2 validated DL systems including
for referable DR on 10 additional datasets (n = 40,752 one for the detection of referable DR. They developed and applied
images). In the secondary validation datasets, the DL system a CNN-independent adaptive kernel visualization technique for
had AUCs between 0.889 and 0.983 for referable DR (n = visualizing the learning procedure of the networks. Using this
40,752 images). The DL system could also achieve high method, discriminative image regions were highlighted when the

© 2019 Asia-Pacific Academy of Ophthalmology www.apjo.org | 161

Copyright © 2019 Asia-Pacific Academy of Ophthalmology. Unauthorized reproduction of this article is prohibited.
Cheung et al Asia-Pacific Journal of Ophthalmology • Volume 8, Number 2, March/April 2019

classification possibility output of being diagnosed was greater screening programs (Fig. 1B). By using this screening model, the
than 70%, and a heat map was then generated highlighting highly sensitivity of the DL system may need to be set higher in order to
prognostic regions in the input image. They found that typical minimalize false-negative cases.
referable DR lesions (retinal hemorrhage, exudate, intraretinal
microvascular abnormality, venous beading) could be identified
as the most important prognostic regions in 96% of true-positive CURRENT CHALLENGES
referable DR cases whereas nontraditional fundus regions (eg, The performance of DL algorithms is extremely promising
optic disc and fundus areas adjacent to vessels) were identified in as seen from current literature. However, there are still several
85% of false-positive referable DR cases. This study may further challenges and complexities that limit the ability to move forward
promote the clinical adoption of using DL algorithm and enable quickly.
clinicians to understand important exposure variables in real time. First, as mentioned above, the “black box” approach is still
unacceptable. Deep learning uses millions of image features most
predictive for classification rather than explicitly detecting clinical
HOW DOES DL FIT IN CURRENT DR SCREENING features that physicians are familiar with (eg, microaneurysms,
STRATEGY? hard exudates). It is unclear exactly what the machine “thinks”
A critical question in the conceptualization of using DL and “sees”.18 Transparency of DL algorithm is required in order
technology for DR screening is “where does the DL system to let users understand the basis to justify a particular diagnosis,
fit?”. Deep learning systems could potentially be deployed in treatment recommendation, or outcome prediction that the DL
2 different settings.1,18 Figure 1 illustrates the 2 proposed DL- algorithm offered, and also allow examination for any potential
based screening models for DR, compared with an existing bias.57
model. First, the DL system could be incorporated in existing DR Second, AI most likely succeeds when it is used with
screening programs such as in the UK and Singapore to assist high-quality “labeled” input data validated by multiple medical
human assessors in centralized reading centers (a semi-automated specialists for learning and classifying images in relation to
model).1,55,56 The retinal images can be firstly analyzed by the DL outcomes. Also, DL algorithms are highly data hungry, often
system and the human assessors only review images flagged as requiring millions of observations to reach acceptable performance
“referable” or “ungradable” by the DL system (2nd screening) for levels.58 Moreover, training using single clinical dataset is always
further confirmation. This model not only decreases the massive limited by potential biases (eg, same ethnicity, same device)
workload to read those non-referable retinal images but also which may affect both performance and generalizability.59,60
avoids those false-positive cases referring to ophthalmologists Using dataset from diverse settings and populations for training
(Fig. 1A). Second, the DL system could be a fully automated DL algorithm is ideal. Nevertheless, the availability of such
model in that all retinal images can be analyzed by the DL system dataset is very limited and it is very costly.
which will be useful in communities without any existing DR Third, “real-world” experiments are essential in the

A B C

FIGURE 1. Two artificial intelligence–based screening models for diabetic eye diseases have been proposed. A, A semi-automated model that retinal
images will be firstly analyzed by the deep learning systems. The images will be read by doctors or trained graders (2nd screening) if it is flagged by
the deep learning systems. B, A fully automated model that retinal images will be analyzed by the deep learning systems. C, Existing model by human
assessors.

162 | www.apjo.org © 2019 Asia-Pacific Academy of Ophthalmology

Copyright © 2019 Asia-Pacific Academy of Ophthalmology. Unauthorized reproduction of this article is prohibited.
Asia-Pacific Journal of Ophthalmology • Volume 8, Number 2, March/April 2019 AI in Diabetic Eye Disease Screening

validation of DL algorithm. For example, prospective study age-related macular degeneration. JAMA Ophthalmol. 2018;136:1359–1366.
design in appropriate clinical settings and testing DL algorithm 10. Li Z, He Y, Keel S, et al. Efficacy of a deep learning system for detecting
using independent validation datasets under different settings and glaucomatous optic neuropathy based on color fundus photographs.
conditions that played no role in model development are critical Ophthalmology. 2018;125:1199–1206.
to evaluate the performance of DL algorithms. Several recent 11. Medeiros FA, Jammal AA, Thompson AC. From machine to machine:
studies have already reported the performance of DL algorithm An OCT-trained deep learning algorithm for objective quantification of
using a prospective study design.6,48,61 glaucomatous damage in fundus photographs. Ophthalmology. 2019;126:
Fourth, a continuous data supply is necessary for ongoing 513–521.
improvement and refinement of the DL algorithms and data 12. Redd TK, Campbell JP, Brown JM, et al. Evaluation of a deep learning
may need to be shared across multiple centers for widespread image assessment system for detecting severe retinopathy of prematurity. Br
implementation. However, the current healthcare environment J Ophthalmol. 2018 Nov 23. Epub ahead of print.
holds little incentive for data sharing.57 13. Brown JM, Campbell JP, Beers A, et al. Automated diagnosis of plus
Lastly, before DL algorithms or systems can be deployed disease in retinopathy of prematurity using deep convolutional neural
in the clinical workflow of diabetic eye disease screening networks. JAMA Ophthalmol. 2018;136:803–810.
programs, there are still more questions need to be addressed. For 14. Kermany DS, Goldbaum M, Cai W, et al. Identifying medical diagnoses
example, “If a patient suffers an adverse event due to an AI-based and treatable diseases by image-based deep learning. Cell. 2018;172:
technology, who is responsible?”.57 In addition, more rigorous 1122–1131.e9.
scientific evidence on safety, validity, reproducibility, reliability, 15. De Fauw J, Ledsam JR, Romera-Paredes B, et al. Clinically applicable deep
usability, and patient and clinician acceptability is also critical, learning for diagnosis and referral in retinal disease. Nat Med. 2018;24:
before the clinical deployment. 1342–1350.
16. Schlegl T, Waldstein SM, Bogunovic H, et al. Fully automated detection and
quantification of macular fluid in OCT using deep learning. Ophthalmology.
CONCLUSIONS 2018;125:549–558.
In summary, AI and DL are entering the mainstream of clinical 17. FDA permits marketing of artificial intelligence-based device to detect
medicine. This technology can augment human intelligence to certain diabetes-related eye problems [news release]. US Food & Drug
improve decision making and operational processes. Screening of Administration; April 11, 2018. https://www.fda.gov/NewsEvents/
DR is almost an ideal task for AI in health care. Looking forward, Newsroom/PressAnnouncements/ucm604357.htm?fbclid=IwAR2h4qBp
AI will become ubiquitous and indispensable for the screening, SAD6vx8EErts2nhhundX8654ZUL3xwvVahXyLxnSYSxCAOMEias.
with expectation to improve the efficiency and accessibility of Accessed February 24, 2019.
screening programs, and therefore will be able to prevent visual 18. Wong TY, Bressler NM. Artificial intelligence with deep learning
loss and blindness from this devastating disease. technology looks into diabetic retinopathy screening. JAMA. 2016;316:
2366–2367.
19. Beam AL, Kohane IS. Translating artificial intelligence into clinical care.
REFERENCES JAMA. 2016;316:2368–2369.
1. Ting DSW, Cheung CY, Lim G, et al. Development and validation of a 20. Google AI matches diabetic retinopathy screens [medical news]. Medscap;
deep learning system for diabetic retinopathy and related eye diseases November 30, 2016. https://www.medscape.com/viewarticle/872555.
using retinal images from multiethnic populations with diabetes. JAMA. Accessed February 24, 2019.
2017;318:2211–2223. 21. Metz C. India fights diabetic blindness with help from A.I. New York
2. Abràmoff MD, Lou Y, Erginay A, et al. Improved automated detection of Times. March 10, 2019: https://www.nytimes.com/2019/03/10/technology/
diabetic retinopathy on a publicly available dataset through integration of artificial-intelligence-eye-hospital-india.html. Accessed March 12, 2019.
deep learning. Invest Ophthalmol Vis Sci. 2016;57:5200–5206. 22. Ogurtsova K, da Rocha Fernandes JD, Huang Y, et al. IDF Diabetes Atlas:
3. Gulshan V, Peng L, Coram M, et al. Development and validation of a deep Global estimates for the prevalence of diabetes for 2015 and 2040. Diabetes
learning algorithm for detection of diabetic retinopathy in retinal fundus Res Clin Pract. 2017;128:40–50.
photographs. JAMA. 2016;316:2402–2410. 23. Chan JC, Malik V, Jia W, et al. Diabetes in Asia: epidemiology, risk factors,
4. Gargeya R, Leng T. Automated identification of diabetic retinopathy using and pathophysiology. JAMA. 2009;301:2129–2140.
deep learning. Ophthalmology. 2017;124:962–969. 24. Chan JC, Zhang Y, Ning G. Diabetes in China: a societal solution for a
5. Li Z, Keel S, Liu C, et al. An automated grading system for detection of personal challenge. Lancet Diabetes Endocrinol. 2014;2:969–979.
vision-threatening referable diabetic retinopathy on the basis of color fundus 25. Wong TY, Cheung CM, Larsen M, et al. Diabetic retinopathy. Nat Rev Dis
photographs. Diabetes Care. 2018;41:2509–2516. Primers. 2016;2:16012.
6. Kanagasingam Y, Xiao D, Vignarajan J, et al. Evaluation of artificial 26. Cheung N, Mitchell P, Wong TY. Diabetic retinopathy. Lancet. 2010;376:
intelligence-based grading of diabetic retinopathy in primary care. JAMA 124–136.
Netw Open. 2018;1:e182665. 27. Yau JW, Rogers SL, Kawasaki R, et al. Global prevalence and major risk
7. Burlina PM, Joshi N, Pacheco KD, et al. Assessment of deep generative factors of diabetic retinopathy. Diabetes Care. 2012;35:556–564.
models for high-resolution synthetic retinal image generation of age-related 28. Cheung CY, Ikram MK, Klein R, et al. The clinical implications of recent
macular degeneration. JAMA Ophthalmol. 2019 Jan 10. Epub ahead of print. studies on the structure and function of the retinal microvasculature in
8. Peng Y, Dharssi S, Chen Q, et al. DeepSeeNet: A deep learning model for diabetes. Diabetologia. 2015;58:871–885.
automated classification of patient-based age-related macular degeneration 29. Zhang P, Zhang X, Brown JB, et al. In: Economic Impact of Diabetes. IDF
severity from color fundus photographs. Ophthalmology. 2019;126:565–575. Diabetes Atlas. 4th ed. International Diabetes Federation; 2010.
9. Burlina PM, Joshi N, Pacheco KD, et al. Use of deep learning for detailed 30. Zhang X, Low S, Kumari N, et al. Direct medical cost associated with
severity characterization and estimation of 5-year risk among patients with diabetic retinopathy severity in type 2 diabetes in Singapore. PLoS One.

© 2019 Asia-Pacific Academy of Ophthalmology www.apjo.org | 163

Copyright © 2019 Asia-Pacific Academy of Ophthalmology. Unauthorized reproduction of this article is prohibited.
Cheung et al Asia-Pacific Journal of Ophthalmology • Volume 8, Number 2, March/April 2019

2017;12:e0180949. skin cancer with deep neural networks. Nature. 2017;542:115–118.


31. Diabetic Retinopathy Clinical Research Network, Wells JA, Glassman AR, 46. Lakhani P, Sundaram B. Deep learning at chest radiography: automated
et al. Aflibercept, bevacizumab, or ranibizumab for diabetic macular edema. classification of pulmonary tuberculosis by using convolutional neural
N Engl J Med. 2015;372:1193–1203. networks. Radiology. 2017;284:574–582.
32. Wang LZ, Cheung CY, Tapp RJ, et al. Availability and variability in 47. Yu KH, Zhang C, Berry GJ, et al. Predicting non-small cell lung cancer
guidelines on diabetic retinopathy screening in Asian countries. Br J prognosis by fully automated microscopic pathology image features. Nat
Ophthalmol. 2017;101:1352–1360. Commun. 2016;7:12474.
33. Bhargava M, Cheung CY, Sabanayagam C, et al. Accuracy of diabetic 48. Abràmoff MD, Lavin PT, Birch M, et al. Pivotal trial of an autonomous AI-
retinopathy screening by trained non-physician graders using non-mydriatic based diagnostic system for detection of diabetic retinopathy in primary
fundus camera. Singapore Med J. 2012;53:715–719. care offices. NPJ Digit Med. 2018;1:39.
34. Wong TY, Sun J, Kawasaki R, et al. Guidelines on Diabetic Eye Care: 49. Li F, Chen H, Liu Z, et al. Fully automated detection of retinal disorders by
The International Council of Ophthalmology Recommendations for image-based deep learning. Graefes Arch Clin Exp Ophthalmol. 2019;257:
Screening, Follow-up, Referral, and Treatment Based on Resource Settings. 495–505.
Ophthalmology. 2018;125:1608–1622. 50. Abràmoff MD, Folk JC, Han DP, et al. Automated analysis of retinal
35. Wong RL, Tsang CW, Wong DS, et al. Are we making good use of images for detection of referable diabetic retinopathy. JAMA Ophthalmol.
our public resources? The false-positive rate of screening by fundus 2013;131:351–357.
photography for diabetic macular oedema. Hong Kong Med J. 2017;23: 51. Verbraak FD, Abramoff MD, Bausch GCF, et al. Diagnostic accuracy of a
356–364. device for the automated detection of diabetic retinopathy in a primary care
36. Jyothi S, Elahi B, Srivastava A, et al. Compliance with the quality standards setting. Diabetes Care. 2019;42:651–656.
of National Diabetic Retinopathy Screening Committee. Prim Care 52. van der Heijden AA, Abramoff MD, Verbraak F, et al. Validation of
Diabetes. 2009;3:67–72. automated screening for referable diabetic retinopathy with the IDx-DR
37. Olson J, Sharp P, Goatman K, et al. Improving the economic value of device in the Hoorn Diabetes Care System. Acta Ophthalmol. 2018;96:
photographic screening for optical coherence tomography-detectable 63–68.
macular oedema: a prospective, multicentre, UK study. Health Technol 53. Castelvecchi D. Can we open the black box of AI? Nature. 2016;538:20–23.
Assess. 2013;17:1–142. 54. Keel S, Wu J, Lee PY, et al. Visualizing deep learning models for
38. Goh JK, Cheung CY, Sim SS, et al. Retinal imaging techniques for diabetic the detection of referable diabetic retinopathy and glaucoma. JAMA
retinopathy screening. J Diabetes Sci Technol. 2016;10:282–294. Ophthalmol. 2018 Dec 20. Epub ahead of print.
39. Leal J, Luengo-Fernandez R, Stratton IM, et al. Cost-effectiveness of digital 55. Nguyen HV, Tan GS, Tapp RJ, et al. Cost-effectiveness of a National
surveillance clinics with optical coherence tomography versus hospital eye Telemedicine Diabetic Retinopathy Screening Program in Singapore.
service follow-up for patients with screen-positive maculopathy. Eye (Lond). Ophthalmology. 2016;123:2571–2580.
2018 Nov 30. Epub ahead of print. 56. Scanlon PH. The English National Screening Programme for diabetic
40. Scanlon PH, Aldington SJ, Leal J, et al. Development of a cost-effectiveness retinopathy 2003-2016. Acta Diabetol. 2017;54:515–525.
model for optimisation of the screening interval in diabetic retinopathy 57. He J, Baxter SL, Xu J, et al. The practical implementation of artificial
screening. Health Technol Assess. 2015;19:1–116. intelligence technologies in medicine. Nat Med. 2019;25:30–36.
41. Abràmoff MD, Reinhardt JM, Russell SR, et al. Automated early detection 58. Obermeyer Z, Emanuel EJ. Predicting the future – big data, machine
of diabetic retinopathy. Ophthalmology. 2010;117:1147–1154. learning, and clinical medicine. N Engl J Med. 2016;375:1216–1219.
42. Tufail A, Rudisill C, Egan C, et al. Automated diabetic retinopathy image 59. Gianfrancesco MA, Tamang S, Yazdany J, et al. Potential biases in machine
assessment software: diagnostic accuracy and cost-effectiveness compared learning algorithms using electronic health record data. JAMA Intern Med.
with human graders. Ophthalmology. 2017;124:343–351. 2018;178:1544–1547.
43. Fleming AD, Goatman KA, Philip S, et al. Automated grading for diabetic 60. Zou J, Schiebinger L. AI can be sexist and racist – it’s time to make it fair.
retinopathy: a large-scale audit using arbitration by clinical experts. Br J Nature. 2018;559:324–326.
Ophthalmol. 2010;94:1606–1610. 61. Keel S, Lee PY, Scheetz J, et al. Feasibility and patient acceptability of a
44. LeCun Y, Bengio Y, Hinton G. Deep learning. Nature. 2015;521:436–444. novel artificial intelligence-based screening model for diabetic retinopathy
45. Esteva A, Kuprel B, Novoa RA, et al. Dermatologist-level classification of at endocrinology outpatient services: a pilot study. Sci Rep. 2018;8:4330.

164 | www.apjo.org © 2019 Asia-Pacific Academy of Ophthalmology

Copyright © 2019 Asia-Pacific Academy of Ophthalmology. Unauthorized reproduction of this article is prohibited.

You might also like