Download as pdf or txt
Download as pdf or txt
You are on page 1of 13

Artificial Intelligence (AI) for Screening Mammography, From the

AJR Special Series on AI Applications


Leslie R. Lamb, MD, MSc1, Constance D. Lehman, MD, PhD1, Aimilia Gastounioti, PhD2,3, Emily F. Conant, MD2,
Manisha Bahl, MD, MPH1

Breast Imaging · Special Series Review


Downloaded from ajronline.org by 85.107.71.160 on 03/24/24 from IP address 85.107.71.160. Copyright ARRS. For personal use only; all rights reserved

Keywords
artificial intelligence, breast cancer, Artificial intelligence (AI) applications for screening mammography are being mar-
implementation, machine learning, keted for clinical use in the interpretative domains of lesion detection and diagnosis,
screening mammography triage, and breast density assessment and in the noninterpretive domains of breast
cancer risk assessment, image quality control, image acquisition, and dose reduction.
Submitted: Nov 1, 2021 Evidence in support of these nascent applications, particularly for lesion detection and
Revision requested: Nov 15, 2021 diagnosis, is largely based on multireader studies with cancer-enriched datasets rather
Revision received: Dec 14, 2021 than rigorous clinical evaluation aligned with the application’s specific intended clini-
Accepted: Jan 3, 2022 cal use. This article reviews commercial AI algorithms for screening mammography that
First published online: Jan 12, 2022 are currently available for clinical practice, their use, and evidence supporting their per-
formance. Clinical implementation considerations, such as workflow integration, gov-
C. D. Lehman receives institutional research
ernance, and ethical issues, are also described. In addition, the future of AI for screening
support from the Breast Cancer Research
Foundation, GE Healthcare, and Hologic mammography is discussed, including the development of interpretive and noninter-
and is a cofounder of Clairity. A. Gastounioti pretive AI applications and strategic priorities for research and development.
receives research support from iCAD. E. F.
Conant receives research support from
Hologic, iCAD, and OM1; is a member of Artificial intelligence (AI) for breast cancer screening has rapidly advanced from feasi-
advisory panels of Hologic and iCAD; and bility and reader studies to clinical implementation. As of December 2021, there are more
has received speaker fees from AuntMinnie. than 15 FDA-approved AI models for screening mammography [1]. These models can be
com. M. Bahl is a consultant for Lunit and used for interpretive applications, including lesion detection and diagnosis, triage, and
an expert panelist for 2nd.MD. The density assessment, and for noninterpretive applications, including risk assessment, im-
remaining author declares that there are no
age quality control, image acquisition, and dose reduction [1–7]. AI, a branch of comput-
additional disclosures relevant to the
subject matter of this article. er science, includes machine learning (ML), which refers to computers learning from data
without being explicitly programmed, and deep learning (DL), which relies on multilayer
The content is solely the responsibility of neural networks to extract features from raw data [8]. The most common network for im-
the authors and does not necessarily age analysis is the convolutional neural network (CNN). The history of AI, DL, and CNNs can
represent the official views of the NIH or be traced back several decades; however, recent advances in computational resources,
the Susan G. Komen Foundation. pivotal developments in AI algorithms, and the growth of digital health data have fueled
a revolution that has brought AI to the mainstream in computational medical imaging [9].
Supported by the Susan G. Komen Although multiple trials have proved that screening mammography leads to fewer
Foundation (grant PDF17479714 to A. breast cancer deaths, digital mammography (DM) remains an imperfect tool [10]. For ex-
Gastounioti) and NIH (grant K08CA241365
ample, the sensitivity of DM is estimated to be 87% [11]. In the United States, 1 in 10 wom-
to M. Bahl).
en is recalled from screening mammography for further evaluation, but only 5% of those
Lamb et al.
recalled have breast cancer [12]. Furthermore, radiologist performance varies widely, with
AI for Screening Mammography 41% exceeding recommended recall rates [11]. In addition, the shortage of subspecialized
Lamb LR, Lehman CD, Gastounioti A, Conant EF, Bahl M breast imaging radiologists and the longer interpretation times required for digital breast
Breast Imaging
tomosynthesis (DBT) have fueled interest in approaches that improve efficiency without
Special Series Review
adversely affecting performance [13]. AI algorithms have the potential to improve the in-
terpretation of screening mammography by increasing sensitivity for breast cancer detec-
tion, reducing avoidable recalls and biopsies, and decreasing interpretation times.
ARRS is accredited by the Accreditation Council for Continuing
Medical Education (ACCME) to provide continuing medical Clinical implementation of AI algorithms has progressed rapidly, leading to concerns
education activities for physicians. The ARRS designates this that the pace has been ahead of rigorous evaluation of approved algorithms and their
journal-based CME activity for a maximum of 1.00 AMA PRA
Category 1 Credits™ and 1.00 American Board of Radiology©,
impact on patient outcomes [14–16]. This article reviews commercial AI algorithms for
MOC Part II, Self-­Assessment CME (­SA-CME). Physicians should screening mammography, their use, and evidence supporting their performance. Clinical
claim only the credit commensurate with the extent of their
participation in the activity. To access the article for credit, follow
the prompts associated with the online version of this article.

doi.org/10.2214/AJR.21.27071 1
Department of Radiology, Massachusetts General Hospital, 55 Fruit St, WAC 240, Boston, MA 02114. Address corre-
AJR 2022; 219:369–381 spondence to M. Bahl (mbahl1@mgh.harvard.edu).
ISSN-L 0361–803X/22/2193–369 Department of Radiology, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA.
2

© American Roentgen Ray Society Present affiliation: Mallinckrodt Institute of Radiology, Washington University School of Medicine, St. Louis, MO.
3

AJR:219, September 2022 www.ajronline.org | 369


Lamb et al.

implementation considerations are also discussed, in addition to


the future of AI for screening mammography. HIGHLIGHTS

Learning From Our Past „ Commercial artificial intelligence (AI) interpretive


Computer-aided detection (CAD) became a major focus for applications for screening mammography have been
mammography in the 1980s and 1990s [17]. The first CAD system shown to have high diagnostic accuracy in
gained FDA approval in 1998 and was integrated into practices multireader studies.
„ Evidence supporting these applications is largely based
Downloaded from ajronline.org by 85.107.71.160 on 03/24/24 from IP address 85.107.71.160. Copyright ARRS. For personal use only; all rights reserved

across the United States after reimbursement for its use was grant-
ed by the Centers for Medicare & Medicaid Services in 2002 [18, on multireader studies with cancer-enriched datasets;
19]. The studies leading to FDA approval and early adoption were however, rigorous clinical evaluation aligned with the
largely reader studies and single-center retrospective studies, but application’s intended use is needed.
multiple subsequent studies have shown that CAD in clinical prac- „ Steps to support successful implementation of an AI tool
tice does not improve any metric of screening performance [18, into clinical practice include internal validation, workflow
20]. In particular, CAD has a high false-positive rate, which has led integration, user education, and continuous monitoring.
radiologists to disregard most of its annotations [19].
An early postimplementation study of CAD in the Unit-
ed States by the Breast Cancer Surveillance Consortium (BCSC) The primary feature that differentiates new DL-based AI al-
showed that CAD was associated with a decrease in AUC from gorithms from traditional CAD is that a DL-based AI algorithm
0.92 to 0.87 (p = .005) [20]. Specificity was reduced from 90.2% identifies the imaging features that are useful for a particular im-
without CAD to 87.2% with CAD (p < .001), and sensitivity was un- age analysis task, whereas the features used by traditional CAD
changed (80.4% vs 84.0%, p = .32) [20]. A subsequent larger BCSC are human-derived [16, 23] (Fig. 1). DL-based AI algorithms can,
study with DM reported similar cancer detection rates with CAD therefore, uncover features and relationships among features
or without CAD [18]. Postimplementation clinical studies did sug- that are not perceived by humans [8]. DL has revived the promise
gest, however, that single reading with CAD could replace double of AI and has pervaded mammographic screening as one of the
reading without CAD in certain practices [21, 22]. most promising computerized breast imaging tools.

Fig. 1—Diagram shows comparison of traditional machine learning and deep learning for assessment of mammogram. Deep learning–based algorithm self-learns
imaging features for image analysis task in lieu of relying on human-engineered features. (Image by Loomis S, used with permission)

370 AJR:219, September 2022


AI for Screening Mammography

TABLE 1: Examples of FDA-Approved Lesion Detection and Diagnosis Applications for Screening
Mammography a
Reported AUCs:
Radiologists Aided
No. of No. of Radiologists Versus Radiologists FDA Type of
Tool (Company) Cases in Reader Study Unaided Class Mammogram Vendor
Downloaded from ajronline.org by 85.107.71.160 on 03/24/24 from IP address 85.107.71.160. Copyright ARRS. For personal use only; all rights reserved

MammoScreen 2.0 240 for DM 14 for DM 0.80 vs 0.77 for DM II DM and DBT GE Healthcare (DM)
(Therapixel)b [25, 26] 240 for DBT 20 for DBT 0.83 vs 0.79 for DBT Hologic (DM and DBT)
Genius AI Detection 390 17 0.83 vs 0.79 II DBT Hologic
(Hologic) [27]
ProFound AI Software 260 24 0.85 vs 0.80 II DBT GE Healthcare, Hologic,
3.0 (iCAD)c [28, 29] Siemens Healthineers
Transpara 1.7.0 240 for DM 14 for DM 0.89 vs 0.87 for DM II DM and DBT GE Healthcare (DM), Philips Healthcare
(ScreenPoint Medical)d 240 for DBT 18 for DBT 0.86 vs 0.83 for DBT (DM), Fujifilm (DM and DBT),
[30, 33, 100] Hologic (DM and DBT), Siemens
Healthineers (DM and DBT)
Lunit INSIGHT MMG 240 12 0.81 vs 0.75 II DM GE Healthcare, Hologic, Siemens
(Lunit) [31] Healthineers
Note—DM = digital 2D mammography, DBT = digital breast tomosynthesis.
a
The information in this table is based on the referenced FDA documentation and, where indicated, verification by company representatives.
b
De Snoeck Q, Therapixel representative, written communication, 2021.
c
Hawkins R, iCAD representative, written and oral communications, 2021.
d
Karssemeijer N, ScreenPoint Medical representative, written communication, 2021.

Commercial Artificial Intelligence Algorithms sis [1, 25–31] (Table 1). These tools offer traditional lesion mark-
Approved for Clinical Practice ers with corresponding level-of-suspicion scores, in addition to
Lesion Detection and Diagnosis scores at the breast and/or examination levels (Fig. 2). Four of the
AI-based CAD systems offer both computer-aided detection five tools can be used for DBT examinations.
(which refers to identification of lesions) and computer-aided di- Evidence supporting these applications is largely based on
agnosis (which refers to classification of lesions as likely benign reader studies with cancer-enriched datasets. Although early re-
or malignant) [24]. As of December 2021, there are at least five search is promising, postimplementation evaluation in clinical
FDA-approved AI applications for lesion detection and diagno- settings is limited. Factors that may affect the utility of AI applica-

Fig. 2—Example of lesion detection and diagnosis


by artificial intelligence (AI) algorithm for digital
breast tomosynthesis in 76-year-old woman.
A and B, Right craniocaudal (A) and mediolateral
oblique (B) views show focal asymmetry (box) in
lower, slightly inner quadrant of right breast, which
was detected by AI algorithm and assigned lesion
score (42, A; 64, B) reflecting percentage likelihood
of malignancy (i.e., 42% on craniocaudal view and
64% on mediolateral oblique view). Histopathology
showed grade 2 invasive ductal cancer. Asymmetry
in lateral right breast was also detected by AI
algorithm (circle, A) and assigned lesion score of
8 (A; i.e., 8% likelihood of malignancy); this latter
finding was deemed benign on basis of long-term
mammographic stability.
A B

AJR:219, September 2022 371


Lamb et al.

1.0
Sensitivity or True-Positive Rate

0.8
20

Reading Time Difference (s)


5
0.6 0. 0
of
Downloaded from ajronline.org by 85.107.71.160 on 03/24/24 from IP address 85.107.71.160. Copyright ARRS. For personal use only; all rights reserved

C
AU –20
ith
0.4 erw
s ifi –40
as
cl
om
nd –60
0.2 Ra Unaided (0.866)
AI support (0.886) –80

0
–100
0 0.2 0.4 0.6 0.8 1.0 Readers
1 – Specificity or False-Positive Rate

Fig. 3—Mean ROC curves with artificial intelligence (AI) support from lesion Fig. 4—Bar chart shows differences between mammographic reading times
detection and diagnosis algorithm and without AI support. Means are calculated with artificial intelligence (AI) support and without AI support in reader study
across 14 radiologists who participated in reader study [33]. Values in parentheses of 24 radiologists. Reading time improved with AI support, with mean decrease
are AUCs. Adapted from Rodríguez-Ruiz A, Krupinski E, Mordang JJ, Schilling of 35 seconds. Each blue bar represents individual reader. Only one reader
K, Heywang-Köbrunner SH, Sechopoulos I, Mann RM, 2019, doi.org/10.1148/ (leftmost bar) had increase in reading time with AI support (1.4 seconds). Data
radiol.2018181371, according to CC BY 4.0 license (creativecommons.org/licenses/ shown were extracted from Conant et al. [35]. (Image by Loomis S, used with
by/4.0/legalcode), including disclaimer in Section 5, with adjustments to wording permission)
of axis labels, adjusted coloring within graph, addition of text within graph,
removal of grid lines, and horizontal widening of graph area.

tions in practice include the radiologist’s confidence in the AI sys- er, these results come from a reader study without simulation of
tem and in their own interpretation, the radiologist’s training and routine clinical workflow [35] (Fig. 4).
experience in interpreting mammography with and without AI A head-to-head comparison of three AI-based CAD systems
support, interaction with the AI system, and the transparency of reported AUCs of 0.96, 0.92, and 0.92, respectively, with sensitiv-
the rationale being used by the AI system for its predictions [32]. ities of 81.9%, 67.0%, and 67.4% when matching the specificity of
In a multireader study evaluating one of the commercial appli- the radiologists [36]. Best performance was achieved, however, by
cations, the AUC was reported to be higher with the use of the AI combining the high-performing AI system with the expertise of ra-
system (0.89 vs 0.87, p = .002), although this increase was not as diologists. In a highly publicized report from Google Health and
pronounced for experienced radiologists [33] (Fig. 3). Sensitivity DeepMind Technologies, their AI system (not currently approved
improved with AI support compared with sensitivity without AI by the FDA) outperformed all six radiologists in a reader study, with
support (86% vs 83%, p = .046), and there was a nonsignificant in- the AI system’s AUC shown to be higher than the mean radiolo-
crease in specificity (79% vs 77%, p = .06) [33]. A multireader study gist AUC by an absolute margin of 11.5% [37]. The limitations of
of the same AI system found that its stand-alone performance that study include the use of a nonrepresentative U.S. dataset (can-
was not inferior to that of 101 radiologists (AUC, 0.84 vs 0.81) [34]. cer-enriched and from a single site) and images from a single ven-
With regard to AI-based CAD for DBT, a multireader study of a dor. A systematic review of published reports on lesion detection
commercial AI system (that provides localization of suspicious and diagnosis algorithms, such as the aforementioned algorithm,
findings within the DBT dataset) reported a reduction in mean found that most studies in this domain have poor methodologic
reading time with AI support (64.1 vs 30.4 seconds) [35]; howev- quality with applicability concerns and high bias risk [15].

TABLE 2: Examples of FDA-Approved Triage Applications for Screening Mammography a


No. of Cases in
Tool (Company) Performance Study Reported AUCs FDA Class Type of Mammogram Vendor
Saige-Q (DeepHealth) [38] b
1333 for DM and 0.97 for DM and II DM and DBT Hologic
1528 for DBT 0.99 for DBT
cmTriage (CureMetrix) [39] 1255 0.95 II DM Agnostic
HealthMammo (Zebra Medical Vision) [40] 835 0.97 II DM Hologic
Note—DM = digital 2D mammography, DBT = digital breast tomosynthesis.
a
The information in this table is based on the referenced FDA documentation and, in one case, verification by a company representative.
b
Lotter W, DeepHealth representative, written communication, 2021.

372 AJR:219, September 2022


AI for Screening Mammography

TABLE 3: Examples of FDA-Approved Density Assessment Applications for Screening Mammography a


Tool (Company) FDA Class Type of Mammogram Vendor
DM-Density (Densitas) [45]
b
II DM GE Healthcare, Hologic, Siemens Healthineers
densityai (Densitas)b [46] II DM and synthetic DM GE Healthcare (DM), Siemens Healthineers (DM), Hologic (DMc)
DenSeeMammo (Statlife) [47] II DM GE Healthcare, Hologic
Downloaded from ajronline.org by 85.107.71.160 on 03/24/24 from IP address 85.107.71.160. Copyright ARRS. For personal use only; all rights reserved

Insight BD (Siemens Healthineers) [48] II DM and DBT Siemens Healthineers


PowerLook Density Assessment 2.1 and 4.0 II DM (2.1) and synthetic DM (4.0) Siemens Healthineers (DM), GE Healthcare (DMe), Hologic (DMc)
(iCAD)d [49]
Quantra 2.1 and 2.2 (Hologic) [50] II DM (2.1 and 2.2) and DBT (2.2) GE Healthcare (2.1), Siemens Healthineers (2.1), Hologic (2.1 and
2.2)
Visage Breast Density (Visage Imaging) [51] II DM and DBT Hologic
Volpara Imaging Software (Volpara Health II DM and DBT Canon (DM), Fujifilm (DM), Medi-Future (DM), Metaltronica
Technologies)f [52] (DM), Philips Healthcare (DM), PlanMed (DM), Sectra (DM), GE
Healthcare (DM and DBT), Hologic (DM and DBT), IMS Giotto
(DM and DBT), Siemens Healthineers (DM and DBT)
WRDensity (Whiterabbit.ai) [53] II DM and synthetic DM Hologic
Note—DM = digital 2D mammography, DBT = digital breast tomosynthesis.
a
The information in this table is based on the referenced FDA documentation and, where indicated, verification by company representatives.
b
Abdolell M, Densitas representative, written communication, 2021.
c
With C-View software (Hologic).
d
Hawkins R, iCAD representative, written and oral communications, 2021.
e
With V-Preview software (GE Healthcare).
f
Chan A, Volpara Health Technologies representative, written communication, 2021.

Triage served reduction. The authors also highlighted “a perception of


Triage tools can be used to prioritize patients within the PACS greater ease” when interpreting batched mammograms without
worklist by flagging mammograms with one or more suspicious any suspicious findings flagged by the tool [41].
findings. At least one of the previously described applications can
also be used for triage purposes (specifically, the case score can Breast Density Assessment
be used for prioritization) [29]. In addition, at least three AI ap- Breast density notification legislation has heightened inter-
plications that are specifically marketed for triage have been ap- est in the importance of dense breast tissue, which is a risk factor
proved by the FDA as of December 2021 [1, 38–40] (Table 2). One for breast cancer and can also mask cancers on mammography;
of the triage tools can be used for both DBT and DM examina- however, considerable intrareader and interreader variability ex-
tions [38]. Two of the tools can be used only with Hologic, and ists with regard to breast density assessment [42–44]. At least
one tool is vendor-agnostic. nine AI applications for breast density assessment have been ap-
For one of the triage tools, performance data cited in the FDA proved by the FDA as of December 2021 [1, 45–53] (Table 3). The
Summary are based on a multicenter cancer-enriched dataset tools provide numeric density values for each breast, in addition
[39]. The AUC is 0.95 and is similar across different breast densities to providing BI-RADS density categories [54]. The tools may be
and mammographic lesion types [39]. In a published case study used for DM, synthetic DM, or DBT and may be vendor-specific or
based on 2 years of clinical experience with this tool, improve- vendor-agnostic.
ments in mean turnaround time from examination completion In a head-to-head comparison of three fully automated meth-
to reporting (from 9.6 days in 2019 to 3.9 days in 2021, p < .05) ods and visual assessment of breast density, two automated meth-
were largely attributed to the triage tool [41]; however, other fac- ods were strongly associated with breast cancer risk, although the
tors—such as the number of radiologists, their experience, ex- strongest predictor was visual assessment [55]. In a second head-
amination volumes, and workflow changes—could have differed to-head comparison of two automated methods and visual assess-
between 2019 and 2021 and could have contributed to the ob- ment, all had similar associations with breast cancer risk; however,

TABLE 4: Potential Use Case Scenarios for Improving Breast Imaging Services With Artificial Intelligence
Use Case Purpose
Breast cancer risk assessment Develop a reliable standard for breast cancer risk based on information from multiple sources
Classification of calcifications Detect the morphology and distribution of calcifications on mammography and recommend follow-up based on risk level
Classification of high-risk Automate high-risk lesion classification based on level of suspicion of upgrade to malignancy
breast lesions
Note—The information in this table is based on the Breast Imaging section of the American College of Radiology Data Science Institute website [88].

AJR:219, September 2022 373


Lamb et al.

1.0 Perfect classifier with AUC of 1.0


High sensitivity
Sensitivity or True-Positive Rate

0.8 Optimal cutoff point


Downloaded from ajronline.org by 85.107.71.160 on 03/24/24 from IP address 85.107.71.160. Copyright ARRS. For personal use only; all rights reserved

5
0.
0.6 of
High specificity AUC
with
0.4 fi er
si
as
cl
om
nd
0.2 Ra

0
0 0.2 0.4 0.6 0.8 1.0
1 – Specificity or False-Positive Rate

Fig. 5—Example of ROC curve. Blue line represents model that is perfect Fig. 6—Example of confusion matrix. Confusion matrix allows reader to easily
classifier (AUC = 1.0), and red line represents model that is random classifier determine how artificial intelligence algorithm is performing. In this example,
(AUC = 0.5). Most ROC curves fall between these two lines, as shown by green algorithm predicts true class of 2 with 83% accuracy. (Image by Loomis S, used
and purple lines. For any given clinical scenario, sensitivity and specificity are with permission)
considered to be optimal at operating point or cutoff point on ROC curve.
(Image by Loomis S, used with permission)

the level of agreement of the two automated methods with visu- ic features such as masses and calcifications and differences in
al assessment was only moderate (κ = 0.57 and 0.46, respectively), the right and left breasts [4]. This 2-year risk prediction model is
with observed differences of 6–14% in dense tissue categorization marketed as a clinical decision support tool, thereby exempting
[56]. Given these findings, the authors suggested using the same it from FDA regulation [62]. The DM-based model has an AUC of
technique to evaluate individual changes over time, especially 0.73 and was externally validated in three patient cohorts [63]. A
when density serves as a marker for treatment response. 1-year risk prediction model based on DBT images by the same
company was recently released [4]. These short-term risk predic-
Breast Cancer Risk Assessment tion models could potentially be used to determine screening in-
Individualized risk assessment may be used to determine tervals (1 year vs 2 years) for individual women.
which patients are at high breast cancer risk and who thus may
benefit from MRI screening, chemoprevention, or genetic test- Image Quality Control, Image Acquisition, and Dose
ing. In fact, before a patient undergoes screening with MRI, the Reduction
patient’s risk level (≥ 20%) must be documented to ensure insur- The application of AI to the image acquisition process can im-
ance coverage of the examination [57]. Traditional risk assess- prove image quality, process efficiency, and technologist perfor-
ment models incorporating patient-specific features are calibrat- mance [5, 6, 19]. For example, a commercial AI application (not
ed at the population level rather than the individual level and requiring FDA approval) provides real-time patient position and
were largely developed based on White women [58]. More re- compression feedback to technologists [6]. Positioning parame-
cently, mammographic density has been added to risk prediction ters include nipple in profile, visibility of the inframammary fold, vi-
models [59]. sualization of the pectoralis muscle extending inferiorly to at least
One of the companies with a commercial AI-based breast den- 1 cm above the posterior nipple line, and adequate visualization
sity algorithm has also developed a risk assessment application, of the pectoralis muscle angle, length, and width. With real-time
which combines AI-based breast density, breast volume, age, feedback, repeat imaging can be immediately performed, thus re-
and other clinical risk factors to generate a current risk prediction ducing technical recalls. This AI system also provides automated
[2]. The AUC for the model based on imaging features (AI-based analytic reports with image quality performance metrics. In addi-
breast density and breast volume) has been reported to be 0.60 tion, DL-based reconstruction of synthetic images from DBT exam-
[60]. Similarly, another company has combined AI-based breast inations is currently being performed and has led to radiation dose
density (which is FDA-approved) with a traditional risk prediction reductions (since 2D images need not be obtained) [7].
model (Tyrer-Cuzick, version 8, which does not require FDA ap-
proval) to calculate lifetime breast cancer risk [3]. Clinical Implementation
Breast density alone, however, does not capture the rich infor- Clinical Adoption
mation contained in a mammogram [61]. At least one commer- The potential of AI for screening mammography has not yet
cial risk prediction model uses age and more information from been realized, with most research based on reader studies with
the mammogram than just density, including mammograph- cancer-enriched datasets and limited postimplementation clin-

374 AJR:219, September 2022


AI for Screening Mammography

ical evaluation [15, 16, 33–35, 37]. Rigorous clinical evaluation the AI tool into the PACS and reporting systems workflow in col-
aligned with the AI tool’s specific intended use will help deter- laboration with the AI vendor and the practice’s informatics team
mine its true impact on patient outcomes. In addition to the lack [68]. To increase awareness and acceptance, education should in-
of evidence and therefore uncertainty about the value of AI in volve not only radiologists but also technologists and administra-
clinical practice, other obstacles to adoption include concerns tors. The last phase is continuous monitoring of the AI tool after de-
about decreased clinical productivity with AI, inconsistent tech- ployment to ensure adequate performance [68, 69].
nical performance of AI systems, the absence of guidelines and Certain AI tools for screening mammography may be integrat-
Downloaded from ajronline.org by 85.107.71.160 on 03/24/24 from IP address 85.107.71.160. Copyright ARRS. For personal use only; all rights reserved

best systems, the variability in the level of trust in AI by radiol- ed into the clinical workflow based on local practices and needs
ogists, other clinicians, and patients [64, 65]. Practical factors to (Fig. 7). For example, a lesion detection and diagnosis application
consider include the mammographic views and image format could replace the second reader at sites that offer double-read-
required for model processing, compatibility of the model with ing or could potentially be used for stand-alone interpretation
local practice techniques (e.g., DM, synthetic DM, or DBT) and in communities with limited breast imaging expertise, though
mammography equipment, and algorithm processing time and none of the commercial tools are currently approved for stand-
storage needs in the context of local resources. When purchas- alone interpretation [25–31]. Practices could also use AI tools
ing an AI tool, cost considerations include software and hardware to prioritize examinations to be read immediately or to gener-
versus only software acquisitions, pricing based on the number ate customized worklists for different readers. One of the com-
of mammography machines and/or the volume of examinations, mercial tools for lesion detection and diagnosis, for example, of-
and the possibility of additional charges associated with software fers a case complexity index (multiple findings, single finding, or
updates and license transfers. no findings), a read time indicator (high, medium, or low), and a
Adoption of AI is likely to be influenced by radiologists’ knowl- reading priority indication (which flags the most concerning cas-
edge, perception, and attitude. For example, a survey of radiol- es), which are metrics that can be used to categorize cases and
ogists and radiology residents found that limited AI knowledge enhance workflow [70]. Algorithms that provide imaging data–
was associated with fear, whereas greater AI knowledge was re- driven risk scores (incorporating lesion scores, asymmetries be-
lated to a positive attitude toward AI, suggesting that exposure tween breasts, and quantitative density measures) could be used
and training may help accelerate clinical adoption [66]. Accord- to help refine recommendations for supplemental screening [59].
ing to a 2020 survey of 1427 American College of Radiology (ACR)
members, one-third of radiologists were using AI as part of their Multidisciplinary Collaboration
practice, and most respondents who were not using AI respond- Essential to the development and implementation of AI algo-
ed that they “see no benefit” to AI [65]. Survey respondents ex- rithms is a strong relationship between domain experts (i.e., ra-
pressed a desire for methods to evaluate and report the per- diologists) and technical experts (i.e., data scientists, engineers)
formance of AI tools on representative datasets, evaluate tools who may be in the academic or private sectors [71, 72]. Radiolog-
before purchase, and monitor their accuracy over time. Address- ic technologists have also expressed a strong interest in being in-
ing radiologists’ concerns and mitigating barriers to adoption will volved in development and implementation [73]. Communication
provide opportunities for the future growth of AI applications. among the multidisciplinary team members is necessary to under-
stand and agree on the goals of the algorithm. The potential of AI
Evaluation will be realized only when these algorithms are deployed in prac-
Implementation of AI tools requires that the tool works (per- tice; adopted by radiologists, other clinicians, technologists, and
formance validation), that the user understands the tool (trust), patients; and their impact is measured using relevant and specific
and that the tool is integrated into the existing workflow (user metrics. Radiologists must ultimately determine the intended im-
experience design) [19]. The performance of an AI tool can be as- pact of and clinically relevant metrics for the AI system and should
sessed using ROC curves (Fig. 5) and/or confusion matrices [8] continue to communicate with the model developers [71].
(Fig. 6). After installation of an AI tool, a practice can perform its
own validation by allowing a test period during which radiolo- Governance
gists do not have access to AI output [67]. Radiologists can later The FDA regulates AI algorithms that are medical devices and
review the predictions made by the AI tool and evaluate its per- defines “medical devices,” according to the 21st Century Cures Act,
formance across a large volume of cases. This test period will also as follows: “[tools] intended for use in the diagnosis of disease or
facilitate learning of the strengths and limitations of the tool and other conditions, or in the cure, mitigation, treatment, or preven-
allow users to gain trust in the AI tool’s predictions. Even the most tion of disease… or intended to affect the structure or any function
trusted AI tool will fail, however, if not appropriately integrated of the body of man or other animals…” [74, 75]. AI tools used in the
into existing workflows [19]. interpretation of imaging, such as the applications for lesion detec-
tion and diagnosis, triage, and density assessment, are thus regu-
Workflow Integration lated by the FDA. The FDA classifies these devices into one of three
Steps involved in integrating an AI tool into clinical practice in- classes according to their risk level and intended use. Approved AI
clude internal validation, workflow integration, user education, algorithms for mammography are categorized as class II devices,
and continuous monitoring [68]. The first step involves the practice which are considered to be moderate risk [1, 76].
conducting its own internal validation of the AI tool, since regula- Certain AI algorithms such as those considered to be clinical
tory approval does not ensure that the tool will be generalizable decision support software are exempt from FDA regulation [62].
to all practices [67, 69]. The integration phase involves assimilating Clinical decision support software is a tool that provides the fol-

AJR:219, September 2022 375


Lamb et al.

lowing: “knowledge and person-specific information, intelligent- tems. Thus, small and/or resource-poor practices may not be able
ly filtered or presented at appropriate times to enhance health to take advantage of AI systems, which could result in disparities
and health care” [62]. The software can offer recommendations to in the use of AI and its potential benefits [78]. As with all agree-
a health care professional, but it cannot be used to replace pro- ments between commercial entities and health care organiza-
fessional judgment. The previously discussed short-term risk pre- tions, potential conflicts of interests between the institution’s
diction tool and the image quality control tools are considered members and the company must be monitored following stan-
clinical decision support software and are therefore not subject dard practices of public disclosure, institutional monitoring, and
Downloaded from ajronline.org by 85.107.71.160 on 03/24/24 from IP address 85.107.71.160. Copyright ARRS. For personal use only; all rights reserved

to FDA regulation [4–6]. recusal of members with a significant conflict of interest from the
decision-making process [79].
Ethical and Medicolegal Issues The integration of AI into clinical practice will undoubtedly im-
The use of AI in health care raises ethical questions with regard pact the doctor-patient relationship, and uncertainty exists re-
to algorithm bias, resource allocations, conflicts of interest, and garding patients’ acceptance of AI. In one survey, more than three-
the doctor-patient relationship, in addition to liability concerns. fourths of women did not support the use of a fully autonomous
Recent research has shown that AI algorithms have been dispro- AI system for screening mammography [80]. Survey respondents
portionately trained on patients from New York, Massachusetts, were receptive to the idea of the AI tool serving as a second read-
and California, raising concern that these algorithms may gener- er, although more than 40% did not believe that AI scores should
alize poorly and thus highlighting the importance of validation be used as a method to determine which patients require a sec-
across diverse populations [77]. Regarding resource allocations, ond read. Education about the strengths and limitations of AI ap-
the development and implementation of AI tools require large plications could help mitigate patient concerns and increase their
datasets, the technology and technical skills to manage the data, acceptance. In terms of practice and radiologist liability, questions
and the computational power to develop and maintain AI sys- have arisen about the legal responsibility of AI manufacturers for

Fig. 7—Diagram shows integration of artificial intelligence (AI) into screening workflow. AI applications can be used for noninterpretive functions, such as breast
cancer risk assessment, image quality control, image acquisition, and dose reduction, and for interpretive purposes, such as triage, breast density assessment, and
lesion detection and diagnosis. (Image by Loomis S, used with permission)

376 AJR:219, September 2022


AI for Screening Mammography

errors made by AI tools. As of December 2021, however, radiolo- assisting radiologists with triaging normal cases out of their work
gists continue to bear ultimate responsibility, though the issue of queue remains. In fact, DL models have been shown to confi-
liability is complex and will continue to evolve [78]. dently classify mammograms as cancer-free, which could lead to
a reduction in radiologist workload [93, 94]. AI applications that
Future of Artificial Intelligence for Screening provide rapid and accurate assessment of a mammogram as can-
Mammography cer-free may also be valuable in communities that lack access to
Artificial Intelligence Model Development breast imaging expertise.
Downloaded from ajronline.org by 85.107.71.160 on 03/24/24 from IP address 85.107.71.160. Copyright ARRS. For personal use only; all rights reserved

Mammographic images present multiple technical challenges


for AI model development. Most efforts have focused on the ap- Strategic Priorities for Research and Development
plication of existing DL models to mammographic images rather To facilitate AI algorithm development, the following priori-
than the development of new architectures suited to this domain ties in AI research were identified at an international workshop
[81]. However, customized architectures that can leverage intrinsic organized by the NIH: new image reconstruction and enhance-
characteristics of mammographic imaging, such as the high image ment methods, automated labeling and annotation methods,
resolution, the multiple mammographic views, and varying recon- novel algorithms developed for the complexity of imaging data,
struction algorithms, could further boost model performance [37, methods that explain AI advice given to users, and methods for
82, 83]. Moreover, extensive work is needed to determine if and deidentification of imaging data and data sharing [95]. Priorities
how the models developed for DM can be applied to DBT [81]. for promoting AI include creating standards and methods for
Access to high-quality expert annotations (or labels) is also a data sharing and management, improving the user interface and
critical factor for model development. AI models that learn in a experience, ensuring patient safety and health equity, and devel-
supervised fashion from both pixel-level annotations (e.g., circle oping efficient pathways to commercialize AI tools [96]. Active
around cancer) and examination-level annotations (e.g., mam- collaboration among radiologists, model developers, research
mogram labeled as positive or negative for cancer) typically scientists, and regulatory bodies is critical to translate research
show better performance [37, 84]. However, obtaining high-qual- into practice and to promote AI use.
ity pixel-level labels is a major challenge as these annotations can
be time-consuming and tedious, especially with DBT. Recogniz- Value Proposition of Artificial Intelligence for Screening
ing the need for annotated datasets, the ACR provides resources Mammography
in the ACR AI-LAB to annotate datasets [85]. The so-called “volume to value” transition is a primary compo-
A challenge to be addressed before AI becomes commonplace nent of the ACR’s 3.0 initiative [97]. AI must show its added val-
in mammographic screening is interpretability [86]. Models that ue to mammography through the standard metrics of improved
use an explainable transparent rationale to make predictions are performance, increased efficiency, and a preferential cost-ben-
more likely to be trusted, which can accelerate their adoption efit ratio [19]. To show improved performance, audit reports for
into clinical practice. In addition, AI models must generalize well screening mammography must show evidence of increased can-
to unseen heterogeneous datasets. Therefore, large retrospec- cer detection rates and decreased recall rates after AI implemen-
tive studies involving diverse populations, different vendors, and tation. Early studies show that one of the benefits of AI-based
various image acquisition settings are essential. Although evalu- CAD over traditional CAD may be higher specificity with fewer
ation with retrospective datasets provides insight into the poten- annotations made by the AI system [35, 41]. To increase efficien-
tial performance of AI models depending on the intended clinical cy, AI algorithms could aid radiologists by detecting and charac-
use, validation studies in real-world clinical settings are necessary terizing lesions, autocategorizing breast density, prepopulating
to fully appreciate the performance of AI applications, their im- reports, and reducing screening workload by autoreporting nor-
pact on radiologist performance, and the complex interaction mal examinations [19].
between the two [87]. As CAD is currently part of a bundled rather than separate
charge for screening mammograms, the possibility of introduc-
Future Artificial Intelligence Applications for Screening ing a new billing code for AI remains uncertain. Payment in our
Mammography fee-for-service environment could pose challenges, but future
The ACR Data Science Institute has outlined a series of use case value-based payment models may recognize the value of AI tools
scenarios for improving breast imaging services with AI [88] (Ta- that improve quality and equity [98]. In the more immediate fu-
ble 4). For example, research is in progress for interpretive AI ap- ture, AI could lead to reduced costs through increased efficien-
plications that detect and classify calcifications on mammogra- cy (for example, if a subset of screening mammograms are accu-
phy, with preliminary results showing that CNNs can outperform rately autoreported as normal without radiologist involvement
handcrafted features for this task [89]. In addition, an AI algorithm or if the perceived need for double-reading by radiologists is
is being investigated to detect arterial calcifications on mammog- reduced) and/or through a reduction in false-positive examina-
raphy, which can help identify women at risk of coronary artery tions, which are estimated to cost nearly $3 billion per year [99].
disease [90]. Research is also ongoing for noninterpretive appli-
cations, including long-term risk prediction models (e.g., 5-year, Conclusion
10-year, and lifetime) that incorporate more mammographic in- Breast imaging has been at the forefront of AI in health care,
formation than just breast density [91, 92]. with AI applications for breast cancer screening having quickly
Early claims that AI-based stand-alone interpretation of screen- advanced from feasibility and reader studies to clinical imple-
ing mammograms was imminent proved false, but the goal of AI mentation. Commercial AI applications for screening mammog-

AJR:219, September 2022 377


Lamb et al.

raphy are available in the interpretive domains of lesion detec- 16. Hickman SE, Woitek R, Le EPV, et al. Machine learning for workflow appli-
tion and diagnosis, triage, and density assessment and in the cations in screening mammography: systematic review and meta-analy-
noninterpretive domains of risk assessment, image quality con- sis. Radiology 2022; 302:88–104
trol, image acquisition, and dose reduction. The rapid deploy- 17. Giger ML, Chan HP, Boone J. Anniversary paper: history and status of CAD
ment of AI tools into clinical practice, however, may outpace the and quantitative image analysis: the role of Medical Physics and AAPM.
rigorous evaluation of these tools and their potential impact on Med Phys 2008; 35:5799–5820
patient care and outcomes. The success of AI for breast cancer 18. Lehman CD, Wellman RD, Buist DS, Kerlikowske K, Tosteson AN, Miglioret-
Downloaded from ajronline.org by 85.107.71.160 on 03/24/24 from IP address 85.107.71.160. Copyright ARRS. For personal use only; all rights reserved

screening will require that AI algorithms not only are validated ti DL; Breast Cancer Surveillance Consortium. Diagnostic accuracy of digi-
in the clinical environment but also are appropriately integrat- tal screening mammography with and without computer-aided detec-
ed into existing workflows and show added value to screening tion. JAMA Intern Med 2015; 175:1828–1837
mammography through improved patient outcomes, increased 19. Morgan MB, Mates JL. Applications of artificial intelligence in breast imag-
efficiency, and a preferential cost-benefit ratio. ing. Radiol Clin North Am 2021; 59:139–148
20. Fenton JJ, Taplin SH, Carney PA, et al. Influence of computer-aided detec-
Acknowledgment tion on performance of screening mammography. N Engl J Med 2007;
We thank Susanne L. Loomis (Medical and Scientific Commu- 356:1399–1409
nications, Strategic Communications, Department of Radiology, 21. Gilbert FJ, Astley SM, Gillan MG, et al.; CADET II Group. Single reading with
Massachusetts General Hospital, Boston, MA) for creating Figures computer-aided detection for screening mammography. N Engl J Med
1 and 4–7 in this article. 2008; 359:1675–1684
22. Gromet M. Comparison of computer-aided detection to double reading
References of screening mammograms: review of 231,221 mammograms. AJR 2008;
1. American College of Radiology Data Science Institute website. AI Central. 190:854–859
models.acrdsi.org/. Accessed August 25, 2021 23. Bahl M. Detecting breast cancers with mammography: will AI succeed
2. Densitas website. Densitas riskaiTM. densitas.health/riskai. Accessed Sep- where traditional CAD failed? Radiology 2019; 290:315–316
tember 15, 2021 24. Gao Y, Geras KJ, Lewin AA, Moy L. New frontiers: an update on comput-
3. Volpara Health website. ScorecardTM. www.volparahealth.com/breast- er-aided diagnosis for breast imaging in the age of artificial intelligence.
health-platform/products/scorecard/. Accessed September 15, 2021 AJR 2019; 212:300–307
4. iCAD website. ProFound AI Risk. www.icadmed.com/profoundai-risk. 25. FDA website. Approval document for MammoScreen. www.accessdata.
html. Accessed September 15, 2021 fda.gov/cdrh_docs/pdf19/K192854.pdf. Published March 25, 2020. Ac-
5. Densitas website. Densitas qualityaiTM. densitas.health/solutions/quality/. cessed September 10, 2021
Accessed September 20, 2021 26. FDA website. Approval document for MammoScreen 2.0. www.­
6. Volpara Health website. LiveTM. www.volparahealth.com/breast-health- accessdata.fda.gov/cdrh_docs/pdf21/K211541.pdf. Published November
platform/products/live/. Accessed September 20, 2021 26, 2021. Accessed December 29, 2021
7. Hologic website. 3DQuorumTM imaging technology. www.hologic.com/ 27. FDA website. Approval document for Genius AI Detection. www.­
sites/default/files/downloads/WP-00152_Rev001_3DQuorum_Imaging_ accessdata.fda.gov/cdrh_docs/pdf20/K201019.pdf. Published November
Technology_Whitepaper%20%20(1).pdf. Published October 2019. Ac- 18, 2020. Accessed September 10, 2021
cessed October 15, 2021 28. FDA website. Approval document for ProFound AI Software V2.1. www.
8. Bahl M. Artificial intelligence: a primer for breast imaging radiologists. J accessdata.fda.gov/cdrh_docs/pdf19/K191994.pdf. Published October 4,
Breast Imaging 2020; 2:304–314 2019. Accessed September 10, 2021
9. Trister AD, Buist DSM, Lee CI. Will machine learning tip the balance in 29. FDA website. 510(k) Premarket Notification for ProFound AI Software
breast cancer screening? JAMA Oncol 2017; 3:1463–1464 V3.0. www.accessdata.fda.gov/scripts/cdrh/cfdocs/cfpmn/pmn.cfm?ID=​
10. Nyström L, Andersson I, Bjurstam N, Frisell J, Nordenskjöld B, Rutqvist LE. K203822. Published March 12, 2021. Accessed October 15, 2021
Long-term effects of mammography screening: updated overview of the 30. FDA website. Approval document for Transpara 1.7.0. www.accessdata.
Swedish randomised trials. Lancet 2002; 359:909–919 fda.gov/cdrh_docs/pdf21/K210404.pdf. Published July 30, 2021. Accessed
11. Lehman CD, Arao RF, Sprague BL, et al. National performance benchmarks September 10, 2021
for modern screening digital mammography: update from the Breast 31. FDA website. Approval document for Lunit INSIGHT MMG. www.­
Cancer Surveillance Consortium. Radiology 2017; 283:49–58 accessdata.fda.gov/cdrh_docs/pdf21/K211678.pdf. Published November
12. Lee CS, Parise C, Burleson J, Seidenwurm D. Assessing the recall rate for 17, 2021. Accessed December 12, 2021
screening mammography: comparing the Medicare Hospital Compare 32. Hsu W, Hoyt AC. Using time as a measure of impact for AI systems: impli-
dataset with the National Mammography Database. AJR 2018; 211:127–132 cations in breast screening. Radiol Artif Intell 2019; 1:e190107
13. Dang PA, Freer PE, Humphrey KL, Halpern EF, Rafferty EA. Addition of to- 33. Rodríguez-Ruiz A, Krupinski E, Mordang JJ, et al. Detection of breast can-
mosynthesis to conventional digital mammography: effect on image in- cer with mammography: effect of an artificial intelligence support sys-
terpretation time of screening examinations. Radiology 2014; 270:49–56 tem. Radiology 2019; 290:305–314
14. Lehman CD. Artificial intelligence to support independent assessment of 34. Rodríguez-Ruiz A, Lång K, Gubern-Merida A, et al. Stand-alone artificial
screening mammograms: the time has come. JAMA Oncol 2020; 6:1588– intelligence for breast cancer detection in mammography: comparison
1589 with 101 radiologists. J Natl Cancer Inst 2019; 111:916–922
15. Freeman K, Geppert J, Stinton C, et al. Use of artificial intelligence for im- 35. Conant EF, Toledano AY, Periaswamy S, et al. Improving accuracy and effi-
age analysis in breast cancer screening programmes: systematic review of ciency with concurrent use of artificial intelligence for digital breast to-
test accuracy. BMJ 2021; 374:n1872 mosynthesis. Radiol Artif Intell 2019; 1:e180096

378 AJR:219, September 2022


AI for Screening Mammography

36. Salim M, Wåhlin E, Dembrower K, et al. External evaluation of 3 commer- of measuring mammographic density: a case-control study. Breast Cancer
cial artificial intelligence algorithms for independent assessment of Res 2018; 20:10
screening mammograms. JAMA Oncol 2020; 6:1581–1588 56. Brandt KR, Scott CG, Ma L, et al. Comparison of clinical and automated
37. McKinney SM, Sieniek M, Godbole V, et al. International evaluation of an AI breast density measurements: implications for risk prediction and sup-
system for breast cancer screening. Nature 2020; 577:89–94 plemental screening. Radiology 2016; 279:710–719
38. FDA website. Approval document for Saige-Q. www.accessdata.fda.gov/ 57. Commonwealth of Massachusetts website. Guidelines for medical neces-
cdrh_docs/pdf20/K203517.pdf. Published April 16, 2021. Accessed Sep- sity determination for breast MRI. www.mass.gov/doc/breast-mri/­
Downloaded from ajronline.org by 85.107.71.160 on 03/24/24 from IP address 85.107.71.160. Copyright ARRS. For personal use only; all rights reserved

tember 10, 2021 download. Published June 9, 2016. Accessed October 15, 2021
39. FDA website. Approval document for cmTriage. www.accessdata.fda.gov/ 58. Kim G, Bahl M. Assessing risk of breast cancer: a review of risk prediction
cdrh_docs/pdf18/K183285.pdf. Published March 8, 2019. Accessed Sep- models. J Breast Imaging 2021; 3:144–155
tember 10, 2021 59. Brentnall AR, Cuzick J, Buist DSM, Bowles EJA. Long-term accuracy of
40. FDA website. Approval document for HealthMammo. www.accessdata. breast cancer risk assessment combining classic risk factors and breast
fda.gov/cdrh_docs/pdf20/K200905.pdf. Published July 16, 2020. Ac- density. JAMA Oncol 2018; 4:e180174
cessed September 10, 2021 60. Abdolell M, Payne JI, Caines J, et al. Assessing breast cancer risk within the
41. Tartar M, Le L, Watanabe AT, Enomoto AJ. Artificial intelligence support general screening population: developing a breast cancer risk model to
for mammography: in-practice clinical experience. J Am Coll Radiol 2021; identify higher risk women at mammographic screening. Eur Radiol 2020;
18:1510–1513 30:5417–5426
42. Sprague BL, Conant EF, Onega T, et al.; PROSPR Consortium. Variation in 61. Bahl M. Harnessing the power of deep learning to assess breast cancer
mammographic breast density assessments among radiologists in clini- risk. Radiology 2020; 294:273–274
cal practice: a multicenter observational study. Ann Intern Med 2016; 62. FDA website. Guidance document: clinical decision support software. https:/​
165:457–464 /www.fda.gov/regulatory-information/search-fda-guidance-­documents/​
43. Lehman CD, Yala A, Schuster T, et al. Mammographic breast density as- clinical-decision-support-software. Published September 27, 2019. Ac-
sessment using deep learning: clinical implementation. Radiology 2019; cessed October 1, 2021
290:52–58 63. Eriksson M, Czene K, Strand F, et al. Identification of women at high risk of
44. Gastounioti A, Pantalone L, Scott CG, et al. Fully automated volumetric breast cancer who need supplemental screening. Radiology 2020;
breast density estimation from digital breast tomosynthesis. Radiology 297:327–333
2021; 301:561–568 64. Strohm L, Hehakaya C, Ranschaert ER, Boon WPC, Moors EHM. Implemen-
45. FDA website. Approval document for DM-Density. www.accessdata.fda. tation of artificial intelligence (AI) applications in radiology: hindering and
gov/cdrh_docs/pdf17/K170540.pdf. Published February 23, 2018. Ac- facilitating factors. Eur Radiol 2020; 30:5525–5532
cessed September 26, 2021 65. Allen B, Agarwal S, Coombs L, Wald C, Dreyer K. 2020 ACR Data Science
46. FDA website. Approval document for densitas densityai. www.­accessdata. Institute artificial intelligence survey. J Am Coll Radiol 2021; 18:1153–1159
fda.gov/cdrh_docs/pdf19/K192973.pdf. Published February 19, 2020. Ac- 66. Huisman M, Ranschaert E, Parker W, et al. An international survey on AI in
cessed September 26, 2021 radiology in 1,041 radiologists and radiology residents part 1: fear of re-
47. FDA website. Approval document for DenSeeMammo. www.accessdata. placement, knowledge, and attitude. Eur Radiol 2021; 31:7058–7066
fda.gov/cdrh_docs/pdf17/K173574.pdf. Published July 26, 2018. Accessed 67. Chan HP, Samala RK, Hadjiiski LM. CAD and AI for breast cancer—recent
September 26, 2021 development and challenges. Br J Radiol 2020; 93:20190580
48. FDA website. Approval document for Insight BD. www.accessdata.fda. 68. Pierce JD, Rosipko B, Youngblood L, Gilkeson RC, Gupta A, Bittencourt LK.
gov/cdrh_docs/pdf17/K172832.pdf. Published February 6, 2018. Accessed Seamless integration of artificial intelligence into the clinical environ-
September 26, 2021 ment: our experience with a novel pneumothorax detection artificial in-
49. FDA website. Approval document for PowerLook Density Assessment telligence algorithm. J Am Coll Radiol 2021; 18:1497–1505
V4.0. www.accessdata.fda.gov/cdrh_docs/pdf21/K211506.pdf. Published 69. Allen B, Dreyer K, Stibolt R Jr, et al. Evaluation and real-world performance
July 12, 2021. Accessed September 26, 2021 monitoring of artificial intelligence models in clinical practice: try it, buy it,
50. FDA website. Approval document for Quantra. www.accessdata.fda.gov/ check it. J Am Coll Radiol 2021; 18:1489–1496
cdrh_docs/pdf16/K163623.pdf. Published October 20, 2017. Accessed 70. Hologic website. Kshirsagar A, Keller B, Smith A. Genius AI Detection for
September 26, 2021 breast tomosynthesis. www.hologic.com/sites/default/files/2020_12/
51. FDA website. Approval document for Visage Breast Density. www.­ WP-00178_Rev02_GeniusAI_Detection-white-paper-6979r10p.pdf. Pub-
accessdata.fda.gov/cdrh_docs/pdf20/K201411.pdf. Published January 29, lished December 2020. Accessed October 20, 2021
2021. Accessed September 26, 2021 71. Kshirsagar M, Robinson C, Yang S, et al. Becoming good at AI for good.
52. FDA website. Approval document for Volpara Imaging Software. www. arXiv website. arxiv.org/abs/2104.11757. Published May 3, 2021. Accessed
accessdata.fda.gov/cdrh_docs/pdf21/K211279.pdf. Published July 27, October 22, 2021
2021. Accessed September 26, 2021 72. Spilseth B, McKnight CD, Li MD, et al. AUR-RRA review: logistics of aca-
53. FDA website. Approval document for WRDensity by Whiterabbit.ai. www. demic-industry partnerships in artificial intelligence. Acad Radiol 2022;
accessdata.fda.gov/cdrh_docs/pdf20/K202013.pdf. Published October 29:119–128
30, 2020. Accessed September 26, 2021 73. Ryan ML, O’Donovan T, McNulty JP. Artificial intelligence: the opinions of
54. Sickles EA, D’Orsi CJ, Bassett LW, et al. ACR BI-RADS Mammography, 5th radiographers and radiation therapists in Ireland. Radiography (Lond)
ed. In: D’Orsi CJ, Sickles EA, Mendelson EB, et al. ACR BI-RADS Atlas, Breast 2021; 27(suppl 1):S74–S82
Imaging Reporting and Data System. American College of Radiology, 2013 74. U.S. Congress website. H.R.34: 21st Century Cures Act. www.congress.
55. Astley SM, Harkness EF, Sergeant JC, et al. A comparison of five methods gov/bill/114th-congress/house-bill/34. Published December 13, 2016. Ac-

AJR:219, September 2022 379


Lamb et al.

cessed October 15, 2021 88. American College of Radiology Data Science Institute website. Define-AI Di-
75. FDA website. How to determine if your product is a medical device. www. rectory. www.acrdsi.org/DSI-Services/Define-AI. Accessed December 1, 2021
fda.gov/MedicalDevices/DeviceRegulationandGuidance/Overview/­Classify​ 89. Cai H, Huang Q, Rong W, et al. Breast microcalcification diagnosis using
YourDevice/ucm051512.htm. Published December 16, 2019. Accessed Oc- deep convolutional neural network from digital mammograms. Comput
tober 15, 2021 Math Methods Med 2019; 2019:2717454
76. FDA website. Learn if a medical device has been cleared by FDA for mar- 90. CureMetrix website. cmAngio. curemetrix.com/cmangio/. Accessed De-
keting. www.fda.gov/medical-devices/consumers-medical-devices/learn-​ cember 8, 2021
Downloaded from ajronline.org by 85.107.71.160 on 03/24/24 from IP address 85.107.71.160. Copyright ARRS. For personal use only; all rights reserved

if-medical-device-has-been-cleared-fda-marketing. Published Decem- 91. Yala A, Lehman C, Schuster T, Portnoi T, Barzilay R. A deep learning mam-
ber 29, 2017. Accessed October 15, 2021 mography-based model for improved breast cancer risk prediction. Ra-
77. Kaushal A, Altman R, Langlotz C. Geographic distribution of U.S. cohorts diology 2019; 292:60–66
used to train deep learning algorithms. JAMA 2020; 324:1212–1213 92. Yala A, Mikhael PG, Strand F, et al. Toward robust mammography-based
78. Geis JR, Brady AP, Wu CC, et al. Ethics of artificial intelligence in radiology: models for breast cancer risk. Sci Transl Med 2021; 13:eaba4373
summary of the joint European and North American multisociety state- 93. Rodríguez-Ruiz A, Lång K, Gubern-Merida A, et al. Can we reduce the
ment. Radiology 2019; 293:436–440 workload of mammographic screening by automatic identification of
79. Bero L. Addressing bias and conflict of interest among biomedical re- normal exams with artificial intelligence? A feasibility study. Eur Radiol
searchers. JAMA 2017; 317:1723–1724 2019; 29:4825–4832
80. Ongena YP, Yakar D, Haan M, Kwee TC. Artificial intelligence in screening 94. Yala A, Schuster T, Miles R, Barzilay R, Lehman C. A deep learning model to
mammography: a population survey of women’s preferences. J Am Coll triage screening mammograms: a simulation study. Radiology 2019;
Radiol 2021; 18(1 Pt A):79–86 293:38–46
81. Geras KJ, Mann RM, Moy L. Artificial intelligence for mammography and 95. Langlotz CP, Allen B, Erickson BJ, et al. A roadmap for foundational re-
digital breast tomosynthesis: current concepts and future perspectives. search on artificial intelligence in medical imaging: from the 2018 NIH/
Radiology 2019; 293:246–259 RSNA/ACR/The Academy Workshop. Radiology 2019; 291:781–791
82. Geras KJ, Wolfson S, Shen Y, et al. High-resolution breast cancer screening 96. Allen B Jr, Seltzer SE, Langlotz CP, et al. A road map for translational re-
with multi-view deep convolutional neural networks. arXiv website. arxiv. search on artificial intelligence in medical imaging: from the 2018 Nation-
org/abs/1703.07047. Published June 28, 2018. Accessed October 28, 2021 al Institutes of Health/RSNA/ACR/The Academy Workshop. J Am Coll Radi-
83. Wu N, Jastrzębski S, Park J, Moy L, Cho K, Geras KJ. Improving the ability of ol 2019; 16(9 Pt A):1179–1189
deep neural networks to use information from multiple views in breast 97. Sogani J, Allen B Jr, Dreyer K, McGinty G. Artificial intelligence in radiolo-
cancer screening. Proc Mach Learn Res 2020; 121:827–842 gy: the ecosystem essential to improving patient care. Clin Imaging 2020;
84. Lotter W, Diab AR, Haslam B, et al. Robust breast cancer detection in mam- 59:A3–A6
mography and digital breast tomosynthesis using an annotation-efficient 98. Chen MM, Golding LP, Nicola GN. Who will pay for AI? Radiol Artif Intell
deep learning approach. Nat Med 2021; 27:244–249 2021; 3:e210030
85. American College of Radiology Data Science Institute website. ACR 99. Ong MS, Mandl KD. National expenditure for false-positive mammograms
AI-LAB. ailab.acr.org/Account/Home. Accessed December 1, 2021 and breast cancer overdiagnoses estimated at $4 billion a year. Health Aff
86. Gastounioti A, Kontos D. Is it time to get rid of black boxes and cultivate (Millwood) 2015; 34:576–583
trust in AI? Radiol Artif Intell 2020; 2:e200088 100. van Winkel SL, Rodríguez-Ruiz A, Appelman L, et al. Impact of artificial in-
87. Hickman SE, Baxter GC, Gilbert FJ. Adoption of artificial intelligence in telligence support on accuracy and reading time in breast tomosynthesis
breast imaging: evaluation, ethical constraints and limitations. Br J Cancer image interpretation: a multi-reader multi-case study. Eur Radiol 2021;
2021; 125:15–22 31:8682–8691

(Editorial Comment starts on next page)

380 AJR:219, September 2022


AI for Screening Mammography

Editorial Comment: Artificial Intelligence in Mammography—Our New Reality


In this rapidly evolving environment of artificial intelligence workflow are immediate challenges that these products face as
(AI), mammography has been a prime target for machine learn- they transition from research to clinical practice. Medicolegal is-
ing, particularly deep learning (DL), applications. Historically, sues, governance and ethical issues, and the cost-benefit prop-
breast imaging has been one of the most commonly targeted osition must also be addressed, not just for mammography AI
imaging applications for computer-based detection and classi- products but for all AI-based imaging products. Nonetheless, as
fication systems [1]. However, computer-aided detection (CAD) AI in mammography is now a reality, breast imaging radiologists
Downloaded from ajronline.org by 85.107.71.160 on 03/24/24 from IP address 85.107.71.160. Copyright ARRS. For personal use only; all rights reserved

showed limited value in screening mammography [2]. The prior must have an understanding of the available products and their
failure of CAD in mammography was attributed predominantly clinical utility. This article provides such an overview.
to supervised learning and insufficient processing power [3]. In Sadia Khanani, MD
comparison, modern DL overcomes these problems through the Mayo Clinic
currently available tremendous computing power and through Rochester, MN
self-identification of important imaging features. khanani.sadia@mayo.edu
There has been a spurt of mammography AI products in the mar- S. Khanani has received research funding from Siemens Healthineers, and S. Khanani
ketplace recently. The authors of this study provide a comprehensive and Mayo Clinic have a research and development agreement with Imago Systems.
overview of commercially available AI products for lesion detection doi.org/10.2214/AJR.22.27345
and diagnosis in screening mammography, triage of mammograms,
breast density assessment, breast cancer risk assessment, and im- References
provements in image quality and patient positioning. They also de- 1. Giger ML, Chan HP, Boone J. Anniversary paper: history and status of CAD
scribe the diagnostic accuracy of AI algorithms as reported in the lit- and quantitative image analysis: the role of medical physics and AAPM.
erature and explore issues related to clinical implementation. Med Phys 2008; 35:5799–5820
Although many mammography AI products have transitioned 2. Lehman CD, Wellman RD, Buist DS, Kerlikowske K, Tosteson AN, Miglioretti
from research to clinical practice, questions remain about the DL; Breast Cancer Surveillance Consortium. Diagnostic accuracy of digital
true impact on patient outcomes, as most studies of these algo- screening mammography with and without computer-aided detection.
rithms have been multireader studies of cancer-enriched data- JAMA Intern Med 2015; 175:1828–1837
sets, which is not representative of real-world practice. Lack of 3. Kohli A, Jha S. Why CAD failed in mammography. J Am Coll Radiol 2018; 15(3
clinical validation and the need for better integration into the Pt B):535–537

AJR:219, September 2022 381

You might also like