Download as pdf or txt
Download as pdf or txt
You are on page 1of 15

Circulation Research

COMPENDIUM ON ATRIAL FIBRILLATION

How Will Machine Learning Inform the Clinical


Care of Atrial Fibrillation?
Konstantinos C. Siontis, Xiaoxi Yao, James P. Pirruccello, Anthony A. Philippakis, Peter A. Noseworthy

ABSTRACT: Machine learning applications in cardiology have rapidly evolved in the past decade. With the availability of machine
learning tools coupled with vast data sources, the management of atrial fibrillation (AF), a common chronic disease with
significant associated morbidity and socioeconomic impact, is undergoing a knowledge and practice transformation in the
increasingly complex healthcare environment. Among other advances, deep-learning machine learning methods, including
convolutional neural networks, have enabled the development of AF screening pathways using the ubiquitous 12-lead ECG to
detect asymptomatic paroxysmal AF in at-risk populations (such as those with cryptogenic stroke), the refinement of AF and
stroke prediction schemes through comprehensive digital phenotyping using structured and unstructured data abstraction
from the electronic health record or wearable monitoring technologies, and the optimization of treatment strategies, ranging
from stroke prophylaxis to monitoring of antiarrhythmic drug (AAD) therapy. Although the clinical and population-wide impact
of these tools continues to be elucidated, such transformative progress does not come without challenges, such as the
concerns about adopting black box technologies, assessing input data quality for training such models, and the risk of
perpetuating rather than alleviating health disparities. This review critically appraises the advances of machine learning
related to the care of AF thus far, their potential future directions, and its potential limitations and challenges.

Key Words: artificial intelligence ◼ atrial fibrillation ◼ ECG ◼ electronic health record ◼ machine learning ◼ natural language processing
Downloaded from http://ahajournals.org by on June 24, 2020

M PRIMER ON MACHINE LEARNING


achine learning (ML) has the potential to impact
the practice of clinical medicine. However,
because ML is a broad term that encompasses
TECHNIQUES
the set of algorithms that learn from data—as opposed to Increasingly voluminous and complex data are being ana-
human-crafted rules—the term could reasonably describe lyzed in efforts to understand the basis for the develop-
much of classical statistics. So, what has changed in the ment of AF, its prognosis, and potential therapies. These
past decade that has caused so much attention? Break- data include electrocardiographic signals, raw echocar-
throughs in deep neural networks (deep learning), paired diographic images, genome sequences, and the entire
with biobank-scale data, explain the increasing interest contents of electronic health records (EHRs). The vast-
in applying ML to medicine, from fully automated inter- ness and complexity of these data have spurred the use
pretation of echocardiograms,1 to the detection of aortic of sophisticated analytic techniques from the field of ML,
valve pathology without high-definition visualization of which sits at the confluence of statistics and computer
the valve,2 to the detection of electrocardiographic (ECG) science.6 Although ML leverages traditional statistical
patterns that indicate a patient’s age and sex3 or ejec- methods, ML approaches may have incrementally supe-
tion fraction,4 to the prediction of acute kidney injury.5 In rior performance when the inputs are complex (image
this article, we review the contributions from deep learn- data, for instance), the features of the input data can-
ing and other ML techniques to the clinical care of atrial not be readily discerned (subtle patterns only seen by
fibrillation (AF) and discuss where we perceive the field a convolutional neural network [CNN], for instance), or
to be headed (Figure 1). the relationships between input data are complex and

Correspondence to: Peter A. Noseworthy, MD, Department of Cardiovascular Diseases, Mayo Clinic, 200 1st St SW, Rochester, MN 55905. Email noseworthy.peter@
mayo.edu
For Disclosures, see page 166.
© 2020 American Heart Association, Inc.
Circulation Research is available at www.ahajournals.org/journal/res

Circulation Research. 2020;127:155–169. DOI: 10.1161/CIRCRESAHA.120.316401 June 19, 2020   155


Siontis et al Machine Learning in AF

are chosen by humans, but the way they are weighted in


COMPENDIUM ON ATRIAL

Nonstandard Abbreviations and Acronyms making a prediction is learned from the data. Many rou-
tinely used clinical risk stratification tools (for instance,
FIBRILLATION

AAD antiarrhythmic drug those used in the prediction of stroke or bleeding risk in
AF atrial fibrillation patients with known AF) were derived using regression.
AI artificial intelligence Random forests combine the categorical predictions
AUC area under the curve made by a set of decision trees. Decision trees perform
BNP B-type natriuretic peptide classification (such as whether an individual will develop
CNN convolutional neural network a stroke); a series of branching steps is performed, and
at each step, the tree chooses a feature (such as sex
EHR electronic health record
or an age cutoff) that best splits the remaining data in
FHS Framingham Heart Study
its branch. Single decision trees tend to overfit to the
LA left atrial training data, which causes poor generalization to new
LAA left atrial appendage data, and therefore random forests are popular because
ML machine learning they overcome this tendency. Like regression, the predic-
NLP natural language processing tors in a random forest are preselected, but the relative
OAC oral anticoagulation importance of each predictor, as well as the thresholds
chosen for each split, are learned from the data. This
approach may be useful to help select among various
nonlinear. Here, we aim to provide a brief overview of ML treatment choices using categorical variables.
concepts, as well as intuitive explanations for a focused The nodes of the first layer in a neural network pass
set of specific machine learning algorithms from the input (eg, the pixels from a transthoracic echocardio-
research that is reviewed in the remainder of this article. graphic image) to hidden layers. Each node in a hidden
At the most abstract level, machine learning methods layer receives input from several (or potentially all) nodes
fall into 1 of the 3 categories: supervised, unsupervised, in the prior layer. Ultimately, (in the papers reviewed in this
and reinforcement learning. Most methods reviewed in article), a final output layer predicts a label (eg, whether
this study rely on supervised learning, which requires or not a left atrial appendage [LAA] thrombus is present).
data to have a label when training an algorithm. That The way each node weights its inputs is adjusted during
label is often as simple as a binary outcome: for example, iterative training to produce more accurate responses.
Downloaded from http://ahajournals.org by on June 24, 2020

development of the CHA2DS2-VASc stroke risk model Deep neural networks simply have many hidden layers
required a data set in which individuals’ baseline charac- between the input layer and the output layer; however,
teristics and their stroke outcomes were known.7 Unsu- depth imposes substantial computational costs, which
pervised learning approaches detect relationships within arise due to the large number of connections.
the data itself, without requiring a specific label to train Many neural networks, especially those that interpret
against. A common example is clustering, used by Levy image data, use a convolutional process that mimicks how
et al,8 to identify groups of similar patients in an analysis the visual cortex processes images. In a CNN, an image
of dofetilide dosing. That same paper then applies rein- (or other input data) is broken down into components/
forcement learning, which takes the perspective of an feature abstractions, and convolutions are used to iden-
agent, tasked with maximizing a future reward (eg, suc- tify local correlations between the input data. Convolution
cessful dofetilide initiation) by making decisions at each permits a given neuron in a deeper layer to receive input
step in time (eg, choosing the dose of dofetilide). In prin- from only a small subset of nearby nodes in the prior layer.
ciple, such a system can permit discovery of an improved Within a given layer, this preserves only local relation-
set of decisions (such as choice of antiarrhythmic medi- ships, although long-range relationships can be learned
cation dose), even when humans do not know how to in deeper layers of a convolutional network. Unlike many
decide the value of such decisions. These approaches other ML methods, not only can deep-learning models
hold promise for the discovery of treatment patterns and associate input features with an output of interest, they
outcomes that may emerge in large data sets but may can also learn features from raw data itself. Such models
not be obvious to observers of routine clinical practice. are particularly valuable for the classification of images
While ML is a broad field, most of the work cited in this or complex biophysical signals like ECG or echocardio-
review applies supervised learning using 1 of 3 specific graphic data, though their value in interpreting these tests
algorithms: ordinary regression, random forests, or deep as a whole is not always evident because these feature
CNNs. In ordinary regression, an outcome of interest (such abstractions do not always yield clear explanations.9
as stroke) is predicted by projecting out the linear contribu- The tools of ML (particularly neural networks) often
tions from variables that are included in the model because have vast capacity to effectively memorize (overfit to)
they are thought to be correlated (such as age and sex). The data sets if care is not taken to guard against this. A
predictors that are incorporated into the regression model model that is overfit to a particular data set is likely to

156   June 19, 2020 Circulation Research. 2020;127:155–169. DOI: 10.1161/CIRCRESAHA.120.316401


Siontis et al Machine Learning in AF

COMPENDIUM ON ATRIAL
FIBRILLATION
Figure 1. Applications of machine learning in the clinical care of AF.
AAD indicates antiarrhythmic drug; AF, atrial fibrillation; EHR, electronic health record; ML, machine learning; OAC, oral anticoagulation; and LAA,
left atrial appendage.

generalize poorly to new data. To minimize this risk, data vague and attributed to other conditions, making AF diag-
sets are commonly split into separate training and valida- noses elusive.13 An estimated 1 million Americans may
tion sets to allow derivation and testing of networks on have undiagnosed AF.14 Among patients with implanted
nonoverlapping data. For example, when training a deep cardiac devices (pacemakers and defibrillators), up to
Downloaded from http://ahajournals.org by on June 24, 2020

neural network, ML practitioners use a training set to half of them have asymptomatic high-rate atrial episodes
update model parameters, and a validation set to assess over long-term follow-up, and a dose-response relation-
whether each step of training is continuing to produce ship has been proposed between the duration of these
improvement in the model’s performance. When the episodes and the risk of stroke.15 ML approaches can
training set still shows improved outcome prediction with allow streamlined, targeted, and higher-yield screen-
each additional step of training but the validation set is ing for AF. In this section, we first describe the current
no longer doing so, the model is at the point where it is status of AF screening and challenges; then, we review
likely to begin overfitting. In this case, the model is finding existing AF risk scores that could be used to target AF
a solution that improves performance only on the training screening initiatives and an ML algorithm to screen for
data set, but this is not generalizable to a separate data unrecognized AF based on sinus rhythm ECGs; finally,
set. The optimal solution, on the other hand, will have we explore how ML approaches could facilitate transla-
similar performance in both training and validation data tion of AF prediction algorithms to clinical practice and
sets. Neural networks commonly have other tools to pre- discuss future directions.
vent overfitting, such as dropout, in which nodes of the
network are probabilistically (and temporarily) dropped
from the network during training. This effectively adds Current Status of AF Screening and Challenges
noise to the process and reduces the tendency to overfit. AF screening can be done directly by ECG recordings
For additional details of ML techniques in medicine, or indirectly by assessing pulse irregularity. Screening
several recent reviews provide more in-depth descrip- methods can be largely grouped into 4 categories: (1)
tions.10–12 In the remainder of this review, we address pulse palpation; (2) blood pressure monitors; (3) ECGs,
specific past, present, and potential future applications including standard 12-lead ECG, single-lead ECG in
of ML in the clinical care of AF. handheld devices, patches, belts, smartphones, watches,
etc; and (4) photo-plethysmography, either by stand-
alone devices or by smartphone cameras and applica-
AF SCREENING tions.16,17 The performance of these screening tests
AF is notoriously difficult to screen for as it is often tran- varies (Table 1), but a recent meta-analysis of 19 AF
sient and asymptomatic. Moreover, AF symptoms may be screening studies found that the yield of screening was

Circulation Research. 2020;127:155–169. DOI: 10.1161/CIRCRESAHA.120.316401 June 19, 2020   157


Siontis et al Machine Learning in AF

Table 1. Performance of Different AF Screening Methods


COMPENDIUM ON ATRIAL

Sensitivity, % Specificity, % References


FIBRILLATION

Pulse palpation 87–97 70–81 Harris et al,18 Cooke et al19


BP monitors 30–100 86–97 Wiesel et al,20,21 Kearley et al,22 Stergiou et al,23 Marazzi et al,24 Wiesel et al25
Single-lead ECG screening 94–100 76–97 Kearley et al,22 Lau et al,26 Tieleman et al,27 Vaes et al,28 Doliwa et al,29 Kaasenbrood,30
Jacobs et al,31 Orchard et al,32 Lowres et al,33 Bumgarner et al34
Photo-plethysmography via 87–100 80–100 Tison et al,35 Lewis et al,36 McManus et al,37 Guo et al,38 Dorr et al,39 Bonomi et al,40
smartphone applications Chan et al,41 Krivoshei et al,42 Conroy et al,43 Bashar et al,44 Brasier et al45

AF indicates atrial fibrillation; and BP, blood pressure.


Adapted from Lowres et al.46

not influenced by the screening method used.46 The often leads to only small improvements in model per-
most powerful driver of the screening yield appears formance. For example, in the FHS study (Framingham
to be the age of the study population. Intermittent or Heart Study), 3 echocardiographic measurements (left
continuous screening via an ECG monitoring patch or atrial [LA] diameter, left ventricular wall thickness, and
repeated recordings using a handheld ECG can improve left ventricular fractional shortening) were all associated
the yield; however, they are also more costly and bur- with AF, but the addition of these risk factors did not
densome.47,48 ML approaches can improve the yield of meaningfully improve the performance of the prediction
these screening methods. Notably, ML approaches have model,49 and the addition of BNP (B-type natriuretic
been applied to signals obtained from the single-lead peptide) improved the C statistic from 0.78 to 0.80.62
ECG or photo-plethysmography. For instance, a deep The addition of BNP also improved the C statistic of
neural network has been developed to passively detect the CHARGE-AF score from 0.77 to 0.79.63 In another
AF from photo-plethysmography signals obtained from study, the addition of genetic information improved the
the Apple Watch.35 C statistic from 0.72 to 0.74.64 Most importantly, these
novel risk factors are typically not available in large
asymptomatic populations and, therefore, may have lim-
Existing AF Risk Scores to Guide AF Screening ited applicability to everyday practice. Thus, rather than
Initiatives having patients undergo testing for biomarkers, echo-
Downloaded from http://ahajournals.org by on June 24, 2020

Over the past decade, several risk scores to predict cardiography, or genetic tests to predict AF risk, it may
the risk of AF have been developed and validated be less expensive and potentially of higher yield to sim-
(Table 2).49–52 These models largely include similar ply use prolonged monitoring approaches, for example,
risk factors, for example, age, sex, race, height, weight, a patch, to diagnose AF.
blood pressure, heart failure, etc. In fact, even the simple
CHA2DS2-VASc score, designed as a stroke risk stratifi-
Artificial Intelligence Algorithm to Predict
cation tool in AF patients, has demonstrated a C statistic
of 0.69 to 0.74 for predicting AF.52–59 Minor differences
Unrecognized AF Based on Sinus Rhythm
of these models are attributable to the fact that includ- ECGs
ing certain risk factors improved model performance in None of the above-mentioned risk prediction models
one cohort but not in another cohort. For example, in are used in routine practice to guide AF screening. A
the CHARGE-AF study (Cohorts for Heart and Aging key reason is that they all predict the mid-to-long-term
Research in Genomic Epidemiology Atrial Fibrillation), risk of AF, for example, 5-year or 10-year, rather than
the addition of left ventricular hypertrophy and PR inter- the contemporaneous risk of unrecognized AF. However,
val did not meaningfully improve the findings.51 Nota- the ECG may offer a window into the electrophysiologi-
bly, these models were all developed using regression cal remodeling occurring in patients with AF such that
methods. Although ML methods can potentially increase systematic feature extraction from the ECG may detect
the model performance in comparison to traditional sta- patients who are more likely to have a history of or a
tistical methods, the improvement is not likely to be risk of paroxysmal AF. There is indeed a long record of
clinically meaningful without the addition of other types investigation assessing such features, including, for
of data beyond these clinical risk factors. Population- example, P wave fractionation by signal-averaged ECG65
specific bias and residual confounding are also likely to and other P wave indices on the standard ECG.66 Tak-
affect ML-based prediction models, not unlike traditional ing such approaches several steps further, the availability
prediction models. of powerful computational tools and large data sets has
Novel risk factors have also been explored, including led to the development of methods for the comprehen-
blood and imaging biomarkers and genetic markers.60,61 sive association of the sinus rhythm ECG with probabil-
However, adding these risk factors to the existing models ity of concomitant or imminent AF that is based on the

158   June 19, 2020 Circulation Research. 2020;127:155–169. DOI: 10.1161/CIRCRESAHA.120.316401


Siontis et al Machine Learning in AF

Table 2. Existing AF Prediction Models

COMPENDIUM ON ATRIAL
Study Name Derivative Cohort Outcome Risk Factors in the Model Performance

FIBRILLATION
FHS49 4764 participants 10-y risk of atrial fibrillation or atrial flutter Age, sex, BMI, SBP, HTN treatment, PR Internal validation C statistic
in FHS on ECG interval, clinically significant cardiac murmur; 0.7849; external validation
HF 0.65–0.7350,51,53,54
ARIC50 14 546 participants 10-y risk of atrial fibrillation or atrial flutter Age, race, height, SBP, HTN treatment, Internal validation C statistic
from ARIC based on standard 12-lead ECG and smoking status, precordial murmur, LVH, 0.7850
diagnosis codes listed on hospital discharge left atrial enlargement, diabetes mellitus,
and death certificate coronary heart disease, and HF
CHARGE- 18 556 participants 5-y risk of atrial fibrillation ascertained from Age, race, height, weight, SBP, DBP, current Internal validation C statistic
AF51 from ARIC, CHS, ECGs and hospital discharge diagnosis smoking, HTN treatment, diabetes mellitus, 0.765, external validation C
and FHS codes MI, and HF statistic 0.66–0.7551,54–56,58
C2HEST52 471 446 Chinese Atrial fibrillation on ECG or Holter over 11 y CAD/COPD, HTN, age ≥75 y, systolic HF, Internal validation C statistic
subjects thyroid disease 0.75; external validation C
statistic 0.6552
Partners57 206 042 participants 5-y risk of atrial fibrillation ascertained from Sex, age, race, smoking status, height, Split-sample internal
from Partners hospital diagnostic codes, cardiology tests, weight, blood pressure, and cardiovascular validation with C statistic
and medications and cardiometabolic disease labels of 0.77

ARIC indicates Atherosclerosis Risk in Communities; AF, atrial fibrillation; BMI, body mass index; CAD, coronary artery disease; CHARGE-AF, Cohorts for Heart and
Aging Research in Genomic Epidemiology Atrial Fibrillation; CHS, Cardiovascular Health Study; COPD, chronic obstructive pulmonary disease; DBP, diastolic blood
pressure; FHS, Framingham Heart Study; HF, heart failure; HTN, hypertension; LVH, left ventricular hypertrophy; MI, myocardial infarction; and SBP, systolic blood pressure.

assessment of the ECG as a whole and extends beyond Unlike the traditional risk prediction models that com-
any a priori defined ECG features. prise predefined variables, the CNN described above is
To assess the likelihood of AF in the short term, a agnostic as we do not know what ECG features the CNN
group of Mayo Clinic investigators developed a CNN to is seeing and which factors drive its performance. It is
predict AF based on a standard 12-lead ECG obtained likely that the algorithm performance is based on numer-
during sinus rhythm.67 The algorithm was developed using ous ECG signatures that are known risk factors of AF
454 789 digitally stored ECGs recorded from 126 526 (eg, left ventricular hypertrophy, P wave amplitude, atrial
patients, it was validated in a separate internal validation ectopy, and heart rate variability), as well as others that
are currently unknown or are not obvious to the human
Downloaded from http://ahajournals.org by on June 24, 2020

data set containing 64 340 ECGs from 18 116 patients


and tested on another 130 802 ECGs from 36 280 eye.68 It is also likely that the ECG contains information
patients. The model applied convolutions on a temporal that correlates with known clinical risk factors, as indeed
axis and across multiple leads to extract morphological the ECG has been shown to predict age, sex, and ejec-
and temporal features during the training and validation tion fraction.3,4
process. Patients with at least 1 ECG showing AF within Important questions need to be addressed before
31 days after the sinus rhythm ECG were classified as this tool can be widely used clinically. The algorithm was
positive for AF. In the testing data set, the algorithm dem- derived from a selected population in a large tertiary
care institution with higher prevalence of AF than in less
onstrated a C statistic of 0.87, a sensitivity of 79.0%, a
selected, general populations. Therefore, external validity
specificity of 79.5%, and an accuracy of 79.4% in detect-
and generalizability remain to be determined. The incre-
ing patients with documentation of AF within 31 days fol-
mental value of this approach above and beyond clinical
lowing the sinus rhythm ECG using only information from
factors and risk scores requires further investigation. In
the sinus rhythm ECG (Figure 2). Therefore, the algo-
fact, the value of standard clinical factors in identifying
rithm aims to detect nearly concomitant unrecognized
concomitant (rather than future) AF is largely unknown.
AF, rather than predicting long-term AF risk. This algo-
Also, the impact of widespread application of this tool in
rithm may facilitate targeted AF surveillance (eg, using general or primary care populations in terms of down-
an ambulatory rhythm monitoring patch or implantable stream testing, treatment utilization, costs, patient-
loop recorder) in subsets of high-risk patients. ECGs are reported quality of life, and hard outcomes (stroke,
ubiquitously performed for a variety of screening, diag- bleeding, death) is yet unknown.
nostic, and monitoring purposes, thereby providing ample
opportunities for the application of this algorithm. The
ultimate clinical utility of this approach will be determined ML for Implementing AF Prediction Algorithms
by the observed positive and negative predictive values ML algorithms may facilitate implementation of traditional
of the algorithm when applied to a given population and risk prediction models like CHARGE-AF but also of the
by the cost and downstream consequences, particularly artificial intelligence (AI)-enabled ECG algorithm. In a
for patient outcomes, related to follow-up diagnostic large-scale retrospective analysis using data from almost
testing and therapies. 3 million individuals with no history of AF in the UK Clinical

Circulation Research. 2020;127:155–169. DOI: 10.1161/CIRCRESAHA.120.316401 June 19, 2020   159


Siontis et al Machine Learning in AF
COMPENDIUM ON ATRIAL
FIBRILLATION
Downloaded from http://ahajournals.org by on June 24, 2020

Figure 2. Development of a convolutional network to detect nearly concomitant AF based on a single 12-lead sinus rhythm ECG.
AF indicates atrial fibrillation; AUC, area under curve; NSR, normal sinus rhythm; and SR, sinus rhythm. Reprinted from Attia et al67 with
permission. Copyright ©2020, Elsevier.

Practice Research Datalink, Hill et al69 developed and from the EHR. Some information, for example, age, sex,
comparatively assessed traditional statistical approaches and blood pressure, might be directly obtained from pre-
and novel ML models for AF prediction based on routinely defined areas in the record such as the Demographics
collected patient data. These models included logistic least or Vital Signs tabs where such information is entered for
absolute shrinkage and selector operator, random forests, each patient (structured data fields). Other diagnoses, like
support vector machines, CNNs, Cox regression, and diabetes mellitus, heart failure, and myocardial infarction
published AF risk models, such as CHARGE-AF. A time- are more ambiguous. Relying on diagnosis or procedure
varying CNN, which considered 100 different baseline codes to determine medical history assumes that such
predictors, was identified as the optimal model achieving codes have been correctly adjudicated by the healthcare
an area under the curve (AUC) of 0.83 for AF prediction professional and correctly entered into the EHR. How-
as opposed to 0.73 with the CHARGE-AF score and 0.70 ever, the complexity of current and previous coding sys-
with logistic regression. Importantly, the optimal time-vary- tems allows the potential for significant inconsistencies in
ing CNN added useful insights into the risk prediction of coding between different professionals, thereby reducing
AF by confirming known baseline AF risk factors (such the reliability of prognostic scores derived solely based on
as age, previous cardiovascular disease, antihypertensive diagnostic codes. Natural language processing (NLP) has
medication usage) and identifying additional time-varying the potential to improve the determination of diagnoses by
predictors (such as proximity of cardiovascular events, extracting information from free-text clinical documenta-
body mass index, pulse pressure, and the frequency of tion from the EHR. The goal of NLP is to structure unstruc-
blood pressure measurements). tured free-text data by using a variety of approaches,
To calculate risk scores like CHARGE-AF or the optimal ranging from rule-based model training for recognition of
CNN described by Hill et al, data will need to be abstracted linguistic patterns, to hybrid training of a model using text

160   June 19, 2020 Circulation Research. 2020;127:155–169. DOI: 10.1161/CIRCRESAHA.120.316401


Siontis et al Machine Learning in AF

vectorization and output tags that are fed into ML models, further improve the model performance. For example,

COMPENDIUM ON ATRIAL
to completely unsupervised topic modeling. In turn, several addition of other established risk factors to the AI ECG
different word embedding, text classification, text extrac- algorithm may further increase its prognostic performance

FIBRILLATION
tion, and topic modeling algorithms are used and continu- in a synergistic fashion without necessarily increasing the
ously refined in this active area of investigation. Because costs and complexities of implementing the prediction
NLP is not restricted by predefined diagnostic codes, an algorithm as long as the information on such additional
optimized NLP model has the ability to recognize even risk factors can be readily obtained via the EHR.
complex language patterns by comprehensively assessing Furthermore, the predicted risk, regardless of
all available documentation, thus improving the accuracy CHARGE-AF, AI ECG, or polygenic risk scores, is a con-
in capturing potentially ambiguous diagnoses. Second, tinuous value, whereas the clinical action is a binary deci-
the goal of screening for AF is to identify the patient who sion (eg, whether or not to use prolonged AF monitoring).
may benefit from oral anticoagulation (OAC) should AF be To translate the prediction model to everyday practice, a
detected since not all patients with AF need OAC. Some threshold must be established to help providers decide
patients may have a low stroke risk that does not warrant which patients should receive long-term monitoring. To
intervention; others may have contraindications to OAC so facilitate such decisions, providers will need to know
that a diagnosis of AF would not change their manage- the positive predictive value and negative predictive
ment. Ideally, screening for otherwise clinically silent AF value associated with different risk thresholds. A study
should only be implemented for patients who are eligible is underway to address this question for the AI-enabled
for OAC per current guidelines (eg, CHA2DS2-VASc ≥2 ECG algorithm in a general asymptomatic population.82
in men or ≥3 in women70) and who do not have contrain- A subsequent question would be how to scale the AI
dications (eg, recurrent major bleeding or severe coagu- algorithms to the broad population. One challenge is that
lopathy). The determination of eligibility for OAC in large the algorithms need to be revalidated or adapted when
numbers of patients will also require abstracting data from applied to a different population. The utilization of block-
the EHR for calculation of their CHA2DS2-VASc scores. chain technology may allow the generation of a decen-
However, it should also be noted that ML may allow for tralized marketplace for the secure and traceable sharing
better stroke risk stratification than the CHA2DS2-VASc of large amounts of patient data across institutions for
score (discussed later in this review). the retraining and testing of AI tools.83,84 Moreover, not
Recently, investigators have developed digital phe- every clinic has the capacity and infrastructure for data
notyping algorithms to use both structured data (eg, acquisition and processing like the large health systems
Downloaded from http://ahajournals.org by on June 24, 2020

diagnosis and procedure codes, prescription drugs, vital where the AI algorithms were originally developed. Suc-
signs, laboratory tests, etc) and unstructured data (eg, cessful implementation of these AI solutions will also
clinical notes abstracted via NLP) to determine medi- require extensive training and support, for example, help-
cal history, such as diabetes mellitus, heart failure, and ing clinicians understand what they are supposed to do
peripheral artery disease, which have superior perfor- when they receive AI-generated results.
mance than relying on structured data alone.71–73 Another From an individual’s perspective, the availability of smart
group of researchers have also developed a framework devices to capture the patterns and changes of physical
for phenotyping and developed R packages that can be activity, resting heart rate, weight, and sleep may provide
easily executed by other investigators to develop digi- additional opportunities to refine AF prediction. One exam-
tal phenotyping algorithms if all data are available.74,75 In ple is an initiative of Yale University to build a patient-cen-
repositories, such as the Phenotype KnowledgeBase, tered health data sharing platform, called Hugo, which is a
researchers can upload and share their algorithms as well smartphone application that aggregates data from EHRs
as implement and validate algorithms developed by other of multiple health systems, pharmacies, personal devices
groups.76 The Phenotype KnowledgeBase repository (eg, activity monitors, digital weight scales, and single-lead
already contains algorithms for 50 to 60 medical condi- ECGs) and allows the collection of patient-reported out-
tions and many have demonstrated good performance comes.85 This new tool could provide patients a compre-
when implemented across different health systems.77–81 hensive overview of their healthcare data and allows them
to share data with researchers and interact with clinicians
and researchers in real time (eg, report symptoms and
Future Directions for Leveraging ML to Improve respond to questionnaires). This comprehensive data col-
AF Screening lection tool could provide an opportunity to leverage mul-
For large health systems, the AI-enabled ECG-based AF tiple data sources to refine AF risk prediction.
prediction could service as a low-cost mass screening
tool,67 and a next step would be to investigate whether
including additional clinical risk factors from CHARGE- AF TREATMENT
AF and other novel risk factors (eg, blood biomarkers, Advances in pharmacological and interventional therapies
imaging modalities, and even polygenic risk scores) can have led to improvement in quality of life and reduction in

Circulation Research. 2020;127:155–169. DOI: 10.1161/CIRCRESAHA.120.316401 June 19, 2020   161


Siontis et al Machine Learning in AF

AF-related morbidity in patients with AF over the past 2 incremental markers of stroke risk above and beyond the
COMPENDIUM ON ATRIAL

decades. Nevertheless, AF is a heterogeneous syndrome CHA2DS2-VASc score. Similarly, the potential incremental
and not all patients carry the same risks of AF-related value of circulating biomarkers (myocardial, inflammatory,
FIBRILLATION

complications nor respond to the same set of treatments and procoagulant)94 and electrocardiographic markers
in a similar manner. The challenge of truly individualized (such as P wave amplitude, duration, and axis)95 has not
risk stratification (particularly for stroke) and treatment been routinely assessed in risk calculations in practice
selection remains largely unmet. In this section, we review alongside conventional clinical factors.
the current status and future directions of ML methods The feasibility and accuracy of ECG and image analy-
to help improve risk stratification for AF-related morbid- sis integration into AI methods has been well demon-
ity and optimize the delivery of care, including treatment strated. Harnessing the power derived from large data
selection and monitoring in patients with AF. sets, CNNs may be developed to allow the use of each
of the above markers as readily applicable and low-cost
predictors of stroke risk and to refine decision-making
Stroke Risk Stratification regarding OAC use. Indeed, in a recent analysis of rhythm
Since AF was recognized as a major risk factor for stroke monitoring data from implantable devices in >3000
>40 years ago, efforts to achieve meaningful individual- patients with established AF diagnoses in the Veter-
ized risk stratification have evolved.86 Current guidelines ans Health Administration, 3 ML models based on AF
endorse the widely used CHA2DS2-VASc score as the burden indices were trained for the prediction of stroke,
preferred risk stratification scheme to identify patients in including CNNs, random forest, and L1 regularized logis-
whom initiation of OAC may be warranted.7 The score is tic regression.96 Random forest achieved a C statistic of
based on simple clinical variables, it is generally easy to 0.662 (test data set) and a CNN had a C statistic of
apply in daily practice and most healthcare providers are 0.702 (validation data set), whereas CHA2DS2-VASc had
familiar with it. However, it is not always straightforward an AUC of ≤0.5 in both data sets for stroke prediction.
to accurately ascertain all the clinical risk factors to cal- That study only included 71 stroke cases. Performance
culate the CHA2DS2-VASc score for the individual patient may be improved by accumulation of much larger data
at a certain time point or over longitudinal follow-up due sets for derivation and validation. Integration of the AF
to inaccuracies of sampling the EHR or when one has to burden signature models with clinical variables may
rely on patient-reported history. As discussed in the pre- also lead to incremental improvements. Indeed, when
vious section, ML approaches using both structured and CHA2DS2-VASc score was combined with random forest
Downloaded from http://ahajournals.org by on June 24, 2020

unstructured data with NLP offer promise in automating and CNN, this resulted in an AUC of 0.696 in the valida-
and improving risk phenotyping and identifying high-risk tion data set and an AUC of 0.634 in the test data set,
patients who will benefit from OAC initiation. yielding the highest average AUC on nontraining data.
Despite its widespread utilization, the CHA2DS2-VASc Similar approaches can be undertaken for any of the
score has been criticized for its relatively weak discrimi- potential risk markers described above. However, beyond
natory ability. The C statistic was only 0.606 in its deriva- single biomarkers, ML approaches to automatically and
tion cohort.7 Subsequent attempts to refine the score with reliably capture the full breadth of demographic, clinical,
the addition of other clinical risk factors resulted in only imaging, circulating biomarker, and ECG data from EHR
modest improvements in prognostic performance, and sources can allow the development of inexpensive and
these iterations have not gained popularity in clinical prac- scalable multicomponent tools for automatic detection of
tice.87–89 The common denominator of all these scores, patients who are both at risk of AF and AF-related stroke
and potentially a reason why they fail to completely cap- based on EHR using structured and unstructured data.
ture the magnitude of risk, is that they are solely based Such models may be incorporated into EHR systems
on demographic and comorbid clinical factors. This over- whereby they can be continuously internally validated and
simplified predictive modeling does not reflect the com- refined based on newly collected clinical information and
plex pathogenesis of thromboembolism in AF. Other key events (such as a new stroke event), thereby providing the
phenotypic characteristics, such as the type of AF (par- most up-to-date and precise estimations of risk to the cli-
oxysmal versus nonparoxysmal) and burden, as well as nician. Similarly, the application of ML methods to the vast
electrophysiological characteristics, LA and LAA macro- amount of data collected via wearable ECG technologies
and micro-anatomy and function are not considered. For or implanted cardiac devices can enable the real-time
example, mounting data suggests that thromboembolic application of AF and stroke risk modeling based on clini-
risk may be higher with persistent rather than paroxys- cal factors and rhythm signatures (heart rate variability,
mal AF90 and with increasing AF burden determined by atrial ectopy burden, AF patterns, among others). Inohara
implantable device monitoring.91 An increasing body of et al97 recently performed unsupervised cluster analysis in
evidence also supports the extent of LA fibrosis quanti- ≈10 000 patients with AF in the ORBIT-AF registry (Out-
fied by cardiac magnetic resonance imaging92 and even comes Registry for Better Informed Treatment of Atrial
the LAA morphology,93 among other imaging variables, as Fibrillation) incorporating patient demographics, medical

162   June 19, 2020 Circulation Research. 2020;127:155–169. DOI: 10.1161/CIRCRESAHA.120.316401


Siontis et al Machine Learning in AF

history, medications, vital signs, laboratory data, imaging complications (stroke and worsening heart failure) may

COMPENDIUM ON ATRIAL
parameters, and electrocardiographic parameters. They allow the early institution of effective therapies. This goal
identified 4 clinically relevant phenotypes of AF, each with was not met when low-dose rivaroxaban was tested

FIBRILLATION
distinct associations with clinical outcomes (low comor- against placebo in patients with heart failure with reduced
bidity, behavioral comorbidity, device implantation, and ejection fraction, coronary artery disease, and no AF in the
atherosclerotic comorbidity clusters). Importantly, conven- COMMANDER HF trial (A Study to Assess the Effective-
tional risk factors, such as AF type and LA size, were not ness and Safety of Rivaroxaban in Reducing the Risk of
significant drivers of cluster classification to further refine Death, Myocardial Infarction, or Stroke in Participants With
stroke risk modeling. Heart Failure and Coronary Artery Disease Following an
Hypothesizing that stroke risk may vary longitudinally Episode of Decompensated Heart Failure).102 Finally, the
based on rhythm patterns, a rhythm-guided, non–vitamin potential value of empirical OAC in asymptomatic all-com-
K antagonist OAC (NOAC)-in-pocket approach may ers who are predicted to have an elevated risk for both AF
prove feasible and safe. The stroke risk can be estimated and AF-related stroke on the basis of AI-enabled prog-
in a time-to-event manner, and the predicted risks at nostication is a thought-provoking approach that requires
different time points (1-year risk, 5-year risk, etc) can further testing and clinical validation.
be calculated. Early experiences of intermittent NOAC
use for AF detected by implanted cardiac device moni-
toring were underpowered to demonstrate a reduction
Optimizing Treatment Choices
in clinical end points,98,99 but the effectiveness of this AF treatment is truly multidimensional. It includes consid-
approach may be improved with the real-time application erations of symptom management and prevention of AF-
of ML methods, allowing short-term prediction of AF and related complications with pharmacological (antiarrhythmic
patient-directed OAC even before AF becomes manifest. medications, rate control medications, OACs), catheter
ablation, and device-based interventions (LAA occlusion
and pacemakers). Despite improvements in outcomes of
OAC in High-Risk Groups Without Known AF AF patients treated with contemporary approaches in cen-
For patients with cryptogenic stroke, current guidelines ters with expertise, significant uncertainties remain, further
recommend ambulatory rhythm monitoring to assess for complicating decision-making on behalf of clinicians and
otherwise asymptomatic AF. However, the sensitivity of patients. The availability of large data sets, such as admin-
this approach to detect AF is rather low particularly with istrative claims-based data sets and those originating from
Downloaded from http://ahajournals.org by on June 24, 2020

short duration of monitoring. Longer periods of monitor- large clinical trials, allows the in-depth phenotyping of
ing, for example, with the routinely used implantable loop associations between different treatments and clinical out-
recorders, can lead to higher rate of AF diagnoses, but comes. In the study by Inohara et al,97 unsupervised learning
this invasive approach is costly and may be unnecessary identified phenotypically and prognostically distinct clus-
in some cases. Bypassing prolonged rhythm monitoring, ters of AF patients, which were also characterized by dis-
the routine use of OAC even before AF is documented tinct treatment patterns, such as rate versus rhythm control,
has been considered and tested in 2 clinical trials using specific AAD use, and antithrombotic strategies. Traditional
rivaroxaban and dabigatran.100,101 Both of these trials dem- methodologies can be augmented with ML approaches
onstrated an increase in the risk of bleeding compared integrating routinely collected information in the setting of
to aspirin and no reduction in recurrent strokes, indicating retrospective and prospective studies enriched with widely
that the benefit of OAC may not outweigh the harm of available EHR-derived information. These tools can inform
increased bleeding in patients without documented AF. It shared decision-making in determining an optimal selec-
is also possible that any effect of these NOACs on recur- tion of a wide range of current treatment dilemmas. Spe-
rent stroke risk reduction was attenuated due to the fact cific examples include the following:
that only a minority of these cryptogenic strokes was truly 1. NOAC versus warfarin: The use of NOACs has
related to AF. A tool to better detect unrecognized AF may increased rapidly over the past decade, and their
help limit OAC use to patients who are most likely to bene- advantages over warfarin are clear and well dem-
fit. Applying the paradigm described previously,67 a patient onstrated. However, the NOACs are not without
with cryptogenic stroke could be treated with OAC based limitations, and in some populations, warfarin may
on an AI-enhanced ECG demonstrating a high AF prob- be the preferred OAC either due to specific clinical
ability. This would obviate the need for prolonged ambula- reasons or due to healthcare resource limitations.
tory rhythm monitoring with implantable or other monitors, This is particularly relevant for populations where
reduce unnecessary resource utilization, and reduce the there is paucity of clinical trial data on NOAC out-
time off therapeutic OAC. Similarly, an AI-enabled ECG comes, such as in patients with end-stage renal
or other AI-enabled diagnostic tool that can identify with disease in whom warfarin remains heavily used.103
high fidelity patients with heart failure who are likely to 2. OAC versus LAA occlusion: In the United States,
have undiagnosed or future AF and subsequent risk for the Watchman device is the only currently

Circulation Research. 2020;127:155–169. DOI: 10.1161/CIRCRESAHA.120.316401 June 19, 2020   163


Siontis et al Machine Learning in AF

FDA-approved device for percutaneous LAA Antiarrhythmic Drug Management


COMPENDIUM ON ATRIAL

occlusion in high-risk patients intolerant of long-


AADs are commonly used in the management of AF and
term OAC.104,105 Both OAC and LAA occlusion,
FIBRILLATION

other arrhythmias but require close monitoring due to the


however, can be associated with treatment failures,
risk of potentially fatal ventricular proarrhythmia related to
that is, they do not universally eliminate stroke risk.
drug-induced QT prolongation, a side effect that is most
Also, the Watchman implantation procedure car-
prominent with dofetilide and sotalol. These medications
ries periprocedural risks and significant healthcare
are initiated under ECG monitoring in the hospital. Dose
expenditures, which prohibits more liberal use in
adjustments may subsequently be needed depending on
most AF patients. On the other hand, OACs are
the degree of QT prolongation, concomitant QT-prolong-
associated with non-negligible bleeding risks and
the inconvenience of a daily medication. Thus, ing drugs, and potential changes in renal function as both
identifying patients who are likely to benefit the sotalol and dofetilide are predominantly renally metabo-
most or the least from the Watchman procedure lized. However, the QT interval is an imperfect marker for
becomes a priority. ML methods integrating clinical monitoring because it does not accurately reflect plasma
factors with advanced imaging analytics, including concentration or proarrhythmic potential. A group of Mayo
LA structure and function, as well as LAA size and Clinic investigators developed a deep-learning model to
morphology, are likely to play a role in that regard. predict dofetilide plasma concentration in 42 subjects using
3. Catheter ablation versus AAD: Numerous clinical tri- the ECG.113 This deep-learning approach predicted plasma
als, including the recently completed CABANA trial dofetilide concentrations with good correlation (r=0.58). In
(Catheter Ablation vs Antiarrhythmic Drug Therapy a different multicenter study (n=354 patients), Levy et al8
for Atrial Fibrillation),106,107 have shown catheter used ML approaches including supervised, unsupervised,
ablation to be a reasonable first-line treatment for and reinforcement learning to identify the optimal dosing
patients with symptomatic AF, particularly paroxys- strategy for dofetilide loading. A reinforcement learning
mal. Nevertheless, significant treatment effect het- algorithm informed by unsupervised learning was able to
erogeneity exists among patient subgroups, such predict dosing decisions with an accuracy of 96.1%.8
that generalizability of clinical trial results to clinical Beyond dofetilide, Hu et al114 tested ML methods to
practice becomes challenging. ML approaches that predict appropriateness of initial digoxin dosing in 307
consider the entire patient phenotype may offer a patients, including patients with AF, and distinguishing
between patients with drug-drug interactions and those
Downloaded from http://ahajournals.org by on June 24, 2020

means of discrimination among those who are likely


to respond to ablation versus those more likely to without interactions. Among several ML models tested
respond to AADs. using demographic, clinical, and laboratory data, random
forest provided the highest AUC (0.912) followed by mul-
tilayer perceptron with an AUC of 0.813 in the no drug
Warfarin Treatment Monitoring interactions group. The AUC of random forest was the
Despite the uptake of NOACs, warfarin remains widely best for the group of patients with interactions (0.892),
used in the United States. Among other known limita- followed by a classification and regression tree model
tions, warfarin has a narrow therapeutic window. Poor (AUC, 0.795) and multilayer perceptron (AUC, 0.777).114
anticoagulation control as indicated by the time in thera-
peutic range has been associated with adverse clinical
outcomes in AF patients.108 Also, defining an optimal
Applications for Catheter Ablation
dose can be challenging in patients newly starting the Pulmonary vein isolation is the cornerstone of the cur-
medication as there is significant variability in dosing rent ablative management of AF. Pulmonary vein isolation
requirements among individuals. ML approaches inte- is effective in paroxysmal trigger-dependent AF, but out-
grating demographic, clinical, and even pharmacoge- comes of ablation are at best modest in persistent AF, and
nomic data have been successfully developed to improve many patients require repeated procedures. A very large
the ability to predict warfarin dosing in the individual body of literature over >10 years has examined the incre-
patient.109,110 Furthermore, adherence to OACs is gener- mental value of linear ablation, ablation of nonpulmonary
ally suboptimal and has only modestly improved with the vein triggers, complex fractionated electrograms or targets
increased use of NOACs.111 In a randomized trial, Labo- guided by phase mapping in addition to pulmonary vein iso-
vitz et al112 tested the impact of a smartphone-based lation, but none of these approaches has been consistently
monitoring intervention that automates directly observed shown to further improve long-term success rates of abla-
therapy using AI to visually confirm medication ingestion. tion.115,116 These investigations, however, have been ham-
Adherence was 100% and 50% in the intervention and pered by relatively small sample sizes and limited power.
control groups, respectively, as determined by plasma Mounting evidence suggests the role of nonpulmonary
drug concentration levels,112 suggesting the feasibility vein triggers and substrate particularly in nonparoxysmal
and potential clinical utility of such tools. AF and in patients with comorbidities, such as obstructive

164   June 19, 2020 Circulation Research. 2020;127:155–169. DOI: 10.1161/CIRCRESAHA.120.316401


Siontis et al Machine Learning in AF

sleep apnea.117 Indeed, the pathogenesis of AF, especially expenditure. Those at the highest end of predicted health-

COMPENDIUM ON ATRIAL
persistent, is so heterogeneous that a one size fits all abla- care use were assigned to a special program seeking to
tion approach is unlikely to benefit all patients. improve care coordination. They observed that for any given

FIBRILLATION
Tailored approaches to AF ablation have been many set of inputs, black patients were assigned a lower score
years in the making. Recently, Boyle et al118 have reported than white patients; if this racial disparity was corrected, the
pioneering work attempting to determine atrial ablation tar- number of black patients who qualified for assistance would
get sites based on computational modeling of MRI-defined have tripled. Because black patients accrued less health-
atrial fibrosis. AI methods have the potential to even fur- care cost for any given degree of wellness, the algorithm
ther enhance decision-making for a truly personalized was correctly predicting cost, but was seemingly doing so
selection of an ablative approach. Development of such AI by encoding the racial disparities seen in the training data.
models would incorporate not only imaging data but also Aware of such potential harms, several groups have started
clinical factors, data from ECG in sinus rhythm and/or AF, developing frameworks to guard against bias and potentially
and information from invasive electroanatomic mapping, to decrease healthcare disparities across populations.121,122
including, for example, voltage and activation maps in sinus In order for ML applications to have a meaningful impact
rhythm and AF. Ultimately, a trained model would interface on population health and well-being, these applications
with the proceduralist in real time to indicate the critical need to be easily scalable so that they can reach their target
areas that should be targeted for ablation in that particular populations who are most likely to benefit. In the example
patient. Large-scale multicenter collaborations are required of screening for AF, the populations needing it the most are
for such a model to be realized as information from a very primary care populations outside of cardiology subspecialty
large number of patients and procedures, as well as rigor- care and typically outside of academic medical centers. In
ous post-ablation follow-up, are required. these settings, the familiarity with AI may be lower than in
specialty practices actively engaged in research. We must
remain mindful in our application of AI technologies so not
LIMITATIONS AND CHALLENGES OF ML to perpetuate or exacerbate health disparities.
While large amounts of data can strengthen the power Finally, the regulatory and legal landscape may pose
of ML models, it becomes increasingly difficult to criti- a challenge to fully realizing the benefits of ML in clinical
cally assess their quality. This becomes even more criti- care. For example, tort law favors the standard of care,
cal when deep-learning convolutional models are used so when an ML algorithm makes recommendations that
Downloaded from http://ahajournals.org by on June 24, 2020

whereby nonlinear data transformations and multiple deviate from the standard—which may be seen as a goal
convolutions make it difficult to track how the data was of personalized medicine—the liability risk may hinder
internally handled and what aspects of the input data physicians from following the potentially more accurate
most heavily weighed on the model output.119 guidance provided by ML tools.123
One of the major concerns with deep-learning ML
methods is that it is a black box given the agnostic nature
of how a set of input data is analyzed to derive an output.
CONCLUSIONS
Explainability of deep-learning algorithms is therefore an The clinical application of ML is at its infancy, but it has
important area of ongoing investigation. Another concern the potential to contribute to the care of AF in the mod-
is that such black box models may not allow patients and ern era (Figure 1). With several remaining challenges and
providers to engage into meaningful shared decision uncertainties ranging from screening to risk stratification
making because it is unknown what drives the recom- and treatment of AF and its complications, the ongoing
mendation provided by the model. The counterargument advances driven by ML show promise as discussed in
to that concern is that such an unbiased approach may this review. What will it take for the full potential of ML
actually be able to enhance shared decision making as to be realized? Clinical validation, internal and external
it does not provide any recommendations based on pre- replication consistency, and generalizability with an eye
conceived notions that humans unavoidably carry into toward application in various healthcare settings regard-
shared decision-making interactions to some extent. less of available resources, as well as consideration of
regulatory issues, should all be key components of a rig-
orous research platform that will allow us to realize the
SOCIETAL AND LEGAL CONSIDERATIONS potential of ML while ensuring a patient-oriented focus
in the care of AF.
As ML evolves from a purely research tool to one that is
directly used in clinical care, it is important to consider its
societal and legal implications. A recent example under- ARTICLE INFORMATION
scored the potential for ML to reflect and even compound Affiliations
disparities in care. Obermeyer et al120 assessed a risk model From the Department of Cardiovascular Medicine (K.C.S., P.A.N.), Robert D and
that is in active clinical use to predict healthcare system Patricia E Kern Center for the Science of Health Care Delivery (X.Y.), and Division

Circulation Research. 2020;127:155–169. DOI: 10.1161/CIRCRESAHA.120.316401 June 19, 2020   165


Siontis et al Machine Learning in AF

of Health Care Policy and Research, Department of Health Sciences Research review and meta-analysis. Eur J Prev Cardiol. 2016;23:1330–1338. doi:
COMPENDIUM ON ATRIAL

(X.Y.), Mayo Clinic, Rochester, MN; Broad Institute, Cambridge, MA (J.P.P., A.A.P.); 10.1177/2047487315611347
and Division of Cardiology, Massachusetts General Hospital, Boston (J.P.P.). 18. Harris K, Edwards D, Mant J. How can we best detect atrial fibril-
FIBRILLATION

lation? J R Coll Physicians Edinb. 2012;42(Suppl 18):5–22. doi:


Disclosures 10.4997/JRCPE.2012.S02
A. Philippakis is a Venture Partner at GV, the corporate venture capital group of 19. Cooke G, Doust J, Sanders S. Is pulse palpation helpful in detecting atrial
Alphabet. J. Pirruccello is supported by the John S. LaDue Memorial Fellowship fibrillation? A systematic review. J Fam Pract. 2006;55:130–134.
for Cardiovascular Research. The other authors report no conflicts. 20. Wiesel J, Wiesel D, Suri R, Messineo FC. The use of a modified sphygmoma-
nometer to detect atrial fibrillation in outpatients. Pacing Clin Electrophysiol.
2004;27:639–643. doi: 10.1111/j.1540-8159.2004.00499.x
REFERENCES 21. Wiesel J, Fitzig L, Herschman Y, Messineo FC. Detection of atrial fibrilla-
tion using a modified microlife blood pressure monitor. Am J Hypertens.
1. Zhang J, Gajjala S, Agrawal P, Tison GH, Hallock LA, Beussink-Nelson L, 2009;22:848–852. doi: 10.1038/ajh.2009.98
Lassen MH, Fan E, Aras MA, Jordan C, et al. Fully automated echocardio- 22. Kearley K, Selwood M, Van den Bruel A, Thompson M, Mant D, Hobbs
gram interpretation in clinical practice. Circulation. 2018;138:1623–1635. FR, Fitzmaurice D, Heneghan C. Triage tests for identifying atrial fibril-
doi: 10.1161/CIRCULATIONAHA.118.034338 lation in primary care: a diagnostic accuracy study comparing single-
2. Fries JA, Varma P, Chen VS, Xiao K, Tejeda H, Saha P, Dunnmon J, Chubb lead ECG and modified BP monitors. BMJ Open. 2014;4:e004565. doi:
H, Maskatia S, Fiterau M, et al. Weakly supervised classification of aortic 10.1136/bmjopen-2013-004565
valve malformations using unlabeled cardiac MRI sequences. Nat Commun. 23. Stergiou GS, Karpettas N, Protogerou A, Nasothimiou EG, Kyriakidis
2019;10:3111. doi: 10.1038/s41467-019-11012-3 M. Diagnostic accuracy of a home blood pressure monitor to
3. Attia ZI, Friedman PA, Noseworthy PA, Lopez-Jimenez F, Ladewig DJ, Satam detect atrial fibrillation. J Hum Hypertens. 2009;23:654–658. doi:
G, Pellikka PA, Munger TM, Asirvatham SJ, Scott CG, et al. Age and sex esti- 10.1038/jhh.2009.5
mation using artificial intelligence from standard 12-lead ECGs. Circ Arrhythm 24. Marazzi G, Iellamo F, Volterrani M, Lombardo M, Pelliccia F, Righi D,
Electrophysiol. 2019;12:e007284. doi: 10.1161/CIRCEP.119.007284 Grieco F, Cacciotti L, Iaia L, Caminiti G, et al. Comparison of Micro-
4. Attia ZI, Kapa S, Lopez-Jimenez F, McKie PM, Ladewig DJ, Satam G, Pellikka
life BP A200 Plus and Omron M6 blood pressure monitors to detect
PA, Enriquez-Sarano M, Noseworthy PA, Munger TM, et al. Screening for car-
atrial fibrillation in hypertensive patients. Adv Ther. 2012;29:64–70. doi:
diac contractile dysfunction using an artificial intelligence-enabled electro-
10.1007/s12325-011-0087-0
cardiogram. Nat Med. 2019;25:70–74. doi: 10.1038/s41591-018-0240-2
25. Wiesel J, Arbesfeld B, Schechter D. Comparison of the Microlife
5. Tomašev N, Glorot X, Rae JW, Zielinski M, Askham H, Saraiva A, Mottram A,
blood pressure monitor with the Omron blood pressure monitor for
Meyer C, Ravuri S, Protsyuk I, et al. A clinically applicable approach to con-
detecting atrial fibrillation. Am J Cardiol. 2014;114:1046–1048. doi:
tinuous prediction of future acute kidney injury. Nature. 2019;572:116–119.
10.1016/j.amjcard.2014.07.016
doi: 10.1038/s41586-019-1390-1
26. Lau J, Lowres N, Neubeck L, Brieger D, Sy R, Galloway C, Albert D,
6. Koohy H. The rise and fall of machine learning methods in biomedical
Freedman S. Performance of an automated iPhone ECG algorithm to diag-
research. F1000Res. 2017;6:2012. doi: 10.12688/f1000research.13016.2
nose atrial fibrillation in a community AF screening program (SEARCH-AF).
7. Lip GY, Nieuwlaat R, Pisters R, Lane DA, Crijns HJ. Refining clinical risk
Heart Lung Circ. 2013;22:S205.
stratification for predicting stroke and thromboembolism in atrial fibrillation
27. Tieleman RG, Plantinga Y, Rinkes D, Bartels GL, Posma JL, Cator R,
using a novel risk factor-based approach: the euro heart survey on atrial
Hofman C, Houben RP. Validation and clinical use of a novel diagnostic
fibrillation. Chest. 2010;137:263–272. doi: 10.1378/chest.09-1584
Downloaded from http://ahajournals.org by on June 24, 2020

device for screening of atrial fibrillation. Europace. 2014;16:1291–1295.


8. Levy AE, Biswas M, Weber R, Tarakji K, Chung M, Noseworthy PA,
doi: 10.1093/europace/euu057
Newton-Cheh C, Rosenberg MA. Applications of machine learning
28. Vaes B, Stalpaert S, Tavernier K, Thaels B, Lapeire D, Mullens W,
in decision analysis for dose management for dofetilide. PLoS One.
Degryse J. The diagnostic accuracy of the MyDiagnostick to detect
2019;14:e0227324. doi: 10.1371/journal.pone.0227324
atrial fibrillation in primary care. BMC Fam Pract. 2014;15:113. doi:
9. Rudin C. Stop explaining black box machine learning models for high stakes
decisions and use interpretable models instead. Nature Machine Intelligence. 10.1186/1471-2296-15-113
2019;1:206–215. 29. Doliwa PS, Frykman V, Rosenqvist M. Short-term ECG for out of hospi-
10. Beam AL, Kohane IS. Big data and machine learning in health care. JAMA. tal detection of silent atrial fibrillation episodes. Scand Cardiovasc J.
2018;319:1317–1318. doi: 10.1001/jama.2017.18391 2009;43:163–168. doi: 10.1080/14017430802593435
11. Deo RC. Machine learning in medicine. Circulation. 2015;132:1920–1930. 30. Kaasenbrood F, Hollander M, Rutten FH, Gerhards LJ, Hoes AW, Tieleman
doi: 10.1161/CIRCULATIONAHA.115.001593 RG. Yield of screening for atrial fibrillation in primary care with a hand-held,
12. Gottesman O, Johansson F, Komorowski M, Faisal A, Sontag D, Doshi-Velez single-lead electrocardiogram device during influenza vaccination. Euro-
F, Celi LA. Guidelines for reinforcement learning in healthcare. Nat Med. pace. 2016;18:1514–1520. doi: 10.1093/europace/euv426
2019;25:16–18. doi: 10.1038/s41591-018-0310-5 31. Jacobs MS, Kaasenbrood F, Postma MJ, van Hulst M, Tieleman RG. Cost-
13. Siontis KC, Gersh BJ, Killian JM, Noseworthy PA, McCabe P, Weston SA, effectiveness of screening for atrial fibrillation in primary care with a hand-
Roger VL, Chamberlain AM. Typical, atypical, and asymptomatic presen- held, single-lead electrocardiogram device in the Netherlands. Europace.
tations of new-onset atrial fibrillation in the community: characteristics 2018;20:12–18. doi: 10.1093/europace/euw285
and prognostic implications. Heart Rhythm. 2016;13:1418–1424. doi: 32. Orchard J, Lowres N, Freedman SB, Ladak L, Lee W, Zwar N, Peiris D,
10.1016/j.hrthm.2016.03.003 Kamaladasa Y, Li J, Neubeck L. Screening for atrial fibrillation during influ-
14. Turakhia MP, Shafrin J, Bognar K, Trocio J, Abdulsattar Y, Wiederkehr enza vaccinations by primary care nurses using a smartphone electrocar-
D, Goldman DP. Estimated prevalence of undiagnosed atrial fibril- diograph (iECG): a feasibility study. Eur J Prev Cardiol. 2016;23:13–20. doi:
lation in the United States. PLoS One. 2018;13:e0195088. doi: 10.1177/2047487316670255
10.1371/journal.pone.0195088 33. Lowres N, Neubeck L, Salkeld G, Krass I, McLachlan AJ, Redfern J, Bennett
15. Steinberg BA, Piccini JP. When low-risk atrial fibrillation is not so AA, Briffa T, Bauman A, Martinez C, et al. Feasibility and cost-effectiveness
low risk: beast of burden. JAMA Cardiol. 2018;3:558–560. doi: of stroke prevention through community screening for atrial fibrillation using
10.1001/jamacardio.2018.1205 iPhone ECG in pharmacies. The SEARCH-AF study. Thromb Haemost.
16. Mairesse GH, Moran P, Van Gelder IC, Elsner C, Rosenqvist M, Mant 2014;111:1167–1176. doi: 10.1160/TH14-03-0231
J, Banerjee A, Gorenek B, Brachmann J, Varma N, et al; ESC Scientific 34. Bumgarner JM, Lambert CT, Hussein AA, Cantillon DJ, Baranowski B,
Document Group. Screening for atrial fibrillation: a European Heart Rhythm Wolski K, Lindsay BD, Wazni OM, Tarakji KG. Smartwatch algorithm for
Association (EHRA) consensus document endorsed by the Heart Rhythm automated detection of atrial fibrillation. J Am Coll Cardiol. 2018;71:2381–
Society (HRS), Asia Pacific Heart Rhythm Society (APHRS), and Sociedad 2388. doi: 10.1016/j.jacc.2018.03.003
Latinoamericana de Estimulación Cardíaca y Electrofisiología (SOLAECE). 35. Tison GH, Sanchez JM, Ballinger B, Singh A, Olgin JE, Pletcher MJ,
Europace. 2017;19:1589–1623. doi: 10.1093/europace/eux177 Vittinghoff E, Lee ES, Fan SM, Gladstone RA, et al. Passive detection of
17. Taggar JS, Coleman T, Lewis S, Heneghan C, Jones M. Accuracy of methods atrial fibrillation using a commercially available Smartwatch. JAMA Cardiol.
for detecting an irregular pulse and suspected atrial fibrillation: a systematic 2018;3:409–416. doi: 10.1001/jamacardio.2018.0136

166   June 19, 2020 Circulation Research. 2020;127:155–169. DOI: 10.1161/CIRCRESAHA.120.316401


Siontis et al Machine Learning in AF

36. Lewis M, Parker D, Weston C, Bowes M. Screening for atrial fibrillation: sen- atrial fibrillation risk algorithm in whites and African Americans. Arch Intern

COMPENDIUM ON ATRIAL
sitivity and specificity of a new methodology. Br J Gen Pract. 2011;61:38– Med. 2010;170:1909–1917. doi: 10.1001/archinternmed.2010.434
39. doi: 10.3399/bjgp11X548956 54. Shulman E, Kargoli F, Aagaard P, Hoch E, Di Biase L, Fisher J, Gross J,

FIBRILLATION
37. McMANUS DD, Chong JW, Soni A, Saczynski JS, Esa N, Napolitano C, Kim S, Krumerman A, Ferrick KJ. Validation of the Framingham Heart Study
Darling CE, Boyer E, Rosen RK, Floyd KC, et al. PULSE-SMART: pulse- and CHARGE-AF risk scores for atrial fibrillation in hispanics, African-
based arrhythmia discrimination using a novel smartphone application. J Americans, and non-Hispanic whites. Am J Cardiol. 2016;117:76–83. doi:
Cardiovasc Electrophysiol. 2016;27:51–57. doi: 10.1111/jce.12842 10.1016/j.amjcard.2015.10.009
38. Guo Y, Wang H, Zhang H, Liu T, Liang Z, Xia Y, Yan L, Xing Y, Shi H, Li 55 Pfister R, Brägelmann J, Michels G, Wareham NJ, Luben R, Khaw KT.
S, et al; MAFA II Investigators. Mobile photoplethysmographic technology Performance of the CHARGE-AF risk model for incident atrial fibrillation
to detect atrial fibrillation. J Am Coll Cardiol. 2019;74:2365–2375. doi: in the EPIC Norfolk cohort. Eur J Prev Cardiol. 2015;22:932–939. doi:
10.1016/j.jacc.2019.08.019 10.1177/2047487314544045
39. Dorr M, Nohturfft V, Brasier N, Bosshard E, Djurdjevic A, Gross S, Raichle CJ, 56. Kolek MJ, Graves AJ, Xu M, Bian A, Teixeira PL, Shoemaker MB, Parvez B,
Rhinisperger M, Stockli R, Eckstein J. The WATCH AF Trial: SmartWATCHes Xu H, Heckbert SR, Ellinor PT, et al. Evaluation of a prediction model for the
for detection of atrial fibrillation. JACC Clin Electrophysiol. 2019;5:199–208. development of atrial fibrillation in a repository of electronic medical records.
40. Bonomi AG, Schipper F, Eerikäinen LM, Margarito J, van Dinther R, Muesch JAMA Cardiol. 2016;1:1007–1013. doi: 10.1001/jamacardio.2016.3366
G, de Morree HM, Aarts RM, Babaeizadeh S, McManus DD, et al. Atrial 57. Hulme OL, Khurshid S, Weng LC, Anderson CD, Wang EY, Ashburner JM, Ko
fibrillation detection using a novel cardiac ambulatory monitor based on D, McManus DD, Benjamin EJ, Ellinor PT, et al. Development and validation of
photo-plethysmography at the wrist. J Am Heart Assoc. 2018;7:e009351. a prediction model for atrial fibrillation using electronic health records. JACC
doi: 10.1161/JAHA.118.009351 Clin Electrophysiol. 2019;5:1331–1341. doi: 10.1016/j.jacep.2019.07.016
41. Chan PH, Wong CK, Poh YC, Pun L, Leung WW, Wong YF, Wong MM, 58. Christophersen IE, Yin X, Larson MG, Lubitz SA, Magnani JW, McManus
Poh MZ, Chu DW, Siu CW. Diagnostic performance of a smartphone- DD, Ellinor PT, Benjamin EJ. A comparison of the CHARGE-AF and
based photoplethysmographic application for atrial fibrillation screen- the CHA2DS2-VASc risk scores for prediction of atrial fibrillation
ing in a primary care setting. J Am Heart Assoc. 2016;5:e003428. doi: in the Framingham Heart Study. Am Heart J. 2016;178:45–54. doi:
10.1161/JAHA.116.003428 10.1016/j.ahj.2016.05.004
42. Krivoshei L, Weber S, Burkard T, Maseli A, Brasier N, Kühne M, Conen D, 59. Saliba W, Gronich N, Barnett-Griness O, Rennert G. Usefulness of CHADS2
Huebner T, Seeck A, Eckstein J. Smart detection of atrial fibrillation. Euro- and CHA2DS2-VASc scores in the prediction of new-onset atrial fibril-
pace. 2017;19:753–757. doi: 10.1093/europace/euw125 lation: a population-based study. Am J Med. 2016;129:843–849. doi:
43. Conroy T, Guzman JH, Hall B, Tsouri G, Couderc JP. Detection of atrial 10.1016/j.amjmed.2016.02.029
fibrillation using an earlobe photoplethysmographic sensor. Physiol Meas. 60. Siontis KC, Geske JB, Gersh BJ. Atrial fibrillation pathophysiology and
2017;38:1906–1918. doi: 10.1088/1361-6579/aa8830 prognosis: insights from cardiovascular imaging. Circ Cardiovasc Imaging.
44. Bashar SK, Han D, Hajeb-Mohammadalipour S, Ding E, Whitcomb C, 2015;8:e003020. doi: 10.1161/CIRCIMAGING.115.003020
McManus DD, Chon KH. Atrial fibrillation detection from wrist photople- 61. Weng LC, Preis SR, Hulme OL, Larson MG, Choi SH, Wang B, Trinquart L,
thysmography signals using smartwatches. Sci Rep. 2019;9:15054. doi: McManus DD, Staerk L, Lin H, et al. Genetic predisposition, clinical risk fac-
10.1038/s41598-019-49092-2 tor burden, and lifetime risk of atrial fibrillation. Circulation. 2018;137:1027–
45. Brasier N, Raichle CJ, Dörr M, Becke A, Nohturfft V, Weber S, Bulacher 1038. doi: 10.1161/CIRCULATIONAHA.117.031431
F, Salomon L, Noah T, Birkemeyer R, et al. Detection of atrial fibrillation 62. Schnabel RB, Larson MG, Yamamoto JF, Sullivan LM, Pencina MJ,
with a smartphone camera: first prospective, international, two-centre, Meigs JB, Tofler GH, Selhub J, Jacques PF, Wolf PA, et al. Relations of
clinical validation study (DETECT AF PRO). Europace. 2019;21:41–47. doi: biomarkers of distinct pathophysiological pathways and atrial fibrilla-
Downloaded from http://ahajournals.org by on June 24, 2020

10.1093/europace/euy176 tion incidence in the community. Circulation. 2010;121:200–207. doi:


46. Lowres N, Olivier J, Chao TF, Chen SA, Chen Y, Diederichsen A, Fitzmaurice 10.1161/CIRCULATIONAHA.109.882241
DA, Gomez-Doblas JJ, Harbison J, Healey JS, et al. Estimated stroke 63. Sinner MF, Stepas KA, Moser CB, Krijthe BP, Aspelund T, Sotoodehnia N,
risk, yield, and number needed to screen for atrial fibrillation detected Fontes JD, Janssens AC, Kronmal RA, Magnani JW, et al. B-type natriuretic
through single time screening: a multicountry patient-level meta-analysis peptide and C-reactive protein in the prediction of atrial fibrillation risk: the
of 141,220 screened individuals. PLoS Med. 2019;16:e1002903. doi: CHARGE-AF Consortium of community-based cohort studies. Europace.
10.1371/journal.pmed.1002903 2014;16:1426–1433. doi: 10.1093/europace/euu175
47. Svennberg E, Engdahl J, Al-Khalili F, Friberg L, Frykman V, Rosenqvist M. Mass 64. Everett BM, Cook NR, Conen D, Chasman DI, Ridker PM, Albert CM. Novel
screening for untreated atrial fibrillation: the STROKESTOP Study. Circulation. genetic markers improve measures of atrial fibrillation risk prediction. Eur
2015;131:2176–2184. doi: 10.1161/CIRCULATIONAHA.114.014343 Heart J. 2013;34:2243–2251. doi: 10.1093/eurheartj/eht033
48. Steinhubl SR, Waalen J, Edwards AM, Ariniello LM, Mehta RR, Ebner GS, 65. Darbar D, Jahangir A, Hammill SC, Gersh BJ. P wave signal-averaged
Carter C, Baca-Motes K, Felicione E, Sarich T, et al. Effect of a home- electrocardiography to identify risk for atrial fibrillation. Pacing Clin Electro-
based wearable continuous ECG monitoring patch on detection of undi- physiol. 2002;25:1447–1453. doi: 10.1046/j.1460-9592.2002.01447.x
agnosed atrial fibrillation: the mSToPS Randomized Clinical Trial. JAMA. 66. German DM, Kabir MM, Dewland TA, Henrikson CA, Tereshchenko LG. Atrial
2018;320:146–155. doi: 10.1001/jama.2018.8102 fibrillation predictors: importance of the electrocardiogram. Ann Noninvasive
49. Schnabel RB, Sullivan LM, Levy D, Pencina MJ, Massaro JM, D’Agostino Electrocardiol. 2016;21:20–29. doi: 10.1111/anec.12321
RB Sr, Newton-Cheh C, Yamamoto JF, Magnani JW, Tadros TM, et al. 67. Attia ZI, Noseworthy PA, Lopez-Jimenez F, Asirvatham SJ, Deshmukh
Development of a risk score for atrial fibrillation (Framingham Heart AJ, Gersh BJ, Carter RE, Yao X, Rabinstein AA, Erickson BJ, et al.
Study): a community-based cohort study. Lancet. 2009;373:739–745. doi: An artificial intelligence-enabled ECG algorithm for the identifica-
10.1016/S0140-6736(09)60443-8 tion of patients with atrial fibrillation during sinus rhythm: a retrospec-
50. Chamberlain AM, Agarwal SK, Folsom AR, Soliman EZ, Chambless LE, Crow tive analysis of outcome prediction. Lancet. 2019;394:861–867. doi:
R, Ambrose M, Alonso A. A clinical risk score for atrial fibrillation in a biracial 10.1016/S0140-6736(19)31721-0
prospective cohort (from the Atherosclerosis Risk in Communities [ARIC] 68. Dewland TA, Vittinghoff E, Mandyam MC, Heckbert SR, Siscovick
study). Am J Cardiol. 2011;107:85–91. doi: 10.1016/j.amjcard.2010.08.049 DS, Stein PK, Psaty BM, Sotoodehnia N, Gottdiener JS,
51. Alonso A, Krijthe BP, Aspelund T, Stepas KA, Pencina MJ, Moser CB, Marcus GM. Atrial ectopy as a predictor of incident atrial fibril-
Sinner MF, Sotoodehnia N, Fontes JD, Janssens AC, et al. Simple risk lation: a cohort study. Ann Intern Med. 2013;159:721–728. doi:
model predicts incidence of atrial fibrillation in a racially and geographi- 10.7326/0003-4819-159-11-201312030-00004
cally diverse population: the CHARGE-AF consortium. J Am Heart Assoc. 69. Hill NR, Ayoubkhani D, McEwan P, Sugrue DM, Farooqui U, Lister S, Lumley
2013;2:e000102. doi: 10.1161/JAHA.112.000102 M, Bakhai A, Cohen AT, O’Neill M, et al. Predicting atrial fibrillation in pri-
52. Li YG, Pastori D, Farcomeni A, Yang PS, Jang E, Joung B, Wang YT, Guo mary care using machine learning. PLoS One. 2019;14:e0224582. doi:
YT, Lip GYH. A Simple Clinical Risk Score (C2HEST) for predicting incident 10.1371/journal.pone.0224582
atrial fibrillation in Asian subjects: derivation in 471,446 Chinese subjects, 70. January CT, Wann LS, Calkins H, Chen LY, Cigarroa JE, Cleveland JC Jr,
with internal validation and external application in 451,199 Korean subjects. Ellinor PT, Ezekowitz MD, Field ME, Furie KL, et al. 2019 AHA/ACC/HRS
Chest. 2019;155:510–518. doi: 10.1016/j.chest.2018.09.011 focused update of the 2014 AHA/ACC/HRS guideline for the manage-
53. Schnabel RB, Aspelund T, Li G, Sullivan LM, Suchy-Dicey A, Harris TB, ment of patients with atrial fibrillation: a Report of the American College
Pencina MJ, D’Agostino RB Sr, Levy D, Kannel WB, et al. Validation of an of Cardiology/American Heart Association task force on clinical practice

Circulation Research. 2020;127:155–169. DOI: 10.1161/CIRCRESAHA.120.316401 June 19, 2020   167


Siontis et al Machine Learning in AF

guidelines and the heart rhythm society. J Am Coll Cardiol. 2019;74:104– 89. Kabra R, Girotra S, Vaughan Sarrazin M. Refining stroke prediction
COMPENDIUM ON ATRIAL

132. doi: 10.1016/j.jacc.2019.01.011 in atrial fibrillation patients by addition of African-American ethnic-


71. Liu H, Bielinski SJ, Sohn S, Murphy S, Wagholikar KB, Jonnalagadda SR, ity to CHA2DS2-VASc score. J Am Coll Cardiol. 2016;68:461–470. doi:
FIBRILLATION

Ravikumar KE, Wu ST, Kullo IJ, Chute CG. An information extraction frame- 10.1016/j.jacc.2016.05.044
work for cohort identification using electronic health records. AMIA Jt Sum- 90. Ganesan AN, Chew DP, Hartshorne T, Selvanayagam JB, Aylward
mits Transl Sci Proc. 2013;2013:149–153. PE, Sanders P, McGavigan AD. The impact of atrial fibrillation type
72. Upadhyaya SG, Murphree DH Jr, Ngufor CG, Knight AM, Cronk DJ, on the risk of thromboembolism, mortality, and bleeding: a system-
Cima RR, Curry TB, Pathak J, Carter RE, Kor DJ. Automated diabe- atic review and meta-analysis. Eur Heart J. 2016;37:1591–1602. doi:
tes case identification using electronic health record data at a tertiary 10.1093/eurheartj/ehw007
care facility. Mayo Clin Proc Innov Qual Outcomes. 2017;1:100–110. doi: 91. Kaplan RM, Koehler J, Ziegler PD, Sarkar S, Zweibel S, Passman
10.1016/j.mayocpiqo.2017.04.005 RS. Stroke risk as a function of atrial fibrillation duration and
73. Afzal N, Sohn S, Abram S, Scott CG, Chaudhry R, Liu H, Kullo IJ, Arruda-Olson CHA2DS2-VASc score. Circulation. 2019;140:1639–1646. doi:
AM. Mining peripheral arterial disease cases from narrative clinical notes 10.1161/CIRCULATIONAHA.119.041303
using natural language processing. J Vasc Surg. 2017;65:1753–1761. doi: 92. Daccarett M, Badger TJ, Akoum N, Burgon NS, Mahnkopf C, Vergara G,
10.1016/j.jvs.2016.11.031 Kholmovski E, McGann CJ, Parker D, Brachmann J, et al. Association of
74. Liao KP, Cai T, Savova GK, Murphy SN, Karlson EW, Ananthakrishnan AN, left atrial fibrosis detected by delayed-enhancement magnetic resonance
Gainer VS, Shaw SY, Xia Z, Szolovits P, et al. Development of phenotype imaging and the risk of stroke in patients with atrial fibrillation. J Am Coll
algorithms using electronic medical records and incorporating natural lan- Cardiol. 2011;57:831–838. doi: 10.1016/j.jacc.2010.09.049
guage processing. BMJ. 2015;350:h1885. doi: 10.1136/bmj.h1885 93. Lupercio F, Carlos Ruiz J, Briceno DF, Romero J, Villablanca PA, Berardi C,
75. Zhang Y, Cai T, Yu S, Cho K, Hong C, Sun J, Huang J, Ho YL, Ananthakrishnan Faillace R, Krumerman A, Fisher JD, Ferrick K, et al. Left atrial appendage
AN, Xia Z, et al. High-throughput phenotyping with electronic medical morphology assessment for risk stratification of embolic stroke in patients
record data using a common semi-supervised approach (PheCAP). Nat with atrial fibrillation: a meta-analysis. Heart Rhythm. 2016;13:1402–
Protoc. 2019;14(12):3426–3444. doi: 10.1038/s41596-019-0227-6 1409. doi: 10.1016/j.hrthm.2016.03.042
76. Kirby JC, Speltz P, Rasmussen LV, Basford M, Gottesman O, Peissig 94. Hijazi Z, Lindbäck J, Alexander JH, Hanna M, Held C, Hylek EM,
PL, Pacheco JA, Tromp G, Pathak J, Carrell DS, et al. PheKB: a cata- Lopes RD, Oldgren J, Siegbahn A, Stewart RA, et al; ARISTOTLE
log and workflow for creating electronic phenotype algorithms for and STABILITY Investigators. The ABC (age, biomarkers, clinical his-
transportability. J Am Med Inform Assoc. 2016;23:1046–1052. doi: tory) stroke risk score: a biomarker-based risk score for predicting
10.1093/jamia/ocv202 stroke in atrial fibrillation. Eur Heart J. 2016;37:1582–1590. doi:
77. Carroll RJ, Thompson WK, Eyler AE, Mandelin AM, Cai T, Zink RM, Pacheco 10.1093/eurheartj/ehw054
JA, Boomershine CS, Lasko TA, Xu H, et al. Portability of an algorithm to 95. Maheshwari A, Norby FL, Roetker NS, Soliman EZ, Koene RJ,
identify rheumatoid arthritis in electronic health records. J Am Med Inform Rooney MR, O’Neal WT, Shah AM, Claggett BL, Solomon SD, et
Assoc. 2012;19:e162–e169. doi: 10.1136/amiajnl-2011-000583 al. Refining prediction of atrial fibrillation-related stroke using the
78. Kho AN, Hayes MG, Rasmussen-Torvik L, Pacheco JA, Thompson WK, P2-CHA2DS2-VASc score. Circulation. 2019;139:180–191. doi:
Armstrong LL, Denny JC, Peissig PL, Miller AW, Wei WQ, et al. Use of 10.1161/CIRCULATIONAHA.118.035411
diverse electronic medical record systems to identify genetic risk for type 2 96. Han L, Askari M, Altman RB, Schmitt SK, Fan J, Bentley JP, Narayan SM,
diabetes within a genome-wide association study. J Am Med Inform Assoc. Turakhia MP. Atrial fibrillation burden signature and near-term prediction
2012;19:212–218. doi: 10.1136/amiajnl-2011-000439 of stroke: a machine learning analysis. Circ Cardiovasc Qual Outcomes.
2019;12:e005595. doi: 10.1161/CIRCOUTCOMES.118.005595
Downloaded from http://ahajournals.org by on June 24, 2020

79. Ritchie MD, Denny JC, Zuvich RL, Crawford DC, Schildcrout JS, Bastarache
L, Ramirez AH, Mosley JD, Pulley JM, Basford MA, et al; Cohorts for Heart 97. Inohara T, Shrader P, Pieper K, Blanco RG, Thomas L, Singer DE,
and Aging Research in Genomic Epidemiology (CHARGE) QRS Group. Freeman JV, Allen LA, Fonarow GC, Gersh B, et al. Association of
Genome- and phenome-wide analyses of cardiac conduction identi- atrial fibrillation clinical phenotypes with treatment patterns and out-
fies markers of arrhythmia risk. Circulation. 2013;127:1377–1385. doi: comes: a Multicenter Registry Study. JAMA Cardiol. 2018;3:54–63. doi:
10.1161/CIRCULATIONAHA.112.000604 10.1001/jamacardio.2017.4665
80. Kawatkar A, Chu LH, Iyer R, Yen L, Chen W, Erder MH, Hodgkins P, Longstreth 98. Passman R, Leong-Sit P, Andrei AC, Huskin A, Tomson TT, Bernstein R, Ellis
G. Development and validation of algorithms to identify acute diverticulitis. E, Waks JW, Zimetbaum P. Targeted anticoagulation for atrial fibrillation
Pharmacoepidemiol Drug Saf. 2015;24:27–37. doi: 10.1002/pds.3708 guided by continuous rhythm assessment with an insertable cardiac moni-
81. Denny JC, Crawford DC, Ritchie MD, Bielinski SJ, Basford MA, Bradford Y, tor: the Rhythm Evaluation for Anticoagulation With Continuous Monitoring
Chai HS, Bastarache L, Zuvich R, Peissig P, et al. Variants near FOXE1 are (REACT.COM) Pilot Study. J Cardiovasc Electrophysiol. 2016;27:264–270.
associated with hypothyroidism and other thyroid conditions: using elec- doi: 10.1111/jce.12864
tronic medical records for genome- and phenome-wide studies. Am J Hum 99. Martin DT, Bersohn MM, Waldo AL, Wathen MS, Choucair WK, Lip GY, Ip
Genet. 2011;89:529–542. doi: 10.1016/j.ajhg.2011.09.008 J, Holcomb R, Akar JG, Halperin JL; IMPACT Investigators. Randomized
82. ClinicalTrials.gov. Batch Enrollment for AI-Guided Intervention to Lower trial of atrial arrhythmia monitoring to guide anticoagulation in patients with
Neurologic Events in Unrecognized AF. 2020. implanted defibrillator and cardiac resynchronization devices. Eur Heart J.
83. Krittanawong C, Rogers AJ, Aydar M, Choi E, Johnson KW, Wang Z, 2015;36:1660–1668. doi: 10.1093/eurheartj/ehv115
Narayan SM. Integrating blockchain technology with artificial intelli- 100. Hart RG, Sharma M, Mundl H, Kasner SE, Bangdiwala SI, Berkowitz SD,
gence for cardiovascular medicine. Nat Rev Cardiol. 2020;17:1–3. doi: Swaminathan B, Lavados P, Wang Y, Wang Y, et al; NAVIGATE ESUS
10.1038/s41569-019-0294-y Investigators. Rivaroxaban for stroke prevention after embolic stroke
84. Krittanawong C, Johnson KW, Rosenson RS, Wang Z, Aydar M, Baber U, of undetermined source. N Engl J Med. 2018;378:2191–2201. doi:
Min JK, Tang WHW, Halperin JL, Narayan SM. Deep learning for cardiovas- 10.1056/NEJMoa1802686
cular medicine: a practical primer. Eur Heart J. 2019;40:2058–2073. doi: 101. Diener HC, Sacco RL, Easton JD, Granger CB, Bernstein RA, Uchiyama S,
10.1093/eurheartj/ehz056 Kreuzer J, Cronin L, Cotton D, Grauer C, et al; RE-SPECT ESUS Steering
85. Hoffman C. Patient access to medical information can spur research. 2016. Committee and Investigators. Dabigatran for prevention of stroke after
86. Alkhouli M, Friedman PA. Ischemic stroke risk in patients with nonval- embolic stroke of undetermined source. N Engl J Med. 2019;380:1906–
vular atrial fibrillation: JACC review topic of the week. J Am Coll Cardiol. 1917. doi: 10.1056/NEJMoa1813959
2019;74:3050–3065. doi: 10.1016/j.jacc.2019.10.040 102. Zannad F, Anker SD, Byra WM, Cleland JGF, Fu M, Gheorghiade M,
87. Chao TF, Lip GY, Liu CJ, Tuan TC, Chen SJ, Wang KL, Lin YJ, Chang Lam CSP, Mehra MR, Neaton JD, Nessel CC, et al; COMMANDER HF
SL, Lo LW, Hu YF, et al. Validation of a modified CHA2DS2-VASc Investigators. Rivaroxaban in patients with heart failure, sinus rhythm,
score for stroke risk stratification in Asian patients with atrial fibrilla- and coronary disease. N Engl J Med. 2018;379:1332–1342. doi:
tion: a Nationwide Cohort Study. Stroke. 2016;47:2462–2469. doi: 10.1056/NEJMoa1808848
10.1161/STROKEAHA.116.013880 103. Siontis KC, Zhang X, Eckard A, Bhave N, Schaubel DE, He K, Tilea A,
88. Singer DE, Chang Y, Borowsky LH, Fang MC, Pomernacki NK, Udaltsova N, Stack AG, Balkrishnan R, Yao X, et al. Outcomes associated with apix-
Reynolds K, Go AS. A new risk scheme to predict ischemic stroke and other aban use in patients with end-stage kidney disease and atrial fibril-
thromboembolism in atrial fibrillation: the ATRIA study stroke risk score. J lation in the United States. Circulation. 2018;138:1519–1529. doi:
Am Heart Assoc. 2013;2:e000250. doi: 10.1161/JAHA.113.000250 10.1161/CIRCULATIONAHA.118.035418

168   June 19, 2020 Circulation Research. 2020;127:155–169. DOI: 10.1161/CIRCRESAHA.120.316401


Siontis et al Machine Learning in AF

104. Holmes DR Jr, Kar S, Price MJ, Whisenant B, Sievert H, Doshi SK, Huber K, 113. Attia ZI, Sugrue A, Asirvatham SJ, Ackerman MJ, Kapa S, Friedman PA,

COMPENDIUM ON ATRIAL
Reddy VY. Prospective randomized evaluation of the Watchman Left Atrial Noseworthy PA. Noninvasive assessment of dofetilide plasma concentra-
Appendage Closure device in patients with atrial fibrillation versus long- tion using a deep learning (neural network) analysis of the surface electro-
term warfarin therapy: the PREVAIL trial. J Am Coll Cardiol. 2014;64:1–12.

FIBRILLATION
cardiogram: a proof of concept study. PLoS One. 2018;13:e0201059. doi:
doi: 10.1016/j.jacc.2014.04.029 10.1371/journal.pone.0201059
105. Reddy VY, Sievert H, Halperin J, Doshi SK, Buchbinder M, Neuzil P, Huber 114. Hu YH, Tai CT, Tsai CF, Huang MW. Improvement of adequate digoxin
K, Whisenant B, Kar S, Swarup V, et al; PROTECT AF Steering Committee dosage: an application of machine learning approach. J Healthc Eng.
and Investigators. Percutaneous left atrial appendage closure vs warfarin 2018;2018:3948245. doi: 10.1155/2018/3948245
for atrial fibrillation: a randomized clinical trial. JAMA. 2014;312:1988– 115. Buch E, Share M, Tung R, Benharash P, Sharma P, Koneru J, Mandapati
1998. doi: 10.1001/jama.2014.15192 R, Ellenbogen KA, Shivkumar K. Long-term clinical outcomes of fo-
106. Siontis KC, Ioannidis JP, Katritsis GD, Noseworthy PA, Packer DL, Hummel cal impulse and rotor modulation for treatment of atrial fibrillation:
JD, Jais P, Krittayaphong R, Mont L, Morillo CA, et al. Radiofrequency abla- a multicenter experience. Heart Rhythm. 2016;13:636–641. doi:
tion versus antiarrhythmic drug therapy for atrial fibrillation: meta-analysis of 10.1016/j.hrthm.2015.10.031
quality of life, morbidity, and mortality. J Am Coll Cardiol EP. 2015;2:170–180. 116. Verma A, Jiang CY, Betts TR, Chen J, Deisenhofer I, Mantovan R,
107. Packer DL, Mark DB, Robb RA, Monahan KH, Bahnson TD, Poole JE, Macle L, Morillo CA, Haverkamp W, Weerasooriya R, et al; STAR
Noseworthy PA, Rosenberg YD, Jeffries N, Mitchell LB, et al; CABANA AF II Investigators. Approaches to catheter ablation for persis-
Investigators. Effect of catheter ablation vs antiarrhythmic drug therapy on tent atrial fibrillation. N Engl J Med. 2015;372:1812–1822. doi:
mortality, stroke, bleeding, and cardiac arrest among patients with atrial fi- 10.1056/NEJMoa1408288
brillation: the CABANA Randomized Clinical Trial. JAMA. 2019;321:1261– 117. Anter E, Di Biase L, Contreras-Valdes FM, Gianni C, Mohanty S, Tschabrunn
1274. doi: 10.1001/jama.2019.0693 CM, Viles-Gonzalez JF, Leshem E, Buxton AE, Kulbak G, et al. Atrial sub-
108. White HD, Gruber M, Feyzi J, Kaatz S, Tse HF, Husted S, Albers GW. strate and triggers of paroxysmal atrial fibrillation in patients with obstruc-
Comparison of outcomes among patients randomized to warfarin therapy tive sleep apnea. Circ Arrhythm Electrophysiol. 2017;10:e005407. doi:
according to anticoagulant control: results from SPORTIF III and V. Arch 10.1161/CIRCEP.117.005407
Intern Med. 2007;167:239–245. doi: 10.1001/archinte.167.3.239 118. Boyle PM, Zghaib T, Zahid S, Ali RL, Deng D, Franceschi WH, Hakim JB,
109. Grossi E, Podda GM, Pugliano M, Gabba S, Verri A, Carpani G, Buscema Murphy MJ, Prakosa A, Zimmerman SL, et al. Computationally guided per-
M, Casazza G, Cattaneo M. Prediction of optimal warfarin maintenance sonalized targeted ablation of persistent atrial fibrillation. Nat Biomed Eng.
dose using advanced artificial neural networks. Pharmacogenomics. 2019;3:870–879. doi: 10.1038/s41551-019-0437-9
2014;15:29–37. doi: 10.2217/pgs.13.212 119. Loring Z, Mehrotra S, Piccini JP. Machine learning in “big data”: handle with
110. Ma Z, Wang P, Gao Z, Wang R, Khalighi K. Ensemble of machine learning al- care. Europace. 2019;21:1284–1285. doi: 10.1093/europace/euz130
gorithms using the stacked generalization approach to estimate the warfarin 120. Obermeyer Z, Powers B, Vogeli C, Mullainathan S. Dissecting racial
dose. PLoS One. 2018;13:e0205872. doi: 10.1371/journal.pone.0205872 bias in an algorithm used to manage the health of populations. Science.
111. Yao X, Abraham NS, Alexander GC, Crown W, Montori VM, Sangaralingham 2019;366:447–453. doi: 10.1126/science.aax2342
LR, Gersh BJ, Shah ND, Noseworthy PA. Effect of adherence to 121. Parikh RB, Teeple S, Navathe AS. Addressing bias in artifi-
oral anticoagulants on risk of stroke and major bleeding among pa- cial intelligence in health care. JAMA. 2019;322:2377–2378. doi:
tients with atrial fibrillation. J Am Heart Assoc. 2016;5:e003074. doi: 10.1001/jama.2019.18058
10.1161/JAHA.115.003074 122. Hosny A, Aerts HJWL. Artificial intelligence for global health. Science.
112. Labovitz DL, Shafner L, Reyes Gil M, Virmani D, Hanina A. Using 2019;366:955–956. doi: 10.1126/science.aay5189
artificial intelligence to reduce the risk of nonadherence in pa- 123. Sullivan HR, Schweikart SJ. Are current tort liability doctrines adequate for
Downloaded from http://ahajournals.org by on June 24, 2020

tients on anticoagulation therapy. Stroke. 2017;48:1416–1419. doi: addressing injury caused by AI? AMA J Ethics. 2019;21:E160–E166. doi:
10.1161/STROKEAHA.116.016281 10.1001/amajethics.2019.160

Circulation Research. 2020;127:155–169. DOI: 10.1161/CIRCRESAHA.120.316401 June 19, 2020   169

You might also like