Download as pdf or txt
Download as pdf or txt
You are on page 1of 78

AI IN HEALTHCARE

CMED 301
COLLEGE OF MEDICINE, KING SAUD UNIVERSITY
Shauna M Overgaard, PhD

Center for Digital Health


Mayo Clinic

©2022 Mayo Foundation for Medical Education and Research | slide-1


DISCLOSURES

• I do not intend to discuss an off-label/investigative use of a commercial


product/device.
• I do not have financial relationships with entities producing, marketing,
selling, reselling, or distributing health care products used by or on
patients.

©2022 Mayo Foundation for Medical Education and Research | slide-2


1. Define the Role of AI in Healthcare

2. Identify Key AI Concepts

3. Recognize AI Applications in Diagnosis and


Treatment
LEARNING
4. Discuss Enhanced Patient Care with AI
OBJECTIVES
5. Analyze Ethical and Regulatory Aspects

6. Anticipate Future AI Trends in Healthcare

7. Analyze Case Studies and Real-World


Examples

©2022 Mayo Foundation for Medical Education and Research | slide-3


1 DEFINE THE ROLE OF AI IN
HEALTHCARE

©2022 Mayo Foundation for Medical Education and Research | slide-4


©2022 Mayo Foundation for Medical Education and Research | slide-5
Artificial Intelligence

Machine Learning

Deep Learning

©2022 Mayo Foundation for Medical Education and Research | slide-6


©2022 Mayo Foundation for Medical Education and Research | slide-7
Administrative Efficiency Diagnosis and Disease Detection

Predictive Analytics Drug and Discovery Development

©2022 Mayo Foundation for Medical Education and Research | slide-8


©2022 Mayo Foundation for Medical Education and Research | slide-9
2 IDENTIFY KEY AI CONCEPTS

©2022 Mayo Foundation for Medical Education and Research | slide-10


©2022 Mayo Foundation for Medical Education and Research | slide-11
ARTIFICIAL INTELLIGENCE KEY CONCEPTS

Learning from unlabeled data by leveraging


Self-supervised learning
information extracted from the data itself
Learning from a small amount of labeled data
Semi-supervised learning
combined with a large amount of unlabeled data
Finding the effect of a component or treatment on a
Causal inference
system using data
Learning in an interactive environment using feedback
Reinforcement learning
from actions and past experiences

AI setups beyond supervised learning

Rajpurkar et al., 2022, Nat Med

©2022 Mayo Foundation for Medical Education and Research | slide-12


OVERVIEW OF THE
PROGRESS,
CHALLENGES AND
OPPORTUNITIES FOR
AI IN HEALTH

Rajpurkar et al., 2022, Nat Med

©2022 Mayo Foundation for Medical Education and Research | slide-13


3 RECOGNIZE AI APPLICATIONS IN
DIAGNOSIS AND TREATMENT

©2022 Mayo Foundation for Medical Education and Research | slide-14


©2022 Mayo Foundation for Medical Education and Research | slide-15
©2022 Mayo Foundation for Medical Education and Research | slide-16
4 DISCUSS ENHANCED PATIENT
CARE WITH AI

©2022 Mayo Foundation for Medical Education and Research | slide-17


©2022 Mayo Foundation for Medical Education and Research | slide-18
Attia, 2019, Lancet
©2022 Mayo Foundation for Medical Education and Research | slide-19
©2022 Mayo Foundation for Medical Education and Research | slide-20
Polyp and Adenoma/Carcinoma Detection Rates in the Second Colonoscopy (Per Patient
Analysis): FAS Population

Wallace et al., 2022, Gastroenterology


©2022 Mayo Foundation for Medical Education and Research | slide-21
https://unsplash.com/photos/snNHKZ-mGfE ©2022 Mayo Foundation for Medical Education and Research | slide-22
©2022 Mayo Foundation for Medical Education and Research | slide-23
NATURAL
LANGUAGE
PROCESSING
• An NLP algorithm was developed
to identify twenty specific skeletal
site fractures from radiology
reports.
• Empirical experiments were
conducted to validate the
effectiveness of the NLP algorithm
using radiology reports from a
community-based cohort at Mayo
Clinic.
• Microaveraged results of the NLP
algorithm for the twenty fractures
include sensitivity (0.930),
specificity (1.0), PPV (positive
predictive value, 1.0), NPV
(negative predictive value, 0.941),
and F1-score (0.961).
Wang, 2019, BMC ©2022 Mayo Foundation for Medical Education and Research | slide-24
©2022 Mayo Foundation for Medical Education and Research | slide-25
NURSING CARE

• AI-enhanced patient monitoring


systems’ efficiency, particularly for
tracking vital signs, reduced nurses'
workload, giving them more time for
non-routine and less mentally taxing
tasks.
• AI systems are trained to
automatically monitor patient data and
detect abnormalities, especially in
critical care situations. Timely alerts
can help in managing patients
effectively.
• Limited real-world implementation

Diagrammatic summary of the mechanism by which artificial


intelligence (AI) improves nursing care
Ng, 2021, N J Manage
©2022 Mayo Foundation for Medical Education and Research | slide-26
Overview of the analytical pipeline to predict behavioral impairments in 100 stroke patients based on
structural and functional MRI.
Bonkhoff, 2022, Brain
©2022 Mayo Foundation for Medical Education and Research | slide-27
Bonkhoff, 2022, Brain

SVM-based prediction of motor deficits after stroke ©2022 Mayo Foundation for Medical Education and Research | slide-28
DERMATOLOGY

• Teledermatology using AI offers


dermatological expertise.

• Nondermatologist clinicians
increase confidence and improve
appropriateness of referrals to
dermatologists.

• AI can reduce unnecessary


referrals and enhance the
detection of medically concerning
ones, potentially expediting access
to dermatologists.

• Dermatologists have a crucial


role in ensuring that it is patient-
centered, clinician-led, high-
quality, equitable, and accessible.

Application of deep neural networks in the identification of skin neoplasms.


Beltrami, 2022, JMACD
©2022 Mayo Foundation for Medical Education and Research | slide-29
5 ANALYZE ETHICAL AND
REGULATORY ASPECTS

©2022 Mayo Foundation for Medical Education and Research | slide-30


SHARED ACCOUNTABILITY
EXAMPLES

Physician Developer
Product Claims
Standard of Care Product Design
User Product Testing
Product Changes

Healthcare Org
Vendor Assessment
Evaluating fitness for the org.
Integration
Scope of Use

©2022 Mayo Foundation for Medical Education and Research | slide-31


BUILDING A HEALTHCARE AI FRAMEWORK
HUB AND SPOKE MODEL

©2022 Mayo Foundation for Medical Education and Research | slide-32


ENTERPRISE AI TRANSLATIONAL ADVISORY BOARD

©2022 Mayo Foundation for Medical Education and Research | slide-33


USA STANDARDS,
IS YOUR PRODUCT A MEDICAL DEVICE?
ANALYZING EACH PRODUCT

©2022 Mayo Foundation for Medical Education and Research | slide-34


TRANSLATING FROM RESEARCH TO PRACTICE
HEALTHCARE ORGANIZATIONS ENTERING NEW TERRITORY

Research Black Box Clinical Application

©2022 Mayo Foundation for Medical Education and Research | slide-35


AI/ML RISKS

IEC 62366-1: 2015

TIR34971:2023

https://www.mckinsey.com/business-functions/quantumblack/our-insights/confronting-the-risks-of-artificial-intelligence

Evaluate and consider AI/ML risks for downstream impact


©2022 Mayo Foundation for Medical Education and Research | slide-36
OUR DEPLOYMENT OPTIONS
BUILDING ON A FIRM FOUNDATION

©2022 Mayo Foundation for Medical Education and Research | slide-37


MAYO CLINIC
SAMD REVIEW BOARD
SAFE, EFFECTIVE, AND ETHICAL
SOFTWARE FOR OUR PATIENTS

Regulatory Determinations

Risk Assessment

Recommended Risk Controls

©2022 Mayo Foundation for Medical Education and Research | slide-38


SOFTWARE AS A MEDICAL DEVICE

Regulatory Roles & Development & Validation Submit to Post Deployment


Strategy Responsibilities Testing Process Study the FDA Operations
Work with the SaMD Determine who needs Follow the regulations Work with the IRB to Put all your hard work After deployment,
Review Board to create to do what, when. to prove your medical conduct a study that together and submit to conduct oversight,
a regulatory pathway. device systems is proves you built the the FDA. monitoring, and
designed, developed, right system in its change management.
and tested to be safe intended setting.
and effective.

©2022 Mayo Foundation for Medical Education and Research | slide-39


CHALLENGES

Ethical challenges for AI in medicine

Rajpurkar et al., 2022, Nat Med

©2022 Mayo Foundation for Medical Education and Research | slide-40


AI FAIRNESS

Conceptual model towards clinical AI fairness.


Liu et al., NPJ Digit Med. 2023
©2022 Mayo Foundation for Medical Education and Research | slide-41
VULNERABLE POPULATIONS
- Include the patient’s chronological age at the time of study enrollment
- When applicable and available, include the patient’s developmental age. If unavailable, state as such
AGE - Attempt to include developmental stages and relevant milestone metrics of CYP e.g., height and weight percentile upon enrollment, to capture the heterogeneity of participants. If developmental metrics are unavailable, state them as
such
- Include the age(s) of intended algorithm users e.g., pediatric only, pediatric and adult, or adult only
- Communication of study purpose to CYP as key stakeholders with developmentally appropriate communication strategies
- Communication with parent(s) or legal guardians as key stakeholders
- Tailor communication to social circumstance, addressing family complexities, including any court involvement
- Clear communication of technology-specific study purpose, risks, benefits, and alternatives with all key stakeholders
COMMUNICATION - Consider the use of videos, written material, and decision aids to facilitate education and enhance communication
- Involve stakeholders, including CYP and parents, in focus groups for design feedback where possible and relevant legal and institutional permissions are obtained
- State efforts taken to involve potential users in feedback of research idea and invest in community-level digital literacy
- Where possible, document and articulate model explainability
- Record mode of consent, who provided consent, e.g., parent, legal guardian, and how it was obtained
- Document any complex parental relations, dynamics, or court involvement that impact consent
- Document children’s social circumstances as relevant to safety, participation and evaluation
- For children in state-custody, ensure consent is obtained by relevant legal guardians or custodians and documented accurately
CONSENT AND ASSENT - Document relevant child protection laws pertinent to individual cases
- Attain assent when developmentally appropriate and/or required by regulations
- Ensure minors participate in the assent process in accordance with their developmental skills (e.g., appropriate modifications for children with clinically relevant developmental delay)
- Record age when assent is provided
- Ensure local laws for adolescent assent/consent are followed
- Ensure inclusion and exclusion criteria are clearly defined and specify disease, symptom, or condition of interest, with developmental stages considered as appropriate
- State processes employed to reduce selection bias
9
- Transparent demographic reporting, including race, documented sex, gender, and socioeconomic factors
7
- Provide details on how gender and documented sex have been incorporated into the study design
EQUITY - Incorporate accessible research design to facilitate the inclusion of patients with disabilities (developmental and otherwise)
- If skin tone could influence algorithmic outputs, ensure it is documented
- Indicate the source of demographic information (e.g., self-reported) as well as details on non-reporting and missingness
- Discuss the role of community engagement in the study
- State how data collection aligns with study objectives
- State data-sharing plans when relevant
- State if data is identifiable or de-identified
PROTECTION OF DATA - If data is de-identified, state compliance with the relevant legal frameworks, e.g., HIPAA, Common Rule, GDPR24,25
- State data protection plans, addressing unique data risks in AI/ML including protections against cybersecurity breaches.
- Disclose whether data can or cannot be retrieved/removed in the future by parents and CYP
- Ensure social context of child e.g., suspected or confirmed child abuse or complex social circumstance is accounted for prior to any data releases that may involve parental requests or involvement, if available
- Ensure algorithmic studies are tailored to the needs of the pediatric population and clearly documented in the study protocol
- Ensure AI/ML techniques are only used when potentially beneficial to the pediatric population, and that such benefits are clearly detailed in the study protocol
- Detail any potential harms that pediatric subjects may incur as a result of the study
- Identify measures taken to minimize risk to pediatric subjects throughout study and post-implementation
- State measures taken to monitor and document adverse events that may affect pediatric subjects
- State outcome measures and plans to clinically evaluate performance of algorithms on pediatric subjects
TECHNOLOGICAL CONSIDERATIONS - (TRANSPARENCY OF - When available, utilize validated pediatric clinical scales in the clinical algorithm evaluation
TECHNIQUES TRAINING, AND TESTING METHODOLOGY) - Articulate how AI/ML will be trained to recognize/account for developmental heterogeneity
- Document AI/ML methods using validated guidelines, e.g CONSORT-AI and SPIRIT-AI10,11
- Define data input and output (e.g., images, text) as well as the source (e.g., public dataset), and output
- Account for age-specific factors related to disability and developmental conditions (e.g., natural disease progression) as relevant in study design, testing, and evaluation
- State if the study involves adult, pediatric, or mixed data in training and/or testing
- If the study involves both adult and pediatric data, state the purpose for this combination
- If the study involves both adult and pediatric data, state whether the same or separate algorithms were used to assess each group

ACCEPT-AI Framework: Key recommendations for pediatric data use in AI/ML research

Muraldharan et al., 2023, npj Dig Med


©2022 Mayo Foundation for Medical Education and Research | slide-42
AI BIAS

• Bias can originate at various


stages of the TPLC and can be
addressed using specific metrics
and tailored approaches.
• Following an expanded TPLC
approach, equity analysis and
bias mitigation will help
stakeholders understand how
bias may affect healthcare
decisions and outcomes.

Total Product LifeCycle (TPLC) equity expanded


framework with examples for each phase.
Abramoff et al., 2023, npj Dig Med
©2022 Mayo Foundation for Medical Education and Research | slide-43
EXAMPLES OF AI BIAS SOURCES

• Biases in study inclusion criteria, e.g., using • Overfitting


eGFR to select study participants
• Lack of reporting for methodological approach
• Study data collection biases (including
misclassification of race/ethnicity) • Interpretation bias

• Lack of representation / selection in the dataset • Correlation bias


• Missing data • Training-validation data skew

• Biases in imputed data • Lack of external validation

• Biases in learning and training data • Lack of performance assessments, such as


calibration and discrimination
• Collapsing race variables
• Lack of reporting for methodological approach
• Lack of reporting for methodological approach
• Insufficient sample size
• Labeling bias
©2022 Mayo Foundation for Medical Education and Research | slide-44
A Framework for Understanding Sources of Harm
DATA GENERATION throughout the Machine Learning Life Cycle

Harini Suresh & John Guttag. 2021.

©2022 Mayo Foundation for Medical Education and Research | slide-45


DATA GENERATION BIAS CONSIDERATIONS
Bias Type Risk Description Potential Mitigation Resources
Historical/preexisting Existing societal Debiasing Variational Amini et al., 2019
stereotypes reflected Autoencoder
in datasets.

Representation Nonrepresentative Targeted data Sontag & Johanssen,


data for population augmentation 2018
leading to systematic
errors in ML model
predictions.
Measurement/cognitive Chosen features and Exchanging with Baer 2019
labels are imperfect domain experts about
proxies for the real features
variables of interest

©2022 Mayo Foundation for Medical Education and Research | slide-46


A Framework for Understanding Sources of Harm
MODEL BUILDING AND throughout the Machine Learning Life Cycle
IMPLEMENTATION
Harini Suresh & John Guttag. 2021.

©2022 Mayo Foundation for Medical Education and Research | slide-47


MODEL BUILDING AND IMPLEMENTATION BIAS
CONSIDERATIONS
Bias Type Risk Description Potential Mitigation Resources
Learning Amplification of performance Reevaluate model Hooker, 2021
disparities design choices
Aggregation Incorrect assumption that mapping Coupled Learning Dwork et al., 2018
from inputs to labels is consistent Methods
across subsets of the data.
Evaluation A non-representative testing Representativeness of Ryu et al., 2017
population or inappropriate Benchmark Dataset
performance metrics are used to
evaluate the ML model.
Subgroup Validity Suresh & Guttag, 2019
Deployment The ML model is used and Holistic Evaluation Estiri et al., 2022
interpreted in a different context
than it was built for. Monitoring Plan & Giffen et al., 2022
Human Supervision

©2022 Mayo Foundation for Medical Education and Research | slide-48


AI EVALUATION: PROCESS STANDARDIZATION
Leveraging Clinical Research Pathway

Artificial Intelligence Translation in Health Care: Framework for Evaluation and Documentation.
Overgaard, et al. [V2 in progress] V1 publication: https://pubmed.ncbi.nlm.nih.gov/35854754/

©2022 Mayo Foundation for Medical Education and Research | slide-49


©2022 Mayo Foundation for Medical Education and Research | slide-50
MODEL DOCUMENTATION FRAMEWORK

Phase 1: Phase 2: Phase 3: Phase 4: Phase 5:


Prepare Develop Validate Deploy Maintain

Phase Subgroups: Phase Subgroups: Phase Subgroups: Phase Subgroups: Phase Subgroups:
Patient Impact, Model Planning and Deployment User Education and Post-Deployment
Purpose and Architecture, Risk Planning, Clinical Training, Monitoring and Maintenance
Indications Assessment, Validation and Reporting, Risk, Quality
Usability Formative Corrective and Monitoring and
Preventative Audit, Maintenance
Actions

Reporting Guidelines Used: CONSORT-AI, Model Card, TRIPOD | Updates include IEC 62304, ISO 14971

©2022 Mayo Foundation for Medical Education and Research | slide-51


Privacy,
Security
& Resilience

Explainability & Fairness


CORE
Interpretability & Equity
CHAI/NIST
Health AI ALIGNED
PRINCIPLES
Trustworthiness

Accountability
Usefulness
& Transparency

Safety
Aligned to NIST AI Risk Management Framework and the White House Blueprint for an AI Bill
of Rights

©2022 Mayo Foundation for Medical Education and Research | slide-52


HOW CAN WE ENSURE WE ARE ADVOCATING FOR
THE SAFETY AND EFFICACY OF AI ON BEHALF OF
OUR PATIENTS?
What are we doing to
What policies, processes, What accountability
ensure that workforce
procedures, and practices structures are in place so
diversity, equity, inclusion,
across the organization that the appropriate teams
and accessibility processes
related to the mapping, and individuals are
are prioritized in the
measuring, and managing empowered, responsible,
mapping, measuring, and
of AI risks are in place, and trained for mapping,
managing of AI risks
transparent, and measuring, and managing
throughout the lifecycle?
implemented effectively? AI risks?

How are we What processes are in Are there clear policies


building organizational place for robust and procedures in place to
teams that are committed stakeholder engagement? address AI risks?
to a culture that considers
and communicates risk?
©2022 Mayo Foundation for Medical Education and Research | slide-53
WORK GROUPS
Aligned to
NAM Code of
Conduct

Assurance Maturity Evaluation Registration


Assurance Lab
Standards / Rubric Framework Sandbox Portal
Work Group
Work Group Leads Work Group Work Group Work Group

Privacy & CHAI


Fairness Transparency Usefulness Safety
Security Community
Work Group Work Group Work Group Work Group
Work Group

- Privacy-Enhanced - Fair w/ harmful - Accountable - Valid for accuracy, - Safe


- Secure & Resilient bias managed - Transparent operability & meeting
- Systemic - Explainable its intended purpose
- Computational - Interpretable and benefit (clinical
- Statistical validation)
- Human-cognitive - Testable
- Reliable
- Usable
- Robust/Generalizable

©2022 Mayo Foundation for Medical Education and Research | slide-54


HOW CAN WE ENSURE WE ARE ADVOCATING FOR
THE SAFETY AND EFFICACY OF AI ON BEHALF OF
OUR PATIENTS?
What are we doing to
What policies, processes, What accountability
ensure that workforce
procedures, and practices structures are in place so
diversity, equity, inclusion,
across the organization that the appropriate teams
and accessibility processes
related to the mapping, and individuals are
are prioritized in the
measuring, and managing empowered, responsible,
mapping, measuring, and
of AI risks are in place, and trained for mapping,
managing of AI risks
transparent, and measuring, and managing
throughout the lifecycle?
implemented effectively? AI risks?

How are we What processes are in Are there clear policies


building organizational place for robust and procedures in place to
teams that are committed stakeholder engagement? address AI risks?
to a culture that considers
and communicates risk?
©2022 Mayo Foundation for Medical Education and Research | slide-55
The Global Digital
Health
Partnership
• Take a global approach to
evidence building
• Sharing Insights and
considering risk-based
frameworks as a priority
in digital health
evaluation
• Evolve the workstream;
move towards ‘Evidence
Translation and
Implementation’

https://gdhp.health/
©2022 Mayo Foundation for Medical Education and Research | slide-56
https://www.itu.int/en/ITU-T/focusgroups/ai4h/Pages/default.aspx
©2022 Mayo Foundation for Medical Education and Research | slide-57
6 ANTICIPATE FUTURE AI TRENDS
IN HEALTHCARE

©2022 Mayo Foundation for Medical Education and Research | slide-58


FUTURE OPPORTUNITIES FOR AI IN HEALTHCARE

• Adaptive AI
• Generative AI
• Diverse use cases

©2022 Mayo Foundation for Medical Education and Research | slide-59


CONVENTIONAL VS. NEW

Opportunities for the development of AI algorithms.

Rajpurkar et al., 2022, Nat Med


©2022 Mayo Foundation for Medical Education and Research | slide-60
PRINCIPLES FOR RESPONSIBLE DEVELOPMENT
Principle Questions

• What health disparities are reported for the present AI application?


• How can the AI tool be designed to be accessible to and improve outcomes for the disadvantaged population?
1. Alleviate healthcare disparities
• What clinical interventions are needed to realize the benefit, and are these accessible?
• How can data collection be supported in underserved communities for tool retraining over time?

• How is clinical benefit defined in this domain?


2. Report clinically meaningful outcomes • What is the present threshold for the clinical benefit of existing tools, and how can the AI tool improve upon this
threshold?
• What disease state is an overdiagnosis?
3. Reduce overdiagnosis and overtreatment • For every case of overdiagnosis, what are the downstream costs to the patient and healthcare system?
• How can this AI application reduce the number of overdiagnoses compared to existing approaches?
• Is this AI tool addressing a high-priority healthcare need?
• What would be the cost to the healthcare system in implementation, maintenance, and update?
4. Have high healthcare value
• What would be the cost to the patient who does and does not benefit from this tool?
• Does this tool have high healthcare value, and if not, how can it be improved?
• What biographical data can be collected or carefully coded for the intended population?
5. Incorporate biography • How do these factors vary in the intended population?
• How can these factors be included when developing AI tools?
• Can the training features be easily collected in different settings?
6. Be easily tailored to the local population • Are these features reliable for training across different populations?
• Will the AI/ML workflow be made open-access?
• How will this AI application be evaluated over time, and at what intervals?
7. Promote a learning healthcare system • What are acceptable thresholds for performance?
• How will the evaluation results contribute to continuous improvement?
• Have AI explainability tools been explored and utilized?
• Do clinicians and patients find the explainability results helpful?
8. Facilitate shared decision-making • Have simpler, explainable algorithms been tried and compared to ‘black-box’ algorithms to determine if a simpler
model performs just as well?
• How can patient values be easily integrated into the use of the AI tool?

Questions that can be used when considering each principle in the AI development process.
Badal et al., 2023, Commun Med
©2022 Mayo Foundation for Medical Education and Research | slide-61
7 ANALYZE CASE STUDIES IN
REAL-WORLD EXAMPLES

©2022 Mayo Foundation for Medical Education and Research | slide-62


PHASED RESEARCH FRAMEWORK OF A-GPS

Stage ll paper Stage lIl paper

Stage I paper
A Technical Performance Study and
Proposed Systematic and Comprehensive
Evaluation of an ML-based CDS Solution
for Pediatric Asthma

(1)

2022 Informatics Summit | amia.org 63


©2022 Mayo Foundation for Medical Education and Research | slide-63
AIM OF A-GPS
(ASTHMA GUIDANCE AND PREDICTION SYSTEM)
• Predict the likelihood that the patient • Explain the predicted risk by
will experience an asthma detailing the relevant factors
exacerbation (AE) contributing to the
predicted likelihood of AE

Predict Explain

Operationalize Contextualize


• Place the outcome within a
•Offer actionable, timely visualization of relevant clinical
interventions (precision asthma information pertaining to a
care) patient’s asthma status

2022 Informatics Summit | amia.org 64


©2022 Mayo Foundation for Medical Education and Research | slide-64
METHODOLOGY

2022 Informatics Summit | amia.org 65


©2022 Mayo Foundation for Medical Education and Research | slide-65
MODEL DEVELOPMENT
Cohort Evaluation

Patients aged 6-17, dx active asthma (clinical visit with an associated asthma diagnosis in last 12 months ≥ 1)

Model Feature Selection


Identified as clinically significant factors 28 variables which include both Unstructured: NLP, MedTagger Structured: Queried Mayo Database
by the domain experts structured and unstructured data IEarchitecture using SQL

Data Quality Evaluation


Data provenance and distribution of data availability Distribution of comorbidities aligned with clinical
Data definition and category for each feature (missingness) criteria, clinical description, and interpretation.

Model Evaluation
Binary dependent variable evaluation: area under the
Conducted using 5-fold cross-validation receiver operating characteristic curve (AUC-ROC) F1 evaluation of output, 0.5 positive/negative threshold

2022 Informatics Summit | amia.org 66


©2022 Mayo Foundation for Medical Education and Research | slide-66
DATA CARD
Metadata Prototype Data Card Example

2022 Informatics Summit | amia.org

©2022 Mayo Foundation for Medical Education and Research | slide-67


RESULTS
Comparison of AUC score of various models
Prediction Task
Predict asthma exacerbation (AE)
within the next 12 months

Logistic Regression model output


Threshold Mean F1
0.1 0.421527
0.2 0.522789
0.3 0.531575
0.4 0.484955
0.5 0.456748
0.6 0.430991
0.7 0.4161
0.8 0.388956
0.9 0.339465

Final parameters chosen for the logistic regression model were: (C=0.23357214690901212, max_iter=10000, penalty='l1',
solver='liblinear'). All other parameters of logistic regression were applied with default values defined in Scikit-learn package

2022 Informatics Summit | amia.org 68


©2022 Mayo Foundation for Medical Education and Research | slide-68
MULTIDISCIPLINARY TEAM
• United objective to provide solutions that are:
1. People centric
2. Value added
3. Evidence-based

Multidisciplinary Team to Facilitate Translation


Translational
Informatics Clinicians Data Scientists Ethicists MLOps
Analysts

Quality
UX Researchers
Product Owner Operational Staff Patients Management
and Designers
Team

2022 Informatics Summit | amia.org 69


©2022 Mayo Foundation for Medical Education and Research | slide-69
TRANSLATION AND DEPLOYMENT OF A-GPS
Translation and Deployment of AI Models into Production

Diagram Credit: John Skiffington, MGR SaMD Software Engineering

2022 Informatics Summit | amia.org 70


©2022 Mayo Foundation for Medical Education and Research | slide-70
PHASED RESEARCH FRAMEWORK AND
MODEL DOCUMENTATION FOR EVALUATION
OF A-GPS
Phased Research Framework and Model Documentation for Evaluation of AI models

Based on clinical research phases for AI evaluation by Park Y, Jackson GP, Foreman MA, Gruen D, Hu J, Das
2022
AK. Evaluating artificial Informatics
intelligence Summit
in medicine: | of
phases amia.org
clinical research. JAMIA Open. 2020;3(3):326-31. 71
©2022 Mayo Foundation for Medical Education and Research | slide-71
2022 Informatics Summit | amia.org
©2022 Mayo Foundation for Medical Education and Research | slide-72
AI/ML Documentation
• Primary values driving model documentation Accountability
• clarity about who is responsible for the outcomes

framework: explainability, transparency, of technology use, for when things go wrong, and
how they should be held to account (IEEE)

accountability, trustworthiness
• Harmonizes AI development and ethical
considerations
• Complements and works in tandem with model
development Explainability Transparency
• extent to which information is made transparently • transfer of truthful, relevant information presented

• Communicates model components and available to stakeholders can be


readily interpreted and understood (IEEE)
Model at a level and in a meaningful form
from autonomous system or designers
to stakeholders (IEEE)

features Documentation
• Addresses limitations and risks
• Avoids complex "black-box" models
• Value and document transparency and
Trustworthiness
accountability in model development to standardize • earns and sustains user engagement and confidence,
which is crucial for successful model application; created
explainable AI and gain trust and maintained in three ways: Human Trust, Technical
Trust, and Regulatory Trust (CTA and ANSI)

Sources: ethical-dilemmas-in-ai-report.pdf / ANSI/CTA "The Use of Artificial Intelligence in Health Care: Trustworthiness"

©2022 Mayo Foundation for Medical Education and Research | slide-73


MODEL DOCUMENTATION FRAMEWORK

Phase 1: Phase 2: Phase 3: Phase 4: Phase 5:


Prepare Develop Validate Deploy Maintain

Phase Subgroups: Phase Subgroups: Phase Subgroups: Phase Subgroups: Phase Subgroups:
Patient Impact, Model Planning and Deployment User Education and Post-Deployment
Purpose and Architecture, Risk Planning, Clinical Training, Monitoring and Maintenance
Indications Assessment, Validation and Reporting, Risk, Quality
Usability Formative Corrective and Monitoring and
Preventative Audit, Maintenance
Actions

Reporting Guidelines Used: CONSORT-AI, Model Card, TRIPOD | Updates include IEC 62304, ISO 14971

2022 Informatics Summit | amia.org

©2022 Mayo Foundation for Medical Education and Research | slide-74


MODEL DOCUMENTATION FRAMEWORK

Tool Link
2022 Informatics Summit | amia.org

©2022 Mayo Foundation for Medical Education and Research | slide-75


REFERENCES (1/3)
• Harini Suresh and John Guttag. 2021. A Framework for Understanding Sources of Harm throughout the Machine
Learning Life Cycle. In Equity and Access in Algorithms, Mechanisms, and Optimization (EAAMO '21). Association
for Computing Machinery, New York, NY, USA, Article 17, 1–9. https://doi.org/10.1145/3465416.3483305
• Overgaard SM, Peterson KJ, Wi CI, et al. A Technical Performance Study and Proposed Systematic and Comprehensive
Evaluation of an ML-based CDS Solution for Pediatric Asthma. AMIA Annu Symp Proc. 2022;2022:25-35. Published 2022
May 23. https://pubmed.ncbi.nlm.nih.gov/35854754/
• Alexander Amini, Ava Soleimany, Wilko Schwarting, Sangeeta Bhatia, and Daniela Rus. Uncovering and mitigating
algorithmic bias through learned latent structure. AAAI/ACM Conference on Artificial Intelligence, Ethics, and Society.
2019.
https://static1.squarespace.com/static/5e96377a2c8aee64d168790e/t/6193f15d3ab904678ab86953/1637085537126
/Amini+-+2019.pdf
• Sontag, David and Johansson, Fredrik D. 2018. "Why is my classifier discriminatory?." Advances in Neural Information
Processing Systems, 2018-December. https://dspace.mit.edu/bitstream/handle/1721.1/137319/NeurIPS-2018-why-is-
my-classifier-discriminatory-Paper.pdf?sequence=2&isAllowed=y
• Baer, Tobias. Understand, manage, and prevent algorithmic bias: A guide for business users and data scientists. New
York, NY: Apress, 2019. https://link.springer.com/book/10.1007/978-1-4842-4885-0
• Hossein Estiri, Zachary H Strasser, Sina Rashidian, Jeffrey G Klann, Kavishwar B Wagholikar, Thomas H McCoy, Jr, Shawn
N Murphy, An objective framework for evaluating unrecognized bias in medical AI models predicting COVID-19
outcomes, Journal of the American Medical Informatics Association, Volume 29, Issue 8, August 2022, Pages 1334–
1341, https://doi.org/10.1093/jamia/ocac070

©2022 Mayo Foundation for Medical Education and Research | slide-76


REFERENCES (2/3)

• H., & Guttag, J. V. (2019). A framework for understanding unintended consequences of machine learning. arXiv preprint
arXiv:1901.10002.
• Cynthia Dwork, Nicole Immorlica, Adam Tauman Kalai, and Max Leiserson. 2018. Decoupled Classifiers for Group-Fair
and Efficient Machine Learning. In Proceedings of the 1st Conference on Fairness, Accountability and Transparency
(Proceedings of Machine Learning Research, Vol. 81), Sorelle A. Friedler and Christo Wilson (Eds.). PMLR, New York, NY,
USA, 119–133. http://proceedings.mlr.press/ v81/dwork18a.html
• Sara Hooker. 2021. Moving beyond “algorithmic bias is a data problem”. Patterns 2, 4 (2021), 100241.
• ANSI/CTA-2090 The Use of Artificial Intelligence in Healthcare: Trustworthiness (ANSI/CTA-2090).
https://shop.cta.tech/products/the-use-of-artificial-intelligence-in-healthcare-trustworthiness-cta-2090
• Brereton TA, Malik M, Lifson MA, Greenwood JD, Peterson KJ, Overgaard SM. The Role of AI Model Documentation in
Translational Science: A Scoping Review. medRxiv. 2023:2023.01.21.23284858. doi:10.1101/2023.01.21.23284858.
https://www.medrxiv.org/content/10.1101/2023.01.21.23284858v1
• Brereton TA, Overgaard, SM, et al., A Proposed Model Documentation Framework to Faciliate Translation of AI Models
in Healthcare. CIC AMIA 2022; Inivted for Applied Clinical Informatics Special Issue (under review). Model
Documentation Framework Prototype: https://docs.google.com/spreadsheets/d/1jenXP5miRxcteV6XRU71e-
A7sJB46Ztz/edit?usp=sharing&ouid=101115941916710306157&rtpof=true&sd=true
• Coalition for Health AI: Blueprint for Trustworthy AI Implementation Guidance and Assurance for Healthcare:
https://www.coalitionforhealthai.org/insights
• National Institute of Standards and Technology, AI Risk Management Framework: https://www.nist.gov/itl/ai-risk-
management-framework
©2022 Mayo Foundation for Medical Education and Research | slide-77
REFERENCES (3/3)
• Doyen S, Dadario NB. 12 Plagues of AI in Healthcare: A Practical Guide to Current Issues With Using Machine Learning in a Medical Context.
Front Digit Health. 2022 May 3;4:765406. doi: 10.3389/fdgth.2022.765406. PMID: 35592460; PMCID: PMC9110785.

• Attia ZI, Noseworthy PA, Lopez-Jimenez F, Asirvatham SJ, Deshmukh AJ, Gersh BJ, Carter RE, Yao X, Rabinstein AA, Erickson BJ, Kapa S,
Friedman PA. An artificial intelligence-enabled ECG algorithm for the identification of patients with atrial fibrillation during sinus rhythm: a
retrospective analysis of outcome prediction. Lancet. 2019 Sep 7;394(10201):861-867. doi: 10.1016/S0140-6736(19)31721-0. Epub 2019 Aug 1.
PMID: 31378392.

• Beltrami EJ, Brown AC, Salmon PJM, Leffell DJ, Ko JM, Grant-Kels JM. Artificial intelligence in the detection of skin cancer. J Am Acad Dermatol.
2022 Dec;87(6):1336-1342. doi: 10.1016/j.jaad.2022.08.028. Epub 2022 Aug 23. PMID: 35998842.

• Wallace MB, Sharma P, Bhandari P, East J, Antonelli G, Lorenzetti R, Vieth M, Speranza I, Spadaccini M, Desai M, Lukens FJ, Babameto G,
Batista D, Singh D, Palmer W, Ramirez F, Palmer R, Lunsford T, Ruff K, Bird-Liebermann E, Ciofoaia V, Arndtz S, Cangemi D, Puddick K, Derfus
G, Johal AS, Barawi M, Longo L, Moro L, Repici A, Hassan C. Impact of Artificial Intelligence on Miss Rate of Colorectal Neoplasia.
Gastroenterology. 2022 Jul;163(1):295-304.e5. doi: 10.1053/j.gastro.2022.03.007. Epub 2022 Mar 15. PMID: 35304117.

• Wang, Y., Mehrabi, S., Sohn, S. et al. Natural language processing of radiology reports for identification of skeletal site-specific fractures. BMC
Med Inform Decis Mak 19 (Suppl 3), 73 (2019). https://doi.org/10.1186/s12911-019-0780-5

• Rajpurkar P, Chen E, Banerjee O, Topol EJ. AI in health and medicine. Nat Med. 2022 Jan;28(1):31-38. doi: 10.1038/s41591-021-01614-0. Epub
2022 Jan 20. PMID: 35058619.

• Liu M, Ning Y, Teixayavong S, Mertens M, Xu J, Ting DSW, Cheng LT, Ong JCL, Teo ZL, Tan TF, RaviChandran N, Wang F, Celi LA, Ong MEH,
Liu N. A translational perspective towards clinical AI fairness. NPJ Digit Med. 2023 Sep 14;6(1):172. doi: 10.1038/s41746-023-00918-4. PMID:
37709945; PMCID: PMC10502051.

• Bonkhoff AK, Grefkes C. Precision medicine in stroke: towards personalized outcome predictions using artificial intelligence. Brain. 2022 Apr
18;145(2):457-475. doi: 10.1093/brain/awab439. PMID: 34918041; PMCID: PMC9014757.

• Abràmoff MD, Tarver ME, Loyo-Berrios N, Trujillo S, Char D, Obermeyer Z, Eydelman MB; Foundational Principles of Ophthalmic Imaging and
©2022 Mayo Foundation for Medical Education and Research | slide-78

You might also like