Download as pdf or txt
Download as pdf or txt
You are on page 1of 11

Derisking AI by design:

How to build risk management


into AI development
The compliance and reputational risks of artificial intelligence pose a challenge
to traditional risk-management functions. Derisking by design can help.

by Juan Aristi Baquero, Roger Burkhardt, Arvind Govindarajan, and Thomas Wallace

August 2020
Artificial intelligence (AI) is poised to redefine some cases by driving sweeping changes in human
how businesses work. Already it is unleashing behaviors, make them far from perfect.
the power of data across a range of crucial
functions, such as customer service, marketing, In a previous article, we described the challenges
training, pricing, security, and operations. To posed by new uses of data and innovative
remain competitive, firms in nearly every industry applications of AI. Since then, we’ve seen
will need to adopt AI and the agile development rapid change in formal regulation and societal
approaches that enable building it efficiently expectations around the use of AI and the personal
to keep pace with existing peers and digitally data that are AI’s essential raw material. This is
native market entrants. But they must do so while creating compliance pressures and reputational risk
managing the new and varied risks posed by AI for companies in industries that have not typically
and its rapid development. experienced such challenges. Even within regulated
industries, the pace of change is unprecedented.
The reports of AI models gone awry due to the
COVID-19 crisis have only served as a reminder In this complex and fast-moving environment,
that using AI can create significant risks. The traditional approaches to risk management may not
reliance of these models on historical data, be the answer (see sidebar “Why traditional model
which the pandemic rendered near useless in risk management is insufficient”). Risk management

Why traditional model risk management is insufficient

Model risk management (MRM) in new data. For example, a fraud model — Some applications and use cases, such as
regulated industries such as banking is is retrained weekly in order to adapt to chatbots, natural-language processing,
currently performed by dedicated and new scams. and HR analytics, can qualify as “models”
independent teams reporting to the under regulatory definitions used in
chief risk officer. While these firms have — Traditional MRM workflows are banking. But these applications are very
developed a robust MRM approach to often sequential and require six to 12 different from the traditional model types
improve the governance and control of weeks of review time after the model (for example, capital models, stress-testing
their critical models determining capital development is complete, which delays models, and credit-risk models), and
requirements and lending decisions, this deployment. These workflows are not traditional MRM approaches are not easily
approach is usually not ideal for firms with easily adapted to the agile and iterative applied.
different requirements or in less heavily development cycles frequently used in
regulated industries, for the following AI model development. — AI and machine-learning algorithms are
reasons: often embedded in larger AI application
— MRM is often focused more on systems, such as software-as-a-service
— MRM is typically based on a point-in- traditional risk types (primarily financial (SaaS) offerings from vendors, in ways
time model assessment (for example, risks, such as capital adequacy and that are significantly more complex and
once every one to five years), which credit risk) and may not fully cover the more opaque than traditional models. This
assumes that the models are largely new and more diverse risks arising greatly complicates coordination between
static between reviews. AI models learn from widespread use of AI such as those who review the model and those who
from data, and their logic changes reputational risk, consumer and assess the application and platform (IT risk)
when they are retrained to learn from conduct risk, and employee risk. or the vendor (third-party risk).

2 Derisking AI by design: How to build risk management into AI development


cannot be an afterthought or addressed only by with less defined risk governance, leaders will
model-validation functions such as those that have to grapple with figuring out who should be
currently exist in financial services. Companies responsible for identifying and managing AI risks.
need to build risk management directly into their
AI initiatives, so that oversight is constant and AI is difficult to track across the enterprise
concurrent with internal development and external As AI has become more critical to driving
provisioning of AI across the enterprise. We call performance and as user-friendly machine-
this approach “derisking AI by design.” learning software has become increasingly viable,
AI use is becoming widespread and, in many
institutions, decentralized across the enterprise,
Why managing AI risks presents new making it difficult for risk managers to track. Also,
challenges AI solutions are increasingly embedded in vendor-
While all companies deal with many kinds of provided software, hardware, and software-
risks, managing risks associated with AI can be enabled services deployed by individual business
particularly challenging, due to a confluence of units, potentially introducing new, unchecked
three factors. risks. A global product-sales organization, for
example, might choose to take advantage of a
AI poses unfamiliar risks and creates new new AI feature offered in a monthly update to
responsibilities their vendor-provided customer-relationship-
Over the past two years, AI has increasingly management (CRM) package without realizing
affected a wide range of risk types, including that it raises new and diverse data-privacy and
model, compliance, operational, legal, reputational, compliance risks in several of their geographies.
and regulatory risks. Many of these risks are new
and unfamiliar in industries without a history of Compounding the challenge is the fact that AI risks
widespread analytics use and established model cut across traditional control areas—model, legal,
management. And even in industries that have data privacy, compliance, and reputational—that
a history of managing these risks, AI makes the are often siloed and not well coordinated.
risks manifest in new and challenging ways. For
example, banks have long worried about bias AI risk management involves many design
among individual employees when providing choices for firms without an established risk-
consumer advice. But when employees are management function
delivering advice based on AI recommendations, Building capabilities in AI risk management
the risk is not that one piece of individual advice from the ground up has its advantages but also
is biased but that, if the AI recommendations are poses challenges. Without a legacy structure
biased, the institution is actually systematizing to build upon, companies must make numerous
bias into the decision-making process. How the design choices without a lot of internal expertise,
organization controls bias is very different in these while trying to build the capability rapidly. What
two cases. level of MRM investment is appropriate, given
the AI risk assessments across the portfolio
These additional risks also stand to tax risk- of AI applications? Should reputational risk
management teams that are already being management for a a global organization be
stretched thin. For example, as companies grow governed at headquarters or on a national basis?
more concerned about reputational risk, leaders How should we combine AI risk management
are asking risk-management teams to govern a with the management of other risks, such as data
broader range of models and tools, supporting privacy, cybersecurity, and data ethics? These are
anything from marketing and internal business just a few of the many choices that organizations
decisions to customer service. In industries must make.

Derisking AI by design: How to build risk management into AI development 3


Baking risk management into AI those checks. But in a worst-case scenario, the
development checks turn up problems that require another full
To tackle these challenges without constraining AI development cycle to resolve. This obviously hurts
innovation and disrupting the agile ways of working efficiency and puts the company at a disadvantage
that enable it, we believe companies need to adopt relative to nimbler firms (see sidebar “Learning the
a new approach to risk management: derisking AI by value of derisking by design the hard way”).
design.
Similar issues can occur when organizations source
Risk management by design allows developers and AI solutions from vendors. It is critical for control
their business stakeholders to build AI models that teams to engage with business teams and vendors
are consistent with the company’s values and risk early in the solution-ideation process, so they
appetite. Tools such as model interpretability, bias understand the potential risks and the controls to
detection, and performance monitoring are built in mitigate them. Once the solution is in production, it
so that oversight is constant and concurrent with AI is also important for organizations to understand
development activities and consistent across the when updates to the solution are being pushed
enterprise. In this approach, standards, testing, and through the platform and to have automated
controls are embedded into various stages of the processes in place for identifying and monitoring
analytics model’s life cycle, from development to changes to the models.
deployment and use (Exhibit 1).
It’s possible to reduce costly delays by embedding
Typically, controls to manage analytics risk are risk identification and assessment, together with
applied after development is complete. For associated control requirements, directly into
example, in financial services, model review and the development and procurement cycles. This
validation often begin when the model is ready for approach also speeds up pre-implementation
implementation. In a best-case scenario, the control checks, since the majority of risks have already been
function finds no problems, and the deployment accounted for and mitigated. In practice, creating
is delayed only as long as the time to perform a detailed control framework that sufficiently

Learning the value of derisking by design the hard way

A large food manufacturer developed an third-party review of the model, which realization that the company needed to
analytics solution to forecast demand for uncovered several problems with the undertake a broader initiative to embed
each of its products across geographies in model, including a critical data leakage. risk management into model development
order to optimize manufacturing, logistics, The model had accidentally included a to prevent this and other issues from
and the overall supply chain. The new feature that captured the actual demand. recurring. The manufacturer began the
model showed higher accuracy compared Once the feature was removed, the model effort by creating new roles within the
with the company’s existing expert-based accuracy dropped below the existing group to perform model review, defining
approach. expert-based approach. roles and responsibilities for model checks
throughout the modeling pipeline, and
But before the model was deployed, the This revelation led to a complete redesign implementing standards for development
manufacturer initiated an independent of the model architecture and the and documentation of analytics.

4 Derisking AI by design: How to build risk management into AI development


Risk management by design embeds controls across the algorithmic model’s
Exhibit 1
life cycle.
Risk management by design embeds controls across the algorithmic model’s life cycle.

1 2 3
Ideate Get data Industrialize Monitor and maintain

A Build E F G H
B
Evaluate Approval to develop/proof of concept/
1
minimum viable product
C 2 Approval to implement
D 3 Approval to go live

A Designing the solution E Moving model to production environment


Controls examples: scoping review, evaluation metrics, Controls examples: nonfunctional-requirements check-
assessment of environment including available data list, data-source revalidation, full data-pipeline test,
operational-performance thresholds, external-interface
warnings
B Obtaining reliable data required to build and
train model
F Deploying model where it starts being used by the
Controls examples: data-pipeline testing, data-sourcing business
analysis, statistical-data checks, process and data-usage Controls examples: colleague responsibility assignment
fairness, automated documentation generation and training, escalation mechanisms, workflow manage-
ment, audit-trail generation
C Building a model that achieves good performance
in solving the problem specified during ideation G Inventory management of all models
Controls examples: model-robustness review, business- Controls examples: search tool, automated inventory
context metrics testing, data-leakage controls, label- statistical assessment and risk overview by department
quality assessment, data availability in production
H Live monitoring in production
D Evaluating performance of model and engaging
Controls examples: degradation flagging, retraining
business regularly to ensure business fit
scheduler, periodic testing such as Bayesian hypothesis
Controls examples: standardized performance testing, testing, automated logging, and audit-trail generation
feature-set review, rule-based threshold setting, model-
output review by subject-matter expert, business Review and approval for continued use
requirements, business restrictions, risk assessment,
automated document generation, predictive-outcome Controls example: verification that algorithm continues to
fairness work as intended and its use continues to be appropriate in
current environment

covers all these different risks is a granular exercise. Embedding appropriate controls directly into the
For example, enhancing our own internal model- development and provisioning routines of business
validation framework to accommodate AI-related and data-science teams is especially helpful in
risks results in a matrix of 35 individual control industries without well-established analytics
elements covering eight separate dimensions of development teams and risk managers who
model governance. conduct independent review of analytics or manage

Derisking AI by design: How to build risk management into AI development 5


associated risk. They can move toward a safe and could go wrong?” and use their answers to
agile approach to analytics much faster than if create appropriate controls at the design phase.
they had to create a stand-alone control function
for review and validation for models and analytics — Data sourcing. An early risk assessment
solutions (see sidebar “An energy company takes helps define which data sets are “off-limits”
steps toward derisking by design”). (for example, because of personal-privacy
considerations) and which bias tests are
As an example, one of the most relevant risks of AI required. In many instances, the data sets that
and machine learning is bias in data and analytics capture past behaviors from employees and
methodologies that might lead to unfair decisions for customers will incorporate biases. These biases
consumers or employees. To mitigate this category can become systemic if they are incorporated
of risk, leading firms are embedding several types of into the algorithm of an automated process.
controls into their analytics-development processes
(Exhibit 2): — Model development. The transparency and
interpretability of analytical methods strongly
— Ideation. They first work to understand the influence bias risk. Leading firms decide which
business use case and its regulatory and methodologies are appropriate for each use
reputational context. An AI-driven decision case (for example, some black-box methods will
engine for consumer credit, for example, poses not be allowed in high-risk use cases) and what
a much higher bias risk than an AI-driven post hoc explainability techniques can increase
chatbot that provides information to the the transparency of model decisions.
same customers. An early understanding of
the risks of the use case will help define the — Monitoring and maintenance. Leading
appropriate requirements around the data and firms define the performance-monitoring
methodologies. All the stakeholders ask, “What requirements, including types of tests and

An energy company takes steps toward derisking by design

Companies in industries that have been to produce higher-quality coal. The models as they are developed; creating
running analytical models for decades company set up an analytics center of a centralized inventory for all analytics
under the scrutiny of regulators, such as excellence (CoE), which discovered that use cases and related information (such
financial services, often have a foundation thousands of analytics use cases had as developer and owners); establishing a
for moving to a derisk-by-design model. been developed and deployed across the tiering system to identify the most material
Organizations in industries that have organization without any clear oversight, models; creating standards for model
adopted analytics more recently and are creating risks for human health and safety, development and documentation; defining
less regulated (at least in the area of model financial performance, and company and implementing requirements for model
outputs) will need to build their capabilities reputation. review and monitoring for all models; and
nearly from scratch. defining model-governance processes, roles,
In response, the CoE appointed a model and responsibilities for all stakeholders
One large North American energy manager to oversee the model-governance across the modeling pipeline. These changes
company initiated a multiyear analytics rollout across the organization. The helped the organization take a giant step
transformation in order to improve the manager’s team identified six key priorities: toward embedding risk management into the
efficiency of current assets—for example, implementing a process to identify end-to-end process of model development.

6 Derisking AI by design: How to build risk management into AI development


Exhibit 2
Bias
Bias isisone
one important
important riskrisk
that that can
can be be mitigated
mitigated by embedding
by embedding controls
controls into into
the model-
the model-development
development process. process.

Illustrative example Guidance and Analytical methods


checklists and data tools

Industrialization, monitoring,
Ideation Data sourcing Model development
and maintenance

Determine the level of bias Detect and mitigate bias Find and reduce bias Continuously monitor and
risk, given model use and risk in data. through modeling. manage bias risk in
context. production.

Bias and Bias-detection Explainable AI Context monitoring


explainability risk techniques techniques to - Regulatory changes
assessment explain root cause - Legal changes
- Company-policy
How can we set up a team changes
to reduce or mitigate risk Fair-representation Counterfactual
analysis - Usage
of bias? techniques
appropriateness
Guidance on
Evaluation of risk Review of Model monitoring
convening a
from the choice of underlying - Data drift
diverse team
data sets and hypotheses - Model metrics
What legal and reputational collection methods - Bias metrics in
constraints should we take outcomes
into account? Mitigation of risk in Fairness-aware
feature selection algorithms
Scoping and and engineering Model maintenance
regulatory guid-
Remediation with - Database of metrics
ance Documentation post-processing and trend tracking
How will we measure bias with data sheets techniques on - Updates of
for this use case in this for data sets output documentation
usage context?
Documentation
Creation of bias with model cards
risk metrics

What is the level of our Execute development checks and controls


analytics capabilities? to manage the risk of bias

Capability context
assessment Monitor model for bias metrics

frequency. These requirements will depend example, reinforced learning, self-learning),


on the risk of the use case, the frequency with leading firms use technology platforms that
which the model is used, and the frequency with can specify and execute monitoring tests
which the model is updated or recalibrated. As automatically.
more dynamic models become available (for

Derisking AI by design: How to build risk management into AI development 7


Putting risk managers in a position to rather than taking responsibility for the majority of
succeed—and providing a supporting testing and quality control. Additionally, they will
cast need to reduce one-off “static” exercises and build
To deploy AI at scale, companies need to tap an in the capability to monitor AI on a dynamic, ongoing
array of external and unstructured data sources, basis and support iterative development processes.
connect to a range of new third-party applications,
decentralize the development analytics (although But monitoring AI risk cannot fall solely on risk
common tooling, standards, and other centralized managers. Different teams affected by analytics
capabilities help speed the development process), risk need to coordinate oversight to ensure end-to-
and work in agile teams that rapidly develop and end coverage without overlap, support agile ways of
update analytics in production. working, and reduce the time from analytics concept
to value (Exhibit 3).
These requirements make large-scale and rapid
deployment incredibly difficult for traditional AI risk management requires that each team
risk managers to support. To adjust, they will expand its skills and capabilities, so that skill sets
need to integrate their review and approvals into in different functions overlap more than they do in
agile or sprint-based development approaches, historical siloed approaches. Someone with a core
relying more on developer testing and input from skill—in this case, risk management, compliance,
analytics teams, so they can focus on review vendor risk—needs enough analytics know-how

Exhibit 3
The responsibilities
The responsibilities for for enabling
enabling safe
safe and and ethical
ethical innovation
innovation withintelligence
with artificial artificial span
intelligence
multiple parts span
of the multiple parts of the organization.
organization.

Business
Front line Operations Business-unit control
Confirm soundness of predictive Validate insights against business Ensure tests required by second-line-
drivers, modeling approach, and experience; ensure appropriate use- of-defense functions are performed,
results based on business experience case calibration (eg, clarity on including ongoing monitoring and
modeling objectives) testing of models in use

Analytics Data Technology


Data scientists, developers Data engineers/strategists IT (software and hardware)
Develop best-in-class models in line Maintain data quality; ensure Mitigate implementation risks by
with second-line-of-defense applicability of new features (ie, ensuring adequacy of production
standards; provide transparency into feature engineering) to modeling environment (eg, scalability,
model behavior (ie, explainability) objectives preventing data leakage)

Risk and control functions


Model risk management Compliance and legal Cloud risk, vendor risk, etc
Develop standards providing guard- Provide guidance on compliance Provide guidance on mitigating key
rails on AI/ML model development; risks (eg, prevent bias arising from nonfinancial risks (eg, reputational
assess AI/ML model risk use of certain restricted customer damage, third-party) linked to AI/ML
characteristics) models

8 Derisking AI by design: How to build risk management into AI development


to engage with the data scientists. Similarly, This integration and coordination between analytics
data scientists need to understand the risks in teams and risk managers across the model life cycle
analytics, so they are aware of these risks as they requires a shared technology platform that includes
do their work. the following elements:

In practice, analytics teams need to manage — an agreed-upon documentation standard that


model risk and understand the impact of these satisfies the needs of all stakeholders (including
models on business results, even as the teams developers, risk, compliance, and validation)
adapt to an influx of talent from less traditional
modeling backgrounds, who may not have — a single workflow tool to coordinate and document
a grounding in existing model-management the entire life cycle from initial concept through
techniques. Meanwhile, risk managers need iterative development stages, releases into
to build expertise—through either training or production, and ultimately model retirement
hiring—in data concepts, methodologies, and AI
and machine-learning risks, to ensure they can — access to the same data, development environment,
coordinate and interact with analytics teams and technology stack to streamline testing and
(Exhibit 4). review

Exhibit 4
Both analytics
Both analytics and
and riskrisk professionals
professionals willtoneed
will need to complement
complement their skill
their traditional traditional
sets with
sufficient
skill setsknowledge of the others’
with sufficient function.
knowledge of the others’ function.

Data and analytics professionals Risk and control officers


Core • Math, statistics, machine learning, • Knowledge of applicable regulations
competencies deep learning • Identification and analysis of risks
• Building algorithmic models • Credible and independent review of business
• Collecting, cleansing, structuring data activities
• Creating data visualizations and dashboards
• Explaining model drivers

New • Awareness of analytics risks, including • General understanding of analytics techniques


complementary bias, fairness, and instability and their implications, including performance vs
skills • Understanding of where risks can arise in interpretability trade-offs
analytics-development life cycle • Awareness of best practices in testing for bias,
• Ability to use risk-management tools as fairness, and stability and ability to understand
part of analytics-development process results from risk-management tools such as
(eg, explainability and bias testing, model- explainability reports
performance-monitoring dashboards) • Understanding of data/feature-selection
practices and their effect on risks (eg, bias)
• Understanding of risk-control team’s role
and responsibilities and ability to engage • Understanding of analytics teams’ roles and
with them effectively responsibilities and ability to engage with data
and analytics professionals

Derisking AI by design: How to build risk management into AI development 9


— tools to support automated and frequent (even in the organization (see sidebar “Building
real-time) AI model monitoring, including, most risk management into AI design requires a
critically, when in production coordinated approach”).

— a consistent and comprehensive set of — Create the conceptual design. Build on the
explainability tools to interpret the behavior of overarching principles to establish the basic
all AI technologies, especially for technologies framework for AI risk management. Ensure this
that are inherently opaque covers the full model-development life cycle
outlined earlier: ideation, data sourcing, model
building and evaluation, industrialization, and
Getting started monitoring. Controls should be in place at each
The practical challenges of altering an stage of the life cycle, so engage early with
organization’s ingrained policies and procedures analytics teams to ensure that the design can
are often formidable. But whether or not an be integrated into their existing development
established risk function already exists, leaders approach.
can take these basic steps to begin putting into
practice derisking AI by design: — Establish governance and key roles. Identify
key people in analytics teams and related risk-
— Articulate the company’s ethical principles management roles, clarify their roles within
and vision. Senior executives should create a the risk-management framework, and define
top-down view of how the company will use their mandate and responsibilities in relation
data, analytics, and AI. This should include a to AI controls. Provide risk managers with
clear statement of the value these tools bring to training and guidance that ensure they develop
the organization, recognition of the associated knowledge beyond their previous experience
risks, and clear guidelines and boundaries with traditional analytics, so they are equipped
that can form the basis for more detailed to ask new questions about what could go wrong
risk-management requirements further down with today’s advanced AI models.

Building risk management into AI design requires a coordinated approach

While AI applications can be developed of challenges around key risk processes, The bank alleviated these issues by
in a decentralized fashion across an including tracking and assessing the risks establishing one multidisciplinary team
organization, managing AI risk should be of AI embedded in vendor technologies, to define a clear target state of AI risk
coordinated more centrally in order to be triaging and risk oversight of AI tools, management, build alignment across
effective. A major North American bank building controls into AI model development stakeholders, clarify AI governance
learned this lesson when it set out to involving multiple analytics groups, and requirements, and specify the engagement
create a new set of AI risk-management operationalizing ethical principles on data model and technical requirements to
capabilities to complement its existing and AI approved by the board. As a result, achieve the target state.
risk frameworks. Intitially, multiple groups the bank struggled to demonstrate that
began their own AI risk-management all AI risks were managed through the
efforts. This fragmentation created a host development life cycle.

10 Derisking AI by design: How to build risk management into AI development


— Adopt an agile engagement model. Bring organization. Awareness campaigns and basic
together analytics teams and risk managers training can build institutional knowledge of
to understand their mutual responsibilities new model types. Teams with regular review
and working practices, allowing them to solve responsibilities (risk, legal, and compliance) will
conflicts and determine the most efficient way need to become adept “translators,” capable of
of interacting fluidly during the course of the understanding and interpreting analytics use
development life cycle. Integrate review and cases and approaches. Critical teams will need
approvals into agile or sprint-based development to build and hire in-depth technical capabilities
approaches, and push risk managers to rely on to ensure risks are fully understood and
input from analytics teams, so they can focus on appropriately managed.
reviews rather than taking responsibility for the
majority of testing and quality control.

— Access transparency tools. Adopt essential tools AI is changing the rules of engagement across
for gaining explainability and interpretability. industries. The possibilities and promise are
Train teams to use these tools to identify the exciting, but executive teams are only beginning to
drivers of model results and to understand the grasp the scope of the new risks involved. Existing
outputs they need in order to make use of the approaches to model risk-management functions
results. Analytics teams, risk managers, and may not be ready to support deployment of these
partners outside the company should have new techniques at the scale and pace expected
access to these same tools in order to work by business leaders. Derisking AI by design will
together effectively. give companies the oversight they need to run AI
ethically, legally, and profitably.
— Develop the right capabilities. Build an
understanding of AI risks throughout the

Juan Aristi Baquero and Roger Burkhardt are partners in McKinsey’s New York office, Arvind Govindarajan is a partner in the
Boston office, and Thomas Wallace is a partner in the London office.

The authors wish to thank Rahul Agarwal for his contributions to this article.

Copyright © 2020 McKinsey & Company. All rights reserved.

Derisking AI by design: How to build risk management into AI development 11

You might also like