Download as pdf or txt
Download as pdf or txt
You are on page 1of 7

*&&&UI*OUFSOBUJPOBM$POGFSFODFPO$PMMBCPSBUJPOBOE*OUFSOFU$PNQVUJOH

$*$

What Information is Required for Explainable AI? :


A Provenance-based Research Agenda and Future
Challenges
2020 IEEE 6th International Conference on Collaboration and Internet Computing (CIC) | 978-1-7281-4146-6/20/$31.00 ©2020 IEEE | DOI: 10.1109/CIC50333.2020.00030

Fariha Tasmin Jaigirdar Carsten Rudolph


Dept. of Software Systems and Cybersecurity Dept. of Software Systems and Cybersecurity
Monash University Monash University
Melbourne, Australia Melbourne, Australia
fariha.jaigirdar@monash.edu carsten.rudolph@monash.edu

Gillian Oliver David Watts Chris Bain


Dept. of Human Centred Computing La Trobe Law School Monash University Digital Health
Monash University La Trobe University Monash University
Melbourne, Australia Melbourne, Australia Melbourne, Australia
Gillian.Oliver@monash.edu D.Watts@latrobe.edu.au chris.a.bain@monash.edu

Abstract—Deriving explanations of an Artificial Intelligence- often induce human decisions. Thus, AI has a considerable
based system’s decision making is becoming increasingly essen- impact on peoples’ everyday lives.
tial to address requirements that meet quality standards and That said, there is a growing body of research that highlights
operate in a transparent, comprehensive, understandable, and
explainable manner. Furthermore, more security issues as well problems associated with AI-based decision-making. These
as concerns from human perspectives emerge in describing the arise in particular with ‘second wave’ AI [2] that is based
explainability properties of AI. A full system view is required on statistical learning and probability theory where statistical
to enable humans to properly estimate risks when dealing with models are created for solving specific problem domains.
such systems. This paper introduces open issues in this research These are ‘trained’ on big data and analyzed using machine
area to present the overall picture of explainability and the
required information needed for the explanation to make a learning and deep neural networks to produce highly nuanced
decision-oriented AI system transparent to humans. It illustrates classification and prediction capabilities. Almost invariably,
the potential contribution of proper provenance data to AI- this interplay of data and algorithmic-based analysis occurs
based systems by describing a provenance graph-based design. within what has been referred to as a ‘black box’, where the
This paper proposes a six-Ws framework to demonstrate how reasons why a decision or recommendation is made, or an
a security-aware provenance graph-based design can build the
basis for providing end-users with sufficient meta-information action is taken are opaque and unknown. Further, the relation
on AI-based decision systems. An example scenario is then between input data, training data, and resulting classifications,
presented that highlights the required information for better as well as the provenance of these various inputs, are not
explainability both from human and security-aware aspects. obvious to the human user.
Finally, associated challenges are discussed to provoke further One core issue is that autonomous decision-making systems
research and commentary.
Index Terms—artificial intelligence, data provenance, explain-
follow post-hoc methods, where the roots of included artifacts
able AI, decision-oriented systems, cybersecurity, human-centric are not made explicit. Without documenting the detailed
policy connection of the training data-set with the decision gener-
ated, the overall system’s acceptability is questionable. Thus,
I. I NTRODUCTION the autonomous recommendations require strong ‘explainable
properties’ at each step of processing. Without this explana-
Artificial Intelligence (AI) has become a topic of central tion, the training data-set used and the overall system may face
importance because of the ways in which it impacts on various several vulnerabilities. For example, a woman walking across
sectors, including transportation, finance, legal, military, and a road in Arizona on March 18, 2018, was struck and killed
medicine [1]. With machine learning (ML) and deep learn- by a self-driven and it showed to be difficult to identify the
ing (DL) algorithms, AI provides the core technology for cause of the accident [3]. Moreover, a recent report highlights
autonomous decision-making platforms, from mission-critical various incidents of misunderstandings for unclear sourcing of
services (e.g., autonomous driving) to life-critical tasks (e.g., data, particularly when one is the CEO, or an owner, or even
remote healthcare monitoring, criminal justice). All of these an operator of a decision-making software/system platform [4].
have a direct influence on different areas in people’s lives and Therefore, AI-based decision-support systems [5] and their in-

¥*&&& 
%0*$*$

Authorized licensed use limited to: Carleton University. Downloaded on June 14,2021 at 16:48:21 UTC from IEEE Xplore. Restrictions apply.
depth explanation of knowledge is vital. Even if the overall volved and the significance and envisaged consequences’ of
system works perfectly, it is essential to know the very root the automated decision-making (see Articles 13, 21, and 22).
of performing the decision, especially when the decision or Other responses take narrower, sector-specific approaches. For
prediction is crucial [6] [7]. In this paper, we adopt a human- example, safety and quality requirements for autonomous ve-
centric perspective, looking at the data used for decision hicles, medical devices, or civil aviation. Whichever approach
generation in an AI-based system. We identify and discuss is taken, the key problem is to identify suitable, necessary,
four open issues (OI) in this paper. The first open issue focuses desirable and practicable steps to open the AI black box in
on developing a better understanding of requirements from a a way that humans can digest and understand. Therefore, the
human perspective. fourth open issue adds an ethical and legal perspective on
OI1. What are the necessary requirements for explainability human rights for critical decision making.
that need to be included in an AI-based decision support OI4. How should requirements be added in a critical decision-
system? What kind of information would an end-user based system from an ethical and policy sectors so that
need to get a proper idea of the system’s behavior in policies are checked appropriately?
a decision support system?
Data provenance documents and explains the data stream
Interestingly, even if we were to know the ‘explainability’ of by answering ‘who-what-when-where’ of each step of data
the result in a system, there are still some remaining concerns. propagation [11] [12]. It identifies the data origin as well as
For example, deliberate bias might be built into a system the full data transmission procedure and helps in determining
that influences the result. Training data has been shown to the data trustworthiness by portraying data transparency. In
affect AI outcomes significantly. This can be unintentional this paper, we discuss the four open research issues from a
(e.g., the bias introduced or accuracy depending on different user-centric provenance-based security perspective. We discuss
factors) or intentional (e.g., introducing back-doors or favoring the role of implementing provenance graphs for explaining
particular decisions). Debugging or actual reasoning in AI AI properties so that information from data origin to data
based on statistical processing or probabilistic analysis is modification via data processing can be documented in an
very difficult or nearly impossible in a classical sense [8]. AI-based decision system. We emphasize the need to add
Furthermore, adaptive systems continue learning, and hence, security and legal information in a provenance graph to achieve
they can also change behavior. Therefore, adding explanations transparency and comprehensibility of data propagation in
in a system does not guarantee that the explanation is reliable, the system. Our specific contributions can be summarised as
as there are other factors that can change the explanation, follows.
and a deliberate or intentional bias or system fault is always
possible. There are repeated examples [9] of the problematic 1) We discuss the roles of adding information to a layer-
nature of this type of black-box decision-making that points wise data provenance graph in explaining AI in a
to inaccurate and unsafe decision-making as well as instances system-wide view.
of bias, discrimination, lack of due process, and breaches of 2) We propose a ‘Six Ws’ structure to present the infor-
human rights. Moreover, while expecting ‘explanations’, it is mation required for a security-aware data provenance
possible that the ‘black-box’ nature of an AI system may graph.
create an illogical explanation without proper documentation. 3) We propose to include evidence of explanation of AI
Therefore, if the system could document the security evidence system from both the human and security point of view,
on the security controls involved in a system, it could have an including evidence for secure data propagation, and
overview of the overall system security/safety specifications. highlight the required information/metadata to be added
Thus, with these discussions, we sketch our second and third in the provenance graphs.
open questions as follows. The rest of the paper is organized as follows. Section II
OI2. Has the end-user any means to check that the generated demonstrates necessary evidence for explanation both from
decision is correct, authentic from the right ML or human and security point of view; The role of provenance
DL algorithms, and all processing steps have been as information in decision-oriented AI system with the proposed
expected? framework is described in Section III; Section IV illustrates an
OI3. How to add security evidence in an AI-based system, AI-based loan processing scenario and discusses the required
where approaches to detect or prevent ‘misrepresen- information for explanation; Challenges and future research
tation’, ‘deliberate bias’, ‘safety-mismatch’, or ‘back- directions are included in Section V, and finally, the paper
doors’ are yet to discover? concludes in Section VI.
Policy and lawmakers are increasingly focusing on how to
best address potential harm induced by AI-based systems. One II. E VIDENCE FOR E XPLANATION
response is the EU General Data Protection Regulation’s [10]
right to an explanation of automated decision-making. It In this section, we discuss the evidence (information) re-
requires that individuals affected by AI-based decisions be quired for decision-based AI system’s explainability from
provided with ‘meaningful information about the logic in- human perspectives and security points of view.



Authorized licensed use limited to: Carleton University. Downloaded on June 14,2021 at 16:48:21 UTC from IEEE Xplore. Restrictions apply.
A. The Human Point of View provenance graph so that a user-centric view can be generated
at the end to show relevant risks (if any).
AI brings various critical sectors under the umbrella of tech-
nological sophistication. For example, criminal justice, foren-
sic analysis, clinical health research, and remote healthcare III. T HE ROLE OF P ROVENANCE I NFORMATION
monitoring [13] [14]. With the immense growth of AI in these
A standard for data provenance is presented in the PROV-
dynamic sectors, certain risk factors, including maintaining
DM model [12]. It provides the history of data records by
human values and human rights, also come under consid-
documenting a set of metadata that describes agents, activities,
erations [15] [16]. The challenge is that machine learning
and entities involved in creating, manipulating, and delivering
operations are often not transparent even for the researchers
a piece of data, a document, and/or an automated decision in a
involved in system design [15]. While it may seem not very
data provenance graph. Certain relationships among different
problematic in some areas of applied machine learning, for
attributes of a provenance graph are also added in the graph to
the critical cases of judicial settings, or healthcare analysis,
illustrate their inter-connections. The entities can be physical
evidence for the transparency of the reasoning (from human-
or conceptual things, which can be digital or virtual and are
centric perspectives), explainability of decision operations
usually identified by a unique ID. Activities can be described
(process-explanation perspective), and documentation of secu-
as generating, processing, using, or transforming entities and
rity evidence (security-perspective) are significant. Therefore,
are related to time involved for that action. Documentation of
when discussing and referring to the information necessary
activities includes the start time and end time of an activity and
for explainable AI (XAI), documenting evidence relating to
is identified by a unique ID. Agents are related to activities,
human-centric values and human rights is vital.
entities, and other agents according to their responsibilities.
To illustrate further, while discussing implementing algo-
Provenance linking the input and output of AI systems
rithms in the ‘fair trial’ in a judicial decision, the present
is commonly identified as an important issue, as analyzing
articles raise questions of not following law/policy. Article 6
provenance data is a valuable source for deriving the explana-
of the European Convention on Human Rights [17] guarantees
tions about decisions/recommendations made by algorithmic
the accused the right to participate effectively in the trial and
systems. In these systems, the resulting data, as seen by the
include the presumption of innocence. Thus, implementing AI-
end-user (e.g., a medical doctor using a decision support
based decision procedures in such systems provokes serious
system or a lawyer working with the metadata for any forensic
concerns, and the procedures need a proper justification on
analysis or criminal verdict), is derived from a classification
whether this criterion is maintained or not in the overall
resulting from the AI system. Hence, provenance metadata
explainability of the system procedure. Researchers are con-
plays a vital role in describing ‘explainable’ properties of AI-
cerned about not maintaining the law/policy from human-
based decision oriented systems. Although some researchers
centric perspectives in these critical domains and identify
have highlighted provenance as an emerging research area
that the ‘black-box’ nature of statistical analysis of AI-based
to explain AI-based systems (see [19] [20] [4]), no details
systems makes the system even more complex in terms of
or research road map are identified to present the required
transparency. However, no research has indicated ‘adding and
information for a transparent and comprehensive XAI system.
checking these policies’ as explainability properties. There-
An important consideration while considering ‘provenance’ as
fore, in this paper, we focus on adding information regarding
a measure of ‘explainable AI’ is not only to document the
governance and policy in different applications. While repre-
metadata involved, but to confirm that the metadata collected
senting a data provenance for information flow, we propose
is authentic and comes from an authorized source. A user’s
to add it as a required information of explaining AI-based
trust and confidence in an interpretable learning system is a
decision systems.
function of the user’s capability to understand the machine’s
input/output mapping behavior, which is broadly dependent
B. The Security Point of View
on the AI system to be ‘explainable’ [21]. Therefore, the
One noteworthy point about investigating ways for ‘expla- information supporting explainability also needs to be ex-
nation’ in an AI-based system is that generated explanations tended with security information about the security controls
can be unreliable and misleading. Deliberate manipulation of in the system to enable users to analyze the risks. While some
data sets for specific motivation in different applications of AI- existing provenance models [12] [22] describe complete data
based systems is not new [18]. Moreover, the training data and transformations by semantic relations among entities in data
even the model used to draw a prediction in a decision-based histories, using provenance in AI and detailed descriptions
system can be manipulated. Currently, there is no way to find including the source of ‘security metadata’ or ‘legal metadata’
out whether there is any ‘deliberate’ or ‘intentional’ bias in to be used to generate clarifications about automated decisions
an AI black-box, and any ‘unexpected’ result is the outcome that affect users are yet to be developed. Thus, approaching
of that bias or not. Furthermore, as the AI’s development ‘Explainable AI’, i.e., documenting and understanding how
depends on input data, the security properties guaranteed for the output is derived from the input and the system used in
this data need to be included in the evaluation. In this paper, between, is an essential consideration and needs to be covered
we propose to add ‘security-related information’ in a data in the provenance graph [11].



Authorized licensed use limited to: Carleton University. Downloaded on June 14,2021 at 16:48:21 UTC from IEEE Xplore. Restrictions apply.
Which security/safety information is added
for explanation?
black box or hidden-layer decision generation of a machine
Which?
(security • Security protocol
• System specification
learning algorithm. Moreover, a detailed analysis of the com-
metadata)
• Security controls on system configuration,
authentication mechanisms
plete system, including all internal components, is essential
in order to understand the appropriate security controls (if
Who? What is being processed/
(agent) What? explained? any) placed in the overall system for detecting and mitigating
(entity) • Data/document/
decision system vulnerabilities. Hence, as a vital attribute of describing
• System information
Who is responsible for the data
Provenance Graph based
the ‘second wave’ of AI, we propose to include ‘which’
generation/modification/explanation?
• Users/actors/auditors involved XAI security information is necessary to add in the provenance
• Machine/automated system
graph to represent the security controls involved in an AI-
Where?
(activity)
Why?
(law/policy) based decision system. This ‘security metadata’ may represent
specific system specifications, security protocols used for
When/Where is the explanation given?
Why is the explanation
generated?
data propagation, security controls on system configuration,
• System overflow • Legislation/policy
• Overall process
When?
(activity) • Governance
authentication mechanisms, safety measures equipped by the
• Design/modification/implementation • Human-centric
system, etc. So, ‘which’ provides security evidence in the
provenance graph involved in a decision-oriented AI system.
Fig. 1. The Six Ws of Provenance-based Explainable AI (XAI) This inclusion in a provenance graph helps a user to get an
overview of the system and estimate risks/system fault/status.
The big open issue for collecting information in which security
A. Proposed ‘Six Ws’ Framework for Provenance Graph and explainability information are available is on the side of
based XAI Design AI. While we have established best-practice for other security
In this section, we propose a ‘Six Ws’ structure (see controls, the AI itself’s security and explainability are work
Fig. 1), which represents the necessary attributes to present in progress. Examples for options currently available are as
provenance-based explainable AI properties. While the PROV- follows.
DM model illustrates the basic definition that can be repre- • Collect sufficient information for auditing the AI’s be-
sented as ‘who-what-when-where’ in a data provenance graph, havior to ensure a lack of bias or backdoors. The result
we propose to use it to identify the information required for of audits and some guarantees can be included with
the explainable properties of AI along with two new ‘Ws’: provenance information.
‘Which’ and ‘Why’. • Document which features are actually used in the AI and
We consider decision-oriented AI systems for this paper, remove unwanted features (e.g., gender) from the input
which are generally based on training data [18]. The prove- data before processing.
nance graph of the system should document the information • Use and document the use of privacy-enhancing technolo-
needed to understand the overall system so that an end-user at gies, such as differential privacy.
any point in the system can get the overall idea and estimate The last ‘w’ in the proposed structure is the ‘why’ that
risks of the system. The six Ws are shown in Fig 1. Firstly, represents the mandate for, or reasoning of, the explainability
‘who’ represents an ‘agent’ in the PROV-DM model, where from a law/policy perspective in an AI system. It can be de-
this agent can be an actor, user, machine involved in the signed for specific applications so that the related legislation or
decision-based system. Thus, this agent is responsible for data policy can be reported in the provenance graph. This attribute
generation, modification, or action involved in an appropriate enables the option to design a system from a domain-specific
explanation. Secondly, ‘what’ represents ‘entity’ in the PROV- and/ or person-centered perspective of why this explanation
DM model to delineate the data and/or document included in is generated,and whether the specific policy or governance
the system, which generates the ‘decision’ after certain manip- requirements have been adhered to when generating a decision.
ulation in different layers in training based AI system. Thirdly, This ‘why’ attribute aims to empower users against any
‘when’ and ‘where’ represent specific dimensions of the undesired design of a black-box automated decision-oriented
‘activity’ used in the PROV-DM model to illustrate the work- system, which may violate their ethical rights and freedom.
flow involved at a layer or design/modification/implementation
involved in the overall process. B. Provenance Graph-based XAI properties in Decision-
We argue that providing information only on these factors oriented AI system
(who-what-when-where) in a data provenance graph is not In this section, we present how our proposed structure of
sufficient as the black box and the overall transmission of ‘six Ws’ is relevant to a currently operational paradigm of an
data in an AI-based system may have certain security holes or AI-based decision support system and big data analysis.
system limitation in between. Therefore, these security holes Fig. 2 shows different steps for big data analysis in AI,
may have different impacts according to various applications. where each step demonstrates data generation to visualization
A provenance graph should include information on the security via processing steps. The data is generated from an actor in
of all layers in a decision-support system from the source of a decision-based AI system and is visualized by an end-user
the data via training to decision-making. This training data after applying the ML algorithm. Actors can be any source
explanation is essential in order to stop blindly believing in for data generation, and end-users can be a machine or a



Authorized licensed use limited to: Carleton University. Downloaded on June 14,2021 at 16:48:21 UTC from IEEE Xplore. Restrictions apply.
person working on visualized data or decision generated. If getting information from each processing (activities) steps
we try to plot the provenance graph concept in a decision with data (entity) and from agents involved. Fig. 3 illus-
support system, we can see that the actors and users in the trates related agents, activities, and entities in the provenance
system are the ‘agents’ in the graph. The workflows involved graph. The meanings of the shapes used are indicated in the
in the overall processing of the system represent the steps of lower-left corner of the figure. The graph also indicates the
various ‘activities’ , Datax and Datay represent ‘entities’ at relationships among different attributes (entity, activity, and
different stages with different data representations. Step one agent). For ease of representation we indicate the relevant
and two in Fig. 2 indicate that the actor involved in the system relationships [12] as WasGeneratedBy (WGeB), WasAssociat-
starts working with the initial data. The relevant provenance edWith (WAsW), WasInformedBy (WInB), and WasDerived-
questions are introduced in the figure to indicate the open is- From (WDeF). While details of agents, activities, and entities
sues by the authors. These questions highlight the significance involved in the graph delineate the explanation on ‘who-what-
of including the ‘which’ and ‘why’ in the provenance graph. when-where’ of the system, we include ‘security controls’, and
For example, ‘Is the exact representation of data reported?’, ‘legal controls’ to document the security-related information
or ‘Is the process transparency reported?’, or ‘Is the exact and law/policy related information, respectively. Therefore, it
prediction mechanism documented?’ - can be addressed from creates the opportunity of adding ‘which’ and ‘why’ related in-
documenting ‘security evidence’ or ‘security metadata’ for formation for security-aware and human-centric explanations.
the certain layers. Moreover, the question ‘Is the appropriate Fig. 3 illustrates the internal attributes involved in a loan
law/policy maintained?’ can be answered by reporting whether processing provenance graph. After the customer requests
the appropriate law/policy has been followed from human- for a loan, the ‘loan processing’ activity evaluates the loan
centric perspectives. application according to some machine learning model and
Step three in Fig. 2 constitutes a ML algorithm, which based on the training data of the system., The result is a
does not reveal the hidden layer processing. Moreover, deep recommendation to approve or reject based on the risk of
learning algorithms or neural network designs on decision- the loan to default. We assume that the loan is not approved
making or recommendation generation act as a black-box, as the system model is designed in a way that it does not
where one may include the service to know where the AI is sufficiently value financial benefits to a female customer, while
running on, but unable to keep the knowledge from where the underlying assumption in the system model is ‘female =
the neural network originates and propagates. The fact is, less creditworthy’. From the providers design view and purely
provenance is usually exploited in a relatively coarse-grained mathematical risk estimation, such a bias might be logical. For
manner, in which whole algorithms or data transformations example, if data (used for training the AI) showed that women
are just described by semantic relations. As a result, with the had a record of lower income and thus pose higher chances
above discussion, whole pipelines may be documented with of financial instability. Therefore, the system would reject
provenance, but individual algorithms remain black boxes. We loans to women with a higher probability. Such a behavior
need the details about the system used, how it was trained, is currently not transparent for an individual applicant. With
and additional information required for the system to make a provenance based explanation, it would be possible to get
the explanation clear both from human and security points of the justification of the decision. In particular, it would link
view. Therefore, although we may generate provenance based input data parameters to the decisions. Combined with XAI
explanations, back-door attacks, or deliberate or misrepresent- data, such as statistics on previous decisions and the ability
ing ‘bias’ of training dataset is hard to determine. to audit, there would be higher transparency on the causes
of the decisions. Therefore, from a human-centric view, one
IV. A N E XAMPLE S CENARIO AND R EQUIRED could actually analyze the results and conclude ‘the logic’ is
I NFORMATION FOR E XPLANATION improper and unacceptable. Such an explanation should be
This section discusses a bank loan processing scenario and added to justify the decision to the customer, as the decisive
designs its associated provenance graph to demonstrate the factor should actually be ‘income’, not ‘gender’. Another
essential information for an explanation. AI support has been angle to be considered is security. If a data forgery attack
available for these loan processing for quite some time [23], is generated by an adversary between the loan processing
and the need for explainable AI has been identified in this steps, the decision could be changed, and the user would
context. Very first solutions providing some explanation have get a skewed view of the AI’s behavior. Therefore, if we
been suggested in [24]. Although this paper provides some do not document the security information for the complete
explanation, it does not link the explanation to any input data. system, starting from the input data through the different layers
Therefore, it does not deliver the information to humans to of the system, there would be no possibility of determining
fully backtrack the decision to the input data and verify both. whether there has been any ‘deliberate bias’, ‘system fault’,
Let us consider a hypothetical scenario of an AI-based or manipulations.
decision-oriented loan processing system, where a female
customer of a bank requests a loan to buy a new car. A V. C HALLENGES AND F UTURE R ESEARCH D IRECTIONS
provenance graph is able to track the connection between Working with data provenance generates several challenges
application data (input) and decision (prediction outcome) by as the provenance graph, and the provenance records are



Authorized licensed use limited to: Carleton University. Downloaded on June 14,2021 at 16:48:21 UTC from IEEE Xplore. Restrictions apply.
Step 1
Gathering data from Actor
various sources (Datax)

Workflow(s)
Step 2
Cleaning data (Is the exact representation
of data reported?)
Big Data Analysis in AI

Workflow(s)
Step 3 (Is the process transparency
Model building-selecting

Process
reported?)
the right ML algorithm (Is the appropriate law/policy
maintained?)
Workflow(s) Input Output
Step 4 (Is the exact prediction mechanism layer layer
Getting insight/AI documented?)
prediction Hidden
(Is the appropriate law/policy
layer
maintained?)

End-user Machine Learning Algorithm


Step 5
Data visualization (Datay)

Fig. 2. Data provenance based explanation schema in a decision-oriented AI system

generating any decision. While in communication architecture,


Bank, B
WAsW we know what protocols to document for further validation
Security controls
of communications, in the case of an AI-based system, it is
Security controls Requesting Bank Loan
BID, Cinfo, unknown. Therefore, a crucial challenge is where to include
WGeB CID, Datax
and document the relevant security metadata.
Customer, C Another challenge of making a fully designed data
WAsW WInB WDeF provenance-based system is that it should produce inter-
pretable explanations by the target community of users. This
Bank, B
WAsW
Loan Processing
WGeB
Datax
target community may differ for different applications, and the
system administrator must design specific requirements.
Security controls Security controls Since we are also focusing on human-centric values of
explanation in a provenance graph, considerations should be
entity designed from the human and policy levels. Social scientists,
Training Data Model
activity psychologists, policymakers, and lawyers should determine
agent how the practice needs to be imposed for better efficiency
security Legal controls among different applications/scenarios. The fact is, working
evidence
with these diverse groups may introduce further challenges.
Law/policy
For example, the information essential from a lawyer’s per-
Prediction
Used shapes spective may affect certain human-rights described by a so-
cial scientist. A plan for recognizing and reconciling these
End-user
WAsW
Approval outcome
WGeB Datay, diverse agendas is vital. Therefore, this work reveals an inter-
yes/no
disciplinary research opportunity and creates prospects for
industry and academic collaboration.
Fig. 3. An Example Provenance Graph representing an AI-based Loan
Processing System Provenance data in any framework should include the
granularity policy [26] so that users can have sufficient
documentation at any point. As a result, firstly, the system can
restrict/limit the use of bias/poor training set. Secondly, with
themselves vulnerable to various security threats. Hence, main- proper documentation of security metadata, it can estimate the
taining a secured record of data provenance is an important risk (if any). However, if AI does not know what information
issue [25] [11]. We are working on a comprehensive data to include and where, there are always possibilities of miscon-
provenance model and introducing the concept of adding secu- duct.
rity and legal-related metadata in each step of data processing. If we get all the relevant explanations for an AI-based
Such metadata can include evidence on active security con- system, the next challenge is how we can present it to an
trols, various authorization and/or authentication mechanisms end-user to get an overall view of the system. Therefore,
for users or software running on a device, legal perspectives of



Authorized licensed use limited to: Carleton University. Downloaded on June 14,2021 at 16:48:21 UTC from IEEE Xplore. Restrictions apply.
visualization is another issue to explore the implementation [11] F. T. Jaigirdar, C. Rudolph, and C. Bain, “Can I Trust the Data
I See?” in Proceedings of the Australasian Computer Science Week
of a provenance-based explainable AI system. Researchers Multiconference. Sydney, NSW, Australia: ACM, 2019, pp. 1–10.
need to maintain a trade-off between retaining any required [12] L. Moreau and P. Missier, “PROV-DM: The prov data model. W3C
‘secrecy’ and ‘transparency’ to the target group of users to recommendation,” World Wide Web Consortium, 2013.
[13] C. Rigano, “Using artificial intelligence to address criminal justice
maintain security and trust and useful data privacy. needs,” National Institute of Justice. Issue, no. 280, 2019.
[14] C. Kuziemsky, A. J. Maeder, O. John, S. B. Gogia, A. Basu, S. Meher,
VI. C ONCLUSION and M. Ito, “Role of artificial intelligence within the telehealth domain:
Official 2019 yearbook contribution by the members of imia telehealth
Several researchers have highlighted AI’s vulnerabilities working group,” Yearbook of medical informatics, vol. 28, no. 1, p. 35,
associated with its development processes, or the chain of 2019.
reasoning in AI-based systems. Hence, maintaining the trans- [15] A. Završnik, “Criminal justice, artificial intelligence systems, and human
rights,” in ERA Forum, vol. 20, no. 4. Springer, 2020, pp. 567–583.
parency and explainability of AI-based systems, particularly [16] M. Latonero, “Governing artificial intelligence: Upholding human rights
in the second wave of AI, is crucial. Solutions to also add & dignity,” Data & Society, pp. 1–37, 2018.
meaningful security and legal metadata are required for AI- [17] R. Crawshaw and L. Holmström, “7. convention for the protection of
human rights and fundamental freedoms (european convention on human
based systems in any critical scenarios. This paper introduces rights),” in Essential Texts on Human Rights for the Police, 2008, pp.
four open issues and discusses possible contributions of prove- 323–340.
nance graphs in big data analysis and AI interpretability both [18] D. Pedreschi, F. Giannotti, R. Guidotti, A. Monreale, S. Ruggieri, and
F. Turini, “Meaningful explanations of black box ai decision systems,” in
from a security-aware and human-centric view. In this paper, Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33,
we first demonstrate how the basic design of a provenance 2019, pp. 9780–9784.
graph can help describe the overall system. Afterward, we [19] S. F. Jentzsch and N. Hochgeschwender, “Don’t forget your roots! using
provenance data for transparent and explainable development of machine
propose a framework representing the six necessary elements learning models,” in 2019 34th IEEE/ACM International Conference on
of information to be added in a provenance graph. This frame- Automated Software Engineering Workshop (ASEW). IEEE, 2019, pp.
work presents the information needed for the explainability 37–40.
[20] S. Liu, X. Wang, M. Liu, and J. Zhu, “Towards better analysis of
and transparency of an overall AI-based decision system. Our machine learning models: A visual analytics perspective,” Visual In-
research identifies some essential points for the future design formatics, vol. 1, no. 1, pp. 48–56, 2017.
of explainable AI. Finally, we identify specific challenges [21] D. Doran, S. Schulz, and T. R. Besold, “What does explainable AI really
mean? A new conceptualization of perspectives,” in CEUR Workshop
in this area that would provoke interesting future research Proceedings, vol. 2071, 2018.
directions. [22] S. Salmin, G. Gabriel, B. Elisa, and S. Mohamed, “A Lightweight Secure
Provenance Scheme For Wireless Sensor Networks,” in Proceedings
Acknowledgement of the International Conference on Parallel and Distributed Systems
(ICPADS), vol. 6, no. 1, 2012.
Special thanks to the ICT Division, Ministry of Posts, [23] S. F. Eletter, S. G. Yaseen, and G. A. Elrefae, “Neuro-based artificial
Telecommunication and IT, Peoples Republic of Bangladesh, intelligence model for loan decisions,” American Journal of Economics
for the student research fellowship. and Business Administration, vol. 2, no. 1, p. 27, 2010.
[24] S. Sachan, J.-B. Yang, D.-L. Xu, D. E. Benavides, and Y. Li, “An
explainable ai decision-support-system to automate loan underwriting,”
R EFERENCES Expert Systems with Applications, vol. 144, p. 113100, 2020.
[1] S. A. Bini, “Artificial intelligence, machine learning, deep learning, and [25] A. Alkhalil and R. A. Ramadan, “IoT Data Provenance Implementation
cognitive computing: what do these terms mean and how will they Challenges,” Procedia Computer Science, vol. 109, no. 2014, pp. 1134–
impact health care?” The Journal of arthroplasty, vol. 33, no. 8, pp. 1139, 2017.
2358–2361, 2018. [26] K. W. Hamlen, L. Kagal, and M. Kantarcioglu, “Policy enforcement
[2] G. Hurlburt, “How much to trust artificial intelligence?” IT Professional, framework for cloud data management.” IEEE Data Eng. Bull., vol. 35,
vol. 19, no. 4, pp. 7–11, 2017. no. 4, pp. 39–45, 2012.
[3] “Autonomous vehicle,” accessed: 2020-02-04. [Online]. Avail-
able: https://www.nytimes.com/interactive/2018/03/20/us/self-driving-
uber-pedestrian-killed.html
[4] L. Frost, “Explainable AI and other questions where provenance
matters,” IEEE IoT Newsletter: https://iot.ieee.org/newsletter/january-
2019/explainable-ai-and-other-questions-where-provenance-matters, ac-
cessed: 2020-03-04.
[5] A. S. Keshavarz, T. D. Huynh, and L. Moreau, “Provenance for
online decision making,” in International Provenance and Annotation
Workshop. Springer, 2014, pp. 44–55.
[6] P. Buneman and W.-C. Tan, “Data provenance: What next?” ACM
SIGMOD Record, vol. 47, no. 3, pp. 5–16, 2019.
[7] J. Shaw, F. Rudzicz, T. Jamieson, and A. Goldfarb, “Artificial intelli-
gence and the implementation challenge,” J Med Internet Res, vol. 21,
no. 7, p. e13659, Jul 2019.
[8] A. Preece, “Asking ‘why’in ai: Explainability of intelligent systems–
perspectives and challenges,” Intelligent Systems in Accounting, Finance
and Management, vol. 25, no. 2, pp. 63–72, 2018.
[9] D. Castelvecchi, “Can we open the black box of ai?” Nature News, vol.
538, no. 7623, p. 20, 2016.
[10] P. Voigt and A. Von dem Bussche, “The EU general data protection
regulation (GDPR),” A Practical Guide, 1st Ed., Cham: Springer Inter-
national Publishing, 2017.



Authorized licensed use limited to: Carleton University. Downloaded on June 14,2021 at 16:48:21 UTC from IEEE Xplore. Restrictions apply.

You might also like