E0234 PPT

AN ATTENTION MATRIX FOR EVERY
DECISION
Harsh Vishwakarma
21532
MTech, CSA
Deep Learning for NLP

E0-334
Table of Contents
Introduction
Optimus Transformer Interpretibility
Operations over Attention Matrices
Metric to measure the interpretations
Experiments
Table of Contents
Introduction
Experiments
Introduction
What is the most important aspect of a model in the real-world

scenario
I The model needs to be interpretable
Introduction

scenario
I Enhances the performances on binary and multi-label data
Introduction

scenario
I Enhances the performances on binary and multi-label data
I The authors mainly focused on:
I a new technique that selects the most faithful attention-based
interpretation among the several ones that can be obtained by
combining different head, layer, and matrix operations.
Interpretability
What is interpretability
I A model’s ability to provide insights for its decisions or inner
working, whether intrinsically or not, is referred to as
interpretability.
I Complex models, such as transformers, cannot provide
interpretations out of the box, and therefore posthoc
techniques are typically applied. The representations of an
interpretation include, among others, rules, heatmaps, and
feature importance.
Interpretability of Transformer
How can we interpret the results generated by the transformer

I The most popular transformer-specific interpretability
approach is the use of self-attention scores
Interpretability of Transformer
How can we interpret the results generated by the transformer

I The most popular transformer-specific interpretability
approach is the use of self-attention scores
I We can also generate attention maps as to check which part
of the input get most attention for a particular input instance
Feature importance based methods
I We can consider techniques like Layer-wise Relevance

Propagation (LRP) to check the gradient flow during
backpropagation as how the updates are being made
corresponding to each feature
Feature importance based methods
I We can consider techniques like Layer-wise Relevance

Propagation (LRP) to check the gradient flow during
backpropagation as how the updates are being made
corresponding to each feature
I Some of the ready-to-use interpretations that use the similar
idea are LIME, IG, SHARP
How is interpretibility evaluated?
I Comprehensibility : calculates the percentage of non-zero

weights in an interpretation. The lower this number, the
easier for end users to comprehend the interpretation.
How is interpretibility evaluated?
I Comprehensibility : calculates the percentage of non-zero

weights in an interpretation. The lower this number, the
easier for end users to comprehend the interpretation.
I Faithfulness Score: eliminates the token with the highest
importance score from the examined instance and measures
how much the prediction changes. Higher changes signify
better interpretations.
Table of Contents
Introduction
Experiments
Objective: Given a transformer model f , and an input sequence x

= [t1 , . . . , tS ], consisting of S tokens ti , i = 1 . . . S, our goal
is to extract a local interpretation z = [w1 , . . . , wS ], where wi
R signifies the influence of token ti on the model’s decision f (x),
based on the model’s self-attention scores.
Using Attention scores
We know that the Attention scores corresponding to each token

are generated as:
T
I A = softmax( Q.K
√ + mask) where ARSxS , S: length of
d
sequence
Using Attention scores
We know that the Attention scores corresponding to each token

are generated as:
T
I A = softmax( Q.K √ + mask) where ARSxS , S: length of
d
sequence
I To get beneficial scores for both polarities, the authors
consider interpretations, they removed the softmax function
and named that matrix as A∗ .
Table of Contents
Introduction
Experiments
How Attention matrix is interpreted
Aggregation of Attention Matrix:The process involves

aggregating attention matrices across all heads within each
self-attention layer.
Head Operations:Common operations applied to the

attention matrices of each head. Averaging and summing
essentially give the same token importance order, differing
only in the magnitude of scores assigned to tokens
Final Interpretation Vector:

I Operations like ”From [CLS]” and ”To [CLS]” involve
extracting attention regarding the special [CLS] token that is
typically prepended in text classification tasks. This operation
considers the attention the [CLS] token gives and receives
from other tokens.
Selecting the Best Interpretation
I Select the best set of operations by iterating through different

combinations of operations across the layers and heads in the
transformer model.
I Find the most faithful interpretation for a single instance. It
iterates through various combinations of head, layer, and
matrix operations, and for each combination, it evaluates the
faithfulness using the metric. The combination with the
highest faithfulness score is selected as the best interpretation.
Table of Contents
Introduction
Experiments
Ranked Faithful Truthfulness

I Objective: RFT is designed to measure the importance of
each token in the interpretation of the model’s output. It
evaluates the impact of removing each token from the input
sequence on the model’s prediction, considering the weight of
the model’s interpretation of that token.
Problems that arises due to removing tokens

I In sequence-based models like recurrent neural networks and
transformers, the removal of tokens disrupts the context for
the remaining tokens.
I This disruption is especially crucial in transformer models that
employ positional encoding to understand the sequence
structure.
I Research suggests that simply removing words or tokens from
a sequence can lead to the creation of texts that are
considered out-of-distribution for the transformer model.
Proposed solution
I Authors suggest replacing tokens with a special token such as
”[UNK]” (unknown token) instead of deleting them entirely.
By replacing tokens with ”[UNK]”, the influence of the
replaced token is neutralized while minimally affecting the
context for the remaining tokens.
I
Table of Contents
Introduction
Experiments
Experiments
The below figures contains the results of Ground truth and

Optimus Prime interpretability heatmaps
Results on HateSpeech dataset
Results on Assignment1 Classification1 dataset
Results on SNLI dataset

Experiments
Results on HoC dataset

Thank You!

E0234 PPT

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

E0234 PPT

Uploaded by

Copyright:

Available Formats

AN ATTENTION MATRIX FOR EVERY

Deep Learning for NLP

Optimus Transformer Interpretibility

Operations over Attention Matrices

Metric to measure the interpretations

Optimus Transformer Interpretibility

Operations over Attention Matrices

Metric to measure the interpretations

What is the most important aspect of a model in the real-world

What is the most important aspect of a model in the real-world

What is the most important aspect of a model in the real-world

How can we interpret the results generated by the transformer

How can we interpret the results generated by the transformer

I We can consider techniques like Layer-wise Relevance

I We can consider techniques like Layer-wise Relevance

I Comprehensibility : calculates the percentage of non-zero

I Comprehensibility : calculates the percentage of non-zero

Optimus Transformer Interpretibility

Operations over Attention Matrices

Metric to measure the interpretations

Objective: Given a transformer model f , and an input sequence x

We know that the Attention scores corresponding to each token

We know that the Attention scores corresponding to each token

Optimus Transformer Interpretibility

Operations over Attention Matrices

Metric to measure the interpretations

Aggregation of Attention Matrix:The process involves

Head Operations:Common operations applied to the

Final Interpretation Vector:

I Select the best set of operations by iterating through different

Optimus Transformer Interpretibility

Operations over Attention Matrices

Metric to measure the interpretations

Ranked Faithful Truthfulness

Problems that arises due to removing tokens

Optimus Transformer Interpretibility

Operations over Attention Matrices

Metric to measure the interpretations

The below figures contains the results of Ground truth and

Results on Assignment1 Classification1 dataset

Results on SNLI dataset

Results on HoC dataset

You might also like