Open Ended GPT2 Text Generation Explanations

This notebook demonstrates how to get explanations for the output of gpt2 used for open ended text generation. In this demo, we use the pretrained gpt2 model provided by hugging face (https://huggingface.co/gpt2) to explain the generated text by gpt2. We further showcase how to get explanations for custom output generated text and plot global input token importances for any output generated token.

[1]:

from transformers import AutoModelForCausalLM, AutoTokenizer

import shap

Load model and tokenizer

[2]:

tokenizer = AutoTokenizer.from_pretrained("gpt2", use_fast=True)
model = AutoModelForCausalLM.from_pretrained("gpt2").cuda()

Below, we set certain model configurations. We need to define if the model is a decoder or encoder-decoder. This can be set through the ‘is_decoder’ or ‘is_encoder_decoder’ param in model’s config file. We can also set custom model generation parameters which will be used during the output text generation decoding process.

[3]:

# set model decoder to true
model.config.is_decoder = True
# set text-generation params under task_specific_params
model.config.task_specific_params["text-generation"] = {
    "do_sample": True,
    "max_length": 50,
    "temperature": 0.7,
    "top_k": 50,
    "no_repeat_ngram_size": 2,
}

Define initial text

[4]:

s = ["I enjoy walking with my cute dog"]

Create an explainer object and compute the SHAP values

[5]:

explainer = shap.Explainer(model, tokenizer)
shap_values = explainer(s)

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.

Visualize shap explanations

[6]:

shap.plots.text(shap_values)

[0]

outputs

,

but

I

'm

not

sure

if

I

'll

ever

be

able

to

inputs

I

enjoy

walking

with

my

cute

dog

Another example…

[7]:

s = ["Scientists confirmed the worst possible outcome: the massive asteroid will collide with Earth"]

[8]:

explainer = shap.Explainer(model, tokenizer)
shap_values = explainer(s)

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.

[9]:

shap.plots.text(shap_values)

[0]

outputs

in

the

coming

days

.

inputs

Scientists

confirmed

the

worst

possible

outcome

:

the

massive

asteroid

will

collide

with

Earth

Custom text generation and debugging biased outputs

Below we demonstrate the process of how to explain the liklihood of generating a particular output sentence given an input sentence using the model. For example, we ask a question: Which country’s inhabitant (target) in the sentence “I know many people who are [target].” would have a high liklilhood of generating the token “vodka” in the output sentence “They love their vodka!” ? For this, we first define input-output sentence pairs

[10]:

# define input
x = [
    "I know many people who are Russian.",
    "I know many people who are Greek.",
    "I know many people who are Australian.",
    "I know many people who are American.",
    "I know many people who are Italian.",
    "I know many people who are Spanish.",
    "I know many people who are German.",
    "I know many people who are Indian.",
]

[11]:

# define output
y = [
    "They love their vodka!",
    "They love their vodka!",
    "They love their vodka!",
    "They love their vodka!",
    "They love their vodka!",
    "They love their vodka!",
    "They love their vodka!",
    "They love their vodka!",
]

We wrap the model with a Teacher Forcing scoring class and create a Text masker

[12]:

teacher_forcing_model = shap.models.TeacherForcing(model, tokenizer)
masker = shap.maskers.Text(tokenizer, mask_token="...", collapse_mask_token=True)

Create an explainer…

[13]:

explainer = shap.Explainer(teacher_forcing_model, masker)

Generate SHAP explanation values!

[14]:

shap_values = explainer(x, y)

Now that we have generated the SHAP values, we can have a look at the contribution of tokens in the input driving the token “vodka” in the output sentence using the text plot. Note: The red color indicates a positive contribution while the blue color indicates negative contribution and the intensity of the color shows its strength in the respective direction.

[15]:

shap.plots.text(shap_values)

[0]

outputs

They

love

their

vodka

!

inputs

I

know

many

people

who

are

Russian

.

[1]

outputs

They

love

their

vodka

!

inputs

I

know

many

people

who

are

Greek

.

[2]

outputs

They

love

their

vodka

!

inputs

I

know

many

people

who

are

Australian

.

[3]

outputs

They

love

their

vodka

!

inputs

I

know

many

people

who

are

American

.

[4]

outputs

They

love

their

vodka

!

inputs

I

know

many

people

who

are

Italian

.

[5]

outputs

They

love

their

vodka

!

inputs

I

know

many

people

who

are

Spanish

.

[6]

outputs

They

love

their

vodka

!

inputs

I

know

many

people

who

are

German

.

[7]

outputs

They

love

their

vodka

!

inputs

I

know

many

people

who

are

Indian

.

To view what input tokens impact (positively/negatively) the liklihood of generating the word “vodka”, we plot the global token importances the word “vodka”.

Voila! Russians love their vodka, dont they? :)

[16]:

shap.plots.bar(shap_values[0, :, "vodka"])

../../../_images/example_notebooks_text_examples_text_generation_Open_Ended_GPT2_Text_Generation_Explanations_30_0.png

Have an idea for more helpful examples? Pull requests that add to this documentation notebook are encouraged!