Using custom functions and tokenizers
This notebook demonstrates how to use the Partition
explainer for a multiclass text classification scenario where we are using a custom python function as our model.
[1]:
import datasets
import numpy as np
import pandas as pd
import scipy as sp
import torch
import transformers
import shap
# load the emotion dataset
dataset = datasets.load_dataset("emotion", split="train")
data = pd.DataFrame({"text": dataset["text"], "emotion": dataset["label"]})
Using custom data configuration default
Reusing dataset emotion (/home/slundberg/.cache/huggingface/datasets/emotion/default/0.0.0/aa34462255cd487d04be8387a2d572588f6ceee23f784f37365aa714afeb8fe6)
Define our model
While here we are using the transformers package, any python function that takes in a list of strings and outputs scores will work.
[2]:
# load the model and tokenizer
tokenizer = transformers.AutoTokenizer.from_pretrained("nateraw/bert-base-uncased-emotion", use_fast=True)
model = transformers.AutoModelForSequenceClassification.from_pretrained("nateraw/bert-base-uncased-emotion").cuda()
labels = sorted(model.config.label2id, key=model.config.label2id.get)
# this defines an explicit python function that takes a list of strings and outputs scores for each class
def f(x):
tv = torch.tensor([tokenizer.encode(v, padding="max_length", max_length=128, truncation=True) for v in x]).cuda()
attention_mask = (tv != 0).type(torch.int64).cuda()
outputs = model(tv, attention_mask=attention_mask)[0].detach().cpu().numpy()
scores = (np.exp(outputs).T / np.exp(outputs).sum(-1)).T
val = sp.special.logit(scores)
return val
Create an explainer
In order to build an Explainer
we need both a model and a masker (the masker specifies how to hide portions of the input). Since we are using a custom function as our model, there is no way for SHAP to auto-infer a masker for us. So we need to provide one, either implicitly by passing a transformers tokenizer, or explicitly by building a shap.maskers.Text
object
[3]:
method = "custom tokenizer"
# build an explainer by passing a transformers tokenizer
if method == "transformers tokenizer":
explainer = shap.Explainer(f, tokenizer, output_names=labels)
# build an explainer by explicitly creating a masker
elif method == "default masker":
masker = shap.maskers.Text(r"\W") # this will create a basic whitespace tokenizer
explainer = shap.Explainer(f, masker, output_names=labels)
# build a fully custom tokenizer
elif method == "custom tokenizer":
import re
def custom_tokenizer(s, return_offsets_mapping=True):
"""Custom tokenizers conform to a subset of the transformers API."""
pos = 0
offset_ranges = []
input_ids = []
for m in re.finditer(r"\W", s):
start, end = m.span(0)
offset_ranges.append((pos, start))
input_ids.append(s[pos:start])
pos = end
if pos != len(s):
offset_ranges.append((pos, len(s)))
input_ids.append(s[pos:])
out = {}
out["input_ids"] = input_ids
if return_offsets_mapping:
out["offset_mapping"] = offset_ranges
return out
masker = shap.maskers.Text(custom_tokenizer)
explainer = shap.Explainer(f, masker, output_names=labels)
Compute SHAP values
Explainers have the same method signature as the models they are explaining, so we just pass a list of strings for which to explain the classifications.
[4]:
shap_values = explainer(data["text"][:3])
Visualize the impact on all the output classes
In the plots below, when you hover your mouse over an output class you get the explanation for that output class. When you click an output class name then that class remains the focus of the explanation visualization until you click another class.
The base value is what the model outputs when the entire input text is masked, while \(f_{output class}(inputs)\) is the output of the model for the full original input. The SHAP values explain in an addive way how the impact of unmasking each word changes the model output from the base value (where the entire input is masked) to the final prediction value.
[5]:
shap.plots.text(shap_values)