shap.models.TeacherForcing

class shap.models.TeacherForcing(model, tokenizer=None, similarity_model=None, similarity_tokenizer=None, batch_size=128, device=None)

Generates scores (log odds) for output text explanation algorithms using Teacher Forcing technique.

This class supports generation of log odds for transformer models as well as functions. In model agnostic cases (model is function) it expects a similarity_model and similarity_tokenizer to approximate log odd scores for target sentence generated by the model.

__init__(model, tokenizer=None, similarity_model=None, similarity_tokenizer=None, batch_size=128, device=None)

Build a teacher forcing model from the given text generation model.

Parameters:
model: object or function

A object of any pretrained transformer model or function which is to be explained.

tokenizer: object

A tokenizer object(PreTrainedTokenizer/PreTrainedTokenizerFast) which is used to tokenize source and target sentence.

similarity_model: object

A pretrained transformer model object which is used in model agnostic scenario to approximate log odds.

similarity_tokenizer: object

A tokenizer object(PreTrainedTokenizer/PreTrainedTokenizerFast) which is used to tokenize sentence in model agnostic scenario.

batch_size: int

Batch size for model inferencing and computing logodds (default=128).

device: str

By default, it infers if system has a gpu and accordingly sets device. Should be ‘cpu’ or ‘cuda’ or pytorch models.

Returns:
numpy.ndarray

The scores (log odds) of generating target sentence ids using the model.

Methods

__init__(model[, tokenizer, ...])

Build a teacher forcing model from the given text generation model.

get_inputs(X[, padding_side])

The function tokenizes source sentences.

get_logodds(logits)

Calculates log odds from logits.

get_output_names(output)

Gets the output tokens by computing the output sentence ids and output names using the similarity_tokenizer.

get_outputs(X)

The function tokenizes output sentences and returns ids.

get_teacher_forced_logits(X, Y)

The function generates logits for transformer models.

load(in_file[, instantiate])

This is meant to be overridden by subclasses and called with super.

model_inference(inputs, output_ids)

This function performs model inference for tensorflow and pytorch models.

save(out_file)

Save the model to the given file stream.

update_output_names(output)

The function updates output tokens.

get_inputs(X, padding_side='right')

The function tokenizes source sentences.

In model agnostic case, the function calls model(X) which is expected to return a batch of output sentences which is tokenized to compute inputs.

Parameters:
X: numpy.ndarray

X could be a batch of text or images(model agnostic case).

Returns:
dict

Dictionary of padded source sentence ids and attention mask as tensors(“pt” or “tf” based on similarity_model_type).

get_logodds(logits)

Calculates log odds from logits.

This function passes the logits through softmax and then computes log odds for the output(target sentence) ids.

Parameters:
logits: numpy.ndarray

An array of logits generated from the model.

Returns:
numpy.ndarray

Computes log odds for corresponding output ids.

get_output_names(output)

Gets the output tokens by computing the output sentence ids and output names using the similarity_tokenizer.

Parameters:
output: numpy.ndarray

Output(sentence/sentence ids) for an explanation row.

Returns:
list

A list of output tokens.

get_outputs(X)

The function tokenizes output sentences and returns ids.

Parameters:
X: numpy.ndarray

Output(sentence/sentence ids) for an explanation row.

Returns:
numpy.ndarray

An array of output(target sentence) ids.

get_teacher_forced_logits(X, Y)

The function generates logits for transformer models.

It generates logits for encoder-decoder models as well as decoder only models by using the teacher forcing technique.

Parameters:
X: numpy.ndarray

An array containing a list of masked inputs.

Y: numpy.ndarray

An array containing a list of target sentence/ids.

Returns:
numpy.ndarray

Decoder output logits for output(target sentence) ids.

classmethod load(in_file, instantiate=True)

This is meant to be overridden by subclasses and called with super.

We return constructor argument values when not being instantiated. Since there are no constructor arguments for the Serializable class we just return an empty dictionary.

model_inference(inputs, output_ids)

This function performs model inference for tensorflow and pytorch models.

Parameters:
inputs: dict

Dictionary of padded source sentence ids and attention mask as tensors.

output_ids: numpy.ndarray

An array of decoder output ids.

Returns:
numpy.ndarray

Returns output logits from the model.

save(out_file)

Save the model to the given file stream.

update_output_names(output: ndarray)

The function updates output tokens.

It mimics the caching mechanism to update the output tokens for every new row of explanation that are to be explained.

Parameters:
output: numpy.ndarray

Output(sentence/sentence ids) for an explanation row.