shap.utils.sample

shap.utils.sample(X, nsamples=100, random_state=0)

Performs sampling without replacement of the input data X.

This is a simple wrapper over scikit-learn’s shuffle function. It is used mainly to downsample X for use as a background dataset in SHAP Explainer and its subclasses.

Changed in version 0.42: The behaviour of sample was changed from sampling with replacement to sampling without replacement. Note that reproducibility might be broken when using this function pre- and post-0.42, even with the specification of random_state.

Parameters:
Xarray-like

Data to sample from. Input data can be arrays, lists, dataframes or scipy sparse matrices with a consistent first dimension.

nsamplesint

Number of samples to generate from X.

random_state

Determines random number generation for shuffling the data. Use this to ensure reproducibility across multiple function calls.