shap.datasets.imagenet50

shap.datasets.imagenet50(resolution: int = 224, n_points: int | None = None) tuple[ndarray, ndarray]

Return a set of 50 images representative of ImageNet images.

Parameters:
resolutionint

The resolution of the images. At present, the only supported value is 224.

n_pointsint, optional

Number of data points to sample. If None, the entire dataset is used.

Returns:
Xnp.ndarray

Represents images from ImageNet of a certain resolution.

ynp.ndarray

The target variables, that is, the ImageNet classes.

Notes

This dataset was collected by randomly finding a working ImageNet link and then pasting the original ImageNet image into Google image search restricted to images licensed for reuse. A similar image (now with rights to reuse) was downloaded as a rough replacement for the original ImageNet image. The point is to have a random sample of ImageNet for use as a background distribution for explaining models trained on ImageNet data.

Note that because the images are only rough replacements, the labels might no longer be correct.

Examples

To get the processed images and labels:

images, labels = shap.datasets.imagenet50()