shap.datasets.imagenet50

shap.datasets.imagenet50(resolution=224, n_points=None)

Return a set of 50 images representative of ImageNet images.

Parameters:
resolutionint

The resolution of the images. At present the only supported value is 224.

n_pointsint, optional

Number of data points to sample. If None, the entire dataset is used.

Returns:
Tuple of numpy array representing images and numpy array representing the labels.

Notes

This dataset was collected by randomly finding a working ImageNet link and then pasting the original ImageNet image into Google image search restricted to images licensed for reuse. A similar image (now with rights to reuse) was downloaded as a rough replacement for the original ImageNet image. The point is to have a random sample of ImageNet for use as a background distribution for explaining models trained on ImageNet data.

Note that because the images are only rough replacements, the labels might no longer be correct.

Examples

To get the processed images and labels:

images, labels = shap.datasets.imagenet50()