shap.datasets.iris

shap.datasets.iris(display: Literal[False] = False, n_points: int | None = None) tuple[DataFrame, ndarray]
shap.datasets.iris(display: Literal[True] = False, n_points: int | None = None) tuple[DataFrame, list[str]]

Return the classic Iris dataset in a convenient package.

Parameters:
displaybool

If True, return the original feature matrix along with class labels (as strings). Default is False.

n_pointsint, optional

Number of data points to sample. If provided, randomly samples the specified number of points.

Returns:
Xpd.DataFrame

The feature matrix.

ynp.ndarray or a list of strings

If display is False, a numpy array representing the class labels encoded as integers is returned. If display is True, then a list of class labels is returned.

Notes

  • The dataset includes measurements of sepal length, sepal width, petal length, and petal width for three species of iris flowers.

  • Class labels are encoded as integers (0, 1, 2) representing the species (setosa, versicolor, virginica).

  • If display is True, class labels are returned as strings.

Examples

To get the feature matrix and class labels:

features, labels = shap.datasets.iris()

To get the feature matrix and class labels as strings:

features, class_labels = shap.datasets.iris(display=True)