shap.datasets.iris

shap.datasets.iris(display=False, n_points=None)

Return the classic Iris dataset in a convenient package.

Parameters:
displaybool

If True, return the original feature matrix along with class labels (as strings). Default is False.

- n_pointsint, optional

Number of data points to include. Default is None, including all data points.

Returns:
Tuple of pandas DataFrame containing the feature matrix and either a numpy array representing the class labels or a list of class labels (if display is True).

Notes

  • The dataset includes measurements of sepal length, sepal width, petal length, and petal width for three species of iris flowers.

  • Class labels are encoded as integers (0, 1, 2) representing the species (setosa, versicolor, virginica).

  • If display is True, class labels are returned as strings.

Examples

To get the feature matrix and class labels:

features, labels = shap.datasets.iris()

To get the feature matrix and class labels as strings:

features, class_labels = shap.datasets.iris(display=True)