shap.plots.scatter

shap.plots.scatter(shap_values, color='#1E88E5', hist=True, axis_color='#333333', cmap=<matplotlib.colors.LinearSegmentedColormap object>, dot_size=16, x_jitter='auto', alpha=1, title=None, xmin=None, xmax=None, ymin=None, ymax=None, overlay=None, ax=None, ylabel='SHAP value', show=True)

Create a SHAP dependence scatter plot, colored by an interaction feature.

Plots the value of the feature on the x-axis and the SHAP value of the same feature on the y-axis. This shows how the model depends on the given feature, and is like a richer extension of classical partial dependence plots. Vertical dispersion of the data points represents interaction effects. Grey ticks along the y-axis are data points where the feature’s value was NaN.

Note that if you want to change the data being displayed, you can update the shap_values.display_features attribute and it will then be used for plotting instead of shap_values.data.

Parameters:
shap_valuesshap.Explanation

A single column of an Explanation object (i.e. shap_values[:,"Feature A"]).

colorstring or shap.Explanation

How to color the scatter plot points. This can be a fixed color string, or an Explanation object. If it is an Explanation object, then the scatter plot points are colored by the feature that seems to have the strongest interaction effect with the feature given by the shap_values argument. This is calculated using shap.utils.approximate_interactions(). If only a single column of an Explanation object is passed, then that feature column will be used to color the data points.

histbool

Whether to show a light histogram along the x-axis to show the density of the data. Note that the histogram is normalized such that if all the points were in a single bin, then that bin would span the full height of the plot. Defaults to True.

x_jitter‘auto’ or float

Adds random jitter to feature values by specifying a float between 0 to 1. May increase plot readability when a feature is discrete. By default, x_jitter is chosen based on auto-detection of categorical features.

alphafloat

The transparency of the data points (between 0 and 1). This can be useful to show the density of the data points when using a large dataset.

xminfloat or string

Represents the lower bound of the plot’s x-axis. It can be a string of the format “percentile(float)” to denote that percentile of the feature’s value used on the x-axis.

xmaxfloat or string

Represents the upper bound of the plot’s x-axis. It can be a string of the format “percentile(float)” to denote that percentile of the feature’s value used on the x-axis.

axmatplotlib Axes object

Optionally specify an existing matplotlib Axes object, into which the plot will be placed. In this case, we do not create a Figure, otherwise we do.

showbool

Whether matplotlib.pyplot.show() is called before returning. Setting this to False allows the plot to be customized further after it has been created.

Examples

See scatter plot examples.