shap.plots.scatter
- shap.plots.scatter(shap_values: Explanation, color: str | Explanation | None = '#1E88E5', hist: bool = True, axis_color='#333333', cmap=<matplotlib.colors.LinearSegmentedColormap object>, dot_size=16, x_jitter: float | Literal['auto'] = 'auto', alpha: float = 1.0, title: str | None = None, xmin: AxisLimitSpec = None, xmax: AxisLimitSpec = None, ymin: AxisLimitSpec = None, ymax: AxisLimitSpec = None, overlay: dict[str, ~typing.Any] | None=None, ax: Axes | None = None, ylabel: str = 'SHAP value', show: bool = True)
Create a SHAP dependence scatter plot, optionally colored by an interaction feature.
Plots the value of the feature on the x-axis and the SHAP value of the same feature on the y-axis. This shows how the model depends on the given feature, and is like a richer extension of classical partial dependence plots. Vertical dispersion of the data points represents interaction effects. Grey ticks along the y-axis are data points where the feature’s value was NaN.
Note that if you want to change the data being displayed, you can update the
shap_values.display_featuresattribute and it will then be used for plotting instead ofshap_values.data.- Parameters:
- shap_valuesshap.Explanation
Typically a single column of an
Explanationobject (i.e.shap_values[:, "Feature A"]).Alternatively, pass multiple columns to create several subplots (i.e.
shap_values[:, ["Feature A", "Feature B"]]).- colorstring or shap.Explanation, optional
How to color the scatter plot points. This can be a fixed color string, or an
Explanationobject.If it is an
Explanationobject, then the scatter plot points are colored by the feature that seems to have the strongest interaction effect with the feature given by theshap_valuesargument. This is calculated usingshap.utils.approximate_interactions().If only a single column of an
Explanationobject is passed, then that feature column will be used to color the data points.- histbool
Whether to show a light histogram along the x-axis to show the density of the data. Note that the histogram is normalized such that if all the points were in a single bin, then that bin would span the full height of the plot. Defaults to
True.- x_jitter‘auto’ or float
Adds random jitter to feature values by specifying a float between 0 to 1. May increase plot readability when a feature is discrete. By default,
x_jitteris chosen based on auto-detection of categorical features.- title: str, optional
Plot title.
- alphafloat
The transparency of the data points (between 0 and 1). This can be useful to show the density of the data points when using a large dataset.
- xmin, xmax, ymin, ymaxfloat, string, aggregated Explanation or None
Desired axis limits. Can be a float to specify a fixed limit.
It can be a string of the format
"percentile(float)"to denote that percentile of the feature’s value.It can also be an aggregated column of a single column of an
Explanation, such asexplanation[:, "feature_name"].percentile(20).- overlay: dict, optional
Optional dictionary of up to three additional curves to overlay as line plots.
The dictionary maps a curve name to a list of (xvalues, yvalues) pairs, where there is one pair for each feature to be plotted.
- axmatplotlib Axes, optional
Optionally specify an existing
matplotlib.axes.Axesobject, into which the plot will be placed.Only supported when plotting a single feature.
- showbool
Whether
matplotlib.pyplot.show()is called before returning.Setting this to
Falseallows the plot to be customized further after it has been created.
- Returns:
- axmatplotlib Axes object
Returns the
Axesobject with the plot drawn onto it. Only returned ifshow=False.
Examples