Migrating to the new “Explanation” API
This notebook demonstrates some of the changes to the shap API that were introduced in shap v0.36.0.
[1]:
# An example dataset and model
import xgboost
import shap
X, y = shap.datasets.adult(n_points=100)
model = xgboost.XGBClassifier().fit(X, y)
explainer = shap.TreeExplainer(model, X)
To summarise the main change in the API:
[2]:
shap_values = explainer.shap_values(X) # Old style
explanation = explainer(X) # New style
Calculating explanations
Old style
In versions of shap before v0.36.0, explanations are represented as simple numpy arrays and calculated using the .shap_values() method of an explainer:
[3]:
shap_values = explainer.shap_values(X)
shap_values[:2] # a numpy array
[3]:
array([[-0.54854601, 0.01639348, -0.46476041, 0.85896822, -1.36168788,
-0.64692199, 0.0254638 , -0.58422904, -0.02344483, 0. ,
0.1224989 , 0.01079906],
[-0.83802091, 0.01562196, 0.78349799, -1.10456323, -0.68524691,
-0.84828204, 0.03734176, -0.86151311, -0.02564897, 0. ,
-0.56183428, 0.00415988]])
Similarly, legacy plotting functions like shap.summary_plot expected the shap_values as a numpy array.
New style
As of shap v0.36.0, explanations are now represented with the Explanation object, and are created by calling the explainer directly as a function:
[4]:
explanation = explainer(X)
explanation[:2] # a shap.Explanation object
[4]:
.values =
array([[-0.54854601, 0.01639348, -0.46476041, 0.85896822, -1.36168788,
-0.64692199, 0.0254638 , -0.58422904, -0.02344483, 0. ,
0.1224989 , 0.01079906],
[-0.83802091, 0.01562196, 0.78349799, -1.10456323, -0.68524691,
-0.84828204, 0.03734176, -0.86151311, -0.02564897, 0. ,
-0.56183428, 0.00415988]])
.base_values =
array([-2.70354599, -2.70354599])
.data =
array([[27., 4., 10., 0., 1., 1., 4., 0., 0., 0., 44., 39.],
[27., 4., 13., 4., 10., 0., 4., 0., 0., 0., 40., 39.]])
The shap.Explanation object is a much richer representation that includes the shap values (accessible with the .values attribute) as well as supporting contextual information such as the background dataset and the feature names.
The new-style plotting functions like shap.plot.bar and shap.plots.beeswarm accept these Explanation objections rather than numpy arrays.