shap.utils.hclust
- shap.utils.hclust(X, y=None, linkage='single', metric='auto', random_state=0)
Fit a hierarcical clustering model for features X relative to target variable y.
For more information on clutering methods see: https://docs.scipy.org/doc/scipy/reference/generated/scipy.cluster.hierarchy.linkage.html
- Parameters:
- X: np.array
Features to cluster
- y: np.array | None
Target variable
- linkage: str
Defines the method to calculate the distance between clusters. Must be one of “single”, “complete” or “average”.
- metric: str
Scipy distance metric or “xgboost_distances_r2”.
If “xgboost_distances_r2”, estimate redundancy distances between features X with respect to target variable y using
shap.utils.xgboost_distances_r2()
.Otherwise, calculate distances between features using the given distance metric.
If
auto
(default), usexgboost_distances_r2
if target variable is provided, or elsecosine
distance metric.
- random_state: int
Numpy random state
- Returns:
- clustering: np.array
The hierarchical clustering encoded as a linkage matrix.