shap.datasets.independentlinear60

shap.datasets.independentlinear60(n_points=1000)

A simulated dataset with tight correlations among distinct groups of features.

Parameters:
n_pointsint, optional

Number of data points to generate. Default is 1,000.

Returns:
Tuple of pandas DataFrame containing the feature matrix and a numpy array representing the labels.

Notes

  • The dataset is generated with known correlations among distinct groups of features.

  • The labels are generated based on a linear function of the features with added random noise.

Examples

features, labels = shap.datasets.independentlinear60()