shap.datasets.diabetes
- shap.datasets.diabetes(n_points: int | None = None) tuple[DataFrame, ndarray]
Return the diabetes data in a nice package.
Used in predictive regression tasks.
- Parameters:
- n_pointsint, optional
Number of data points to sample. If provided, randomly samples the specified number of points.
- Returns:
- Xpd.DataFrame
The feature data.
- ynp.ndarray
The target variable.
Notes
Feature Columns in
X
:age
(float): Age in yearssex
(float): Sexbmi
(float): Body mass indexbp
(float): Average blood pressures1
(float): Total serum cholesterols2
(float): Low-density lipoproteins (LDL cholesterol)s3
(float): High-density lipoproteins (HDL cholesterol)s4
(float): Total cholesterol / HDL cholesterol ratios5
(float): Log of serum triglycerides levels6
(float): Blood sugar level
Target
y
:Progression of diabetes one year after baseline (float)
The diabetes dataset is a subset of the larger diabetes dataset from scikit-learn. More details:
sklearn.datasets.load_diabetes()
Examples
To get the processed data and target labels:
data, target = shap.datasets.diabetes()