ClassDiscriminationDrift
Compares classification discrimination metrics between reference and monitoring datasets.
Purpose
The Class Discrimination Drift test is designed to evaluate changes in the model’s discriminative power over time. By comparing key discrimination metrics between reference and monitoring datasets, this test helps identify whether the model maintains its ability to separate classes in production. This is crucial for understanding if the model’s predictive power remains stable and whether its decision boundaries continue to effectively distinguish between different classes.
Test Mechanism
This test proceeds by calculating three key discrimination metrics for both reference and monitoring datasets: ROC AUC (Area Under the Curve), GINI coefficient, and KS (Kolmogorov-Smirnov) statistic. For binary classification, it computes all three metrics. For multiclass problems, it focuses on macro-averaged ROC AUC. The test quantifies drift as percentage changes in these metrics between datasets, providing a comprehensive assessment of discrimination stability.
Signs of High Risk
- Large drifts in discrimination metrics exceeding the threshold
- Significant drops in ROC AUC indicating reduced ranking ability
- Decreased GINI coefficients showing diminished separation power
- Reduced KS statistics suggesting weaker class distinction
- Inconsistent changes across different metrics
- Systematic degradation in discriminative performance
Strengths
- Combines multiple complementary discrimination metrics
- Handles both binary and multiclass classification
- Provides clear quantitative drift assessment
- Enables early detection of model degradation
- Includes standardized drift threshold evaluation
- Supports comprehensive performance monitoring
Limitations
- Does not identify root causes of discrimination drift
- May be sensitive to changes in class distribution
- Cannot suggest optimal decision threshold adjustments
- Limited to discrimination aspects of performance
- Requires sufficient data for reliable metric calculation
- May not capture subtle changes in decision boundaries