ConfusionMatrixDrift
Compares confusion matrix metrics between reference and monitoring datasets.
Purpose
The Confusion Matrix Drift test is designed to evaluate changes in the model’s error patterns over time. By comparing confusion matrix elements between reference and monitoring datasets, this test helps identify whether the model maintains consistent prediction behavior in production. This is crucial for understanding if the model’s error patterns have shifted and whether specific types of misclassifications have become more prevalent.
Test Mechanism
This test proceeds by generating confusion matrices for both reference and monitoring datasets. For binary classification, it tracks True Positives, True Negatives, False Positives, and False Negatives as percentages of total predictions. For multiclass problems, it analyzes per-class metrics including true positives and error rates. The test quantifies drift as percentage changes in these metrics between datasets, providing detailed insight into shifting prediction patterns.
Signs of High Risk
- Large drifts in confusion matrix elements exceeding threshold
- Systematic changes in false positive or false negative rates
- Inconsistent changes across different classes
- Significant shifts in error patterns for specific classes
- Unexpected improvements in certain metrics
- Divergent trends between different types of errors
Strengths
- Provides detailed analysis of prediction behavior
- Identifies specific types of prediction changes
- Enables early detection of systematic errors
- Includes comprehensive error pattern analysis
- Supports both binary and multiclass problems
- Maintains interpretable percentage-based metrics
Limitations
- May be sensitive to class distribution changes
- Cannot identify root causes of prediction drift
- Requires sufficient samples for reliable comparison
- Limited to hard predictions (not probabilities)
- May not capture subtle changes in decision boundaries
- Complex interpretation for multiclass problems