ValidMind for model development 4 — Finalize testing and documentation

Learn how to use ValidMind for your end-to-end model documentation process with our introductory notebook series. In this last notebook, finalize the testing and documentation of your model and have a fully documented sample model ready for review.

We'll first use run_documentation_tests() previously covered in 2 — Start the model development process to ensure that your custom test results generated in 3 — Integrate custom tests are included in your documentation. Then, we'll view and update the configuration for the entire model documentation template to suit your needs.

Learn by doing

Our course tailor-made for developers new to ValidMind combines this series of notebooks with more a more in-depth introduction to the ValidMind Platform — Developer Fundamentals

Prerequisites

In order to finalize the testing and documentation for your sample model, you'll need to first have:

Need help with the above steps?

Refer to the first three notebooks in this series:

Setting up

This section should be very familiar to you now — as we performed the same actions in the previous two notebooks in this series.

Initialize the ValidMind Library

As usual, let's first connect up the ValidMind Library to our model we previously registered in the ValidMind Platform:

  1. In a browser, log in to ValidMind.

  2. In the left sidebar, navigate to Inventory and select the model you registered for this "ValidMind for model development" series of notebooks.

  3. Go to Getting Started and click Copy snippet to clipboard.

Next, load your model identifier credentials from an .env file or replace the placeholder with your own code snippet:

# Make sure the ValidMind Library is installed

%pip install -q validmind

# Load your model identifier credentials from an `.env` file

%load_ext dotenv
%dotenv .env

# Or replace with your code snippet

import validmind as vm

vm.init(
    # api_host="...",
    # api_key="...",
    # api_secret="...",
    # model="...",
)
Note: you may need to restart the kernel to use updated packages.
2026-01-10 01:54:17,394 - INFO(validmind.api_client): 🎉 Connected to ValidMind!
📊 Model: [ValidMind Academy] Model development (ID: cmalgf3qi02ce199qm3rdkl46)
📁 Document Type: model_documentation

Import sample dataset

Next, we'll import the same public Bank Customer Churn Prediction dataset from Kaggle we used in the last notebooks so that we have something to work with:

from validmind.datasets.classification import customer_churn as demo_dataset

print(
    f"Loaded demo dataset with: \n\n\t• Target column: '{demo_dataset.target_column}' \n\t• Class labels: {demo_dataset.class_labels}"
)

raw_df = demo_dataset.load_data()
Loaded demo dataset with: 

    • Target column: 'Exited' 
    • Class labels: {'0': 'Did not exit', '1': 'Exited'}

We'll apply a simple rebalancing technique to the dataset before continuing:

import pandas as pd

raw_copy_df = raw_df.sample(frac=1)  # Create a copy of the raw dataset

# Create a balanced dataset with the same number of exited and not exited customers
exited_df = raw_copy_df.loc[raw_copy_df["Exited"] == 1]
not_exited_df = raw_copy_df.loc[raw_copy_df["Exited"] == 0].sample(n=exited_df.shape[0])

balanced_raw_df = pd.concat([exited_df, not_exited_df])
balanced_raw_df = balanced_raw_df.sample(frac=1, random_state=42)

Remove highly correlated features

Let's also quickly remove highly correlated features from the dataset using the output from a ValidMind test.

As you learned previously, before we can run tests you'll need to initialize a ValidMind dataset object:

# Register new data and now 'balanced_raw_dataset' is the new dataset object of interest
vm_balanced_raw_dataset = vm.init_dataset(
    dataset=balanced_raw_df,
    input_id="balanced_raw_dataset",
    target_column="Exited",
)

With our balanced dataset initialized, we can then run our test and utilize the output to help us identify the features we want to remove:

# Run HighPearsonCorrelation test with our balanced dataset as input and return a result object
corr_result = vm.tests.run_test(
    test_id="validmind.data_validation.HighPearsonCorrelation",
    params={"max_threshold": 0.3},
    inputs={"dataset": vm_balanced_raw_dataset},
)

❌ High Pearson Correlation

High Pearson Correlation is designed to identify pairs of features within a dataset that exhibit strong linear relationships, with the primary purpose of detecting potential feature redundancy or multicollinearity. This is crucial for ensuring that the predictive model remains interpretable and robust, as high correlations between features can obscure the true impact of individual variables and may lead to overfitting or instability in model estimates.

The test operates by calculating the Pearson correlation coefficient for every possible pair of features in the dataset. The Pearson correlation coefficient is a statistical measure that quantifies the strength and direction of a linear relationship between two continuous variables, ranging from -1 (perfect negative linear relationship) to +1 (perfect positive linear relationship), with 0 indicating no linear relationship. The test systematically computes these coefficients for all feature pairs, removes self-correlations and duplicate pairs, and then sorts the results by the absolute value of the coefficient. A pre-defined threshold, set at 0.3 in this instance, is used to determine whether a pair is considered highly correlated. Any pair with an absolute coefficient exceeding this threshold is flagged as a potential risk for multicollinearity. The test then presents the top n strongest correlations, regardless of whether they pass or fail the threshold, providing a transparent view of the most significant linear relationships in the data.

The primary advantages of this test include its efficiency and clarity in surfacing linear dependencies between features, which is particularly valuable during the early stages of model development and risk assessment. By highlighting pairs of variables with strong linear associations, the test enables practitioners to quickly identify and address potential sources of redundancy or instability in the model. The transparent tabular output, which lists feature pairs, their correlation coefficients, and pass/fail status, supports clear communication and documentation. This approach is especially useful in regulated environments or when model interpretability is a priority, as it provides a straightforward mechanism for monitoring and managing multicollinearity risks.

It should be noted that the test is limited to detecting only linear relationships, meaning it may not capture more complex or nonlinear dependencies that could also impact model performance. The Pearson correlation coefficient is sensitive to outliers, which can distort the measure and potentially lead to misleading conclusions about the strength of relationships. Additionally, the test focuses exclusively on pairwise relationships and does not account for higher-order interactions among three or more variables, which may also contribute to multicollinearity. High correlation coefficients, particularly those exceeding the threshold, are indicative of potential risk, but the test does not provide direct guidance on how to address these relationships or their impact on downstream modeling.

This test shows its results in a tabular format, where each row represents a unique pair of features from the dataset. The columns include the feature pair, the calculated Pearson correlation coefficient, and a pass/fail status based on whether the absolute value of the coefficient exceeds the threshold of 0.3. The coefficient values range from -1 to 1, with positive values indicating direct relationships and negative values indicating inverse relationships. In this specific output, the table lists the ten strongest correlations, sorted by the absolute value of the coefficient. The highest observed coefficient is 0.3561 for the pair (Age, Exited), which is the only pair exceeding the threshold and thus marked as "Fail." All other pairs have coefficients below the threshold, with values ranging from -0.1905 to -0.0336, and are marked as "Pass." The table provides a clear and interpretable summary of the linear relationships present in the data, allowing users to quickly assess the extent and nature of feature correlations.

The test results reveal the following key insights:

  • Single Pair Exceeds Correlation Threshold: Only the feature pair (Age, Exited) has a Pearson correlation coefficient (0.3561) that exceeds the threshold of 0.3, resulting in a "Fail" status for this pair, while all other pairs remain below the threshold.
  • Majority of Feature Pairs Show Low Correlation: The remaining nine feature pairs have coefficients ranging from -0.1905 to -0.0336, all of which are well below the threshold and are marked as "Pass," indicating generally weak linear relationships among most features.
  • Negative and Positive Correlations Present: Both positive and negative correlations are observed, with the strongest negative correlation being between (IsActiveMember, Exited) at -0.1905, and the strongest positive correlation, aside from the failed pair, being (Balance, Exited) at 0.1354.
  • No Evidence of Widespread Multicollinearity: The distribution of coefficients suggests that, apart from the (Age, Exited) pair, there is no evidence of widespread or severe multicollinearity among the top correlated feature pairs.
  • Feature Relationships Are Generally Stable: The relatively narrow range of correlation coefficients and the absence of multiple pairs near the threshold indicate stable and distinct feature relationships within the dataset.

Based on these results, the dataset exhibits a generally low level of linear correlation among its features, with only one pair, (Age, Exited), surpassing the pre-defined threshold for high correlation. This suggests that, overall, the risk of multicollinearity affecting model interpretability or stability is minimal, as most feature pairs demonstrate weak linear associations. The presence of both positive and negative correlations, none of which approach the threshold except for the single failed pair, further supports the conclusion that the features are largely independent in their linear relationships. The clear separation between the failed pair and the rest of the feature pairs highlights the isolated nature of the observed high correlation, rather than a systemic pattern across the dataset. This pattern indicates that the model is unlikely to be adversely impacted by feature redundancy or instability due to linear dependencies, and the feature set maintains a high degree of distinctiveness in its predictive contributions.

Parameters:

{
  "max_threshold": 0.3
}
            

Tables

Columns Coefficient Pass/Fail
(Age, Exited) 0.3561 Fail
(IsActiveMember, Exited) -0.1905 Pass
(Balance, NumOfProducts) -0.1722 Pass
(Balance, Exited) 0.1354 Pass
(NumOfProducts, IsActiveMember) 0.0448 Pass
(Tenure, IsActiveMember) -0.0430 Pass
(NumOfProducts, Exited) -0.0406 Pass
(Tenure, EstimatedSalary) 0.0404 Pass
(CreditScore, Exited) -0.0399 Pass
(HasCrCard, IsActiveMember) -0.0336 Pass
# From result object, extract table from `corr_result.tables`
features_df = corr_result.tables[0].data
features_df
Columns Coefficient Pass/Fail
0 (Age, Exited) 0.3561 Fail
1 (IsActiveMember, Exited) -0.1905 Pass
2 (Balance, NumOfProducts) -0.1722 Pass
3 (Balance, Exited) 0.1354 Pass
4 (NumOfProducts, IsActiveMember) 0.0448 Pass
5 (Tenure, IsActiveMember) -0.0430 Pass
6 (NumOfProducts, Exited) -0.0406 Pass
7 (Tenure, EstimatedSalary) 0.0404 Pass
8 (CreditScore, Exited) -0.0399 Pass
9 (HasCrCard, IsActiveMember) -0.0336 Pass
# Extract list of features that failed the test
high_correlation_features = features_df[features_df["Pass/Fail"] == "Fail"]["Columns"].tolist()
high_correlation_features
['(Age, Exited)']
# Extract feature names from the list of strings
high_correlation_features = [feature.split(",")[0].strip("()") for feature in high_correlation_features]
high_correlation_features
['Age']

We can then re-initialize the dataset with a different input_id and the highly correlated features removed and re-run the test for confirmation:

# Remove the highly correlated features from the dataset
balanced_raw_no_age_df = balanced_raw_df.drop(columns=high_correlation_features)

# Re-initialize the dataset object
vm_raw_dataset_preprocessed = vm.init_dataset(
    dataset=balanced_raw_no_age_df,
    input_id="raw_dataset_preprocessed",
    target_column="Exited",
)
# Re-run the test with the reduced feature set
corr_result = vm.tests.run_test(
    test_id="validmind.data_validation.HighPearsonCorrelation",
    params={"max_threshold": 0.3},
    inputs={"dataset": vm_raw_dataset_preprocessed},
)

✅ High Pearson Correlation

High Pearson Correlation is designed to identify pairs of features within a dataset that exhibit strong linear relationships, with the primary purpose of detecting potential feature redundancy or multicollinearity. This is crucial for ensuring that the predictive model remains interpretable and robust, as highly correlated features can obscure the true impact of individual variables and may lead to overfitting or instability in model coefficients.

The test operates by calculating the Pearson correlation coefficient for every possible pair of features in the dataset. The Pearson correlation coefficient is a statistical measure that quantifies the strength and direction of a linear relationship between two continuous variables, producing values that range from -1 to 1. A value of 1 indicates a perfect positive linear relationship, -1 indicates a perfect negative linear relationship, and 0 indicates no linear relationship. The test systematically computes these coefficients for all feature pairs, removes self-correlations and duplicate pairs, and then compares the absolute value of each coefficient to a predefined threshold, which in this case is set at 0.3. If the absolute value exceeds this threshold, the pair is flagged as potentially problematic due to high correlation. The test then returns the top n pairs with the strongest correlations, regardless of whether they pass or fail the threshold, providing a transparent view of the most significant linear relationships in the data.

The primary advantages of this test include its efficiency and clarity in surfacing linear dependencies between features, which is particularly valuable during the early stages of model development and risk assessment. By highlighting pairs of features with strong linear associations, the test enables practitioners to quickly identify and address multicollinearity, which can otherwise compromise model interpretability and stability. The output is straightforward, listing feature pairs, their correlation coefficients, and pass/fail status, making it easy to communicate results to both technical and non-technical stakeholders. This transparency supports informed decision-making regarding feature selection, engineering, and model design, and helps ensure that the model's predictive power is not artificially inflated by redundant information.

It should be noted that the test is limited to detecting linear relationships and does not capture more complex, nonlinear dependencies that may exist between features. Additionally, the Pearson correlation coefficient is sensitive to outliers, which can distort the true strength of the relationship between variables. The test only examines pairwise relationships, so it may miss higher-order interactions involving three or more features. Furthermore, the presence of high correlation coefficients is a sign of potential risk, as it may indicate redundancy or multicollinearity, but the absence of high coefficients does not guarantee that the dataset is free from all forms of dependency or redundancy. Interpretation of the results should therefore be contextualized within the broader modeling and data exploration process.

This test shows its results in the form of a table, where each row represents a unique pair of features from the dataset. The columns include the feature pair, the Pearson correlation coefficient, and a pass/fail status based on whether the absolute value of the coefficient exceeds the threshold of 0.3. The coefficients are presented as decimal values, typically ranging from -1 to 1, with negative values indicating inverse relationships and positive values indicating direct relationships. In this particular output, all coefficients are well below the threshold, with the highest absolute value being -0.1905 for the pair (IsActiveMember, Exited). The table is sorted by the strength of the correlation, allowing for quick identification of the most strongly related pairs. Notably, all pairs in this result have a "Pass" status, indicating that none of the observed correlations exceed the threshold for high risk. The range of coefficients spans from -0.1905 to 0.0404, suggesting generally weak linear relationships among the top pairs. This format allows users to easily scan for both the magnitude and direction of relationships, as well as to assess compliance with the predefined risk criteria.

The test results reveal the following key insights:

  • No Feature Pairs Exceed Correlation Threshold: All observed Pearson correlation coefficients are below the threshold of 0.3, with the highest absolute value being -0.1905, indicating no pairs are flagged for high linear correlation.
  • Weak Linear Relationships Dominate: The coefficients for the top ten feature pairs range from -0.1905 to 0.0404, demonstrating that the strongest relationships in the dataset are weak and unlikely to contribute to multicollinearity.
  • Balanced Distribution of Positive and Negative Correlations: Both positive and negative correlations are present, with the most negative being between IsActiveMember and Exited (-0.1905) and the most positive between Tenure and EstimatedSalary (0.0404), suggesting no systematic directional bias.
  • No Evidence of Redundancy Among Key Features: Feature pairs involving critical variables such as Balance, NumOfProducts, and Exited all show low correlation coefficients, supporting the independence of these features in the dataset.
  • Consistent Pass Status Across All Pairs: Every feature pair in the top ten receives a "Pass" status, reinforcing the observation that the dataset does not exhibit problematic linear dependencies among its most strongly related features.

Based on these results, the dataset demonstrates a low degree of linear association among its top feature pairs, as evidenced by the absence of any coefficients exceeding the 0.3 threshold and the uniformly weak correlations observed. The distribution of both positive and negative coefficients, all within a narrow range, suggests that the features are largely independent in terms of linear relationships, reducing the likelihood of multicollinearity affecting model performance or interpretability. The consistent "Pass" status across all pairs further supports the conclusion that the dataset is structurally sound with respect to linear feature dependencies. These characteristics indicate that the model built on this dataset is unlikely to be compromised by feature redundancy or instability in coefficient estimation due to high pairwise correlations, thereby supporting robust and interpretable modeling outcomes.

Parameters:

{
  "max_threshold": 0.3
}
            

Tables

Columns Coefficient Pass/Fail
(IsActiveMember, Exited) -0.1905 Pass
(Balance, NumOfProducts) -0.1722 Pass
(Balance, Exited) 0.1354 Pass
(NumOfProducts, IsActiveMember) 0.0448 Pass
(Tenure, IsActiveMember) -0.0430 Pass
(NumOfProducts, Exited) -0.0406 Pass
(Tenure, EstimatedSalary) 0.0404 Pass
(CreditScore, Exited) -0.0399 Pass
(HasCrCard, IsActiveMember) -0.0336 Pass
(CreditScore, EstimatedSalary) -0.0321 Pass

Train the model

We'll then use ValidMind tests to train a simple logistic regression model on our prepared dataset:

# First encode the categorical features in our dataset with the highly correlated features removed
balanced_raw_no_age_df = pd.get_dummies(
    balanced_raw_no_age_df, columns=["Geography", "Gender"], drop_first=True
)
balanced_raw_no_age_df.head()
CreditScore Tenure Balance NumOfProducts HasCrCard IsActiveMember EstimatedSalary Exited Geography_Germany Geography_Spain Gender_Male
134 721 2 0.00 2 1 1 106977.80 0 False True False
3230 640 4 0.00 2 1 0 44904.26 0 False False False
3895 605 5 91612.91 1 1 1 28427.84 0 False False True
1101 662 4 90350.77 1 1 0 75884.65 1 False True False
69 777 2 0.00 1 1 0 136458.19 1 False False False
# Split the processed dataset into train and test
from sklearn.model_selection import train_test_split

train_df, test_df = train_test_split(balanced_raw_no_age_df, test_size=0.20)

X_train = train_df.drop("Exited", axis=1)
y_train = train_df["Exited"]
X_test = test_df.drop("Exited", axis=1)
y_test = test_df["Exited"]
from sklearn.linear_model import LogisticRegression

# Logistic Regression grid params
log_reg_params = {
    "penalty": ["l1", "l2"],
    "C": [0.001, 0.01, 0.1, 1, 10, 100, 1000],
    "solver": ["liblinear"],
}

# Grid search for Logistic Regression
from sklearn.model_selection import GridSearchCV

grid_log_reg = GridSearchCV(LogisticRegression(), log_reg_params)
grid_log_reg.fit(X_train, y_train)

# Logistic Regression best estimator
log_reg = grid_log_reg.best_estimator_
/opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sklearn/linear_model/_logistic.py:1135: FutureWarning:

'penalty' was deprecated in version 1.8 and will be removed in 1.10. To avoid this warning, leave 'penalty' set to its default value and use 'l1_ratio' or 'C' instead. Use l1_ratio=0 instead of penalty='l2', l1_ratio=1 instead of penalty='l1', and C=np.inf instead of penalty=None.

/opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sklearn/linear_model/_logistic.py:1160: UserWarning:

Inconsistent values: penalty=l1 with l1_ratio=0.0. penalty is deprecated. Please use l1_ratio only.

/opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sklearn/linear_model/_logistic.py:1135: FutureWarning:

'penalty' was deprecated in version 1.8 and will be removed in 1.10. To avoid this warning, leave 'penalty' set to its default value and use 'l1_ratio' or 'C' instead. Use l1_ratio=0 instead of penalty='l2', l1_ratio=1 instead of penalty='l1', and C=np.inf instead of penalty=None.

/opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sklearn/linear_model/_logistic.py:1160: UserWarning:

Inconsistent values: penalty=l1 with l1_ratio=0.0. penalty is deprecated. Please use l1_ratio only.

/opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sklearn/linear_model/_logistic.py:1135: FutureWarning:

'penalty' was deprecated in version 1.8 and will be removed in 1.10. To avoid this warning, leave 'penalty' set to its default value and use 'l1_ratio' or 'C' instead. Use l1_ratio=0 instead of penalty='l2', l1_ratio=1 instead of penalty='l1', and C=np.inf instead of penalty=None.

/opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sklearn/linear_model/_logistic.py:1160: UserWarning:

Inconsistent values: penalty=l1 with l1_ratio=0.0. penalty is deprecated. Please use l1_ratio only.

/opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sklearn/linear_model/_logistic.py:1135: FutureWarning:

'penalty' was deprecated in version 1.8 and will be removed in 1.10. To avoid this warning, leave 'penalty' set to its default value and use 'l1_ratio' or 'C' instead. Use l1_ratio=0 instead of penalty='l2', l1_ratio=1 instead of penalty='l1', and C=np.inf instead of penalty=None.

/opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sklearn/linear_model/_logistic.py:1160: UserWarning:

Inconsistent values: penalty=l1 with l1_ratio=0.0. penalty is deprecated. Please use l1_ratio only.

/opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sklearn/linear_model/_logistic.py:1135: FutureWarning:

'penalty' was deprecated in version 1.8 and will be removed in 1.10. To avoid this warning, leave 'penalty' set to its default value and use 'l1_ratio' or 'C' instead. Use l1_ratio=0 instead of penalty='l2', l1_ratio=1 instead of penalty='l1', and C=np.inf instead of penalty=None.

/opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sklearn/linear_model/_logistic.py:1160: UserWarning:

Inconsistent values: penalty=l1 with l1_ratio=0.0. penalty is deprecated. Please use l1_ratio only.

/opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sklearn/linear_model/_logistic.py:1135: FutureWarning:

'penalty' was deprecated in version 1.8 and will be removed in 1.10. To avoid this warning, leave 'penalty' set to its default value and use 'l1_ratio' or 'C' instead. Use l1_ratio=0 instead of penalty='l2', l1_ratio=1 instead of penalty='l1', and C=np.inf instead of penalty=None.

/opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sklearn/linear_model/_logistic.py:1135: FutureWarning:

'penalty' was deprecated in version 1.8 and will be removed in 1.10. To avoid this warning, leave 'penalty' set to its default value and use 'l1_ratio' or 'C' instead. Use l1_ratio=0 instead of penalty='l2', l1_ratio=1 instead of penalty='l1', and C=np.inf instead of penalty=None.

/opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sklearn/linear_model/_logistic.py:1135: FutureWarning:

'penalty' was deprecated in version 1.8 and will be removed in 1.10. To avoid this warning, leave 'penalty' set to its default value and use 'l1_ratio' or 'C' instead. Use l1_ratio=0 instead of penalty='l2', l1_ratio=1 instead of penalty='l1', and C=np.inf instead of penalty=None.

/opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sklearn/linear_model/_logistic.py:1135: FutureWarning:

'penalty' was deprecated in version 1.8 and will be removed in 1.10. To avoid this warning, leave 'penalty' set to its default value and use 'l1_ratio' or 'C' instead. Use l1_ratio=0 instead of penalty='l2', l1_ratio=1 instead of penalty='l1', and C=np.inf instead of penalty=None.

/opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sklearn/linear_model/_logistic.py:1135: FutureWarning:

'penalty' was deprecated in version 1.8 and will be removed in 1.10. To avoid this warning, leave 'penalty' set to its default value and use 'l1_ratio' or 'C' instead. Use l1_ratio=0 instead of penalty='l2', l1_ratio=1 instead of penalty='l1', and C=np.inf instead of penalty=None.

/opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sklearn/linear_model/_logistic.py:1135: FutureWarning:

'penalty' was deprecated in version 1.8 and will be removed in 1.10. To avoid this warning, leave 'penalty' set to its default value and use 'l1_ratio' or 'C' instead. Use l1_ratio=0 instead of penalty='l2', l1_ratio=1 instead of penalty='l1', and C=np.inf instead of penalty=None.

/opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sklearn/linear_model/_logistic.py:1160: UserWarning:

Inconsistent values: penalty=l1 with l1_ratio=0.0. penalty is deprecated. Please use l1_ratio only.

/opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sklearn/linear_model/_logistic.py:1135: FutureWarning:

'penalty' was deprecated in version 1.8 and will be removed in 1.10. To avoid this warning, leave 'penalty' set to its default value and use 'l1_ratio' or 'C' instead. Use l1_ratio=0 instead of penalty='l2', l1_ratio=1 instead of penalty='l1', and C=np.inf instead of penalty=None.

/opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sklearn/linear_model/_logistic.py:1160: UserWarning:

Inconsistent values: penalty=l1 with l1_ratio=0.0. penalty is deprecated. Please use l1_ratio only.

/opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sklearn/linear_model/_logistic.py:1135: FutureWarning:

'penalty' was deprecated in version 1.8 and will be removed in 1.10. To avoid this warning, leave 'penalty' set to its default value and use 'l1_ratio' or 'C' instead. Use l1_ratio=0 instead of penalty='l2', l1_ratio=1 instead of penalty='l1', and C=np.inf instead of penalty=None.

/opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sklearn/linear_model/_logistic.py:1160: UserWarning:

Inconsistent values: penalty=l1 with l1_ratio=0.0. penalty is deprecated. Please use l1_ratio only.

/opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sklearn/linear_model/_logistic.py:1135: FutureWarning:

'penalty' was deprecated in version 1.8 and will be removed in 1.10. To avoid this warning, leave 'penalty' set to its default value and use 'l1_ratio' or 'C' instead. Use l1_ratio=0 instead of penalty='l2', l1_ratio=1 instead of penalty='l1', and C=np.inf instead of penalty=None.

/opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sklearn/linear_model/_logistic.py:1160: UserWarning:

Inconsistent values: penalty=l1 with l1_ratio=0.0. penalty is deprecated. Please use l1_ratio only.

/opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sklearn/linear_model/_logistic.py:1135: FutureWarning:

'penalty' was deprecated in version 1.8 and will be removed in 1.10. To avoid this warning, leave 'penalty' set to its default value and use 'l1_ratio' or 'C' instead. Use l1_ratio=0 instead of penalty='l2', l1_ratio=1 instead of penalty='l1', and C=np.inf instead of penalty=None.

/opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sklearn/linear_model/_logistic.py:1160: UserWarning:

Inconsistent values: penalty=l1 with l1_ratio=0.0. penalty is deprecated. Please use l1_ratio only.

/opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sklearn/linear_model/_logistic.py:1135: FutureWarning:

'penalty' was deprecated in version 1.8 and will be removed in 1.10. To avoid this warning, leave 'penalty' set to its default value and use 'l1_ratio' or 'C' instead. Use l1_ratio=0 instead of penalty='l2', l1_ratio=1 instead of penalty='l1', and C=np.inf instead of penalty=None.

/opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sklearn/linear_model/_logistic.py:1135: FutureWarning:

'penalty' was deprecated in version 1.8 and will be removed in 1.10. To avoid this warning, leave 'penalty' set to its default value and use 'l1_ratio' or 'C' instead. Use l1_ratio=0 instead of penalty='l2', l1_ratio=1 instead of penalty='l1', and C=np.inf instead of penalty=None.

/opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sklearn/linear_model/_logistic.py:1135: FutureWarning:

'penalty' was deprecated in version 1.8 and will be removed in 1.10. To avoid this warning, leave 'penalty' set to its default value and use 'l1_ratio' or 'C' instead. Use l1_ratio=0 instead of penalty='l2', l1_ratio=1 instead of penalty='l1', and C=np.inf instead of penalty=None.

/opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sklearn/linear_model/_logistic.py:1135: FutureWarning:

'penalty' was deprecated in version 1.8 and will be removed in 1.10. To avoid this warning, leave 'penalty' set to its default value and use 'l1_ratio' or 'C' instead. Use l1_ratio=0 instead of penalty='l2', l1_ratio=1 instead of penalty='l1', and C=np.inf instead of penalty=None.

/opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sklearn/linear_model/_logistic.py:1135: FutureWarning:

'penalty' was deprecated in version 1.8 and will be removed in 1.10. To avoid this warning, leave 'penalty' set to its default value and use 'l1_ratio' or 'C' instead. Use l1_ratio=0 instead of penalty='l2', l1_ratio=1 instead of penalty='l1', and C=np.inf instead of penalty=None.

/opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sklearn/linear_model/_logistic.py:1135: FutureWarning:

'penalty' was deprecated in version 1.8 and will be removed in 1.10. To avoid this warning, leave 'penalty' set to its default value and use 'l1_ratio' or 'C' instead. Use l1_ratio=0 instead of penalty='l2', l1_ratio=1 instead of penalty='l1', and C=np.inf instead of penalty=None.

/opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sklearn/linear_model/_logistic.py:1160: UserWarning:

Inconsistent values: penalty=l1 with l1_ratio=0.0. penalty is deprecated. Please use l1_ratio only.

/opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sklearn/linear_model/_logistic.py:1135: FutureWarning:

'penalty' was deprecated in version 1.8 and will be removed in 1.10. To avoid this warning, leave 'penalty' set to its default value and use 'l1_ratio' or 'C' instead. Use l1_ratio=0 instead of penalty='l2', l1_ratio=1 instead of penalty='l1', and C=np.inf instead of penalty=None.

/opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sklearn/linear_model/_logistic.py:1160: UserWarning:

Inconsistent values: penalty=l1 with l1_ratio=0.0. penalty is deprecated. Please use l1_ratio only.

/opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sklearn/linear_model/_logistic.py:1135: FutureWarning:

'penalty' was deprecated in version 1.8 and will be removed in 1.10. To avoid this warning, leave 'penalty' set to its default value and use 'l1_ratio' or 'C' instead. Use l1_ratio=0 instead of penalty='l2', l1_ratio=1 instead of penalty='l1', and C=np.inf instead of penalty=None.

/opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sklearn/linear_model/_logistic.py:1160: UserWarning:

Inconsistent values: penalty=l1 with l1_ratio=0.0. penalty is deprecated. Please use l1_ratio only.

/opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sklearn/linear_model/_logistic.py:1135: FutureWarning:

'penalty' was deprecated in version 1.8 and will be removed in 1.10. To avoid this warning, leave 'penalty' set to its default value and use 'l1_ratio' or 'C' instead. Use l1_ratio=0 instead of penalty='l2', l1_ratio=1 instead of penalty='l1', and C=np.inf instead of penalty=None.

/opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sklearn/linear_model/_logistic.py:1160: UserWarning:

Inconsistent values: penalty=l1 with l1_ratio=0.0. penalty is deprecated. Please use l1_ratio only.

/opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sklearn/linear_model/_logistic.py:1135: FutureWarning:

'penalty' was deprecated in version 1.8 and will be removed in 1.10. To avoid this warning, leave 'penalty' set to its default value and use 'l1_ratio' or 'C' instead. Use l1_ratio=0 instead of penalty='l2', l1_ratio=1 instead of penalty='l1', and C=np.inf instead of penalty=None.

/opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sklearn/linear_model/_logistic.py:1160: UserWarning:

Inconsistent values: penalty=l1 with l1_ratio=0.0. penalty is deprecated. Please use l1_ratio only.

/opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sklearn/linear_model/_logistic.py:1135: FutureWarning:

'penalty' was deprecated in version 1.8 and will be removed in 1.10. To avoid this warning, leave 'penalty' set to its default value and use 'l1_ratio' or 'C' instead. Use l1_ratio=0 instead of penalty='l2', l1_ratio=1 instead of penalty='l1', and C=np.inf instead of penalty=None.

/opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sklearn/linear_model/_logistic.py:1135: FutureWarning:

'penalty' was deprecated in version 1.8 and will be removed in 1.10. To avoid this warning, leave 'penalty' set to its default value and use 'l1_ratio' or 'C' instead. Use l1_ratio=0 instead of penalty='l2', l1_ratio=1 instead of penalty='l1', and C=np.inf instead of penalty=None.

/opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sklearn/linear_model/_logistic.py:1135: FutureWarning:

'penalty' was deprecated in version 1.8 and will be removed in 1.10. To avoid this warning, leave 'penalty' set to its default value and use 'l1_ratio' or 'C' instead. Use l1_ratio=0 instead of penalty='l2', l1_ratio=1 instead of penalty='l1', and C=np.inf instead of penalty=None.

/opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sklearn/linear_model/_logistic.py:1135: FutureWarning:

'penalty' was deprecated in version 1.8 and will be removed in 1.10. To avoid this warning, leave 'penalty' set to its default value and use 'l1_ratio' or 'C' instead. Use l1_ratio=0 instead of penalty='l2', l1_ratio=1 instead of penalty='l1', and C=np.inf instead of penalty=None.

/opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sklearn/linear_model/_logistic.py:1135: FutureWarning:

'penalty' was deprecated in version 1.8 and will be removed in 1.10. To avoid this warning, leave 'penalty' set to its default value and use 'l1_ratio' or 'C' instead. Use l1_ratio=0 instead of penalty='l2', l1_ratio=1 instead of penalty='l1', and C=np.inf instead of penalty=None.

/opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sklearn/linear_model/_logistic.py:1135: FutureWarning:

'penalty' was deprecated in version 1.8 and will be removed in 1.10. To avoid this warning, leave 'penalty' set to its default value and use 'l1_ratio' or 'C' instead. Use l1_ratio=0 instead of penalty='l2', l1_ratio=1 instead of penalty='l1', and C=np.inf instead of penalty=None.

/opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sklearn/linear_model/_logistic.py:1160: UserWarning:

Inconsistent values: penalty=l1 with l1_ratio=0.0. penalty is deprecated. Please use l1_ratio only.

/opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sklearn/linear_model/_logistic.py:1135: FutureWarning:

'penalty' was deprecated in version 1.8 and will be removed in 1.10. To avoid this warning, leave 'penalty' set to its default value and use 'l1_ratio' or 'C' instead. Use l1_ratio=0 instead of penalty='l2', l1_ratio=1 instead of penalty='l1', and C=np.inf instead of penalty=None.

/opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sklearn/linear_model/_logistic.py:1160: UserWarning:

Inconsistent values: penalty=l1 with l1_ratio=0.0. penalty is deprecated. Please use l1_ratio only.

/opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sklearn/linear_model/_logistic.py:1135: FutureWarning:

'penalty' was deprecated in version 1.8 and will be removed in 1.10. To avoid this warning, leave 'penalty' set to its default value and use 'l1_ratio' or 'C' instead. Use l1_ratio=0 instead of penalty='l2', l1_ratio=1 instead of penalty='l1', and C=np.inf instead of penalty=None.

/opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sklearn/linear_model/_logistic.py:1160: UserWarning:

Inconsistent values: penalty=l1 with l1_ratio=0.0. penalty is deprecated. Please use l1_ratio only.

/opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sklearn/linear_model/_logistic.py:1135: FutureWarning:

'penalty' was deprecated in version 1.8 and will be removed in 1.10. To avoid this warning, leave 'penalty' set to its default value and use 'l1_ratio' or 'C' instead. Use l1_ratio=0 instead of penalty='l2', l1_ratio=1 instead of penalty='l1', and C=np.inf instead of penalty=None.

/opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sklearn/linear_model/_logistic.py:1160: UserWarning:

Inconsistent values: penalty=l1 with l1_ratio=0.0. penalty is deprecated. Please use l1_ratio only.

/opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sklearn/linear_model/_logistic.py:1135: FutureWarning:

'penalty' was deprecated in version 1.8 and will be removed in 1.10. To avoid this warning, leave 'penalty' set to its default value and use 'l1_ratio' or 'C' instead. Use l1_ratio=0 instead of penalty='l2', l1_ratio=1 instead of penalty='l1', and C=np.inf instead of penalty=None.

/opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sklearn/linear_model/_logistic.py:1160: UserWarning:

Inconsistent values: penalty=l1 with l1_ratio=0.0. penalty is deprecated. Please use l1_ratio only.

/opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sklearn/linear_model/_logistic.py:1135: FutureWarning:

'penalty' was deprecated in version 1.8 and will be removed in 1.10. To avoid this warning, leave 'penalty' set to its default value and use 'l1_ratio' or 'C' instead. Use l1_ratio=0 instead of penalty='l2', l1_ratio=1 instead of penalty='l1', and C=np.inf instead of penalty=None.

/opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sklearn/linear_model/_logistic.py:1135: FutureWarning:

'penalty' was deprecated in version 1.8 and will be removed in 1.10. To avoid this warning, leave 'penalty' set to its default value and use 'l1_ratio' or 'C' instead. Use l1_ratio=0 instead of penalty='l2', l1_ratio=1 instead of penalty='l1', and C=np.inf instead of penalty=None.

/opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sklearn/linear_model/_logistic.py:1135: FutureWarning:

'penalty' was deprecated in version 1.8 and will be removed in 1.10. To avoid this warning, leave 'penalty' set to its default value and use 'l1_ratio' or 'C' instead. Use l1_ratio=0 instead of penalty='l2', l1_ratio=1 instead of penalty='l1', and C=np.inf instead of penalty=None.

/opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sklearn/linear_model/_logistic.py:1135: FutureWarning:

'penalty' was deprecated in version 1.8 and will be removed in 1.10. To avoid this warning, leave 'penalty' set to its default value and use 'l1_ratio' or 'C' instead. Use l1_ratio=0 instead of penalty='l2', l1_ratio=1 instead of penalty='l1', and C=np.inf instead of penalty=None.

/opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sklearn/linear_model/_logistic.py:1135: FutureWarning:

'penalty' was deprecated in version 1.8 and will be removed in 1.10. To avoid this warning, leave 'penalty' set to its default value and use 'l1_ratio' or 'C' instead. Use l1_ratio=0 instead of penalty='l2', l1_ratio=1 instead of penalty='l1', and C=np.inf instead of penalty=None.

/opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sklearn/linear_model/_logistic.py:1135: FutureWarning:

'penalty' was deprecated in version 1.8 and will be removed in 1.10. To avoid this warning, leave 'penalty' set to its default value and use 'l1_ratio' or 'C' instead. Use l1_ratio=0 instead of penalty='l2', l1_ratio=1 instead of penalty='l1', and C=np.inf instead of penalty=None.

/opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sklearn/linear_model/_logistic.py:1160: UserWarning:

Inconsistent values: penalty=l1 with l1_ratio=0.0. penalty is deprecated. Please use l1_ratio only.

/opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sklearn/linear_model/_logistic.py:1135: FutureWarning:

'penalty' was deprecated in version 1.8 and will be removed in 1.10. To avoid this warning, leave 'penalty' set to its default value and use 'l1_ratio' or 'C' instead. Use l1_ratio=0 instead of penalty='l2', l1_ratio=1 instead of penalty='l1', and C=np.inf instead of penalty=None.

/opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sklearn/linear_model/_logistic.py:1160: UserWarning:

Inconsistent values: penalty=l1 with l1_ratio=0.0. penalty is deprecated. Please use l1_ratio only.

/opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sklearn/linear_model/_logistic.py:1135: FutureWarning:

'penalty' was deprecated in version 1.8 and will be removed in 1.10. To avoid this warning, leave 'penalty' set to its default value and use 'l1_ratio' or 'C' instead. Use l1_ratio=0 instead of penalty='l2', l1_ratio=1 instead of penalty='l1', and C=np.inf instead of penalty=None.

/opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sklearn/linear_model/_logistic.py:1160: UserWarning:

Inconsistent values: penalty=l1 with l1_ratio=0.0. penalty is deprecated. Please use l1_ratio only.

/opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sklearn/linear_model/_logistic.py:1135: FutureWarning:

'penalty' was deprecated in version 1.8 and will be removed in 1.10. To avoid this warning, leave 'penalty' set to its default value and use 'l1_ratio' or 'C' instead. Use l1_ratio=0 instead of penalty='l2', l1_ratio=1 instead of penalty='l1', and C=np.inf instead of penalty=None.

/opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sklearn/linear_model/_logistic.py:1160: UserWarning:

Inconsistent values: penalty=l1 with l1_ratio=0.0. penalty is deprecated. Please use l1_ratio only.

/opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sklearn/linear_model/_logistic.py:1135: FutureWarning:

'penalty' was deprecated in version 1.8 and will be removed in 1.10. To avoid this warning, leave 'penalty' set to its default value and use 'l1_ratio' or 'C' instead. Use l1_ratio=0 instead of penalty='l2', l1_ratio=1 instead of penalty='l1', and C=np.inf instead of penalty=None.

/opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sklearn/linear_model/_logistic.py:1160: UserWarning:

Inconsistent values: penalty=l1 with l1_ratio=0.0. penalty is deprecated. Please use l1_ratio only.

/opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sklearn/linear_model/_logistic.py:1135: FutureWarning:

'penalty' was deprecated in version 1.8 and will be removed in 1.10. To avoid this warning, leave 'penalty' set to its default value and use 'l1_ratio' or 'C' instead. Use l1_ratio=0 instead of penalty='l2', l1_ratio=1 instead of penalty='l1', and C=np.inf instead of penalty=None.

/opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sklearn/linear_model/_logistic.py:1135: FutureWarning:

'penalty' was deprecated in version 1.8 and will be removed in 1.10. To avoid this warning, leave 'penalty' set to its default value and use 'l1_ratio' or 'C' instead. Use l1_ratio=0 instead of penalty='l2', l1_ratio=1 instead of penalty='l1', and C=np.inf instead of penalty=None.

/opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sklearn/linear_model/_logistic.py:1135: FutureWarning:

'penalty' was deprecated in version 1.8 and will be removed in 1.10. To avoid this warning, leave 'penalty' set to its default value and use 'l1_ratio' or 'C' instead. Use l1_ratio=0 instead of penalty='l2', l1_ratio=1 instead of penalty='l1', and C=np.inf instead of penalty=None.

/opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sklearn/linear_model/_logistic.py:1135: FutureWarning:

'penalty' was deprecated in version 1.8 and will be removed in 1.10. To avoid this warning, leave 'penalty' set to its default value and use 'l1_ratio' or 'C' instead. Use l1_ratio=0 instead of penalty='l2', l1_ratio=1 instead of penalty='l1', and C=np.inf instead of penalty=None.

/opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sklearn/linear_model/_logistic.py:1135: FutureWarning:

'penalty' was deprecated in version 1.8 and will be removed in 1.10. To avoid this warning, leave 'penalty' set to its default value and use 'l1_ratio' or 'C' instead. Use l1_ratio=0 instead of penalty='l2', l1_ratio=1 instead of penalty='l1', and C=np.inf instead of penalty=None.

/opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sklearn/linear_model/_logistic.py:1135: FutureWarning:

'penalty' was deprecated in version 1.8 and will be removed in 1.10. To avoid this warning, leave 'penalty' set to its default value and use 'l1_ratio' or 'C' instead. Use l1_ratio=0 instead of penalty='l2', l1_ratio=1 instead of penalty='l1', and C=np.inf instead of penalty=None.

/opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sklearn/linear_model/_logistic.py:1160: UserWarning:

Inconsistent values: penalty=l1 with l1_ratio=0.0. penalty is deprecated. Please use l1_ratio only.

/opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sklearn/linear_model/_logistic.py:1135: FutureWarning:

'penalty' was deprecated in version 1.8 and will be removed in 1.10. To avoid this warning, leave 'penalty' set to its default value and use 'l1_ratio' or 'C' instead. Use l1_ratio=0 instead of penalty='l2', l1_ratio=1 instead of penalty='l1', and C=np.inf instead of penalty=None.

/opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sklearn/linear_model/_logistic.py:1160: UserWarning:

Inconsistent values: penalty=l1 with l1_ratio=0.0. penalty is deprecated. Please use l1_ratio only.

/opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sklearn/linear_model/_logistic.py:1135: FutureWarning:

'penalty' was deprecated in version 1.8 and will be removed in 1.10. To avoid this warning, leave 'penalty' set to its default value and use 'l1_ratio' or 'C' instead. Use l1_ratio=0 instead of penalty='l2', l1_ratio=1 instead of penalty='l1', and C=np.inf instead of penalty=None.

/opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sklearn/linear_model/_logistic.py:1160: UserWarning:

Inconsistent values: penalty=l1 with l1_ratio=0.0. penalty is deprecated. Please use l1_ratio only.

/opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sklearn/linear_model/_logistic.py:1135: FutureWarning:

'penalty' was deprecated in version 1.8 and will be removed in 1.10. To avoid this warning, leave 'penalty' set to its default value and use 'l1_ratio' or 'C' instead. Use l1_ratio=0 instead of penalty='l2', l1_ratio=1 instead of penalty='l1', and C=np.inf instead of penalty=None.

/opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sklearn/linear_model/_logistic.py:1160: UserWarning:

Inconsistent values: penalty=l1 with l1_ratio=0.0. penalty is deprecated. Please use l1_ratio only.

/opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sklearn/linear_model/_logistic.py:1135: FutureWarning:

'penalty' was deprecated in version 1.8 and will be removed in 1.10. To avoid this warning, leave 'penalty' set to its default value and use 'l1_ratio' or 'C' instead. Use l1_ratio=0 instead of penalty='l2', l1_ratio=1 instead of penalty='l1', and C=np.inf instead of penalty=None.

/opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sklearn/linear_model/_logistic.py:1160: UserWarning:

Inconsistent values: penalty=l1 with l1_ratio=0.0. penalty is deprecated. Please use l1_ratio only.

/opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sklearn/linear_model/_logistic.py:1135: FutureWarning:

'penalty' was deprecated in version 1.8 and will be removed in 1.10. To avoid this warning, leave 'penalty' set to its default value and use 'l1_ratio' or 'C' instead. Use l1_ratio=0 instead of penalty='l2', l1_ratio=1 instead of penalty='l1', and C=np.inf instead of penalty=None.

/opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sklearn/linear_model/_logistic.py:1135: FutureWarning:

'penalty' was deprecated in version 1.8 and will be removed in 1.10. To avoid this warning, leave 'penalty' set to its default value and use 'l1_ratio' or 'C' instead. Use l1_ratio=0 instead of penalty='l2', l1_ratio=1 instead of penalty='l1', and C=np.inf instead of penalty=None.

/opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sklearn/linear_model/_logistic.py:1135: FutureWarning:

'penalty' was deprecated in version 1.8 and will be removed in 1.10. To avoid this warning, leave 'penalty' set to its default value and use 'l1_ratio' or 'C' instead. Use l1_ratio=0 instead of penalty='l2', l1_ratio=1 instead of penalty='l1', and C=np.inf instead of penalty=None.

/opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sklearn/linear_model/_logistic.py:1135: FutureWarning:

'penalty' was deprecated in version 1.8 and will be removed in 1.10. To avoid this warning, leave 'penalty' set to its default value and use 'l1_ratio' or 'C' instead. Use l1_ratio=0 instead of penalty='l2', l1_ratio=1 instead of penalty='l1', and C=np.inf instead of penalty=None.

/opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sklearn/linear_model/_logistic.py:1135: FutureWarning:

'penalty' was deprecated in version 1.8 and will be removed in 1.10. To avoid this warning, leave 'penalty' set to its default value and use 'l1_ratio' or 'C' instead. Use l1_ratio=0 instead of penalty='l2', l1_ratio=1 instead of penalty='l1', and C=np.inf instead of penalty=None.

/opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sklearn/linear_model/_logistic.py:1135: FutureWarning:

'penalty' was deprecated in version 1.8 and will be removed in 1.10. To avoid this warning, leave 'penalty' set to its default value and use 'l1_ratio' or 'C' instead. Use l1_ratio=0 instead of penalty='l2', l1_ratio=1 instead of penalty='l1', and C=np.inf instead of penalty=None.

/opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sklearn/linear_model/_logistic.py:1160: UserWarning:

Inconsistent values: penalty=l1 with l1_ratio=0.0. penalty is deprecated. Please use l1_ratio only.

/opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sklearn/linear_model/_logistic.py:1135: FutureWarning:

'penalty' was deprecated in version 1.8 and will be removed in 1.10. To avoid this warning, leave 'penalty' set to its default value and use 'l1_ratio' or 'C' instead. Use l1_ratio=0 instead of penalty='l2', l1_ratio=1 instead of penalty='l1', and C=np.inf instead of penalty=None.

/opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sklearn/linear_model/_logistic.py:1160: UserWarning:

Inconsistent values: penalty=l1 with l1_ratio=0.0. penalty is deprecated. Please use l1_ratio only.

/opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sklearn/linear_model/_logistic.py:1135: FutureWarning:

'penalty' was deprecated in version 1.8 and will be removed in 1.10. To avoid this warning, leave 'penalty' set to its default value and use 'l1_ratio' or 'C' instead. Use l1_ratio=0 instead of penalty='l2', l1_ratio=1 instead of penalty='l1', and C=np.inf instead of penalty=None.

/opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sklearn/linear_model/_logistic.py:1160: UserWarning:

Inconsistent values: penalty=l1 with l1_ratio=0.0. penalty is deprecated. Please use l1_ratio only.

/opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sklearn/linear_model/_logistic.py:1135: FutureWarning:

'penalty' was deprecated in version 1.8 and will be removed in 1.10. To avoid this warning, leave 'penalty' set to its default value and use 'l1_ratio' or 'C' instead. Use l1_ratio=0 instead of penalty='l2', l1_ratio=1 instead of penalty='l1', and C=np.inf instead of penalty=None.

/opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sklearn/linear_model/_logistic.py:1160: UserWarning:

Inconsistent values: penalty=l1 with l1_ratio=0.0. penalty is deprecated. Please use l1_ratio only.

/opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sklearn/linear_model/_logistic.py:1135: FutureWarning:

'penalty' was deprecated in version 1.8 and will be removed in 1.10. To avoid this warning, leave 'penalty' set to its default value and use 'l1_ratio' or 'C' instead. Use l1_ratio=0 instead of penalty='l2', l1_ratio=1 instead of penalty='l1', and C=np.inf instead of penalty=None.

/opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sklearn/linear_model/_logistic.py:1160: UserWarning:

Inconsistent values: penalty=l1 with l1_ratio=0.0. penalty is deprecated. Please use l1_ratio only.

/opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sklearn/linear_model/_logistic.py:1135: FutureWarning:

'penalty' was deprecated in version 1.8 and will be removed in 1.10. To avoid this warning, leave 'penalty' set to its default value and use 'l1_ratio' or 'C' instead. Use l1_ratio=0 instead of penalty='l2', l1_ratio=1 instead of penalty='l1', and C=np.inf instead of penalty=None.

/opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sklearn/linear_model/_logistic.py:1135: FutureWarning:

'penalty' was deprecated in version 1.8 and will be removed in 1.10. To avoid this warning, leave 'penalty' set to its default value and use 'l1_ratio' or 'C' instead. Use l1_ratio=0 instead of penalty='l2', l1_ratio=1 instead of penalty='l1', and C=np.inf instead of penalty=None.

/opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sklearn/linear_model/_logistic.py:1135: FutureWarning:

'penalty' was deprecated in version 1.8 and will be removed in 1.10. To avoid this warning, leave 'penalty' set to its default value and use 'l1_ratio' or 'C' instead. Use l1_ratio=0 instead of penalty='l2', l1_ratio=1 instead of penalty='l1', and C=np.inf instead of penalty=None.

/opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sklearn/linear_model/_logistic.py:1135: FutureWarning:

'penalty' was deprecated in version 1.8 and will be removed in 1.10. To avoid this warning, leave 'penalty' set to its default value and use 'l1_ratio' or 'C' instead. Use l1_ratio=0 instead of penalty='l2', l1_ratio=1 instead of penalty='l1', and C=np.inf instead of penalty=None.

/opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sklearn/linear_model/_logistic.py:1135: FutureWarning:

'penalty' was deprecated in version 1.8 and will be removed in 1.10. To avoid this warning, leave 'penalty' set to its default value and use 'l1_ratio' or 'C' instead. Use l1_ratio=0 instead of penalty='l2', l1_ratio=1 instead of penalty='l1', and C=np.inf instead of penalty=None.

/opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sklearn/linear_model/_logistic.py:1135: FutureWarning:

'penalty' was deprecated in version 1.8 and will be removed in 1.10. To avoid this warning, leave 'penalty' set to its default value and use 'l1_ratio' or 'C' instead. Use l1_ratio=0 instead of penalty='l2', l1_ratio=1 instead of penalty='l1', and C=np.inf instead of penalty=None.

/opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sklearn/linear_model/_logistic.py:1160: UserWarning:

Inconsistent values: penalty=l1 with l1_ratio=0.0. penalty is deprecated. Please use l1_ratio only.

Initialize the ValidMind objects

Let's initialize the ValidMind Dataset and Model objects in preparation for assigning model predictions to each dataset:

# Initialize the datasets into their own dataset objects
vm_train_ds = vm.init_dataset(
    input_id="train_dataset_final",
    dataset=train_df,
    target_column="Exited",
)

vm_test_ds = vm.init_dataset(
    input_id="test_dataset_final",
    dataset=test_df,
    target_column="Exited",
)

# Initialize a model object
vm_model = vm.init_model(log_reg, input_id="log_reg_model_v1")

Assign predictions

Once the model is registered, we'll assign predictions to the training and test datasets:

vm_train_ds.assign_predictions(model=vm_model)
vm_test_ds.assign_predictions(model=vm_model)
2026-01-10 01:55:12,568 - INFO(validmind.vm_models.dataset.utils): Running predict_proba()... This may take a while
2026-01-10 01:55:12,570 - INFO(validmind.vm_models.dataset.utils): Done running predict_proba()
2026-01-10 01:55:12,570 - INFO(validmind.vm_models.dataset.utils): Running predict()... This may take a while
2026-01-10 01:55:12,573 - INFO(validmind.vm_models.dataset.utils): Done running predict()
2026-01-10 01:55:12,575 - INFO(validmind.vm_models.dataset.utils): Running predict_proba()... This may take a while
2026-01-10 01:55:12,577 - INFO(validmind.vm_models.dataset.utils): Done running predict_proba()
2026-01-10 01:55:12,578 - INFO(validmind.vm_models.dataset.utils): Running predict()... This may take a while
2026-01-10 01:55:12,580 - INFO(validmind.vm_models.dataset.utils): Done running predict()

Add custom tests

We'll also add the same custom tests we implemented in the previous notebook so that this session has access to the same custom inline test and local test provider.

Implement custom inline test

Let's set up a custom inline test that calculates the confusion matrix for a binary classification model:

# First create a confusion matrix plot
import matplotlib.pyplot as plt
from sklearn import metrics

# Get the predicted classes
y_pred = log_reg.predict(vm_test_ds.x)

confusion_matrix = metrics.confusion_matrix(y_test, y_pred)

cm_display = metrics.ConfusionMatrixDisplay(
    confusion_matrix=confusion_matrix, display_labels=[False, True]
)
cm_display.plot()

# Create the reusable ConfusionMatrix inline test with normalized matrix
@vm.test("my_custom_tests.ConfusionMatrix")
def confusion_matrix(dataset, model, normalize=False):
    """The confusion matrix is a table that is often used to describe the performance of a classification model on a set of data for which the true values are known.

    The confusion matrix is a 2x2 table that contains 4 values:

    - True Positive (TP): the number of correct positive predictions
    - True Negative (TN): the number of correct negative predictions
    - False Positive (FP): the number of incorrect positive predictions
    - False Negative (FN): the number of incorrect negative predictions

    The confusion matrix can be used to assess the holistic performance of a classification model by showing the accuracy, precision, recall, and F1 score of the model on a single figure.
    """
    y_true = dataset.y
    y_pred = dataset.y_pred(model=model)

    if normalize:
        confusion_matrix = metrics.confusion_matrix(y_true, y_pred, normalize="all")
    else:
        confusion_matrix = metrics.confusion_matrix(y_true, y_pred)

    cm_display = metrics.ConfusionMatrixDisplay(
        confusion_matrix=confusion_matrix, display_labels=[False, True]
    )
    cm_display.plot()

    plt.close()  # close the plot to avoid displaying it

    return cm_display.figure_  # return the figure object itself
# Test dataset with normalize=True
result = vm.tests.run_test(
    "my_custom_tests.ConfusionMatrix:test_dataset_normalized",
    inputs={"model": vm_model, "dataset": vm_test_ds},
    params={"normalize": True},
)

Confusion Matrix Test Dataset Normalized

Confusion Matrix: Test Dataset Normalized is designed to provide a comprehensive summary of a classification model’s predictive performance by displaying the distribution of correct and incorrect predictions across the possible classes. The primary purpose of this test is to enable a clear, quantitative assessment of how well the model distinguishes between the positive and negative classes, using a structured tabular format that highlights both successes and errors in classification.

The test operates by constructing a two-by-two matrix that records the frequency of each prediction outcome: true positives (correctly predicted positives), true negatives (correctly predicted negatives), false positives (incorrectly predicted positives), and false negatives (incorrectly predicted negatives). In this normalized version, each cell value represents the proportion of total predictions falling into that category, rather than raw counts, allowing for direct comparison across different datasets or models regardless of sample size. The matrix is typically visualized as a heatmap, with axes representing the true and predicted labels. Key performance metrics such as accuracy, precision, recall, and F1 score can be derived from these proportions. Accuracy measures the overall proportion of correct predictions, precision quantifies the proportion of positive predictions that are correct, recall assesses the proportion of actual positives that are correctly identified, and the F1 score balances precision and recall. Each metric ranges from 0 to 1, where values closer to 1 indicate better performance. The normalized confusion matrix thus provides a holistic view of model behavior, highlighting both strengths and areas for improvement.

The primary advantages of this test include its ability to present a detailed and interpretable breakdown of model performance across all prediction categories, making it especially useful for identifying specific types of errors such as false positives or false negatives. By normalizing the values, the test facilitates fair comparison between models or datasets of different sizes and class distributions. This is particularly valuable in domains where class imbalance is a concern, as it prevents misleading interpretations that could arise from raw counts. The visual representation as a heatmap further enhances interpretability, allowing stakeholders to quickly identify patterns and outliers. Additionally, the confusion matrix serves as a foundation for calculating a range of secondary metrics, supporting deeper diagnostic analysis and model refinement.

It should be noted that the confusion matrix, while informative, has certain limitations. It provides a static snapshot of model performance on a specific dataset and does not account for the underlying distribution of the data or the costs associated with different types of errors. In cases of severe class imbalance, even normalized values may obscure the practical impact of misclassifications. The test also does not capture the confidence of predictions or the model’s calibration, which can be critical in risk-sensitive applications. Interpretation challenges may arise if the matrix is used in isolation, without considering additional context such as the business domain or downstream effects of errors. Furthermore, the confusion matrix is limited to binary or multiclass classification tasks and is not applicable to regression or ranking problems.

This test shows a normalized confusion matrix presented as a color-coded heatmap, with the true labels on the vertical axis and the predicted labels on the horizontal axis. Each cell contains a value representing the proportion of total predictions for that true-predicted label combination, with the color intensity reflecting the magnitude of the value. The matrix is accompanied by a color bar indicating the scale, which ranges from approximately 0.18 to 0.33. The top-left cell (True Negative) has a value of 0.33, indicating that 33% of all predictions were correct negatives. The top-right cell (False Positive) shows 0.18, representing 18% incorrect positive predictions. The bottom-left cell (False Negative) is 0.20, meaning 20% of predictions were incorrect negatives, while the bottom-right cell (True Positive) is 0.30, indicating 30% correct positive predictions. The sum of all cells equals 1, as expected for a normalized matrix. The heatmap allows for immediate visual assessment of where the model performs well and where errors are concentrated, with the diagonal cells representing correct predictions and the off-diagonal cells representing misclassifications.

The test results reveal the following key insights:

  • Balanced Distribution of Correct Predictions: The model achieves similar proportions of true negatives (0.33) and true positives (0.30), indicating that it is comparably effective at identifying both negative and positive cases.
  • Moderate Rate of Misclassifications: The false positive rate (0.18) and false negative rate (0.20) are both below the rates of correct predictions, suggesting that the model makes fewer errors than correct classifications, but the error rates are still substantial.
  • Diagonal Dominance in the Matrix: The highest values are located on the diagonal (0.33 and 0.30), which is characteristic of a model that is generally able to distinguish between classes, though not with high precision.
  • Comparable Error Rates Across Classes: The false positive and false negative rates are close in value, indicating that the model does not disproportionately misclassify one class over the other.
  • Normalized Values Facilitate Interpretation: The use of normalized proportions allows for direct comparison and highlights that no single cell dominates the matrix, pointing to a relatively balanced model performance.

Based on these results, the model demonstrates a relatively balanced ability to correctly classify both positive and negative cases, as evidenced by the similar proportions of true positives and true negatives. The error rates for both false positives and false negatives are moderate and closely matched, indicating that the model does not exhibit a strong bias toward over-predicting or under-predicting either class. The diagonal dominance in the confusion matrix suggests that the model is generally effective at distinguishing between the two classes, though the presence of non-negligible off-diagonal values highlights areas where misclassifications occur. The normalized presentation of the results ensures that these observations are not confounded by class imbalance or dataset size, providing a clear and interpretable summary of model behavior. Overall, the confusion matrix reveals a model with balanced but not exceptional performance, with room for improvement in reducing both types of misclassification while maintaining its ability to correctly identify both classes.

Parameters:

{
  "normalize": true
}
            

Figures

ValidMind Figure my_custom_tests.ConfusionMatrix:test_dataset_normalized:b0ba

Add a local test provider

Finally, let's save our custom inline test to our local test provider:

# Create custom tests folder
tests_folder = "my_tests"

import os

# create tests folder
os.makedirs(tests_folder, exist_ok=True)

# remove existing tests
for f in os.listdir(tests_folder):
    # remove files and pycache
    if f.endswith(".py") or f == "__pycache__":
        os.system(f"rm -rf {tests_folder}/{f}")
# Save custom inline test to custom tests folder
confusion_matrix.save(
    tests_folder,
    imports=["import matplotlib.pyplot as plt", "from sklearn import metrics"],
)
2026-01-10 01:55:40,849 - INFO(validmind.tests.decorator): Saved to /home/runner/work/documentation/documentation/site/notebooks/EXECUTED/model_development/my_tests/ConfusionMatrix.py!Be sure to add any necessary imports to the top of the file.
2026-01-10 01:55:40,850 - INFO(validmind.tests.decorator): This metric can be run with the ID: <test_provider_namespace>.ConfusionMatrix
# Register local test provider
from validmind.tests import LocalTestProvider

# initialize the test provider with the tests folder we created earlier
my_test_provider = LocalTestProvider(tests_folder)

vm.tests.register_test_provider(
    namespace="my_test_provider",
    test_provider=my_test_provider,
)

Reconnect to ValidMind

After you insert test-driven blocks into your model documentation, changes should persist and become available every time you call vm.preview_template().

However, you'll need to reload the connection to the ValidMind Platform if you have added test-driven blocks when the connection was already established using reload():

vm.reload()

Now, when you run preview_template() again, the three test-driven blocks you added to your documentation in the last two notebooks in should show up in the template in sections 2.3 Correlations and Interactions and 3.2 Model Evaluation:

vm.preview_template()
1. Conceptual Soundness ('conceptual_soundness')
2. Data Preparation ('data_preparation')
3. Model Development ('model_development')
4. Monitoring and Governance ('monitoring_governance')

Include custom test results

Since your custom test IDs are now part of your documentation template, you can now run tests for an entire section and all additional custom tests should be loaded without any issues.

Let's run all tests in the Model Evaluation section of the documentation. Note that we have been running the sample custom confusion matrix with normalize=True to demonstrate the ability to provide custom parameters.

In the Run the model evaluation tests section of 2 — Start the model development process, you learned how to assign inputs to individual tests with run_documentation_tests(). Assigning parameters is similar, you only need to provide assign a params dictionary to a given test ID, my_test_provider.ConfusionMatrix in this case.

test_config = {
    "validmind.model_validation.sklearn.ClassifierPerformance:in_sample": {
        "inputs": {
            "dataset": vm_train_ds,
            "model": vm_model,
        },
    },
    "my_test_provider.ConfusionMatrix": {
        "params": {"normalize": True},
        "inputs": {"dataset": vm_test_ds, "model": vm_model},
    },
}
results = vm.run_documentation_tests(
    section=["model_evaluation"],
    inputs={
        "dataset": vm_test_ds,  # Any test that requires a single dataset will use vm_test_ds
        "model": vm_model,
        "datasets": (
            vm_train_ds,
            vm_test_ds,
        ),  # Any test that requires multiple datasets will use vm_train_ds and vm_test_ds
    },
    config=test_config,
)
2026-01-10 01:55:41,880 - WARNING(validmind.vm_models.test_suite.runner): Config key 'my_test_provider.ConfusionMatrix' does not match a test_id in the template.
    Ensure you registered a content block with the correct content_id in the template
    The configuration for this test will be ignored.
Test suite complete!
18/18 (100.0%)

Test Suite Results: Binary Classification V2


Check out the updated documentation on ValidMind.

Template for binary classification models.

Model Evaluation

Documentation template configuration

Let's call the utility function vm.get_test_suite().get_default_config() which will return the default configuration for the entire documentation template as a dictionary:

  • This configuration will contain all the test IDs and their default parameters.
  • You can then modify this configuration as needed and pass it to run_documentation_tests() to run all tests in the documentation template if needed.
  • You still have the option to continue running tests for one section at a time; get_default_config() simply provides a useful reference for providing default parameters to every test.
import json

model_test_suite = vm.get_test_suite()
config = model_test_suite.get_default_config()
print("Suite Config: \n", json.dumps(config, indent=2))
Suite Config: 
 {
  "validmind.data_validation.DatasetDescription": {
    "inputs": {
      "dataset": "dataset"
    },
    "params": {}
  },
  "validmind.data_validation.ClassImbalance": {
    "inputs": {
      "dataset": "dataset"
    },
    "params": {
      "min_percent_threshold": 10
    }
  },
  "validmind.data_validation.Duplicates": {
    "inputs": {
      "dataset": "dataset"
    },
    "params": {
      "min_threshold": 1
    }
  },
  "validmind.data_validation.HighCardinality": {
    "inputs": {
      "dataset": "dataset"
    },
    "params": {
      "num_threshold": 100,
      "percent_threshold": 0.1,
      "threshold_type": "percent"
    }
  },
  "validmind.data_validation.MissingValues": {
    "inputs": {
      "dataset": "dataset"
    },
    "params": {
      "min_threshold": 1
    }
  },
  "validmind.data_validation.Skewness": {
    "inputs": {
      "dataset": "dataset"
    },
    "params": {
      "max_threshold": 1
    }
  },
  "validmind.data_validation.UniqueRows": {
    "inputs": {
      "dataset": "dataset"
    },
    "params": {
      "min_percent_threshold": 1
    }
  },
  "validmind.data_validation.TooManyZeroValues": {
    "inputs": {
      "dataset": "dataset"
    },
    "params": {
      "max_percent_threshold": 0.03
    }
  },
  "validmind.data_validation.IQROutliersTable": {
    "inputs": {
      "dataset": "dataset"
    },
    "params": {
      "threshold": 1.5
    }
  },
  "validmind.data_validation.IQROutliersBarPlot": {
    "inputs": {
      "dataset": "dataset"
    },
    "params": {
      "threshold": 1.5,
      "fig_width": 800
    }
  },
  "validmind.data_validation.DescriptiveStatistics": {
    "inputs": {
      "dataset": "dataset"
    },
    "params": {}
  },
  "validmind.data_validation.PearsonCorrelationMatrix": {
    "inputs": {
      "dataset": "dataset"
    },
    "params": {}
  },
  "validmind.data_validation.HighPearsonCorrelation": {
    "inputs": {
      "dataset": "dataset"
    },
    "params": {
      "max_threshold": 0.3,
      "top_n_correlations": 10,
      "feature_columns": null
    }
  },
  "validmind.model_validation.ModelMetadata": {
    "inputs": {
      "model": "model"
    },
    "params": {}
  },
  "validmind.data_validation.DatasetSplit": {
    "inputs": {
      "datasets": "datasets"
    },
    "params": {}
  },
  "validmind.model_validation.sklearn.PopulationStabilityIndex": {
    "inputs": {
      "datasets": "datasets",
      "model": "model"
    },
    "params": {
      "num_bins": 10,
      "mode": "fixed"
    }
  },
  "validmind.model_validation.sklearn.ConfusionMatrix": {
    "inputs": {
      "dataset": "dataset",
      "model": "model"
    },
    "params": {
      "threshold": 0.5
    }
  },
  "validmind.model_validation.sklearn.ClassifierPerformance:in_sample": {
    "inputs": {
      "dataset": "dataset",
      "model": "model"
    },
    "params": {
      "average": "macro"
    }
  },
  "validmind.model_validation.sklearn.ClassifierPerformance:out_of_sample": {
    "inputs": {
      "dataset": "dataset",
      "model": "model"
    },
    "params": {
      "average": "macro"
    }
  },
  "validmind.model_validation.sklearn.PrecisionRecallCurve": {
    "inputs": {
      "model": "model",
      "dataset": "dataset"
    },
    "params": {}
  },
  "validmind.model_validation.sklearn.ROCCurve": {
    "inputs": {
      "model": "model",
      "dataset": "dataset"
    },
    "params": {}
  },
  "validmind.model_validation.sklearn.TrainingTestDegradation": {
    "inputs": {
      "datasets": "datasets",
      "model": "model"
    },
    "params": {
      "max_threshold": 0.1
    }
  },
  "validmind.model_validation.sklearn.MinimumAccuracy": {
    "inputs": {
      "dataset": "dataset",
      "model": "model"
    },
    "params": {
      "min_threshold": 0.7
    }
  },
  "validmind.model_validation.sklearn.MinimumF1Score": {
    "inputs": {
      "dataset": "dataset",
      "model": "model"
    },
    "params": {
      "min_threshold": 0.5
    }
  },
  "validmind.model_validation.sklearn.MinimumROCAUCScore": {
    "inputs": {
      "dataset": "dataset",
      "model": "model"
    },
    "params": {
      "min_threshold": 0.5
    }
  },
  "validmind.model_validation.sklearn.PermutationFeatureImportance": {
    "inputs": {
      "model": "model",
      "dataset": "dataset"
    },
    "params": {
      "fontsize": null,
      "figure_height": null
    }
  },
  "validmind.model_validation.sklearn.SHAPGlobalImportance": {
    "inputs": {
      "model": "model",
      "dataset": "dataset"
    },
    "params": {
      "kernel_explainer_samples": 10,
      "tree_or_linear_explainer_samples": 200,
      "class_of_interest": null
    }
  },
  "validmind.model_validation.sklearn.WeakspotsDiagnosis": {
    "inputs": {
      "datasets": "datasets",
      "model": "model"
    },
    "params": {
      "features_columns": null,
      "metrics": null,
      "thresholds": null
    }
  },
  "validmind.model_validation.sklearn.OverfitDiagnosis": {
    "inputs": {
      "model": "model",
      "datasets": "datasets"
    },
    "params": {
      "metric": null,
      "cut_off_threshold": 0.04
    }
  },
  "validmind.model_validation.sklearn.RobustnessDiagnosis": {
    "inputs": {
      "datasets": "datasets",
      "model": "model"
    },
    "params": {
      "metric": null,
      "scaling_factor_std_dev_list": [
        0.1,
        0.2,
        0.3,
        0.4,
        0.5
      ],
      "performance_decay_threshold": 0.05
    }
  }
}

Update the config

The default config does not assign any inputs to a test, but you can assign inputs to individual tests as needed depending on the datasets and models you want to pass to individual tests.

For this particular documentation template (binary classification), the ValidMind Library provides a sample configuration that can be used to populate the entire model documentation using the following inputs as placeholders:

  • A raw_dataset raw dataset
  • A train_dataset training dataset
  • A test_dataset test dataset
  • A trained model instance

As part of updating the config you will need to ensure the correct input_ids are used in the final config passed to run_documentation_tests().

from validmind.datasets.classification import customer_churn
from validmind.utils import preview_test_config

test_config = customer_churn.get_demo_test_config()
preview_test_config(test_config)

Using this sample configuration, let's finish populating model documentation by running all tests for the Model Development section of the documentation.

Recall that the training and test datasets in our exercise have the following input_id values:

  • train_dataset_final for the training dataset
  • test_dataset_final for the test dataset
config = {
    "validmind.model_validation.ModelMetadata": {
        "inputs": {"model": "log_reg_model_v1"},
    },
    "validmind.data_validation.DatasetSplit": {
        "inputs": {"datasets": ["train_dataset_final", "test_dataset_final"]},
    },
    "validmind.model_validation.sklearn.PopulationStabilityIndex": {
        "inputs": {
            "model": "log_reg_model_v1",
            "datasets": ["train_dataset_final", "test_dataset_final"],
        },
        "params": {"num_bins": 10, "mode": "fixed"},
    },
    "validmind.model_validation.sklearn.ConfusionMatrix": {
        "inputs": {"model": "log_reg_model_v1", "dataset": "test_dataset_final"},
    },
    "my_test_provider.ConfusionMatrix": {
        "inputs": {"dataset": "test_dataset_final", "model": "log_reg_model_v1"},
    },
    "my_custom_tests.ConfusionMatrix:test_dataset_normalized": {
        "inputs": {"dataset": "test_dataset_final", "model": "log_reg_model_v1"},
    },
    "validmind.model_validation.sklearn.ClassifierPerformance:in_sample": {
        "inputs": {"model": "log_reg_model_v1", "dataset": "train_dataset_final"}
    },
    "validmind.model_validation.sklearn.ClassifierPerformance:out_of_sample": {
        "inputs": {"model": "log_reg_model_v1", "dataset": "test_dataset_final"}
    },
    "validmind.model_validation.sklearn.PrecisionRecallCurve": {
        "inputs": {"model": "log_reg_model_v1", "dataset": "test_dataset_final"},
    },
    "validmind.model_validation.sklearn.ROCCurve": {
        "inputs": {"model": "log_reg_model_v1", "dataset": "test_dataset_final"},
    },
    "validmind.model_validation.sklearn.TrainingTestDegradation": {
        "inputs": {
            "model": "log_reg_model_v1",
            "datasets": ["train_dataset_final", "test_dataset_final"],
        },
        "params": {
            "metrics": ["accuracy", "precision", "recall", "f1"],
            "max_threshold": 0.1,
        },
    },
    "validmind.model_validation.sklearn.MinimumAccuracy": {
        "inputs": {"model": "log_reg_model_v1", "dataset": "test_dataset_final"},
        "params": {"min_threshold": 0.7},
    },
    "validmind.model_validation.sklearn.MinimumF1Score": {
        "inputs": {"model": "log_reg_model_v1", "dataset": "test_dataset_final"},
        "params": {"min_threshold": 0.5},
    },
    "validmind.model_validation.sklearn.MinimumROCAUCScore": {
        "inputs": {"model": "log_reg_model_v1", "dataset": "test_dataset_final"},
        "params": {"min_threshold": 0.5},
    },
    "validmind.model_validation.sklearn.PermutationFeatureImportance": {
        "inputs": {"model": "log_reg_model_v1", "dataset": "test_dataset_final"},
    },
    "validmind.model_validation.sklearn.SHAPGlobalImportance": {
        "inputs": {"model": "log_reg_model_v1", "dataset": "test_dataset_final"},
        "params": {"kernel_explainer_samples": 10},
    },
    "validmind.model_validation.sklearn.WeakspotsDiagnosis": {
        "inputs": {
            "model": "log_reg_model_v1",
            "datasets": ["train_dataset_final", "test_dataset_final"],
        },
        "params": {
            "thresholds": {"accuracy": 0.75, "precision": 0.5, "recall": 0.5, "f1": 0.7}
        },
    },
    "validmind.model_validation.sklearn.OverfitDiagnosis": {
        "inputs": {
            "model": "log_reg_model_v1",
            "datasets": ["train_dataset_final", "test_dataset_final"],
        },
        "params": {"cut_off_percentage": 4},
    },
    "validmind.model_validation.sklearn.RobustnessDiagnosis": {
        "inputs": {
            "model": "log_reg_model_v1",
            "datasets": ["train_dataset_final", "test_dataset_final"],
        },
        "params": {
            "scaling_factor_std_dev_list": [0.0, 0.1, 0.2, 0.3, 0.4, 0.5],
            "accuracy_decay_threshold": 4,
        },
    },
}


full_suite = vm.run_documentation_tests(
    section="model_development",
    config=config,
)
2026-01-10 01:56:29,215 - WARNING(validmind.vm_models.test_suite.runner): Config key 'my_test_provider.ConfusionMatrix' does not match a test_id in the template.
    Ensure you registered a content block with the correct content_id in the template
    The configuration for this test will be ignored.
2026-01-10 01:56:29,216 - WARNING(validmind.vm_models.test_suite.runner): Config key 'my_custom_tests.ConfusionMatrix:test_dataset_normalized' does not match a test_id in the template.
    Ensure you registered a content block with the correct content_id in the template
    The configuration for this test will be ignored.
Test suite complete!
34/34 (100.0%)

Test Suite Results: Binary Classification V2


Check out the updated documentation on ValidMind.

Template for binary classification models.

Model Development

In summary

In this final notebook, you learned how to:

With our ValidMind for model development series of notebooks, you learned how to document a model end-to-end with the ValidMind Library by running through some common scenarios in a typical model development setting:

  • Running out-of-the-box tests
  • Documenting your model by adding evidence to model documentation
  • Extending the capabilities of the ValidMind Library by implementing custom tests
  • Ensuring that the documentation is complete by running all tests in the documentation template

Next steps

Work with your model documentation

Now that you've logged all your test results and generated a draft for your model documentation, head to the ValidMind Platform to wrap up your model documentation. Continue to work on your model documentation by:

  • Run and log more tests: Use the skills you learned in this series of notebooks to run and log more individual tests, including custom tests, then insert them into your documentation as supplementary evidence. (Learn more: validmind.tests)

  • Inserting additional test results: Add Test-Driven Blocks under any relevant section of your model documentation. (Learn more: Work with test results)

  • Making qualitative edits to your test descriptions: Click on the description of any inserted test results to review and edit the ValidMind-generated test descriptions for quality and accuracy. (Learn more: Working with model documentation)

  • View guidelines: In any section of your model documentation, click ​ValidMind Insights in the top right corner to reveal the Documentation Guidelines for each section to help guide the contents of your model documentation. (Learn more: View documentation guidelines)

  • Collaborate with other stakeholders: Use the ValidMind Platform's real-time collaborative features to work seamlessly together with the rest of your organization, including model validators. Review suggested changes in your content blocks, work with versioned history, and use comments to discuss specific portions of your model documentation. (Learn more: Collaborate with others)

When your model documentation is complete and ready for review, submit it for approval from the same ValidMind Platform where you made your edits and collaborated with the rest of your organization, ensuring transparency and a thorough model development history. (Learn more: Submit for approval)

Learn more

Now that you're familiar with the basics, you can explore the following notebooks to get a deeper understanding on how the ValidMind Library allows you generate model documentation for any use case:

Use cases

More how-to guides and code samples

Discover more learning resources

All notebook samples can be found in the following directories of the ValidMind Library GitHub repository: