Developing
Challenger Models

Validator Fundamentals — Module 3 of 4

Click to start

Learning objectives

“As a validator who has already run and logged data quality tests with ​ValidMind, I want to next run both out-of-the-box and custom model evaluation tests for the champion model and a potential challenger model, and use the results of my testing to log model findings.”


This third module is part of a four-part series:

Validator Fundamentals

Module 3 — Contents


First, let’s make sure you can log in to ​ValidMind.

Training is interactive — you explore ​ValidMind live. Try it!

, , SPACE , N — next slide     , , P , H — previous slide     ? — all keyboard shortcuts

Before you begin

To continue, you need to have been onboarded onto ValidMind Academy with the Validator role and completed the first two modules of this course:

Already logged in and refreshed this module? Click to continue.

  1. Log in to check your access:

Be sure to return to this page afterwards.

  1. After you successfully log in, refresh the page to connect this training module up to the ValidMind Platform:

ValidMind for model validation

Jupyter Notebook series

These notebooks walk you through how to validate a model using ​ValidMind, complete with supporting test results attached as evidence to your validation report.


You will need to have already completed notebooks 1 and 2 during the first and second modules to proceed.

​ValidMind for model validation

Our series of four introductory notebooks for model validators include sample code and how-to information to get you started with ​ValidMind:

1 — Set up the ValidMind Library for validation
2 — Start the model validation process
3 — Developing a potential challenger model
4 — Finalize testing and reporting

In this third module, we’ll run through the remaining two notebooks 3 in Section 1 and 4 in Section 2 together.

Let’s continue our journey with Section 1 on the next page.

Section 1

3 — Developing a potential challenger model

This is the third notebook in our introductory series, which will walk you through how to evaluate your champion model against a potential challenger with ​ValidMind.

Scroll through this notebook to explore. When you are done, click to continue.

Get your code snippet

​ValidMind generates a unique code snippet for each registered model to connect with your validation environment:

  1. Select the name of your model you registered for this course to open up the model details page.
  2. On the left sidebar that appears for your model, click Getting Started.
  3. Locate the code snippet and click Copy snippet to clipboard.

When you’re done, click to continue.

Can’t load the ValidMind Platform?

Make sure you’re logged in and have refreshed the page in a Chromium-based web browser.

Connect to your model

With your code snippet copied to your clipboard:

  1. Open 3 — Developing a potential challenger model: JupyterHub
  2. Run all the cells under the Setting up section.

When you’re done, return to this page and click to continue.

Import the champion model

Next, let’s import the champion model submitted by the model development team in the format of a .pkl file for evaluation:

  1. Continue with 3 — Developing a potential challenger model: JupyterHub
  2. Run the cell under the Import the champion model section.

When you’re done, return to this page and click to continue.

Train a challenger model

Champion vs. challenger models

Try it live on the next pages.

We’re curious how an alternate model compares to our champion model, so let’s train a challenger model as a basis for our testing:

  • Our champion logistic regression model is a simpler, parametric model that assumes a linear relationship between the independent variables and the log-odds of the outcome.
  • While logistic regression may not capture complex patterns as effectively, it offers a high degree of interpretability and is easier to explain to stakeholders.
  • However, model risk is not calculated in isolation from a single factor, but rather in consideration with trade-offs in predictive performance, ease of interpretability, and overall alignment with business objectives.
  • A random forest classification model is an ensemble machine learning algorithm that uses multiple decision trees to classify data. In ensemble learning, multiple models are combined to improve prediction accuracy and robustness.
  • Random forest classification models generally have higher accuracy because they capture complex, non-linear relationships, but as a result they lack transparency in their predictions.

Random forest classification model

Let’s train our potential challenger model:

  1. Continue with 3 — Developing a potential challenger model: JupyterHub
  2. Run the cell under the following Training a potential challenger model section: Random forest classification model

When you’re done, return to this page and click to continue.

Initialize the model objects

In addition to the initialized datasets, you’ll also need to initialize a ValidMind model object (vm_model) that can be passed to other functions for analysis and tests on the data for each of our two models using the vm.init_model() method:

  1. Continue with 3 — Developing a potential challenger model: JupyterHub
  2. Run all the cells under the section Initializing the model objects.

When you’re done, return to this page and click to continue.

Run model evaluation tests

Model evaluation testing

Try it live on the next pages.

With everything ready for us, let’s run the rest of our validation tests. We’ll focus on comprehensive testing around model performance of both the champion and challenger models going forward as we’ve already verified the data quality of the datasets used to train the champion model:

We’ll start with some performance tests, beginning with independent testing of our champion logistic regression model, then moving on to our potential challenger model.

Next, we want to inspect the robustness and stability testing comparison between our champion and challenger model.

Finally, we want to verify the relative influence of different input features on our models’ predictions, as well as inspect the differences between our champion and challenger model to see if a certain model offers more understandable or logical importance scores for features.

Run model performance tests

Use the list_tests() function to identify all the model performance tests for classification:

  1. Continue with 3 — Developing a potential challenger model: JupyterHub
  2. Run all the cells under the Running model evaluation tests section: Run model performance tests

When you’re done, return to this page and click to continue.

Log a model finding

(Scroll down for the full instructions.)


Try it live on the next page.

As we can observe from the output in our notebook, our champion model doesn’t pass the MinimumAccuracy test based on the default thresholds of the out-of-the-box test, so let’s log a model finding in the ValidMind Platform:

Create a finding via a validation report

  1. From the Inventory in the ValidMind Platform, go to the model you connected to earlier.

  2. In the left sidebar that appears for your model, click Validation Report.

  3. Locate the Data Preparation section and click on 2.2.2. Model Performance to expand that section.

  4. Under the Model Performance Metrics section, locate Findings then click Link Finding to Report:

    Screenshot showing the validation report with the link finding option highlighted

    Validation report with the link finding option highlighted
  5. Click Create New Finding to add a finding.

  6. Enter in the details for your finding, for example:

    • title — Champion Logistic Regression Model Fails Minimum Accuracy Threshold
    • risk area — Model Performance
    • documentation section — 3.2. Model Evaluation
    • description — The logistic regression champion model was subjected to a Minimum Accuracy test to determine whether its predictive accuracy meets the predefined performance threshold of 0.7. The model achieved an accuracy score of 0.6136, which falls below the required minimum. As a result, the test produced a Fail outcome.
  7. Click Save.

  8. Select the finding you just added to link to your validation report.

  9. Click Update Linked Findings to insert your finding.

  10. Confirm that finding you inserted has been correctly inserted into section 2.2.2. Model Performance of the report.

  11. Click on the finding to expand the finding, where you can adjust details such as severity, owner, due date, status, etc. as well as include proposed remediation plans or supporting documentation as attachments.

Create a model finding

  1. Select the name of your model you registered for this course to open up the model details page.
  2. In the left sidebar that appears for your model, click Validation Report.
  3. Locate the Data Preparation section and click on 2.2.2. Model Performance to expand that section.
  4. Under the Model Performance Metrics section, locate Findings then click Link Finding to Report.
  5. Click Create New Finding to add a finding.
  6. Enter in the details for your finding and click Save.
  7. Select the finding you just added to link to your validation report.
  8. Click Update Linked Findings to insert your finding.

When you’re done, click to continue.

Run diagnostic tests

This time, use list_tests() to identify all the model diagnosis tests for classification:

  1. Continue with 3 — Developing a potential challenger model: JupyterHub
  2. Run all the cells under the Running model evaluation tests section: Run diagnostic tests

When you’re done, return to this page and click to continue.

Run feature importance tests

Use list_tests() again to identify all the feature importance tests for classification:

  1. Continue with 3 — Developing a potential challenger model: JupyterHub
  2. Run all the cells under the Running model evaluation tests section: Run feature importance tests

When you’re done, return to this page and click to continue.

Section 2

4 — Finalize testing and reporting

This is the final notebook in our introductory series, which will walk you through how to supplement ValidMind tests with your own custom tests and include them as additional evidence in your validation report, and wrap up your validation testing.

Scroll through this notebook to explore. When you are done, click to continue.

Retrieve your code snippet

As usual, let’s connect back up to your model in the ValidMind Platform:

  1. Select the name of your model you registered for this course to open up the model details page.
  2. On the left sidebar that appears for your model, click Getting Started.
  3. Locate the code snippet and click Copy snippet to clipboard.

When you’re done, click to continue.

Connect to your model

With your code snippet copied to your clipboard:

  1. Open 4 — Finalize testing and validation: JupyterHub
  2. Run all the cells under the Setting up section.

When you’re done, return to this page and click to continue.

Implement custom tests

Custom tests


Try it live on the next pages.

Let’s implement a custom test that calculates a confusion matrix:

  • You’ll note that the custom test function is just a regular Python function that can include and require any Python library as you see fit.
  • In a usual model validation situation, you would load a saved custom test provided by the model development team. In the following section, we’ll have you implement the same custom test and make it available for reuse, to familiarize you with the processes.

Implement a custom inline test

An inline test refers to a test written and executed within the same environment as the code being tested — in the following example, right in our Jupyter Notebook — without requiring a separate test file or framework:

  1. Continue with 4 — Finalize testing and validation: JupyterHub
  2. Run all the cells in the following sections under Implementing custom tests: Implement a custom inline test

When you’re done, return to this page and click to continue.

Use external test providers

Sometimes you may want to reuse the same set of custom tests across multiple models and share them with others in your organization, like the model development team would have done with you in this example workflow featured in this series of notebooks:

  1. Continue with 4 — Finalize testing and validation: JupyterHub
  2. Run all the cells in the following sections under Implementing custom tests: Use external test providers

When you’re done, return to this page and click to continue.

Verify test runs

Verify model development testing

Our final task is to verify that all the tests provided by the model development team were run and reported accurately:

  1. Continue with 4 — Finalize testing and validation: JupyterHub
  2. Run all the cells under the Verify test runs section.

When you’re done, return to this page and click to continue.

In summary

Developing challenger models

In this third module, you learned how to:


Continue your model development journey with:

Finalizing Validation Reports