validmind.vm_models
Models entrypoint
Base class for ValidMind Input types
Allows for setting options on the input object that are passed by the user when using the input to run a test or set of tests
To allow options, just override this method in the subclass (see VMDataset) and ensure that it returns a new instance of the input with the specified options set.
Arguments:
- **kwargs: Arbitrary keyword arguments that will be passed to the input object
Returns:
VMInput: A new instance of the input with the specified options set
Base class for VM datasets
Child classes should be used to support new dataset types (tensor, polars etc)
by converting the user's dataset into a numpy array collecting metadata like
column names and then call this (parent) class __init__
method.
This way we can support multiple dataset types but under the hood we only need to work with numpy arrays and pandas dataframes in this class.
Attributes:
- raw_dataset (np.ndarray): The raw dataset as a NumPy array.
- input_id (str): Identifier for the dataset.
- index (np.ndarray): The raw dataset index as a NumPy array.
- columns (Set[str]): The column names of the dataset.
- target_column (str): The target column name of the dataset.
- feature_columns (List[str]): The feature column names of the dataset.
- feature_columns_numeric (List[str]): The numeric feature column names of the dataset.
- feature_columns_categorical (List[str]): The categorical feature column names of the dataset.
- text_column (str): The text column name of the dataset for NLP tasks.
- target_class_labels (Dict): The class labels for the target columns.
- df (pd.DataFrame): The dataset as a pandas DataFrame.
- extra_columns (Dict): Extra columns to include in the dataset.
Initializes a VMDataset instance.
Arguments:
- raw_dataset (np.ndarray): The raw dataset as a NumPy array.
- input_id (str): Identifier for the dataset.
- model (VMModel): Model associated with the dataset.
- index (np.ndarray): The raw dataset index as a NumPy array.
- index_name (str): The raw dataset index name as a NumPy array.
- date_time_index (bool): Whether the index is a datetime index.
- columns (List[str], optional): The column names of the dataset. Defaults to None.
- target_column (str, optional): The target column name of the dataset. Defaults to None.
- feature_columns (str, optional): The feature column names of the dataset. Defaults to None.
- text_column (str, optional): The text column name of the dataset for nlp tasks. Defaults to None.
- target_class_labels (Dict, optional): The class labels for the target columns. Defaults to None.
Support options provided when passing an input to run_test or run_test_suite
Example:
# to only use a certain subset of columns in the dataset:
run_test(
"validmind.SomeTestID",
inputs={
"dataset": {
"input_id": "my_dataset_id",
"columns": ["col1", "col2"],
}
}
)
# behind the scenes, this retrieves the dataset object (VMDataset) from the registry
# and then calls the `with_options()` method and passes `{"columns": ...}`
Arguments:
- **kwargs: Options:
- columns: Filter columns in the dataset
Returns:
VMDataset: A new instance of the dataset with only the specified columns
Assign predictions and probabilities to the dataset.
Arguments:
- model (VMModel): The model used to generate the predictions.
- prediction_column (str, optional): The name of the column containing the predictions. Defaults to None.
- prediction_values (list, optional): The values of the predictions. Defaults to None.
- probability_column (str, optional): The name of the column containing the probabilities. Defaults to None.
- probability_values (list, optional): The values of the probabilities. Defaults to None.
- prediction_probabilities (list, optional): DEPRECATED: The values of the probabilities. Defaults to None.
- kwargs: Additional keyword arguments that will get passed through to the model's
predict
method.
Adds an extra column to the dataset without modifying the dataset features
and target
columns.
Arguments:
- column_name (str): The name of the extra column.
- column_values (np.ndarray, optional): The values of the extra column.
Returns the dataset as a pandas DataFrame.
Returns:
pd.DataFrame: The dataset as a pandas DataFrame.
Returns the input features (X) of the dataset.
Returns:
np.ndarray: The input features.
Returns the target variables (y) of the dataset.
Returns:
np.ndarray: The target variables.
Returns the predictions for a given model.
Attempts to stack complex prediction types (e.g., embeddings) into a single, multi-dimensional array.
Arguments:
- model (VMModel): The model whose predictions are sought.
Returns:
np.ndarray: The predictions for the model
Returns the probabilities for a given model.
Arguments:
- model (str): The ID of the model whose predictions are sought.
Returns:
np.ndarray: The probability variables.
Returns a dataframe containing the predictions for a given model
An base class that wraps a trained model instance and its associated data.
Attributes:
- model (object, optional): The trained model instance. Defaults to None.
- input_id (str, optional): The input ID for the model. Defaults to None.
- attributes (ModelAttributes, optional): The attributes of the model. Defaults to None.
- name (str, optional): The name of the model. Defaults to the class name.
Predict probabilties - must be implemented by subclass if needed
Predict method for the model. This is a wrapper around the model's
Inherited Members
Figure objects track the schema supported by the ValidMind API
Model attributes definition
A dataclass that holds the table summary of result
Test result
Add a new table to the result
Arguments:
- table (Union[ResultTable, pd.DataFrame, List[Dict[str, Any]]]): The table to add
- title (Optional[str]): The title of the table (can optionally be provided for pd.DataFrame and List[Dict[str, Any]] tables)
Remove a table from the result by index
Arguments:
- index (int): The index of the table to remove (default is 0)
Add a new figure to the result
Arguments:
- figure (Union[matplotlib.figure.Figure, go.Figure, go.FigureWidget, bytes, Figure]): The figure to add (can be either a VM Figure object, a raw figure object from the supported libraries, or a png image as raw bytes)
Remove a figure from the result by index
Arguments:
- index (int): The index of the figure to remove (default is 0)
Create an ipywdiget representation of the result... Must be overridden by subclasses
Log the result to ValidMind
Arguments:
- section_id (str): The section ID within the model document to insert the test result
- position (int): The position (index) within the section to insert the test result
- unsafe (bool): If True, log the result even if it contains sensitive data i.e. raw data from input datasets
Inherited Members
- validmind.vm_models.result.result.Result
- show
Base class for test suites. Test suites are used to define a grouping of tests that can be run as a suite against datasets and models. Test Suites can be defined by inheriting from this base class and defining the list of tests as a class variable.
Tests can be a flat list of strings or may be nested into sections by using a dict
Returns the default configuration for the test suite
Each test in a test suite can accept parameters and those parameters can have default values. Both the parameters and their defaults are set in the test class and a config object can be passed to the test suite's run method to override the defaults. This function returns a dictionary containing the parameters and their default values for every test to allow users to view and set values
Returns:
dict: A dictionary of test names and their default parameters
Runs a test suite
Logs the results of the test suite to ValidMind
This method will be called after the test suite has been run and all results have been collected. This method will log the results to ValidMind.
Runs the test suite, renders the summary and sends the results to ValidMind
Arguments:
- send (bool, optional): Whether to send the results to ValidMind. Defaults to True.
- fail_fast (bool, optional): Whether to stop running tests after the first failure. Defaults to False.