validmind.tests

ValidMind Tests Module

def list_tests( filter=None, task=None, tags=None, pretty=True, truncate=True, __as_class=False):

List all tests in the tests directory.

Arguments:
  • filter (str, optional): Find tests where the ID, tasks or tags match the filter string. Defaults to None.
  • task (str, optional): Find tests that match the task. Can be used to narrow down matches from the filter string. Defaults to None.
  • tags (list, optional): Find tests that match list of tags. Can be used to narrow down matches from the filter string. Defaults to None.
  • pretty (bool, optional): If True, returns a pandas DataFrame with a formatted table. Defaults to True.
  • truncate (bool, optional): If True, truncates the test description to the first line. Defaults to True. (only used if pretty=True)
Returns:

list or pandas.DataFrame: A list of all tests or a formatted table.

def load_test(test_id: str, reload=False):

Load a test by test ID

Test IDs are in the format namespace.path_to_module.TestClassOrFuncName[:result_id]. The result ID is optional and is used to distinguish between multiple results from the running the same test.

Arguments:
  • test_id (str): The test ID in the format namespace.path_to_module.TestName[:result_id]
  • reload (bool, optional): Whether to reload the test module. Defaults to False.
def describe_test( test_id: Literal['validmind.prompt_validation.Bias', 'validmind.prompt_validation.Clarity', 'validmind.prompt_validation.Specificity', 'validmind.prompt_validation.Robustness', 'validmind.prompt_validation.NegativeInstruction', 'validmind.prompt_validation.Conciseness', 'validmind.prompt_validation.Delimitation', 'validmind.model_validation.ModelPredictionResiduals', 'validmind.model_validation.BertScore', 'validmind.model_validation.TimeSeriesPredictionsPlot', 'validmind.model_validation.RegardScore', 'validmind.model_validation.BleuScore', 'validmind.model_validation.TimeSeriesPredictionWithCI', 'validmind.model_validation.RegressionResidualsPlot', 'validmind.model_validation.FeaturesAUC', 'validmind.model_validation.ContextualRecall', 'validmind.model_validation.MeteorScore', 'validmind.model_validation.RougeScore', 'validmind.model_validation.ModelMetadata', 'validmind.model_validation.ClusterSizeDistribution', 'validmind.model_validation.TokenDisparity', 'validmind.model_validation.ToxicityScore', 'validmind.model_validation.ModelMetadataComparison', 'validmind.model_validation.TimeSeriesR2SquareBySegments', 'validmind.model_validation.embeddings.CosineSimilarityComparison', 'validmind.model_validation.embeddings.EmbeddingsVisualization2D', 'validmind.model_validation.embeddings.StabilityAnalysisRandomNoise', 'validmind.model_validation.embeddings.TSNEComponentsPairwisePlots', 'validmind.model_validation.embeddings.CosineSimilarityDistribution', 'validmind.model_validation.embeddings.PCAComponentsPairwisePlots', 'validmind.model_validation.embeddings.CosineSimilarityHeatmap', 'validmind.model_validation.embeddings.StabilityAnalysisTranslation', 'validmind.model_validation.embeddings.EuclideanDistanceComparison', 'validmind.model_validation.embeddings.ClusterDistribution', 'validmind.model_validation.embeddings.EuclideanDistanceHeatmap', 'validmind.model_validation.embeddings.StabilityAnalysis', 'validmind.model_validation.embeddings.StabilityAnalysisKeyword', 'validmind.model_validation.embeddings.StabilityAnalysisSynonyms', 'validmind.model_validation.embeddings.DescriptiveAnalytics', 'validmind.model_validation.ragas.ContextEntityRecall', 'validmind.model_validation.ragas.Faithfulness', 'validmind.model_validation.ragas.AspectCritique', 'validmind.model_validation.ragas.AnswerSimilarity', 'validmind.model_validation.ragas.AnswerCorrectness', 'validmind.model_validation.ragas.ContextRecall', 'validmind.model_validation.ragas.ContextPrecision', 'validmind.model_validation.ragas.AnswerRelevance', 'validmind.model_validation.sklearn.RegressionModelsPerformanceComparison', 'validmind.model_validation.sklearn.AdjustedMutualInformation', 'validmind.model_validation.sklearn.SilhouettePlot', 'validmind.model_validation.sklearn.RobustnessDiagnosis', 'validmind.model_validation.sklearn.AdjustedRandIndex', 'validmind.model_validation.sklearn.SHAPGlobalImportance', 'validmind.model_validation.sklearn.ConfusionMatrix', 'validmind.model_validation.sklearn.HomogeneityScore', 'validmind.model_validation.sklearn.CompletenessScore', 'validmind.model_validation.sklearn.OverfitDiagnosis', 'validmind.model_validation.sklearn.ClusterPerformanceMetrics', 'validmind.model_validation.sklearn.PermutationFeatureImportance', 'validmind.model_validation.sklearn.FowlkesMallowsScore', 'validmind.model_validation.sklearn.MinimumROCAUCScore', 'validmind.model_validation.sklearn.ClusterCosineSimilarity', 'validmind.model_validation.sklearn.PrecisionRecallCurve', 'validmind.model_validation.sklearn.ClassifierPerformance', 'validmind.model_validation.sklearn.VMeasure', 'validmind.model_validation.sklearn.MinimumF1Score', 'validmind.model_validation.sklearn.ROCCurve', 'validmind.model_validation.sklearn.RegressionR2Square', 'validmind.model_validation.sklearn.RegressionErrors', 'validmind.model_validation.sklearn.ClusterPerformance', 'validmind.model_validation.sklearn.FeatureImportanceComparison', 'validmind.model_validation.sklearn.TrainingTestDegradation', 'validmind.model_validation.sklearn.RegressionErrorsComparison', 'validmind.model_validation.sklearn.HyperParametersTuning', 'validmind.model_validation.sklearn.KMeansClustersOptimization', 'validmind.model_validation.sklearn.ModelsPerformanceComparison', 'validmind.model_validation.sklearn.WeakspotsDiagnosis', 'validmind.model_validation.sklearn.RegressionR2SquareComparison', 'validmind.model_validation.sklearn.PopulationStabilityIndex', 'validmind.model_validation.sklearn.MinimumAccuracy', 'validmind.model_validation.statsmodels.RegressionModelsCoeffs', 'validmind.model_validation.statsmodels.BoxPierce', 'validmind.model_validation.statsmodels.RegressionCoeffsPlot', 'validmind.model_validation.statsmodels.RegressionModelSensitivityPlot', 'validmind.model_validation.statsmodels.RegressionModelForecastPlotLevels', 'validmind.model_validation.statsmodels.ScorecardHistogram', 'validmind.model_validation.statsmodels.LJungBox', 'validmind.model_validation.statsmodels.JarqueBera', 'validmind.model_validation.statsmodels.KolmogorovSmirnov', 'validmind.model_validation.statsmodels.ShapiroWilk', 'validmind.model_validation.statsmodels.CumulativePredictionProbabilities', 'validmind.model_validation.statsmodels.RegressionFeatureSignificance', 'validmind.model_validation.statsmodels.RegressionModelSummary', 'validmind.model_validation.statsmodels.Lilliefors', 'validmind.model_validation.statsmodels.RunsTest', 'validmind.model_validation.statsmodels.RegressionPermutationFeatureImportance', 'validmind.model_validation.statsmodels.PredictionProbabilitiesHistogram', 'validmind.model_validation.statsmodels.AutoARIMA', 'validmind.model_validation.statsmodels.GINITable', 'validmind.model_validation.statsmodels.RegressionModelForecastPlot', 'validmind.model_validation.statsmodels.DurbinWatsonTest', 'validmind.ongoing_monitoring.PredictionCorrelation', 'validmind.ongoing_monitoring.PredictionAcrossEachFeature', 'validmind.ongoing_monitoring.FeatureDrift', 'validmind.ongoing_monitoring.TargetPredictionDistributionPlot', 'validmind.data_validation.IQROutliersTable', 'validmind.data_validation.Skewness', 'validmind.data_validation.Duplicates', 'validmind.data_validation.MissingValuesBarPlot', 'validmind.data_validation.DatasetDescription', 'validmind.data_validation.ZivotAndrewsArch', 'validmind.data_validation.ScatterPlot', 'validmind.data_validation.TimeSeriesOutliers', 'validmind.data_validation.TabularCategoricalBarPlots', 'validmind.data_validation.AutoStationarity', 'validmind.data_validation.DescriptiveStatistics', 'validmind.data_validation.TimeSeriesDescription', 'validmind.data_validation.TargetRateBarPlots', 'validmind.data_validation.PearsonCorrelationMatrix', 'validmind.data_validation.FeatureTargetCorrelationPlot', 'validmind.data_validation.TabularNumericalHistograms', 'validmind.data_validation.IsolationForestOutliers', 'validmind.data_validation.ChiSquaredFeaturesTable', 'validmind.data_validation.HighCardinality', 'validmind.data_validation.MissingValues', 'validmind.data_validation.PhillipsPerronArch', 'validmind.data_validation.RollingStatsPlot', 'validmind.data_validation.TabularDescriptionTables', 'validmind.data_validation.AutoMA', 'validmind.data_validation.UniqueRows', 'validmind.data_validation.TooManyZeroValues', 'validmind.data_validation.HighPearsonCorrelation', 'validmind.data_validation.ACFandPACFPlot', 'validmind.data_validation.WOEBinTable', 'validmind.data_validation.TimeSeriesFrequency', 'validmind.data_validation.DatasetSplit', 'validmind.data_validation.SpreadPlot', 'validmind.data_validation.TimeSeriesLinePlot', 'validmind.data_validation.KPSS', 'validmind.data_validation.AutoSeasonality', 'validmind.data_validation.BivariateScatterPlots', 'validmind.data_validation.EngleGrangerCoint', 'validmind.data_validation.TimeSeriesMissingValues', 'validmind.data_validation.TimeSeriesHistogram', 'validmind.data_validation.LaggedCorrelationHeatmap', 'validmind.data_validation.SeasonalDecompose', 'validmind.data_validation.WOEBinPlots', 'validmind.data_validation.ClassImbalance', 'validmind.data_validation.IQROutliersBarPlot', 'validmind.data_validation.DFGLSArch', 'validmind.data_validation.TimeSeriesDescriptiveStatistics', 'validmind.data_validation.AutoAR', 'validmind.data_validation.TabularDateTimeHistograms', 'validmind.data_validation.ADF', 'validmind.data_validation.nlp.Toxicity', 'validmind.data_validation.nlp.PolarityAndSubjectivity', 'validmind.data_validation.nlp.Punctuations', 'validmind.data_validation.nlp.Sentiment', 'validmind.data_validation.nlp.CommonWords', 'validmind.data_validation.nlp.Hashtags', 'validmind.data_validation.nlp.LanguageDetection', 'validmind.data_validation.nlp.Mentions', 'validmind.data_validation.nlp.TextDescription', 'validmind.data_validation.nlp.StopWords'] = None, raw: bool = False, show: bool = True):

Get or show details about the test

This function can be used to see test details including the test name, description, required inputs and default params. It can also be used to get a dictionary of the above information for programmatic use.

Arguments:
  • test_id (str, optional): The test ID. Defaults to None.
  • raw (bool, optional): If True, returns a dictionary with the test details. Defaults to False.
def run_test( test_id: Literal['validmind.prompt_validation.Bias', 'validmind.prompt_validation.Clarity', 'validmind.prompt_validation.Specificity', 'validmind.prompt_validation.Robustness', 'validmind.prompt_validation.NegativeInstruction', 'validmind.prompt_validation.Conciseness', 'validmind.prompt_validation.Delimitation', 'validmind.model_validation.ModelPredictionResiduals', 'validmind.model_validation.BertScore', 'validmind.model_validation.TimeSeriesPredictionsPlot', 'validmind.model_validation.RegardScore', 'validmind.model_validation.BleuScore', 'validmind.model_validation.TimeSeriesPredictionWithCI', 'validmind.model_validation.RegressionResidualsPlot', 'validmind.model_validation.FeaturesAUC', 'validmind.model_validation.ContextualRecall', 'validmind.model_validation.MeteorScore', 'validmind.model_validation.RougeScore', 'validmind.model_validation.ModelMetadata', 'validmind.model_validation.ClusterSizeDistribution', 'validmind.model_validation.TokenDisparity', 'validmind.model_validation.ToxicityScore', 'validmind.model_validation.ModelMetadataComparison', 'validmind.model_validation.TimeSeriesR2SquareBySegments', 'validmind.model_validation.embeddings.CosineSimilarityComparison', 'validmind.model_validation.embeddings.EmbeddingsVisualization2D', 'validmind.model_validation.embeddings.StabilityAnalysisRandomNoise', 'validmind.model_validation.embeddings.TSNEComponentsPairwisePlots', 'validmind.model_validation.embeddings.CosineSimilarityDistribution', 'validmind.model_validation.embeddings.PCAComponentsPairwisePlots', 'validmind.model_validation.embeddings.CosineSimilarityHeatmap', 'validmind.model_validation.embeddings.StabilityAnalysisTranslation', 'validmind.model_validation.embeddings.EuclideanDistanceComparison', 'validmind.model_validation.embeddings.ClusterDistribution', 'validmind.model_validation.embeddings.EuclideanDistanceHeatmap', 'validmind.model_validation.embeddings.StabilityAnalysis', 'validmind.model_validation.embeddings.StabilityAnalysisKeyword', 'validmind.model_validation.embeddings.StabilityAnalysisSynonyms', 'validmind.model_validation.embeddings.DescriptiveAnalytics', 'validmind.model_validation.ragas.ContextEntityRecall', 'validmind.model_validation.ragas.Faithfulness', 'validmind.model_validation.ragas.AspectCritique', 'validmind.model_validation.ragas.AnswerSimilarity', 'validmind.model_validation.ragas.AnswerCorrectness', 'validmind.model_validation.ragas.ContextRecall', 'validmind.model_validation.ragas.ContextPrecision', 'validmind.model_validation.ragas.AnswerRelevance', 'validmind.model_validation.sklearn.RegressionModelsPerformanceComparison', 'validmind.model_validation.sklearn.AdjustedMutualInformation', 'validmind.model_validation.sklearn.SilhouettePlot', 'validmind.model_validation.sklearn.RobustnessDiagnosis', 'validmind.model_validation.sklearn.AdjustedRandIndex', 'validmind.model_validation.sklearn.SHAPGlobalImportance', 'validmind.model_validation.sklearn.ConfusionMatrix', 'validmind.model_validation.sklearn.HomogeneityScore', 'validmind.model_validation.sklearn.CompletenessScore', 'validmind.model_validation.sklearn.OverfitDiagnosis', 'validmind.model_validation.sklearn.ClusterPerformanceMetrics', 'validmind.model_validation.sklearn.PermutationFeatureImportance', 'validmind.model_validation.sklearn.FowlkesMallowsScore', 'validmind.model_validation.sklearn.MinimumROCAUCScore', 'validmind.model_validation.sklearn.ClusterCosineSimilarity', 'validmind.model_validation.sklearn.PrecisionRecallCurve', 'validmind.model_validation.sklearn.ClassifierPerformance', 'validmind.model_validation.sklearn.VMeasure', 'validmind.model_validation.sklearn.MinimumF1Score', 'validmind.model_validation.sklearn.ROCCurve', 'validmind.model_validation.sklearn.RegressionR2Square', 'validmind.model_validation.sklearn.RegressionErrors', 'validmind.model_validation.sklearn.ClusterPerformance', 'validmind.model_validation.sklearn.FeatureImportanceComparison', 'validmind.model_validation.sklearn.TrainingTestDegradation', 'validmind.model_validation.sklearn.RegressionErrorsComparison', 'validmind.model_validation.sklearn.HyperParametersTuning', 'validmind.model_validation.sklearn.KMeansClustersOptimization', 'validmind.model_validation.sklearn.ModelsPerformanceComparison', 'validmind.model_validation.sklearn.WeakspotsDiagnosis', 'validmind.model_validation.sklearn.RegressionR2SquareComparison', 'validmind.model_validation.sklearn.PopulationStabilityIndex', 'validmind.model_validation.sklearn.MinimumAccuracy', 'validmind.model_validation.statsmodels.RegressionModelsCoeffs', 'validmind.model_validation.statsmodels.BoxPierce', 'validmind.model_validation.statsmodels.RegressionCoeffsPlot', 'validmind.model_validation.statsmodels.RegressionModelSensitivityPlot', 'validmind.model_validation.statsmodels.RegressionModelForecastPlotLevels', 'validmind.model_validation.statsmodels.ScorecardHistogram', 'validmind.model_validation.statsmodels.LJungBox', 'validmind.model_validation.statsmodels.JarqueBera', 'validmind.model_validation.statsmodels.KolmogorovSmirnov', 'validmind.model_validation.statsmodels.ShapiroWilk', 'validmind.model_validation.statsmodels.CumulativePredictionProbabilities', 'validmind.model_validation.statsmodels.RegressionFeatureSignificance', 'validmind.model_validation.statsmodels.RegressionModelSummary', 'validmind.model_validation.statsmodels.Lilliefors', 'validmind.model_validation.statsmodels.RunsTest', 'validmind.model_validation.statsmodels.RegressionPermutationFeatureImportance', 'validmind.model_validation.statsmodels.PredictionProbabilitiesHistogram', 'validmind.model_validation.statsmodels.AutoARIMA', 'validmind.model_validation.statsmodels.GINITable', 'validmind.model_validation.statsmodels.RegressionModelForecastPlot', 'validmind.model_validation.statsmodels.DurbinWatsonTest', 'validmind.ongoing_monitoring.PredictionCorrelation', 'validmind.ongoing_monitoring.PredictionAcrossEachFeature', 'validmind.ongoing_monitoring.FeatureDrift', 'validmind.ongoing_monitoring.TargetPredictionDistributionPlot', 'validmind.data_validation.IQROutliersTable', 'validmind.data_validation.Skewness', 'validmind.data_validation.Duplicates', 'validmind.data_validation.MissingValuesBarPlot', 'validmind.data_validation.DatasetDescription', 'validmind.data_validation.ZivotAndrewsArch', 'validmind.data_validation.ScatterPlot', 'validmind.data_validation.TimeSeriesOutliers', 'validmind.data_validation.TabularCategoricalBarPlots', 'validmind.data_validation.AutoStationarity', 'validmind.data_validation.DescriptiveStatistics', 'validmind.data_validation.TimeSeriesDescription', 'validmind.data_validation.TargetRateBarPlots', 'validmind.data_validation.PearsonCorrelationMatrix', 'validmind.data_validation.FeatureTargetCorrelationPlot', 'validmind.data_validation.TabularNumericalHistograms', 'validmind.data_validation.IsolationForestOutliers', 'validmind.data_validation.ChiSquaredFeaturesTable', 'validmind.data_validation.HighCardinality', 'validmind.data_validation.MissingValues', 'validmind.data_validation.PhillipsPerronArch', 'validmind.data_validation.RollingStatsPlot', 'validmind.data_validation.TabularDescriptionTables', 'validmind.data_validation.AutoMA', 'validmind.data_validation.UniqueRows', 'validmind.data_validation.TooManyZeroValues', 'validmind.data_validation.HighPearsonCorrelation', 'validmind.data_validation.ACFandPACFPlot', 'validmind.data_validation.WOEBinTable', 'validmind.data_validation.TimeSeriesFrequency', 'validmind.data_validation.DatasetSplit', 'validmind.data_validation.SpreadPlot', 'validmind.data_validation.TimeSeriesLinePlot', 'validmind.data_validation.KPSS', 'validmind.data_validation.AutoSeasonality', 'validmind.data_validation.BivariateScatterPlots', 'validmind.data_validation.EngleGrangerCoint', 'validmind.data_validation.TimeSeriesMissingValues', 'validmind.data_validation.TimeSeriesHistogram', 'validmind.data_validation.LaggedCorrelationHeatmap', 'validmind.data_validation.SeasonalDecompose', 'validmind.data_validation.WOEBinPlots', 'validmind.data_validation.ClassImbalance', 'validmind.data_validation.IQROutliersBarPlot', 'validmind.data_validation.DFGLSArch', 'validmind.data_validation.TimeSeriesDescriptiveStatistics', 'validmind.data_validation.AutoAR', 'validmind.data_validation.TabularDateTimeHistograms', 'validmind.data_validation.ADF', 'validmind.data_validation.nlp.Toxicity', 'validmind.data_validation.nlp.PolarityAndSubjectivity', 'validmind.data_validation.nlp.Punctuations', 'validmind.data_validation.nlp.Sentiment', 'validmind.data_validation.nlp.CommonWords', 'validmind.data_validation.nlp.Hashtags', 'validmind.data_validation.nlp.LanguageDetection', 'validmind.data_validation.nlp.Mentions', 'validmind.data_validation.nlp.TextDescription', 'validmind.data_validation.nlp.StopWords'] = None, params: Dict[str, Any] = None, inputs: Dict[str, Any] = None, input_grid: Union[Dict[str, List[Any]], List[Dict[str, Any]]] = None, name: str = None, unit_metrics: List[Literal['validmind.prompt_validation.Bias', 'validmind.prompt_validation.Clarity', 'validmind.prompt_validation.Specificity', 'validmind.prompt_validation.Robustness', 'validmind.prompt_validation.NegativeInstruction', 'validmind.prompt_validation.Conciseness', 'validmind.prompt_validation.Delimitation', 'validmind.model_validation.ModelPredictionResiduals', 'validmind.model_validation.BertScore', 'validmind.model_validation.TimeSeriesPredictionsPlot', 'validmind.model_validation.RegardScore', 'validmind.model_validation.BleuScore', 'validmind.model_validation.TimeSeriesPredictionWithCI', 'validmind.model_validation.RegressionResidualsPlot', 'validmind.model_validation.FeaturesAUC', 'validmind.model_validation.ContextualRecall', 'validmind.model_validation.MeteorScore', 'validmind.model_validation.RougeScore', 'validmind.model_validation.ModelMetadata', 'validmind.model_validation.ClusterSizeDistribution', 'validmind.model_validation.TokenDisparity', 'validmind.model_validation.ToxicityScore', 'validmind.model_validation.ModelMetadataComparison', 'validmind.model_validation.TimeSeriesR2SquareBySegments', 'validmind.model_validation.embeddings.CosineSimilarityComparison', 'validmind.model_validation.embeddings.EmbeddingsVisualization2D', 'validmind.model_validation.embeddings.StabilityAnalysisRandomNoise', 'validmind.model_validation.embeddings.TSNEComponentsPairwisePlots', 'validmind.model_validation.embeddings.CosineSimilarityDistribution', 'validmind.model_validation.embeddings.PCAComponentsPairwisePlots', 'validmind.model_validation.embeddings.CosineSimilarityHeatmap', 'validmind.model_validation.embeddings.StabilityAnalysisTranslation', 'validmind.model_validation.embeddings.EuclideanDistanceComparison', 'validmind.model_validation.embeddings.ClusterDistribution', 'validmind.model_validation.embeddings.EuclideanDistanceHeatmap', 'validmind.model_validation.embeddings.StabilityAnalysis', 'validmind.model_validation.embeddings.StabilityAnalysisKeyword', 'validmind.model_validation.embeddings.StabilityAnalysisSynonyms', 'validmind.model_validation.embeddings.DescriptiveAnalytics', 'validmind.model_validation.ragas.ContextEntityRecall', 'validmind.model_validation.ragas.Faithfulness', 'validmind.model_validation.ragas.AspectCritique', 'validmind.model_validation.ragas.AnswerSimilarity', 'validmind.model_validation.ragas.AnswerCorrectness', 'validmind.model_validation.ragas.ContextRecall', 'validmind.model_validation.ragas.ContextPrecision', 'validmind.model_validation.ragas.AnswerRelevance', 'validmind.model_validation.sklearn.RegressionModelsPerformanceComparison', 'validmind.model_validation.sklearn.AdjustedMutualInformation', 'validmind.model_validation.sklearn.SilhouettePlot', 'validmind.model_validation.sklearn.RobustnessDiagnosis', 'validmind.model_validation.sklearn.AdjustedRandIndex', 'validmind.model_validation.sklearn.SHAPGlobalImportance', 'validmind.model_validation.sklearn.ConfusionMatrix', 'validmind.model_validation.sklearn.HomogeneityScore', 'validmind.model_validation.sklearn.CompletenessScore', 'validmind.model_validation.sklearn.OverfitDiagnosis', 'validmind.model_validation.sklearn.ClusterPerformanceMetrics', 'validmind.model_validation.sklearn.PermutationFeatureImportance', 'validmind.model_validation.sklearn.FowlkesMallowsScore', 'validmind.model_validation.sklearn.MinimumROCAUCScore', 'validmind.model_validation.sklearn.ClusterCosineSimilarity', 'validmind.model_validation.sklearn.PrecisionRecallCurve', 'validmind.model_validation.sklearn.ClassifierPerformance', 'validmind.model_validation.sklearn.VMeasure', 'validmind.model_validation.sklearn.MinimumF1Score', 'validmind.model_validation.sklearn.ROCCurve', 'validmind.model_validation.sklearn.RegressionR2Square', 'validmind.model_validation.sklearn.RegressionErrors', 'validmind.model_validation.sklearn.ClusterPerformance', 'validmind.model_validation.sklearn.FeatureImportanceComparison', 'validmind.model_validation.sklearn.TrainingTestDegradation', 'validmind.model_validation.sklearn.RegressionErrorsComparison', 'validmind.model_validation.sklearn.HyperParametersTuning', 'validmind.model_validation.sklearn.KMeansClustersOptimization', 'validmind.model_validation.sklearn.ModelsPerformanceComparison', 'validmind.model_validation.sklearn.WeakspotsDiagnosis', 'validmind.model_validation.sklearn.RegressionR2SquareComparison', 'validmind.model_validation.sklearn.PopulationStabilityIndex', 'validmind.model_validation.sklearn.MinimumAccuracy', 'validmind.model_validation.statsmodels.RegressionModelsCoeffs', 'validmind.model_validation.statsmodels.BoxPierce', 'validmind.model_validation.statsmodels.RegressionCoeffsPlot', 'validmind.model_validation.statsmodels.RegressionModelSensitivityPlot', 'validmind.model_validation.statsmodels.RegressionModelForecastPlotLevels', 'validmind.model_validation.statsmodels.ScorecardHistogram', 'validmind.model_validation.statsmodels.LJungBox', 'validmind.model_validation.statsmodels.JarqueBera', 'validmind.model_validation.statsmodels.KolmogorovSmirnov', 'validmind.model_validation.statsmodels.ShapiroWilk', 'validmind.model_validation.statsmodels.CumulativePredictionProbabilities', 'validmind.model_validation.statsmodels.RegressionFeatureSignificance', 'validmind.model_validation.statsmodels.RegressionModelSummary', 'validmind.model_validation.statsmodels.Lilliefors', 'validmind.model_validation.statsmodels.RunsTest', 'validmind.model_validation.statsmodels.RegressionPermutationFeatureImportance', 'validmind.model_validation.statsmodels.PredictionProbabilitiesHistogram', 'validmind.model_validation.statsmodels.AutoARIMA', 'validmind.model_validation.statsmodels.GINITable', 'validmind.model_validation.statsmodels.RegressionModelForecastPlot', 'validmind.model_validation.statsmodels.DurbinWatsonTest', 'validmind.ongoing_monitoring.PredictionCorrelation', 'validmind.ongoing_monitoring.PredictionAcrossEachFeature', 'validmind.ongoing_monitoring.FeatureDrift', 'validmind.ongoing_monitoring.TargetPredictionDistributionPlot', 'validmind.data_validation.IQROutliersTable', 'validmind.data_validation.Skewness', 'validmind.data_validation.Duplicates', 'validmind.data_validation.MissingValuesBarPlot', 'validmind.data_validation.DatasetDescription', 'validmind.data_validation.ZivotAndrewsArch', 'validmind.data_validation.ScatterPlot', 'validmind.data_validation.TimeSeriesOutliers', 'validmind.data_validation.TabularCategoricalBarPlots', 'validmind.data_validation.AutoStationarity', 'validmind.data_validation.DescriptiveStatistics', 'validmind.data_validation.TimeSeriesDescription', 'validmind.data_validation.TargetRateBarPlots', 'validmind.data_validation.PearsonCorrelationMatrix', 'validmind.data_validation.FeatureTargetCorrelationPlot', 'validmind.data_validation.TabularNumericalHistograms', 'validmind.data_validation.IsolationForestOutliers', 'validmind.data_validation.ChiSquaredFeaturesTable', 'validmind.data_validation.HighCardinality', 'validmind.data_validation.MissingValues', 'validmind.data_validation.PhillipsPerronArch', 'validmind.data_validation.RollingStatsPlot', 'validmind.data_validation.TabularDescriptionTables', 'validmind.data_validation.AutoMA', 'validmind.data_validation.UniqueRows', 'validmind.data_validation.TooManyZeroValues', 'validmind.data_validation.HighPearsonCorrelation', 'validmind.data_validation.ACFandPACFPlot', 'validmind.data_validation.WOEBinTable', 'validmind.data_validation.TimeSeriesFrequency', 'validmind.data_validation.DatasetSplit', 'validmind.data_validation.SpreadPlot', 'validmind.data_validation.TimeSeriesLinePlot', 'validmind.data_validation.KPSS', 'validmind.data_validation.AutoSeasonality', 'validmind.data_validation.BivariateScatterPlots', 'validmind.data_validation.EngleGrangerCoint', 'validmind.data_validation.TimeSeriesMissingValues', 'validmind.data_validation.TimeSeriesHistogram', 'validmind.data_validation.LaggedCorrelationHeatmap', 'validmind.data_validation.SeasonalDecompose', 'validmind.data_validation.WOEBinPlots', 'validmind.data_validation.ClassImbalance', 'validmind.data_validation.IQROutliersBarPlot', 'validmind.data_validation.DFGLSArch', 'validmind.data_validation.TimeSeriesDescriptiveStatistics', 'validmind.data_validation.AutoAR', 'validmind.data_validation.TabularDateTimeHistograms', 'validmind.data_validation.ADF', 'validmind.data_validation.nlp.Toxicity', 'validmind.data_validation.nlp.PolarityAndSubjectivity', 'validmind.data_validation.nlp.Punctuations', 'validmind.data_validation.nlp.Sentiment', 'validmind.data_validation.nlp.CommonWords', 'validmind.data_validation.nlp.Hashtags', 'validmind.data_validation.nlp.LanguageDetection', 'validmind.data_validation.nlp.Mentions', 'validmind.data_validation.nlp.TextDescription', 'validmind.data_validation.nlp.StopWords']] = None, output_template: str = None, show: bool = True, __generate_description: bool = True, **kwargs) -> Union[validmind.vm_models.test.result_wrapper.MetricResultWrapper, validmind.vm_models.test.result_wrapper.ThresholdTestResultWrapper]:

Run a test by test ID

Arguments:
  • test_id (TestID, optional): The test ID to run. Not required if unit_metrics is provided.
  • params (dict, optional): A dictionary of parameters to pass into the test. Params are used to customize the test behavior and are specific to each test. See the test details for more information on the available parameters. Defaults to None.
  • inputs (Dict[str, Any], optional): A dictionary of test inputs to pass into the test. Inputs are either models or datasets that have been initialized using vm.init_model() or vm.init_dataset(). Defaults to None.
  • input_grid (Union[Dict[str, List[Any]], List[Dict[str, Any]]], optional): To run a comparison test, provide either a dictionary of inputs where the keys are the input names and the values are lists of different inputs, or a list of dictionaries where each dictionary is a set of inputs to run the test with. This will run the test multiple times with different sets of inputs and then combine the results into a single output. When passing a dictionary, the grid will be created by taking the Cartesian product of the input lists. Its simply a more convenient way of forming the input grid as opposed to passing a list of all possible combinations. Defaults to None.
  • name (str, optional): The name of the test (used to create a composite metric out of multiple unit metrics) - required when running multiple unit metrics
  • unit_metrics (list, optional): A list of unit metric IDs to run as a composite metric - required when running multiple unit metrics
  • output_template (str, optional): A jinja2 html template to customize the output of the test. Defaults to None.
  • show (bool, optional): Whether to display the results. Defaults to True.
  • **kwargs: Keyword inputs to pass into the test (same as inputs but as keyword args instead of a dictionary):
    • dataset: A validmind Dataset object or a Pandas DataFrame
    • model: A model to use for the test
    • models: A list of models to use for the test
    • dataset: A validmind Dataset object or a Pandas DataFrame
def register_test_provider( namespace: str, test_provider: TestProvider) -> None:

Register an external test provider

Arguments:
  • namespace (str): The namespace of the test provider
  • test_provider (TestProvider): The test provider
class LoadTestError(validmind.errors.BaseError):

Exception raised when an error occurs while loading a test

Inherited Members
validmind.errors.BaseError
BaseError
description
builtins.BaseException
with_traceback
add_note
class LocalTestProvider:

Test providers in ValidMind are responsible for loading tests from different sources, such as local files, databases, or remote services. The LocalTestProvider specifically loads tests from the local file system.

To use the LocalTestProvider, you need to provide the root_folder, which is the root directory for local tests. The test_id is a combination of the namespace (set when registering the test provider) and the path to the test class module, where slashes are replaced by dots and the .py extension is left out.

Example usage:

# Create an instance of LocalTestProvider with the root folder
test_provider = LocalTestProvider("/path/to/tests/folder")

# Register the test provider with a namespace
register_test_provider("my_namespace", test_provider)

# Load a test using the test_id (namespace + path to test class module)
test = test_provider.load_test("my_namespace.my_test_class")
# full path to the test class module is /path/to/tests/folder/my_test_class.py
Attributes:
  • root_folder (str): The root directory for local tests.
LocalTestProvider(root_folder: str)

Initialize the LocalTestProvider with the given root_folder (see class docstring for details)

Arguments:
  • root_folder (str): The root directory for local tests.
def load_test(self, test_id: str):

Load the test identified by the given test_id.

Arguments:
  • test_id (str): The identifier of the test. This corresponds to the relative
  • path of the python file from the root folder, with slashes replaced by dots
Returns:

The test class that matches the last part of the test_id.

Raises:
  • Exception: If the test can't be imported or loaded.
class TestProvider(typing.Protocol):

Protocol for user-defined test providers

TestProvider(*args, **kwargs)
def load_test(self, test_id: str):

Load the test by test ID

Arguments:
  • test_id (str): The test ID (does not contain the namespace under which the test is registered)
Returns:

Test: A test class or function

Raises:
  • FileNotFoundError: If the test is not found
def list_tags():

List unique tags from all test classes.

def list_tasks():

List unique tasks from all test classes.

def list_tasks_and_tags():

List all task types and their associated tags, with one row per task type and all tags for a task type in one row.

Returns:

pandas.DataFrame: A DataFrame with 'Task Type' and concatenated 'Tags'.

def test(func_or_id):

Decorator for creating and registering metrics with the ValidMind framework.

Creates a metric object and registers it with ValidMind under the provided ID. If no ID is provided, the function name will be used as to build one. So if the function name is my_metric, the metric will be registered under the ID validmind.custom_metrics.my_metric.

This decorator works by creating a new Metric class will be created whose run method calls the decorated function. This function should take as arguments the inputs it requires (dataset, datasets, model, models) followed by any parameters. It can return any number of the following types:

  • Table: Either a list of dictionaries or a pandas DataFrame
  • Plot: Either a matplotlib figure or a plotly figure
  • Scalar: A single number or string

The function may also include a docstring. This docstring will be used and logged as the metric's description.

Arguments:
  • func: The function to decorate
  • test_id: The identifier for the metric. If not provided, the function name is used.
Returns:

The decorated function.

def metric(func_or_id):

DEPRECATED, use @vm.test instead

def tags(*tags):

Decorator for specifying tags for a metric.

Arguments:
  • *tags: The tags to apply to the metric.
def tasks(*tasks):

Decorator for specifying the task types that a metric is designed for.

Arguments:
  • *tasks: The task types that the metric is designed for.