AnswerRelevance

Assesses how pertinent the generated answer is to the given prompt.

The evaluation metric, Answer Relevancy, focuses on assessing how pertinent the generated answer is to the given prompt. A lower score is assigned to answers that are incomplete or contain redundant information and higher scores indicate better relevancy. This metric is computed using the question, the contexts and the answer.

The Answer Relevancy is defined as the mean cosine similartiy of the original question to a number of artifical questions, which are generated (reverse-engineered) based on the answer:

\[ \\text{answer relevancy} = \\frac{1}{N} \\sum_{i=1}^{N} cos(E_{g_i}, E_o) \] \[ \\text{answer relevancy} = \\frac{1}{N} \\sum_{i=1}^{N} \\frac{E_{g_i} \\cdot E_o}{\\|E_{g_i}\\|\\|E_o\\|} \]

Where: - \(E_{g_i}\) is the embedding of the generated question \(i\). - \(E_o\) is the embedding of the original question. - \(N\) is the number of generated questions - 3 by default.

Note: This is a reference-free metric, meaning that it does not require a ground_truth answer to compare against. A similar metric that does evaluate the correctness of a generated answser with respect to a ground_truth answer is validmind.model_validation.ragas.AnswerCorrectness.

Configuring Columns

This metric requires the following columns in your dataset:

  • question (str): The text query that was input into the model.
  • contexts (List[str]): Any contextual information retrieved by the model before generating an answer.
  • answer (str): The response generated by the model.

If the above data is not in the appropriate column, you can specify different column names for these fields using the parameters question_column, answer_column, and contexts_column.

For example, if your dataset has this data stored in different columns, you can pass the following parameters:

params = {
question_column": "input_text",
answer_column": "output_text",
contexts_column": "context_info
}

If answer and contexts are stored as a dictionary in another column, specify the column and key like this:

pred_col = dataset.prediction_column(model)
params = {
answer_column": f"{pred_col}.generated_answer",
contexts_column": f"{pred_col}.contexts",
}

For more complex data structures, you can use a function to extract the answers:

pred_col = dataset.prediction_column(model)
params = {
answer_column": lambda row: "\\n\\n".join(row[pred_col]["messages"]),
contexts_column": lambda row: [row[pred_col]["context_message"]],
}