AnswerRelevance
Assesses how pertinent the generated answer is to the given prompt.
The evaluation metric, Answer Relevancy, focuses on assessing how pertinent the generated answer is to the given prompt. A lower score is assigned to answers that are incomplete or contain redundant information and higher scores indicate better relevancy. This metric is computed using the question
, the contexts
and the answer
.
The Answer Relevancy is defined as the mean cosine similartiy of the original question
to a number of artifical questions, which are generated (reverse-engineered) based on the answer
:
\[ \\text{answer relevancy} = \\frac{1}{N} \\sum_{i=1}^{N} cos(E_{g_i}, E_o) \] \[ \\text{answer relevancy} = \\frac{1}{N} \\sum_{i=1}^{N} \\frac{E_{g_i} \\cdot E_o}{\\|E_{g_i}\\|\\|E_o\\|} \]
Where: - \(E_{g_i}\) is the embedding of the generated question \(i\). - \(E_o\) is the embedding of the original question. - \(N\) is the number of generated questions - 3 by default.
Note: This is a reference-free metric, meaning that it does not require a ground_truth
answer to compare against. A similar metric that does evaluate the correctness of a generated answser with respect to a ground_truth
answer is validmind.model_validation.ragas.AnswerCorrectness
.
Configuring Columns
This metric requires the following columns in your dataset:
question
(str): The text query that was input into the model.contexts
(List[str]): Any contextual information retrieved by the model before generating an answer.answer
(str): The response generated by the model.
If the above data is not in the appropriate column, you can specify different column names for these fields using the parameters question_column
, answer_column
, and contexts_column
.
For example, if your dataset has this data stored in different columns, you can pass the following parameters:
= {
params ": "input_text",
question_columnanswer_column": "output_text",
": "context_info
contexts_column }
If answer and contexts are stored as a dictionary in another column, specify the column and key like this:
= dataset.prediction_column(model)
pred_col = {
params ": f"{pred_col}.generated_answer",
answer_columncontexts_column": f"{pred_col}.contexts",
}
For more complex data structures, you can use a function to extract the answers:
= dataset.prediction_column(model)
pred_col = {
params ": lambda row: "\\n\\n".join(row[pred_col]["messages"]),
answer_columncontexts_column": lambda row: [row[pred_col]["context_message"]],
}