Faithfulness

Faithfulness Metric

The Faithfulness Metric assesses the factual consistency of a generated answer in relation to the provided context. It is calculated based on both the answer itself and the retrieved context. The resulting score is normalized to a range of 0 to 1, where a higher score indicates better faithfulness.

Faithfulness score=Number of claims in the generated answer that can be inferred from given contextTotal number of claims in the generated answer\text{Faithfulness score} = {|\text{Number of claims in the generated answer that can be inferred from given context}| \over |\text{Total number of claims in the generated answer}|}

Definition

An answer is considered faithful if all claims made in the answer can be logically inferred from the given context.

Calculation Process

  1. Claim Identification: A set of claims present in the generated answer is identified.

  2. Cross-Verification: Each claim is then cross-checked with the provided context to determine if it can be substantiated by the context.

Scoring

The final faithfulness score reflects how well the claims in the answer align with the given context.

Result

Your Faithfulness metric will now be ready for use in evaluations!

Example Code: Faithfulness Evaluation

This example demonstrates how to use the FaithfulnessEvaluator to assess the factual consistency of generated answers against given contexts using the OpenAI language model.

Last updated