# Answer Correctness

<figure><img src="/files/bRAmRXTtg9ujUeo0kQoy" alt=""><figcaption></figcaption></figure>

### Answer Correctness Evaluator

The **Answer Correctness Evaluator** assesses the correctness of answers based on their alignment with ground truth answers. It extracts key statements from both answers and ground truths, classifies them into True Positives (TP), False Positives (FP), and False Negatives (FN), and computes similarity scores for each answer.

#### Key Formulas

1. **Precision**:&#x20;

$$
\text{Precision} = \frac{TP}{TP + FP}
$$

2. **Recall**:&#x20;

$$
\text{Recall} = \frac{TP}{TP + FN}
$$

1. **F1 Score**:

$$
F1 = 2 \times \frac{\text{Precision} \times \text{Recall}}{\text{Precision} + \text{Recall}}
$$

1. **Final Score** (combined from F1 score and similarity score):

$$
\text{Final Score} = w\_1 \times F1 + w\_2 \times \text{Similarity Score}
$$

&#x20;where (w\_1) and (w\_2) are the weights assigned to the F1 and similarity scores, respectively.

### Example Code: Answer Correctness Evaluation

This example demonstrates how to compute the **Answer Correctness** metric using the `AnswerCorrectnessEvaluator` with the OpenAI language model.

```python
import logging
import sys
from dotenv import find_dotenv, load_dotenv
from dynamiq.evaluations.metrics import AnswerCorrectnessEvaluator
from dynamiq.nodes.llms import OpenAI

# Load environment variables for the OpenAI API
load_dotenv(find_dotenv())

# Configure logging level
logging.basicConfig(stream=sys.stdout, level=logging.INFO)

# Initialize the OpenAI language model
llm = OpenAI(model="gpt-4o-mini")

# Sample data
questions = [
    "What powers the sun and what is its primary function?",
    "What is the boiling point of water?",
]
answers = [
    (
        "The sun is powered by nuclear fission, similar to nuclear reactors on Earth."
        " Its primary function is to provide light to the solar system."
    ),
    "The boiling point of water is 100 degrees Celsius at sea level.",
]
ground_truth_answers = [
    (
        "The sun is powered by nuclear fusion, where hydrogen atoms fuse to form helium."
        " This fusion process releases a tremendous amount of energy. The sun provides"
        " heat and light, which are essential for life on Earth."
    ),
    (
        "The boiling point of water is 100 degrees Celsius (212 degrees Fahrenheit) at"
        " sea level. The boiling point can change with altitude."
    ),
]

# Initialize evaluator
evaluator = AnswerCorrectnessEvaluator(llm=llm)

# Evaluate
correctness_scores = evaluator.run(
    questions=questions,
    answers=answers,
    ground_truth_answers=ground_truth_answers,
    verbose=False,  # Set verbose=True to enable logging
)

# Print the results
for idx, score in enumerate(correctness_scores):
    print(f"Question: {questions[idx]}")
    print(f"Answer Correctness Score: {score}")
    print("-" * 50)

print("Answer Correctness Scores:")
print(correctness_scores)
```


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.getdynamiq.ai/old-version-evaluations/predefined-metrics/answer-correctness.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.