Datasets
Creating an Evaluation Dataset
A well-prepared dataset is crucial for assessing the performance of AI workflows and ensuring that evaluation metrics capture the nuances of different answers. In this guide, we will outline how to create, manage, and utilize your evaluation dataset in Dynamiq.
Steps to Create an Evaluation Dataset
Navigate to Datasets:
In the Dynamiq portal, go to the Evaluations section and select Datasets. This is where you can manage your datasets.
Add New Dataset:
Click on the Add new dataset button to start creating a new dataset.
Dataset Details:
Name: Enter a descriptive name for your dataset.
Description: Provide a brief description of the dataset's purpose and contents.
Upload from File:
You can upload your dataset in JSON format. Click on the upload area or drag and drop your JSON file. If you need a reference, you can download the Sample JSON to see the required format.
JSON Structure:
Your JSON file should include essential data such as input prompts and desired outputs. Here's an example structure:
Create: Once your file is uploaded, click the Create button to finalize your dataset.
Deploying the Dataset
After creating your dataset, it’s important to deploy it to make it available for use in evaluations. This ensures that the dataset is actively accessible during the evaluation process.
Updating the Dataset
If you need to make changes to your dataset, you can easily update it by creating a new version:
Navigate to Datasets and select the dataset you want to update.
Click on the New Dataset Version button and provide the updated JSON file or make changes in the UI.
This allows you to add new values or modify existing entries while maintaining a history of dataset versions.
Reviewing Your Dataset
After uploading, you can review your dataset entries:
Dataset Overview: View the dataset's version, creator, and last edited details.
Dataset Entries: Examine each entry's context, question, and ground truth answer to ensure accuracy and completeness.
By following these steps, you can create a comprehensive dataset that enhances the evaluation process, ensuring your AI workflows are thoroughly tested and validated.
Last updated