Dynamiq Docs
  • Welcome to Dynamiq
  • Low-Code Builder
    • Chat
    • Basics
    • Connecting Nodes
    • Conditional Nodes and Multiple Outputs
    • Input and Output Transformers
    • Error Handling and Retries
    • LLM Nodes
    • Validator Nodes
    • RAG Nodes
      • Indexing Workflow
        • Pre-processing Nodes
        • Document Splitting
        • Document Embedders
        • Document Writers
      • Inference RAG workflow
        • Text embedders
        • Document retrievers
          • Complex retrievers
        • LLM Answer Generators
    • LLM Agents
      • Basics
      • Guide to Implementing LLM Agents: ReAct and Simple Agents
      • Guide to Agent Orchestration: Linear and Adaptive Orchestrators
      • Guide to Advanced Agent Orchestration: Graph Orchestrator
    • Audio and voice
    • Tools and External Integrations
    • Python Code in Workflows
    • Memory
    • Guardrails
  • Deployments
    • Workflows
      • Tracing Workflow Execution
    • LLMs
      • Fine-tuned Adapters
      • Supported Models
    • Vector Databases
  • Prompts
    • Prompt Playground
  • Connections
  • LLM Fine-tuning
    • Basics
    • Using Adapters
    • Preparing Data
    • Supported Models
    • Parameters Guide
  • Knowledge Bases
  • Evaluations
    • Metrics
      • LLM-as-a-Judge
      • Predefined metrics
        • Faithfulness
        • Context Precision
        • Context Recall
        • Factual Correctness
        • Answer Correctness
      • Python Code Metrics
    • Datasets
    • Evaluation Runs
    • Examples
      • Build Accurate vs. Inaccurate Workflows
  • Examples
    • Building a Search Assistant
      • Approach 1: Single Agent with a Defined Role
      • Approach 2: Adaptive Orchestrator with Multiple Agents
      • Approach 3: Custom Logic Pipeline with a Straightforward Workflow
    • Building a Code Assistant
  • Platform Settings
    • Access Keys
    • Organizations
    • Settings
    • Billing
  • On-premise Deployment
    • AWS
    • IBM
  • Support Center
Powered by GitBook
On this page
  • Creating an Evaluation Dataset
  • Steps to Create an Evaluation Dataset
  • Deploying the Dataset
  • Updating the Dataset
  • Reviewing Your Dataset
  1. Evaluations

Datasets

Creating an Evaluation Dataset

A well-prepared dataset is crucial for assessing the performance of AI workflows and ensuring that evaluation metrics capture the nuances of different answers. In this guide, we will outline how to create, manage, and utilize your evaluation dataset in Dynamiq.

Steps to Create an Evaluation Dataset

  1. Navigate to Datasets:

    • In the Dynamiq portal, go to the Evaluations section and select Datasets. This is where you can manage your datasets.

  2. Add New Dataset:

    • Click on the Add new dataset button to start creating a new dataset.

  3. Dataset Details:

    • Name: Enter a descriptive name for your dataset.

    • Description: Provide a brief description of the dataset's purpose and contents.

  4. Upload from File:

    • You can upload your dataset in JSON format. Click on the upload area or drag and drop your JSON file. If you need a reference, you can download the Sample JSON to see the required format.

  5. JSON Structure:

    • Your JSON file should include essential data such as input prompts and desired outputs. Here's an example structure:

    [
      {
        "question": "What is the capital of France?",
        "context": "France, a country in Western Europe, has its capital in Paris, which is renowned for its art, culture, and the iconic Eiffel Tower.",
        "ground_truth_answer": "Paris"
      },
      {
        "question": "Who developed the theory of relativity?",
        "context": "The theory of relativity, which revolutionized our understanding of space, time, and gravity, was developed by the physicist Albert Einstein in the early 20th century.",
        "ground_truth_answer": "Albert Einstein"
      }
      // Add more entries as needed
    ]
  6. Create: Once your file is uploaded, click the Create button to finalize your dataset.

Deploying the Dataset

After creating your dataset, it’s important to deploy it to make it available for use in evaluations. This ensures that the dataset is actively accessible during the evaluation process.

Updating the Dataset

If you need to make changes to your dataset, you can easily update it by creating a new version:

  • Navigate to Datasets and select the dataset you want to update.

  • Click on the New Dataset Version button and provide the updated JSON file or make changes in the UI.

  • This allows you to add new values or modify existing entries while maintaining a history of dataset versions.

Reviewing Your Dataset

After uploading, you can review your dataset entries:

  • Dataset Overview: View the dataset's version, creator, and last edited details.

  • Dataset Entries: Examine each entry's context, question, and ground truth answer to ensure accuracy and completeness.

By following these steps, you can create a comprehensive dataset that enhances the evaluation process, ensuring your AI workflows are thoroughly tested and validated.

PreviousPython Code MetricsNextEvaluation Runs

Last updated 1 month ago