Dynamiq Docs
  • Welcome to Dynamiq
  • Low-Code Builder
    • Chat
    • Basics
    • Connecting Nodes
    • Conditional Nodes and Multiple Outputs
    • Input and Output Transformers
    • Error Handling and Retries
    • LLM Nodes
    • Validator Nodes
    • RAG Nodes
      • Indexing Workflow
        • Pre-processing Nodes
        • Document Splitting
        • Document Embedders
        • Document Writers
      • Inference RAG workflow
        • Text embedders
        • Document retrievers
          • Complex retrievers
        • LLM Answer Generators
    • LLM Agents
      • Basics
      • Guide to Implementing LLM Agents: ReAct and Simple Agents
      • Guide to Agent Orchestration: Linear and Adaptive Orchestrators
      • Guide to Advanced Agent Orchestration: Graph Orchestrator
    • Audio and voice
    • Tools and External Integrations
    • Python Code in Workflows
    • Memory
    • Guardrails
  • Deployments
    • Workflows
      • Tracing Workflow Execution
    • LLMs
      • Fine-tuned Adapters
      • Supported Models
    • Vector Databases
  • Prompts
    • Prompt Playground
  • Connections
  • LLM Fine-tuning
    • Basics
    • Using Adapters
    • Preparing Data
    • Supported Models
    • Parameters Guide
  • Knowledge Bases
  • Evaluations
    • Metrics
      • LLM-as-a-Judge
      • Predefined metrics
        • Faithfulness
        • Context Precision
        • Context Recall
        • Factual Correctness
        • Answer Correctness
      • Python Code Metrics
    • Datasets
    • Evaluation Runs
    • Examples
      • Build Accurate vs. Inaccurate Workflows
  • Examples
    • Building a Search Assistant
      • Approach 1: Single Agent with a Defined Role
      • Approach 2: Adaptive Orchestrator with Multiple Agents
      • Approach 3: Custom Logic Pipeline with a Straightforward Workflow
    • Building a Code Assistant
  • Platform Settings
    • Access Keys
    • Organizations
    • Settings
    • Billing
  • On-premise Deployment
    • AWS
    • IBM
  • Support Center
Powered by GitBook
On this page
  • Creating a Python-Based Metric in Dynamiq
  • The Power of Custom Python Metrics
  • Creating Your Python Metric
  • How Python Metrics Work
  • Technical Implementation Details
  • Taking Your Evaluations to the Next Level
  1. Evaluations
  2. Metrics

Python Code Metrics

PreviousAnswer CorrectnessNextDatasets

Last updated 1 month ago

Creating a Python-Based Metric in Dynamiq

The Power of Custom Python Metrics

One of Dynamiq's standout features is the ability to create custom Python-based metrics. This functionality provides you with unparalleled flexibility in evaluating your AI workflows. Instead of being confined to predefined evaluation criteria, you can implement tailored logic to effectively assess your specific use cases.

Creating Your Python Metric

Follow these steps to create a Python metric in Dynamiq:

  1. Navigate to the Metric Creation Interface:

    • Go to Evaluations → Metrics → Create New Metric.

  2. Select the Python Tab:

    • Once you arrive at the Python tab, you will see an example metric designed to help you get started. For quick development, you can also choose from several pre-built templates, including:

      • Exact Match

      • Email Presence

      • Phone Presence

      • String Presence

      • Arithmetic Sum

      • JSON Validity Check

How Python Metrics Work

Let’s look at a simple example using the Exact Match metric:

def evaluate(answer: str, expected: str) -> int:
    if answer == expected:
        score = 1
    else:
        score = 0
    return score

This code checks if an answer matches the expected value exactly, returning a score of 1 for a match or 0 otherwise.

Important Note on Function Arguments

When defining a Python metric, Dynamiq automatically extracts the function arguments (in this case, answer and expected). During evaluation runs, you’ll need to map these arguments to specific fields from your dataset or workflow output.

This mapping system allows your metrics to dynamically access the relevant data during evaluation.

Technical Implementation Details

Under the hood, Dynamiq uses RestrictedPython to safely execute your metric code. This means that certain operations may be prohibited for security reasons.

Taking Your Evaluations to the Next Level

With the power of custom Python metrics at your disposal, you can create sophisticated evaluation pipelines tailored to your precise requirements. This functionality enables you to:

  • Implement domain-specific evaluation logic.

  • Create composite metrics that take multiple factors into account.

  • Evaluate complex data structures and relationships.

  • Build metrics that align directly with your business objectives.

By leveraging the full power of Python within your evaluation pipeline, you can gain deeper insights into your AI applications and make informed, data-driven improvements with confidence.

For a detailed overview of allowed Python libraries and implementation details, you can review the .

official Dynamiq GitHub repository