Metrics
Last updated
Last updated
Welcome to the first chapter dedicated to Metrics Creation using Dynamiq. Effective evaluation begins with the right metrics, and this guide will provide you with the knowledge you need to create, customize, and implement metrics that enhance the performance of your AI-driven solutions.
In this chapter, we will cover the various types of metrics available in Dynamiq, breaking them down into three main categories:
LLM-as-a-Judge Metrics: Customize your evaluations by defining LLM prompts tailored to specific evaluation goals. Discover how to create effective metric prompts that assess various criteria, such as accuracy, relevance, and more.
Predefined Metrics: Leverage a set of complex, predefined metrics designed to evaluate the quality of answers in Retrieval-Augmented Generation (RAG) and agentic applications. Learn about the built-in metrics available to you and how to utilize them for your specific needs.
Custom Python Code Metrics: For advanced users, Dynamiq allows the creation of custom metrics using Python code. This provides the flexibility to programmatically implement evaluation methods tailored to your unique requirements.
In the upcoming subpages, you’ll find detailed information on each metric category, including:
Step-by-step guides on how to create and customize metrics.
Examples and best practices for implementing each metric type.
Tips on how to integrate your metrics into your evaluation workflows effectively.
By the end of this chapter, you will have a comprehensive understanding of how to create metrics that meet your specific evaluation needs, empowering you to drive continuous improvement in your AI applications.
Let’s dive in!