Dynamiq Docs
  • Welcome to Dynamiq
  • Low-Code Builder
    • Chat
    • Basics
    • Connecting Nodes
    • Conditional Nodes and Multiple Outputs
    • Input and Output Transformers
    • Error Handling and Retries
    • LLM Nodes
    • Validator Nodes
    • RAG Nodes
      • Indexing Workflow
        • Pre-processing Nodes
        • Document Splitting
        • Document Embedders
        • Document Writers
      • Inference RAG workflow
        • Text embedders
        • Document retrievers
          • Complex retrievers
        • LLM Answer Generators
    • LLM Agents
      • Basics
      • Guide to Implementing LLM Agents: ReAct and Simple Agents
      • Guide to Agent Orchestration: Linear and Adaptive Orchestrators
      • Guide to Advanced Agent Orchestration: Graph Orchestrator
    • Audio and voice
    • Tools and External Integrations
    • Python Code in Workflows
    • Memory
    • Guardrails
  • Deployments
    • Workflows
      • Tracing Workflow Execution
    • LLMs
      • Fine-tuned Adapters
      • Supported Models
    • Vector Databases
  • Prompts
    • Prompt Playground
  • Connections
  • LLM Fine-tuning
    • Basics
    • Using Adapters
    • Preparing Data
    • Supported Models
    • Parameters Guide
  • Knowledge Bases
  • Evaluations
    • Metrics
      • LLM-as-a-Judge
      • Predefined metrics
        • Faithfulness
        • Context Precision
        • Context Recall
        • Factual Correctness
        • Answer Correctness
      • Python Code Metrics
    • Datasets
    • Evaluation Runs
    • Examples
      • Build Accurate vs. Inaccurate Workflows
  • Examples
    • Building a Search Assistant
      • Approach 1: Single Agent with a Defined Role
      • Approach 2: Adaptive Orchestrator with Multiple Agents
      • Approach 3: Custom Logic Pipeline with a Straightforward Workflow
    • Building a Code Assistant
  • Platform Settings
    • Access Keys
    • Organizations
    • Settings
    • Billing
  • On-premise Deployment
    • AWS
    • IBM
  • Support Center
Powered by GitBook
On this page
  • Creating Example Workflows to Showcase Metrics Power
  • Workflow Overview
  • Creating and Deploying the Workflows
  • Running and Reviewing an Evaluation
  • Conclusion
  1. Evaluations
  2. Examples

Build Accurate vs. Inaccurate Workflows

PreviousExamplesNextExamples

Last updated 1 month ago

Creating Example Workflows to Showcase Metrics Power

In this section, we will demonstrate the effectiveness of evaluation metrics by creating two distinct workflows: one that generates accurate answers and another that produces incorrect answers. This comparison will highlight how various metrics can differentiate between high-quality and low-quality outputs.

Workflow Overview

We will create the following two workflows:

  1. Accurate Workflow: This workflow generates correct, precise answers to questions.

  2. Inaccurate Workflow: This workflow intentionally includes errors and irrelevant information in its responses.

Prompt 1: Accurate Answers

Instructions for the Accurate Workflow

You are an expert assistant providing precise and accurate answers to questions. 
Ensure that your answers are correct, concise, and, where appropriate, include brief explanations to enhance understanding.

Instructions:
- Provide accurate information in response to the question.
- Keep the answer clear and concise.
- Include a brief explanation if it adds value.
- Do not include irrelevant information.
- Use proper grammar and spelling.
- Maintain a professional tone.

Question: {{question}}
Context: {{context}}

Answer:

Prompt 2: Inaccurate Assistant

You are an assistant providing answers to questions, but you often make mistakes and include irrelevant information.

Instructions:
- Provide an answer to the question and include:
- Major and minor factual errors.
- Incomplete or insufficient information.
- Irrelevant or off-topic details.
- Grammatical and spelling mistakes
- Aim for a casual tone


Question: {{question}}
Context: {{context}}

Answer:

Creating and Deploying the Workflows

Steps to Create the Workflows

  1. Navigate to Workflows: In the Dynamiq portal, go to the Workflows section.

  2. Create New Workflow: Click on the Create button to start a new workflow.

  3. Configure Workflow:

    • Name: Assign a descriptive name to each workflow (e.g., "Accurate Workflow" and "Inaccurate Workflow").

    • Prompt: Use the templates provided above for each workflow.

    • LLM Selection: Choose the appropriate LLM provider and model (if applicable) for generating responses.

  4. Deploy Workflows: Once configured, deploy the workflows to start generating answers based on the provided prompts.

By setting up these workflows, you can clearly observe how various evaluation metrics can distinguish between accurate and inaccurate responses, demonstrating their effectiveness in evaluating AI-generated content.

Running and Reviewing an Evaluation

After setting up your evaluation runs, follow these steps to assess the performance of your workflows using the evaluation metrics you defined.

Steps to Execute an Evaluation Run

  1. Initiate Evaluation Run: After configuring your evaluation settings, click Create to start the evaluation job. The system will begin processing the workflows with the selected metrics.

  2. Monitor Evaluation Status: In the Evaluations section, you can check the status of your evaluation runs. It will initially show as "Running" and change to "Succeeded" once completed.

  3. Review Results: Once the evaluation is complete, you can review the answers and their corresponding metrics.

Reviewing Evaluation Results

  • Evaluation Runs Overview: The main screen will list all evaluation runs, showing their names, statuses, and creators. Successful runs will be marked as "Succeeded."

  • Detailed Results: Click on an evaluation run to see more detailed insights, such as:

    • Context and Question: The input data used for generating answers.

    • Ground Truth Answer: The correct answer for comparison.

    • Workflow Outputs: The answers generated by each workflow version.

    • Metrics Scores: The scores for each metric, including but not limited to Clarity, Coherence, Ethical Compliance, Language Quality, and Factual Accuracy.

Conclusion

By following these steps to create accurate and inaccurate workflows, you will gain a comprehensive understanding of how various evaluation metrics can be applied to evaluate AI-generated content. This process not only highlights the effectiveness of the metrics but also helps in identifying areas for improvement in your AI workflows.