RAG Nodes

Creating a RAG Application with Dynamiq

What is RAG?

RAG (Retrieval-Augmented Generation) is a powerful approach in natural language processing that combines the strengths of information retrieval and text generation. It enhances the capabilities of language models by integrating external knowledge sources, allowing them to generate more accurate and contextually relevant responses. This is particularly useful in scenarios where the model needs to access up-to-date or domain-specific information that is not part of its training data.

Why is RAG Needed?

Traditional language models are limited by the data they were trained on, which can become outdated or insufficient for specific queries. RAG addresses this limitation by retrieving relevant documents or data from external sources in real-time, enriching the model's responses with current and precise information. This makes RAG essential for applications like customer support, knowledge management, and any domain requiring dynamic and informed interactions.

Building RAG Applications

Creating a RAG application involves two main phases:

1. Indexing Phase

This phase involves preparing and storing the data that the application will retrieve during inference. It includes steps like data preprocessing, chunking, vectorization, and storage.

2. Inference Phase

This phase involves processing user queries to retrieve relevant information and generate responses. It includes components like input nodes, text embedders, retrievers, generators, and output nodes.

High-Level Components of a RAG Application

To build a RAG application using Dynamiq's Workflow Builder, you will need the following components:

Indexing Phase Components

Pre-processing: Converts raw data file into a structured format.
Chunking: Splits documents into manageable pieces.
Vectorization: Converts text into vector representations.
Storage: Saves the vectorized data in vector storage for retrieval.

Inference Phase Components

Input Node: Captures the user's query or question.
Text Embedder: Converts the input query into a vector representation.
Retriever: Searches for relevant documents based on the query's vector.
Generator: Uses the retrieved documents to generate a response.
Output Node: Displays the generated response to the user.

Each of these components plays a crucial role in ensuring that the RAG application functions effectively, providing accurate and contextually relevant answers to user queries.

In the following sections, we will delve deeper into each component, exploring their functionalities and how to configure them within Dynamiq's Workflow Builder.

PreviousValidator Nodes NextIndexing Workflow

Last updated 9 months ago