Document retrievers

In the inference workflow of a Retrieval-Augmented Generation (RAG) application, document retrievers play a crucial role in accessing stored vectorized data. By efficiently retrieving relevant information, they enhance the system's ability to provide accurate and contextually relevant responses.

Available Document Retrievers

Dynamiq offers a variety of document retrievers, each with unique features and configurations. Let's explore these options:

Weaviate Retriever

Configuration

  • Name: Provide a name for the retriever.

  • Connection: Establish a connection to Weaviate, a vector database optimized for retrieval.

  • Index Name: Specify the index name for retrieval.

  • Max Documents: Set the maximum number of documents to retrieve.

  • Filters: Apply filters to refine search results.

  • Options:

    • Use hybrid search: Enables hybrid search.

      • Alpha: Adjusts the balance between keyword and vector search.

  • Advanced configuration:

    • Content Key: Specify custom field name used to store content in the storage.

Pinecone Retriever

Configuration

  • Name: Provide a name for the retriever.

  • Connection: Connect to Pinecone, a scalable vector database service.

  • Index Name: Specify the index name for retrieval.

  • Namespace: Use namespaces to segment data.

  • Max Documents: Limit the number of documents retrieved.

  • Filters: Use filters to narrow down results.

  • Advanced configuration:

    • Content Key: Specify custom field name used to store content in the storage.

Chroma Retriever

Configuration

  • Name: Provide a name for the retriever.

  • Connection: Connect to Chroma for managing vector data.

  • Index Name: Specify the index name for retrieval.

  • Max Documents: Define the maximum documents to fetch.

  • Filters: Apply filters for targeted retrieval.

Qdrant Retriever

Configuration

  • Name: Set a name for easy reference.

  • Connection: Establish a connection to Qdrant, a high-performance vector database.

  • Index Name: Specify the index name for retrieval.

  • Max Documents: Specify the maximum number of documents to retrieve.

  • Filters: Use filters to refine search results.

  • Advanced configuration:

    • Content Key: Specify custom field name used to store content in the storage.

Milvus Retriever

Configuration

  • Name: Provide a name for the retriever.

  • Connection: Establish a connection to Milvus, a highly performant, scalable vector database.

  • Index Name: Specify the index name for retrieval.

  • Max Documents: Specify the maximum number of documents to retrieve.

  • Filters: Use filters to refine search results.

  • Advanced configuration:

    • Content Key: Specify a unique name for the field in the storage used to keep content.

    • Embedding key: Specify a unique name for the field in the storage used to keep the vector.

Elasticsearch Retriever

Configuration

  • Name: Provide a name for the retriever.

  • Connection: Establish a connection to Elasticsearch, distributed search and analytics engine.

  • Index Name: Specify the index name for retrieval.

  • Max Documents: Specify the maximum number of documents to retrieve.

  • Embedding dimension: Dimension of the embeddings in vector store.

  • Filters: Use filters to refine search results.

  • Advanced configuration:

    • Content Key: Specify a unique name for the field in the storage used to keep content.

    • Embedding key: Specify a unique name for the field in the storage used to keep the vector.

PGvector Retriever

Configuration

  • Name: Provide a name for the retriever.

  • Connection: Establish a connection to pgvector, open-source vector similarity search for Postgres.

  • Index Name: Specify the index name for retrieval.

  • Max Documents: Specify the maximum number of documents to retrieve.

  • Schema name: Enter the name of the schema in the database.

  • Keyword index name: Specify the name for the keyword index.

  • Filters: Use filters to refine search results.

  • Options:

    • Use hybrid search: Enables hybrid search.

      • Alpha: Adjusts the balance between keyword and vector search.

  • Advanced configuration:

    • Content Key: Specify a unique name for the field in the storage used to keep content.

    • Embedding key: Specify a unique name for the field in the storage used to keep the vector.

How to Use Document Retrievers

Input:

  • Provide the query vector to initiate the retrieval process.

Configuration:

  • Select the appropriate retriever based on your retrieval needs.

  • Configure necessary parameters such as connection, index name, and filters.

Output:

  • The retriever fetches relevant documents, making them available for further processing in the RAG application.

Benefits of Document Retrievers

  • Efficient Retrieval: Quickly accesses relevant data for accurate responses.

  • Scalability: Handles large datasets, supporting extensive knowledge bases.

  • Flexibility: Offers various configurations to suit different retrieval needs.

By effectively utilizing document retrievers, your RAG application can deliver precise and contextually relevant information efficiently.

Last updated