Document retrievers

In the inference workflow of a Retrieval-Augmented Generation (RAG) application, document retrievers play a crucial role in accessing stored vectorized data. By efficiently retrieving relevant information, they enhance the system's ability to provide accurate and contextually relevant responses.

Available Document Retrievers

Dynamiq offers a variety of document retrievers, each with unique features and configurations. Let's explore these options:

Weaviate Retriever

Configuration

Name: Provide a name for the retriever.
Connection: Establish a connection to Weaviate, a vector database optimized for retrieval.
Index Name: Specify the index name for retrieval.
Max Documents: Set the maximum number of documents to retrieve.
Filters: Apply filters to refine search results.
Options:
- Use hybrid search: Enables hybrid search.
  - Alpha: Adjusts the balance between keyword and vector search.
Advanced configuration:
- Content Key: Specify custom field name used to store content in the storage.

Pinecone Retriever

Configuration

Name: Provide a name for the retriever.
Connection: Connect to Pinecone, a scalable vector database service.
Index Name: Specify the index name for retrieval.
Namespace: Use namespaces to segment data.
Max Documents: Limit the number of documents retrieved.
Filters: Use filters to narrow down results.
Advanced configuration:
- Content Key: Specify custom field name used to store content in the storage.

Chroma Retriever

Configuration

Name: Provide a name for the retriever.
Connection: Connect to Chroma for managing vector data.
Index Name: Specify the index name for retrieval.
Max Documents: Define the maximum documents to fetch.
Filters: Apply filters for targeted retrieval.

Qdrant Retriever

Configuration

Name: Set a name for easy reference.
Connection: Establish a connection to Qdrant, a high-performance vector database.
Index Name: Specify the index name for retrieval.
Max Documents: Specify the maximum number of documents to retrieve.
Filters: Use filters to refine search results.
Advanced configuration:
- Content Key: Specify custom field name used to store content in the storage.

Milvus Retriever

Configuration

Name: Provide a name for the retriever.
Connection: Establish a connection to Milvus, a highly performant, scalable vector database.
Index Name: Specify the index name for retrieval.
Max Documents: Specify the maximum number of documents to retrieve.
Filters: Use filters to refine search results.
Advanced configuration:
- Content Key: Specify a unique name for the field in the storage used to keep content.
- Embedding key: Specify a unique name for the field in the storage used to keep the vector.

Elasticsearch Retriever

Configuration

Name: Provide a name for the retriever.
Connection: Establish a connection to Elasticsearch, distributed search and analytics engine.
Index Name: Specify the index name for retrieval.
Max Documents: Specify the maximum number of documents to retrieve.
Embedding dimension: Dimension of the embeddings in vector store.
Filters: Use filters to refine search results.
Advanced configuration:
- Content Key: Specify a unique name for the field in the storage used to keep content.
- Embedding key: Specify a unique name for the field in the storage used to keep the vector.

PGvector Retriever

Configuration

Name: Provide a name for the retriever.
Connection: Establish a connection to pgvector, open-source vector similarity search for Postgres.
Index Name: Specify the index name for retrieval.
Max Documents: Specify the maximum number of documents to retrieve.
Schema name: Enter the name of the schema in the database.
Keyword index name: Specify the name for the keyword index.
Filters: Use filters to refine search results.
Options:
- Use hybrid search: Enables hybrid search.
  - Alpha: Adjusts the balance between keyword and vector search.
Advanced configuration:
- Content Key: Specify a unique name for the field in the storage used to keep content.
- Embedding key: Specify a unique name for the field in the storage used to keep the vector.

How to Use Document Retrievers

Input:

Provide the query vector to initiate the retrieval process.

Configuration:

Select the appropriate retriever based on your retrieval needs.
Configure necessary parameters such as connection, index name, and filters.

Output:

The retriever fetches relevant documents, making them available for further processing in the RAG application.

Benefits of Document Retrievers

Efficient Retrieval: Quickly accesses relevant data for accurate responses.
Scalability: Handles large datasets, supporting extensive knowledge bases.
Flexibility: Offers various configurations to suit different retrieval needs.

By effectively utilizing document retrievers, your RAG application can deliver precise and contextually relevant information efficiently.

PreviousText embedders NextComplex retrievers

Last updated 3 months ago