Retrievers & Rankers
Query vector stores with per-store retriever nodes, bundle retrieval into an agent tool, and re-rank results with Cohere, an LLM, or time weighting.
Retrievers (dynamiq.nodes.retrievers) search a vector store for the chunks closest to a query embedding. Rankers (dynamiq.nodes.rankers) reorder and trim those results before they reach the LLM. This page covers both, plus VectorStoreRetriever — the composite tool that packages embed → retrieve → re-rank for agents.
Retriever nodes
One retriever per store, all with the same shape: ChromaDocumentRetriever, ElasticsearchDocumentRetriever, MilvusDocumentRetriever, OpenSearchDocumentRetriever, PGVectorDocumentRetriever, PineconeDocumentRetriever, QdrantDocumentRetriever, WeaviateDocumentRetriever.
from dynamiq.connections import Pinecone as PineconeConnection
from dynamiq.nodes.retrievers import PineconeDocumentRetriever
retriever = PineconeDocumentRetriever(
connection=PineconeConnection(),
index_name="quickstart",
top_k=5,
)Configuration shared by all retrievers:
top_kintfiltersdictsimilarity_thresholdfloatAt run time a retriever takes the query embedding (produced by a text embedder) and optional per-call overrides for top_k, filters, and similarity_threshold; it returns documents, each carrying content, metadata, and a similarity score. The RAG Pipeline page shows the full embedder → retriever → LLM wiring.
Stores with hybrid search (for example Weaviate and Elasticsearch) also accept the raw query string and an alpha parameter that blends keyword and vector scoring (0 = pure keyword, 1 = pure vector).
VectorStoreRetriever: RAG as an agent tool
VectorStoreRetriever bundles a text embedder, a store retriever, and an optional reranker behind a single query interface. Because it is a tool-group node, you can hand it to an agent:
from dynamiq.connections import (
Cohere as CohereConnection,
OpenAI as OpenAIConnection,
Pinecone as PineconeConnection,
)
from dynamiq.nodes.agents import Agent
from dynamiq.nodes.embedders import OpenAITextEmbedder
from dynamiq.nodes.llms import OpenAI
from dynamiq.nodes.rankers import CohereReranker
from dynamiq.nodes.retrievers import PineconeDocumentRetriever
from dynamiq.nodes.retrievers.retriever import VectorStoreRetriever
rag_tool = VectorStoreRetriever(
name="knowledge-search",
text_embedder=OpenAITextEmbedder(
connection=OpenAIConnection(), model="text-embedding-3-small"
),
document_retriever=PineconeDocumentRetriever(
connection=PineconeConnection(), index_name="quickstart", top_k=20
),
document_reranker=CohereReranker(connection=CohereConnection(), top_k=5),
)
llm = OpenAI(connection=OpenAIConnection(), model="gpt-4o")
agent = Agent(
name="kb-agent",
llm=llm,
tools=[rag_tool],
role="Answer questions using the knowledge-search tool and cite the sources you used.",
max_loops=6,
)
result = agent.run(input_data={"input": "What does our refund policy say about digital goods?"})
print(result.output["content"])A common pattern: retrieve generously (top_k=20 on the retriever), then let the reranker keep the best 5. The counterpart for writes is VectorStoreWriter (dynamiq.nodes.writers.writer), which pairs a document embedder with a store writer so an agent can persist new documents.
Rankers
All three rankers take query + documents (the TimeWeightedDocumentRanker only needs documents) and return a reordered, trimmed documents list, so they slot between a retriever and an LLM — or into document_reranker above.
CohereReranker
Cross-encoder re-ranking through Cohere's rerank API:
from dynamiq.connections import Cohere
from dynamiq.nodes.rankers import CohereReranker
from dynamiq.types import Document
ranker = CohereReranker(connection=Cohere()) # reads COHERE_API_KEY
output = ranker.run(
input_data={
"query": "What is machine learning?",
"documents": [
Document(content="Machine learning is a branch of AI...", score=0.8),
Document(content="Deep learning is a subset of machine learning...", score=0.7),
],
}
)
print(output.output["documents"])modelstrtop_kintthresholdfloatLLMDocumentRanker
Uses any LLM node to judge relevance — no extra vendor, fully customizable via prompt_template:
from dynamiq.connections import OpenAI as OpenAIConnection
from dynamiq.nodes.llms import OpenAI
from dynamiq.nodes.rankers import LLMDocumentRanker
ranker = LLMDocumentRanker(
llm=OpenAI(connection=OpenAIConnection(), model="gpt-4o-mini"),
top_k=5,
)TimeWeightedDocumentRanker
Boosts recent documents based on a date stored in metadata — useful for news, tickets, and logs:
from dynamiq.nodes.rankers import TimeWeightedDocumentRanker
ranker = TimeWeightedDocumentRanker(
top_k=5,
date_field="date", # metadata key holding the date
date_format="%d %B, %Y",
max_days=3600, # age horizon for the decay
min_coefficient=0.9, # floor for the recency multiplier
)Choosing a ranker
| Ranker | Best when | Cost |
|---|---|---|
CohereReranker | You want the strongest general-purpose relevance and already use Cohere | Per-call API usage |
LLMDocumentRanker | You need custom relevance criteria or want to stay on one provider | LLM tokens |
TimeWeightedDocumentRanker | Freshness matters as much as similarity | Free — pure computation |