Dynamiq
RAG

Embedders & Vector Stores

Eight embedding providers and eight vector stores — the provider/store matrix with writer configuration for each.

Embedders turn text into vectors; vector stores persist them. Every embedding provider ships in two flavors: a document embedder (embeds a list of Document chunks at indexing time) and a text embedder (embeds a single query string at retrieval time). Every store ships a writer node for indexing and a retriever node for search.

Embedding providers

All embedders live in dynamiq.nodes.embedders. Connections read their API keys from environment variables — see Connections & Credentials.

ProviderDocument embedderText embedderDefault model
OpenAIOpenAIDocumentEmbedderOpenAITextEmbeddertext-embedding-3-small
CohereCohereDocumentEmbedderCohereTextEmbeddercohere/embed-english-v2.0
AWS BedrockBedrockDocumentEmbedderBedrockTextEmbedderamazon.titan-embed-text-v1
MistralMistralDocumentEmbedderMistralTextEmbeddermistral/mistral-embed
GeminiGeminiDocumentEmbedderGeminiTextEmbeddergemini/gemini-embedding-exp-03-07
Hugging FaceHuggingFaceDocumentEmbedderHuggingFaceTextEmbedderhuggingface/BAAI/bge-large-zh (document) / huggingface/microsoft/codebert-base (text)
IBM watsonxWatsonXDocumentEmbedderWatsonXTextEmbedderwatsonx/ibm/slate-30m-english-rtrvr
Vertex AIVertexAIDocumentEmbedderVertexAITextEmbeddervertex_ai/text-embedding-005
from dynamiq.connections import OpenAI as OpenAIConnection
from dynamiq.nodes.embedders import OpenAIDocumentEmbedder, OpenAITextEmbedder
from dynamiq.types import Document

connection = OpenAIConnection()  # reads OPENAI_API_KEY

# Indexing side: documents in, documents-with-embeddings out
doc_embedder = OpenAIDocumentEmbedder(connection=connection, model="text-embedding-3-small")
docs = doc_embedder.run(
    input_data={"documents": [Document(content="Machine learning is a branch of AI.")]}
).output["documents"]

# Retrieval side: query in, embedding out
text_embedder = OpenAITextEmbedder(connection=connection, model="text-embedding-3-small")
out = text_embedder.run(input_data={"query": "What is machine learning?"}).output
embedding = out["embedding"]   # list[float]
query = out["query"]           # original string, handy for prompts downstream

Index and query with the same provider and model. The document embedder sets the vector space; the text embedder must live in it. If you switch models, re-index.

Vector stores

Writers live in dynamiq.nodes.writers; the underlying store clients in dynamiq.storages.vector:

StoreWriter nodeRetriever node
PineconePineconeDocumentWriterPineconeDocumentRetriever
WeaviateWeaviateDocumentWriterWeaviateDocumentRetriever
QdrantQdrantDocumentWriterQdrantDocumentRetriever
MilvusMilvusDocumentWriterMilvusDocumentRetriever
ChromaChromaDocumentWriterChromaDocumentRetriever
ElasticsearchElasticsearchDocumentWriterElasticsearchDocumentRetriever
OpenSearchOpenSearchDocumentWriterOpenSearchDocumentRetriever
pgvectorPGVectorDocumentWriterPGVectorDocumentRetriever

Writers take documents (already embedded) as input and report upserted_count in their output. Set create_if_not_exist=True to create the index programmatically.

Pinecone

Serverless deployment:

from dynamiq.connections import Pinecone as PineconeConnection
from dynamiq.nodes.writers import PineconeDocumentWriter

writer = PineconeDocumentWriter(
    connection=PineconeConnection(),
    index_name="quickstart",
    dimension=1536,
    create_if_not_exist=True,
    index_type="serverless",
    cloud="aws",
    region="us-east-1",
)

Pod-based deployment:

writer = PineconeDocumentWriter(
    connection=PineconeConnection(),
    index_name="quickstart",
    dimension=1536,
    create_if_not_exist=True,
    index_type="pod",
    environment="us-west1-gcp",
    pod_type="p1.x1",
    pods=1,
)

Elasticsearch

from dynamiq.connections import Elasticsearch as ElasticsearchConnection
from dynamiq.nodes.writers import ElasticsearchDocumentWriter

writer = ElasticsearchDocumentWriter(
    connection=ElasticsearchConnection(
        url="https://localhost:9200",
        api_key="your-api-key",
    ),
    index_name="documents",
    dimension=1536,
    similarity="cosine",
)

For Elastic Cloud, authenticate with username, password, and cloud_id on the connection instead of url/api_key, and optionally pass index_settings / mapping_settings dicts when creating the index.

Weaviate

from dynamiq.nodes.writers import WeaviateDocumentWriter
from dynamiq.storages.vector import WeaviateVectorStore

writer = WeaviateDocumentWriter(
    vector_store=WeaviateVectorStore(index_name="Documents", create_if_not_exist=True)
)

Any writer can be built either from a connection (the node constructs the store) or from a prebuilt vector_store instance, as shown here.

A complete embed-and-store fragment

from dynamiq import Workflow
from dynamiq.connections import OpenAI as OpenAIConnection, Pinecone as PineconeConnection
from dynamiq.nodes.embedders import OpenAIDocumentEmbedder
from dynamiq.nodes.writers import PineconeDocumentWriter
from dynamiq.types import Document

wf = Workflow()

embedder = OpenAIDocumentEmbedder(
    connection=OpenAIConnection(), model="text-embedding-3-small"
)
writer = (
    PineconeDocumentWriter(
        connection=PineconeConnection(),
        index_name="quickstart",
        dimension=1536,
        create_if_not_exist=True,
        index_type="serverless",
        cloud="aws",
        region="us-east-1",
    )
    .inputs(documents=embedder.outputs.documents)
    .depends_on(embedder)
)
wf.flow.add_nodes(embedder, writer)

result = wf.run(
    input_data={
        "documents": [
            Document(content="Dynamiq is an operating platform for agentic AI."),
        ]
    }
)
print(result.output[writer.id]["output"]["upserted_count"])  # 1

dimension must match the embedding model's output size — text-embedding-3-small produces 1536-dimensional vectors. On the platform, the same writers back Knowledge Base storage — see Vector Store vs Knowledge Base.

Next steps

On this page