Embedders & Vector Stores

Eight embedding providers and eight vector stores — the provider/store matrix with writer configuration for each.

Embedders turn text into vectors; vector stores persist them. Every embedding provider ships in two flavors: a document embedder (embeds a list of Document chunks at indexing time) and a text embedder (embeds a single query string at retrieval time). Every store ships a writer node for indexing and a retriever node for search.

Embedding providers

All embedders live in dynamiq.nodes.embedders. Connections read their API keys from environment variables — see Connections & Credentials.

Provider	Document embedder	Text embedder	Default model
OpenAI	`OpenAIDocumentEmbedder`	`OpenAITextEmbedder`	`text-embedding-3-small`
Cohere	`CohereDocumentEmbedder`	`CohereTextEmbedder`	`cohere/embed-english-v2.0`
AWS Bedrock	`BedrockDocumentEmbedder`	`BedrockTextEmbedder`	`amazon.titan-embed-text-v1`
Mistral	`MistralDocumentEmbedder`	`MistralTextEmbedder`	`mistral/mistral-embed`
Gemini	`GeminiDocumentEmbedder`	`GeminiTextEmbedder`	`gemini/gemini-embedding-exp-03-07`
Hugging Face	`HuggingFaceDocumentEmbedder`	`HuggingFaceTextEmbedder`	`huggingface/BAAI/bge-large-zh` (document) / `huggingface/microsoft/codebert-base` (text)
IBM watsonx	`WatsonXDocumentEmbedder`	`WatsonXTextEmbedder`	`watsonx/ibm/slate-30m-english-rtrvr`
Vertex AI	`VertexAIDocumentEmbedder`	`VertexAITextEmbedder`	`vertex_ai/text-embedding-005`

from dynamiq.connections import OpenAI as OpenAIConnection
from dynamiq.nodes.embedders import OpenAIDocumentEmbedder, OpenAITextEmbedder
from dynamiq.types import Document

connection = OpenAIConnection()  # reads OPENAI_API_KEY

# Indexing side: documents in, documents-with-embeddings out
doc_embedder = OpenAIDocumentEmbedder(connection=connection, model="text-embedding-3-small")
docs = doc_embedder.run(
    input_data={"documents": [Document(content="Machine learning is a branch of AI.")]}
).output["documents"]

# Retrieval side: query in, embedding out
text_embedder = OpenAITextEmbedder(connection=connection, model="text-embedding-3-small")
out = text_embedder.run(input_data={"query": "What is machine learning?"}).output
embedding = out["embedding"]   # list[float]
query = out["query"]           # original string, handy for prompts downstream

Index and query with the same provider and model. The document embedder sets the vector space; the text embedder must live in it. If you switch models, re-index.

Vector stores

Writers live in dynamiq.nodes.writers; the underlying store clients in dynamiq.storages.vector:

Store	Writer node	Retriever node
Pinecone	`PineconeDocumentWriter`	`PineconeDocumentRetriever`
Weaviate	`WeaviateDocumentWriter`	`WeaviateDocumentRetriever`
Qdrant	`QdrantDocumentWriter`	`QdrantDocumentRetriever`
Milvus	`MilvusDocumentWriter`	`MilvusDocumentRetriever`
Chroma	`ChromaDocumentWriter`	`ChromaDocumentRetriever`
Elasticsearch	`ElasticsearchDocumentWriter`	`ElasticsearchDocumentRetriever`
OpenSearch	`OpenSearchDocumentWriter`	`OpenSearchDocumentRetriever`
pgvector	`PGVectorDocumentWriter`	`PGVectorDocumentRetriever`

Writers take documents (already embedded) as input and report upserted_count in their output. Set create_if_not_exist=True to create the index programmatically.

Pinecone

Serverless deployment:

from dynamiq.connections import Pinecone as PineconeConnection
from dynamiq.nodes.writers import PineconeDocumentWriter

writer = PineconeDocumentWriter(
    connection=PineconeConnection(),
    index_name="quickstart",
    dimension=1536,
    create_if_not_exist=True,
    index_type="serverless",
    cloud="aws",
    region="us-east-1",
)

Pod-based deployment:

writer = PineconeDocumentWriter(
    connection=PineconeConnection(),
    index_name="quickstart",
    dimension=1536,
    create_if_not_exist=True,
    index_type="pod",
    environment="us-west1-gcp",
    pod_type="p1.x1",
    pods=1,
)

Elasticsearch

from dynamiq.connections import Elasticsearch as ElasticsearchConnection
from dynamiq.nodes.writers import ElasticsearchDocumentWriter

writer = ElasticsearchDocumentWriter(
    connection=ElasticsearchConnection(
        url="https://<elasticsearch-host>:9200",
        api_key="your-api-key",
    ),
    index_name="documents",
    dimension=1536,
    similarity="cosine",
)

For Elastic Cloud, authenticate with username, password, and cloud_id on the connection instead of url/api_key, and optionally pass index_settings / mapping_settings dicts when creating the index.

Weaviate

from dynamiq.nodes.writers import WeaviateDocumentWriter
from dynamiq.storages.vector import WeaviateVectorStore

writer = WeaviateDocumentWriter(
    vector_store=WeaviateVectorStore(index_name="Documents", create_if_not_exist=True)
)

Any writer can be built either from a connection (the node constructs the store) or from a prebuilt vector_store instance, as shown here.

A complete embed-and-store fragment

from dynamiq import Workflow
from dynamiq.connections import OpenAI as OpenAIConnection, Pinecone as PineconeConnection
from dynamiq.nodes.embedders import OpenAIDocumentEmbedder
from dynamiq.nodes.writers import PineconeDocumentWriter
from dynamiq.types import Document

wf = Workflow()

embedder = OpenAIDocumentEmbedder(
    connection=OpenAIConnection(), model="text-embedding-3-small"
)
writer = (
    PineconeDocumentWriter(
        connection=PineconeConnection(),
        index_name="quickstart",
        dimension=1536,
        create_if_not_exist=True,
        index_type="serverless",
        cloud="aws",
        region="us-east-1",
    )
    .inputs(documents=embedder.outputs.documents)
    .depends_on(embedder)
)
wf.flow.add_nodes(embedder, writer)

result = wf.run(
    input_data={
        "documents": [
            Document(content="Dynamiq is an operating platform for agentic AI."),
        ]
    }
)
print(result.output[writer.id]["output"]["upserted_count"])  # 1

dimension must match the embedding model's output size — text-embedding-3-small produces 1536-dimensional vectors. On the platform, the same writers back Knowledge Base storage — see Vector Store vs Knowledge Base.

Embedders & Vector Stores

Embedding providers

Vector stores

Pinecone

Elasticsearch

Weaviate

A complete embed-and-store fragment

Next steps

Retrievers & Rankers

RAG Pipeline

Connections & Credentials

Memory

On this page