Extract structured data from a document
Runs OCR on a PDF or image and then extracts structured data matching a JSON schema template. The `options` form field selects the OCR LLM, the structured-output LLM, and the extraction template. With `"stream": true` the response is an SSE stream instead of JSON.
Organization or project Access Key created in the Dynamiq console. Used for deployed-app (Runs API), AI Gateway, traces collector, and management API requests. Send as Authorization: Bearer <access-key>.
In: header
Request Body
multipart/form-data
TypeScript Definitions
Use the request body type in TypeScript.
Response Body
application/json
application/json
application/json
curl -X POST "https://example.com/v1/ocr/extract" \ -F file="string" \ -F options="string"{
"data": {
"invoice_number": "4812",
"charges": [
{
"amount": 49
},
{
"amount": 49
}
]
}
}{
"error": {
"code": "bad_request",
"message": "Bad Request",
"details": {
"input": "cannot be blank"
}
}
}{
"error": {
"code": "unauthorized",
"message": "Unauthorized"
}
}{
"error": {
"code": "bad_request",
"message": "Failed to parse data",
"details": {
"error": "1 validation error for LLMRequest"
}
}
}Parse a document with OCR
Extracts the text of a PDF or image as Markdown using an LLM-based OCR pipeline. The `options` form field is a JSON string selecting the LLM. With `"stream": true` the response is an SSE stream of extraction events instead of JSON.
Ingest trace runs
Ingests a batch of workflow/flow/node trace runs, as produced by the Dynamiq Python SDK's `TracingCallbackHandler` / `DynamiqTracingClient`. The collector validates `id`, `name`, `type`, `trace_id`, `source_id`, `start_time`, `end_time`, and `status`; the remaining fields are stored as-is. SDK-emitted fields not listed in the schema (such as `session_id` and `tags`) are accepted and ignored by the collector.