Skip to main content
POST
/
v1
/
agentic
/
documents
/
tabularize
Tabularize
curl --request POST \
  --url https://api.example.com/v1/agentic/documents/tabularize \
  --header 'Content-Type: application/json' \
  --data '
{
  "documents": [
    {
      "document_id": "<string>",
      "upload_key": "<string>"
    }
  ],
  "schema": {
    "skill": "<string>",
    "use_case": "<string>",
    "questions": [
      {
        "id": "<string>",
        "question": "<string>",
        "rationale": "<string>",
        "unit": "<string>",
        "enum": [
          "<string>"
        ]
      }
    ],
    "metadata": {}
  },
  "web_augmentation": false
}
'
{
  "job_id": "3c90c3cc-0d44-4b50-8888-8dd25736052a",
  "message": "<string>",
  "dataset_id": "3c90c3cc-0d44-4b50-8888-8dd25736052a"
}

Headers

authorization
string | null
refresh-token
string | null

Body

application/json

POST /v1/agentic/documents/tabularize -- async tabularization job.

Extracts every uploaded document against schema and assembles a row per document. Submit, poll status, then fetch the result via GET /v1/agentic/documents/tables/{dataset_id}.

Fan-out concurrency is governed server-side by the Lambda fleet's reserved concurrency; there is no client-supplied concurrency knob.

documents
DocumentRef · object[]
required
Minimum array length: 1
schema
Schema · object
required

A frozen list of :class:Question for one document type and one use case.

web_augmentation
boolean
default:false

If true, allow the agent to issue web searches to corroborate answers.

Response

Successful Response

POST /v1/agentic/documents/tabularize -- async submission response.

job_id is the canonical job handle. dataset_id is the id of the dataset being produced — used to fetch the typed result via GET /v1/agentic/documents/tables/{dataset_id} once the job completes.

job_id
required

Server-assigned id of the submitted job

status
enum<string>
required
Available options:
pending,
running,
completed,
failed
message
string
required
dataset_id
required