Skip to main content
POST
/
v1
/
uploads
Create Upload
curl --request POST \
  --url https://api.example.com/v1/uploads \
  --header 'Content-Type: application/json' \
  --data '{}'
{
  "connection_config": {
    "uri": "<string>",
    "id": "3c90c3cc-0d44-4b50-8888-8dd25736052a",
    "connector_type": "file_upload"
  },
  "upload_url": "<string>",
  "content_type": "<string>",
  "expires_in": 123
}

Headers

authorization
string | null
refresh-token
string | null

Body

application/json

POST /v1/uploads -- request a presigned URL for direct-to-S3 upload.

file_format
enum<string>
required

Format of the dataset you will PUT to the returned URL. 'pkl' = a pickled pandas DataFrame, 'csv' = RFC4180 CSV with a header row, 'parquet' = Apache Parquet. The label column must be present in the uploaded table and its name is supplied on the subsequent /v1/trainer/run or /v1/reasoning/fit call as label_column.

Available options:
pkl,
csv,
parquet

Response

Successful Response

POST /v1/uploads response.

Mints the file_upload_connection_config row (so its id is Postgres-generated, not derived from the S3 key) and returns it as connection_config alongside the presigned PUT. The caller PUTs the bytes to upload_url with content_type, then registers a dataset over connection_config via POST /v1/datasets — which is what assigns the dataset id. No dataset_id is returned here.

connection_config
FileUploadConnectionConfigResponse · object
required

FileUploadConnectionConfig plus the server-assigned row identity.

The response shape for POST /v1/uploads — never a request, so id and state are always present (dedicated response type instead of nullable fields on the base). The base config is what rides the ConnectionConfig union on requests/inference; this adds the persisted id and lifecycle state.

upload_url
string
required
content_type
string
required
expires_in
integer
required