Trainer Run - OuterProduct

curl --request POST \ --url https://api.example.com/v1/trainer/run \ --header 'Content-Type: application/json' \ --data ' { "dataset_id": "3c90c3cc-0d44-4b50-8888-8dd25736052a", "filter": { "col": "<string>", "kind": "cmp", "value": "<string>" }, "task": { "label_column": "<string>", "task_kind": "regression" }, "model_types": [ "<string>" ], "metrics": [ "<string>" ], "strategy": "optuna", "n_trials": 4, "n_splits": 123, "ensemble": false, "grid_size": 123, "teacher_model_id": "3c90c3cc-0d44-4b50-8888-8dd25736052a", "teacher_predict_url": "<string>", "teacher_predict_headers": {}, "random_state": 42 } '

Headers

authorization

string | null

refresh-token

string | null

Body

application/json

POST /v1/trainer/run — configure a Trainer and run HPO across a model matrix.

The training dataset is referenced by dataset_id (register it first via POST /v1/datasets). task is a per-fit choice so the same dataset can be trained against different targets/tasks.

dataset_id

required

Id of a dataset registered via POST /v1/datasets. The server looks it up and proceeds with the fit pipeline.

filter

Comparison · object

A single column comparison, e.g. col("ts") >= "2024-01-01".

value is unused for is_null / is_not_null and is a list for in / not_in. Timestamps travel as ISO-8601 strings and are coerced to the column dtype trainer-side.

Comparison
BoolExpr
Not

Show child attributes

task

Regression · object

Predict a continuous target.

Show child attributes

model_types

string[] | null

Candidate model-family identifiers, e.g. ['tabm', 'xgboost', 'xrfm', 'tabicl']. Resolved server-side. If omitted, the server picks a default set based on dataset shape.

metrics

string[] | null

Metric names to optimise, e.g. ['auc'] or ['auc', 'neg_class_error']. Multi-metric requests trigger a Pareto sweep when grid_size is set. Custom Python callables are not supported in v1.

strategy

string

default:optuna

HPO strategy: 'optuna' or 'random'. Resolved server-side.

n_trials

integer

default:4

Number of HPO trials per matrix row.

Required range: x >= 1

n_splits

integer | null

K-fold cross-validation folds. None means a single holdout split.

ensemble

boolean

default:false

If true, build a Caruana-style ensemble across stage-1 fold/trial models instead of refitting the winner on full data.

grid_size

number | null

Simplex grid step for Pareto-style multi-metric sweeps.

teacher_model_id

Distill from a same-server trained model identified by id. The synth-gen worker calls the inference worker in-process — no HTTP self-loop. Mutually exclusive with teacher_predict_url.

teacher_predict_url

string | null

If set, distil from an external teacher at this URL. The worker POSTs {'samples': [[...]], 'feature_names': [...]}. Use teacher_model_id for same-server teachers instead.

teacher_predict_headers

Teacher Predict Headers · object

Headers to send when calling teacher_predict_url.

Show child attributes

random_state

integer

default:42

Response

Successful Response

POST /v1/trainer/run -- async trainer job submission response.

dataset_id echoes the dataset this run trained against.

job_id

required

Server-assigned id of the submitted job

status

enum<string>

required

Available options:

pending,

running,

completed,

failed

message

string

required

dataset_id

required

Id of the dataset this run trained against.