Trainer Run
Compile the trainer graph and hand it to the scheduler.
The router’s responsibility ends at “ship the DTO”: validate, build
the dataset, compile the graph, cloudpickle the trainer to S3, and
put a :class:RequestToComputeGraph on the in-process inbox.
The scheduler owns the jobs and computation_node tables and
INSERTs both atomically; the router awaits the one-shot reply
future on the submission for the minted job_id and returns it
synchronously, preserving the existing SDK contract.
Body
POST /v1/trainer/run — configure a Trainer and run HPO across a model matrix.
The training dataset is referenced by dataset_id (register it first
via POST /v1/datasets). task is a per-fit choice so the same dataset
can be trained against different targets/tasks.
Id of a dataset registered via POST /v1/datasets. The server looks it up and proceeds with the fit pipeline.
A single column comparison, e.g. col("ts") >= "2024-01-01".
value is unused for is_null / is_not_null and is a list for
in / not_in. Timestamps travel as ISO-8601 strings and are coerced
to the column dtype trainer-side.
- Comparison
- BoolExpr
- Not
Predict a continuous target.
- Regression
- Binclass
- Multiclass
- Forecasting
- SequenceRegression
- SequenceBinclass
- SequenceMulticlass
Candidate model-family identifiers, e.g. ['tabm', 'xgboost', 'xrfm', 'tabicl']. Resolved server-side. If omitted, the server picks a default set based on dataset shape.
Metric names to optimise, e.g. ['auc'] or ['auc', 'neg_class_error']. Multi-metric requests trigger a Pareto sweep when grid_size is set. Custom Python callables are not supported in v1.
HPO strategy: 'optuna' or 'random'. Resolved server-side.
Number of HPO trials per matrix row.
x >= 1K-fold cross-validation folds. None means a single holdout split.
If true, build a Caruana-style ensemble across stage-1 fold/trial models instead of refitting the winner on full data.
Simplex grid step for Pareto-style multi-metric sweeps.
Distill from a same-server trained model identified by id. The synth-gen worker calls the inference worker in-process — no HTTP self-loop. Mutually exclusive with teacher_predict_url.
If set, distil from an external teacher at this URL. The worker POSTs {'samples': [[...]], 'feature_names': [...]}. Use teacher_model_id for same-server teachers instead.
Headers to send when calling teacher_predict_url.
Response
Successful Response
POST /v1/trainer/run -- async trainer job submission response.
dataset_id echoes the dataset this run trained against.