ClearTrace Client
Attributes
| Name | Type | Description |
|---|---|---|
model_id |
str \| None |
Model ID for the current job. |
feature_names |
list[str] \| None |
Column names captured during fit. |
status |
StatusResult |
Current job status. |
Methods
outerproduct.client.ClearTrace
Client for the OuterProduct ClearTrace API.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
base_url
|
str
|
API server URL. Defaults to |
None
|
api_key
|
str
|
Bearer token. Defaults to |
None
|
timeout
|
float
|
Max seconds for HTTP requests and async job polling. |
300
|
poll_interval
|
float
|
Initial delay between status polls (exponential backoff). |
2.0
|
max_retries
|
int
|
Retries on transient HTTP errors (502/503/504). |
3
|
Methods:
| Name | Description |
|---|---|
create_upload |
Request a presigned upload URL from the API. |
upload_fileobj |
Upload data to S3 using a presigned URL. |
upload_file |
Upload a data file to S3 via a presigned URL. |
fit |
Train a ClearTrace model on labelled data. |
fit_distill |
Distill a black-box model via its predict URL. |
predict |
Batch predictions for X. Returns a numpy array. |
explain |
Batch prediction + AGOP-based explanation. |
predict_and_explain |
Batch predict and explain in one call. |
interpret |
Global feature importance via AGOP. |
get_schema |
Retrieve the persisted schema manifest for the loaded model. |
scenario |
Counterfactual search with constraints. |
segment |
Supervised segmentation (async). |
get_segments |
Retrieve completed segmentation results. |
narrative |
LLM-generated natural language summary (async). |
get_narrative |
Retrieve completed narrative results. |
health |
Check API server health. |
load |
Attach to an existing model by ID. |
create_upload(file_format, *, model_id=None)
Request a presigned upload URL from the API.
This is the first step of a multi-step upload flow. The returned
UploadResult contains the upload_url to PUT data to.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
file_format
|
str
|
File format enum: |
required |
model_id
|
str
|
Custom model ID. Auto-generated by the server if omitted. |
None
|
Returns:
| Type | Description |
|---|---|
UploadResult
|
|
upload_fileobj(upload, fileobj)
Upload data to S3 using a presigned URL.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
upload
|
UploadResult
|
The result returned by create_upload. |
required |
fileobj
|
bytes or binary file-like object
|
The data to upload. Pass raw |
required |
upload_file(path)
Upload a data file to S3 via a presigned URL.
Convenience method that combines create_upload and upload_fileobj into a single call.
The file is uploaded as raw bytes — CSV, Parquet, or pickle format is auto-detected from the file extension.
After uploading, call fit or fit_distill with
target to specify the label column in the uploaded file.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
path
|
str or PathLike
|
Path to the file to upload. Accepted formats: |
required |
Returns:
| Type | Description |
|---|---|
UploadResult
|
|
fit(data=None, *, target, feature_fields=None, feature_schema=None, wait=True, **config)
Train a ClearTrace model on labelled data.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data
|
DataFrame, ndarray, or None
|
Dataset containing features (and optionally the target column). Omit when using pre-uploaded data (see upload_file). |
None
|
target
|
str, array-like, or None
|
If data is a DataFrame, the name of the target column. If data is an ndarray, the target values directly. Omit when using pre-uploaded data. |
required |
feature_fields
|
list[str]
|
Column names to use as features. Only applies when data is a DataFrame. If omitted, all columns except target are used. |
None
|
wait
|
bool
|
Block until training completes (default |
True
|
**config
|
Any
|
Forwarded to the server as flat fields (e.g.
|
{}
|
Returns:
| Type | Description |
|---|---|
JobResult
|
|
fit_distill(data=None, predict_url=None, *, target=None, predict_headers=None, labels=None, feature_fields=None, feature_schema=None, wait=True, **config)
Distill a black-box model via its predict URL.
The server calls predict_url to obtain teacher predictions, then trains an xRFM student model.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data
|
DataFrame, ndarray, or None
|
Feature matrix. Omit when using pre-uploaded data (see upload_file). |
None
|
predict_url
|
str
|
URL of the black-box model's predict endpoint. |
None
|
target
|
str
|
Name of the label column in the uploaded file. Only used with pre-uploaded data. Optional for distill (the teacher predictions can drive training alone). |
None
|
predict_headers
|
dict[str, str]
|
Headers to include when calling predict_url. |
None
|
labels
|
array - like
|
Optional ground-truth labels for evaluation. |
None
|
feature_fields
|
list[str]
|
Column names to use as features. Only applies when data is a DataFrame. If omitted, all columns are used. |
None
|
wait
|
bool
|
Block until training completes (default |
True
|
**config
|
Any
|
Forwarded to the server (same options as fit). |
{}
|
Returns:
| Type | Description |
|---|---|
JobResult
|
|
predict(X)
Batch predictions for X. Returns a numpy array.
explain(X, *, feature_names=None, use_sqrt=True, raw_gradient=True)
Batch prediction + AGOP-based explanation.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
X
|
DataFrame, ndarray, or nested list
|
2-D feature matrix (n_samples, n_features). |
required |
feature_names
|
list[str]
|
Feature names. Auto-extracted from DataFrame columns. |
None
|
use_sqrt
|
bool
|
Use sqrt scaling (default |
True
|
raw_gradient
|
bool
|
Return raw gradient (default |
True
|
Returns:
| Type | Description |
|---|---|
ExplanationResult
|
|
predict_and_explain(X, *, feature_names=None, use_sqrt=False, raw_gradient=True, with_persona=False, rule_kwargs=None)
Batch predict and explain in one call.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
X
|
DataFrame, ndarray, or nested list
|
2-D feature matrix (n_samples, n_features). |
required |
feature_names
|
list[str]
|
Feature names. Auto-extracted from DataFrame columns. |
None
|
use_sqrt
|
bool
|
Use sqrt scaling (default |
False
|
raw_gradient
|
bool
|
Return raw gradient (default |
True
|
with_persona
|
bool
|
Include persona information (default |
False
|
rule_kwargs
|
dict[str, Any]
|
When provided, enables local-rule computation. Pass |
None
|
Returns:
| Type | Description |
|---|---|
PredictAndExplainResult
|
|
interpret()
get_schema()
scenario(queries, *, feature_names=None, desired_class=1, n_walks=500, max_steps=30, epsilon=0.2, random_state=42, constraints=None)
Counterfactual search with constraints.
Finds counterfactual points that flip the model prediction to
desired_class while respecting any feature constraints.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
queries
|
DataFrame, ndarray, list[dict], or nested list
|
Query records. Accepts a pandas DataFrame, numpy array,
list of |
required |
feature_names
|
list[str]
|
Feature names. Auto-extracted from DataFrame columns or dict keys. Required when queries is an ndarray or nested list. |
None
|
desired_class
|
int
|
Target class for the counterfactual (default |
1
|
n_walks
|
int
|
Number of random walks (default |
500
|
max_steps
|
int
|
Maximum steps per walk (default |
30
|
epsilon
|
float
|
Step size (default |
0.2
|
random_state
|
int or None
|
Random seed for reproducibility (default |
42
|
constraints
|
dict[str, dict[str, Any]]
|
Per-feature constraints. Each value may contain keys:
|
None
|
Returns:
| Type | Description |
|---|---|
ScenarioResult
|
Supports indexing ( |
Examples:
From a DataFrame:
>>> result = ct.scenario(df[["age", "income"]].head(2))
From dicts:
>>> result = ct.scenario(
... [{"age": 25, "income": 50000}],
... constraints={"age": {"immutable": True}},
... )
>>> result[0].baseline_prediction
0.3
>>> for candidate in result[0]:
... print(candidate.changes)
segment(*, data=None, target_values=None, feature_names=None, min_clusters=4, max_clusters=10, n_search_steps=50, use_agent=None, kpi_field=None, problem_context=None, wait=True)
Supervised segmentation (async).
Groups the data into clusters where each cluster has distinct explanation patterns.
If wait=True (default), polls GET /v1/models/{model_id}/segments
until complete and returns a SegmentationResult.
If wait=False, returns a JobResult immediately.
get_segments()
Retrieve completed segmentation results.
narrative(data, *, feature_names=None, kpi_name, context=None, max_tool_calls=6, wait=True)
get_narrative()
Retrieve completed narrative results.
health()
Check API server health.
load(model_id)
Attach to an existing model by ID.