outerproduct.reasoning - OuterProduct

The outerproduct.reasoning module is the entry point for training reasoning models and analysing their behaviour across populations. op.reasoning.fit() trains a ReasoningModel that produces both predictions and feature-level attributions. op.reasoning.pattern_tracker.fit() takes a fitted ReasoningModel and condenses its explanations over a target prediction band into a small set of named, executable filter patterns you can apply to any new data.

`op.reasoning.fit()`

op.reasoning.fit(
    dataset: Dataset,
    task: Task | None = None,
    teacher: Model | Predictor | None = None,
    model_types: list[str] | None = None,
    metric: Metric | list[Metric] | None = None,
    n_hyperopt_steps: int = 5,
    random_state: int = 42,
) -> ReasoningFitJob

Submits a training job and returns a non-blocking Job handle immediately. Call .wait() on the handle to block until training completes and receive the trained ReasoningModel.

Parameters

dataset

Dataset

required

Training data wrapped in an op.Dataset. Build one with op.LocalDataset.from_pandas(df).upload(), op.LocalDataset.from_csv(...).upload(), or any other LocalDataset constructor, or reference a connector. The column you name in task’s label_column (or the labels supplied by teacher) is used as the target; all remaining columns are treated as features.

task

Task | None

The supervised learning task, which carries the target label_column. Pass the config that matches your problem:

Config	Use when
`op.Binclass(label_column=...)`	Binary classification (e.g. churn, fraud, approval)
`op.Multiclass(label_column=...)`	Multi-class classification (three or more discrete classes)
`op.Regression(label_column=...)`	Continuous numerical target
`op.Forecasting(label_column=..., id_column=..., timestamp_column=..., horizon=..., lookback=...)`	Time-series forecasting
`op.SequenceBinclass(label_column=..., id_column=..., timestamp_column=...)`	Per-entity binary classification over each entity’s sequence
`op.SequenceMulticlass(label_column=..., id_column=..., timestamp_column=...)`	Per-entity multi-class classification over each entity’s sequence
`op.SequenceRegression(label_column=..., id_column=..., timestamp_column=...)`	Per-entity regression over each entity’s sequence (accepted, not yet executable)

Required unless teacher is provided. When a teacher is set, OuterProduct queries it to generate training labels and task becomes optional.

model_types

list[str] | None

Restrict the model family search to a specific list, e.g. ["tabm", "xgboost"]. When None, OuterProduct selects from its full portfolio of model families. Providing this list is useful when you have a latency or interpretability constraint that rules out certain architectures.

metric

Metric | list[Metric] | None

The optimization target(s). Pass a single op.Metric to optimize directly, or a list to trigger a Pareto sweep across their weightings. When None, OuterProduct picks a sensible default for the task.

n_hyperopt_steps

int

default:"5"

Number of hyperparameter optimisation trials to run. Higher values improve model quality at the cost of longer training time. Defaults to 5.

random_state

int

default:"42"

Seed for the training search, for reproducible runs. Defaults to 42.

teacher

Model | Predictor | None

A teacher model for knowledge distillation. Accepts either a previously trained OuterProduct Model / ReasoningModel or a Predictor wrapping an external HTTP scoring endpoint. When set, OuterProduct trains the new ReasoningModel to mimic the teacher’s output, adding full reasoning to a black-box predictor.

Return value

returns

Job[ReasoningModel]

A non-blocking job handle. The job runs on OuterProduct’s hosted infrastructure. Use the methods below to interact with it.

Show Job methods

.wait()

ReasoningModel

Blocks the current thread until the job completes, then returns the trained ReasoningModel. This is the most common pattern for scripts and notebooks.

.status()

str

Returns the current job state without blocking: "pending", "running", "completed", or "failed".

.results()

any

Returns the raw result payload once the job has completed. Prefer .wait() for typed access to the model.

Examples

import outerproduct as op

op.init(api_key="your-api-key")

dataset = op.LocalDataset.from_csv("customers.csv").upload()

model = op.reasoning.fit(
    dataset,
    task=op.Binclass(label_column="churn"),
).wait()  # ReasoningModel

predictions = model.predict(op.LocalDataset.from_pandas(X_new).upload())
reasoning   = model.explain(op.LocalDataset.from_pandas(X_new).upload())

For large datasets or long hyperopt runs, use the non-blocking pattern and poll job.status() so your script can do other work while training proceeds.

`op.reasoning.pattern_tracker.fit()`

op.reasoning.pattern_tracker.fit(
    model: ReasoningModel,
    dataset: Dataset,
    target_range: tuple[float | None, float | None],
) -> Job[PatternTracker]

Distils a ReasoningModel’s explanation behaviour on a specific prediction band into a compact, portable set of named filter patterns. The fitted PatternTracker can then be applied to any schema-compatible dataset to score rows against those patterns.

Parameters

model

ReasoningModel

required

A trained ReasoningModel (produced by op.reasoning.fit().wait()). The tracker learns from this model’s explanations over the supplied dataset.

dataset

Dataset

required

The dataset used to fit the tracker. The tracker analyses the model’s predictions and attributions over these rows to extract recurring patterns within the target_range.

target_range

tuple[float | None, float | None]

required

An inclusive prediction band that defines which rows are considered “positive” examples for pattern extraction. Either bound may be None for an open-ended range; at least one bound must be set.

`target_range`	Selects
`(0.5, None)`	`pred >= 0.5`, likely-positive cohort
`(None, 0.5)`	`pred <= 0.5`, likely-negative cohort
`(0.4, 0.6)`	`0.4 <= pred <= 0.6`, borderline / uncertain band

Return value

returns

Job[PatternTracker]

A non-blocking job handle. Call .wait() to block until fitting completes and receive the PatternTracker. You can also poll with .status() or access the raw payload with .results().

import outerproduct as op

op.init(api_key="your-api-key")

dataset = op.LocalDataset.from_csv("customers.csv").upload()

model = op.reasoning.fit(
    dataset, task=op.Binclass(label_column="churn")
).wait()

pt = op.reasoning.pattern_tracker.fit(
    model,
    dataset,
    target_range=(0.5, None),  # analyse rows where pred >= 0.5
).wait()

print(f"{len(pt.patterns)} patterns; coverage={pt.coverage_fit:.0%}")

PatternTracker

PatternTracker is produced by op.reasoning.pattern_tracker.fit().wait(). It holds a set of named filter patterns and can score any schema-compatible dataset against them.

Attributes

patterns

list[FilterPattern]

The patterns discovered during fitting. Each entry is a FilterPattern with a human-readable label and quality metrics. See FilterPattern below.

coverage_fit

float

Fraction of rows in the fitting dataset that are matched by at least one pattern. A value of 0.82 means 82 % of the fitting set falls under at least one named pattern.

Methods

`pt.transform()`

pt.transform(X: Dataset) -> pd.DataFrame

Returns a boolean DataFrame of shape (n_rows, n_patterns). Each column corresponds to one pattern (named by FilterPattern.label); a cell is True if that row matches that pattern. Rows can match multiple patterns simultaneously.

Dataset

required

New data to score. Must be schema-compatible with the dataset used to fit the tracker.

returns

pd.DataFrame

Boolean DataFrame of shape (n_rows, n_patterns). Column names match the FilterPattern.label values in pt.patterns.

`pt.distribution()`

pt.distribution(X: Dataset) -> pd.Series

Returns match rates (the fraction of rows in X that match each pattern) as a pd.Series indexed by pattern label.

Dataset

required

New data to score.

returns

pd.Series

Float Series of shape (n_patterns,), indexed by FilterPattern.label. Values are in [0, 1].

`pt.partition()`

pt.partition(X: Dataset) -> dict[str, np.ndarray]

Returns a dict mapping each pattern label to the integer row indices in X that match it. Useful when you want to extract the actual rows belonging to each pattern segment.

Dataset

required

New data to score.

returns

dict[str, np.ndarray]

Mapping from pattern label to a 1-D integer array of matching row indices. A row may appear under multiple pattern labels.

Full usage example

import outerproduct as op
import pandas as pd

op.init(api_key="your-api-key")

# --- Fit ---
dataset = op.LocalDataset.from_csv("customers.csv").upload()

model = op.reasoning.fit(
    dataset, task=op.Binclass(label_column="churn")
).wait()

pt = op.reasoning.pattern_tracker.fit(
    model,
    dataset,
    target_range=(0.5, None),
).wait()

# Inspect the discovered patterns
print(f"{len(pt.patterns)} patterns; coverage={pt.coverage_fit:.0%}")
for fp in pt.patterns:
    print(f"  {fp.label}: precision={fp.precision:.2f}, lift={fp.lift:.2f}")

# --- Apply to new data ---
X_new = op.LocalDataset.from_pandas(pd.read_csv("new_customers.csv")).upload()

# Boolean match matrix
match_matrix = pt.transform(X_new)
print(match_matrix.head())

# Match rates per pattern
print(pt.distribution(X_new))

# Row indices per pattern
segments = pt.partition(X_new)
for label, indices in segments.items():
    print(f"{label}: {len(indices)} matching rows")

transform(), distribution(), and partition() require X to be schema-compatible with the dataset used during pattern_tracker.fit(). Mismatched column names will raise a local validation error before any network call is made.

FilterPattern

A single named pattern discovered by the PatternTracker.

label

str

A human-readable name for the pattern, generated by OuterProduct to summarise the feature conditions that define it (e.g. "high_credit_low_income").

precision

float

Fraction of rows matching this pattern whose prediction also falls within target_range. Higher precision means the pattern is a more reliable indicator of the target cohort. Values are in [0, 1].

lift

float

Ratio of the pattern’s precision to the base rate of target_range in the fitting dataset. A lift of 2.0 means rows matching this pattern are twice as likely to fall in the target band as a randomly chosen row.

for fp in pt.patterns:
    print(f"{fp.label}")
    print(f"  precision : {fp.precision:.2%}")
    print(f"  lift      : {fp.lift:.2f}x")
# high_credit_low_income
#   precision : 91.30%
#   lift      : 2.41x
# recent_late_payments
#   precision : 87.50%
#   lift      : 2.31x

​op.reasoning.fit()

​Parameters

​Return value

​Examples

​op.reasoning.pattern_tracker.fit()

​Parameters

​Return value

​PatternTracker

​Attributes

​Methods

​pt.transform()

​pt.distribution()

​pt.partition()

​Full usage example

​FilterPattern

`op.reasoning.fit()`

Parameters

Return value

Examples

`op.reasoning.pattern_tracker.fit()`

Parameters

Return value

PatternTracker

Attributes

Methods

`pt.transform()`

`pt.distribution()`

`pt.partition()`

Full usage example

FilterPattern