MLNode¶
Graph node that runs a classical-ML model through one of three runtimes:
sklearn, xgboost, or onnx. Wraps the loaders in stargraph.ml.loaders.
Constructor¶
| Parameter | Type | Default | Description |
|---|---|---|---|
model_id |
str |
required | Unique identifier within the model registry. |
version |
str |
required | Semver-ish version; forms the cache key with model_id for the ONNX session pool. |
runtime |
Literal["sklearn", "xgboost", "onnx"] |
required | Runtime selector. |
file_uri |
str \| None |
None |
file:// URI of the model bytes. None defers to a registry lookup at execute time (Phase-3 stub — registry lands in task 3.38). |
allow_unsafe_pickle |
bool |
False |
Default-deny gate for the sklearn unsafe-deserialize path (FR-30 antipattern guard #4). No effect on xgboost / onnx. |
expected_sha256 |
str \| None |
None |
Optional pinned SHA-256 of the model file; verified before any deserialize step. |
input_field |
str |
"x" |
State field to read inference inputs from. |
output_field |
str |
"y" |
State field to write predictions to. |
All parameters are keyword-only.
Eager construction
Construction is eager-validated: the runtime is checked, the safe-deserialize
gate fires for runtime="sklearn" when allow_unsafe_pickle=False, and the
underlying ONNX session is warmed via the module-scope cache. There is no
path where allow_unsafe_pickle=False plus a sklearn file:// URI builds
a usable node — the failure mode is identical whether the graph is built
up-front or lazily.
State contract¶
- Reads —
state.<input_field>(defaultstate.x). - Writes —
{output_field: predictions}(default{"y": predictions}).
Inference is offloaded to a worker thread via asyncio.to_thread so the event
loop is never blocked by a sync .predict(...) call.
Side effects + replay¶
side_effects = none— inference is a pure function of the loaded model and the input.- Replay re-executes natively unless the registered
content_hashchanged (registry mismatch raisesIncompatibleModelHashError, which the FR-21must-stubenvelope routes through the recorded cassette).
YAML¶
nodes:
- id: infer_node
kind: ml
spec:
model_id: "$state.model_id"
version: "$state.version"
runtime: sklearn
input_field: x
output_field: y
allow_unsafe_pickle: true
See tests/fixtures/training-subgraph.yaml for the full training-as-subgraph
recipe (FR-32, design §3.9.4).
Errors¶
All errors are MLNodeError:
- Unsupported runtime (not one of
sklearn/xgboost/onnx). runtime="sklearn"withallow_unsafe_pickle=False— message includesset allow_unsafe_pickle=True to opt in.runtime="onnx"reaching_predictwith no warmed session (onnx session not initialised).- Other runtimes reaching
_predictwith no loaded model (model not loaded). - Loader errors (sidecar skew, hash mismatch,
.binxgboost) propagate fromstargraph.ml.loadersat construction time.
See also¶
NodeBase— abstract contract.SubGraphNode— composes train + register + infer per design §3.9.4.