Skills¶

Reference for stargraph.skills — the Skill base class, the SkillKind taxonomy, the salience scorer protocol, and the in-tree reference skills (RAG, Shipwright).

Public surface¶

from stargraph.skills import (
    Example,
    ReactSkill, ReactState, ReactStep, ToolCallRecord,
    RuleBasedScorer, SalienceContext, SalienceScorer,
    Skill, SkillKind,
    refs,
)

Skill model¶

stargraph.skills.base.Skill is a pydantic.BaseModel (FR-21..FR-24). The plugin loader pre-validates each instance and registers it via register_skills.

Field	Type	Default	Purpose
`name`	`str`	—	Slug; same validator surface as `ToolSpec.name`.
`version`	`str`	—	SemVer.
`kind`	`SkillKind`	—	Role taxonomy.
`description`	`str`	—	Human-readable.
`tools`	`list[str]`	`[]`	Fully-qualified tool ids the skill may call (`<ns>.<name>@<ver>`).
`subgraph`	`str \\| None`	`None`	Path to an IR document or inline `IRDocument` reference. Phase 3 routes execution through `SubGraphNode` using this reference.
`system_prompt`	`str \\| None`	`None`	Instruction template.
`state_schema`	`type[BaseModel]`	—	Pydantic model whose field names are the FR-23 declared output channels.
`requires`	`list[str]`	`[]`	Capability strings checked by the FR-7 capabilities gate.
`examples`	`list[Example]`	`[]`	Few-shot examples.
`bubble_events`	`bool`	`True`	FR-24 default-on (LangGraph #2484 mitigation).
`declared_output_keys`	`frozenset[str]`	derived	Set automatically from `state_schema.model_fields`.
`site_id` (computed)	`str`	derived	`f"{name}@{version}"` (POC formula). Stable handle the engine uses for checkpointing.

Replay-safe state

The _validate_declared_outputs model validator walks state_schema.model_fields and rejects any field typed as set or set[X] (or with a nested set). Replay-safe state requires hashable, immutable collections — use frozenset (NFR-2). The validator also populates declared_output_keys so the engine SubGraphNode boundary translator can enforce the write whitelist at registration time, not at runtime.

`SkillKind` enum¶

stargraph.skills.base.SkillKind is a StrEnum:

Value	Meaning
`agent`	Agentic loop (e.g. ReAct, planner-executor). Free to call tools and terminate by its own criteria.
`workflow`	Deterministic multi-step flow with a fixed control structure.
`utility`	Single-purpose helper (formatter, validator, classifier).

`Example` model¶

class Example(BaseModel):
    inputs: dict[str, Any]
    expected_output: dict[str, Any] | None = None

Carried in Skill.examples. Used both for documentation and for downstream eval harnesses.

Registration¶

Skills register through the standard pluggy hookspec:

from stargraph.plugin import hookimpl
from stargraph.skills import Skill, SkillKind
from pydantic import BaseModel

class MyState(BaseModel):
    answer: str = ""

MY_SKILL = Skill(
    name="my-skill",
    version="0.1.0",
    kind=SkillKind.utility,
    description="...",
    state_schema=MyState,
)

@hookimpl
def register_skills() -> list[Skill]:
    return [MY_SKILL]

The plugin loader (see PluginManifest) pre-validates each instance and pre-checks namespace conflicts before any register_skills hookimpl executes its module body.

SkillSpec vs Skill

stargraph.ir.SkillSpec is the portable IR record — no Pydantic validators, no Python-only types, safe to serialise across languages. stargraph.skills.Skill is the runtime Pydantic class with validators, computed fields, and state_schema: type[BaseModel]. Hookspecs declare list[SkillSpec] for IR portability; the registry accepts both shapes in the runtime path.

state_schema and declared output channels¶

state_schema is the contract between the skill and the engine SubGraphNode boundary translator (FR-23, design §3.7):

The validator collects state_schema.model_fields.keys() into declared_output_keys.
At graph-construction time the engine builds a write whitelist from declared_output_keys.
Any attempt by the skill subgraph to write a key not in the whitelist is rejected at registration / boundary translation time — not at runtime. This is the replay-first stance from design §3.7.

set / set[X] annotations on state_schema fields raise at construction time; replace them with frozenset.

bubble_events¶

bubble_events: bool = True  # FR-24 default-on

When True, events emitted inside the skill subgraph bubble to the parent graph's event stream. Default-on mitigates LangGraph issue #2484, where subgraph events were silently dropped at the parent boundary. Skills that want self-contained event scopes can set it to False.

ReactSkill (POC)¶

stargraph.skills.react.ReactSkill (design §3.9, FR-25, AC-7.5, AC-10.4) is the in-tree think → act → observe tool-loop reference skill.

`ToolCallRecord`¶

class ToolCallRecord(BaseModel):
    name: str                 # fully-qualified tool id
    arguments: dict[str, Any] = {}
    result: Any = None
    error: str | None = None  # str-formatted exception when dispatch raised

Mirrors the shape engine FR-24 dispatchers consume. No regex parsing — attribute access on the dict only.

`ReactStep`¶

class ReactStep(BaseModel):
    thought: str
    tool_call: ToolCallRecord | None = None
    observation: str | None = None

A single trajectory entry.

`ReactState`¶

ReactState is the state_schema for ReactSkill (engine SubGraphNode honors its field names as the declared output channels).

Field	Type	Default	Purpose
`trajectory`	`list[ReactStep]`	`[]`	Append-only trajectory.
`tool_calls`	`list[ToolCallRecord]`	`[]`	Mirror of every dispatched tool call (independent of trajectory ordering).
`done`	`bool`	`False`	Set by `_think` when a final answer is reached.
`error_budget`	`int`	`3`	Decremented on each tool exception. Termination at `<= 0`.
`final_answer`	`str \\| None`	`None`	Populated when `done` flips True with a final answer.
`step_index`	`int`	`0`	Counts completed iterations; pinned by `test_max_steps_termination`.

`ReactSkill` constructor wiring¶

Field	Default	Purpose
`kind`	`SkillKind.agent`	Hard-coded agentic.
`state_schema`	`ReactState`	—
`max_steps`	`10`	Manifest-level wall-clock cap.
`llm_stub`	`None`	Callable `(state, ctx) -> {"reasoning", "tool_call", "done", "final_answer"}`. Required at run time; production wires through the engine model registry in Phase 3.
`tool_impls`	`{}`	Maps `tool_call["name"]` → callable invoked with `**arguments`. Distinct from `Skill.tools` which lists declared tool ids.

Termination order (whichever fires first):

state.done flipped True by _think.
state.error_budget exhausted (decremented on tool exception).
self.max_steps reached.

Salience scoring¶

stargraph.skills.salience provides the pluggable scorer used to gate episodic → semantic memory consolidation (FR-31, design §3.6). Episodes scoring below a caller-chosen threshold are filtered before the consolidation rule body fires, so noise never gets promoted (AC-5.5).

`SalienceScorer` Protocol¶

@runtime_checkable
class SalienceScorer(Protocol):
    async def score(self, memory: Episode, context: SalienceContext) -> float: ...

Returns a float in [0, 1]. Stable across v1 (rule-based) → v2 (embedding similarity) → v3 (learned scorer) — only weights and the scorer instance change.

`SalienceContext`¶

Field	Type	Default	Purpose
`query_embedding`	`list[float] \\| None`	`None`	Reserved for v2 relevance scoring; v1 ignores.
`last_access_ts`	`datetime`	—	Recency anchor (Park 2023 §4.1: decay since last access, not creation).
`access_count`	`int`	—	Frequency factor input.
`rule_match_count`	`int`	—	Rule-affinity factor input.
`weights`	`dict[str, float]`	`{"recency": 1.0, "relevance": 0.0, "importance": 0.0}`	v1 default weights gate relevance and importance to zero.
`decay_tau_seconds`	`float`	`86400.0`	Recency decay constant.

`RuleBasedScorer`¶

v1 default. Implements the Park et al. 2023 formula structurally:

score = w_recency * exp(-Δt / τ)
      + w_relevance * cos(q_emb, m_emb)
      + w_importance * imp(m)

…multiplied by tanh(access_count / 10) (frequency) and tanh(rule_match_count / 5) (rule affinity), then clamped to [0, 1]. v1 weights default relevance=0.0 and importance=0.0 (rule-based-only constraint per epic decision); v2 swaps to embedding similarity, v3 swaps to a learned scorer behind the same Protocol.

refs subpackage¶

stargraph.skills.refs ships the in-tree reference Skill implementations (FR-32..FR-34):

Module	Skill	Status
`stargraph.skills.refs.rag`	`RagSkill` (retrieval + LLM-stub answer assembly)	POC (FR-32 / AC-7.1)
`stargraph.skills.refs.autoresearch`	autoresearch reference
`stargraph.skills.refs.wiki`	wiki reference

RagSkill composes stargraph.nodes.retrieval.RetrievalNode (vector + doc fan-out) with a deterministic LLM stub and an answer-assembly step. Capability requirements declared on the manifest: db.vectors:read, db.docs:read, llm.generate. Subgraph IR lives at tests/fixtures/skills/rag/example.yaml; Phase 2 loads it via Skill.subgraph and routes execution through SubGraphNode.

For the bigger picture and rationale see Reference skills knowledge and Skills knowledge.

Shipwright skill example¶

stargraph.skills.shipwright is the canonical "skill bundle" example — a Stargraph graph that authors Stargraph graphs from a brief. It lives in src/stargraph/skills/shipwright/ and is structured like a real out-of-tree plugin so contributors can copy the layout.

File	Role
`manifest.yaml`	Skill identity (`id`, `version`, `kind: workflow`, `description`, `state_schema` reference).
`stargraph.yaml`	Graph definition: `state` reference, node list (`triage_gate`, `parse_brief`, `gap_check`, `propose_questions`, `human_input`, `synthesize_graph`, `verify_static`, `verify_tests`, `verify_smoke`, `fix_loop`), Bosun rule packs, governance packs, store providers, checkpoint config.
`_pack.py`	Loader for `stargraph.bosun.shipwright.*` sub-packs. Splits `rules.clp` into top-level constructs and feeds each to `fathom.Engine._env.build` for precise compile-error attribution. Mirrors `tests/integration/bosun/_helpers.py` so production nodes (`GapCheck`, `FixLoop`) and tests share one canonical loader.
`state.py`	The `State` Pydantic model referenced from `manifest.yaml#state_schema` and `stargraph.yaml#state`.
`nodes/`	Per-node modules (`triage`, `parse`, `interview`, `synthesize`, `verify`, `fix`).
`templates/`	Prompt fragments and rendered output templates.
`graph.yaml`	Companion graph artifact.

Snippet from manifest.yaml:

id: stargraph.skills.shipwright
version: "0.1.0"
kind: workflow
description: |
  Shipwright — a Stargraph graph that authors Stargraph graphs from a brief.
  Interview-driven (rules + LLM dual-truth), verified end-to-end, replayable.
state_schema: stargraph.skills.shipwright.state:State

Snippet from stargraph.yaml:

name: shipwright
state: ./state.py:State

nodes:
  - name: gap_check
    type: stargraph.skills.shipwright.nodes.interview:GapCheck
  - name: human_input
    type: stargraph.nodes.human_input
    expected_input_schema_from: open_questions

rules:
  - pack: stargraph.bosun.shipwright.gaps
  - pack: stargraph.bosun.shipwright.edits

stores:
  doc: sqlite:./.shipwright/docs.db
  fact: sqlite:./.shipwright/facts.db

checkpoints:
  every: node-exit
  store: sqlite:./.shipwright/checkpoints.db

Skills¶

Public surface¶

Skill model¶

SkillKind enum¶

Example model¶

Registration¶

state_schema and declared output channels¶

bubble_events¶

ReactSkill (POC)¶

ToolCallRecord¶

ReactStep¶

ReactState¶

ReactSkill constructor wiring¶