Skills¶
Reference for stargraph.skills — the Skill base class, the SkillKind taxonomy, the salience scorer protocol, and the in-tree reference skills (RAG, Shipwright).
See also: PluginManifest, Hookspec catalog, Reference skills knowledge, Skills knowledge.
Public surface¶
from stargraph.skills import (
Example,
ReactSkill, ReactState, ReactStep, ToolCallRecord,
RuleBasedScorer, SalienceContext, SalienceScorer,
Skill, SkillKind,
refs,
)
Skill model¶
stargraph.skills.base.Skill is a pydantic.BaseModel (FR-21..FR-24). The plugin loader pre-validates each instance and registers it via register_skills.
| Field | Type | Default | Purpose |
|---|---|---|---|
name |
str |
— | Slug; same validator surface as ToolSpec.name. |
version |
str |
— | SemVer. |
kind |
SkillKind |
— | Role taxonomy. |
description |
str |
— | Human-readable. |
tools |
list[str] |
[] |
Fully-qualified tool ids the skill may call (<ns>.<name>@<ver>). |
subgraph |
str \| None |
None |
Path to an IR document or inline IRDocument reference. Phase 3 routes execution through SubGraphNode using this reference. |
system_prompt |
str \| None |
None |
Instruction template. |
state_schema |
type[BaseModel] |
— | Pydantic model whose field names are the FR-23 declared output channels. |
requires |
list[str] |
[] |
Capability strings checked by the FR-7 capabilities gate. |
examples |
list[Example] |
[] |
Few-shot examples. |
bubble_events |
bool |
True |
FR-24 default-on (LangGraph #2484 mitigation). |
declared_output_keys |
frozenset[str] |
derived | Set automatically from state_schema.model_fields. |
site_id (computed) |
str |
derived | f"{name}@{version}" (POC formula). Stable handle the engine uses for checkpointing. |
Replay-safe state
The _validate_declared_outputs model validator walks state_schema.model_fields and rejects any field typed as set or set[X] (or with a nested set). Replay-safe state requires hashable, immutable collections — use frozenset (NFR-2). The validator also populates declared_output_keys so the engine SubGraphNode boundary translator can enforce the write whitelist at registration time, not at runtime.
SkillKind enum¶
stargraph.skills.base.SkillKind is a StrEnum:
| Value | Meaning |
|---|---|
agent |
Agentic loop (e.g. ReAct, planner-executor). Free to call tools and terminate by its own criteria. |
workflow |
Deterministic multi-step flow with a fixed control structure. |
utility |
Single-purpose helper (formatter, validator, classifier). |
Example model¶
Carried in Skill.examples. Used both for documentation and for downstream eval harnesses.
Registration¶
Skills register through the standard pluggy hookspec:
from stargraph.plugin import hookimpl
from stargraph.skills import Skill, SkillKind
from pydantic import BaseModel
class MyState(BaseModel):
answer: str = ""
MY_SKILL = Skill(
name="my-skill",
version="0.1.0",
kind=SkillKind.utility,
description="...",
state_schema=MyState,
)
@hookimpl
def register_skills() -> list[Skill]:
return [MY_SKILL]
The plugin loader (see PluginManifest) pre-validates each instance and pre-checks namespace conflicts before any register_skills hookimpl executes its module body.
SkillSpec vs Skill
stargraph.ir.SkillSpec is the portable IR record — no Pydantic validators, no Python-only types, safe to serialise across languages. stargraph.skills.Skill is the runtime Pydantic class with validators, computed fields, and state_schema: type[BaseModel]. Hookspecs declare list[SkillSpec] for IR portability; the registry accepts both shapes in the runtime path.
state_schema and declared output channels¶
state_schema is the contract between the skill and the engine SubGraphNode boundary translator (FR-23, design §3.7):
- The validator collects
state_schema.model_fields.keys()intodeclared_output_keys. - At graph-construction time the engine builds a write whitelist from
declared_output_keys. - Any attempt by the skill subgraph to write a key not in the whitelist is rejected at registration / boundary translation time — not at runtime. This is the replay-first stance from design §3.7.
set / set[X] annotations on state_schema fields raise at construction time; replace them with frozenset.
bubble_events¶
When True, events emitted inside the skill subgraph bubble to the parent graph's event stream. Default-on mitigates LangGraph issue #2484, where subgraph events were silently dropped at the parent boundary. Skills that want self-contained event scopes can set it to False.
ReactSkill (POC)¶
stargraph.skills.react.ReactSkill (design §3.9, FR-25, AC-7.5, AC-10.4) is the in-tree think → act → observe tool-loop reference skill.
ToolCallRecord¶
class ToolCallRecord(BaseModel):
name: str # fully-qualified tool id
arguments: dict[str, Any] = {}
result: Any = None
error: str | None = None # str-formatted exception when dispatch raised
Mirrors the shape engine FR-24 dispatchers consume. No regex parsing — attribute access on the dict only.
ReactStep¶
class ReactStep(BaseModel):
thought: str
tool_call: ToolCallRecord | None = None
observation: str | None = None
A single trajectory entry.
ReactState¶
ReactState is the state_schema for ReactSkill (engine SubGraphNode honors its field names as the declared output channels).
| Field | Type | Default | Purpose |
|---|---|---|---|
trajectory |
list[ReactStep] |
[] |
Append-only trajectory. |
tool_calls |
list[ToolCallRecord] |
[] |
Mirror of every dispatched tool call (independent of trajectory ordering). |
done |
bool |
False |
Set by _think when a final answer is reached. |
error_budget |
int |
3 |
Decremented on each tool exception. Termination at <= 0. |
final_answer |
str \| None |
None |
Populated when done flips True with a final answer. |
step_index |
int |
0 |
Counts completed iterations; pinned by test_max_steps_termination. |
ReactSkill constructor wiring¶
| Field | Default | Purpose |
|---|---|---|
kind |
SkillKind.agent |
Hard-coded agentic. |
state_schema |
ReactState |
— |
max_steps |
10 |
Manifest-level wall-clock cap. |
llm_stub |
None |
Callable (state, ctx) -> {"reasoning", "tool_call", "done", "final_answer"}. Required at run time; production wires through the engine model registry in Phase 3. |
tool_impls |
{} |
Maps tool_call["name"] → callable invoked with **arguments. Distinct from Skill.tools which lists declared tool ids. |
Termination order (whichever fires first):
state.doneflipped True by_think.state.error_budgetexhausted (decremented on tool exception).self.max_stepsreached.
Salience scoring¶
stargraph.skills.salience provides the pluggable scorer used to gate episodic → semantic memory consolidation (FR-31, design §3.6). Episodes scoring below a caller-chosen threshold are filtered before the consolidation rule body fires, so noise never gets promoted (AC-5.5).
SalienceScorer Protocol¶
@runtime_checkable
class SalienceScorer(Protocol):
async def score(self, memory: Episode, context: SalienceContext) -> float: ...
Returns a float in [0, 1]. Stable across v1 (rule-based) → v2 (embedding similarity) → v3 (learned scorer) — only weights and the scorer instance change.
SalienceContext¶
| Field | Type | Default | Purpose |
|---|---|---|---|
query_embedding |
list[float] \| None |
None |
Reserved for v2 relevance scoring; v1 ignores. |
last_access_ts |
datetime |
— | Recency anchor (Park 2023 §4.1: decay since last access, not creation). |
access_count |
int |
— | Frequency factor input. |
rule_match_count |
int |
— | Rule-affinity factor input. |
weights |
dict[str, float] |
{"recency": 1.0, "relevance": 0.0, "importance": 0.0} |
v1 default weights gate relevance and importance to zero. |
decay_tau_seconds |
float |
86400.0 |
Recency decay constant. |
RuleBasedScorer¶
v1 default. Implements the Park et al. 2023 formula structurally:
…multiplied by tanh(access_count / 10) (frequency) and tanh(rule_match_count / 5) (rule affinity), then clamped to [0, 1]. v1 weights default relevance=0.0 and importance=0.0 (rule-based-only constraint per epic decision); v2 swaps to embedding similarity, v3 swaps to a learned scorer behind the same Protocol.
refs subpackage¶
stargraph.skills.refs ships the in-tree reference Skill implementations (FR-32..FR-34):
| Module | Skill | Status |
|---|---|---|
stargraph.skills.refs.rag |
RagSkill (retrieval + LLM-stub answer assembly) |
POC (FR-32 / AC-7.1) |
stargraph.skills.refs.autoresearch |
autoresearch reference | |
stargraph.skills.refs.wiki |
wiki reference |
RagSkill composes stargraph.nodes.retrieval.RetrievalNode (vector + doc fan-out) with a deterministic LLM stub and an answer-assembly step. Capability requirements declared on the manifest: db.vectors:read, db.docs:read, llm.generate. Subgraph IR lives at tests/fixtures/skills/rag/example.yaml; Phase 2 loads it via Skill.subgraph and routes execution through SubGraphNode.
For the bigger picture and rationale see Reference skills knowledge and Skills knowledge.
Shipwright skill example¶
stargraph.skills.shipwright is the canonical "skill bundle" example — a Stargraph graph that authors Stargraph graphs from a brief. It lives in src/stargraph/skills/shipwright/ and is structured like a real out-of-tree plugin so contributors can copy the layout.
| File | Role |
|---|---|
manifest.yaml |
Skill identity (id, version, kind: workflow, description, state_schema reference). |
stargraph.yaml |
Graph definition: state reference, node list (triage_gate, parse_brief, gap_check, propose_questions, human_input, synthesize_graph, verify_static, verify_tests, verify_smoke, fix_loop), Bosun rule packs, governance packs, store providers, checkpoint config. |
_pack.py |
Loader for stargraph.bosun.shipwright.* sub-packs. Splits rules.clp into top-level constructs and feeds each to fathom.Engine._env.build for precise compile-error attribution. Mirrors tests/integration/bosun/_helpers.py so production nodes (GapCheck, FixLoop) and tests share one canonical loader. |
state.py |
The State Pydantic model referenced from manifest.yaml#state_schema and stargraph.yaml#state. |
nodes/ |
Per-node modules (triage, parse, interview, synthesize, verify, fix). |
templates/ |
Prompt fragments and rendered output templates. |
graph.yaml |
Companion graph artifact. |
Snippet from manifest.yaml:
id: stargraph.skills.shipwright
version: "0.1.0"
kind: workflow
description: |
Shipwright — a Stargraph graph that authors Stargraph graphs from a brief.
Interview-driven (rules + LLM dual-truth), verified end-to-end, replayable.
state_schema: stargraph.skills.shipwright.state:State
Snippet from stargraph.yaml:
name: shipwright
state: ./state.py:State
nodes:
- name: gap_check
type: stargraph.skills.shipwright.nodes.interview:GapCheck
- name: human_input
type: stargraph.nodes.human_input
expected_input_schema_from: open_questions
rules:
- pack: stargraph.bosun.shipwright.gaps
- pack: stargraph.bosun.shipwright.edits
stores:
doc: sqlite:./.shipwright/docs.db
fact: sqlite:./.shipwright/facts.db
checkpoints:
every: node-exit
store: sqlite:./.shipwright/checkpoints.db