Steward H2 proof — 2026-06-10

Status: H2 smoke passed · Audience: maintainers and agents

Claim

mcp_flutter is the first external Skill Steward adoption target with a current steward/v1 contract and a passing H2 smoke loop. This proves contract discovery, safe action inspection, quick probe execution, and one strict dogfood benchmark. It does not prove product runtime correctness, full H4 workflow breadth, or H5 promoted harness maturity.

Repo state

Branch: codex/steward-adoption-h2
Contract commit: af35ac3185c31fb26edb4aa546834f70208a5600
Redacted proof snapshot commit: f6fd961bf3ba34bc9b663469169733fc3bfa091c
Worktree state during H2 loop: clean
Native gate: make check-contracts
Steward scenario: mcp_flutter.web-dogfood-warm
Local ignored benchmark summary: .steward/benchmark-summaries/mcp_flutter.web-dogfood-warm.strict.json
Tracked redacted review summary: docs/evidence/generated/mcp_flutter.web-dogfood-warm.strict.redacted.json

Contract refresh

repo.archetype now uses the public Skill Steward archetype vocabulary: harness.
stewardship.repo_quality is declared with contract_spec, maturity_model, and evidence path.
AGENTS.md now records the released steward executable as the reusable command surface. The earlier tool/steward/run.sh bridge was temporary proof scaffolding for a stale local binary and is not the adoption pattern for future repos.

Portable commands

Run from the repository root after installing a current steward binary.

make check-contracts
steward doctor --json
steward actions list --json
steward action inspect fmt.check.tool-prefix --json
steward probe --profile quick --json
steward benchmark --scenario mcp_flutter.web-dogfood-warm --strict --output .steward/benchmark-summaries/mcp_flutter.web-dogfood-warm.strict.json --json

Local provenance

The original run used the maintainer's private Dart SDK path plus a sibling Skill Steward checkout. A later proof used a temporary repo-local wrapper to bridge a stale global binary. Both forms are non-copyable provenance only; future evidence should use the released steward executable, Skill Steward's setup action, or an explicit maintainer source checkout.

Results

Gate	Result
`make check-contracts`	Passed; existing skill metadata/source warnings only
`doctor --json`	Passed; `config.valid: true`, `repo.archetype: "harness"`
`actions list --json`	Passed; exposed `fmt.check.tool-prefix`
`action inspect fmt.check.tool-prefix --json`	Passed; action is `bounded_local` / `auto`, with no writes, no network, no secrets, and no destructive effects
`probe --profile quick --json`	Passed; selected `fmt.check.tool-prefix`
`benchmark --strict --output ...`	Passed; `result: "pass"`, `blocked_by: null`, `durability.status: "ready"`, `proof.status: "ready"`

The tracked redacted summary preserves the benchmark result, proof status, durability status, subject commit, and warnings while omitting machine-local command paths.

Non-claims

This does not prove the WebMCP runtime dogfood path.
This does not promote a new diagnostic or action to H5.
This does not prove every mcp_flutter workflow is agent-operable.
source.commit is treated as the benchmark subject commit. A later local HEAD can differ and still produce a warning; this is not remote-equivalence proof.

Start Here

For Humans

For AI Agents

Core Reference

Decisions

Contributing