Steward H2 proof — 2026-06-10
Status: H2 smoke passed · Audience: maintainers and agents
Claim
mcp_flutter is the first external Skill Steward adoption target with a steward/v1 contract and a passing H2 smoke loop. This proves contract discovery, safe action inspection, quick probe execution, and one strict contract benchmark. It does not prove product runtime correctness, full H4 workflow breadth, full native-gate execution through Steward, or H5 promoted harness maturity.
Repo state
- Branch:
codex/steward-adoption-h2 - Contract commit:
af35ac3185c31fb26edb4aa546834f70208a5600 - Redacted proof snapshot commit:
f6fd961bf3ba34bc9b663469169733fc3bfa091c - Worktree state during H2 loop: clean
- Native gate:
make check-contracts - Steward scenario:
mcp_flutter.web-dogfood-warm - Local ignored benchmark summary:
.steward/benchmark-summaries/mcp_flutter.web-dogfood-warm.strict.json - Tracked redacted review summary:
docs/evidence/generated/mcp_flutter.web-dogfood-warm.strict.redacted.json
Contract refresh
repo.archetypenow uses the public Skill Steward archetype vocabulary:harness.stewardship.repo_qualityis declared withcontract_spec,maturity_model, and evidence path.AGENTS.mdnow records the releasedstewardexecutable as the reusable command surface. The earliertool/steward/run.shbridge was temporary proof scaffolding for a stale local binary and is not the adoption pattern for future repos.- Current contract management now uses
mcp_flutter.contract-status-smokeas the first smoke scenario. The historicalmcp_flutter.web-dogfood-warmname is superseded because it never proved live WebMCP runtime behavior. steward.yamlexposes additional quick-safe contract slices and a non-quickfmt.check.contracts-fullaction so local Steward CLI users can inspect the full native gate's effects before runningmake check-contractsdirectly.
Portable commands
Run from the repository root after installing a current steward binary.
make check-contracts
steward doctor --json
steward actions list --json
steward action inspect fmt.check.tool-prefix --json
steward probe --profile quick --json
steward benchmark --scenario mcp_flutter.contract-status-smoke --strict --output .steward/benchmark-summaries/mcp_flutter.contract-status-smoke.strict.json --json
steward action inspect fmt.check.contracts-full --json
make check-contracts
Local provenance
The original run used the maintainer's private Dart SDK path plus a sibling Skill Steward checkout. A later proof used a temporary repo-local wrapper to bridge a stale global binary. Both forms are non-copyable provenance only; future evidence should use the released steward executable, Skill Steward's setup action, or an explicit maintainer source checkout.
Results
| Gate | Result |
|---|---|
make check-contracts | Passed; existing skill metadata/source warnings only |
doctor --json | Passed; config.valid: true, repo.archetype: "harness" |
actions list --json | Passed; exposed fmt.check.tool-prefix |
action inspect fmt.check.tool-prefix --json | Passed; action is bounded_local / auto, with no writes, no network, no secrets, and no destructive effects |
probe --profile quick --json | Passed; selected fmt.check.tool-prefix |
benchmark --strict --output ... | Passed; result: "pass", blocked_by: null, durability.status: "ready", proof.status: "ready" |
The tracked redacted summary preserves the benchmark result, proof status, durability status, subject commit, and warnings while omitting machine-local command paths.
Current status update - 2026-06-17
steward doctor --jsonandsteward actions list --jsonpass with the expanded action set.steward probe --profile quick --jsonpasses and runs seven read-only bounded-local checks.- Strict benchmark reruns for the new/changed scenarios are expected to report
durability_blockeduntilsteward.yamland the new scenario manifests are committed; that is dirty-input protection, not a contract execution failure. - Current
steward benchmarkstill rejects non-quick actions as safe first probes, so the full native gate remains inspectable through Steward but executable throughmake check-contracts. - The local
stewardbinary in this environment does not exposeschema check-outputsorschema drift; usedoctor,actions list,action inspect,probe, andbenchmarkhere unless the CLI is upgraded.
Non-claims
- This does not prove the WebMCP runtime dogfood path.
- This does not promote a new diagnostic or action to H5.
- This does not prove every mcp_flutter workflow is agent-operable.
- The contract-smoke scenario proves deterministic contract slices, not release publishing, runtime launch, or visual/product correctness.
source.commitis treated as the benchmark subject commit. A later localHEADcan differ and still produce a warning; this is not remote-equivalence proof.
