Interaction Cookbook
Copy-paste recipes for driving a running Flutter app from an agent using the Playwright-style interaction layer. Every example assumes:
WS='ws://127.0.0.1:8181/<token>/ws' # from `discover_debug_apps`
CLI='flutter-mcp-toolkit' # the built CLI binary
All interaction tools run against the shared command catalog and take the
usual nested connection object for targeting.
1. Tap a button
# 1. Snapshot → get refs for every interactive widget on screen.
$CLI exec --name semantic_snapshot \
--args "{\"connection\":{\"targetId\":\"$WS\"}}"
# 2. Tap the ref you want. Pass snapshotId to detect staleness.
$CLI exec --name tap_widget \
--args "{\"ref\":\"s_1\",\"snapshotId\":4,\"connection\":{\"targetId\":\"$WS\"}}"
The response includes via: "semantic_action" when Tier 1 succeeded
(SemanticsOwner.performAction) or via: "pointer_events" when it fell back
to synthetic taps.
2. Fill a form field
# ref comes from a previous semantic_snapshot call.
$CLI exec --name enter_text \
--args "{\"ref\":\"s_3\",\"text\":\"hello\",\"connection\":{\"targetId\":\"$WS\"}}"
enter_text prefers SemanticsAction.setText and falls back to driving
EditableTextState.userUpdateTextEditingValue directly. TextInputFormatters
and onChanged fire correctly.
3. Scroll to reveal more content
# Pass a scrollable ref for the deterministic semantic-action path.
$CLI exec --name scroll \
--args "{\"ref\":\"s_6\",\"direction\":\"down\",\"connection\":{\"targetId\":\"$WS\"}}"
# Without a ref, scroll dispatches a PointerScrollEvent at screen centre
# (desktop-friendly wheel path).
$CLI exec --name scroll \
--args "{\"direction\":\"down\",\"distance\":400,\"connection\":{\"targetId\":\"$WS\"}}"
# Re-snapshot — off-screen widgets are only in the tree after they scroll in.
$CLI exec --name semantic_snapshot \
--args "{\"connection\":{\"targetId\":\"$WS\"}}"
Direction follows the Playwright convention: direction: "down" reveals
content below (the finger swipes up).
4. Read runtime state without registering a tool
$CLI exec --name evaluate_dart_expression \
--args "{\"expression\":\"AgentState.instance.counter\",\"connection\":{\"targetId\":\"$WS\"}}"
Evaluates in the app's root library and returns {result, kind, classRef}.
Useful for asserting "did the tap I just issued actually update state?"
without waiting for a visual diff.
5. Edit → hot reload → see what changed
# After editing a Dart file:
$CLI exec --name hot_reload_and_capture \
--args "{\"connection\":{\"targetId\":\"$WS\"}}"
Returns a single response with hotReload report, fresh screenshot, fresh
semantics (new snapshot_id), and any app errors raised during reassembly.
One round trip replaces the classic "reload, then snapshot, then errors" chain.
Staleness handshake
Every interaction tool accepts an optional snapshotId. If it doesn't match
the server's current snapshot the call returns:
{
"ok": false,
"error": "stale_snapshot",
"providedSnapshotId": 4,
"currentSnapshotId": 7,
"message": "Snapshot is stale. Call semantic_snapshot to get fresh refs."
}
Handle by re-issuing semantic_snapshot, remapping refs, then retrying.
Known limits (quick reference)
- Refs only resolve against the most recent snapshot.
- Off-screen widgets aren't in the snapshot until they scroll into view.
- Scroll by
refis the most reliable path — no-ref scroll uses aPointerScrollEventthat a full-screen overlay can swallow. - Widgets without
Semanticsare invisible to the snapshot; reach them viainspect_widget_at_point+ Tier 2 pointer events. - Platform views / custom text input can't be filled through
userUpdateTextEditingValue; useevaluate_dart_expressionto set state directly.
See CLI quick recipes for copy-paste coverage of the shared command catalog.