Skip to content

Observation-first architecture: eliminate agent lockout from blocking wait tools #61

@tony

Description

@tony

1. Problems

Agents get trapped when a pane wait hides the terminal behind a boolean. In a live dev loop, the agent is often tailing a server, watching tests, checking several panes, or deciding whether to interrupt. A blocking wait_for_text call gives it only found or timeout, so the agent cannot see that the server picked a different port, printed an error, is waiting for input, already finished, or that another pane now needs attention. This is the lockout reported in #59.

The deeper problem is tool mismatch. wait_for_text is useful for known future output from a process the agent does not control. It is a poor default for command completion, current-screen inspection, fuzzy diagnosis, or multi-pane monitoring. Those cases need different primitives:

Agent intent Better primitive Common trap
"Did the command I sent finish?" wait_for_channel with tmux wait-for -S Waiting for a prompt or regex in wait_for_text
"What is visible right now?" snapshot_pane / capture_pane Calling a future-only wait
"Which pane mentions this?" search_panes Listing pane metadata and missing terminal text
"What changed since I last looked?" Proposed capture_since (#60) Repeated full snapshots or a long blocking wait
"Did this exact third-party line appear?" wait_for_text Treating it as a generic completion primitive
"Did anything change?" wait_for_content_change, then inspect Treating changed=True as semantic success

The filed wait-family issues are concrete instances of that mismatch:

The upstream substrates matter:

  • tmux wait-for is the right zero-poll primitive for authored command synchronization; see cmd-wait-for.c.
  • tmux grid/capture APIs are snapshots, not a durable log stream. History rollover, resize, alternate screen, and wrap behavior all affect what can be reconstructed from capture-pane, grid_collect_history(), and alternate-screen handling in screen.c.
  • FastMCP can report progress through Context.report_progress(), and background tasks can send task status notifications, but UI/client notifications are not a substitute for agent-visible tool results.
  • Python cancellation and timeouts need explicit async subprocess care. CPython documents asyncio.create_subprocess_exec() and asyncio.wait_for() as the async path, while subprocess.run(timeout=...) kills only the child it is managing after timeout; see asyncio-subprocess.rst and subprocess.py.
  • Shell composition is portable only at the simple cmd; tmux wait-for -S channel level. zsh documents list/separator and status behavior in grammar.yo and params.yo. fish has different status variables, events, and prompt markers; see fish_for_bash_users.rst, language.rst events, and mark-prompt. Status-preserving recipes should be shell-specific, not implied as universal.

2. Avenues of investigation

Primary investment: build capture_since (#60) as the observation-first primitive. It should be non-blocking, cursor-based, and bounded by max_lines / max_bytes, returning actual pane output plus cursor and correctness metadata. The first call should be immediately useful by returning visible screen content and establishing a cursor; later calls should return only deltas. This lets the agent tail dev servers, watch tests, inspect multiple panes, and self-correct without betting on one regex.

Keep wait_for_channel as the deterministic authored-command primitive. When the agent sends the command, compose a signal into the command text and wait on tmux's channel:

$ command; tmux wait-for -S libtmux_mcp_done

That path is zero-poll inside tmux and avoids the prompt/regex/stale-output trap. The docs should keep the simple recipe portable, then add shell-specific status-preserving variants only after they are verified for zsh, fish, and POSIX-like shells.

Investigate FastMCP task=True as an optional enhancement, not the foundation. FastMCP supports task-enabled async tools through TaskConfig, server-side task docs (docs/servers/tasks.mdx), and client-side task=True calls (docs/clients/tasks.mdx). This may reduce client/server lockout for long operations, but it does not by itself solve the core problem: the agent still needs model-visible pane output while deciding what to do.

Investigate event-driven pane streams separately. tmux has real streaming surfaces: pipe-pane and control-mode output via control_write_output(). These may support a future stream_pane, but they introduce lifecycle management, cleanup, raw PTY bytes, ANSI/control-sequence handling, buffering, safety-tier concerns, and stateful subscriptions. They should not block the simpler capture_since path.

Tighten the existing wait/search tools while the observation primitive is explored. #50-#55 still reduce real traps: race accounting, same-row behavior, adaptive backoff, lifecycle parity, structured risk fields, and wrap-aware search fallback all improve the current surface even if the recommended workflow shifts away from blocking waits.

Rewrite docs and server instructions around scenarios rather than tool names:

Scenario Recommended path
I launched a command and need completion send_keys("cmd; tmux wait-for -S chan") + wait_for_channel("chan")
I need to inspect an already-running pane snapshot_pane / capture_pane first
I need to tail or monitor output capture_since once available
I need to search visible/current pane text search_panes; use slow path where wrap-spanning text matters
I need a known future line from third-party output wait_for_text with bounded timeout
I only need to know something changed wait_for_content_change, followed by inspection

Recommended ordering:

  1. Ship documentation improvements immediately so agents stop reaching for wait_for_text as a generic completion tool.
  2. Implement capture_since as the main new primitive (feat: capture_since — non-blocking, cursor-based delta capture for agentic flows #60).
  3. Continue the targeted correctness tickets (wait_for_text: two-call state/capture race within a single poll tick #50-search_panes: visual-row capture misses wrap-spanning patterns #55) where they remain valuable.
  4. Add optional FastMCP task support only after client behavior and dependency cost are understood.
  5. Treat pipe-pane / control-mode streaming as a separate future stream_pane design, not a retrofit of blocking wait tools.

This keeps the architecture simple: make observation cheap, explicit, and model-visible; reserve blocking waits for cases where blocking is actually the right abstraction.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions