You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Agents get trapped when a pane wait hides the terminal behind a boolean. In a live dev loop, the agent is often tailing a server, watching tests, checking several panes, or deciding whether to interrupt. A blocking wait_for_text call gives it only found or timeout, so the agent cannot see that the server picked a different port, printed an error, is waiting for input, already finished, or that another pane now needs attention. This is the lockout reported in #59.
The deeper problem is tool mismatch. wait_for_text is useful for known future output from a process the agent does not control. It is a poor default for command completion, current-screen inspection, fuzzy diagnosis, or multi-pane monitoring. Those cases need different primitives:
wait_for_channel and signal_channel block the FastMCP event loop (sync subprocess.run) #18 fixed the server-side version of this class for wait_for_channel: blocking subprocess.run() on the FastMCP event loop could stall the server. That is separate from agent lockout, where the server may stay healthy but the agent is still trapped inside an uninformative call.
The upstream substrates matter:
tmux wait-for is the right zero-poll primitive for authored command synchronization; see cmd-wait-for.c.
tmux grid/capture APIs are snapshots, not a durable log stream. History rollover, resize, alternate screen, and wrap behavior all affect what can be reconstructed from capture-pane, grid_collect_history(), and alternate-screen handling in screen.c.
FastMCP can report progress through Context.report_progress(), and background tasks can send task status notifications, but UI/client notifications are not a substitute for agent-visible tool results.
Python cancellation and timeouts need explicit async subprocess care. CPython documents asyncio.create_subprocess_exec() and asyncio.wait_for() as the async path, while subprocess.run(timeout=...) kills only the child it is managing after timeout; see asyncio-subprocess.rst and subprocess.py.
Shell composition is portable only at the simple cmd; tmux wait-for -S channel level. zsh documents list/separator and status behavior in grammar.yo and params.yo. fish has different status variables, events, and prompt markers; see fish_for_bash_users.rst, language.rst events, and mark-prompt. Status-preserving recipes should be shell-specific, not implied as universal.
2. Avenues of investigation
Primary investment: build capture_since (#60) as the observation-first primitive. It should be non-blocking, cursor-based, and bounded by max_lines / max_bytes, returning actual pane output plus cursor and correctness metadata. The first call should be immediately useful by returning visible screen content and establishing a cursor; later calls should return only deltas. This lets the agent tail dev servers, watch tests, inspect multiple panes, and self-correct without betting on one regex.
Keep wait_for_channel as the deterministic authored-command primitive. When the agent sends the command, compose a signal into the command text and wait on tmux's channel:
$ command; tmux wait-for -S libtmux_mcp_done
That path is zero-poll inside tmux and avoids the prompt/regex/stale-output trap. The docs should keep the simple recipe portable, then add shell-specific status-preserving variants only after they are verified for zsh, fish, and POSIX-like shells.
Investigate FastMCP task=True as an optional enhancement, not the foundation. FastMCP supports task-enabled async tools through TaskConfig, server-side task docs (docs/servers/tasks.mdx), and client-side task=True calls (docs/clients/tasks.mdx). This may reduce client/server lockout for long operations, but it does not by itself solve the core problem: the agent still needs model-visible pane output while deciding what to do.
Investigate event-driven pane streams separately. tmux has real streaming surfaces: pipe-pane and control-mode output via control_write_output(). These may support a future stream_pane, but they introduce lifecycle management, cleanup, raw PTY bytes, ANSI/control-sequence handling, buffering, safety-tier concerns, and stateful subscriptions. They should not block the simpler capture_since path.
Tighten the existing wait/search tools while the observation primitive is explored. #50-#55 still reduce real traps: race accounting, same-row behavior, adaptive backoff, lifecycle parity, structured risk fields, and wrap-aware search fallback all improve the current surface even if the recommended workflow shifts away from blocking waits.
Rewrite docs and server instructions around scenarios rather than tool names:
Add optional FastMCP task support only after client behavior and dependency cost are understood.
Treat pipe-pane / control-mode streaming as a separate future stream_pane design, not a retrofit of blocking wait tools.
This keeps the architecture simple: make observation cheap, explicit, and model-visible; reserve blocking waits for cases where blocking is actually the right abstraction.
1. Problems
Agents get trapped when a pane wait hides the terminal behind a boolean. In a live dev loop, the agent is often tailing a server, watching tests, checking several panes, or deciding whether to interrupt. A blocking
wait_for_textcall gives it onlyfoundor timeout, so the agent cannot see that the server picked a different port, printed an error, is waiting for input, already finished, or that another pane now needs attention. This is the lockout reported in #59.The deeper problem is tool mismatch.
wait_for_textis useful for known future output from a process the agent does not control. It is a poor default for command completion, current-screen inspection, fuzzy diagnosis, or multi-pane monitoring. Those cases need different primitives:wait_for_channelwithtmux wait-for -Swait_for_textsnapshot_pane/capture_panesearch_panescapture_since(#60)wait_for_textwait_for_content_change, then inspectchanged=Trueas semantic successThe filed wait-family issues are concrete instances of that mismatch:
wait_for_textmatches stale scrollback immediately — needs baseline anchor #45 fixed stale scrollback instant matches by redefiningwait_for_textas "new text after the call begins." That makessearch_panes/snapshot_panethe correct answer for "is this already on screen?"wait_for_text's delays (90 seconds) is can lock agentic flows - blocking #59 covers the 90-second agent lockout when the expected pattern is wrong, misspelled, already scrolled away, or semantically too narrow.printf, carriage-return progress bars, spinners, and status redraws can stay on the baseline row thatwait_for_textdeliberately skips to avoid stale matches.window_pane_search()searches visual rows, so long build/test messages can split across the wrap boundary and silently miss on the fast path.wait_for_textsamples pane state, then callscapture-pane; tmux capture math is evaluated against live history incmd-capture-pane, while history can move throughgrid_collect_history().wait_for_content_change: pane death, respawn, andpane_pidchanges are observation facts, not just content-diff edge cases.ctx.warning()may not reach the model or programmatic caller, so best-effort correctness must be exposed in structured tool results.wait_for_channel: blockingsubprocess.run()on the FastMCP event loop could stall the server. That is separate from agent lockout, where the server may stay healthy but the agent is still trapped inside an uninformative call.The upstream substrates matter:
wait-foris the right zero-poll primitive for authored command synchronization; seecmd-wait-for.c.capture-pane,grid_collect_history(), and alternate-screen handling inscreen.c.Context.report_progress(), and background tasks can send task status notifications, but UI/client notifications are not a substitute for agent-visible tool results.asyncio.create_subprocess_exec()andasyncio.wait_for()as the async path, whilesubprocess.run(timeout=...)kills only the child it is managing after timeout; seeasyncio-subprocess.rstandsubprocess.py.cmd; tmux wait-for -S channellevel. zsh documents list/separator and status behavior ingrammar.yoandparams.yo. fish has different status variables, events, and prompt markers; seefish_for_bash_users.rst,language.rstevents, andmark-prompt. Status-preserving recipes should be shell-specific, not implied as universal.2. Avenues of investigation
Primary investment: build
capture_since(#60) as the observation-first primitive. It should be non-blocking, cursor-based, and bounded bymax_lines/max_bytes, returning actual pane output plus cursor and correctness metadata. The first call should be immediately useful by returning visible screen content and establishing a cursor; later calls should return only deltas. This lets the agent tail dev servers, watch tests, inspect multiple panes, and self-correct without betting on one regex.Keep
wait_for_channelas the deterministic authored-command primitive. When the agent sends the command, compose a signal into the command text and wait on tmux's channel:$ command; tmux wait-for -S libtmux_mcp_doneThat path is zero-poll inside tmux and avoids the prompt/regex/stale-output trap. The docs should keep the simple recipe portable, then add shell-specific status-preserving variants only after they are verified for zsh, fish, and POSIX-like shells.
Investigate FastMCP
task=Trueas an optional enhancement, not the foundation. FastMCP supports task-enabled async tools throughTaskConfig, server-side task docs (docs/servers/tasks.mdx), and client-sidetask=Truecalls (docs/clients/tasks.mdx). This may reduce client/server lockout for long operations, but it does not by itself solve the core problem: the agent still needs model-visible pane output while deciding what to do.Investigate event-driven pane streams separately. tmux has real streaming surfaces:
pipe-paneand control-mode output viacontrol_write_output(). These may support a futurestream_pane, but they introduce lifecycle management, cleanup, raw PTY bytes, ANSI/control-sequence handling, buffering, safety-tier concerns, and stateful subscriptions. They should not block the simplercapture_sincepath.Tighten the existing wait/search tools while the observation primitive is explored. #50-#55 still reduce real traps: race accounting, same-row behavior, adaptive backoff, lifecycle parity, structured risk fields, and wrap-aware search fallback all improve the current surface even if the recommended workflow shifts away from blocking waits.
Rewrite docs and server instructions around scenarios rather than tool names:
send_keys("cmd; tmux wait-for -S chan")+wait_for_channel("chan")snapshot_pane/capture_panefirstcapture_sinceonce availablesearch_panes; use slow path where wrap-spanning text matterswait_for_textwith bounded timeoutwait_for_content_change, followed by inspectionRecommended ordering:
wait_for_textas a generic completion tool.capture_sinceas the main new primitive (feat:capture_since— non-blocking, cursor-based delta capture for agentic flows #60).pipe-pane/ control-mode streaming as a separate futurestream_panedesign, not a retrofit of blocking wait tools.This keeps the architecture simple: make observation cheap, explicit, and model-visible; reserve blocking waits for cases where blocking is actually the right abstraction.