Skip to content

feat: add pty_snapshot and pty_snapshot_wait tools with seq-based diffing#31

Open
JosXa wants to merge 4 commits intoshekohex:mainfrom
JosXa:feat/pty-snapshot
Open

feat: add pty_snapshot and pty_snapshot_wait tools with seq-based diffing#31
JosXa wants to merge 4 commits intoshekohex:mainfrom
JosXa:feat/pty-snapshot

Conversation

@JosXa
Copy link
Copy Markdown
Contributor

@JosXa JosXa commented Apr 1, 2026

Summary

Adds pty_snapshot and pty_snapshot_wait tools that capture the parsed visible terminal screen of any PTY session, with seq-based history and line-level diffing. Complements the existing pty_read tool which returns the raw output buffer (ANSI codes and all).

Motivation

When using pty_read on TUI applications like OpenCode itself, the output is an unreadable flood of ANSI escape sequences, cursor movements, and control characters. This makes it impossible for AI agents to understand what is actually displayed on screen.

pty_snapshot solves this by maintaining a headless terminal emulator (@xterm/headless) alongside each PTY session, producing clean screen captures that look exactly like what a human would see.

What is included

pty_snapshot tool

  • Returns visible screen text, cursor position, terminal size, content hash, and a monotonic seq number
  • Optional since parameter: pass a previous seq to get only changed lines (line-level diff) instead of the full screen
  • Seq only increments on actual content changes (hash-based dedup)

pty_snapshot_wait tool

  • Blocks until a condition is met, then returns the matching snapshot
  • search: regex pattern against screen text, resolves on first match
  • hashStableMs: resolves when screen content is unchanged for N milliseconds
  • timeout: max wait (default 30s)
  • Also supports since for diff output

Seq-based history

  • Ring buffer of 200 deduped snapshots per session
  • Each frame stores line-by-line content for efficient diffing
  • Diff format shows changed/added/removed lines with line numbers
  • If requested seq fell off the buffer, returns full screen with historyTruncated flag

Other improvements

  • Headless terminal emulator per PTY session feeding both raw ring buffer and xterm.js
  • Improved web UI plain-text buffer endpoint uses parsed screen instead of naive ANSI stripping
  • FNV-1a content hash for efficient change detection without full-text comparison

Example: full snapshot

<pty_snapshot id="pty_abc" seq="5" hash="123...">
Size: 120x40
Cursor: (20, 50) visible=true
---
                         OpenCode banner and clean screen text...

Example: diff since seq 3

<pty_snapshot id="pty_abc" seq="5" hash="456..." since="3">
Changed lines:
  5 [changed]: pineapple pineapple pineapple pineapple pineapple
  7 [changed]: Build gpt-5.4 2.0s

Implementation details

  • Uses @xterm/headless (same terminal emulation as VS Code terminal)
  • Writes queued with proper callback handling for xterm async time-sliced parsing
  • Content hash uses FNV-1a 64-bit for fast, collision-resistant change detection
  • Synchronous cached snapshot state avoids async API churn across manager/tools/web layers
  • waitForCondition polls at 100ms intervals, checks search regex and/or hash stability

@JosXa JosXa force-pushed the feat/pty-snapshot branch from bc5856b to d49d4ce Compare April 1, 2026 22:56
JosXa added 2 commits April 2, 2026 01:13
Add a headless terminal emulator (@xterm/headless) to each PTY session
that processes raw output in parallel with the existing ring buffer.
This enables clean, ANSI-free screen capture via a new pty_snapshot tool.

The snapshot returns the visible terminal screen text, cursor position,
terminal size, and a content hash for efficient change detection -
useful for monitoring TUI applications during development.

Also improves the web UI plain-text buffer endpoint to use the parsed
screen instead of naive ANSI regex stripping.
- pty_snapshot_wait: blocks until search regex matches or screen stabilizes (hashStableMs)
- seq-based history: each content change gets a monotonic sequence number
- since parameter on both tools: returns line-level diff against historical frame
- Ring buffer of 200 deduped snapshots per session for diffing
- LineDiff format: changed/added/removed lines with content and line numbers
@JosXa JosXa force-pushed the feat/pty-snapshot branch from d49d4ce to 1abc157 Compare April 2, 2026 00:43
@JosXa JosXa changed the title feat: add pty_snapshot tool for parsed terminal screen capture feat: add pty_snapshot and pty_snapshot_wait tools with seq-based diffing Apr 2, 2026
- Document snapshot and snapshot_wait tools in tools table
- Add usage examples for screen capture, seq-based diffs, and conditional waiting
- Add 'Debugging OpenCode itself' walkthrough as a real-world use case
- Update 'How It Works' section with snapshot/diff/wait pipeline
- Add xterm.js to credits
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant