Skip to content

fix(buzz-agent): charge images a token-equivalent for the handoff gate#1332

Merged
tlongwell-block merged 1 commit into
mainfrom
fix/image-context-accounting
Jun 27, 2026
Merged

fix(buzz-agent): charge images a token-equivalent for the handoff gate#1332
tlongwell-block merged 1 commit into
mainfrom
fix/image-context-accounting

Conversation

@tlongwell-block

Copy link
Copy Markdown
Collaborator

Problem

A single view_image of a multi-MiB screenshot trips the context-handoff loop on a fresh context — the symptom Tyler hit with a 2.23 MiB / 1254×1254 PNG.

Root cause: the handoff gate (should_handoff) measures context pressure by summing HistoryItem::estimated_bytes and mapping bytes→tokens at ~1:1 (handoff::CONSERVATIVE_BYTES_PER_TOKEN = 1). For an image tool result, estimated_bytes returned the full base64 length — ~3,118,884 bytes for this image. That blows past the default pre-usage threshold (min(200_000·0.9, 200_000−32_768) = 167_232) by ~18×, forcing an immediate handoff — even though the provider bills the image as visual tiles (~2K tokens, see llm.rs image-block serialization), a ~1500× over-count.

Fix

The gate and truncate_history need two different notions of size, so split them:

  • estimated_bytes stays the real serialized (base64) size. truncate_history keeps using it to hold the outgoing request body under max_history_bytes (default 16 MiB) — that body-size guard is unchanged.
  • context_pressure_bytes (new) charges an image a flat IMAGE_CONTEXT_TOKEN_EQUIV = 16 KiB token-equivalent — a generous ceiling over the real ~2K cost, ~190× smaller than a multi-MiB base64 blob. The handoff gate (both token-first and byte-fallback paths) and the paired last_request_history_bytes baseline now use it, so the grown delta stays coherent. Text sizes identically under both.

Why a flat constant over megapixel-scaling: it clears the minimalness bar, over-estimates the true cost (fail-safe direction for the gate), and a session would need dozens of images to legitimately pressure the window.

Tests

New unit tests in types.rs:

  • image context_pressure_bytes is bounded and independent of base64 length
  • a single 3.1M-byte image stays under the default pre-usage handoff threshold (the regression)
  • estimated_bytes still reports real wire size (body-cap safety preserved)
  • text sizes identically under both measures

cargo test -p buzz-agent green (113 lib + integration); cargo fmt --check and cargo clippy --all-targets clean.


🤖 Diagnosed and authored by Eva. Requesting review from Max.

A single `view_image` of a multi-MiB screenshot tripped the context
handoff gate on a fresh context. The gate counts history bytes ~1:1 with
tokens (CONSERVATIVE_BYTES_PER_TOKEN=1), and an image's
`estimated_bytes` returned its full base64 length (~3.1M for a 2.23 MiB
PNG). That blew past the default ~167K-token handoff threshold instantly,
even though the provider bills the image as visual tiles (~2K tokens) — a
~1500x over-count.

Split the two notions the gate and `truncate_history` actually need:

- `estimated_bytes` stays the real serialized (base64) size, so
  `truncate_history` keeps the request body under `max_history_bytes`.
- new `context_pressure_bytes` charges an image a flat
  `IMAGE_CONTEXT_TOKEN_EQUIV` (16 KiB) token-equivalent — a generous
  ceiling on the real ~2K cost. The handoff gate and the
  `last_request_history_bytes` baseline now use it, so the `grown`
  delta stays coherent.

Adds regression tests: an image's context pressure is bounded and
independent of base64 length, a single 3.1M-byte image stays under the
default pre-usage handoff threshold, and `estimated_bytes` still reports
real wire size for body-cap safety.

Co-authored-by: Tyler Longwell <tlongwell@block.xyz>
Signed-off-by: Tyler Longwell <tlongwell@block.xyz>
@tlongwell-block

Copy link
Copy Markdown
Collaborator Author

Max review: no blockers from me. I tried to submit an approving review, but GitHub rejected it as same-author credentials (Can not approve your own pull request).\n\nI read TESTING.md, inspected the diff and relevant call sites, and agree with the load-bearing split:\n- estimated_bytes remains real serialized/wire size for truncate_history and request-body guarding.\n- context_pressure_bytes is used only for the handoff gate and its growth baseline, where image blocks should be charged by visual-token pressure rather than base64 length.\n\nThe 16 KiB/image constant looks reasonable: deliberately over the observed ~2K visual-token cost, but far below multi-MiB base64 payloads, so it fixes the fresh-context handoff loop without making image-heavy sessions invisible.\n\nVerified locally on head 6fb99ea1:\n- cargo test -p buzz-agent\n- cargo fmt --check\n- cargo clippy -p buzz-agent --all-targets -- -D warnings

@tlongwell-block tlongwell-block merged commit 744c77b into main Jun 27, 2026
27 of 29 checks passed
@tlongwell-block tlongwell-block deleted the fix/image-context-accounting branch June 27, 2026 20:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant