You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A 30+ minute streaming session captured with --memprofile shows the TUI allocates 890 GB cumulatively (yes, GB) over its lifetime, of which 76% (677 GB) flows through messages.(*model).ensureAllItemsRendered. Live heap (inuse_space) stayed at a healthy 44 MB, so this isn't a leak — it's an allocation rate / GC pressure problem distinct from #2861 (per-message retention) and #2884 (notification stacking).
Repro
Build with the --memprofile flag plumbed in pkg/profiling.
Run a normal interactive session that includes long-form streaming responses (10s of KB of markdown + reasoning blocks + animated tool calls).
After ~30 min of typical usage, exit cleanly so pprof.WriteHeapProfile runs.
Any animation tick (spinner, fade, pulse) or any child view emitting a non-nil cmd dirties the entire message list. ensureAllItemsRendered then walks every view and rebuilds renderedLines. The per-item LRU (renderedItems) absorbs most of the cost for finalized messages, but three categories bypass the cache:
The currently-streaming assistant message (shouldCacheMessage returns false while Content is empty/whitespace, and even when populated, the streaming target is invalidated on every chunk).
Reasoning blocks (MessageTypeAssistantReasoningBlock is hard-coded to return false in shouldCacheMessage because of embedded spinners).
The selected/hovered message.
So during streaming, every animation tick (~10–60 Hz depending on platform) re-renders the active message and any reasoning block at full width through markdown → lipgloss → ANSI styling. With long messages and a multi-hour session, total churn easily reaches hundreds of GB allocated.
Why this matters even though inuse_space is small
CPU. The 119% CPU I observed during a streaming turn (Activity Monitor) is consistent with one core saturated in markdown/ANSI rendering on top of model decode.
GC pressure.runtime.mallocgc shows up as the 4th-largest inuse_space consumer (4.6 MB) — small, but a constant fraction of bookkeeping for a lot of churn.
Suggested fix axes
Per-item invalidation. Most m.renderDirty = true sites know which view changed (the for i, view := range m.views loop has the index). Only that view needs re-rendering; the surrounding lines in renderedLines can be patched at lineOffsets[i] rather than rebuilt.
Stream coalescing. During an active stream, the assistant message receives many chunks per second but the user can't perceive >~30 fps. Throttle re-renders of the streaming target to a fixed rate via a tea.Tick-driven flush, independent of chunk arrival.
Reuse buffers across re-renders of the same message.strings.(*Builder).WriteString is 48% of alloc_space — much of it inside lipgloss.Style.Render. A sync.Pool of strings.Builder (or buffer pool wired through IncrementalRenderer) wouldn't change correctness but would dramatically cut allocations. Lipgloss is upstream so this would need a small wrapper.
(1) alone should kill ≥80% of the churn and is the cheapest.
Multi-hour interactive session ending in clean exit (so pprof.WriteHeapProfile runs — note that jetsam SIGKILL bypasses this, so capturing requires exiting before the kill).
Summary
A 30+ minute streaming session captured with
--memprofileshows the TUI allocates 890 GB cumulatively (yes, GB) over its lifetime, of which 76% (677 GB) flows throughmessages.(*model).ensureAllItemsRendered. Live heap (inuse_space) stayed at a healthy 44 MB, so this isn't a leak — it's an allocation rate / GC pressure problem distinct from #2861 (per-message retention) and #2884 (notification stacking).Repro
--memprofileflag plumbed inpkg/profiling.pprof.WriteHeapProfileruns.go tool pprof -top -alloc_space heap.pprof.Profile (anonymized)
-cumview confirms the call chain dominates the program:For reference the same profile's
inuse_spacewas only 44 MB — nothing is being retained. All those bytes are GC churn.Mechanism
pkg/tui/components/messages/messages.go,Update:Any animation tick (spinner, fade, pulse) or any child view emitting a non-nil cmd dirties the entire message list.
ensureAllItemsRenderedthen walks every view and rebuildsrenderedLines. The per-item LRU (renderedItems) absorbs most of the cost for finalized messages, but three categories bypass the cache:shouldCacheMessagereturns false whileContentis empty/whitespace, and even when populated, the streaming target is invalidated on every chunk).MessageTypeAssistantReasoningBlockis hard-coded toreturn falseinshouldCacheMessagebecause of embedded spinners).So during streaming, every animation tick (~10–60 Hz depending on platform) re-renders the active message and any reasoning block at full width through markdown → lipgloss → ANSI styling. With long messages and a multi-hour session, total churn easily reaches hundreds of GB allocated.
Why this matters even though
inuse_spaceis smallruntime.mallocgcshows up as the 4th-largestinuse_spaceconsumer (4.6 MB) — small, but a constant fraction of bookkeeping for a lot of churn.Suggested fix axes
Per-item invalidation. Most
m.renderDirty = truesites know which view changed (thefor i, view := range m.viewsloop has the index). Only that view needs re-rendering; the surrounding lines inrenderedLinescan be patched atlineOffsets[i]rather than rebuilt.Stream coalescing. During an active stream, the assistant message receives many chunks per second but the user can't perceive >~30 fps. Throttle re-renders of the streaming target to a fixed rate via a
tea.Tick-driven flush, independent of chunk arrival.Reuse buffers across re-renders of the same message.
strings.(*Builder).WriteStringis 48% ofalloc_space— much of it insidelipgloss.Style.Render. Async.Poolofstrings.Builder(or buffer pool wired throughIncrementalRenderer) wouldn't change correctness but would dramatically cut allocations. Lipgloss is upstream so this would need a small wrapper.(1) alone should kill ≥80% of the churn and is the cheapest.
Repro environment
docker-agentHEAD as of 2026-05-22pprof.WriteHeapProfileruns — note that jetsam SIGKILL bypasses this, so capturing requires exiting before the kill).Related
inuse_spacelinearly; this one keepsinuse_spaceflat but burnsalloc_spaceand CPU.