tui: per-tick re-render of active streaming message allocates 100s of GB during long sessions

## Summary

A 30+ minute streaming session captured with `--memprofile` shows the TUI allocates **890 GB cumulatively** (yes, GB) over its lifetime, of which **76% (677 GB) flows through `messages.(*model).ensureAllItemsRendered`**. Live heap (`inuse_space`) stayed at a healthy 44 MB, so this isn't a leak — it's an **allocation rate / GC pressure** problem distinct from #2861 (per-message retention) and #2884 (notification stacking).

## Repro

1. Build with the `--memprofile` flag plumbed in `pkg/profiling`.
2. Run a normal interactive session that includes long-form streaming responses (10s of KB of markdown + reasoning blocks + animated tool calls).
3. After ~30 min of typical usage, exit cleanly so `pprof.WriteHeapProfile` runs.
4. `go tool pprof -top -alloc_space heap.pprof`.

## Profile (anonymized)

```
File: docker-agent
Type: alloc_space
Showing nodes accounting for 858946.90MB, 96.43% of 890731.07MB total
      flat  flat%   sum%        cum   cum%
425369.83MB 47.76% 47.76% 425369.83MB 47.76%  strings.(*Builder).WriteString (partial-inline)
 81999.61MB  9.21% 56.96%  81999.61MB  9.21%  bytes.growSlice
 55231.66MB  6.20% 63.16%  55231.66MB  6.20%  github.com/charmbracelet/ultraviolet.NewBuffer
 …
 12293.53MB  1.38% 87.37% 454751.76MB 51.05%  pkg/tui/components/message.(*messageModel).render
  4966.89MB  0.56% 93.71% 677505.05MB 76.06%  pkg/tui/components/messages.(*model).ensureAllItemsRendered
```

`-cum` view confirms the call chain dominates the program:

```
                          cum    cum%
ensureAllItemsRendered  677 GB  76.06%
└─ renderItem           673 GB  75.61%
   └─ messageModel.Render 454 GB  51.05%
      └─ lipgloss.Style.Render 566 GB  63.55%
         └─ strings.Builder.WriteString 425 GB  47.76%
```

For reference the same profile's `inuse_space` was only 44 MB — nothing is being retained. All those bytes are GC churn.

## Mechanism

`pkg/tui/components/messages/messages.go`, `Update`:

```go
case animation.TickMsg:
    if m.hasAnimatedContent() {
        m.renderDirty = true
    }

for i, view := range m.views {
    updatedView, cmd := view.Update(msg)
    m.views[i] = updatedView
    if cmd != nil {
        cmds = append(cmds, cmd)
        m.renderDirty = true
    }
}
```

Any animation tick (spinner, fade, pulse) or any child view emitting a non-nil cmd dirties the **entire** message list. `ensureAllItemsRendered` then walks every view and rebuilds `renderedLines`. The per-item LRU (`renderedItems`) absorbs most of the cost for finalized messages, but three categories bypass the cache:

- The currently-streaming assistant message (`shouldCacheMessage` returns false while `Content` is empty/whitespace, and even when populated, the streaming target is invalidated on every chunk).
- Reasoning blocks (`MessageTypeAssistantReasoningBlock` is hard-coded to `return false` in `shouldCacheMessage` because of embedded spinners).
- The selected/hovered message.

So during streaming, every animation tick (~10–60 Hz depending on platform) re-renders the active message and any reasoning block at full width through markdown → lipgloss → ANSI styling. With long messages and a multi-hour session, total churn easily reaches hundreds of GB allocated.

## Why this matters even though `inuse_space` is small

- **CPU.** The 119% CPU I observed during a streaming turn (Activity Monitor) is consistent with one core saturated in markdown/ANSI rendering on top of model decode.
- **Possible jetsam co-contributor.** macOS jetsam sums VM-compressor activity into its "largest process" decision. A high allocation rate means many pages cycle through the compressor; combined with #2861's per-message retention, this can push docker-agent over the threshold faster than retention alone would. After #2866 lands the retention growth stops, but the per-tick re-render storm remains.
- **GC pressure.** `runtime.mallocgc` shows up as the 4th-largest `inuse_space` consumer (4.6 MB) — small, but a constant fraction of bookkeeping for a lot of churn.

## Suggested fix axes

1. **Per-item invalidation.** Most `m.renderDirty = true` sites know which view changed (the `for i, view := range m.views` loop has the index). Only that view needs re-rendering; the surrounding lines in `renderedLines` can be patched at `lineOffsets[i]` rather than rebuilt.

2. **Stream coalescing.** During an active stream, the assistant message receives many chunks per second but the user can't perceive >~30 fps. Throttle re-renders of the streaming target to a fixed rate via a `tea.Tick`-driven flush, independent of chunk arrival.

3. **Reuse buffers across re-renders of the same message.** `strings.(*Builder).WriteString` is 48% of `alloc_space` — much of it inside `lipgloss.Style.Render`. A `sync.Pool` of `strings.Builder` (or buffer pool wired through `IncrementalRenderer`) wouldn't change correctness but would dramatically cut allocations. Lipgloss is upstream so this would need a small wrapper.

(1) alone should kill ≥80% of the churn and is the cheapest.

## Repro environment

- macOS 26.x, Apple Silicon, 64 GB
- `docker-agent` HEAD as of 2026-05-22
- Multi-agent config; long streaming responses (~10 KB markdown + reasoning blocks + animated tool calls)
- Multi-hour interactive session ending in clean exit (so `pprof.WriteHeapProfile` runs — note that jetsam SIGKILL bypasses this, so capturing requires exiting before the kill).

## Related

- #2861 — per-message retention leak (memory growth → jetsam kill). Independent: that grows `inuse_space` linearly; this one keeps `inuse_space` flat but burns `alloc_space` and CPU.
- #2884 — persistent toolset-failure notifications stacking. UX layer, separate.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

tui: per-tick re-render of active streaming message allocates 100s of GB during long sessions #2886

Summary

Repro

Profile (anonymized)

Mechanism

Why this matters even though `inuse_space` is small

Suggested fix axes

Repro environment

Related

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

tui: per-tick re-render of active streaming message allocates 100s of GB during long sessions #2886

Description

Summary

Repro

Profile (anonymized)

Mechanism

Why this matters even though inuse_space is small

Suggested fix axes

Repro environment

Related

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

Why this matters even though `inuse_space` is small