Skip to content

feat(0.14.0): runDurableTurn + DurableChatTurnEngine#22

Merged
drewstone merged 1 commit into
mainfrom
feat/durable-turn-primitive
May 20, 2026
Merged

feat(0.14.0): runDurableTurn + DurableChatTurnEngine#22
drewstone merged 1 commit into
mainfrom
feat/durable-turn-primitive

Conversation

@drewstone
Copy link
Copy Markdown
Contributor

Summary

Two reusable primitives so every product chat handler routes durability through one place instead of copy-pasting it across legal/gtm/creative/tax.

runDurableTurn — streaming, backend-agnostic, checkpoint+replay durable turn

  • Fresh run: producer runs, events forward live (streaming preserved), final text checkpointed on drain.
  • Replay: a completed turn re-emits cached text as one synthetic event — producer never constructed, no LLM call, no double-billing.
  • Mid-stream crash: re-runs from the top. The substrate checkpoints JSON at step granularity — there is no partial-stream checkpoint. This is the honest durability ceiling, documented in the source.
  • Generic over event type — products stream their own NDJSON shape or RuntimeStreamEvent.

DurableChatTurnEngine — framework-neutral chat-turn orchestrator

Owns what was duplicated 4×: durable checkpointing, NDJSON line protocol, session.run.* lifecycle envelope, ordered persist/post-process hooks, trace flush. Product-specific behavior is hooks: produce / persistAssistantMessage / onTurnComplete / onEvent / transformFinalText / traceFlush. Takes resolved values (identity, store, waitUntil) — never a Request/Context — so React Router AND Hono products use it identically.

  • Replay skips persist + post-process → a retried turn never double-writes.
  • Producer failure → error + session.run.failed; stream always closes.
  • Hook errors swallowed + logged → a post-process failure never fails a streamed turn.

Test plan

  • 36 new tests (durable-turn 15, chat-engine 21)
  • Each runs against the full InMemory / FileSystem / D1-over-better-sqlite3 store matrix
  • pnpm test — 213/213 pass
  • pnpm typecheck + pnpm lint clean

Follow-up

Migrate all 4 product chat handlers onto DurableChatTurnEngine — each api.chat collapses to a thin route adapter (auth + parse + engine.runTurn). Per-product durable-chat.ts deleted. Supersedes legal#75, gtm#126, creative#102.

…at layer

Two reusable primitives so every product chat handler routes durability
through one place instead of copy-pasting it four times.

runDurableTurn (src/durable/turn.ts) — a streaming, backend-agnostic,
checkpoint+replay durable turn. Generic over the event type; never
inspects events, only forwards them and reads finalText() after drain.
  - Fresh run: producer runs, events forward live (streaming preserved),
    final text checkpointed on drain.
  - Replay: a completed turn re-emits cached text as one synthetic event;
    the producer is never constructed — no LLM call, no double-billing.
  - Mid-stream crash: a turn that died while streaming re-runs from the
    top (the substrate checkpoints JSON at step granularity — there is no
    partial-stream checkpoint; this is the honest durability ceiling).
  - One step, lease claimed once via startOrResume; concurrent workers on
    the same runId are rejected with DurableRunLeaseHeldError.

DurableChatTurnEngine (src/durable/chat-engine.ts) — the framework-neutral
chat-turn orchestrator. Owns what was duplicated across legal/gtm/creative/
tax: durable checkpointing, the NDJSON StreamEvent line protocol, the
session.run.started/completed/failed lifecycle envelope, ordered persist/
post-process hooks, trace flush. Everything product-specific is a hook:
produce / persistAssistantMessage / onTurnComplete / onEvent /
transformFinalText / traceFlush. Takes resolved values (identity tuple,
store, waitUntil) — never a Request or a Context — so React Router and
Hono products use it identically. Replay skips persist + post-process so
a retried turn never double-writes. Producer failure becomes an error +
session.run.failed pair; the stream always closes; hook errors are
swallowed + logged so a post-process failure never fails a streamed turn.

36 new tests (durable-turn 15, chat-engine 21), each run against the full
InMemory / FileSystem / D1-over-better-sqlite3 store matrix. Total suite:
213 tests, typecheck + biome clean.
@drewstone drewstone merged commit 03c7426 into main May 20, 2026
1 check passed
@drewstone drewstone deleted the feat/durable-turn-primitive branch May 20, 2026 16:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant