Skip to content

feat: chat runtime - pause/resume, SSE transport, React bindings#5

Merged
yyyyaaa merged 26 commits into
mainfrom
feat/chat-runtime
May 13, 2026
Merged

feat: chat runtime - pause/resume, SSE transport, React bindings#5
yyyyaaa merged 26 commits into
mainfrom
feat/chat-runtime

Conversation

@marslavish
Copy link
Copy Markdown
Contributor

@marslavish marslavish commented Apr 27, 2026

Builds on feat/features-complete. Adds the chat-runtime layer on top of the redesigned core: pausable tool execution, an SSE-serializable run handle, a headless React hook, a Next.js reference demo, and shared test infrastructure.

Summary

  • @agentic-kit/agent — pausable tools, AgentRunHandle (events / ReadableStream / SSE Response), maxSteps, decision lookup by toolCallId.
  • @agentic-kit/react (new package) — useChat hook that POSTs to an SSE endpoint and folds events into messages, streaming snapshot, pending decisions, and executing tools.
  • apps/nextjs-chat-demo (new) — Next.js App Router demo wiring agent.prompt(...).toResponse() to useChat, with a tool-approval UI.
  • agentic-kitinjectDeferralResults helper for the "user types instead of approving" flow; cross-fetch dropped in the OpenAI adapter in favor of native fetch.
  • Test infra — shared helpers under tools/test/ (scripted provider, SSE stub, fixtures), SSE parser tests, run-handle tests (443 LOC), useChat tests (1011 LOC).

What's New

@agentic-kit/agent — pause/resume + SSE

  • Pausable tools. Tools declare an optional decision JSON Schema. When the agent reaches a call with no attached decision, it emits tool_decision_pending and stops. Attach the decision to the matching toolCall block and call continue() to resume.
  • AgentRunHandle returned by prompt() / continue(), consumable exactly once as:
    • await handle — run to completion
    • handle.events() — async iterator of AgentEvents
    • handle.toReadableStream()ReadableStream<AgentEvent>
    • handle.toResponse() — SSE Response ready to return from a Next.js / Hono / Express handler
  • parseSSEStream() exported from the package for clients consuming toResponse().
  • maxSteps cap on model invocations per run (resets in prompt(), persists across continue()); stopReason: 'completed' | 'max_steps' on agent_end.
  • Decision lookup by id. continue() and the underlying loop walk the message log backwards to find the most recent un-decided toolCall matching a given toolCallId, so callers may append unrelated messages between the pause and the response.

@agentic-kit/react — new package

  • Single hook useChat({ api, body?, initialMessages?, fetch?, on* }).
  • State: messages, streamingMessage, isStreaming, pendingDecisions: ReadonlyMap<string, ToolDecisionPendingEvent>, executingToolCallIds: ReadonlySet<string>, error.
  • Actions: send, sendMessages, setMessages (array or updater), respondWithDecision(toolCallId, value), abort().
  • abort() finalizes any visible streamed text as an assistant message and drops orphan toolCall blocks so the next call doesn't re-pause.
  • Callbacks: onMessage, onFinish, onDecisionPending, onToolExecutionStart/End, onError.
  • Headless — no UI, no run store, no runId. State lives in the message log.

agentic-kitinjectDeferralResults

For the case where the user types a new message while a tool is paused: synthesizes a stand-in toolResult for every toolCall that lacks both a decision and a paired result, so the server picks up a well-formed transcript.

import { injectDeferralResults, createUserMessage } from 'agentic-kit';

await sendMessages([
  ...injectDeferralResults(messages),
  createUserMessage(text),
]);

apps/nextjs-chat-demo

  • /api/chat/route.ts constructs an Agent, applies prior messages, and returns agent.prompt(...).toResponse().
  • Client uses useChat with chat-input, chat-messages, tool-call-card, tool-approval-card components.

Test infrastructure

  • tools/test/ — repo-internal helpers (no package.json, imported via tsconfig paths). Scripted provider, SSE stub, fixtures, shared index.
  • Provider unit suites refactored onto the shared helpers; default pnpm test stays deterministic and offline.
  • New suites: sse.test.ts (parser), run-handle.test.ts (443 LOC), use-chat.test.ts (1011 LOC under jsdom), inject-deferral-results.test.ts.
  • @agentic-kit/react is the only package on jsdom; everything else stays on node.

Cleanup

  • cross-fetch removed from the OpenAI adapter — runtimes are expected to provide fetch.
  • Packages expose a source export condition so workspace consumers can resolve TypeScript directly.

Test Plan

  • pnpm install && pnpm build && pnpm test is green across packages
  • apps/nextjs-chat-demo boots, streams a chat turn, and a paused tool can be approved/denied via respondWithDecision
  • Abort mid-stream preserves visible text and clears orphan toolCalls; next send() does not re-pause
  • injectDeferralResults flow: pause a tool, send a fresh user message instead of deciding, verify the next request carries synthesized stand-in results

@marslavish marslavish changed the base branch from main to feat/features-complete April 27, 2026 14:44
@marslavish marslavish changed the title feat: chat runtime foundation — pausable tools and test infra (WIP) feat: pause/resume runtime, run store, useChat Apr 27, 2026
@marslavish marslavish changed the title (WIP) feat: pause/resume runtime, run store, useChat feat: pause/resume runtime, run store, useChat May 12, 2026
@marslavish marslavish changed the title feat: pause/resume runtime, run store, useChat feat: chat runtime - pause/resume, SSE transport, React bindings May 12, 2026
@marslavish marslavish changed the base branch from feat/features-complete to main May 12, 2026 02:09
Comment thread packages/agent/src/run-handle.ts
@yyyyaaa
Copy link
Copy Markdown
Contributor

yyyyaaa commented May 13, 2026

wow great work, this is pretty complicated. I just have some design questions:

  1. Decision-resume ordering. When a user message arrives during a pause, continue() appends the tool result at the tail of the log instead of
    adjacent to its assistant block — the transform layer then synthesizes a placeholder for OpenAI/Anthropic. Is the non-trailing case in scope, and
    how do you want to handle it (insert adjacent? reorder? reject continue() if a user message intervened? document as a constraint)?

  2. Concurrency contract for prompt() / continue(). isStreaming is set on consumption, not on call, so two synchronous prompt()s both build handles
    and race. What's the intended contract — reject-second, queue/steering, preempt, or doc-only?

  3. events() early-break does not cancel. readableStreamToAsyncIterable releases the lock but never cancels (run-handle.ts:187), so breaking out of
    for await parks the producer forever. toReadableStream/toResponse cancel correctly. Should events() match them, or is the asymmetry intentional?

  4. what is the semantics for abort() during tool execution? executeOneTool catches the abort, records it as an isError tool result, and the loop calls the model again. Abort
    during stream-generation does terminate. What is abort() supposed to mean during tool exec — stop immediately, drain remaining tools then stop, or per-tool only (current)?

@marslavish
Copy link
Copy Markdown
Contributor Author

Thanks for the deep review — all four questions addressed in the latest push:

1. Decision-resume ordering. Went with reject-with-pointer. continue() now throws when non-toolResult messages have been appended after the pending assistant, with the error message pointing at injectDeferralResults() + prompt() for the user-typed-instead-of-approving flow. The transform-layer placeholder remains a fallback for legacy data but the typed path is now enforced. Added a test covering the throw.

2. Concurrency contract for prompt() / continue(). Reject-second. The agent now tracks an outstandingHandle and assertIdle() throws if prompt() or continue() is invoked while a prior handle hasn't been consumed (or before abort()). The handle clears itself as soon as its binder runs, so single-use is enforced at call time rather than at consumption time. Added a test.

3. events() early-break does not cancel. Was a bug, now matches toReadableStream / toResponse. readableStreamToAsyncIterable calls reader.cancel() in finally if the iteration didn't drain, so breaking out of for await propagates cancellation upstream and aborts the producer. Added a test.

4. abort() during tool execution. Went with "stop after current". The already-running tool receives the AbortSignal and can opt to abort itself; the loop no longer dispatches subsequent tools and won't re-invoke the model. agent_end.stopReason now carries 'completed' | 'max_steps' | 'aborted' so consumers can distinguish how a run ended. Added a test.

@yyyyaaa
Copy link
Copy Markdown
Contributor

yyyyaaa commented May 13, 2026

Looks good, thanks! I'll merge and publish now

@yyyyaaa yyyyaaa merged commit cd00eaf into main May 13, 2026
12 checks passed
yyyyaaa added a commit that referenced this pull request May 13, 2026
…gaps (#7)

* fix(react): close three useChat gaps surfaced by PR #5 review

- send(): sync messagesRef before runStream so rapid synchronous sends
  both reach the outgoing request body
- useState init: hydrate pendingDecisions from initialMessages so
  rehydrated paused tool calls render decision UI immediately
- unmount: abort the in-flight fetch on cleanup to prevent leaked
  streams when the consumer unmounts mid-request

* test: lock down regressions surfaced by PR #5 review

Adds four regression tests that act as acceptance criteria for the
fixes shipped in PR #5 and the companion useChat fixes in this branch.

agent.test.ts:
- injectDeferralResults() + prompt() places the synthetic toolResult
  adjacent to its assistant block (verifies the documented "user typed
  instead of deciding" recovery pattern produces provider-valid order).

use-chat.test.ts:
- initialMessages with a paused tool call hydrates pendingDecisions.
- Two rapid synchronous send() calls both reach the outgoing body.
- Unmount aborts the in-flight fetch.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants