Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions .changeset/recover-interrupted-tool-exchange.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
---
"@moonshot-ai/agent-core": patch
"@moonshot-ai/kimi-code": patch
---

Recover resumed sessions that were interrupted after recording a tool call but before recording its tool result.
23 changes: 23 additions & 0 deletions packages/agent-core/src/agent/context/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,8 @@ const TOOL_EMPTY_STATUS = '<system>Tool output is empty.</system>';
const TOOL_EMPTY_ERROR_STATUS =
'<system>ERROR: Tool execution failed. Tool output is empty.</system>';
const TOOL_OUTPUT_EMPTY_TEXT = 'Tool output is empty.';
const INTERRUPTED_TOOL_RESULT =
'Kimi Code was interrupted before this tool call could record a result. Treat this tool call as failed and continue from the latest user instruction.';

export class ContextMemory {
private _history: ContextMessage[] = [];
Expand Down Expand Up @@ -206,6 +208,27 @@ export class ContextMemory {
this.pushHistory(message);
}

recoverInterruptedToolExchanges(): number {
// A sealed step can intentionally keep waiting for async tool output across
// context operations. An open step at replay EOF means the process stopped
// before the loop could finish pairing recorded tool calls.
const missingToolResultIds = this.openSteps.size > 0 ? [...this.pendingToolResultIds] : [];
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Recover pending calls after compaction clears the step

If manual or auto compaction completes after a tool.call is recorded but before its tool.result, context.applyCompaction preserves the assistant/tool-call message but clears openSteps; after a crash at that point, replay reaches EOF with pendingToolResultIds still populated and openSteps.size === 0, so this guard synthesizes no result. The next model request still contains the compacted history with an assistant tool call and no matching tool message, reproducing the provider 400 this recovery is meant to prevent.

Useful? React with 👍 / 👎.

for (const toolCallId of missingToolResultIds) {
this.appendLoopEvent({
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Include recovered messages in replay output

When the first resume repairs an interrupted tool exchange, this call runs after AgentRecords.restore has cleared records.restoring; ReplayBuilder.push only records messages while restoring, so the synthetic tool result created here—and any deferred user/background messages flushed immediately afterward—are absent from ResumedAgentState.replay. The TUI renders that replay on resume, so the first resume after a crash can still show an unresolved tool call and omit the queued continuation even though context and persistence were repaired; the transcript only looks correct after a second resume from the newly appended record.

Useful? React with 👍 / 👎.

type: 'tool.result',
parentUuid: toolCallId,
toolCallId,
result: {
output: INTERRUPTED_TOOL_RESULT,
isError: true,
},
});
}
this.openSteps.clear();
this.flushDeferredMessagesIfToolExchangeClosed();
return missingToolResultIds.length;
}

private flushDeferredMessagesIfToolExchangeClosed(): void {
if (this.pendingToolResultIds.size > 0 || this.deferredMessages.length === 0) {
return;
Expand Down
4 changes: 4 additions & 0 deletions packages/agent-core/src/agent/records/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -185,6 +185,10 @@ export class AgentRecords {
this.persistence.rewrite(replayedRecords);
await this.persistence.flush();
}
const recoveredToolResults = this.agent.context.recoverInterruptedToolExchanges();
if (recoveredToolResults > 0) {
await this.persistence.flush();
}
if (this.agent.blobStore !== undefined) {
for (const msg of this.agent.context.history) {
await this.agent.blobStore.rehydrateParts(msg.content);
Expand Down
81 changes: 81 additions & 0 deletions packages/agent-core/test/agent/records/index.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -91,6 +91,87 @@ describe('AgentRecords persistence metadata', () => {
expect(persistence.records.filter((record) => record.type === 'metadata')).toHaveLength(1);
});

it('repairs orphan tool calls after replaying an interrupted session', async () => {
const stepUuid = 'interrupted-step';
const persistence = new InMemoryAgentRecordPersistence([
{
type: 'metadata',
protocol_version: AGENT_WIRE_PROTOCOL_VERSION,
created_at: 1,
},
{
type: 'context.append_message',
message: {
role: 'user',
content: [{ type: 'text', text: 'wait for task output' }],
toolCalls: [],
origin: { kind: 'user' },
},
},
{
type: 'context.append_loop_event',
event: { type: 'step.begin', uuid: stepUuid, turnId: '0', step: 1 },
},
{
type: 'context.append_loop_event',
event: {
type: 'tool.call',
uuid: 'call_task_output',
turnId: '0',
step: 1,
stepUuid,
toolCallId: 'call_task_output',
name: 'TaskOutput',
args: { block: true },
},
},
{
type: 'context.append_message',
message: {
role: 'user',
content: [{ type: 'text', text: 'continue' }],
toolCalls: [],
origin: { kind: 'user' },
},
},
]);
const ctx = testAgent({ persistence });

await ctx.agent.records.replay();

expect(ctx.agent.context.history.map((message) => message.role)).toEqual([
'user',
'assistant',
'tool',
'user',
]);
expect(ctx.agent.context.history[2]).toMatchObject({
role: 'tool',
toolCallId: 'call_task_output',
isError: true,
});
expect(ctx.agent.context.messages[2]?.content).toEqual([
{
type: 'text',
text: expect.stringContaining(
'Kimi Code was interrupted before this tool call could record a result.',
),
},
]);
expect(persistence.records.at(-1)).toMatchObject({
type: 'context.append_loop_event',
event: {
type: 'tool.result',
parentUuid: 'call_task_output',
toolCallId: 'call_task_output',
result: {
output: expect.stringContaining('interrupted before this tool call'),
isError: true,
},
},
});
});

it('does not rewrite records that already use the current wire version', async () => {
const persistence = new RecordingInMemoryAgentRecordPersistence([
{
Expand Down