Skip to content

Fix local Codex queued messages getting stuck#2185

Closed
fredluz wants to merge 1 commit into
PostHog:mainfrom
fredluz:fix/codex-queued-messages-stuck
Closed

Fix local Codex queued messages getting stuck#2185
fredluz wants to merge 1 commit into
PostHog:mainfrom
fredluz:fix/codex-queued-messages-stuck

Conversation

@fredluz
Copy link
Copy Markdown

@fredluz fredluz commented May 18, 2026

Problem

Local Codex queued follow-up messages can remain stuck in the conversation after the active turn completes.

Closes #2182

Changes

_posthog/turn_complete can arrive through the streamed ACP subscription before the matching JSON-RPC session/prompt response for local Codex sessions. The renderer was clearing currentPromptId for all sessions when that notification arrived. When the later prompt response arrived, it was treated as stale and returned before the local queue-drain path could run.

This changes turn_complete prompt-state clearing to apply only to cloud sessions. Local sessions keep using the matching prompt response as the queue-drain boundary.

Added a regression test for the local Codex ordering case: queued message present, turn_complete arrives first, then the matching prompt response drains the queue.

How did you test this?

Manual:

  • Reproduced before the fix: start a local Codex turn, send a follow-up while pending, observe the follow-up remains stuck as queued after the first turn completes.
  • Retested after the fix: same flow, queued follow-up sends automatically after the first turn completes.

Automated:

pnpm --filter code exec biome check src/renderer/features/sessions/service/service.ts src/renderer/features/sessions/service/service.test.ts
pnpm --filter code typecheck
pnpm --filter code exec vitest run src/renderer/features/sessions/service/service.test.ts

Also run by pre-commit hook:

biome check --write --unsafe --files-ignore-unknown=true --no-errors-on-unmatched
pnpm typecheck

Publish to changelog?

no

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 0093fedd79

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

});
if (isTurnCompleteEvent(acpMsg)) {
const session = sessionStoreSetters.getSessions()[taskRunId];
if (session?.isCloud) {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Clear cancelled local turns before their response

When a user cancels a local Codex turn that has queued follow-up messages, Codex emits _posthog/turn_complete before the matching JSON-RPC prompt response, but this new session?.isCloud guard leaves the old currentPromptId in place for local sessions. Since cancelPrompt() only clears isPromptPending, the later cancelled response still matches the stale id in handleSessionEvent and drains the queued messages, even though the turn was cancelled; previously the turn-complete event cleared the id so that response was ignored as stale.

Useful? React with 👍 / 👎.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 18, 2026

Prompt To Fix All With AI
Fix the following 1 code review issue. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 1
apps/code/src/renderer/features/sessions/service/service.test.ts:2304-2379
**Test accesses private method via type cast — first such pattern in this file**

The test casts `service as unknown as { handleSessionEvent: ... }` to call a `private` method directly. This is the first occurrence of this pattern in the test file, and it makes the test brittle to any rename or visibility change on `handleSessionEvent`. The existing approach for similar event-ordering tests is to capture the `onData` callback from `mockTrpcAgent.onSessionEvent.subscribe` and call it directly in sequence, which exercises the actual public subscription path. The test is also placed in `describe("sendPrompt")` despite not calling `sendPrompt` at all; it fits better alongside the cloud queue-flush test under `watchCloudTask` or in a dedicated queue-management block.

Reviews (1): Last reviewed commit: "Fix local Codex queued messages getting ..." | Re-trigger Greptile

Comment on lines 2304 to 2379
it("drains local queued messages when Codex turn_complete arrives before prompt response", async () => {
const service = getSessionService() as unknown as {
handleSessionEvent: (
taskRunId: string,
acpMsg: {
type: "acp_message";
ts: number;
message: Record<string, unknown>;
},
) => void;
};
const queuedSession = createMockSession({
taskRunId: "run-123",
taskId: "task-123",
status: "connected",
isPromptPending: true,
currentPromptId: 42,
messageQueue: [
{ id: "q-1", content: "investigate another", queuedAt: 1700000001 },
],
});
const sessions: Record<string, AgentSession> = {
"run-123": queuedSession,
};
mockSessionStoreSetters.getSessions.mockImplementation(() => sessions);
mockSessionStoreSetters.getSessionByTaskId.mockImplementation(
(taskId: string) =>
taskId === "task-123" ? sessions["run-123"] : undefined,
);
mockSessionStoreSetters.updateSession.mockImplementation(
(taskRunId: string, updates: Partial<AgentSession>) => {
const existing = sessions[taskRunId];
if (!existing) return;
sessions[taskRunId] = { ...existing, ...updates };
},
);
mockSessionStoreSetters.dequeueMessagesAsText.mockReturnValue(
"investigate another",
);
mockTrpcAgent.prompt.mutate.mockResolvedValue({ stopReason: "end_turn" });

service.handleSessionEvent("run-123", {
type: "acp_message",
ts: 1700000002,
message: {
jsonrpc: "2.0",
method: "_posthog/turn_complete",
params: { sessionId: "acp-session", stopReason: "end_turn" },
},
});
expect(sessions["run-123"].currentPromptId).toBe(42);

service.handleSessionEvent("run-123", {
type: "acp_message",
ts: 1700000003,
message: {
jsonrpc: "2.0",
id: 42,
result: { stopReason: "end_turn" },
},
});

await vi.waitFor(() => {
expect(
mockSessionStoreSetters.dequeueMessagesAsText,
).toHaveBeenCalledWith("task-123");
expect(mockTrpcAgent.prompt.mutate).toHaveBeenCalledWith({
sessionId: "run-123",
prompt: [{ type: "text", text: "investigate another" }],
});
});
});

it("queues message when compaction is in progress", async () => {
const service = getSessionService();
mockSessionStoreSetters.getSessionByTaskId.mockReturnValue(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Test accesses private method via type cast — first such pattern in this file

The test casts service as unknown as { handleSessionEvent: ... } to call a private method directly. This is the first occurrence of this pattern in the test file, and it makes the test brittle to any rename or visibility change on handleSessionEvent. The existing approach for similar event-ordering tests is to capture the onData callback from mockTrpcAgent.onSessionEvent.subscribe and call it directly in sequence, which exercises the actual public subscription path. The test is also placed in describe("sendPrompt") despite not calling sendPrompt at all; it fits better alongside the cloud queue-flush test under watchCloudTask or in a dedicated queue-management block.

Prompt To Fix With AI
This is a comment left during a code review.
Path: apps/code/src/renderer/features/sessions/service/service.test.ts
Line: 2304-2379

Comment:
**Test accesses private method via type cast — first such pattern in this file**

The test casts `service as unknown as { handleSessionEvent: ... }` to call a `private` method directly. This is the first occurrence of this pattern in the test file, and it makes the test brittle to any rename or visibility change on `handleSessionEvent`. The existing approach for similar event-ordering tests is to capture the `onData` callback from `mockTrpcAgent.onSessionEvent.subscribe` and call it directly in sequence, which exercises the actual public subscription path. The test is also placed in `describe("sendPrompt")` despite not calling `sendPrompt` at all; it fits better alongside the cloud queue-flush test under `watchCloudTask` or in a dedicated queue-management block.

How can I resolve this? If you propose a fix, please make it concise.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

@fredluz fredluz force-pushed the fix/codex-queued-messages-stuck branch from 0093fed to a05668a Compare May 18, 2026 13:59
@fredluz
Copy link
Copy Markdown
Author

fredluz commented May 18, 2026

Thanks — updated.

Cloud sessions still use _posthog/turn_complete as the turn boundary for prompt-state clearing and queue flushing. The change only prevents local sessions from clearing currentPromptId on that notification, because local Codex can receive turn_complete before the matching prompt response.

I also added currentPromptId: null to cancelPrompt() so cancelled local prompt responses are treated as stale and cannot drain queued messages.

I manually tested the local Codex flow before and after the fix. I can’t manually run cloud tasks from my current local setup, but the existing session-service cloud queue/hydration coverage still passes.

@k11kirky
Copy link
Copy Markdown
Contributor

Thanks @fredluz - I actually had a fix on a branch for this already here: #2197

Looks very similar so will merge and close

@k11kirky k11kirky closed this May 18, 2026
@k11kirky
Copy link
Copy Markdown
Contributor

duplicate PR #2197

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Codex queued messages not working

2 participants