Add stream-stall telemetry to Anthropic Messages streaming (#321432)#321671
Open
meganrogge wants to merge 4 commits into
Open
Add stream-stall telemetry to Anthropic Messages streaming (#321432)#321671meganrogge wants to merge 4 commits into
meganrogge wants to merge 4 commits into
Conversation
The Anthropic Messages streaming body can return 200 with headers and then hang mid-stream with no further chunk and no error, leaving the for-await loop pending indefinitely. Add a 120s idle watchdog that resets on each chunk, emits a messagesApi.streamIdleTimeout telemetry event when it trips (so the stall is observable in the wild), rejects the iterator, and cancels the underlying reader so the stream settles instead of hanging.
Drop the abort/reject behavior so request behavior is unchanged. The idle watchdog now only detects a stalled Anthropic Messages stream and emits the messagesApi.streamIdleTimeout event once, so we can first measure how often the #321432 stall happens in the wild before changing behavior.
kycutler
previously approved these changes
Jun 16, 2026
bhavyaus
reviewed
Jun 16, 2026
lszomoru
approved these changes
Jun 16, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Adds an observe-only idle watchdog to
processResponseFromMessagesEndpointinmessagesApi.ts(the Anthropic Messages API path used byvscode-copilot-cliwith Claude models). It detects when the streaming body stalls and emits a newmessagesApi.streamIdleTimeouttelemetry event. It does not change request behavior — the stream is not aborted.Why
The streaming-stall class of hang in #321432: the HTTP response can return
200with headers and then stall mid-stream — the body iterator (for await (const chunk of response.body)) stops producing chunks but never completes and never errors, so the request hangs indefinitely. Today only the success path is instrumented, so when this happens the stall is invisible in telemetry.This was the root cause of 4 of the 6
X_AGENT_STILL_RESPONDINGtimeouts in MSBench run27640643629(the others were genuine 60-min budget exhaustion). The stall is environment-agnostic — the eval harness just amplifies its frequency — so it likely affects real users too. We want to measure how often this happens in the wild first before changing any request behavior.What this change does
messagesApi.streamIdleTimeoutonce and otherwise leaves the stream untouched.finally.This is intentionally non-invasive: no abort, no reject, no behavior change. A follow-up can add mitigation once we have data on frequency and shape.
Telemetry dimensions
messagesApi.streamIdleTimeout(classificationSystemMetaData, purposePerformanceAndHealth):requestId,ghRequestId— correlate with server-side logsidleMs— ms since the last chunk when the watchdog trippedelapsedMs— ms since the stream startedchunksReceived— chunks received before the stallcompletionsEmitted— completions emitted before the stall (0= first-turn hang,>0= mid-stream hang)Related: #321640