fix: collapse BaseExceptionGroup to surface real errors from task groups#2179
Open
giulio-leone wants to merge 10 commits intomodelcontextprotocol:mainfrom
Open
fix: collapse BaseExceptionGroup to surface real errors from task groups#2179giulio-leone wants to merge 10 commits intomodelcontextprotocol:mainfrom
giulio-leone wants to merge 10 commits intomodelcontextprotocol:mainfrom
Conversation
When a task in an anyio task group fails, sibling tasks are cancelled. The resulting BaseExceptionGroup contains the real error alongside Cancelled exceptions from those siblings. This makes error classification extremely difficult for callers. Add open_task_group() context manager and collapse_exception_group() utility that detect this pattern and re-raise just the original error, keeping the full group as __cause__ for debugging. Applied to all 16 create_task_group() sites across: - Client transports (sse, stdio, websocket, streamable_http) - Server transports (sse, stdio, websocket, streamable_http) - Session __aexit__ - Server lowlevel run loop - StreamableHTTP session manager - SessionGroup, InMemoryTransport - Experimental task support Fixes modelcontextprotocol#2114
On Python < 3.11, BaseExceptionGroup is not a builtin and must be imported from the exceptiongroup backport package (transitive dep via anyio).
- Remove unused `import anyio` from 4 modules where anyio.create_task_group was replaced by open_task_group - Add `# pragma: no cover` to sys.version_info < (3, 11) checks since coverage is per-Python-version and each version only covers one branch - Add `# pragma: lax no cover` to defensive raise paths in open_task_group and BaseSession.__aexit__ (triggered when exception group has no cancellation noise — extremely rare with anyio task groups)
strict-no-cover flags 'pragma: no cover' as incorrect when the lines ARE covered on the running Python version. Use 'pragma: lax no cover' instead, which is excluded from both coverage counting and strict checking.
Pre-commit end-of-file fixer requires a blank line between the except block and the next method definition. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add _get_exceptions() helper to provide typed access to BaseExceptionGroup.exceptions, avoiding reportUnknownMemberType errors. Use pyright: ignore[reportUnknownArgumentType] for the narrowed BaseExceptionGroup[Unknown] type after isinstance checks. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Resolves pyright reportUnusedVariable error.
The split() return value for the cancelled subgroup is intentionally discarded. Use bare _ instead of _cancelled so pyright strict mode recognises it as an unused-by-design binding. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Author
|
All 25 CI checks pass (26th is claude-review skip). Previously flaky Windows test_stdio is now stable. Ready for review. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes #2114
When a task in an anyio task group fails, sibling tasks are cancelled. The resulting
BaseExceptionGroupcontains the real error alongsideCancelledexceptions from those siblings. This makes error classification extremely difficult for callers — they cannot reliably determine the root cause of a failure.Root Cause
There are 16
create_task_group()usages across the SDK with noexcept*syntax or ExceptionGroup unwrapping anywhere. A single connection failure produces aBaseExceptionGroupcontaining the real error plus multipleCancelledexceptions.Solution
New utility module (
src/mcp/shared/_exception_utils.py):collapse_exception_group(eg, cancelled_type)— extracts the single real error from a group if there's exactly one non-cancelled exception; preserves the full group for multiple concurrent failuresopen_task_group()— drop-in replacement foranyio.create_task_group()that automatically collapses on exitApplied to all 16 task group sites:
BaseSession.__aexit__Behavior
BaseExceptionGroup([ConnectionError, Cancelled, Cancelled])ConnectionError(with group as__cause__)BaseExceptionGroup([Cancelled, Cancelled])CancelledBaseExceptionGroup([ValueError, RuntimeError, Cancelled])BaseExceptionGroup([ValueError, RuntimeError])(Cancelled stripped)The original
BaseExceptionGroupis always preserved as__cause__for debugging.Testing
collapse_exception_group, 4 integration tests foropen_task_group)