Skip to content

Fix AsyncioConnection race conditions causing EBADF errors (#614)#697

Draft
dkropachev wants to merge 5 commits intomasterfrom
fix/asyncio-close-race-614
Draft

Fix AsyncioConnection race conditions causing EBADF errors (#614)#697
dkropachev wants to merge 5 commits intomasterfrom
fix/asyncio-close-race-614

Conversation

@dkropachev
Copy link
Collaborator

Summary

  • Fix close() to wait for _close() completion from non-event-loop threads, eliminating the window where is_closed=True but the socket fd is still open (root cause of EBADF)
  • Set last_error on server EOF in handle_read() so factory() detects dead connections instead of returning them
  • Add is_closed/is_defunct guard in push() to reject writes on closed connections
  • Treat BrokenPipeError/ConnectionResetError in handle_write() as clean peer disconnections instead of defuncting, and skip defunct() in both I/O handlers if the connection is already shutting down

Test plan

  • New unit tests in tests/unit/io/test_asyncio_race_614.py cover all four race conditions using real socket pairs
  • Full unit test suite passes with no regressions
  • Integration tests with TLS and node restart scenarios

Fix four race conditions in AsyncioConnection that cause "[Errno 9] Bad
file descriptor" errors during node restarts, especially with TLS:

1. close() now waits for _close() to complete when called from outside
   the event loop thread, eliminating the window where is_closed=True
   but the socket fd is still open.

2. handle_read() sets last_error on server EOF so factory() detects dead
   connections instead of returning them to callers.

3. push() rejects data when the connection is already closed, preventing
   writes from being enqueued to a shutting-down connection.

4. handle_write() treats BrokenPipeError/ConnectionResetError as clean
   peer disconnections instead of defuncting, and both I/O handlers skip
   defunct() if the connection is already shutting down.
ProactorEventLoop does not support remove_reader/remove_writer
(raises NotImplementedError). Wrap these calls so the socket is
always closed regardless, and use try/finally to ensure
connected_event is always set even if cleanup fails.
On Windows, ProactorEventLoop may raise plain OSError with
winerror=10054 (WSAECONNRESET) or 10053 (WSAECONNABORTED) instead
of ConnectionResetError. Use ConnectionError base class plus
winerror check for robust cross-platform detection.
macOS raises OSError(57, "Socket is not connected") (ENOTCONN) when
writing to a peer-disconnected socket. Unlike BrokenPipeError or
ConnectionResetError, this is a plain OSError — not a ConnectionError
subclass — so it was not caught by the isinstance(err, ConnectionError)
check. Add errno-based detection for ENOTCONN, ESHUTDOWN, ECONNRESET,
and ECONNABORTED to handle these platform-specific disconnect errors
cleanly instead of defuncting the connection.
- Guard SocketPairAsyncioConnection definition with ASYNCIO_AVAILABLE
  check to avoid subclassing NoneType when asyncio is unavailable
- Add debug logging in close() except block instead of bare pass
- Add inline comments to except blocks in _close() for
  remove_reader/remove_writer
@dkropachev dkropachev force-pushed the fix/asyncio-close-race-614 branch from fe6be33 to e4f0c6b Compare February 12, 2026 22:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant