Skip to content

Python: Fix Responses API handoff state handling and add focused tests#4057

Open
alliscode wants to merge 2 commits intomicrosoft:mainfrom
alliscode:investigation/issue-4053-1771455544
Open

Python: Fix Responses API handoff state handling and add focused tests#4057
alliscode wants to merge 2 commits intomicrosoft:mainfrom
alliscode:investigation/issue-4053-1771455544

Conversation

@alliscode
Copy link
Member

This pull request addresses critical issues with agent handoff behavior when using Responses API-style clients, ensuring conversation context and session state are correctly managed across handoffs. It introduces regression tests to verify these invariants and updates the orchestration logic to prevent context loss and stale session IDs that could cause API errors.

Bug fixes for handoff and session management:

  • Ensured the agent receives the full conversation history (_full_conversation) as input after a handoff, rather than just the latest broadcast, to preserve context for APIs like the Responses API.
  • Cleared the session's service_session_id after a handoff to prevent sending a stale previous_response_id, which could otherwise cause "No tool output found" errors with the Responses API.

Testing and regression coverage:

  • Added a new test suite (test_handoff_responses.py) with regression tests to verify that (1) handoffs correctly clear the session's conversation pointer and (2) agents receive the complete conversation context after a handoff. This includes a mock client and agent simulating Responses API behavior.

Closes #4053

Contribution Checklist

  • The code builds clean without any errors or warnings
  • The PR follows the Contribution Guidelines
  • All unit tests pass, and I have added new tests where possible
  • Is this a breaking change? If yes, add "[BREAKING]" prefix to the title of the PR.

Copilot AI review requested due to automatic review settings February 18, 2026 23:58
@github-actions github-actions bot changed the title Fix Responses API handoff state handling and add focused tests Python: Fix Responses API handoff state handling and add focused tests Feb 18, 2026
@markwallace-microsoft
Copy link
Member

markwallace-microsoft commented Feb 19, 2026

Python Test Coverage

Python Test Coverage Report •
FileStmtsMissCoverMissing
packages/orchestrations/agent_framework_orchestrations
   _handoff.py3425783%105–106, 108, 163–173, 175, 177, 179, 184, 282, 335, 360, 402–403, 417, 475–476, 508, 514, 518, 522–523, 561–563, 568–570, 686, 689, 696, 701, 763, 768, 775, 785, 787, 806, 808, 890–891, 923–924, 1006, 1013, 1085–1086, 1088
TOTAL21247330284% 

Python Unit Test Overview

Tests Skipped Failures Errors Time
4192 239 💤 0 ❌ 0 🔥 1m 12s ⏱️

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request fixes two critical bugs in the HandoffBuilder's handling of Responses API-style clients, ensuring conversation context and session state are correctly managed across agent handoffs. The fixes prevent "No tool output found" API errors and restore complete conversation history to agents after handoffs.

Changes:

  • Fixed stale previous_response_id issue by clearing service_session_id after handoffs
  • Restored full conversation context by passing _full_conversation to agents instead of partial _cache
  • Added comprehensive regression tests with Responses API mock to prevent future regressions

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File Description
python/packages/orchestrations/tests/test_handoff_responses.py New test suite with mock Responses API client to verify handoff invariants: session clearing and context preservation
python/packages/orchestrations/agent_framework_orchestrations/_handoff.py Two-line fix: copy full conversation to cache before agent runs, and clear service_session_id after handoffs

# from being sent on the next run. The handoff response contained a function_call
# for the handoff tool; referencing it via previous_response_id after the tool
# output has been cleaned would cause "No tool output found" API errors.
if self._session and self._session.service_session_id:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this like saying each time it will create a new session in the service?

@moonbox3
Copy link
Contributor

moonbox3 commented Feb 19, 2026

I've got these same fixes in #3911, plus others that I found through testing workflows with handoff + AG-UI. We could add the regression tests from yours to 3911.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Python: [Bug]: HandoffBuilder with AzureOpenAIResponsesClient fails with stale previous_response_id and loses conversation context on handoff

4 participants

Comments