Skip to content

fix(runners): dispatch run_after_run_callback in _run_node_async (#5282)#5323

Open
surfai wants to merge 1 commit intogoogle:v2from
surfai:fix/5282-dispatch-after-run-callback
Open

fix(runners): dispatch run_after_run_callback in _run_node_async (#5282)#5323
surfai wants to merge 1 commit intogoogle:v2from
surfai:fix/5282-dispatch-after-run-callback

Conversation

@surfai
Copy link
Copy Markdown

@surfai surfai commented Apr 14, 2026

Summary

Fixes #5282 by wiring plugin_manager.run_after_run_callback into Runner._run_node_async, closing the runners.py:427 TODO for node runtime plugin lifecycle.

Companion to test-only PR #5301 (which established the xfail regression anchor on stock 2.0.0a3). This PR ships the source fix plus flips the anchor by replacing the WorkaroundRunner scaffolding with a direct positive assertion.

The fix (3 lines)

try:
  async for event in self._consume_event_queue(ic, done_sentinel):
    yield event
  # 5. Run the after_run callbacks. Mirrors _exec_with_plugin:1230.
  # This does NOT emit any event.
  await ic.plugin_manager.run_after_run_callback(invocation_context=ic)
finally:
  await self._cleanup_root_task(task, self.agent.name)

Placement mirrors the legacy dispatch at runners.py:1230 exactly: outside the finally block, so semantics match legacy —

Exit path Legacy _exec_with_plugin This fix
Natural drain ✓ fires ✓ fires
Error in loop ✗ skipped ✗ skipped (exception jumps to finally)
Caller break ✗ skipped ✗ skipped (GeneratorExit at yield jumps to finally)

Full semantic equivalence. Single insertion point covers both LlmAgent and BaseNode root types because run_async Branches A (line 807) and B (line 838) both funnel into _run_node_async. This also removes the last pre-condition to collapsing the two branches per the TODO at runners.py:836-837.

Also in this PR

Alternatives considered and rejected

  1. Dispatch in finally block — would fire after_run_callback on error and caller-break, changing the legacy contract. Plugin authors wrote handlers expecting "successful completion" semantics; flipping to "always fires" is a breaking change.
  2. Lift dispatch to run_async call sites (lines 825, 841)ic is created inside _run_node_async at line 446, so this would require plumbing ic out of the generator. Expands blast radius.
  3. Extract _with_plugin_lifecycle(ic) shared context manager — right long-term shape, but refactors the legacy _exec_with_plugin too (which has early-exit logic inside before_run_callback that's awkward to extract). Out of scope for a fix PR.

Test plan

  • pytest tests/unittests/runners/test_issue_5282.py -v — 2 passed
  • pytest tests/unittests/runners/ (full subdir) — 69 passed, 1 skipped, 5 xfailed (all pre-existing), 0 new failures
  • isort --check — clean
  • pyink --check --config pyproject.toml — clean
  • CI: pyink.yml, isort.yml, python-unit-tests.yml, mypy.yml, mypy-new-errors.yml

Links

🤖 Generated with Claude Code

@google-cla
Copy link
Copy Markdown

google-cla bot commented Apr 14, 2026

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

…gle#5282)

Runner._run_node_async had a TODO at runners.py:427 for node runtime
plugin lifecycle, and never dispatched plugin_manager.run_after_run_callback.
Because both Workflow(BaseNode) and (post a3 refactor) LlmAgent roots funnel
through _run_node_async, any BasePlugin subclass that overrides
after_run_callback silently no-op'd — breaking the canonical pattern for
memory persistence, metrics emission, and post-run audit hooks.

This wires one dispatch call after _consume_event_queue drains, mirroring
the legacy path in _exec_with_plugin at runners.py:1230 exactly: outside
the finally block, so semantics match legacy (fires on natural drain only,
skipped on error in the loop, skipped on caller-break via GeneratorExit).

Also:
- Narrows the TODO at runners.py:427 from "tracing and plugin lifecycle"
  to "tracing" since plugin lifecycle is now wired.
- Incidental pyink reformat at runners.py:1451: pre-existing 82-char line
  on v2 HEAD, unrelated to google#5282 but required by the pyink CI gate.

Companion to test-only PR google#5301 (which established the regression anchor).
This PR ships the fix plus flips the anchor by replacing the WorkaroundRunner
scaffolding with a direct positive assertion.

Alternatives considered and rejected:
- Wrap in finally to fire on error/break — changes legacy contract.
- Lift dispatch to run_async call sites — requires plumbing ic out of
  _run_node_async, bigger blast radius.
- Extract a shared _with_plugin_lifecycle context manager — right long-term
  shape, but refactors the legacy path too and expands scope.

Fixes google#5282
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant