Skip to content

[Live/bidi] transfer_to_agent drops pending user message on round-trip — no response after live-to-live handoff on Agent Engine #5238

@smwitkowski

Description

@smwitkowski

Required Information

Describe the Bug

When two agents both use gemini-live-2.5-flash-native-audio and are connected via sub_agents=[] (enabling transfer_to_agent), the transfer-back to the parent agent intermittently drops the user's pending message on Agent Engine via bidi_stream_query().

After transfer_to_agent fires, the target agent starts a new Gemini Live session. The pending user message is not delivered to that session. The target agent receives empty context, produces an empty turn_complete (no audio, no text, no transcription), or makes a spurious re-transfer based on stale session state.

Effective failure rate: 75–100% when transfer_to_agent actually fires on the round-trip path.

The apparent 30–60% overall failure rate in multi-turn tests is lower because the LLM sometimes answers directly without transferring. When isolating only the trials where transfer_to_agent was invoked, 6 out of 7 transfer attempts produced no response (86%).

This is distinct from #5195 (model mismatch). In #5195, a gemini-live-* parent delegates to a gemini-2.5-flash child and crashes with 1011 every time. Here, both agents use the same live model and the failure is intermittent message loss rather than a hard crash. The workaround (AgentTool()) is the same.

Steps to Reproduce

1. Create app/__init__.py — minimal 2-agent setup (both live models):

import json
from datetime import datetime

from google.adk.agents import Agent
from google.adk.tools import FunctionTool
from google.genai.types import GenerateContentConfig


def get_current_time() -> str:
    """Get the current date and time."""
    return json.dumps({"time": datetime.now().isoformat(), "timezone": "UTC"})


def get_weather(city: str = "Miami") -> str:
    """Get the current weather for a city."""
    return json.dumps({
        "city": city,
        "temperature": "75F",
        "condition": "Sunny",
        "humidity": "60%",
    })


helper_agent = Agent(
    model="gemini-live-2.5-flash-native-audio",
    name="helper_agent",
    description="Handles weather and travel queries.",
    instruction=(
        "You are a helpful weather assistant. "
        "When the user asks about weather, call get_weather and respond. "
        "If the user asks about ANYTHING other than weather or travel, "
        "you MUST immediately transfer back to root_agent. "
        "Do NOT answer non-weather questions yourself."
    ),
    tools=[FunctionTool(get_weather)],
    generate_content_config=GenerateContentConfig(temperature=0.0),
)


root_agent = Agent(
    model="gemini-live-2.5-flash-native-audio",
    name="root_agent",
    description="Main coordinator. Delegates weather to helper_agent.",
    instruction=(
        "You are a concierge assistant. "
        "When the user asks about weather or travel, transfer to helper_agent. "
        "For all other questions, answer directly. "
        "Keep responses brief (1-2 sentences)."
    ),
    sub_agents=[helper_agent],
    tools=[FunctionTool(get_current_time)],
    generate_content_config=GenerateContentConfig(temperature=0.0),
)

2. Create app/agent_engine_app.py:

from vertexai.preview.reasoning_engines.templates.adk import AdkApp
from app import root_agent

class AgentEngineApp(AdkApp):
    def register_operations(self):
        operations = super().register_operations()
        operations["bidi_stream"] = ["bidi_stream_query"]
        operations["async_stream"] = ["async_stream_query"]
        return operations

agent_engine = AgentEngineApp(agent=root_agent)

3. Deploy to Agent Engine and run the round-trip test:

import asyncio
import vertexai

async def test_round_trip():
    client = vertexai.Client(project="YOUR_PROJECT", location="us-central1")
    
    async with client.aio.live.agent_engines.connect(
        agent_engine="projects/.../reasoningEngines/...",
        config={"class_method": "bidi_stream_query"},
    ) as session:
        await session.send(query_input={"user_id": "test-handoff"})
        
        # Step 1: trigger delegation root -> helper_agent
        await session.send(query_input={
            "content": {"role": "user", "parts": [{"text": "What's the weather in Miami?"}]}
        })
        # Collect events until turn_complete (helper responds with audio -- works)
        while True:
            event = await asyncio.wait_for(session.receive(), timeout=30)
            ev = event.get("bidiStreamOutput", event)
            if ev.get("turn_complete") or ev.get("turnComplete"):
                break
        
        await asyncio.sleep(1)
        
        # Step 2: trigger transfer back helper -> root_agent
        await session.send(query_input={
            "content": {"role": "user", "parts": [{"text": "What is 2 + 2?"}]}
        })
        # Observe: helper calls transfer_to_agent(root_agent),
        # but root_agent produces empty turn_complete -- no audio, no text
        while True:
            event = await asyncio.wait_for(session.receive(), timeout=30)
            ev = event.get("bidiStreamOutput", event)
            print(f"author={ev.get('author')}  "
                  f"content_parts={len(ev.get('content', {}).get('parts', []))}  "
                  f"transfer={ev.get('actions', {}).get('transfer_to_agent', '')}  "
                  f"done={ev.get('turn_complete', False)}")
            if ev.get("turn_complete") or ev.get("turnComplete"):
                break

asyncio.run(test_round_trip())

Expected Behavior

After transfer_to_agent(root_agent) fires:

  1. root_agent receives the pending user message ("What is 2 + 2?")
  2. root_agent responds with audio/text: "2 + 2 is 4."
  3. turn_complete with content

This is what happens ~25% of the time.

Observed Behavior (75% of transfer-back attempts)

author=helper_agent  content_parts=0  transfer=           done=False
author=helper_agent  content_parts=1  transfer=           done=False   # transfer_to_agent call
author=helper_agent  content_parts=1  transfer=root_agent done=False   # transfer executes
author=root_agent    content_parts=0  transfer=           done=False   # root starts -- no message
author=root_agent    content_parts=0  transfer=           done=False
author=root_agent    content_parts=0  transfer=           done=True    # empty turn_complete

root_agent never receives the user message. It starts with empty context, produces nothing, and completes the turn.

In some cases root_agent makes a spurious re-transfer to helper_agent (based on stale context from turn 1), creating a transfer loop that also ends with an empty turn.

Environment

Tested on two configurations with identical results:

Config A Config B
Python 3.10.11 3.13.9
google-adk 1.28.1 1.29.0
google-cloud-aiplatform 1.145.0 1.147.0
Model gemini-live-2.5-flash-native-audio gemini-live-2.5-flash-native-audio
Path bidi_stream_query on Agent Engine bidi_stream_query on Agent Engine
NR rate when transfer fires 75% (3/4) 75% (3/4)

The bug reproduces identically across Python versions and SDK versions, confirming it is a server-side issue in the AE assembly service / ADK runner — not a client SDK bug.


Optional Information

Timing analysis

All transfer-back attempts across both test configurations followed the same sequence:

Metric PASS (1 trial) FAIL (3 trials)
transfer fires t=1.25s t=1.09s
root_agent appears t=2.97s (gap: 1.72s) t=3.05s (gap: 1.96s)
root produces content t=2.97s (4ms after appearing) NEVER

The gap between transfer and the target agent appearing (~1.7-2.0s) is consistent across pass and fail. The difference is binary: either the user message is delivered to the new session or it isn't. There's no timing threshold -- it appears to be a message-routing bug inside the runner where pending user content is dropped during Live session handoff.

Cloud Logging evidence (historical 1011 crash)

On an earlier AE/ADK version (April 7), the same transfer_to_agent round-trip produced a hard crash in Cloud Logging:

16:09:29 -- transfer_to_agent: root_agent
16:10:41 -- APIError: 1007 None. error when processing input audio, please check
           if the inputaudio is in valid format: 16khz s16le pcm, mono channel
16:10:41 -- StatusCode.ABORTED: Agent Engine Error
16:10:41 -- RpcError raised by BidiStreamQuery

The 1007 audio format error on the new Live session propagated as 1011 to the client. Current AE versions appear to handle this more gracefully (no crash) but the underlying issue -- message loss during transfer -- persists.

Test results

Minimal agent (both agents gemini-live-2.5-flash-native-audio, sub_agents=[]):

Run Python / ADK / aiplatform Trials Transfer fired NR when transfer fires
1 3.10 / 1.28.1 / 1.145.0 5 2 100% (2/2)
2 3.10 / 1.28.1 / 1.145.0 10 4 75% (3/4)
3 3.13 / 1.29.0 / 1.147.0 10 4 75% (3/4)

Existing test agent (different agent configuration, same sub_agents setup):

Run Python / ADK / aiplatform Trials Transfer fired NR when transfer fires
1 3.10 / system 10 3 100% (3/3)

Combined across all runs: 13 transfer-back attempts, 11 failures (85%).

Hypothesis: message not forwarded to new Live session

When transfer_to_agent fires on the bidi path:

  1. The ADK runner (runners.py:run_live) hands off to the target agent
  2. The target agent opens a new Gemini Live API session (via base_llm_flow.py:run_live)
  3. The user's pending message (which was already consumed from the input queue) is supposed to be forwarded to the new session
  4. Intermittently, the message is not forwarded -- the new session starts with empty context

This may be a race condition in how _forward_requests() interacts with the agent switch, or the message may be consumed by the outgoing agent's Live session before the incoming agent can receive it.

Workaround

Replace sub_agents=[helper_agent] with tools=[AgentTool(helper_agent)]. AgentTool wraps the sub-agent as a regular function call -- the parent agent stays in control, receives the sub-agent's response as a tool result, and there is no Live session handoff.

from google.adk.tools import AgentTool

root_agent = Agent(
    ...
    sub_agents=[],  # remove sub_agents
    tools=[AgentTool(helper_agent)],  # use AgentTool instead
)

Behavioral difference: with AgentTool, the sub-agent cannot interact with the user over multiple turns. For single-turn delegation, the behavior is equivalent.

Related Issues

Metadata

Metadata

Labels

live[Component] This issue is related to live, voice and video chat

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions