[Live/bidi] transfer_to_agent drops pending user message on round-trip — no response after live-to-live handoff on Agent Engine

## Required Information

### Describe the Bug

When two agents both use `gemini-live-2.5-flash-native-audio` and are connected via `sub_agents=[]` (enabling `transfer_to_agent`), the transfer-back to the parent agent intermittently drops the user's pending message on Agent Engine via `bidi_stream_query()`.

After `transfer_to_agent` fires, the target agent starts a new Gemini Live session. The pending user message is **not delivered** to that session. The target agent receives empty context, produces an empty `turn_complete` (no audio, no text, no transcription), or makes a spurious re-transfer based on stale session state.

**Effective failure rate: 75–100%** when `transfer_to_agent` actually fires on the round-trip path.

The apparent 30–60% overall failure rate in multi-turn tests is lower because the LLM sometimes answers directly without transferring. When isolating only the trials where `transfer_to_agent` was invoked, 6 out of 7 transfer attempts produced no response (86%).

**This is distinct from #5195** (model mismatch). In #5195, a `gemini-live-*` parent delegates to a `gemini-2.5-flash` child and crashes with 1011 every time. Here, **both agents use the same live model** and the failure is intermittent message loss rather than a hard crash. The workaround (`AgentTool()`) is the same.

### Steps to Reproduce

#### 1. Create `app/__init__.py` — minimal 2-agent setup (both live models):

```python
import json
from datetime import datetime

from google.adk.agents import Agent
from google.adk.tools import FunctionTool
from google.genai.types import GenerateContentConfig


def get_current_time() -> str:
    """Get the current date and time."""
    return json.dumps({"time": datetime.now().isoformat(), "timezone": "UTC"})


def get_weather(city: str = "Miami") -> str:
    """Get the current weather for a city."""
    return json.dumps({
        "city": city,
        "temperature": "75F",
        "condition": "Sunny",
        "humidity": "60%",
    })


helper_agent = Agent(
    model="gemini-live-2.5-flash-native-audio",
    name="helper_agent",
    description="Handles weather and travel queries.",
    instruction=(
        "You are a helpful weather assistant. "
        "When the user asks about weather, call get_weather and respond. "
        "If the user asks about ANYTHING other than weather or travel, "
        "you MUST immediately transfer back to root_agent. "
        "Do NOT answer non-weather questions yourself."
    ),
    tools=[FunctionTool(get_weather)],
    generate_content_config=GenerateContentConfig(temperature=0.0),
)


root_agent = Agent(
    model="gemini-live-2.5-flash-native-audio",
    name="root_agent",
    description="Main coordinator. Delegates weather to helper_agent.",
    instruction=(
        "You are a concierge assistant. "
        "When the user asks about weather or travel, transfer to helper_agent. "
        "For all other questions, answer directly. "
        "Keep responses brief (1-2 sentences)."
    ),
    sub_agents=[helper_agent],
    tools=[FunctionTool(get_current_time)],
    generate_content_config=GenerateContentConfig(temperature=0.0),
)
```

#### 2. Create `app/agent_engine_app.py`:

```python
from vertexai.preview.reasoning_engines.templates.adk import AdkApp
from app import root_agent

class AgentEngineApp(AdkApp):
    def register_operations(self):
        operations = super().register_operations()
        operations["bidi_stream"] = ["bidi_stream_query"]
        operations["async_stream"] = ["async_stream_query"]
        return operations

agent_engine = AgentEngineApp(agent=root_agent)
```

#### 3. Deploy to Agent Engine and run the round-trip test:

```python
import asyncio
import vertexai

async def test_round_trip():
    client = vertexai.Client(project="YOUR_PROJECT", location="us-central1")
    
    async with client.aio.live.agent_engines.connect(
        agent_engine="projects/.../reasoningEngines/...",
        config={"class_method": "bidi_stream_query"},
    ) as session:
        await session.send(query_input={"user_id": "test-handoff"})
        
        # Step 1: trigger delegation root -> helper_agent
        await session.send(query_input={
            "content": {"role": "user", "parts": [{"text": "What's the weather in Miami?"}]}
        })
        # Collect events until turn_complete (helper responds with audio -- works)
        while True:
            event = await asyncio.wait_for(session.receive(), timeout=30)
            ev = event.get("bidiStreamOutput", event)
            if ev.get("turn_complete") or ev.get("turnComplete"):
                break
        
        await asyncio.sleep(1)
        
        # Step 2: trigger transfer back helper -> root_agent
        await session.send(query_input={
            "content": {"role": "user", "parts": [{"text": "What is 2 + 2?"}]}
        })
        # Observe: helper calls transfer_to_agent(root_agent),
        # but root_agent produces empty turn_complete -- no audio, no text
        while True:
            event = await asyncio.wait_for(session.receive(), timeout=30)
            ev = event.get("bidiStreamOutput", event)
            print(f"author={ev.get('author')}  "
                  f"content_parts={len(ev.get('content', {}).get('parts', []))}  "
                  f"transfer={ev.get('actions', {}).get('transfer_to_agent', '')}  "
                  f"done={ev.get('turn_complete', False)}")
            if ev.get("turn_complete") or ev.get("turnComplete"):
                break

asyncio.run(test_round_trip())
```

### Expected Behavior

After `transfer_to_agent(root_agent)` fires:
1. root_agent receives the pending user message ("What is 2 + 2?")
2. root_agent responds with audio/text: "2 + 2 is 4."
3. turn_complete with content

This is what happens ~25% of the time.

### Observed Behavior (75% of transfer-back attempts)

```
author=helper_agent  content_parts=0  transfer=           done=False
author=helper_agent  content_parts=1  transfer=           done=False   # transfer_to_agent call
author=helper_agent  content_parts=1  transfer=root_agent done=False   # transfer executes
author=root_agent    content_parts=0  transfer=           done=False   # root starts -- no message
author=root_agent    content_parts=0  transfer=           done=False
author=root_agent    content_parts=0  transfer=           done=True    # empty turn_complete
```

root_agent never receives the user message. It starts with empty context, produces nothing, and completes the turn.

In some cases root_agent makes a spurious re-transfer to helper_agent (based on stale context from turn 1), creating a transfer loop that also ends with an empty turn.

### Environment

Tested on two configurations with identical results:

| | Config A | Config B |
|---|---|---|
| Python | 3.10.11 | 3.13.9 |
| `google-adk` | 1.28.1 | 1.29.0 |
| `google-cloud-aiplatform` | 1.145.0 | 1.147.0 |
| Model | `gemini-live-2.5-flash-native-audio` | `gemini-live-2.5-flash-native-audio` |
| Path | `bidi_stream_query` on Agent Engine | `bidi_stream_query` on Agent Engine |
| NR rate when transfer fires | **75%** (3/4) | **75%** (3/4) |

The bug reproduces identically across Python versions and SDK versions, confirming it is a server-side issue in the AE assembly service / ADK runner — not a client SDK bug.

---

## Optional Information

### Timing analysis

All transfer-back attempts across both test configurations followed the same sequence:

| Metric | PASS (1 trial) | FAIL (3 trials) |
|--------|----------------|------------------|
| transfer fires | t=1.25s | t=1.09s |
| root_agent appears | t=2.97s (gap: 1.72s) | t=3.05s (gap: 1.96s) |
| root produces content | t=2.97s (4ms after appearing) | **NEVER** |

The gap between transfer and the target agent appearing (~1.7-2.0s) is consistent across pass and fail. The difference is binary: either the user message is delivered to the new session or it isn't. There's no timing threshold -- it appears to be a message-routing bug inside the runner where pending user content is dropped during Live session handoff.

### Cloud Logging evidence (historical 1011 crash)

On an earlier AE/ADK version (April 7), the same `transfer_to_agent` round-trip produced a hard crash in Cloud Logging:

```
16:09:29 -- transfer_to_agent: root_agent
16:10:41 -- APIError: 1007 None. error when processing input audio, please check
           if the inputaudio is in valid format: 16khz s16le pcm, mono channel
16:10:41 -- StatusCode.ABORTED: Agent Engine Error
16:10:41 -- RpcError raised by BidiStreamQuery
```

The 1007 audio format error on the new Live session propagated as 1011 to the client. Current AE versions appear to handle this more gracefully (no crash) but the underlying issue -- message loss during transfer -- persists.

### Test results

**Minimal agent** (both agents `gemini-live-2.5-flash-native-audio`, `sub_agents=[]`):

| Run | Python / ADK / aiplatform | Trials | Transfer fired | NR when transfer fires |
|-----|--------------------------|--------|----------------|----------------------|
| 1 | 3.10 / 1.28.1 / 1.145.0 | 5 | 2 | **100%** (2/2) |
| 2 | 3.10 / 1.28.1 / 1.145.0 | 10 | 4 | **75%** (3/4) |
| 3 | 3.13 / 1.29.0 / 1.147.0 | 10 | 4 | **75%** (3/4) |

**Existing test agent** (different agent configuration, same `sub_agents` setup):

| Run | Python / ADK / aiplatform | Trials | Transfer fired | NR when transfer fires |
|-----|--------------------------|--------|----------------|----------------------|
| 1 | 3.10 / system | 10 | 3 | **100%** (3/3) |

**Combined across all runs: 13 transfer-back attempts, 11 failures (85%).**

### Hypothesis: message not forwarded to new Live session

When `transfer_to_agent` fires on the bidi path:

1. The ADK runner (`runners.py:run_live`) hands off to the target agent
2. The target agent opens a **new** Gemini Live API session (via `base_llm_flow.py:run_live`)
3. The user's pending message (which was already consumed from the input queue) is supposed to be forwarded to the new session
4. **Intermittently, the message is not forwarded** -- the new session starts with empty context

This may be a race condition in how `_forward_requests()` interacts with the agent switch, or the message may be consumed by the outgoing agent's Live session before the incoming agent can receive it.

### Workaround

Replace `sub_agents=[helper_agent]` with `tools=[AgentTool(helper_agent)]`. `AgentTool` wraps the sub-agent as a regular function call -- the parent agent stays in control, receives the sub-agent's response as a tool result, and there is no Live session handoff.

```python
from google.adk.tools import AgentTool

root_agent = Agent(
    ...
    sub_agents=[],  # remove sub_agents
    tools=[AgentTool(helper_agent)],  # use AgentTool instead
)
```

Behavioral difference: with `AgentTool`, the sub-agent cannot interact with the user over multiple turns. For single-turn delegation, the behavior is equivalent.

### Related Issues

- **#5195** -- `transfer_to_agent` crashes with 1011 when root uses `gemini-live-*` and sub uses `gemini-2.5-flash` (model mismatch -- 100% crash). Different root cause but same mechanism.
- **#5113** -- `bidi_stream_query` silently drops `state` from first queue message (fixed in aiplatform 1.146.0). Same `bidi_stream_query` pipeline, different field.
- **#4996** -- Session resumption dead code in `run_live()` (fixed April 8). Related: the reconnection path in `base_llm_flow.py` is the same code that handles agent transfers.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Live/bidi] transfer_to_agent drops pending user message on round-trip — no response after live-to-live handoff on Agent Engine #5238

Required Information

Describe the Bug

Steps to Reproduce

1. Create `app/init.py` — minimal 2-agent setup (both live models):

2. Create `app/agent_engine_app.py`:

3. Deploy to Agent Engine and run the round-trip test:

Expected Behavior

Observed Behavior (75% of transfer-back attempts)

Environment

Optional Information

Timing analysis

Cloud Logging evidence (historical 1011 crash)

Test results

Hypothesis: message not forwarded to new Live session

Workaround

Related Issues

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

	Config A	Config B
Python	3.10.11	3.13.9
`google-adk`	1.28.1	1.29.0
`google-cloud-aiplatform`	1.145.0	1.147.0
Model	`gemini-live-2.5-flash-native-audio`	`gemini-live-2.5-flash-native-audio`
Path	`bidi_stream_query` on Agent Engine	`bidi_stream_query` on Agent Engine
NR rate when transfer fires	75% (3/4)	75% (3/4)

Metric	PASS (1 trial)	FAIL (3 trials)
transfer fires	t=1.25s	t=1.09s
root_agent appears	t=2.97s (gap: 1.72s)	t=3.05s (gap: 1.96s)
root produces content	t=2.97s (4ms after appearing)	NEVER

Run	Python / ADK / aiplatform	Trials	Transfer fired	NR when transfer fires
1	3.10 / 1.28.1 / 1.145.0	5	2	100% (2/2)
2	3.10 / 1.28.1 / 1.145.0	10	4	75% (3/4)
3	3.13 / 1.29.0 / 1.147.0	10	4	75% (3/4)

[Live/bidi] transfer_to_agent drops pending user message on round-trip — no response after live-to-live handoff on Agent Engine #5238

Description

Required Information

Describe the Bug

Steps to Reproduce

1. Create app/__init__.py — minimal 2-agent setup (both live models):

2. Create app/agent_engine_app.py:

3. Deploy to Agent Engine and run the round-trip test:

Expected Behavior

Observed Behavior (75% of transfer-back attempts)

Environment

Optional Information

Timing analysis

Cloud Logging evidence (historical 1011 crash)

Test results

Hypothesis: message not forwarded to new Live session

Workaround

Related Issues

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

1. Create `app/init.py` — minimal 2-agent setup (both live models):

2. Create `app/agent_engine_app.py`: