Skip to content

Streaming tool yields in live mode cause model to re-invoke the tool in a loop #5947

@kazunori279

Description

@kazunori279

Description

When using an async generator streaming tool in live (bidi) mode, each yield from the tool causes the model to be interrupted and re-invoke the same tool, creating an infinite loop of duplicate tool calls.

Root Cause

In _process_function_live_helper (functions.py), streaming tool results are sent back to the model via live_request_queue.send_content():

async for result in agen:
    updated_content = types.Content(
        role='user',
        parts=[
            types.Part.from_text(
                text=f'Function {tool.name} returned: {result}'
            )
        ],
    )
    invocation_context.live_request_queue.send_content(updated_content)

In GeminiLlmConnection.send_content(), this is forwarded as:

await self._gemini_session.send_client_content(turns=[content], turn_complete=True)

The combination of role='user' and turn_complete=True makes the model treat each streaming update as a new complete user turn. The model:

  1. Gets interrupted (stops current audio/text generation)
  2. Receives what looks like a new user message: "Function run_my_tool returned: {'status': 'running', ...}"
  3. Responds to it — often by calling the same tool again
  4. The tool is already running, so it either starts a duplicate or returns "busy"
  5. Repeat

This produces a rapid interrupted → turn_complete → tool_call → interrupted → ... loop visible in the logs, with the model never completing a coherent response.

Steps to Reproduce

  1. Define an async generator tool that yields progress updates:
async def long_running_task(query: str):
    """A tool that takes time and yields progress updates."""
    yield {"status": "running", "message": "Step 1: fetching data..."}
    await asyncio.sleep(5)
    yield {"status": "running", "message": "Step 2: processing..."}
    await asyncio.sleep(5)
    yield {"status": "completed", "result": "Done!"}
  1. Create an agent with this tool and run in live mode with a native-audio model
  2. Send a message that triggers the tool
  3. Observe: the model gets interrupted by each yield, re-calls the tool repeatedly, and never delivers a coherent spoken response

Expected Behavior

Streaming tool updates should be delivered to the model in a way that does not interrupt the current generation or trigger re-invocation. The model should be able to narrate progress updates naturally between yields.

Possible Fixes

  • Send streaming updates with turn_complete=False so the model treats them as partial/background context rather than a new turn requiring a response
  • Use a different role (e.g., 'tool' or function_response) instead of 'user' so the model doesn't interpret yields as user messages
  • Queue streaming updates and let the model consume them after its current generation completes, rather than interrupting it
  • Provide a mechanism to send out-of-band status updates that bypass the model entirely (e.g., via a callback to the application layer)

Environment

  • google-adk: 2.1.0
  • Model: gemini-live-2.5-flash-native-audio (Vertex AI)
  • Python: 3.12

Metadata

Metadata

Labels

live[Component] This issue is related to live, voice and video chatneeds review[Status] The PR/issue is awaiting review from the maintainer

Type

No fields configured for Bug.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions