Skip to content

feat(eot): add audio models AGT-2520#4722

Merged
chenghao-mou merged 120 commits into
mainfrom
feat/AGT-2520-multimodal-EOU
Jun 16, 2026
Merged

feat(eot): add audio models AGT-2520#4722
chenghao-mou merged 120 commits into
mainfrom
feat/AGT-2520-multimodal-EOU

Conversation

@chenghao-mou

@chenghao-mou chenghao-mou commented Feb 5, 2026

Copy link
Copy Markdown
Member

Adds streaming audio end-of-turn detection. Single user-facing AudioTurnDetector that selects between two backends:

  • turn-detector
  • turn-detector-mini

On cloud transport error or predict_end_of_turn timeout, the session swaps to mini/local for the rest of the stream (sticky per session, one warning per failure mode).
Local failures emit the default 1.0 prediction and retry on the next turn.

A user-set unlikely_threshold is scaled multiplicatively against the cloud default so the operating point survives a fallback.

@hsjun99

hsjun99 commented Feb 25, 2026

Copy link
Copy Markdown

@chenghao-mou Excited to see this! A couple of questions:

  1. Will the multimodal EOT model be publicly accessible via model weights or agent-gateway.livekit.cloud, or in some other way?
  2. Any rough timeline for when MultiModalTurnDetector gets fully wired up?

@chenghao-mou

Copy link
Copy Markdown
Member Author

@chenghao-mou Excited to see this! A couple of questions:

  1. Will the multimodal EOT model be publicly accessible via model weights or agent-gateway.livekit.cloud, or in some other way?
  2. Any rough timeline for when MultiModalTurnDetector gets fully wired up?

Thanks for your patience! We don't have an official decision or timeline yet, but hopefully I can get it ready within a month or two.

@chenghao-mou chenghao-mou marked this pull request as ready for review April 22, 2026 07:38
@chenghao-mou chenghao-mou requested a review from a team April 22, 2026 07:38
devin-ai-integration[bot]

This comment was marked as resolved.

Drop duplicate worker token header declaration

Co-authored-by: devin-ai-integration[bot] <158243242+devin-ai-integration[bot]@users.noreply.github.com>
Comment thread examples/frontdesk/agent.py Outdated
Comment on lines +7 to +33
def resolve_env_var(val: NotGivenOr[str], *env_vars: str, default: str = "") -> str:
"""
Resolve an environment variable from a list of potential sources.

Args:
val: The value to resolve.
*env_vars: The environment variables to check. Order matters, the first non-None value will be returned.
default: The default value to return if no environment variables are set.

Returns:
The resolved environment variable.

Examples:
>>> resolve_env_var(
... NOT_GIVEN,
... "ABC_URL",
... default="https://agent-gateway.livekit.cloud/v1",
... )
"https://agent-gateway.livekit.cloud/v1"
"""
if is_given(val):
return val
for env_var in env_vars:
curr_val = os.getenv(env_var, None)
if curr_val is not None and curr_val != "":
return curr_val
return default

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need that?

Isn't it just

curr_val = os.getenv(env_var, None) or "default"?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really just to handle LIVEKIT_INFERENCE_* before LIVEKIT_*

Comment on lines +51 to +95
@runtime_checkable
class _StreamingTurnDetectorStream(_TurnDetector, Protocol):
# allow None chat_ctx for the streaming model
async def predict_end_of_turn(
self, chat_ctx: ChatContext | None = None, *, timeout: float | None = None
) -> float: ...

@property
def is_active(self) -> bool: ...
@property
def is_inference_running(self) -> bool: ...
@property
def preemptive_request_id(self) -> str | None: ...
@property
def last_prediction(self) -> TurnDetectionEvent | None: ...

def update_language(self, language: LanguageCode | None) -> None: ...

def warmup(self) -> asyncio.Future[float]: ...
def activate(self, trigger: str | None = None) -> None: ...
def deactivate(self, trigger: str | None = None) -> None: ...
def flush(self, reason: str | None = None) -> None: ...
def push_audio(self, frame: rtc.AudioFrame) -> None: ...
def end_input(self) -> None: ...
async def aclose(self) -> None: ...


@runtime_checkable
class _StreamingTurnDetector(Protocol):
"""Turn detector that hands out a per-session stream instead of resolving
inline. Per-language threshold lookups (``unlikely_threshold`` /
``supports_language``) live on the stream, not the detector — after a
cloud→local fallback they need to reflect the active backend's rescaled
view, which is per-session state."""

@property
def model(self) -> str: ...
@property
def provider(self) -> str: ...

def stream(
self,
*,
conn_options: APIConnectOptions = DEFAULT_API_CONNECT_OPTIONS,
) -> _StreamingTurnDetectorStream: ...

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fwiw, it's very unlikely we will support third party turn detectors, we could simplify the abstraction by not having any protocol or abc class

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've simplified this a lot in the new commit: d74df8d (this PR)

devin-ai-integration[bot]

This comment was marked as resolved.

@theomonnom

Copy link
Copy Markdown
Member

Nice work! 🚀

devin-ai-integration[bot]

This comment was marked as resolved.

@theomonnom

Copy link
Copy Markdown
Member

Lgtm lets make sure we correctly fallback when infserve has issues (long timeout/stale connection, errors)

# Conflicts:
#	livekit-agents/livekit/agents/job.py
#	livekit-agents/livekit/agents/version.py
#	livekit-agents/livekit/agents/voice/agent_activity.py
#	livekit-agents/pyproject.toml
#	livekit-plugins/livekit-plugins-anam/livekit/plugins/anam/version.py
#	livekit-plugins/livekit-plugins-anam/pyproject.toml
#	livekit-plugins/livekit-plugins-anthropic/livekit/plugins/anthropic/version.py
#	livekit-plugins/livekit-plugins-anthropic/pyproject.toml
#	livekit-plugins/livekit-plugins-assemblyai/livekit/plugins/assemblyai/version.py
#	livekit-plugins/livekit-plugins-assemblyai/pyproject.toml
#	livekit-plugins/livekit-plugins-asyncai/livekit/plugins/asyncai/version.py
#	livekit-plugins/livekit-plugins-asyncai/pyproject.toml
#	livekit-plugins/livekit-plugins-avatario/livekit/plugins/avatario/version.py
#	livekit-plugins/livekit-plugins-avatario/pyproject.toml
#	livekit-plugins/livekit-plugins-avatartalk/livekit/plugins/avatartalk/version.py
#	livekit-plugins/livekit-plugins-avatartalk/pyproject.toml
#	livekit-plugins/livekit-plugins-aws/livekit/plugins/aws/version.py
#	livekit-plugins/livekit-plugins-aws/pyproject.toml
#	livekit-plugins/livekit-plugins-azure/livekit/plugins/azure/version.py
#	livekit-plugins/livekit-plugins-azure/pyproject.toml
#	livekit-plugins/livekit-plugins-baseten/livekit/plugins/baseten/version.py
#	livekit-plugins/livekit-plugins-baseten/pyproject.toml
#	livekit-plugins/livekit-plugins-bey/livekit/plugins/bey/version.py
#	livekit-plugins/livekit-plugins-bey/pyproject.toml
#	livekit-plugins/livekit-plugins-bithuman/livekit/plugins/bithuman/version.py
#	livekit-plugins/livekit-plugins-bithuman/pyproject.toml
#	livekit-plugins/livekit-plugins-browser/livekit/plugins/browser/version.py
#	livekit-plugins/livekit-plugins-browser/pyproject.toml
#	livekit-plugins/livekit-plugins-cambai/livekit/plugins/cambai/version.py
#	livekit-plugins/livekit-plugins-cambai/pyproject.toml
#	livekit-plugins/livekit-plugins-cartesia/livekit/plugins/cartesia/version.py
#	livekit-plugins/livekit-plugins-cartesia/pyproject.toml
#	livekit-plugins/livekit-plugins-cerebras/livekit/plugins/cerebras/version.py
#	livekit-plugins/livekit-plugins-cerebras/pyproject.toml
#	livekit-plugins/livekit-plugins-clova/livekit/plugins/clova/version.py
#	livekit-plugins/livekit-plugins-clova/pyproject.toml
#	livekit-plugins/livekit-plugins-deepgram/livekit/plugins/deepgram/version.py
#	livekit-plugins/livekit-plugins-deepgram/pyproject.toml
#	livekit-plugins/livekit-plugins-did/livekit/plugins/did/version.py
#	livekit-plugins/livekit-plugins-did/pyproject.toml
#	livekit-plugins/livekit-plugins-elevenlabs/livekit/plugins/elevenlabs/version.py
#	livekit-plugins/livekit-plugins-elevenlabs/pyproject.toml
#	livekit-plugins/livekit-plugins-fal/livekit/plugins/fal/version.py
#	livekit-plugins/livekit-plugins-fal/pyproject.toml
#	livekit-plugins/livekit-plugins-fireworksai/livekit/plugins/fireworksai/version.py
#	livekit-plugins/livekit-plugins-fireworksai/pyproject.toml
#	livekit-plugins/livekit-plugins-fishaudio/livekit/plugins/fishaudio/version.py
#	livekit-plugins/livekit-plugins-fishaudio/pyproject.toml
#	livekit-plugins/livekit-plugins-gladia/livekit/plugins/gladia/version.py
#	livekit-plugins/livekit-plugins-gladia/pyproject.toml
#	livekit-plugins/livekit-plugins-gnani/livekit/plugins/gnani/version.py
#	livekit-plugins/livekit-plugins-gnani/pyproject.toml
#	livekit-plugins/livekit-plugins-google/livekit/plugins/google/version.py
#	livekit-plugins/livekit-plugins-google/pyproject.toml
#	livekit-plugins/livekit-plugins-gradium/livekit/plugins/gradium/version.py
#	livekit-plugins/livekit-plugins-gradium/pyproject.toml
#	livekit-plugins/livekit-plugins-groq/livekit/plugins/groq/version.py
#	livekit-plugins/livekit-plugins-groq/pyproject.toml
#	livekit-plugins/livekit-plugins-hamming/livekit/plugins/hamming/version.py
#	livekit-plugins/livekit-plugins-hamming/pyproject.toml
#	livekit-plugins/livekit-plugins-hedra/livekit/plugins/hedra/version.py
#	livekit-plugins/livekit-plugins-hedra/pyproject.toml
#	livekit-plugins/livekit-plugins-hume/livekit/plugins/hume/version.py
#	livekit-plugins/livekit-plugins-hume/pyproject.toml
#	livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/version.py
#	livekit-plugins/livekit-plugins-inworld/pyproject.toml
#	livekit-plugins/livekit-plugins-keyframe/livekit/plugins/keyframe/version.py
#	livekit-plugins/livekit-plugins-keyframe/pyproject.toml
#	livekit-plugins/livekit-plugins-krisp/livekit/plugins/krisp/version.py
#	livekit-plugins/livekit-plugins-krisp/pyproject.toml
#	livekit-plugins/livekit-plugins-langchain/livekit/plugins/langchain/version.py
#	livekit-plugins/livekit-plugins-langchain/pyproject.toml
#	livekit-plugins/livekit-plugins-lemonslice/livekit/plugins/lemonslice/version.py
#	livekit-plugins/livekit-plugins-lemonslice/pyproject.toml
#	livekit-plugins/livekit-plugins-liveavatar/livekit/plugins/liveavatar/version.py
#	livekit-plugins/livekit-plugins-liveavatar/pyproject.toml
#	livekit-plugins/livekit-plugins-lmnt/livekit/plugins/lmnt/version.py
#	livekit-plugins/livekit-plugins-lmnt/pyproject.toml
#	livekit-plugins/livekit-plugins-minimal/livekit/plugins/minimal/version.py
#	livekit-plugins/livekit-plugins-minimal/pyproject.toml
#	livekit-plugins/livekit-plugins-minimax/livekit/plugins/minimax/version.py
#	livekit-plugins/livekit-plugins-minimax/pyproject.toml
#	livekit-plugins/livekit-plugins-mistralai/livekit/plugins/mistralai/version.py
#	livekit-plugins/livekit-plugins-mistralai/pyproject.toml
#	livekit-plugins/livekit-plugins-murf/livekit/plugins/murf/version.py
#	livekit-plugins/livekit-plugins-murf/pyproject.toml
#	livekit-plugins/livekit-plugins-neuphonic/livekit/plugins/neuphonic/version.py
#	livekit-plugins/livekit-plugins-neuphonic/pyproject.toml
#	livekit-plugins/livekit-plugins-nltk/livekit/plugins/nltk/version.py
#	livekit-plugins/livekit-plugins-nltk/pyproject.toml
#	livekit-plugins/livekit-plugins-nvidia/livekit/plugins/nvidia/version.py
#	livekit-plugins/livekit-plugins-nvidia/pyproject.toml
#	livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/version.py
#	livekit-plugins/livekit-plugins-openai/pyproject.toml
#	livekit-plugins/livekit-plugins-perplexity/livekit/plugins/perplexity/version.py
#	livekit-plugins/livekit-plugins-perplexity/pyproject.toml
#	livekit-plugins/livekit-plugins-phonic/livekit/plugins/phonic/version.py
#	livekit-plugins/livekit-plugins-phonic/pyproject.toml
#	livekit-plugins/livekit-plugins-resemble/livekit/plugins/resemble/version.py
#	livekit-plugins/livekit-plugins-resemble/pyproject.toml
#	livekit-plugins/livekit-plugins-respeecher/livekit/plugins/respeecher/version.py
#	livekit-plugins/livekit-plugins-respeecher/pyproject.toml
#	livekit-plugins/livekit-plugins-rime/livekit/plugins/rime/version.py
#	livekit-plugins/livekit-plugins-rime/pyproject.toml
#	livekit-plugins/livekit-plugins-rtzr/livekit/plugins/rtzr/version.py
#	livekit-plugins/livekit-plugins-rtzr/pyproject.toml
#	livekit-plugins/livekit-plugins-runway/livekit/plugins/runway/version.py
#	livekit-plugins/livekit-plugins-runway/pyproject.toml
#	livekit-plugins/livekit-plugins-sarvam/livekit/plugins/sarvam/version.py
#	livekit-plugins/livekit-plugins-sarvam/pyproject.toml
#	livekit-plugins/livekit-plugins-silero/livekit/plugins/silero/version.py
#	livekit-plugins/livekit-plugins-silero/pyproject.toml
#	livekit-plugins/livekit-plugins-simli/livekit/plugins/simli/version.py
#	livekit-plugins/livekit-plugins-simli/pyproject.toml
#	livekit-plugins/livekit-plugins-simplismart/livekit/plugins/simplismart/version.py
#	livekit-plugins/livekit-plugins-simplismart/pyproject.toml
#	livekit-plugins/livekit-plugins-slng/livekit/plugins/slng/version.py
#	livekit-plugins/livekit-plugins-slng/pyproject.toml
#	livekit-plugins/livekit-plugins-smallestai/livekit/plugins/smallestai/version.py
#	livekit-plugins/livekit-plugins-smallestai/pyproject.toml
#	livekit-plugins/livekit-plugins-soniox/livekit/plugins/soniox/version.py
#	livekit-plugins/livekit-plugins-soniox/pyproject.toml
#	livekit-plugins/livekit-plugins-speechify/livekit/plugins/speechify/version.py
#	livekit-plugins/livekit-plugins-speechify/pyproject.toml
#	livekit-plugins/livekit-plugins-speechmatics/livekit/plugins/speechmatics/version.py
#	livekit-plugins/livekit-plugins-speechmatics/pyproject.toml
#	livekit-plugins/livekit-plugins-spitch/livekit/plugins/spitch/version.py
#	livekit-plugins/livekit-plugins-spitch/pyproject.toml
#	livekit-plugins/livekit-plugins-tavus/livekit/plugins/tavus/version.py
#	livekit-plugins/livekit-plugins-tavus/pyproject.toml
#	livekit-plugins/livekit-plugins-telnyx/livekit/plugins/telnyx/version.py
#	livekit-plugins/livekit-plugins-telnyx/pyproject.toml
#	livekit-plugins/livekit-plugins-trugen/livekit/plugins/trugen/version.py
#	livekit-plugins/livekit-plugins-trugen/pyproject.toml
#	livekit-plugins/livekit-plugins-turn-detector/livekit/plugins/turn_detector/version.py
#	livekit-plugins/livekit-plugins-turn-detector/pyproject.toml
#	livekit-plugins/livekit-plugins-ultravox/livekit/plugins/ultravox/version.py
#	livekit-plugins/livekit-plugins-ultravox/pyproject.toml
#	livekit-plugins/livekit-plugins-upliftai/livekit/plugins/upliftai/version.py
#	livekit-plugins/livekit-plugins-upliftai/pyproject.toml
#	livekit-plugins/livekit-plugins-xai/livekit/plugins/xai/version.py
#	livekit-plugins/livekit-plugins-xai/pyproject.toml
@chenghao-mou chenghao-mou force-pushed the feat/AGT-2520-multimodal-EOU branch from c5a7863 to 671ef28 Compare June 16, 2026 10:03
devin-ai-integration[bot]

This comment was marked as resolved.

devin-ai-integration[bot]

This comment was marked as resolved.

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@chenghao-mou chenghao-mou merged commit 61b14fd into main Jun 16, 2026
23 checks passed
@chenghao-mou chenghao-mou deleted the feat/AGT-2520-multimodal-EOU branch June 16, 2026 18:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants