From 78ca7dd9fe524cec3d72175fe8ed787bb0e953b1 Mon Sep 17 00:00:00 2001 From: noahraynor-vapi <244848814+noahraynor-vapi@users.noreply.github.com> Date: Thu, 12 Mar 2026 01:44:38 +0000 Subject: [PATCH 1/2] fix: add Soniox provider to speech config docs and remove Vapi text-based entry Co-Authored-By: Claude Opus 4.6 --- fern/customization/speech-configuration.mdx | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/fern/customization/speech-configuration.mdx b/fern/customization/speech-configuration.mdx index 0471be6d8..f84ca0da2 100644 --- a/fern/customization/speech-configuration.mdx +++ b/fern/customization/speech-configuration.mdx @@ -55,11 +55,12 @@ This plan defines the parameters for when the assistant begins speaking after th - **Assembly**: Transcriber that also reports end-of-turn detection. To use Assembly, choose it as your transcriber without setting a separate smart endpointing plan. As transcripts arrive, we consider the `end_of_turn` flag that Assembly sends to mark the end-of-turn, stream to the LLM, and generate a response. + - **Soniox v4**: Transcriber with built-in semantic endpointing across 60+ languages. Soniox uses a single universal model to detect end-of-turn based on intonation and conversational context rather than silence duration, producing fewer false triggers. To use Soniox, choose it as your transcriber and turn off "Smart Endpointing" in the Start Speaking Plan. Recommended for multi-lingual agents. + **Text-based providers:** - **Off**: Disabled by default. When smart endpointing is set to "Off", the system will automatically use the transcriber's end-of-turn detection if available. If no transcriber EOT detection is available, the system defaults to LiveKit if the language is set to English or to Vapi's standard endpointing mode. - **LiveKit**: Recommended for English conversations as it provides the most sophisticated solution for detecting natural speech patterns and pauses. LiveKit can be fine-tuned using the `waitFunction` parameter to adjust response timing based on the probability that the user is still speaking. - - **Vapi**: Recommended for non-English conversations or as an alternative when LiveKit isn't suitable ![LiveKit Smart Endpointing Configuration](../static/images/advanced-tab/livekit-smart-endpointing.png) From 8bb24dba3b58dc14b6ac8e6e6bea77e73f78d688 Mon Sep 17 00:00:00 2001 From: Vapi Tasker Date: Thu, 12 Mar 2026 01:48:47 +0000 Subject: [PATCH 2/2] fix: use 'multilingual' consistently (no hyphen) Matches the convention used across all 265 other occurrences in the docs. --- fern/customization/speech-configuration.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fern/customization/speech-configuration.mdx b/fern/customization/speech-configuration.mdx index f84ca0da2..ef34ee7b7 100644 --- a/fern/customization/speech-configuration.mdx +++ b/fern/customization/speech-configuration.mdx @@ -55,7 +55,7 @@ This plan defines the parameters for when the assistant begins speaking after th - **Assembly**: Transcriber that also reports end-of-turn detection. To use Assembly, choose it as your transcriber without setting a separate smart endpointing plan. As transcripts arrive, we consider the `end_of_turn` flag that Assembly sends to mark the end-of-turn, stream to the LLM, and generate a response. - - **Soniox v4**: Transcriber with built-in semantic endpointing across 60+ languages. Soniox uses a single universal model to detect end-of-turn based on intonation and conversational context rather than silence duration, producing fewer false triggers. To use Soniox, choose it as your transcriber and turn off "Smart Endpointing" in the Start Speaking Plan. Recommended for multi-lingual agents. + - **Soniox v4**: Transcriber with built-in semantic endpointing across 60+ languages. Soniox uses a single universal model to detect end-of-turn based on intonation and conversational context rather than silence duration, producing fewer false triggers. To use Soniox, choose it as your transcriber and turn off "Smart Endpointing" in the Start Speaking Plan. Recommended for multilingual agents. **Text-based providers:**