feat: add smallest ai asr and tts extensions by harshitajain165 · Pull Request #2200 · TEN-framework/ten-framework

harshitajain165 · 2026-07-03T10:44:13Z

What

Adds two new vendor extensions for Smallest AI:

smallest_asr_python — real-time speech-to-text using the
Pulse
live WebSocket API (wss://api.smallest.ai/waves/v1/stt/live). 38 languages,
64 ms time-to-first-transcript.
smallest_tts_python — text-to-speech using the
Lightning
SSE streaming endpoint (/waves/v1/tts/live). ~100 ms to first audio chunk,
12 languages, voice cloning; lightning_v3.1 and lightning_v3.1_pro models.

Implementation notes

ASR (extends AsyncASRBaseExtension):

Streams raw PCM16 binary frames; interim results with final=false, finals on
Pulse's is_final.
asr_finalize maps to Pulse's {"type": "finalize"} control message (session
stays open).
word_timestamps=true by default so final results carry accurate
start_ms/duration_ms.
Reconnect with exponential backoff (5 attempts) before reporting a fatal error.

TTS (extends AsyncTTS2HttpExtension):

SSE frames (data: {"audio": "<base64>"} / {"done": true}) decoded
incrementally; a partial line is never split across chunk boundaries.
output_format pinned to pcm: Lightning's raw PCM is already signed 16-bit
LE mono, matching the pcm_frame contract with no conversion, and skipping the
container header keeps time-to-first-audio low.
401/403 and invalid_api_key classified as INVALID_KEY_ERROR (fatal);
everything else non-fatal.

Both authenticate via params.api_key / SMALLEST_API_KEY and send
X-Source: ten-framework for API-side attribution. All other params keys pass
through verbatim (query string for ASR, request body for TTS).

Testing

Extension-local mocked test suites included for both (no API key needed):
ASR — result shape, finalize, dump, metrics, reconnect, vendor error, invalid
params; TTS — basic/dump/flush, error classification, metrics, params/URL
resolution, robustness, state machine.
python -m black --check --line-length 80 clean; syntax verified.
task asr-guarder-test EXTENSION=smallest_asr_python (pending — will run
before marking ready for review)
task tts-guarder-test EXTENSION=smallest_tts_python (pending)
End-to-end voice-assistant graph smoke test (pending)

Add two new vendor extensions for Smallest AI: - smallest_asr_python: real-time speech-to-text via the Pulse live WebSocket API (binary PCM16 in, interim/final transcripts out, finalize control message, reconnect with exponential backoff). - smallest_tts_python: text-to-speech via the Lightning SSE streaming endpoint (base64 PCM16 chunks decoded on the fly, ~100 ms to first audio, output_format pinned to pcm). Both include mocked extension-local test suites, test configs, and README docs. Configured via params.api_key or SMALLEST_API_KEY.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: add smallest ai asr and tts extensions#2200

feat: add smallest ai asr and tts extensions#2200
harshitajain165 wants to merge 1 commit into
TEN-framework:mainfrom
harshitajain165:feat/smallest-ai-integration

harshitajain165 commented Jul 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

harshitajain165 commented Jul 3, 2026

What

Implementation notes

Testing

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant