deepgram · jeniya-tabassum · Jan 29, 2026 · Jan 15, 2026 · Jan 24, 2026
@@ -38,12 +38,27 @@ uv run stream_audio_file.py --ui --live \
 
 ### Save & Print Mode
 
-**Stream to file (realtime mode - natural pace):**
+**Stream and save JSON output:**
 ```bash
-uv run stream_audio_file.py -o output.json -f audio.wav --realtime \
+uv run stream_audio_file.py -f audio.wav \
   --url "wss://api.deepgram.com/v1/listen?model=nova-3&interim_results=true"
 ```
 
+Output is automatically saved to `audio.json` (derived from input filename).
+
+**Specify a custom output file:**
+```bash
+uv run stream_audio_file.py -o output.json -f audio.wav \
+  --url "wss://api.deepgram.com/v1/listen?model=nova-3&interim_results=true"
+```
+
+**Live recording saves with timestamp:**
+```bash
+uv run stream_audio_file.py --live \
+  --url "wss://api.deepgram.com/v1/listen?model=nova-3&interim_results=true"
+# Saves to recording_20250114_153022.json (or similar)
+```
+
 **Print basic transcript:**
 ```bash
 uv run print_transcript.py -f output.json
@@ -56,7 +71,7 @@ uv run print_transcript.py -f output.json
 **Print with all the details:**
 ```bash
 uv run print_transcript.py -f output.json \
-  --print-speakers --print-channels --print-interim --print-delay --colorize
+  --print-speakers --print-channels --print-interim --print-latency --colorize
 ```
 ```
 [18:30:24.066 (0.665s since EOS)] [00:00:00.00 - 00:00:03.48] [Speaker 0] [Channel 0] [IsFinal]: The missile knows where it is at all times.
@@ -74,24 +89,51 @@ It knows this because it knows where it isn't.
 ## Key Options
 
 ### stream_audio_file.py
-- `--ui` - Interactive terminal UI with live updates
-- `-f, --audio` - Audio file to stream
-- `-l, --live` - Stream from microphone
-- `-o, --output` - Save JSON messages to file
-- `-v, -vv, -vvv` - Increase verbosity
+
+| Option | Description |
+|--------|-------------|
+| `--url, -u` | Deepgram websocket URL (required) |
+| `--ui` | Interactive terminal UI with live updates |
+| `-f, --audio` | Audio file to stream |
+| `-l, --live` | Stream from microphone |
+| `-o, --output` | Save JSON messages to file (defaults to input filename or timestamped name) |
+| `-v, -vv, -vvv` | Increase verbosity |
 
 ### print_transcript.py
-- `--print-speakers` - Show speaker labels
-- `--print-channels` - Show audio channels
-- `--print-interim` - Include interim results
-- `--print-delay` - Show latency (time since end of speech)
-- `--colorize` - Color words by confidence
-- `--only-transcript` - Just the text, no metadata
+
+| Option | Description |
+|--------|-------------|
+| `--print-speakers` | Show speaker labels |
+| `--print-channels` | Show audio channels |
+| `--print-interim` | Include interim results |
+| `--print-received` | Show received timestamp for streamed messages |
+| `--print-latency` | Show latency metrics (TTFT, update frequency, message latency, EOT latency) |
+| `--print-entities` | Show detected entities |
+| `--colorize` | Color words by confidence |
+| `--only-transcript` | Just the text, no metadata |
 
 Run either script with `--help` for full options.
 
-## What's Happening?
+### Shell Completion
+
+Generate shell completions for your preferred shell:
+
+```bash
+uv run stream_audio_file.py completion bash  # or zsh, fish
+```
+
+## Metrics Calculated
+
+When using `--print-latency`, the following metrics are computed:
 
-The UI mode shows transcription speed in real-time - watch words appear as you speak and see exactly how fast Deepgram processes your audio. The `--print-delay` option reveals latency metrics, perfect for testing different models and configurations.
+**Session-level:**
+- **TTFT (Time To First Transcript)**: Wall-clock time from when audio streaming begins to when the first transcript message is received. Measures initial responsiveness.
+- **Update Frequency**: Number of interim transcript updates per second of audio. Higher values mean a more fluid, responsive transcription experience.
 
+**Per-message:**
+- **Message Latency**: How far behind the transcription is from the audio being sent, calculated as `audio_cursor - transcript_cursor`. Measured on interim results only, per Deepgram's methodology.
+- **EOT Latency (End-of-Turn Latency)**: Time between the last interim result and the finalizing event (e.g., `speech_final`, `UtteranceEnd`, `EndOfTurn`). Critical for voice agents—they can't respond until they know the user stopped speaking.
+
+## What's Happening?
 
+The UI mode shows transcription speed in real-time—watch words appear as you speak and see exactly how fast Deepgram processes your audio. The `--print-latency` option reveals latency metrics, perfect for testing different models and configurations.