Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
50 changes: 31 additions & 19 deletions src/pages/docs/observe/concepts/voice-observability.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -5,27 +5,42 @@ description: "Turning every voice call into a trace you can debug and score."

## A voice call is a trace

**Voice observability** captures each voice call as a [trace](/docs/observe/concepts/traces), the same tree of [spans](/docs/observe/concepts/spans) you get from a text app. One call becomes one trace: each back-and-forth turn is a span inside it, and the whole thing carries the transcript, the recording, and the call's duration, turn count, and cost. A spoken conversation lands in the same place as every other request, ready for the same evals, alerts, and filters.
**Voice observability** captures each voice call as a [trace](/docs/observe/concepts/traces), the same tree of [spans](/docs/observe/concepts/spans) you get from a text app. One call becomes one trace, and each back-and-forth turn is a span inside it. The call carries the transcript, the recording, and its duration, turn count, and cost. A spoken conversation lands in the same place as every other request, ready for the same [evals](/docs/observe/features/evals), alerts, and filters.

A voice call reaches Observe by one of two paths:
## Inside a voice call

- **Managed ingestion**, for hosted providers like Vapi and Retell. Connect the provider once with its API key and assistant ID and switch observability on. Observe pulls the provider's call logs and writes each finished call in as a trace, with no SDK and no code on your side.
- **Auto-instrumentation**, for voice apps you build on LiveKit or Pipecat. Your app emits spans through [traceAI](/docs/observe/concepts/traceai), exactly like any other instrumented service.

Either way you land on the same thing: one voice call you can open as a trace.
A turn is more than a single step. In an app you instrument on LiveKit or Pipecat, a turn's span breaks down into speech-to-text, the model call, and text-to-speech, so you can see exactly where a turn went wrong, not just that it did. Managed calls arrive at the turn level, and the transcript and recording sit on the call either way.

<Mermaid chart={`flowchart TD
accTitle: Two paths turn a voice call into a trace
accDescr: Vapi and Retell call logs are pulled into Observe as a trace. LiveKit and Pipecat apps emit spans through traceAI into the same trace. The trace holds the transcript, recording, turns, and cost.
A["Vapi / Retell call"] -->|"Observe pulls call logs"| T["One voice call, as a trace"]
B["LiveKit / Pipecat app"] -->|"traceAI emits spans"| T
T --> D["Transcript · recording · turns · cost"]`} />
accTitle: What a voice call looks like as a trace
accDescr: A voice call is one trace made of turn spans. In an instrumented app each turn breaks down into speech-to-text, the model call, and text-to-speech. The call carries the transcript, recording, turn count, and cost.
C["Voice call · one trace"] --> T1["Turn · span"]
C --> T2["Turn · span"]
T1 --> STT["speech to text"]
T1 --> LLM["model call"]
T1 --> TTS["text to speech"]
C --> M["Transcript · recording · turns · cost"]`} />

Because it is an ordinary trace, a voice call fits the same [observability model](/docs/observe/concepts/observability-model) as your text traces.

## How a call reaches Observe

A voice call reaches Observe by one of two paths. Whichever it takes, it lands as the same trace; what differs is who produces the spans.

Take a support line running on a Vapi assistant. Observe pulls each finished call in as its own trace: read it top to bottom and you follow the conversation turn by turn, with the transcript and recording sitting right on the call and its duration, turn count, and cost totalled up. When a caller reports the agent misheard their order number, you open that one call, jump to the turn where it happened, and play the audio back, instead of guessing from a dashboard.
| Path | For | How spans are produced | What you write |
|---|---|---|---|
| **Managed ingestion** | Hosted agents on Vapi or Retell | Observe pulls the provider's call logs | No code: connect the provider and turn observability on |
| **Auto-instrumentation** | Apps built on LiveKit or Pipecat | Your app emits a span per turn through [traceAI](/docs/observe/concepts/traceai) | A few lines of traceAI setup |

For the managed-ingestion setup, see [Voice observability](/docs/observe/features/voice).

## Debugging a call

Take a support line running on a Vapi assistant. Observe pulls each finished call in as its own trace, so you read it top to bottom and follow the conversation turn by turn. When a caller reports the agent misheard their order number, you open that one call, jump to the turn where it happened, and play the audio back, instead of guessing from a dashboard.

## When to use

Reach for voice observability when what you are debugging is a spoken conversation: a caller who got the wrong answer, an agent that ran long, a call that cost more than it should. It earns its place when you want those calls sitting alongside the rest of your traces, ready to score and monitor.
Reach for voice observability when what you are debugging is a spoken conversation: a caller who got the wrong answer, an agent that ran long, a call that cost more than it should. It also fits when you want those calls sitting alongside the rest of your traces, ready to score and monitor.

When the grain is wrong, reach elsewhere:

Expand All @@ -34,18 +49,15 @@ When the grain is wrong, reach elsewhere:

## Why it matters

Voice failures are the ones you hear about from a customer, not a log: the agent talked over the caller, misheard a number, or trailed off. A spoken call leaves nothing behind to inspect. Capturing it as a trace changes that. You get the transcript to read, the recording to play, and per-turn timing to see where it dragged, all on one call. And because it is just a trace, everything else works on it too: run [evals](/docs/observe/features/evals) on the conversation, set alerts when calls start failing, filter and export like anything else.
Voice failures are the ones you hear about from a customer, not a log. A spoken call normally leaves nothing behind to inspect; capturing it as a trace changes that, so a complaint becomes a call you can open, read, and replay instead of a guess.

## Keep exploring

<CardGroup cols={3}>
<Card title="Trace a LiveKit voice app" icon="plug" href="/docs/traceai/auto/livekit">
Auto-instrument a LiveKit voice agent through traceAI
</Card>
<CardGroup cols={2}>
<Card title="Run evals on traces" icon="chart-line" href="/docs/observe/features/evals">
Score voice conversations for quality and safety
</Card>
<Card title="Observability model" icon="compass" href="/docs/observe/concepts/observability-model">
<Card title="Observability model" icon="layer-group" href="/docs/observe/concepts/observability-model">
How spans, traces, sessions, and users fit together
</Card>
</CardGroup>