diff --git a/src/pages/docs/observe/concepts/voice-observability.mdx b/src/pages/docs/observe/concepts/voice-observability.mdx index 747272cb..a2f2571e 100644 --- a/src/pages/docs/observe/concepts/voice-observability.mdx +++ b/src/pages/docs/observe/concepts/voice-observability.mdx @@ -5,27 +5,42 @@ description: "Turning every voice call into a trace you can debug and score." ## A voice call is a trace -**Voice observability** captures each voice call as a [trace](/docs/observe/concepts/traces), the same tree of [spans](/docs/observe/concepts/spans) you get from a text app. One call becomes one trace: each back-and-forth turn is a span inside it, and the whole thing carries the transcript, the recording, and the call's duration, turn count, and cost. A spoken conversation lands in the same place as every other request, ready for the same evals, alerts, and filters. +**Voice observability** captures each voice call as a [trace](/docs/observe/concepts/traces), the same tree of [spans](/docs/observe/concepts/spans) you get from a text app. One call becomes one trace, and each back-and-forth turn is a span inside it. The call carries the transcript, the recording, and its duration, turn count, and cost. A spoken conversation lands in the same place as every other request, ready for the same [evals](/docs/observe/features/evals), alerts, and filters. -A voice call reaches Observe by one of two paths: +## Inside a voice call -- **Managed ingestion**, for hosted providers like Vapi and Retell. Connect the provider once with its API key and assistant ID and switch observability on. Observe pulls the provider's call logs and writes each finished call in as a trace, with no SDK and no code on your side. -- **Auto-instrumentation**, for voice apps you build on LiveKit or Pipecat. Your app emits spans through [traceAI](/docs/observe/concepts/traceai), exactly like any other instrumented service. - -Either way you land on the same thing: one voice call you can open as a trace. +A turn is more than a single step. In an app you instrument on LiveKit or Pipecat, a turn's span breaks down into speech-to-text, the model call, and text-to-speech, so you can see exactly where a turn went wrong, not just that it did. Managed calls arrive at the turn level, and the transcript and recording sit on the call either way. |"Observe pulls call logs"| T["One voice call, as a trace"] - B["LiveKit / Pipecat app"] -->|"traceAI emits spans"| T - T --> D["Transcript · recording · turns · cost"]`} /> + accTitle: What a voice call looks like as a trace + accDescr: A voice call is one trace made of turn spans. In an instrumented app each turn breaks down into speech-to-text, the model call, and text-to-speech. The call carries the transcript, recording, turn count, and cost. + C["Voice call · one trace"] --> T1["Turn · span"] + C --> T2["Turn · span"] + T1 --> STT["speech to text"] + T1 --> LLM["model call"] + T1 --> TTS["text to speech"] + C --> M["Transcript · recording · turns · cost"]`} /> + +Because it is an ordinary trace, a voice call fits the same [observability model](/docs/observe/concepts/observability-model) as your text traces. + +## How a call reaches Observe + +A voice call reaches Observe by one of two paths. Whichever it takes, it lands as the same trace; what differs is who produces the spans. -Take a support line running on a Vapi assistant. Observe pulls each finished call in as its own trace: read it top to bottom and you follow the conversation turn by turn, with the transcript and recording sitting right on the call and its duration, turn count, and cost totalled up. When a caller reports the agent misheard their order number, you open that one call, jump to the turn where it happened, and play the audio back, instead of guessing from a dashboard. +| Path | For | How spans are produced | What you write | +|---|---|---|---| +| **Managed ingestion** | Hosted agents on Vapi or Retell | Observe pulls the provider's call logs | No code: connect the provider and turn observability on | +| **Auto-instrumentation** | Apps built on LiveKit or Pipecat | Your app emits a span per turn through [traceAI](/docs/observe/concepts/traceai) | A few lines of traceAI setup | + +For the managed-ingestion setup, see [Voice observability](/docs/observe/features/voice). + +## Debugging a call + +Take a support line running on a Vapi assistant. Observe pulls each finished call in as its own trace, so you read it top to bottom and follow the conversation turn by turn. When a caller reports the agent misheard their order number, you open that one call, jump to the turn where it happened, and play the audio back, instead of guessing from a dashboard. ## When to use -Reach for voice observability when what you are debugging is a spoken conversation: a caller who got the wrong answer, an agent that ran long, a call that cost more than it should. It earns its place when you want those calls sitting alongside the rest of your traces, ready to score and monitor. +Reach for voice observability when what you are debugging is a spoken conversation: a caller who got the wrong answer, an agent that ran long, a call that cost more than it should. It also fits when you want those calls sitting alongside the rest of your traces, ready to score and monitor. When the grain is wrong, reach elsewhere: @@ -34,18 +49,15 @@ When the grain is wrong, reach elsewhere: ## Why it matters -Voice failures are the ones you hear about from a customer, not a log: the agent talked over the caller, misheard a number, or trailed off. A spoken call leaves nothing behind to inspect. Capturing it as a trace changes that. You get the transcript to read, the recording to play, and per-turn timing to see where it dragged, all on one call. And because it is just a trace, everything else works on it too: run [evals](/docs/observe/features/evals) on the conversation, set alerts when calls start failing, filter and export like anything else. +Voice failures are the ones you hear about from a customer, not a log. A spoken call normally leaves nothing behind to inspect; capturing it as a trace changes that, so a complaint becomes a call you can open, read, and replay instead of a guess. ## Keep exploring - - - Auto-instrument a LiveKit voice agent through traceAI - + Score voice conversations for quality and safety - + How spans, traces, sessions, and users fit together