Skip to content

Commit 0228b80

Browse files
Add livekit agents skill (#49)
1 parent cb13593 commit 0228b80

File tree

5 files changed

+924
-1
lines changed

5 files changed

+924
-1
lines changed
Lines changed: 293 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,293 @@
1+
---
2+
name: livekit-agents
3+
description: 'Build voice AI agents with LiveKit Cloud and the Agents SDK. Use when the user asks to "build a voice agent", "create a LiveKit agent", "add voice AI", "implement handoffs", "structure agent workflows", or is working with LiveKit Agents SDK. Provides opinionated guidance for the recommended path: LiveKit Cloud + LiveKit Inference. REQUIRES writing tests for all implementations.'
4+
license: MIT
5+
metadata:
6+
author: livekit
7+
version: "0.3.0"
8+
---
9+
10+
# LiveKit Agents Development for LiveKit Cloud
11+
12+
This skill provides opinionated guidance for building voice AI agents with LiveKit Cloud. It assumes you are using LiveKit Cloud (the recommended path) and encodes *how to approach* agent development, not API specifics. All factual information about APIs, methods, and configurations must come from live documentation.
13+
14+
**This skill is for LiveKit Cloud developers.** If you're self-hosting LiveKit, some recommendations (particularly around LiveKit Inference) won't apply directly.
15+
16+
## MANDATORY: Read This Checklist Before Starting
17+
18+
Before writing ANY code, complete this checklist:
19+
20+
1. **Read this entire skill document** - Do not skip sections even if MCP is available
21+
2. **Ensure LiveKit Cloud project is connected** - You need `LIVEKIT_URL`, `LIVEKIT_API_KEY`, and `LIVEKIT_API_SECRET` from your Cloud project
22+
3. **Set up documentation access** - Use MCP if available, otherwise use web search
23+
4. **Plan to write tests** - Every agent implementation MUST include tests (see testing section below)
24+
5. **Verify all APIs against live docs** - Never rely on model memory for LiveKit APIs
25+
26+
This checklist applies regardless of whether MCP is available. MCP provides documentation access but does NOT replace the guidance in this skill.
27+
28+
## LiveKit Cloud Setup
29+
30+
LiveKit Cloud is the fastest way to get a voice agent running. It provides:
31+
- Managed infrastructure (no servers to deploy)
32+
- **LiveKit Inference** for AI models (no separate API keys needed)
33+
- Built-in noise cancellation, turn detection, and other voice features
34+
- Simple credential management
35+
36+
### Connect to Your Cloud Project
37+
38+
1. Sign up at [cloud.livekit.io](https://cloud.livekit.io) if you haven't already
39+
2. Create a project (or use an existing one)
40+
3. Get your credentials from the project settings:
41+
- `LIVEKIT_URL` - Your project's WebSocket URL (e.g., `wss://your-project.livekit.cloud`)
42+
- `LIVEKIT_API_KEY` - API key for authentication
43+
- `LIVEKIT_API_SECRET` - API secret for authentication
44+
45+
4. Set these as environment variables (typically in `.env.local`):
46+
```bash
47+
LIVEKIT_URL=wss://your-project.livekit.cloud
48+
LIVEKIT_API_KEY=your-api-key
49+
LIVEKIT_API_SECRET=your-api-secret
50+
```
51+
52+
The LiveKit CLI can automate credential setup. Consult the CLI documentation for current commands.
53+
54+
### Use LiveKit Inference for AI Models
55+
56+
**LiveKit Inference is the recommended way to use AI models with LiveKit Cloud.** It provides access to leading AI model providers—all through your LiveKit credentials with no separate API keys needed.
57+
58+
Benefits of LiveKit Inference:
59+
- No separate API keys to manage for each AI provider
60+
- Billing consolidated through your LiveKit Cloud account
61+
- Optimized for voice AI workloads
62+
63+
Consult the documentation for available models, supported providers, and current usage patterns. The documentation always has the most up-to-date information.
64+
65+
## Critical Rule: Never Trust Model Memory for LiveKit APIs
66+
67+
LiveKit Agents is a fast-evolving SDK. Model training data is outdated the moment it's created. When working with LiveKit:
68+
69+
- **Never assume** API signatures, method names, or configuration options from memory
70+
- **Never guess** SDK behavior or default values
71+
- **Always verify** against live documentation before writing code
72+
- **Always cite** the documentation source when implementing features
73+
74+
This rule applies even when confident about an API. Verify anyway.
75+
76+
## REQUIRED: Use LiveKit MCP Server for Documentation
77+
78+
Before writing any LiveKit code, ensure access to the LiveKit documentation MCP server. This provides current, verified API information and prevents reliance on stale model knowledge.
79+
80+
### Check for MCP Availability
81+
82+
Look for `livekit-docs` MCP tools. If available, use them for all documentation lookups:
83+
- Search documentation before implementing any feature
84+
- Verify API signatures and method parameters
85+
- Look up configuration options and their valid values
86+
- Find working examples for the specific task at hand
87+
88+
### If MCP Is Not Available
89+
90+
If the LiveKit MCP server is not configured, inform the user and recommend installation. Installation instructions for all supported platforms are available at:
91+
92+
**https://docs.livekit.io/intro/mcp-server/**
93+
94+
Fetch the installation instructions appropriate for the user's coding agent from that page.
95+
96+
### Fallback When MCP Unavailable
97+
98+
If MCP cannot be installed in the current session:
99+
1. **Inform the user immediately** that documentation cannot be verified in real-time
100+
2. Use web search to fetch current documentation from docs.livekit.io
101+
3. **Explicitly mark all LiveKit-specific code** with a comment like `# UNVERIFIED: Please check docs.livekit.io for current API`
102+
4. **State clearly** when you cannot verify something: "I cannot verify this API signature against current documentation"
103+
5. Recommend the user verify against https://docs.livekit.io before using the code
104+
105+
## Voice Agent Architecture Principles
106+
107+
Voice AI agents have fundamentally different requirements than text-based agents or traditional software. Internalize these principles:
108+
109+
### Latency Is Critical
110+
111+
Voice conversations are real-time. Users expect responses within hundreds of milliseconds, not seconds. Every architectural decision should consider latency impact:
112+
113+
- Minimize LLM context size to reduce inference time
114+
- Avoid unnecessary tool calls during active conversation
115+
- Prefer streaming responses over batch responses
116+
- Design for the unhappy path (network delays, API timeouts)
117+
118+
### Context Bloat Kills Performance
119+
120+
Large system prompts and extensive tool lists directly increase latency. A voice agent with 50 tools and a 10,000-token system prompt will feel sluggish regardless of model speed.
121+
122+
Design agents with minimal viable context:
123+
- Include only tools relevant to the current conversation phase
124+
- Keep system prompts focused and concise
125+
- Remove tools and context that aren't actively needed
126+
127+
### Users Don't Read, They Listen
128+
129+
Voice interface constraints differ from text:
130+
- Long responses frustrate users—keep outputs concise
131+
- Users cannot scroll back—ensure clarity on first delivery
132+
- Interruptions are normal—design for graceful handling
133+
- Silence feels broken—acknowledge processing when needed
134+
135+
## Workflow Architecture: Handoffs and Tasks
136+
137+
Complex voice agents should not be monolithic. LiveKit Agents supports structured workflows that maintain low latency while handling sophisticated use cases.
138+
139+
### The Problem with Monolithic Agents
140+
141+
A single agent handling an entire conversation flow accumulates:
142+
- Tools for every possible action (bloated tool list)
143+
- Instructions for every conversation phase (bloated context)
144+
- State management for all scenarios (complexity)
145+
146+
This creates latency and reduces reliability.
147+
148+
### Handoffs: Agent-to-Agent Transitions
149+
150+
Handoffs allow one agent to transfer control to another. Use handoffs to:
151+
- Separate distinct conversation phases (greeting → intake → resolution)
152+
- Isolate specialized capabilities (general support → billing specialist)
153+
- Manage context boundaries (each agent has only what it needs)
154+
155+
Design handoffs around natural conversation boundaries where context can be summarized rather than transferred wholesale.
156+
157+
### Tasks: Scoped Operations
158+
159+
Tasks are tightly-scoped prompts designed to achieve a specific outcome. Use tasks for:
160+
- Discrete operations that don't require full agent capabilities
161+
- Situations where a focused prompt outperforms a general-purpose agent
162+
- Reducing context when only a specific capability is needed
163+
164+
Consult the documentation for implementation details on handoffs and tasks.
165+
166+
## REQUIRED: Write Tests for Agent Behavior
167+
168+
Voice agent behavior is code. Every agent implementation MUST include tests. Shipping an agent without tests is shipping untested code.
169+
170+
### Mandatory Testing Workflow
171+
172+
When building or modifying a LiveKit agent:
173+
174+
1. **Create a `tests/` directory** if one doesn't exist
175+
2. **Write at least one test** before considering the implementation complete
176+
3. **Test the core behavior** the user requested
177+
4. **Run the tests** to verify they pass
178+
179+
A minimal test file structure:
180+
```
181+
project/
182+
├── agent.py (or src/agent.py)
183+
└── tests/
184+
└── test_agent.py
185+
```
186+
187+
### Test-Driven Development Process
188+
189+
When modifying agent behavior—instructions, tool descriptions, workflows—begin by writing tests for the desired behavior:
190+
191+
1. Define what the agent should do in specific scenarios
192+
2. Write test cases that verify this behavior
193+
3. Implement the feature
194+
4. Iterate until tests pass
195+
196+
This approach prevents shipping agents that "seem to work" but fail in production.
197+
198+
### What Every Agent Test Should Cover
199+
200+
At minimum, write tests for:
201+
- **Basic conversation flow**: Agent responds appropriately to a greeting
202+
- **Tool invocation** (if tools exist): Tools are called with correct parameters
203+
- **Error handling**: Agent handles unexpected input gracefully
204+
205+
Focus tests on:
206+
- **Tool invocation**: Does the agent call the right tools with correct parameters?
207+
- **Response quality**: Does the agent produce appropriate responses for given inputs?
208+
- **Workflow transitions**: Do handoffs and tasks trigger correctly?
209+
- **Edge cases**: How does the agent handle unexpected input, interruptions, silence?
210+
211+
### Test Implementation Pattern
212+
213+
Use LiveKit's testing framework. Consult the testing documentation via MCP for current patterns:
214+
```
215+
search: "livekit agents testing"
216+
```
217+
218+
The framework supports:
219+
- Simulated user input
220+
- Verification of agent responses
221+
- Tool call assertions
222+
- Workflow transition testing
223+
224+
### Why This Is Non-Negotiable
225+
226+
Agents that "seem to work" in manual testing frequently fail in production:
227+
- Prompt changes silently break behavior
228+
- Tool descriptions affect when tools are called
229+
- Model updates change response patterns
230+
231+
Tests catch these issues before users do.
232+
233+
### Skipping Tests
234+
235+
If a user explicitly requests no tests, proceed without them but inform them:
236+
> "I've built the agent without tests as requested. I strongly recommend adding tests before deploying to production. Voice agents are difficult to verify manually and tests prevent silent regressions."
237+
238+
## Common Mistakes to Avoid
239+
240+
### Overloading the Initial Agent
241+
242+
Starting with one agent that "does everything" and adding tools/instructions over time. Instead, design workflow structure upfront, even if initial implementation is simple.
243+
244+
### Ignoring Latency Until It's a Problem
245+
246+
Latency issues compound. An agent that feels "a bit slow" in development becomes unusable in production with real network conditions. Measure and optimize latency continuously.
247+
248+
### Copying Examples Without Understanding
249+
250+
Examples in documentation demonstrate specific patterns. Copying code without understanding its purpose leads to bloated, poorly-structured agents. Understand what each component does before including it.
251+
252+
### Skipping Tests Because "It's Just Prompts"
253+
254+
Agent behavior is code. Prompt changes affect behavior as much as code changes. Test agent behavior with the same rigor as traditional software. **Never deliver an agent implementation without at least one test file.**
255+
256+
### Assuming Model Knowledge Is Current
257+
258+
Reiterating the critical rule: never trust model memory for LiveKit APIs. The SDK evolves faster than model training cycles. Verify everything.
259+
260+
## When to Consult Documentation
261+
262+
**Always consult documentation for:**
263+
- API method signatures and parameters
264+
- Configuration options and their valid values
265+
- SDK version-specific features or changes
266+
- Deployment and infrastructure setup
267+
- Model provider integration details
268+
- CLI commands and flags
269+
270+
**This skill provides guidance on:**
271+
- Architectural approach and design principles
272+
- Workflow structure decisions
273+
- Testing strategy
274+
- Common pitfalls to avoid
275+
276+
The distinction matters: this skill tells you *how to think* about building voice agents. The documentation tells you *how to implement* specific features.
277+
278+
## Feedback Loop
279+
280+
When using LiveKit documentation via MCP, note any gaps, outdated information, or confusing content. Reporting documentation issues helps improve the ecosystem for all developers.
281+
282+
## Summary
283+
284+
Building effective voice agents with LiveKit Cloud requires:
285+
286+
1. **Use LiveKit Cloud + LiveKit Inference** as the foundation—it's the fastest path to production
287+
2. **Verify everything** against live documentation—never trust model memory
288+
3. **Minimize latency** at every architectural decision point
289+
4. **Structure workflows** using handoffs and tasks to manage complexity
290+
5. **Test behavior** before and after changes—never ship without tests
291+
6. **Keep context minimal**—only include what's needed for the current phase
292+
293+
These principles remain valid regardless of SDK version or API changes. For all implementation specifics, consult the LiveKit documentation via MCP.

0 commit comments

Comments
 (0)