Pydantic Agents

Render Developer Q&A Assistant showcasing observable AI with Pydantic Agents, Pydantic Embedder, and Logfire

Intelligent question-answering system that demonstrates real-world AI observability patterns. This example project shows how to build, instrument, and monitor a multi-stage LLM pipeline with full cost tracking, quality evaluation, and performance monitoring.

What This App Does

This is an AI-powered Q&A assistant for Render documentation. Users can ask questions about Render's platform, and the app provides accurate, well-researched answers backed by the official documentation.

User Experience

Ask a question - "How do I deploy a Node.js app on Render?" or "What database plans are available?"
Watch the pipeline - Track progress as the run moves through 7 stages (embedding → retrieval → generation → verification)
Get accurate answers - Receive detailed responses with sources from Render docs
Quality guaranteed - Every answer is verified for accuracy and rated by dual AI evaluators

Key Features

Hybrid search - Combines semantic understanding with keyword matching for better retrieval
Three verification capabilities - Each answer goes through Grounding (extract claims, verify them against the retrieved sources), Accuracy (a factual-correctness review), and Quality (a dual-model developer-experience rating) — distinct checks, not redundant ones
Neutral, grounded answers - The assistant answers only from retrieved documentation, with no product-favorable steering in the prompt; relevant Render docs surface through retrieval, not by being force-sold
Cost tracking - See exactly how much each question costs to answer
Concurrent verification - The Accuracy + dual-model Quality checks run concurrently (a single asyncio.gather) so the three independent LLM calls overlap

What This Demonstrates

Render Capabilities

PostgreSQL with pgvector + full-text - Managed hybrid search database
Web Service + Static Site - FastAPI backend (pipeline runs in-process) + Next.js frontend
Cron Jobs - Scheduled ingestion refresh that re-embeds the live RAG sources
Blueprint deploy + env groups - render.yaml provisions everything; shared config lives in one env group

Logfire Features

LLM Traces - Complete visibility into every AI call (OpenAI + Anthropic auto-instrumented)
HTTP Tracing - FastAPI auto-instrumentation for request/response tracking
Database Monitoring - AsyncPG auto-instrumentation for query performance
Cost Tracking - Per-stage and per-execution cost attribution with custom metrics
Multi-Model Evals - Dual-rater quality assessment (OpenAI + Anthropic)
Session Tracking - End-to-end user journey with distributed tracing
Custom Metrics - Business-specific metrics (cost, quality, accuracy)
SQL Queries - Custom analytics on AI performance

Pydantic Stack

This project is built end-to-end on the Pydantic ecosystem:

Pydantic AI Agents — every pipeline stage (generation, claims extraction, accuracy check, dual-rater evaluation) is a pydantic_ai.Agent with a typed output_type. Multi-provider orchestration (Claude + GPT) runs through OpenAIProvider / AnthropicProvider in a single pipeline. See backend/pipeline/.
Pydantic Embedder — pydantic_ai.Embedder with OpenAIEmbeddingModel powers question embedding (embed_query) and batch claim embedding (embed_documents) for verification. Auto-instrumented by logfire.instrument_pydantic_ai(). See backend/pipeline/embeddings.py and backend/pipeline/verification.py.
Pydantic Models — Claims, accuracy scores, eval dimensions, and pipeline state are parsed directly into Pydantic models (e.g. ClaimsOutput, EvaluationOutput). pydantic-settings manages config in backend/config.py.
Pydantic GenAI Prices — model pricing is loaded dynamically from the pydantic/genai-prices registry, then combined with per-agent token counts from result.usage() to produce per-stage cost attribution. See backend/prices.py.
Logfire — distributed traces, custom metrics, dual-model evals, and cost attribution. Auto-instruments FastAPI, AsyncPG, HTTPX, and Pydantic AI. See backend/observability.py.

Architecture

The frontend connects to a FastAPI backend that runs the 7-stage Q&A pipeline in-process as a background task. POST /ask launches the run and returns immediately; the frontend polls GET /ask/{run_id} for live per-stage progress and the final result.

┌─────────────────────────────────────────────────────────────┐
│  Frontend (Next.js + TypeScript)                            │
│  Deployed as: Render Static Site                            │
│  - Question input UI                                        │
│  - Progress via polling (POST /ask → poll GET /ask/{id})    │
│  - Answer display with metrics                              │
└─────────────────────────────────────────────────────────────┘
                          ↓ HTTPS
┌─────────────────────────────────────────────────────────────┐
│  Backend (FastAPI + Logfire)                                │
│  Deployed as: Render Web Service (Python 3.13)              │
│  - POST /ask        → launch run_qa_pipeline (background)   │
│  - GET  /ask/{id}   → read run status + progress            │
│  - /health, /history, /stats, /sessions/{id}/logs           │
│                                                             │
│  In-process orchestrator: backend/pipeline/orchestrator.py  │
│  ┌────────────────────────────────────────────────────────┐ │
│  │ Retrieval  [1] Question Embedding  (OpenAI)            │ │
│  │            [2] RAG Retrieval (pgvector+BM25)           │ │
│  │ Generate   [3] Answer Generation   (Claude)           │ │
│  │ Grounding  [4] Claims Extraction   (GPT)              │ │
│  │            [5] Claims Verification (RAG)               │ │
│  │ Accuracy   [6] Factual-grounding   (Claude) ┐         │ │
│  │ Quality    [7] Dual-model rating   (OpenAI+ ├─ gather  │ │
│  │                                    Anthropic)┘ (parallel)│ │
│  └────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────┘
            ↓                                    ↓
┌──────────────────────┐           ┌───────────────────────────┐
│  PostgreSQL          │           │  Logfire                  │
│  (Render Managed)    │           │  (Pydantic)               │
│  - pgvector ext      │           │  - Distributed traces     │
│  - RAG embeddings    │           │  - Cost attribution       │
│  - Full-text search  │           │  - Quality metrics        │
│  - pipeline_runs +   │           │  - Custom dashboards      │
│    pipeline_progress │           └───────────────────────────┘
└──────────────────────┘

  Cron (daily) ─ data/scripts/ingest_pages.py ─▶ re-embed live sources

In-process, not a separate service. The pipeline is plain async code: stages 1–5 run sequentially, and the three independent verification calls (6 + 7) overlap via a single asyncio.gather. POST /ask records a row in pipeline_runs and runs the orchestrator as a detached task; GET /ask/{run_id} reads that row plus the live pipeline_progress updates. See backend/pipeline/orchestrator.py.

Project Structure

render-qa-assistant/
├── backend/
│   ├── main.py                    # FastAPI app (launches + polls in-process runs)
│   ├── api/
│   │   └── logs.py                # Logfire logs API endpoint
│   ├── pipeline/                  # 7-stage pipeline implementation
│   │   └── orchestrator.py        # In-process run_qa_pipeline orchestrator
│   ├── ingestion.py               # Shared embed + replace-by-source helpers
│   ├── models.py                  # Pydantic models
│   ├── database.py                # PostgreSQL + pgvector (+ pipeline_runs/progress)
│   ├── observability.py           # Logfire configuration
│   └── config.py                  # Settings management
├── frontend/
│   ├── src/                       # Next.js + TypeScript UI
│   └── package.json
├── data/
│   ├── embeddings/                # Pre-embedded documentation
│   ├── curated/                   # Hand-curated source content (markdown)
│   ├── sources.py                 # Live-source registry (build strategies + metadata)
│   └── scripts/                   # Data ingestion scripts
├── docs/
│   ├── PIPELINE.md                # Detailed pipeline guide
│   ├── OBSERVABILITY.md           # Logfire instrumentation guide
│   ├── CONFIGURATION.md           # Configuration reference
│   └── HYBRID_SEARCH.md           # Hybrid search deep-dive
├── pyproject.toml                 # Python dependencies (uv)
├── uv.lock                        # Locked dependency versions
├── .python-version                # Pins Python to 3.13
├── render.yaml                    # Infrastructure as code
├── .env.example                   # Environment variables template
└── README.md                      # This file

Quick Start

Prerequisites

uv (manages Python 3.13 automatically)
Node.js 18+
PostgreSQL 16+ (with pgvector extension >= 0.5.0, for the HNSW vector index)
OpenAI API key
Anthropic API key
Logfire account — sign in at logfire.pydantic.dev, create a project (US region), then:
1. Settings → Write Tokens → create a token → LOGFIRE_TOKEN in .env
2. Settings → Read Tokens → create a token → LOGFIRE_READ_TOKEN in .env
3. View traces in the Live panel under your project

Local Development (with Make)

# 1. Install everything (uv installs Python 3.13 automatically)
make install

# 2. Set up .env file (copy from example and fill in your keys)
cp .env.example .env

# 3. Start database
make db-start

# 4. Load documentation (this step might take a while!)
make ingest

# 5. Run backend (in one terminal)
make run-backend

# 6. Run frontend (in another terminal)
make run-frontend

The full stack runs locally — no Render cloud resources, no extra API key. The Q&A pipeline runs in-process inside the backend, so POST /ask works against your local DATABASE_URL and History populates normally. make run-backend and make run-frontend (steps 5–6 above) are all you need; ask questions at http://localhost:3000.

Local config → deployed env group. Locally every process reads one .env (copied from .env.example). On deploy that same config lives in the Render env group in render.yaml — see Deploy → Environment groups. The Docker DATABASE_URL isn't in the group; in the cloud DATABASE_URL is injected from the database.

make ingest runs the full pipeline: bulk doc embeddings, plus the curated "special pages" that get explicit-injection into RAG context (pricing, AI agent, autoscaling, Node.js). These live sources are defined in the data/sources.py registry and ingested through the shared build → embed → replace-by-source helpers. To re-load just one of those after editing its registry entry (or curated content), use the per-target shortcuts:

make add-pricing      # render.com/pricing tables
make add-ai-agent     # render.com/tutorials/agents-on-render-workflows (AI agents → Render Workflows)
make add-autoscaling  # render.com/docs/scaling
make add-nodejs       # render.com/docs/deploy-node-express-app

Access locally:

Deploy to Render

1. Set up a Logfire account.

Before clicking the deploy button, sign in at logfire.pydantic.dev, create a project (US region), and generate two tokens:

Preferences → Write Tokens → create token → save as LOGFIRE_TOKEN
Preferences → Read Tokens → create token → save as LOGFIRE_READ_TOKEN

You'll paste both into the Render Dashboard in step 3.

2. One-click deploy

Render reads render.yaml and provisions:

PostgreSQL database with pgvector (pydantic-agents-db)
Backend web service (pydantic-agents-api, FastAPI + Logfire — runs the pipeline in-process)
Ingestion refresh cron (pydantic-agents-ingest, re-embeds the live sources daily)
Frontend static site (pydantic-agents-frontend, Next.js)
One environment group that holds all shared config (see below)

On Apply, Render prompts once for the secret values in the env group (OPENAI_API_KEY, ANTHROPIC_API_KEY, and both Logfire tokens). You fill these at the group level, not per service.

Environment groups

render.yaml defines one reusable env group so config lives in one place instead of being duplicated across services:

Group	Contents	Linked to
`pydantic-agents-pipeline`	LLM/Logfire secrets + all pipeline, RAG, and model config (~18 vars)	Backend API and the ingest cron

The backend and the cron both run the same backend.config.Settings, so they need identical config — linking one group beats pasting ~18 variables per service. DATABASE_URL stays per-service (it's injected from the database, which can't live in a group), and the frontend's NEXT_PUBLIC_API_URL stays inline (unique, build-time).

3. Fill in the env-group values

Because the backend and cron read everything from the env group, you set values on the group, not on each service — every linked service picks them up automatically. Set the four secrets once, when you apply the Blueprint in step 2:

Variable	Source
`OPENAI_API_KEY`	platform.openai.com
`ANTHROPIC_API_KEY`	console.anthropic.com
`LOGFIRE_TOKEN`	Logfire write token from step 1
`LOGFIRE_READ_TOKEN`	Logfire read token from step 1

The first three are required — the services crash on startup without them (no defaults in backend/config.py).

Edit the group under Dashboard → Env Groups → pydantic-agents-pipeline. Saving re-deploys every service linked to it.

Auto-filled, no action needed: DATABASE_URL (injected from the database service) and the rest of pydantic-agents-pipeline's config (MAX_TOKENS, TIMEOUT_SECONDS, RAG_TOP_K, SIMILARITY_THRESHOLD, VERIFICATION_THRESHOLD, EMBEDDING_MODEL, EMBEDDING_DIMENSIONS, the model-selection vars, ENABLE_CACHING, LOG_LEVEL) ship with sensible defaults in render.yaml.

4. Wire the frontend to the backend

After the backend deploys, copy its public URL from the service's Dashboard page and set it as the NEXT_PUBLIC_API_URL env var on the frontend service, then redeploy the frontend so the value takes effect. For this deploy that's:

NEXT_PUBLIC_API_URL=https://pydantic-agents-api.onrender.com

Use the base origin only — no trailing slash and no /api path (the frontend appends /ask, /health, etc. itself). If your service name isn't globally unique, Render adds a random suffix (…-api-xxxx.onrender.com), so always copy the exact URL shown in the Dashboard.

5. Done — the corpus seeds itself

The backend's preDeployCommand runs data/scripts/ingest_pages.py on every deploy, loading the core corpus and re-embedding the live sources, so the database is seeded by the time the service goes live. The daily pydantic-agents-ingest cron keeps it fresh. Ask a question at your frontend URL — for example, "How do I deploy an AI agent on Render?"

Documentation

Core Guides

docs/PIPELINE.md - Detailed breakdown of the 7-stage pipeline
docs/OBSERVABILITY.md - Comprehensive Logfire instrumentation guide
docs/CONFIGURATION.md - All configuration options and tuning
docs/HYBRID_SEARCH.md - Technical deep-dive on hybrid search

External Resources

Logfire Documentation: https://docs.pydantic.dev/logfire/
Pydantic AI Documentation: https://ai.pydantic.dev/
Render Documentation: https://docs.render.com/

Contributing

This is a demo project, but improvements are welcome!

Fork the repository
Create a feature branch
Make your changes
Add tests
Submit a pull request

License

MIT License - see LICENSE file for details

Acknowledgments

Built to showcase:

Logfire by Pydantic - AI observability platform
Render - Modern cloud platform
Pydantic AI - Type-safe AI agent framework
OpenAI & Anthropic - LLM providers

Ready to build observable AI? Fork this repo and deploy to Render to get started!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Pydantic Agents

Table of Contents

What This App Does

User Experience

Key Features

What This Demonstrates

Render Capabilities

Logfire Features

Pydantic Stack

Architecture

Project Structure

Quick Start

Prerequisites

Local Development (with Make)

Deploy to Render

1. Set up a Logfire account.

2. One-click deploy

Environment groups

3. Fill in the env-group values

4. Wire the frontend to the backend

5. Done — the corpus seeds itself

Documentation

Core Guides

External Resources

Contributing

License

Acknowledgments

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 97 Commits
backend		backend
dashboards		dashboards
data		data
docker		docker
docs		docs
frontend		frontend
.env.example		.env.example
.gitignore		.gitignore
.python-version		.python-version
.uvversion		.uvversion
Makefile		Makefile
README.md		README.md
docker-compose.yml		docker-compose.yml
pyproject.toml		pyproject.toml
render.yaml		render.yaml
uv.lock		uv.lock

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Pydantic Agents

Table of Contents

What This App Does

User Experience

Key Features

What This Demonstrates

Render Capabilities

Logfire Features

Pydantic Stack

Architecture

Project Structure

Quick Start

Prerequisites

Local Development (with Make)

Deploy to Render

1. Set up a Logfire account.

2. One-click deploy

Environment groups

3. Fill in the env-group values

4. Wire the frontend to the backend

5. Done — the corpus seeds itself

Documentation

Core Guides

External Resources

Contributing

License

Acknowledgments

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages