Skip to content

render-examples/pydantic-agents

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

97 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Pydantic Agents

Render Developer Q&A Assistant showcasing observable AI with Pydantic Agents, Pydantic Embedder, and Logfire

Deploy to Render

Intelligent question-answering system that demonstrates real-world AI observability patterns. This example project shows how to build, instrument, and monitor a multi-stage LLM pipeline with full cost tracking, quality evaluation, and performance monitoring.

Table of Contents


What This App Does

This is an AI-powered Q&A assistant for Render documentation. Users can ask questions about Render's platform, and the app provides accurate, well-researched answers backed by the official documentation.

User Experience

  1. Ask a question - "How do I deploy a Node.js app on Render?" or "What database plans are available?"
  2. Watch the pipeline - Track progress as the run moves through 7 stages (embedding β†’ retrieval β†’ generation β†’ verification)
  3. Get accurate answers - Receive detailed responses with sources from Render docs
  4. Quality guaranteed - Every answer is verified for accuracy and rated by dual AI evaluators

Key Features

  • Hybrid search - Combines semantic understanding with keyword matching for better retrieval
  • Three verification capabilities - Each answer goes through Grounding (extract claims, verify them against the retrieved sources), Accuracy (a factual-correctness review), and Quality (a dual-model developer-experience rating) β€” distinct checks, not redundant ones
  • Neutral, grounded answers - The assistant answers only from retrieved documentation, with no product-favorable steering in the prompt; relevant Render docs surface through retrieval, not by being force-sold
  • Cost tracking - See exactly how much each question costs to answer
  • Concurrent verification - The Accuracy + dual-model Quality checks run concurrently (a single asyncio.gather) so the three independent LLM calls overlap

What This Demonstrates

Render Capabilities

  • PostgreSQL with pgvector + full-text - Managed hybrid search database
  • Web Service + Static Site - FastAPI backend (pipeline runs in-process) + Next.js frontend
  • Cron Jobs - Scheduled ingestion refresh that re-embeds the live RAG sources
  • Blueprint deploy + env groups - render.yaml provisions everything; shared config lives in one env group

Logfire Features

  • LLM Traces - Complete visibility into every AI call (OpenAI + Anthropic auto-instrumented)
  • HTTP Tracing - FastAPI auto-instrumentation for request/response tracking
  • Database Monitoring - AsyncPG auto-instrumentation for query performance
  • Cost Tracking - Per-stage and per-execution cost attribution with custom metrics
  • Multi-Model Evals - Dual-rater quality assessment (OpenAI + Anthropic)
  • Session Tracking - End-to-end user journey with distributed tracing
  • Custom Metrics - Business-specific metrics (cost, quality, accuracy)
  • SQL Queries - Custom analytics on AI performance

Pydantic Stack

This project is built end-to-end on the Pydantic ecosystem:

  • Pydantic AI Agents β€” every pipeline stage (generation, claims extraction, accuracy check, dual-rater evaluation) is a pydantic_ai.Agent with a typed output_type. Multi-provider orchestration (Claude + GPT) runs through OpenAIProvider / AnthropicProvider in a single pipeline. See backend/pipeline/.
  • Pydantic Embedder β€” pydantic_ai.Embedder with OpenAIEmbeddingModel powers question embedding (embed_query) and batch claim embedding (embed_documents) for verification. Auto-instrumented by logfire.instrument_pydantic_ai(). See backend/pipeline/embeddings.py and backend/pipeline/verification.py.
  • Pydantic Models β€” Claims, accuracy scores, eval dimensions, and pipeline state are parsed directly into Pydantic models (e.g. ClaimsOutput, EvaluationOutput). pydantic-settings manages config in backend/config.py.
  • Pydantic GenAI Prices β€” model pricing is loaded dynamically from the pydantic/genai-prices registry, then combined with per-agent token counts from result.usage() to produce per-stage cost attribution. See backend/prices.py.
  • Logfire β€” distributed traces, custom metrics, dual-model evals, and cost attribution. Auto-instruments FastAPI, AsyncPG, HTTPX, and Pydantic AI. See backend/observability.py.

Architecture

The frontend connects to a FastAPI backend that runs the 7-stage Q&A pipeline in-process as a background task. POST /ask launches the run and returns immediately; the frontend polls GET /ask/{run_id} for live per-stage progress and the final result.

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Frontend (Next.js + TypeScript)                            β”‚
β”‚  Deployed as: Render Static Site                            β”‚
β”‚  - Question input UI                                        β”‚
β”‚  - Progress via polling (POST /ask β†’ poll GET /ask/{id})    β”‚
β”‚  - Answer display with metrics                              β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                          ↓ HTTPS
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Backend (FastAPI + Logfire)                                β”‚
β”‚  Deployed as: Render Web Service (Python 3.13)              β”‚
β”‚  - POST /ask        β†’ launch run_qa_pipeline (background)   β”‚
β”‚  - GET  /ask/{id}   β†’ read run status + progress            β”‚
β”‚  - /health, /history, /stats, /sessions/{id}/logs           β”‚
β”‚                                                             β”‚
β”‚  In-process orchestrator: backend/pipeline/orchestrator.py  β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚  β”‚ Retrieval  [1] Question Embedding  (OpenAI)            β”‚ β”‚
β”‚  β”‚            [2] RAG Retrieval (pgvector+BM25)           β”‚ β”‚
β”‚  β”‚ Generate   [3] Answer Generation   (Claude)           β”‚ β”‚
β”‚  β”‚ Grounding  [4] Claims Extraction   (GPT)              β”‚ β”‚
β”‚  β”‚            [5] Claims Verification (RAG)               β”‚ β”‚
β”‚  β”‚ Accuracy   [6] Factual-grounding   (Claude) ┐         β”‚ β”‚
β”‚  β”‚ Quality    [7] Dual-model rating   (OpenAI+ β”œβ”€ gather  β”‚ β”‚
β”‚  β”‚                                    Anthropic)β”˜ (parallel)β”‚ β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
            ↓                                    ↓
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”           β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  PostgreSQL          β”‚           β”‚  Logfire                  β”‚
β”‚  (Render Managed)    β”‚           β”‚  (Pydantic)               β”‚
β”‚  - pgvector ext      β”‚           β”‚  - Distributed traces     β”‚
β”‚  - RAG embeddings    β”‚           β”‚  - Cost attribution       β”‚
β”‚  - Full-text search  β”‚           β”‚  - Quality metrics        β”‚
β”‚  - pipeline_runs +   β”‚           β”‚  - Custom dashboards      β”‚
β”‚    pipeline_progress β”‚           β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

  Cron (daily) ─ data/scripts/ingest_pages.py ─▢ re-embed live sources

In-process, not a separate service. The pipeline is plain async code: stages 1–5 run sequentially, and the three independent verification calls (6 + 7) overlap via a single asyncio.gather. POST /ask records a row in pipeline_runs and runs the orchestrator as a detached task; GET /ask/{run_id} reads that row plus the live pipeline_progress updates. See backend/pipeline/orchestrator.py.

Project Structure

render-qa-assistant/
β”œβ”€β”€ backend/
β”‚   β”œβ”€β”€ main.py                    # FastAPI app (launches + polls in-process runs)
β”‚   β”œβ”€β”€ api/
β”‚   β”‚   └── logs.py                # Logfire logs API endpoint
β”‚   β”œβ”€β”€ pipeline/                  # 7-stage pipeline implementation
β”‚   β”‚   └── orchestrator.py        # In-process run_qa_pipeline orchestrator
β”‚   β”œβ”€β”€ ingestion.py               # Shared embed + replace-by-source helpers
β”‚   β”œβ”€β”€ models.py                  # Pydantic models
β”‚   β”œβ”€β”€ database.py                # PostgreSQL + pgvector (+ pipeline_runs/progress)
β”‚   β”œβ”€β”€ observability.py           # Logfire configuration
β”‚   └── config.py                  # Settings management
β”œβ”€β”€ frontend/
β”‚   β”œβ”€β”€ src/                       # Next.js + TypeScript UI
β”‚   └── package.json
β”œβ”€β”€ data/
β”‚   β”œβ”€β”€ embeddings/                # Pre-embedded documentation
β”‚   β”œβ”€β”€ curated/                   # Hand-curated source content (markdown)
β”‚   β”œβ”€β”€ sources.py                 # Live-source registry (build strategies + metadata)
β”‚   └── scripts/                   # Data ingestion scripts
β”œβ”€β”€ docs/
β”‚   β”œβ”€β”€ PIPELINE.md                # Detailed pipeline guide
β”‚   β”œβ”€β”€ OBSERVABILITY.md           # Logfire instrumentation guide
β”‚   β”œβ”€β”€ CONFIGURATION.md           # Configuration reference
β”‚   └── HYBRID_SEARCH.md           # Hybrid search deep-dive
β”œβ”€β”€ pyproject.toml                 # Python dependencies (uv)
β”œβ”€β”€ uv.lock                        # Locked dependency versions
β”œβ”€β”€ .python-version                # Pins Python to 3.13
β”œβ”€β”€ render.yaml                    # Infrastructure as code
β”œβ”€β”€ .env.example                   # Environment variables template
└── README.md                      # This file

Quick Start

Prerequisites

  • uv (manages Python 3.13 automatically)
  • Node.js 18+
  • PostgreSQL 16+ (with pgvector extension >= 0.5.0, for the HNSW vector index)
  • OpenAI API key
  • Anthropic API key
  • Logfire account β€” sign in at logfire.pydantic.dev, create a project (US region), then:
    1. Settings β†’ Write Tokens β†’ create a token β†’ LOGFIRE_TOKEN in .env
    2. Settings β†’ Read Tokens β†’ create a token β†’ LOGFIRE_READ_TOKEN in .env
    3. View traces in the Live panel under your project

Local Development (with Make)

# 1. Install everything (uv installs Python 3.13 automatically)
make install

# 2. Set up .env file (copy from example and fill in your keys)
cp .env.example .env

# 3. Start database
make db-start

# 4. Load documentation (this step might take a while!)
make ingest

# 5. Run backend (in one terminal)
make run-backend

# 6. Run frontend (in another terminal)
make run-frontend

The full stack runs locally β€” no Render cloud resources, no extra API key. The Q&A pipeline runs in-process inside the backend, so POST /ask works against your local DATABASE_URL and History populates normally. make run-backend and make run-frontend (steps 5–6 above) are all you need; ask questions at http://localhost:3000.

Local config β†’ deployed env group. Locally every process reads one .env (copied from .env.example). On deploy that same config lives in the Render env group in render.yaml β€” see Deploy β†’ Environment groups. The Docker DATABASE_URL isn't in the group; in the cloud DATABASE_URL is injected from the database.

make ingest runs the full pipeline: bulk doc embeddings, plus the curated "special pages" that get explicit-injection into RAG context (pricing, AI agent, autoscaling, Node.js). These live sources are defined in the data/sources.py registry and ingested through the shared build β†’ embed β†’ replace-by-source helpers. To re-load just one of those after editing its registry entry (or curated content), use the per-target shortcuts:

make add-pricing      # render.com/pricing tables
make add-ai-agent     # render.com/tutorials/agents-on-render-workflows (AI agents β†’ Render Workflows)
make add-autoscaling  # render.com/docs/scaling
make add-nodejs       # render.com/docs/deploy-node-express-app

Access locally:


Deploy to Render

1. Set up a Logfire account.

Before clicking the deploy button, sign in at logfire.pydantic.dev, create a project (US region), and generate two tokens:

  • Preferences β†’ Write Tokens β†’ create token β†’ save as LOGFIRE_TOKEN
  • Preferences β†’ Read Tokens β†’ create token β†’ save as LOGFIRE_READ_TOKEN

You'll paste both into the Render Dashboard in step 3.

2. One-click deploy

Deploy to Render

Render reads render.yaml and provisions:

  • PostgreSQL database with pgvector (pydantic-agents-db)
  • Backend web service (pydantic-agents-api, FastAPI + Logfire β€” runs the pipeline in-process)
  • Ingestion refresh cron (pydantic-agents-ingest, re-embeds the live sources daily)
  • Frontend static site (pydantic-agents-frontend, Next.js)
  • One environment group that holds all shared config (see below)

On Apply, Render prompts once for the secret values in the env group (OPENAI_API_KEY, ANTHROPIC_API_KEY, and both Logfire tokens). You fill these at the group level, not per service.

Environment groups

render.yaml defines one reusable env group so config lives in one place instead of being duplicated across services:

Group Contents Linked to
pydantic-agents-pipeline LLM/Logfire secrets + all pipeline, RAG, and model config (~18 vars) Backend API and the ingest cron

The backend and the cron both run the same backend.config.Settings, so they need identical config β€” linking one group beats pasting ~18 variables per service. DATABASE_URL stays per-service (it's injected from the database, which can't live in a group), and the frontend's NEXT_PUBLIC_API_URL stays inline (unique, build-time).

3. Fill in the env-group values

Because the backend and cron read everything from the env group, you set values on the group, not on each service β€” every linked service picks them up automatically. Set the four secrets once, when you apply the Blueprint in step 2:

Variable Source
OPENAI_API_KEY platform.openai.com
ANTHROPIC_API_KEY console.anthropic.com
LOGFIRE_TOKEN Logfire write token from step 1
LOGFIRE_READ_TOKEN Logfire read token from step 1

The first three are required β€” the services crash on startup without them (no defaults in backend/config.py).

Edit the group under Dashboard β†’ Env Groups β†’ pydantic-agents-pipeline. Saving re-deploys every service linked to it.

Auto-filled, no action needed: DATABASE_URL (injected from the database service) and the rest of pydantic-agents-pipeline's config (MAX_TOKENS, TIMEOUT_SECONDS, RAG_TOP_K, SIMILARITY_THRESHOLD, VERIFICATION_THRESHOLD, EMBEDDING_MODEL, EMBEDDING_DIMENSIONS, the model-selection vars, ENABLE_CACHING, LOG_LEVEL) ship with sensible defaults in render.yaml.

4. Wire the frontend to the backend

After the backend deploys, copy its public URL from the service's Dashboard page and set it as the NEXT_PUBLIC_API_URL env var on the frontend service, then redeploy the frontend so the value takes effect. For this deploy that's:

NEXT_PUBLIC_API_URL=https://pydantic-agents-api.onrender.com

Use the base origin only β€” no trailing slash and no /api path (the frontend appends /ask, /health, etc. itself). If your service name isn't globally unique, Render adds a random suffix (…-api-xxxx.onrender.com), so always copy the exact URL shown in the Dashboard.

5. Done β€” the corpus seeds itself

The backend's preDeployCommand runs data/scripts/ingest_pages.py on every deploy, loading the core corpus and re-embedding the live sources, so the database is seeded by the time the service goes live. The daily pydantic-agents-ingest cron keeps it fresh. Ask a question at your frontend URL β€” for example, "How do I deploy an AI agent on Render?"

Documentation

Core Guides

External Resources


Contributing

This is a demo project, but improvements are welcome!

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes
  4. Add tests
  5. Submit a pull request

License

MIT License - see LICENSE file for details


Acknowledgments

Built to showcase:

  • Logfire by Pydantic - AI observability platform
  • Render - Modern cloud platform
  • Pydantic AI - Type-safe AI agent framework
  • OpenAI & Anthropic - LLM providers

Ready to build observable AI? Fork this repo and deploy to Render to get started!

About

A demo AI pipeline showcasing observable AI with Pydantic AI, Logfire, and Render

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors