Render Developer Q&A Assistant showcasing observable AI with Pydantic Agents, Pydantic Embedder, and Logfire
Intelligent question-answering system that demonstrates real-world AI observability patterns. This example project shows how to build, instrument, and monitor a multi-stage LLM pipeline with full cost tracking, quality evaluation, and performance monitoring.
- What This App Does
- What This Demonstrates
- Architecture
- Quick Start
- Deploy to Render
- Example Metrics
- Documentation
- Contributing
- License
This is an AI-powered Q&A assistant for Render documentation. Users can ask questions about Render's platform, and the app provides accurate, well-researched answers backed by the official documentation.
- Ask a question - "How do I deploy a Node.js app on Render?" or "What database plans are available?"
- Watch the pipeline - Track progress as the run moves through 7 stages (embedding β retrieval β generation β verification)
- Get accurate answers - Receive detailed responses with sources from Render docs
- Quality guaranteed - Every answer is verified for accuracy and rated by dual AI evaluators
- Hybrid search - Combines semantic understanding with keyword matching for better retrieval
- Three verification capabilities - Each answer goes through Grounding (extract claims, verify them against the retrieved sources), Accuracy (a factual-correctness review), and Quality (a dual-model developer-experience rating) β distinct checks, not redundant ones
- Neutral, grounded answers - The assistant answers only from retrieved documentation, with no product-favorable steering in the prompt; relevant Render docs surface through retrieval, not by being force-sold
- Cost tracking - See exactly how much each question costs to answer
- Concurrent verification - The Accuracy + dual-model Quality checks run concurrently (a single
asyncio.gather) so the three independent LLM calls overlap
- PostgreSQL with pgvector + full-text - Managed hybrid search database
- Web Service + Static Site - FastAPI backend (pipeline runs in-process) + Next.js frontend
- Cron Jobs - Scheduled ingestion refresh that re-embeds the live RAG sources
- Blueprint deploy + env groups -
render.yamlprovisions everything; shared config lives in one env group
- LLM Traces - Complete visibility into every AI call (OpenAI + Anthropic auto-instrumented)
- HTTP Tracing - FastAPI auto-instrumentation for request/response tracking
- Database Monitoring - AsyncPG auto-instrumentation for query performance
- Cost Tracking - Per-stage and per-execution cost attribution with custom metrics
- Multi-Model Evals - Dual-rater quality assessment (OpenAI + Anthropic)
- Session Tracking - End-to-end user journey with distributed tracing
- Custom Metrics - Business-specific metrics (cost, quality, accuracy)
- SQL Queries - Custom analytics on AI performance
This project is built end-to-end on the Pydantic ecosystem:
- Pydantic AI Agents β every pipeline stage (generation, claims extraction, accuracy check, dual-rater evaluation) is a
pydantic_ai.Agentwith a typedoutput_type. Multi-provider orchestration (Claude + GPT) runs throughOpenAIProvider/AnthropicProviderin a single pipeline. Seebackend/pipeline/. - Pydantic Embedder β
pydantic_ai.EmbedderwithOpenAIEmbeddingModelpowers question embedding (embed_query) and batch claim embedding (embed_documents) for verification. Auto-instrumented bylogfire.instrument_pydantic_ai(). Seebackend/pipeline/embeddings.pyandbackend/pipeline/verification.py. - Pydantic Models β Claims, accuracy scores, eval dimensions, and pipeline state are parsed directly into Pydantic models (e.g.
ClaimsOutput,EvaluationOutput).pydantic-settingsmanages config inbackend/config.py. - Pydantic GenAI Prices β model pricing is loaded dynamically from the
pydantic/genai-pricesregistry, then combined with per-agent token counts fromresult.usage()to produce per-stage cost attribution. Seebackend/prices.py. - Logfire β distributed traces, custom metrics, dual-model evals, and cost attribution. Auto-instruments FastAPI, AsyncPG, HTTPX, and Pydantic AI. See
backend/observability.py.
The frontend connects to a FastAPI backend that runs the 7-stage Q&A pipeline in-process as a
background task. POST /ask launches the run and returns immediately; the frontend polls
GET /ask/{run_id} for live per-stage progress and the final result.
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Frontend (Next.js + TypeScript) β
β Deployed as: Render Static Site β
β - Question input UI β
β - Progress via polling (POST /ask β poll GET /ask/{id}) β
β - Answer display with metrics β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β HTTPS
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Backend (FastAPI + Logfire) β
β Deployed as: Render Web Service (Python 3.13) β
β - POST /ask β launch run_qa_pipeline (background) β
β - GET /ask/{id} β read run status + progress β
β - /health, /history, /stats, /sessions/{id}/logs β
β β
β In-process orchestrator: backend/pipeline/orchestrator.py β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Retrieval [1] Question Embedding (OpenAI) β β
β β [2] RAG Retrieval (pgvector+BM25) β β
β β Generate [3] Answer Generation (Claude) β β
β β Grounding [4] Claims Extraction (GPT) β β
β β [5] Claims Verification (RAG) β β
β β Accuracy [6] Factual-grounding (Claude) β β β
β β Quality [7] Dual-model rating (OpenAI+ ββ gather β β
β β Anthropic)β (parallel)β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β β
ββββββββββββββββββββββββ βββββββββββββββββββββββββββββ
β PostgreSQL β β Logfire β
β (Render Managed) β β (Pydantic) β
β - pgvector ext β β - Distributed traces β
β - RAG embeddings β β - Cost attribution β
β - Full-text search β β - Quality metrics β
β - pipeline_runs + β β - Custom dashboards β
β pipeline_progress β βββββββββββββββββββββββββββββ
ββββββββββββββββββββββββ
Cron (daily) β data/scripts/ingest_pages.py ββΆ re-embed live sources
In-process, not a separate service. The pipeline is plain async code: stages 1β5 run sequentially, and the three independent verification calls (6 + 7) overlap via a single
asyncio.gather.POST /askrecords a row inpipeline_runsand runs the orchestrator as a detached task;GET /ask/{run_id}reads that row plus the livepipeline_progressupdates. Seebackend/pipeline/orchestrator.py.
render-qa-assistant/
βββ backend/
β βββ main.py # FastAPI app (launches + polls in-process runs)
β βββ api/
β β βββ logs.py # Logfire logs API endpoint
β βββ pipeline/ # 7-stage pipeline implementation
β β βββ orchestrator.py # In-process run_qa_pipeline orchestrator
β βββ ingestion.py # Shared embed + replace-by-source helpers
β βββ models.py # Pydantic models
β βββ database.py # PostgreSQL + pgvector (+ pipeline_runs/progress)
β βββ observability.py # Logfire configuration
β βββ config.py # Settings management
βββ frontend/
β βββ src/ # Next.js + TypeScript UI
β βββ package.json
βββ data/
β βββ embeddings/ # Pre-embedded documentation
β βββ curated/ # Hand-curated source content (markdown)
β βββ sources.py # Live-source registry (build strategies + metadata)
β βββ scripts/ # Data ingestion scripts
βββ docs/
β βββ PIPELINE.md # Detailed pipeline guide
β βββ OBSERVABILITY.md # Logfire instrumentation guide
β βββ CONFIGURATION.md # Configuration reference
β βββ HYBRID_SEARCH.md # Hybrid search deep-dive
βββ pyproject.toml # Python dependencies (uv)
βββ uv.lock # Locked dependency versions
βββ .python-version # Pins Python to 3.13
βββ render.yaml # Infrastructure as code
βββ .env.example # Environment variables template
βββ README.md # This file
- uv (manages Python 3.13 automatically)
- Node.js 18+
- PostgreSQL 16+ (with pgvector extension >= 0.5.0, for the HNSW vector index)
- OpenAI API key
- Anthropic API key
- Logfire account β sign in at logfire.pydantic.dev, create a project (US region), then:
- Settings β Write Tokens β create a token β
LOGFIRE_TOKENin.env - Settings β Read Tokens β create a token β
LOGFIRE_READ_TOKENin.env - View traces in the Live panel under your project
- Settings β Write Tokens β create a token β
# 1. Install everything (uv installs Python 3.13 automatically)
make install
# 2. Set up .env file (copy from example and fill in your keys)
cp .env.example .env
# 3. Start database
make db-start
# 4. Load documentation (this step might take a while!)
make ingest
# 5. Run backend (in one terminal)
make run-backend
# 6. Run frontend (in another terminal)
make run-frontendThe full stack runs locally β no Render cloud resources, no extra API key. The Q&A pipeline runs in-process inside the backend, so
POST /askworks against your localDATABASE_URLand History populates normally.make run-backendandmake run-frontend(steps 5β6 above) are all you need; ask questions at http://localhost:3000.
Local config β deployed env group. Locally every process reads one
.env(copied from.env.example). On deploy that same config lives in the Render env group inrender.yamlβ see Deploy β Environment groups. The DockerDATABASE_URLisn't in the group; in the cloudDATABASE_URLis injected from the database.
make ingest runs the full pipeline: bulk doc embeddings, plus the curated "special pages" that get explicit-injection into RAG context (pricing, AI agent, autoscaling, Node.js). These live sources are defined in the data/sources.py registry and ingested through the shared build β embed β replace-by-source helpers. To re-load just one of those after editing its registry entry (or curated content), use the per-target shortcuts:
make add-pricing # render.com/pricing tables
make add-ai-agent # render.com/tutorials/agents-on-render-workflows (AI agents β Render Workflows)
make add-autoscaling # render.com/docs/scaling
make add-nodejs # render.com/docs/deploy-node-express-appAccess locally:
- Frontend: http://localhost:3000
- API docs: http://localhost:8000/docs
- Logfire: https://logfire.pydantic.dev
Before clicking the deploy button, sign in at logfire.pydantic.dev, create a project (US region), and generate two tokens:
- Preferences β Write Tokens β create token β save as
LOGFIRE_TOKEN - Preferences β Read Tokens β create token β save as
LOGFIRE_READ_TOKEN
You'll paste both into the Render Dashboard in step 3.
Render reads render.yaml and provisions:
- PostgreSQL database with pgvector (
pydantic-agents-db) - Backend web service (
pydantic-agents-api, FastAPI + Logfire β runs the pipeline in-process) - Ingestion refresh cron (
pydantic-agents-ingest, re-embeds the live sources daily) - Frontend static site (
pydantic-agents-frontend, Next.js) - One environment group that holds all shared config (see below)
On Apply, Render prompts once for the secret values in the env group
(OPENAI_API_KEY, ANTHROPIC_API_KEY, and both Logfire tokens). You fill these at the group
level, not per service.
render.yaml defines one reusable env group
so config lives in one place instead of being duplicated across services:
| Group | Contents | Linked to |
|---|---|---|
pydantic-agents-pipeline |
LLM/Logfire secrets + all pipeline, RAG, and model config (~18 vars) | Backend API and the ingest cron |
The backend and the cron both run the same backend.config.Settings, so they need identical
config β linking one group beats pasting ~18 variables per service. DATABASE_URL stays
per-service (it's injected from the database, which can't live in a group), and the frontend's
NEXT_PUBLIC_API_URL stays inline (unique, build-time).
Because the backend and cron read everything from the env group, you set values on the group, not on each service β every linked service picks them up automatically. Set the four secrets once, when you apply the Blueprint in step 2:
| Variable | Source |
|---|---|
OPENAI_API_KEY |
platform.openai.com |
ANTHROPIC_API_KEY |
console.anthropic.com |
LOGFIRE_TOKEN |
Logfire write token from step 1 |
LOGFIRE_READ_TOKEN |
Logfire read token from step 1 |
The first three are required β the services crash on startup without them (no defaults in
backend/config.py).
Edit the group under Dashboard β Env Groups β
pydantic-agents-pipeline. Saving re-deploys every service linked to it.
Auto-filled, no action needed: DATABASE_URL (injected from the database service) and the
rest of pydantic-agents-pipeline's config (MAX_TOKENS, TIMEOUT_SECONDS, RAG_TOP_K,
SIMILARITY_THRESHOLD, VERIFICATION_THRESHOLD, EMBEDDING_MODEL, EMBEDDING_DIMENSIONS, the
model-selection vars, ENABLE_CACHING, LOG_LEVEL) ship with sensible defaults in render.yaml.
After the backend deploys, copy its public URL from the service's Dashboard page and set it as
the NEXT_PUBLIC_API_URL env var on the frontend service, then redeploy the frontend so the
value takes effect. For this deploy that's:
NEXT_PUBLIC_API_URL=https://pydantic-agents-api.onrender.com
Use the base origin only β no trailing slash and no /api path (the frontend appends
/ask, /health, etc. itself). If your service name isn't globally unique, Render adds a random
suffix (β¦-api-xxxx.onrender.com), so always copy the exact URL shown in the Dashboard.
The backend's preDeployCommand runs data/scripts/ingest_pages.py
on every deploy, loading the core corpus and re-embedding the live sources, so the database is
seeded by the time the service goes live. The daily pydantic-agents-ingest cron keeps it fresh.
Ask a question at your frontend URL β for example, "How do I deploy an AI agent on Render?"
- docs/PIPELINE.md - Detailed breakdown of the 7-stage pipeline
- docs/OBSERVABILITY.md - Comprehensive Logfire instrumentation guide
- docs/CONFIGURATION.md - All configuration options and tuning
- docs/HYBRID_SEARCH.md - Technical deep-dive on hybrid search
- Logfire Documentation: https://docs.pydantic.dev/logfire/
- Pydantic AI Documentation: https://ai.pydantic.dev/
- Render Documentation: https://docs.render.com/
This is a demo project, but improvements are welcome!
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests
- Submit a pull request
MIT License - see LICENSE file for details
Built to showcase:
- Logfire by Pydantic - AI observability platform
- Render - Modern cloud platform
- Pydantic AI - Type-safe AI agent framework
- OpenAI & Anthropic - LLM providers
Ready to build observable AI? Fork this repo and deploy to Render to get started!