⚡ SignalPilot Data Agent

🏆 #1 on Spider 2.0-DBT — the public state of the art for AI data engineering

65.6 — +5.3 ahead of the next-best system (Databao by JetBrains, May 2026) · 96.9% accuracy on ADE-bench

Governed AI agents for your data stack — db, dbt, and more. Optimized and secured by AutoFyn.

☁️ Try SignalPilot Cloud — free

🚀 Self-host · 📖 Docs · ⭐ Star the repo · 📊 See benchmarks · 🌐 signalpilot.ai · ⚙️ Try AutoFyn · 📅 Book a demo

For Agentic Data and Platform Teams

Governed production access — Snowflake, BigQuery, Postgres, dbt with enterprise guardrails
Optimized & hardened by AutoFyn — the autonomous security agent that's surfaced vulnerabilities in Next.js, MetaMask, and other major platforms — now tuning prompts, skills, and security guardrails for your data agents
Benchmark-driven evaluation — same eval rigor that earned #1 on Spider 2.0-DBT
Enterprise support — SSO, private deployments, SLAs

Talk to us · signalpilot.ai

What SignalPilot Is

Two entrypoints, three pieces of infrastructure on the same gateway.

The supported entrypoints are Claude Code and Codex — both via a native plugin. Underneath them, the same components do the work:

Plugin (skills + agents) — adds 23 dbt/SQL skills + 2 verifier agents + 40 MCP tools to your agent session. This is the recommended way to use SignalPilot, on Claude Code or Codex.
MCP server — standard streamable-http, the layer the plugins talk to. Experimental for other clients: Cursor / custom Agent SDK builds can connect and call the 40 MCP tools, but the skills are plugin-specific and don't run there. Use at your own risk until other platforms ship a skill-equivalent surface.
Observability platform — docker compose up -d brings up the gateway, web UI (:3200), audit log, query history, latency/error dashboards, encrypted credential storage, and SignalPilot Workspaces. Or use SignalPilot Cloud for SSO and hosted history.

Benchmarks

Most data agents look impressive in a demo and fall apart on a real warehouse. We measure ours against the field — and lead it.

🥇 #1 on Spider 2.0-DBT

The hardest public benchmark for AI data engineering — 68 tasks that demand real project navigation, complex SQL environments, and long-context reasoning. SignalPilot tops the leaderboard at 65.6 — +5.3 ahead of the next-best system (Databao by JetBrains, May 2026), and well clear of the GPT-5-based agents below it. See the leaderboard →

🎯 96.9% on ADE-bench

Across 64 real analytics-engineering tasks — building dbt models against live warehouses — SignalPilot's agents delivered correct, production-ready output 62 of 64 times. Not plausible-looking SQL: models you can ship.

That gap is the difference between an agent you babysit and one you trust in production.

Index — What It Is · Benchmarks · How It Works · Try · Architecture · Workspaces · MCP Tools · Community

How It Works

Five stages, every task: plan → scan → govern → build → report.

01 — Describe what you need

Plain-English goal in chat (e.g. "Build shopify__daily_shop — orders, abandoned checkouts, fulfillment counts by day")
Parsed into a structured task — no SQL written, no warehouse touched yet

02 — Agent scans your project

Inspects dbt project + warehouse: sources, staging, marts, missing models
Flags date hazards (current_date, now())
Resolves build order across the DAG — deterministic, not a guess

03 — Every query is governed

DDL (DROP, CREATE, ALTER) and DML (INSERT, UPDATE, DELETE) blocked at the parser
Auto-LIMIT injection on unbounded SELECT
Per-session budget cap kills queries that would scan over your $ threshold
Every query audited: timestamp, agent ID, policy reason, full SQL

04 — Plan, build, and verify

Writes a structured build spec first — sources, joins, grain, filters, expected rows per model — before a line of SQL
dbt parse catches structural errors; models build in dependency order
Two read-only subagents verify every built model in parallel — neither edits files, runs dbt, or touches state
The 7-check verifier audits structure: table existence, column completeness, row count vs source, fan-out, cardinality/grain, non-deterministic SQL, and source-table preservation
The value-verifier audits values: sample spot-checks, aggregate cross-validation (total vs distinct, measured against source), and status-column filtering for returns/cancellations
Each verifier returns PASS/FAIL/WARN per check and prescribes the exact CHANGE: fix — no editorializing, numbers measured from source
The agent applies only FAIL fixes, rebuilds, and re-dispatches both verifiers — looping until every check passes, then stops

05 — Full audit receipt

Structured summary: duration · agent turns · governed queries · queries blocked · models built · columns validated
Every line traces back to a specific MCP tool call

Try SignalPilot Data Agent

Give your AI agent governed, production-ready access to your data stack — db, dbt, and more. Schema discovery, read-only SQL, dbt project management, all through a single MCP server. No hallucinated tables. No dropped rows. No unbounded queries.

# Start SignalPilot
git clone https://github.com/SignalPilot-Labs/signalpilot.git
cd signalpilot
docker compose up -d
# Web UI available at http://localhost:3200

# Connect the MCP server to Claude Code
claude mcp add --transport http signalpilot http://localhost:3300/mcp

# (Optional) Install the plugin for skills + agents — Claude Code
claude plugin marketplace add SignalPilot-Labs/signalpilot-plugin
claude plugin install signalpilot-dbt@signalpilot

# (Optional) Install the plugin for skills + agents — Codex
codex plugin marketplace add SignalPilot-Labs/codex-signalpilot-plugin
codex plugin add signalpilot@signalpilot

That's it. Claude Code (or Codex) now has governed access to your databases.

Codex: if you previously added a broken marketplace snapshot, refresh it with codex plugin marketplace upgrade signalpilot && codex plugin add signalpilot@signalpilot.

Architecture

Other MCP-DB servers don't enforce LIMIT injection, DDL blocking, dangerous function blocking, or audit logging by default. SignalPilot does — that's why agents on it set the SOTA on Spider 2.0-DBT.

┌─────────────────────────────────────────────────────────────┐
│  Your AI Agent (Claude Code, Codex, Agent SDK, MCP client)  │
└────────────────────────────┬────────────────────────────────┘
                             │ MCP Protocol (streamable-http)
┌────────────────────────────▼────────────────────────────────┐
│  SignalPilot Gateway                                         │
│  ┌────────────┐ ┌──────────────┐ ┌───────────────────────┐ │
│  │ Governance │ │ Schema       │ │ dbt Project           │ │
│  │ • LIMIT    │ │ • DDL        │ │ • Map / Validate      │ │
│  │ • DDL block│ │ • Explore    │ │ • Model verification  │ │
│  │ • Audit    │ │ • Join paths │ │ • Date boundaries     │ │
│  └────────────┘ └──────────────┘ └───────────────────────┘ │
│  ┌────────────────────────┐  ┌────────────────────────────┐ │
│  │ Workspaces (notebooks) │  │ Knowledge base · Notion    │ │
│  │ sandboxed pods (gVisor)│  │ web UI · audit · history   │ │
│  └────────────────────────┘  └────────────────────────────┘ │
└────────────────────────────┬────────────────────────────────┘
                             │  11 governed connectors
   ┌──────────┬──────────┬───┴──────┬──────────┬──────────┐
   ▼          ▼          ▼          ▼          ▼          ▼
 Postgres  Snowflake  BigQuery   DuckDB    Redshift   …+6
 MySQL · SQLite · Databricks · ClickHouse · MSSQL · Trino

Plugin (Claude Code)

The SignalPilot plugin adds 23 battle-tested skills + 2 verifier agents to Claude Code and Codex — the same skills that power the Spider 2.0-DBT SOTA.

# Claude Code
claude plugin marketplace add SignalPilot-Labs/signalpilot-plugin
claude plugin install signalpilot-dbt@signalpilot

# Codex
codex plugin marketplace add SignalPilot-Labs/codex-signalpilot-plugin
codex plugin add signalpilot@signalpilot

Core skills: dbt-workflow (full 8-step lifecycle), dbt-write, dbt-debugging, dbt-testing, dbt-snapshots, dbt-versioning, dbt-knowledgebase, sql-workflow, plus dialect skills (duckdb-sql, snowflake-sql, bigquery-sql, sqlite-sql) and domain packs (ecommerce, financial, healthcare, marketing, media, product, hr).

Agents: verifier (post-build 7-check protocol) and value-verifier (aggregate value checks).

Use With Any MCP Client

⚠️ Experimental for clients without a plugin. The 40 MCP tools work over streamable-http from any MCP client (Cursor, custom Agent SDK) — but the SignalPilot skills only run inside the Claude Code and Codex plugins. Elsewhere you'll have the tools without skill orchestration. The plugins are the supported path; treat the configs below as best-effort.

Claude Code (one-liner)

claude mcp add --transport http signalpilot http://localhost:3300/mcp

Claude Desktop / Cursor / Any MCP Client

Add to your MCP config (.mcp.json, .cursor/mcp.json, etc.):

{
  "mcpServers": {
    "signalpilot": {
      "type": "http",
      "url": "http://localhost:3300/mcp"
    }
  }
}

When API keys are configured, add an Authorization: Bearer YOUR_API_KEY header.

Connect a Database

Via the web UI at http://localhost:3200, or via API:

curl -X POST http://localhost:3300/api/connections \
  -H "Content-Type: application/json" \
  -d '{
    "name": "my-warehouse",
    "db_type": "duckdb",
    "database": "/path/to/warehouse.duckdb"
  }'

Supported (11 connectors): PostgreSQL, MySQL, Snowflake, BigQuery, Redshift, Databricks, DuckDB, SQLite, Microsoft SQL Server, ClickHouse, Trino — plus connection pooling and SSH tunneling.

SignalPilot Workspaces

Reactive Python notebooks your agent can drive end-to-end. The run_notebook MCP tool spins up a sandboxed, git-backed workspace; the agent writes and runs cells, and the notebook's Data SDK (import signalpilot as sp) queries your warehouse through the same governed gateway. Open and edit them in the web UI, or let the agent build them headless.

Each workspace runs as an isolated, sandboxed pod (gVisor) with per-org network isolation; projects are git-backed, so no shared filesystem is required.

MCP Tools

40 governed tools across query execution, schema discovery, dbt intelligence, model verification, workspaces, knowledge base, and Notion integration.

Category	Tools
Query	`query_database`, `validate_sql`, `explain_query`, `estimate_query_cost`, `debug_cte_query`, `check_budget`, `query_history`
Schema	`list_tables`, `describe_table`, `explore_table`, `explore_column`, `explore_columns`, `schema_overview`, `schema_ddl`, `schema_diff`, `schema_statistics`, `get_date_boundaries`
Relationships	`get_relationships`, `find_join_path`, `schema_link`
dbt & verification	`dbt_error_parser`, `generate_sql_skeleton`, `check_model_schema`, `validate_model_output`, `verify_model_values`, `audit_model_sources`, `analyze_grain`, `compare_join_types`
Workspaces	`list_workspace_projects`, `run_notebook`
Knowledge base	`get_knowledge`, `search_knowledge`, `propose_knowledge`
Notion	`list_notion_integrations`, `notion_search`, `notion_fetch_page`, `notion_create_page`
Connections	`list_database_connections`, `connection_health`, `connector_capabilities`

See the full tools reference in the docs.

Security

Read-only governance — queries parsed to AST, DDL/DML blocked, 79+ dangerous functions blocked across 7 dialects
Tenant isolation — API keys, connections, and audit logs are org-scoped
Encryption at rest — Fernet (AES-128-CBC + HMAC-SHA256)
Audit logging — every query logged with PII redaction
Rate limiting — per-IP, per-key, and per-org with brute-force protection
Non-root containers — gateway runs as UID 10001
Sandboxed Workspaces — notebook pods run under gVisor with per-org NetworkPolicy isolation, read-only rootfs, and IMDS egress blocked

See Security docs for our full security model and vulnerability reporting policy.

Project Structure

SignalPilot/
├── signalpilot/
│   ├── gateway/              # FastAPI backend — MCP server, REST API, governance
│   │   └── gateway/
│   │       ├── api/          # REST API modules
│   │       ├── connectors/   # 11 database connectors + pooling + SSH tunneling
│   │       ├── governance/   # Budget, cache, PII redaction, annotations
│   │       ├── mcp/          # 40 MCP tool definitions (modular package)
│   │       ├── engine/       # SQL validation, LIMIT injection, function denylist
│   │       ├── dbt/          # Project scanning, validation, hazard detection
│   │       ├── orchestrator/ # Notebook pod orchestration (K8s/EKS, namespaces)
│   │       ├── db/           # SQLAlchemy ORM models + async engine
│   │       └── auth/         # Clerk JWT (cloud) / local auth + org role enforcement
│   ├── notebook-server/      # SignalPilot Workspaces — reactive notebook runtime
│   └── web/                  # Next.js 16 frontend + embedded notebook UI, Tailwind CSS
│       ├── app/              # App router pages (dashboard, connections, query, etc.)
│       ├── notebook/         # Embedded reactive-notebook frontend (source)
│       └── lib/              # API client, auth context, hooks
├── plugin/                   # Claude Code + Codex plugin (23 skills, 2 verifier agents)
├── sp-sandbox/               # gVisor sandboxed Python execution
├── deploy/k8s/               # Kubernetes manifests: gateway RBAC + admission policies
├── benchmark/                # Spider 2.0-DBT (SOTA 65.6) + ADE-bench (96.9%) suites
└── docker-compose.yml        # Full stack: web, gateway, postgres, sandbox

Community

🐛 Open an issue — bugs, feature requests, connector requests
💬 GitHub Discussions — questions, ideas, show-and-tell
🔒 Security policy — report vulnerabilities responsibly

Contributors

Star History

If SignalPilot is useful, please ⭐ — it helps a ton.

---

Contributing

Contributions are welcome — bug fixes, new connectors, docs, and features. Start with the Contributing Guide for setup, coding standards, and the PR process. For anything non-trivial, open an issue first so we can align on the approach.

🛠️ Contributing Guide — dev setup, standards, PR workflow
🐛 Report a bug · 💬 Discussions · 🔒 Security policy

License

Apache 2.0 — see LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 1,290 Commits
.claude-plugin		.claude-plugin
.github		.github
benchmark		benchmark
deploy/k8s		deploy/k8s
docs		docs
plugin @ a5d89d1		plugin @ a5d89d1
signalpilot		signalpilot
sp-sandbox		sp-sandbox
tests		tests
.dockerignore		.dockerignore
.gitignore		.gitignore
.gitmodules		.gitmodules
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile.gateway		Dockerfile.gateway
Dockerfile.notebook		Dockerfile.notebook
Dockerfile.sandbox		Dockerfile.sandbox
Dockerfile.web		Dockerfile.web
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
docker-compose.k8s.yml		docker-compose.k8s.yml
docker-compose.local.yml		docker-compose.local.yml
docker-compose.yml		docker-compose.yml

Folders and files

Latest commit

History

Repository files navigation

⚡ SignalPilot Data Agent

🏆 #1 on Spider 2.0-DBT — the public state of the art for AI data engineering

For Agentic Data and Platform Teams

What SignalPilot Is

Benchmarks

🥇 #1 on Spider 2.0-DBT

🎯 96.9% on ADE-bench

How It Works

01 — Describe what you need

02 — Agent scans your project

03 — Every query is governed

04 — Plan, build, and verify

05 — Full audit receipt

Try SignalPilot Data Agent

Architecture

Plugin (Claude Code)

Use With Any MCP Client

Claude Code (one-liner)

Claude Desktop / Cursor / Any MCP Client

Connect a Database

SignalPilot Workspaces

MCP Tools

Security

Project Structure

Community

Contributors

Star History

Contributing

License

About

Resources

License

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages