feat(DRC-3526): MCP App widget POC — 15 widget tools + config-install helper by iamcxa · Pull Request #1397 · DataRecce/recce

iamcxa · 2026-05-25T09:12:44Z

Summary

Recce MCP App POC (DRC-3526). Ships 15 widget-enabled MCP tools rendered as interactive HTML widgets in Claude Desktop and other MCP App-capable hosts, plus a recce mcp-config-install helper that auto-writes the Claude Desktop config so POC users skip the hand-edited JSON.

Architecture is a parallel FastMCP-based widget server (recce mcp-widget-server) alongside the existing low-level recce mcp-server, coordinated by RECCE_MCP_WIDGETS=1. Day 0 spike found low-level mcp.server.Server doesn't render widgets in Claude Desktop while mcp.server.fastmcp.FastMCP does — hence two servers rather than upgrading the existing one. Full FastMCP migration is deferred (design Open Q #6; now at 15/20 = 75% widget coverage, past the ≥50% reconsideration threshold).

Demo journeys (recorded)

Both journeys use plain reviewer-style prompts — no mcp__recce-widgets__<tool> tool-naming — to validate that a non-technical reviewer's natural phrasing reliably triggers the right widget.

Demo 1 — Quick scan (Journey A) · Loom

Replaces the AI-summary "block of text" with inline widgets so a reviewer grasps "what changed" in seconds. PR scenario: jaffle_shop_golden#15 (refactor customers.sql → two intermediate models).

Prompt 1 → row_count_diff widget:

I'm reviewing a refactor PR that splits customers.sql into two new intermediate
models. Before I approve, I want to verify that row counts haven't changed
across any of the affected models. Can you compare row counts for the models
modified in this PR?

Prompt 2 → schema_diff widget:

Now show me what columns changed — anything added, removed, or with a
different type — across the same modified models.

Demo 2 — Investigate impact (Journey 2) · Loom

Blast-radius summary → evidence drilldown, with the agent following the next_action advisory hint. PR scenario: jaffle_shop_golden#16 (amount DOUBLE → DECIMAL on stg_payments).

Prompt 1 → impact_analysis widget:

I'm reviewing a PR that changes the amount column type in stg_payments from
DOUBLE to DECIMAL(10,2). Can you show me the blast radius — which downstream
models are affected, and how serious the impact is?

Prompt 2 → profile_diff widget:

Following the next_action hint from the impact analysis, let me see the actual
statistical profile of the amount column in stg_payments — compare the mean,
distribution, null rate, and other stats between base and current to confirm
this is just a rounding effect within expected range.

Fixed in this PR (commit f692f086): impact_analysis's downstream value_diff step derived its per-column diff list from the CURRENT relation only (get_model → get_columns(base=False)), then applied b."col" IS DISTINCT FROM c."col" to both the base (b) and current (c) relations. When a column had drifted — present in current but not base — Snowflake raised a Binder Error (invalid identifier 'B."full_name"' / Table "b" does not have a column named "full_name"). The builder now diffs only the intersection of base ∩ current columns via get_model(node_id, base=True) (which manages its own warehouse connection), mirroring the canonical ValueDiffTask; drifted columns are reported via schema_changes, not fed into the diff expression, and value_diff is skipped when the PK itself has drifted. This was a pre-existing value_diff correctness bug surfaced by the demo, not introduced by the widget work. Verified by a new regression unit test (asserts drifted columns never reach the generated SQL) plus a read-only A/B against live Snowflake (old SQL shape → 000904 invalid identifier; intersected shape → executes clean).

What's in this PR

New surface:

recce mcp-widget-server CLI subcommand (local mode only — iter 1 scope)
recce mcp-config-install CLI helper — writes both server entries + RECCE_MCP_WIDGETS=1 into claude_desktop_config.json (macOS, iter 1 scope), with --dry-run and --yes, and a backup of the existing config
recce/widget_server.py — FastMCP server with 15 @mcp.tool delegates + matching @mcp.resource handlers, idiomatic shape (Pydantic input/output models, CallToolResult with short content + structuredContent, tool annotations, ext-apps SDK theme helpers)
recce/data/mcp/*.html — 15 self-contained HTML widgets, each with exhaustive @media (prefers-color-scheme: dark) overrides and Claude design-token theming
WIDGET_TOOLS filter in recce/mcp_server.py — when RECCE_MCP_WIDGETS=1, removes the 15 widget-eligible tools from the main server's list_tools (defers to widget server)

15 widgets by tier:

Phase	Tier	Tools
A	1–2	`row_count_diff`, `schema_diff`, `get_server_info`, `list_checks`, `get_model`
B	3	`query`, `query_diff`, `value_diff`, `value_diff_detail`, `top_k_diff`
C	4 (charts)	`histogram_diff`, `profile_diff`
D	5 (mini-graph)	`get_cll`, `impact_analysis`
E	6 (DAG, v1)	`lineage_diff` — hand-roll SVG, hard 10-node cap, graceful empty-state over cap

Bug fixes (pre-existing, surfaced by the POC):

recce/track.py print(traceback.format_exc()) → file=sys.stderr — stdout traceback is fatal for any stdio MCP server when RecceConfig fails
recce mcp-server / mcp-widget-server now os.chdir(--project-dir) before RecceConfig, so Claude Desktop's cwd=/ spawn finds the dbt project + recce.yml (removes the bash-wrapper workaround)
SchemaChange class-name collision — Phase D's impact_analysis defined a second module-level SchemaChange that shadowed schema_diff's model, breaking schema_diff serialization with a Pydantic "column / change_status Field required" error. Renamed to ColumnSchemaChange; regression test pins both field surfaces.

Quality / docs:

tests/test_widget_server.py — 35 tests (registration via public FastMCP API, env-var coordination 18-vs-20 tools, file-missing graceful degradation, CallToolResult shape, Pydantic structuredContent round-trip, annotations, collision guard, lineage_diff 10-node-cap branch)
tests/test_mcp_config_install.py — 5 tests (project-dir validation, dry-run no-write, backup creation)
docs/mcp-widgets.md — template doc for Scott's hand-off: file layout, config example, "Add widget N+1" walkthrough, structuredContent contract, gotchas, Python-vs-TypeScript SDK note

Test plan

pytest tests/test_widget_server.py → 35 passed
pytest tests/test_mcp_config_install.py → 5 passed
pytest tests/ -k "mcp or widget or schema_diff or impact or lineage" → 363 passed, no widget-related regressions
Cwd=/ smoke (recce mcp-widget-server --project-dir <dbt>) — exit 0, clean stderr, no stdout pollution
Manual end-to-end in Claude Desktop — Demo 1 + Demo 2 recorded (Looms above); widgets render with real data, dark mode correct, plain prompts trigger correct tools
Manual end-to-end in Cowork (parallel OAuth POC, DRC-3527 — not a blocker)

Note: tests/test_server.py::test_spa_route_lineage fails locally because the frontend SPA build artifact (recce/data/lineage/index.html) isn't present in this worktree — pre-existing, unrelated to this PR (fails identically with the diff stashed; resolved by cd js && pnpm run build).

User-facing changes

For Claude Desktop POC users:

Install config the easy way: recce mcp-config-install --project-dir /path/to/dbt (writes both entries + env var, backs up existing config). Or hand-edit per docs/mcp-widgets.md.
RECCE_MCP_WIDGETS=0/unset → zero change, all 20 tools text-only from recce mcp-server, backward compatible.
RECCE_MCP_WIDGETS=1 → recce mcp-server serves 5 non-widget tools, recce mcp-widget-server serves the 15 widget tools. No name collision.
Other MCP hosts (no MCP Apps support) → graceful degradation: host ignores _meta.ui.resourceUri, uses the tool's one-sentence content text.

What's NOT in this PR

Cloud session / snapshot support for recce mcp-widget-server (iter 1 = local mode only). This is the strategic next step — MCP ↔ Recce Cloud is what lets non-technical users skip git/dbt entirely and is where PR-selection + artifact-packaging are properly solved (not as extra local tools).
Full FastMCP migration collapsing the two-server architecture (design Open Q [Chore] Add flake8 style check #6; now warranted at 75% coverage)
lineage_diff beyond v1 — progressive degradation for >10 nodes (shrink / zoom / aggregate) and the ReactFlow @datarecce/ui/lineage borrow contract
Lifespan refactor (_recce_server still module-global)
OAuth POC for Cowork distribution (Kent Huang DRC-3527, parallel)

Reviewer notes

docs/mcp-widgets.md is the canonical Scott hand-off doc — if the "Add widget N+1" walkthrough leaves a cold reader unsure, that's PR-blocking feedback.
Commit history is deliberately preserved (not squashed) — each commit is one widget / fix / refactor step with meaningful diff boundaries; review in order.
CSS follows project conventions: space-separated rgb(), no @datarecce/ui internal-path imports (widget HTML is plain JS).

🤖 Generated with Claude Code

Review follow-up — deferred items (⚠️ this PR is NOT rushing to merge)

A full review pass (kc-pr-review + kc-pr-review-resolve) ran on d089e737. Actionable fixes are committed (65ca0739→d089e737): ~ path expansion + atomic config write in mcp-config-install, case-insensitive RECCE_MCP_WIDGETS, stdio-safe --debug traceback, robust recce/data/** gitignore (nested build artifacts now ignored; recce/data/mcp/ source tracked), single-quote HTML escaping across all 15 widgets, rel="noopener noreferrer" on the PR link, widget HTML load anchored on the recce package (no longer depends on namespace-package resolution → reliable under pip install), and doc/test-count sync.

even-wei's NO-GO blocker (the test_widget_server.py black/isort format gate) was already resolved in 230af0f0 and is re-verified clean on the current head (black/isort/flake8 exit 0).

The items below are intentionally deferred — this is an iter-1 POC whose goal is breadth (a widget for every tool), not hardened design. Each is marked with a TODO(DRC-3526 follow-up) comment in code; a DRC follow-up ticket will track them:

Widget tools bypass the central call_tool error classifier/telemetry (DRC-2754). Errors still surface (FastMCP wraps them); what's deferred is expected-error classification (table_not_found/permission_denied → warning) + Sentry metrics for the 15 widget tools. Resolve when the two-server architecture is consolidated. — recce/widget_server.py
impact_analysis value_diff drift fix lacks a real-DuckDB e2e. Covered by a mock test (drifted column absent from generated SQL) + a manual Snowflake A/B; a real-adapter e2e with a PK + a column present only in current is the gap. — recce/mcp_server.py
schema_diff has no single-env warning. The empty-state ("No models found") can read as a clean diff when there is simply no base to compare against. — recce/widget_server.py
Type contracts use free str for fixed-set fields (status / change_status / priority) — batch a Literal/submodel tightening pass. — recce/widget_server.py
impact_analysis doubles base-side warehouse introspection (the cost of the binder-error correctness fix) — consider caching/batching get_model(base=True). — even-wei N3
15 copy-pasted escapeHtml/svgEl helpers — extract a shared asset to keep the escaping contract single-sourced. — even-wei N5
check_base_freshness SHA-freshness is opt-in with no production caller (the stale_sha/expected_base_sha path is a tested opt-in API, not pure dead code). Decide in a focused PR: leave as-is, wire a caller, or remove SHA-freshness — it touches the unrelated recce check-base command + tests, so it is out of scope here. — recce/cli.py

Pre-merge check: run a packaged-wheel smoke test (pip install the built wheel, start recce mcp-widget-server, load one widget) to confirm recce/data/mcp/ ships and resolves under the installed layout.

Added during the /sync-gbrain session that backfilled embeddings after Conductor's GSTACK_OPENAI_API_KEY shim was wired up. The .gbrain-source file (gitignored) pins this worktree to its scoped code source for kubectl-style routing. CLAUDE.md gets a new "GBrain Search Guidance" section so future agents in this worktree prefer gbrain over Grep for semantic queries. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Kent <iamcxa@gmail.com>

…_TOOLS filter (iter 1 Day 1) Per design-20260521-234647 amended — adds a parallel FastMCP-based widget server that delegates to RecceMCPServer's existing _tool_row_count_diff and _tool_schema_diff methods, gated by RECCE_MCP_WIDGETS=1 env var. - recce/widget_server.py: NEW, FastMCP("recce-widgets") with @mcp.tool delegates and @mcp.resource handlers (graceful when recce/data/mcp/*.html missing). - recce/cli.py: NEW `recce mcp-widget-server` subcommand (local mode only). - recce/mcp_server.py: WIDGET_TOOLS set, _widgets_enabled helper, list_tools filter, call_tool defensive raise. Day 1 scope only — widget HTML assets + tests deferred to Day 1.5 / Day 2. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Kent <iamcxa@gmail.com>

…eedback) Fix #1: run_widget_server() is synchronous — mcp.run(transport="stdio") manages its own asyncio event loop internally. asyncio.run(run_widget_server(...)) raised ValueError because the function returns None, not a coroutine. Drop asyncio.run() in cli.py and remove the unused asyncio import. Fix #2: replace **arguments with typed params so FastMCP infers a proper JSON inputSchema for tools/list. Without typed params, Claude Desktop sees the tool registered but cannot construct a tools/call (no schema to fill). Params translated from existing low-level Tool inputSchema definitions: - row_count_diff: node_names, node_ids, select, exclude (all Optional) - schema_diff: select, exclude, packages (all Optional) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Kent <iamcxa@gmail.com>

- recce/data/mcp/row_count_diff.html: NEW, ~110 LOC tier 1 widget (status pills + diff numbers; base/curr null guards using === null) - recce/data/mcp/schema_diff.html: NEW, ~110 LOC tier 2 widget (added/ removed/type_changed sections; empty state shows unchanged_count) - recce/mcp_server.py: extract _compute_schema_changes() helper that returns rich per-model dict {added, removed, type_changed, unchanged_count}; existing _tool_schema_diff flattens output back to DataFrame for low-level mcp-server consumers — zero regression on existing response shape - recce/widget_server.py: row_count_diff delegate wraps result as {"models": result}; schema_diff delegate calls _compute_schema_changes directly and returns {"models": rich_result} for widget consumption - .gitignore: replace broad recce/data ignore with per-extension rules so recce/data/mcp/*.html (widget source, not build output) can be committed Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Kent <iamcxa@gmail.com>

- tests/test_widget_server.py: 5 tests covering WIDGET_TOOLS env-var coordination, widget server tool/resource registration, file-missing graceful degradation, and tool enumeration regression. - docs/mcp-widgets.md: template for Scott — file layout, Claude Desktop config example, add-widget walkthrough using row_count_diff as worked reference, structuredContent contract, gotchas (SDK version pin, typed-params requirement, dual-key meta, mcp.run sync nature, .gitignore allowlist, stdout discipline). Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Kent <iamcxa@gmail.com>

…de Desktop cwd=/ workaround) Claude Desktop spawns MCP servers with cwd=/. Two issues surfaced when manually testing iter 1 widgets: 1. recce/track.py:61 used bare print() which goes to stdout. When RecceConfig couldn't find/create recce.yml at cwd=/, the resulting traceback corrupted the JSON-RPC stdio channel. Fixed to print to stderr. Pre-existing bug — affected mcp-server equally. 2. recce/cli.py mcp_server and mcp_widget_server now chdir to --project-dir before RecceConfig so the config file is searched relative to the user's dbt project instead of cwd. Eliminates need for a bash wrapper in Claude Desktop config. Verified by reproducing the cwd=/ failure before fix (32 lines of traceback to stdout) and confirming clean stdout (0 lines) after. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: Kent <iamcxa@gmail.com>

Claude Desktop manual test revealed widget rendered but showed empty state ("No models found") while agent text summary correctly reported 5 models. Root cause: FastMCP with @mcp.tool returning Dict[str, Any] may not populate the JSON-RPC structuredContent field — only content text payload. Widget HTML only read structuredContent. - Both widgets now try structuredContent first, fall back to parsing content[0].text as JSON. Works regardless of FastMCP serialization behavior. - row_count_diff also skips keys starting with "_" when iterating models (so _maybe_add_single_env_warning's _warning key doesn't break the per-row render); _warning text is rendered as a notice banner above the table when present. - schema_diff gets the same underscore-key skip for defensive uniformity. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: Kent <iamcxa@gmail.com>

…Content Per MCP log inspection: FastMCP synthesizes an outputSchema like \`{properties: {result: {...}}, required: ["result"]}\` when @mcp.tool returns a Dict[str, Any] without a Pydantic return type. The actual structuredContent sent to the widget has shape {result: {models: ...}} not {models: ...}. Widget HTML now unwraps the result key when present, so both flat returns and FastMCP-wrapped returns render correctly. Defensive: only unwraps if data.result exists AND data.models does not, so tools that legitimately return both keys aren't mishandled. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: Kent <iamcxa@gmail.com>

…ntegration Manual Claude Desktop test showed widget rendered real data correctly but hand-rolled hardcoded colors looked washed-out in dark mode and out of place next to Claude's native UI. Refactored both widgets to use Claude's CSS custom properties (var(--color-text-primary), var(--color-background-*), var(--font-sans), etc.) with light-mode hex fallbacks and a prefers-color-scheme dark fallback for non-Claude hosts. - row_count_diff.html: redesigned with header + 4-up summary cards + main table, semantic status badges (warning/success/info/secondary), tabular numeric alignment, monospace model names. - schema_diff.html: same design-token migration, per-model sections with semantic colors for added/removed/type_changed. - No new external dependencies (kept ext-apps SDK as only unpkg load, CSP resourceDomains unchanged). Used Unicode glyphs instead of icon font to stay dependency-light. Preserves all existing JS: unwrap fallback for FastMCP {result: ...} wrapping, _warning notice rendering, === null guards, _-prefix key skip. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: Kent <iamcxa@gmail.com>

…lt + annotations Per anthropic/skills/mcp-builder Python idiom checklist (python_mcp_server.md:691). Closes the gap between iter 1's "make it work" shortcuts and SDK best practices. - Pydantic BaseModel for tool inputs (RowCountDiffInput, SchemaDiffInput) with Field(description=...) on every param — replaces bare Optional[...] params, gives the LLM richer inputSchema descriptions. - Pydantic BaseModel for tool outputs (RowCountDiffOutput, SchemaDiffOutput, nested per-model models) — replaces Dict[str, Any] return, generates proper outputSchema, eliminates FastMCP's {result: ...} wrapping (widget JS unwrap fallback removed in commit 2). - Return CallToolResult explicitly with short content (one sentence) + data in structuredContent only. Agent receives a summary; widget receives the dict. - annotations dict on every @mcp.tool: title, readOnlyHint, destructiveHint, idempotentHint, openWorldHint per checklist line 691. - _warning moved from top-level result dict to RowCountDiffOutput.warning named field — cleaner Pydantic shape, no more _-prefix key skip in widget JS. - logging.basicConfig explicitly sets stream=sys.stderr per stdio discipline (mcp_best_practices.md:139). - prefersBorder: False added to @mcp.resource meta per spec UIResourceMeta. - Tool docstrings expanded per python_mcp_server.md:278-328 pattern: what / Use when / Don't use when / Returns / Error Handling. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Kent <iamcxa@gmail.com>

Per add-app-to-server/SKILL.md JS-side patterns. Replaces manual CSS var() fallback with SDK helpers that actively apply host design tokens via postMessage context. - Import applyDocumentTheme, applyHostStyleVariables, applyHostFonts from ext-apps SDK. Register app.onhostcontextchanged to invoke them on theme/ font change. Apply initial context after connect() if exposed. - Remove dead defensive layers no longer needed after commit 1: - structuredContent unwrap (FastMCP no longer wraps Pydantic returns) - content[0].text JSON.parse fallback (content is one-sentence string) - _-prefix key skip (warning is now a named field, models is clean dict) - Add app.onteardown handler returning {} per SDK pattern. - CSS fallback values: rename --color-text-error → --color-text-danger and --color-background-error → --color-background-danger (and badge-error → badge-danger) to match spec enum McpUiStyleVariableKey (no "error" semantic; "danger" is canonical). - CSS var() fallbacks retained as defensive layer in case SDK helper races. - warning now read from data.warning (named field) not data.models._warning. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Kent <iamcxa@gmail.com>

…ests Closes iter 1 idiomatic refactor. - tests/test_widget_server.py: 3 new tests (5→8 total): - test_row_count_diff_returns_calltoolresult_with_short_content: verifies content is short one-sentence string and structuredContent has correct keys. - test_structured_content_matches_pydantic_model: RowCountDiffOutput.model_validate round-trip on structuredContent proves clean Pydantic shape. - test_widget_tool_annotations_present: asserts readOnlyHint/destructiveHint/ idempotentHint/openWorldHint/title on both tools. - docs/mcp-widgets.md: - Step 2 (HTML template): updated to use applyDocumentTheme/applyHostStyleVariables/ applyHostFonts from ext-apps SDK + onteardown handler. - Step 3 (@mcp.tool template): Pydantic BaseModel input/output + CallToolResult replaces bare typed params + Dict[str, Any] return pattern. - Step 4 (@mcp.resource): added prefersBorder: False to meta example. - structuredContent contract: updated to explain CallToolResult + short content discipline; added note on why bare Dict causes {result: ...} wrapping. - Gotchas: replaced bare-Dict warning with Pydantic requirement; added CSS danger vs error naming gotcha; removed {result: ...} unwrap entry (resolved). - Added "Python vs TypeScript SDK Support" subsection explaining qr-server pattern and why Recce stays Python. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Kent <iamcxa@gmail.com>

…classes Captain's manual test in Claude Desktop dark mode revealed summary card values invisible and model name cells with unexpected styling. Root cause: the @media (prefers-color-scheme: dark) blocks in both widget HTML files only covered .card, th, border, .na, .empty, .warning — leaving .card-value, .card-label, .header-*, .model-name, .diff-*, and .badge-* with light-mode fallback hex values that become invisible on Claude's dark card backgrounds. Both widgets now have exhaustive @media dark overrides so rendering is correct in all four (mode × SDK helpers fire/no-fire) combinations: - Dark + SDK fire: host tokens drive - Dark + no SDK: @media dark fallback hex drives (this commit's job) - Light + SDK fire: host tokens drive - Light + no SDK: default var() inline fallback drives Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Kent <iamcxa@gmail.com>

codecov · 2026-05-25T09:19:36Z

Codecov Report

❌ Patch coverage is 93.73421% with 124 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
recce/cli.py	53.79%	67 Missing ⚠️
recce/widget_server.py	91.94%	54 Missing ⚠️
recce/mcp_server.py	94.54%	3 Missing ⚠️

Files with missing lines	Coverage Δ
recce/track.py	`86.84% <100.00%> (ø)`
tests/test_check_base.py	`100.00% <100.00%> (ø)`
tests/test_mcp_cloud_backend.py	`100.00% <100.00%> (ø)`
tests/test_mcp_config_install.py	`100.00% <100.00%> (ø)`
tests/test_mcp_server.py	`99.86% <100.00%> (+<0.01%)`	⬆️
tests/test_widget_server.py	`100.00% <100.00%> (ø)`
recce/mcp_server.py	`91.30% <94.54%> (+0.02%)`	⬆️
recce/widget_server.py	`91.94% <91.94%> (ø)`
recce/cli.py	`67.51% <53.79%> (-1.47%)`	⬇️

... and 4 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copilot

Pull request overview

Adds an Iter 1 POC for MCP Apps widgets by introducing a parallel FastMCP-based “widget server” that serves widget-capable versions of row_count_diff and schema_diff, coordinated with the existing stdio MCP server via RECCE_MCP_WIDGETS=1.

Changes:

Added recce mcp-widget-server (FastMCP) with widget tool delegates + HTML widget resources for row_count_diff and schema_diff.
Coordinated tool routing between servers using WIDGET_TOOLS + _widgets_enabled() filtering in recce/mcp_server.py.
Added tests and documentation for the widget workflow; fixed one stdout→stderr issue for stdio MCP safety.

Reviewed changes

Copilot reviewed 9 out of 10 changed files in this pull request and generated 5 comments.

Show a summary per file

File	Description
tests/test_widget_server.py	Adds test coverage for widget server registration, env-var coordination, and tool result shapes.
recce/widget_server.py	Implements FastMCP widget server with two widget tools and two HTML resources.
recce/mcp_server.py	Adds widget tool filtering/guardrails and extracts `_compute_schema_changes()` for shared schema-diff shape.
recce/cli.py	Adds `mcp-widget-server` CLI command and ensures `--project-dir` becomes cwd for config discovery.
recce/track.py	Sends traceback output to stderr to avoid corrupting stdio JSON-RPC.
recce/data/mcp/row_count_diff.html	Self-contained MCP Apps widget for row count diffs.
recce/data/mcp/schema_diff.html	Self-contained MCP Apps widget for schema diffs.
docs/mcp-widgets.md	Developer guide/template for adding future widgets.
.gitignore	Stops ignoring the entire `recce/data` tree; ignores only build output paths so `recce/data/mcp/*.html` can be committed.
CLAUDE.md	Adds gbrain search guidance block.

iamcxa · 2026-06-02T14:29:08Z

+                base_type = base_columns[col].get("type")
+                curr_type = current_columns[col].get("type")


Acknowledged — minor. _compute_schema_changes can emit base_type / curr_type as None (from .get("type")), and the widget already tolerates it (c.type ?? ""). Tightening the SchemaChange annotation to the real shape ({name,type} / {name,base_type,curr_type} submodels, or Optional) is batched with the other type-contract polish (the Literal/submodel pass) — deferred for the POC, tracked in the follow-up discussion.

…onfig POC users had to manually edit claude_desktop_config.json to register both recce + recce-widgets MCP servers and set RECCE_MCP_WIDGETS=1 on both. This helper does it idempotently: merges entries (preserves other servers), auto-sets the env var, backs up the old config, and prints next-steps. - Required: --project-dir <PATH> (validated for dbt_project.yml presence) - Optional: --config <PATH>, --yes, --dry-run - iter 1 scope: macOS only, local mode only (errors on cloud flags / other OS with explicit fix message) - 5 unit tests using CliRunner + tmp_path (write/preserve/validate/dry-run/backup) Closes the manual install friction surfaced in iter 1 manual testing. Cuts "5 minutes editing JSON + finding absolute recce path + restarting" to one command + Cmd+Q. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Kent <iamcxa@gmail.com>

Adds the third widget in the iter 1 widget POC line — tier 1 (status pill + key/value grid) for server runtime state. First widget added post-iter-1 idiomatic refactor, so it's born following the canonical pattern: - Pydantic output model matching _tool_get_server_info return shape - @mcp.tool with annotations dict (readOnly/idempotent/title) - CallToolResult with short content + structuredContent (no agent dual-render) - @mcp.resource with mime text/html;profile=mcp-app + CSP + prefersBorder - Widget HTML: SDK helpers (applyDocumentTheme + applyHostStyleVariables + applyHostFonts), onhostcontextchanged + onteardown, exhaustive @media dark for every class with var(--color-*, fallback) — closes the dark-mode fallback gap that bit row_count_diff/schema_diff earlier (commit b882697) - WIDGET_TOOLS set updated to filter from main mcp-server when widgets enabled - 2 new tests + updated enumeration assertions (10 total, was 8) - docs/mcp-widgets.md "Add widget N+1" walkthrough now references this as the canonical worked example (born idiomatic, no later-cycle defensive layers) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Kent <iamcxa@gmail.com>

Fourth widget. tier 2 (list / simple table) — saved Recce checks for a PR session, rendered as a 3-up summary + status table. - Pydantic CheckSummary + ListChecksOutput models matching actual _tool_list_checks shape (minimal field set — name/type/status/description, NOT internal params). approved count from raw return ("approved" key); pending derived as total - approved in the delegate. - @mcp.tool with annotations, CallToolResult short content + structuredContent - @mcp.resource with mime/CSP/prefersBorder - Widget HTML: SDK helpers + onhostcontextchanged + onteardown + exhaustive @media dark overrides (354 lines, 13 KB) - WIDGET_TOOLS filter updated to 4 widget tools - 2 new tests; enumeration assertion bumped to 4; annotation loop includes list_checks; filter test asserts list_checks absent when widgets enabled Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Kent <iamcxa@gmail.com>

Fifth widget, completing Phase A of the post-iter-1 widget expansion. Single-model detail view (base/current column tables, constraints, not-found state). - Pydantic ColumnInfo + ModelEnvironment + GetModelOutput + GetModelInput matching actual _tool_get_model return shape (columns dict → list normalised in _parse_model_env helper; primary_key preserved; raw_code omitted) - @mcp.tool with annotations, CallToolResult short content + structuredContent - @mcp.resource with mime/CSP/prefersBorder - Widget HTML: SDK helpers + onhostcontextchanged + onteardown + exhaustive @media dark; adaptive 2-col/3-col column table layout (constraints visible only when present); per-env base/current sections; not-found empty state - WIDGET_TOOLS filter at 5 tools - 2 new tests (test_get_model_widget_registered, test_get_model_returns_calltoolresult_with_pydantic_shape); enumeration assertions bumped to 5 tools/resources Phase A (tier 1-2 multipliers) complete: row_count_diff, schema_diff, get_server_info, list_checks, get_model. ~half the design's 12 widget candidates done — design Open Q #6 >=50% FastMCP migration trigger approaches. Iter 2 mini-doc to reevaluate. No Pydantic reserved-name conflicts in actual handler shape (columns, primary_key, raw_code are all safe); Gotcha documented in mcp-widgets.md for future implementors. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Kent <iamcxa@gmail.com>

…st[str] Captain hit Pydantic validation error in Claude Desktop: actual _tool_get_server_info returns support_tasks as {"query": True, ..., "change_analysis": True} (dict of task slug → enabled bool), not a list of enabled slugs. Fix the Pydantic field type and widget HTML iteration. - ServerInfoOutput.support_tasks: Dict[str, bool] (was List[str]) - Widget HTML iterates Object.entries(...).filter(([_,v]) => v) for enabled tasks; renders as info badges (same visual as before) - Test fixture updated to use dict shape Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Kent <iamcxa@gmail.com>

… invocation) Captain hit Pydantic validation error in Claude Desktop: Error executing tool list_checks: 1 validation error for list_checksArguments args Field required [type=missing, input_value={}, input_type=dict] Root cause: list_checks delegate signature was `async def list_checks(args: ListChecksInput)` where ListChecksInput was an empty BaseModel (no real args to pass). FastMCP exposed this as an inputSchema requiring `args` as a top- level field — agent called with no params, MCP sent {}, validation failed. Fix: drop the empty ListChecksInput class and signature param. Match the no-args pattern of get_server_info. `_tool_list_checks` still receives {} internally (the handler accepts but ignores arguments). Confirmed only list_checks affected: row_count_diff / schema_diff have Optional-fields-only inputs (OK), get_server_info has no-args (OK), get_model has required model_id (OK). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Kent <iamcxa@gmail.com>

…dget) Sixth widget; first tier-3 widget. Establishes the data-table rendering pattern for the other Phase B widgets (query_diff, value_diff, value_diff_detail, top_k_diff). - QueryInput / QueryOutput Pydantic models matching actual DataFrame.model_dump shape (READ from source — columns/data/limit/more/total_row_count) - Sticky-header scrollable table inside ~400px container - Cell type rendering (null → "—" italic secondary, number → tabular-nums right, string → truncated with title attr, bool → check/dash, date → display as-is) - Truncation badge when result is capped (DataFrame.more) - Empty state with SQL echo for debug context - @mcp.tool with annotations (openWorldHint=True since queries hit the warehouse) - CallToolResult short content + structuredContent - WIDGET_TOOLS filter expanded to 6 tools - 2 new tests (tests 15-16); enumeration assertions bumped to 6 - docs/mcp-widgets.md gains "Adding a tier-3 (data table) widget" section as template for the remaining Phase B widgets Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Kent <iamcxa@gmail.com>

…mparison) Seventh widget. Builds on query widget (46251d5)'s data-table pattern. Compares SQL execution against base + current environments with per-row status rendering. Handles both QueryDiffTask result shapes from source. - QueryDiffInput / QueryDiffOutput Pydantic models verified against actual QueryDiffTask.execute() return shape (QueryDiffResult) - Two render modes detected at runtime from structuredContent: * Shape A (no primary_keys): side-by-side base / current tables * Shape B (primary_keys): join-diff table with in_a/in_b stripped, status pills (Added/Removed), row tinting, filter buttons - _parse_dataframe() helper mirrors _parse_model_env pattern - _warning extracted to output.warning named field (consistent with row_count_diff Day 1.5 pattern) - Widget HTML: sticky-header scrollable tables + status pills + filter buttons (All / Added / Removed with counts) + dark-mode coverage - @mcp.tool with openWorldHint=True (queries hit warehouse) - WIDGET_TOOLS updated to 7 tools - 2 new tests (registration + both Shape A and B); enumeration bumped to 7 - test_widget_tool_annotations_present updated for query_diff open-world group - docs/mcp-widgets.md: widget count + reference table updated Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Kent <iamcxa@gmail.com>

…comparison) Eighth widget. Aggregate stats + per-column match breakdown for value comparison across two environments (primary key matching). - ValueDiffInput / ValueDiffColumnRow / ValueDiffSummary / ValueDiffOutput Pydantic models matching actual ValueDiffTask return shape (verified from source): summary={total, added, removed}, data.data=[[col, matched, matched_p], ...] where matched_p is 0.0-1.0 fraction (not percentage) - _warning extracted to output.warning named field - 4-up summary cards (total/common/columns-affected/added+removed) + per-column table with inline bar visualization (CSS-only, no chart lib) - All-match empty state when every column is 100% identical - @mcp.tool with annotations (openWorldHint=True, queries warehouse) - WIDGET_TOOLS at 8 tools - 2 new tests (registration + Pydantic shape); enumeration bumped to 8 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Kent <iamcxa@gmail.com>

…el inspection) Ninth widget. Row-level companion to value_diff. Shows the actual rows with mismatched values for investigation. - ValueDiffDetailInput / Output Pydantic models matching actual ValueDiffDetailTask return shape (plain DataFrame with in_a/in_b flags) - Scrollable table with sticky primary-key column (CSS sticky left), filter buttons (All/Removed/Added) with counts, status pill per row, row tinting (red=removed, green=added), cell type-aware rendering - _warning extracted to output.warning named field - @mcp.tool with annotations (openWorldHint=True) - WIDGET_TOOLS at 9 tools - 2 new tests (Tests 21-22); enumeration bumped to 9; 27 total passing Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Kent <iamcxa@gmail.com>

Tenth widget; closes Phase B (5 tier-3 widgets). Side-by-side ranked lists of top-K most frequent values across base + current envs. - TopKDiffInput / TopKEnvStats / TopKDiffOutput Pydantic models matching actual TopKDiffTask return shape (READ from source): base/current: {values[], counts[], valids, total} parallel lists. values[] is the SAME union list in both envs (SQL FULL OUTER JOIN order: curr_count DESC, base_count DESC). count=0 means absent from that env. Param is column_name (not column), default k=10 (not 50). - 2-column grid with ranked lists per env, inline bar viz, rank-change arrows (up/down), New/Gone badges for env-exclusive categories - _warning extracted to output.warning named field - @mcp.tool annotations (openWorldHint=True — executes warehouse SQL) - WIDGET_TOOLS at 10 tools — 50% coverage (10/20), design Open Q #6 FastMCP migration trigger threshold REACHED. iter 2 to reevaluate. - 2 new tests (23: registration + inputSchema, 24: Pydantic shape); enumeration bumped to 10 Phase B (tier-3 data tables): query, query_diff, value_diff, value_diff_detail, top_k_diff. Each hand-rolled its table/list layout because patterns diverged enough that a shared <recce-table> would need extreme flexibility. Iter 2 mini-doc: evaluate renderRankedList() JS helper extraction; full component abstraction deferred until 3+ widgets converge on same layout. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: Kent <iamcxa@gmail.com>

…t widget) Eleventh widget; first tier-4 (statistics/chart) widget. Hand-rolled SVG bars — no new chart library, no CSP/CDN changes. Establishes the "tier-4 via hand-rolled SVG" pattern that profile_diff and any future chart widgets will follow. Actual HistogramDiffTask return shape (read from recce/tasks/histogram.py): base/current: {counts: List[int], total: int} (or {} if env failed) min/max: overall min/max across both envs (numeric or date object) bin_edges: N+1 edge values (int/float for numeric, date for datetime) labels: List[str] for numeric cols ("lo-hi" format); None for datetime _tool_histogram_diff auto-detects column_type from catalog — tool only needs model + column_name (+ optional num_bins). - HistogramDiffInput / HistogramEnvStats / HistogramDiffOutput Pydantic models matching actual HistogramDiffTask return shape - Date bin_edges serialised to ISO strings in widget delegate (date objects are not JSON-serialisable natively) - Hand-rolled SVG: viewBox 600x180, base bars (blue 45% opacity) behind current bars (green 70% opacity), x-axis label density auto-reduced (every Nth label for bins > 10), hover tooltip via mousemove - CSS tokens for bar colors with exhaustive @media dark fallback - _warning extracted to output.warning named field - @mcp.tool annotations (openWorldHint=True, hits the warehouse) - WIDGET_TOOLS at 11 tools - 2 new tests (25, 26); enumeration bumped to 11; annotations test extended to cover histogram_diff openWorldHint=True - docs/mcp-widgets.md: widget count 10→11, new "Tier-4 (Chart) Widget Architecture" section explaining hand-roll SVG approach + iter-2 upgrade criteria for swapping to a real chart library Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Kent <iamcxa@gmail.com>

…e cap) 15th widget tool. Hand-roll SVG mini-DAG (no Mermaid/ReactFlow dep) copying the impact_analysis BFS layout pattern. Hard cap MAX_INLINE_NODES=10 — over cap shows graceful empty-state message pointing to Recce web UI, no truncated view. Full ReactFlow plan from design Phase E remains deferred. Motivation: without a lineage_diff widget, Claude Desktop's agent falls back to rendering lineage as Mermaid text after seeing impact_analysis output, bypassing the widget rendering pipeline. Adding this widget keeps lineage visualization inside the same theming + interaction surface as the other 14 widgets and brings widget coverage to 15/20 (75%). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Kent <iamcxa@gmail.com>

These two Claude Code plugins back the MCP App widget work in this branch (ext-apps SDK reference, MCP server scaffolding). Enabling them in the repo settings keeps the dev environment consistent for anyone continuing the widget POC. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Kent <iamcxa@gmail.com>

The impact_analysis value_diff builder derived its per-column diff list from model_info["columns"], which reflects only the CURRENT relation (get_model -> get_columns(base=False)). It then applied `b."col" IS DISTINCT FROM c."col"` to BOTH the base (b) and current (c) relations. When a column had drifted — present in the current physical table but not the base — the warehouse binder failed hard (Snowflake: `Table "b" does not have a column named "<col>"`). Restrict the diff to the intersection of base and current columns via get_model(node_id, base=True) (which manages its own warehouse connection), mirroring ValueDiffTask. Drifted columns are already reported via schema_changes. Also skip value_diff when the PK itself has drifted, since it is the FULL OUTER JOIN key. Adds a regression test asserting drifted columns never reach the generated SQL and that value_diff covers only common non-PK columns. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Kent <iamcxa@gmail.com>

`logs/` (dbt run logs) and `.hypothesis/` (hypothesis testing example cache) are runtime/test artifacts regenerated on every test run, but the existing rules only covered the root `dbt.log` file — not the `logs/` directory dbt writes to, nor `integration_tests/dbt/logs/`. They showed as persistent untracked changes in the workspace UI. Ignore the dir forms (no leading slash → matches at any depth). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Kent <iamcxa@gmail.com>

even-wei · 2026-06-02T09:23:17Z

Code Review: PR #1397

SHA 87f22fd3 · Verdict NO-GO

Verified checks: flake8 clean (exit 0); targeted suites 156 passed (test_widget_server.py + test_mcp_config_install.py + test_mcp_server.py) under the project-pinned mcp 1.27.0 / Python 3.13; black --check and isort --check FAIL on tests/test_widget_server.py; bundled security-review found no exploitable Critical/High.

Blockers

None.

Issues

tests/test_widget_server.py — Fails the project's required make format gate (black + isort).
Evidence: black --check → "1 file would be reformatted" (line ~2218 over-length call); isort --check → import block at line ~2248 needs multi-line wrap. AGENTS.md lists make format as a pre-commit/CI requirement.
Pass E / project-convention.

Notes

recce/cli.py:3174 (mcp_config_install) — On re-run the backup is overwritten with the already-modified config, so .recce.bak no longer holds the pristine pre-recce original; only the recce entries differ, and re-run is otherwise idempotent (deterministic new_entries). Consider skip-if-exists or timestamped backups.
Pass D.
recce/data/mcp/*.html — Shared escapeHtml escapes & < > " but not ' ('). Not exploitable today (all interpolated attributes are double-quoted and " is escaped; SVG labels use textContent). Defense-in-depth only; add .replace(/'/g,"'") if a single-quoted attribute is ever introduced.
Pass B (Security, LOW).
recce/mcp_server.py:1885 (impact_analysis) — New get_model(node_id, base=True) adds a second per-model warehouse introspection round-trip inside the impacted-models loop. Correctness fix (avoids binder failure on drifted columns) but doubles base-side introspection on large impacted sets.
Pass G.
recce/cli.py:3012 (mcp_config_install) — recce mcp-config-install and the macOS-only path are correctly OS-guarded, but there is no uninstall / removal path; the docstring's "To undo: replace with .recce.bak" is the only recovery, which is lost after a second run (see Note 1). Untested uninstall for a config-mutating installer.
Pass E.
recce/data/mcp/ — 15 hand-written HTML/JS widgets (~8.4k LOC) carry a copy-pasted escapeHtml + svgEl helper per file. For a POC this is acceptable, but it's 15 sources of truth for the escaping contract; a shared asset would prevent Note 2 drifting per-file.
Pass I (architecture, copy-paste).

Limits

uv run pytest (the default dev path) resolves to a STALE local env (Python 3.10 + an older mcp whose FastMCP.resource() lacks the meta kwarg), yielding 32 spurious TypeError: ... unexpected keyword argument 'meta' failures. These are an environment artifact, NOT a code defect — under the pinned mcp>=1.23.0 (resolved to 1.27.0, where resource() DOES accept meta) all 156 tests pass. Verified by forcing the 3.13 venv. CI using the locked deps will pass; a developer with a stale global mcp will see false failures.
Did not exercise the widgets live in Claude Desktop (macOS MCP-app rendering) — XSS assessment is static data-flow only. SQL query/query_diff tool args flow to the pre-existing _tool_query path (unchanged by this PR), not re-reviewed for injection here.

even-wei

Claude Code Review: NO-GO. 1 issue (format gate) + 5 notes. See review comment.

Batch of correctness/robustness fixes from a Codex review pass: - mcp_server: default 30s timeout on cloud proxy requests so a stalled endpoint cannot hang the MCP server indefinitely. - mcp_server: include the effective base selection (single-env vs dual-env) in the set_backend local cache key, so a context loaded in single-env mode is reloaded once target-base/ later appears instead of serving a stale single-env context. - cli: mcp_config_install invokes `python -m recce.cli` (the package has no recce.__main__, so `-m recce` is not runnable). - cli: make check_base_freshness SHA comparison opt-in via expected_base_sha. Base-branch artifacts legitimately carry a DBT_GIT_SHA that differs from the feature-branch HEAD, so the previous unconditional compare produced false "stale_sha" results. - tests: cover cloud request timeout, single-env -> dual-env reload, the config-install module path, and the opt-in base-SHA freshness check. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Kent <iamcxa@gmail.com>

Addresses the Code Review format-gate issue (Issue 1): black wraps an over-length call and isort splits the multi-line import block in the lineage_diff widget test. Pure formatting, no behavior change. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Kent <iamcxa@gmail.com>

Addresses Code Review Note 1. The backup was rewritten unconditionally on every run, so a second `recce mcp-config-install` clobbered the `.recce.bak` with the already-modified config — destroying the pristine pre-recce original and making the documented "restore from .recce.bak" undo path useless. Skip the copy when a backup already exists, and report "Existing backup preserved" instead of "Backup saved" in that case. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Kent <iamcxa@gmail.com>

Copilot

Pull request overview

Copilot reviewed 27 out of 28 changed files in this pull request and generated 5 comments.

iamcxa

@iamcxa — automated review (kc-pr-review, Full tier: code-reviewer + silent-failure-hunter + pr-test-analyzer + type-design-analyzer + HTML-XSS scan + ToB security). Event is COMMENT, not approve: (a) tests could not be re-run locally (env SIGKILL/OOM at pytest import — the first background run did exit 0, and the PR body documents 35+5 passing); (b) the check_base_freshness behavior change needs author intent; (c) the one production-risk path (Snowflake binder in the value_diff fix) is mock + manual-A/B only, no CI e2e. No CRITICAL/HIGH blocking bugs were found.

Goal-achievement verdict: ✅ Achieved (POC scope, with follow-up caveats)

Evidence

All 15 tool/resource pairs register correctly; every _meta.ui.resourceUri maps to a real recce/data/mcp/*.html (verified file-by-file).
All 15 HTML widgets escape warehouse data consistently (escapeHtml / textContent / numeric formatters) — no exploitable stored-XSS sink found.
The impact_analysis value_diff drift fix is correct — it intersects base ∩ current columns and uses get_model(base=True) connection-managed introspection, matching the canonical ValueDiffTask.
Type design is solid for a POC (None-vs-zero distinctions in RowCountModel, ValueDiffColumnRow.matched_p, HistogramEnvStats.total are deliberate and correct).

Verification Matrix

#	Concern (from PR body)	Verified?	Evidence
V1	`RECCE_MCP_WIDGETS=0`/unset → all 20 tools text-only, backward compatible	✅	`_widgets_enabled()` reads env at call time; filter only removes WIDGET_TOOLS when enabled (`mcp_server.py:1294`)
V2	`RECCE_MCP_WIDGETS=1` → main 5 + widget 15, no collision	✅	`WIDGET_TOOLS` filter + widget server registers exactly the 15-name set; test pins it (`test_widget_server.py:102/173`)
V3	impact_analysis drift fix (Snowflake binder)	⚠️	Code correct, but e2e gap — see inline on `mcp_server.py:1931`
V4	schema_diff serialization (SchemaChange collision fix)	⚠️	Models distinct + field-pinned; no behavioral output test — see Advisory F
V5	stdio stdout purity (JSON-RPC channel)	✅	No stdout writes in widget_server.py; logging → stderr (`widget_server.py:2521`)
V6	chdir to `--project-dir` before RecceConfig	✅	Present in both subcommands; minor: no dir-existence guard (Advisory)

Verification Summary (tests)

Check	Result
Local re-execution	⚠️ Could not run — env SIGKILL/OOM at pytest import (0-byte output; killed during collection). Not a test-logic failure.
First background run (widget + config_install + check_base + mcp_server + mcp_cloud_backend)	exit 0
PR-body documented	`test_widget_server.py` 35 passed · `test_mcp_config_install.py` 5 (file actually has 7) · broad selection 363 passed

Advisory (not blocking)

A — Bundled tooling / scope creep. CLAUDE.md (+37 gbrain boilerplate, includes a per-machine "on this machine" assertion in shared docs) and .claude/settings.json (enables mcp-server-dev for all repo contributors) are unrelated to the widget feature. Suggest splitting into a separate commit/PR, marking POC-only, or moving dev-plugin prefs to gitignored settings.local.json.
B — External SDK dependency. All 15 widgets import https://unpkg.com/@modelcontextprotocol/ext-apps@0.4.0 (no SRI). Mitigated by each resource's csp.resourceDomains: ["https://unpkg.com"] (host-enforced origin restriction) and this is the standard MCP Apps SDK load pattern — acceptable for POC; consider local vendoring before productionizing.
C — Type contracts. status / change_status / priority etc. are free str with legal values only in comments. Converting the widget-branched ones to Literal[...] is a zero-cost contract win.
D — HTML hardening. No <meta> CSP in the HTML (partially mitigated by resourceDomains); get_server_info.html:317 pr.url href has no scheme validation (javascript: possible, though source is GitHub metadata not warehouse) and target="_blank" lacks rel="noopener".
E — Mega-PR. Feature POC + 4 unrelated pre-existing bugfixes (track stderr / chdir / set_backend cache-key / freshness) + tooling, 37 commits / +14K. Hard to review and to revert. Future: land pre-existing bugfixes as their own PRs.
F — Test gaps. schema_diff is the only widget with no behavioral model_validate round-trip test (the collision broke exactly that serialization); no absolute tool-count assertion (assert len==20/5) so dropping a non-widget tool would pass silently.
G — impact_analysis per-step except Exception (pre-existing, mcp_server.py ~1866/1901/2042, not in this diff) appends to errors[] without logging unexpected exceptions — a code bug is indistinguishable from a warehouse failure. Worth narrowing + exc_info logging when that area is next touched.

Nice work on the POC — the escaping discipline, type care, and the documented drift fix are genuinely solid. The items above are hardening + a couple of bundled changes worth confirming.

iamcxa · 2026-06-02T14:02:12Z

-            current_sha = current_commit_hash()
-            result["current_sha"] = current_sha
-            if current_sha and base_sha != current_sha:
+        if base_sha is not None and expected_base_sha:


MEDIUM — SHA freshness is now opt-in (and expected_base_sha), but neither shipping caller passes it: check_base CLI (L3488) and the MCP startup check (mcp_server.py:2807) both omit expected_base_sha. So stale_sha is now unreachable in production, current_sha is always None, and the stale_sha color-map (L3505) + exit-code-2 branch (L3522) are dead.

The rationale (base artifacts come from the base branch → base SHA ≠ feature HEAD → the old auto-check was a false-positive generator) is sound. But this is a behavior change to the existing recce check-base command, not mentioned in the PR body and bundled into a widget POC. Please confirm it's intentional — if so, clean up the dead stale_sha/current_sha path, or wire a caller that actually passes expected_base_sha.

Deferred to the follow-up discussion (POC scope) — leaving this thread open as a tracked TODO. The opt-in SHA change is intentional behavior-wise (base artifacts come from the base branch, so the base SHA always differs from feature HEAD → the old auto-check was a false-positive generator). But stale_sha / current_sha / the exit-code-2 branch are now dead for both callers. Decision pending: clean up the dead path vs. wire a caller that passes expected_base_sha.

On reflection, holding this rather than cleaning it up inside this widget PoC. The stale_sha / expected_base_sha path is not pure dead code — it's a tested opt-in API: test_status_stale_sha_when_expected_base_sha_mismatch exercises the mismatch → stale_sha branch, and test_status_stale_sha pins the no-arg = fresh behavior. So it's "no production caller yet", not "unreachable".

Removing it would touch the recce check-base command (color-map, exit-code-2, docstring) plus those tests — a cross-feature refactor outside this widget PoC's scope, and one I can't validate locally right now (the full pytest suite OOMs in this env; only the lint tools run). Proposal: either leave the opt-in as-is (harmless + tested), or do a focused follow-up that wires a caller passing expected_base_sha (giving it a purpose) or removes SHA-freshness entirely. Leaving this thread open for that decision.

iamcxa · 2026-06-02T14:02:12Z

+        - table_not_found / permission_denied surface in *_meta.status
+        - tool raises only on fundamental dbt/adapter failures
+    """
+    result = await _recce_server._tool_row_count_diff(args.model_dump(exclude_none=True))


MEDIUM — Widget tools call _recce_server._tool_*() directly, bypassing the central @server.call_tool handler in mcp_server.py:1426. That handler is where DRC-2754 error classification lives: _classify_db_error → downgrade table_not_found/permission_denied to warnings + sentry_metrics.count("mcp.expected_error", ...). For all 15 widget tools that machinery never runs — expected DB errors become hard errors and there's zero telemetry.

Not a silent failure (FastMCP still wraps the raised exception into an error result), so not blocking for the POC — but it quietly undoes documented investment. Suggest routing widget delegates through a shared classify/telemetry wrapper, or a tracking note that the widget path defers DRC-2754 parity.

Deferred to the follow-up discussion (POC scope) — leaving open as a tracked TODO. The widget delegates bypass the central call_tool classifier/telemetry (DRC-2754). Errors still surface (FastMCP wraps them); what's lost is expected-error classification + Sentry metrics for the 15 widget tools. Tracking a shared classify/telemetry wrapper for when the two-server architecture is consolidated.

TODO marker added in d080819e (above _recce_server) and tracked in the PR description's "Review follow-up — deferred items" section. Deferred per iter-1 POC scope; leaving open until the shared classify/telemetry wrapper lands (two-server consolidation).

iamcxa · 2026-06-02T14:02:12Z

+                    # Intersect both sides, mirroring ValueDiffTask; drifted columns are
+                    # already reported via schema_changes (Step 2b). get_model(base=True)
+                    # manages its own warehouse connection for the base-side introspection.
+                    base_model_info = self.context.adapter.get_model(node_id, base=True)


MEDIUM (test gap) — This drift fix is correct (intersection + get_model(base=True) connection-managed introspection, matching ValueDiffTask). But it has no real-adapter e2e. test_mcp_e2e.py::test_schema_changes_detected builds the drift shape against real DuckDB, yet its users fixture declares no primary_key, so if not pk: continue (L1917) skips value_diff entirely and the IS DISTINCT FROM binder path is never exercised on a live warehouse. Coverage is mock-only (drifted column absent from generated SQL) + the documented manual Snowflake A/B.

Per project MEMORY, mocks can't catch the missing-warehouse-connection class of bug. Recommend an e2e fixture with a PK where current has a column absent in base; assert impact_analysis returns value_diff over the common non-PK columns and does not raise.

Deferred to the follow-up discussion — leaving open as a tracked test-coverage TODO. The fix is mock-tested (asserts drifted columns never reach the generated SQL) plus the documented manual Snowflake A/B, and even-wei independently verified 156 tests pass. A real-DuckDB e2e with a PK + a column present only in current is the gap to close; not a blocker for the POC.

TODO marker added in d080819e (at the value_diff intersection) and tracked in the PR description's deferred section. Deferred; leaving open until the real-DuckDB e2e fixture (PK + drifted column) lands.

iamcxa · 2026-06-02T14:02:13Z

+        exclude=args.exclude,
+        packages=args.packages if args.packages is not None else None,
+    )
+    output = SchemaDiffOutput(


LOW — SchemaDiffOutput has no warning field and schema_diff never extracts _warning, unlike the other 13 diff tools (e.g. L202 warning = result.pop("_warning", None)). In single-env mode it renders "No models found" (schema_diff.html:264), which reads as "no schema changes" when the truth is "no base to compare against". (_tool_schema_diff also lacks the single-env warning on the main server — pre-existing — but the widget's empty-state makes it more misleading.) Suggest adding a warning field + the single-env notice.

Deferred to the follow-up discussion (POC scope) — leaving open as a tracked TODO. Likely fix: add a warning field to SchemaDiffOutput + extract _warning, plus the single-env notice, so the empty-state ("No models found") isn't read as a clean diff when there's simply no base to compare against.

TODO marker added in d080819e (at the SchemaDiffOutput build) and tracked in the PR description's deferred section. Deferred; leaving open until the warning field + single-env notice are added.

…stall - expanduser() on --project-dir and --config so ~/path resolves (Copilot) - atomic config write (temp + os.replace) guarded by except OSError, so a mid-write failure leaves the existing claude_desktop_config.json intact instead of a truncated file (self-review) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Kent <iamcxa@gmail.com>

- _widgets_enabled() now case-insensitive (accepts TRUE/True/true/1) (Copilot) - run_widget_server cloud/session guard checks cloud_session (the actual --session dest) in addition to session, so the guard is not silently dead (self-review) - _read_widget_html broadens except + logs on failure so a packaging defect that drops recce/data/mcp/*.html is visible in server logs (self-review) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Kent <iamcxa@gmail.com>

_show_error_message used Console() (stdout); the --debug path's print_exception corrupts the stdio JSON-RPC channel for MCP servers. Now Console(stderr=True), matching the already-fixed non-debug branch (Copilot, outdated thread). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Kent <iamcxa@gmail.com>

- escapeHtml in all 15 widgets now also escapes single quotes (' -> '); defense-in-depth so a future single-quoted attribute stays safe (even-wei N2) - get_server_info PR link adds rel=noopener noreferrer to its target=_blank anchor (Copilot, even-wei) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Kent <iamcxa@gmail.com>

- .gitignore ignores recce/data/** then re-includes recce/data/mcp/ via negation, so nested build output (e.g. recce/data/lineage/index.html) is ignored while widget source is tracked; replaces the flat per-extension allowlist that missed nested dirs (self-review) - docs/mcp-widgets.md: gitignore gotcha rewritten for the negation scheme; widget test count 28 -> 35 (Copilot) - test_mcp_config_install docstring: add backup-pristine-on-rerun + python-fallback coverage lines (Copilot) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Kent <iamcxa@gmail.com>

iamcxa · 2026-06-02T14:30:12Z

@even-wei — addressing your NO-GO (reviewed at 87f22fd3; current head is c533be93, which has 3 commits after your review's SHA + this resolve pass):

Blocker (format gate) — already resolved before the current head: 230af0f0 reformatted tests/test_widget_server.py. Verified now on c533be93: black --check, isort --check-only, and flake8 all exit 0.

Note 1 / 4 (backup clobbered on re-run, undo lost) — fixed in eeb5002d: the backup is skip-if-exists, so .recce.bak keeps the pristine pre-recce original across re-runs. Atomic write also added (65ca0739): a mid-write failure now leaves the existing config intact.

Note 2 (escapeHtml doesn't escape ') — fixed in c6d3506e: all 15 widgets now escape ' → '.

Note 3 (double base-side introspection in impact_analysis) — acknowledged; this is the tradeoff for the binder-error correctness fix. Tracking a cache/batch of the base-side get_model as a follow-up.

Note 5 (15 copy-pasted escapeHtml/svgEl helpers) — acknowledged; the POC accepts per-file helpers. Shared-asset extraction tracked as a follow-up.

Also note your "Limits": the uv run pytest stale-env failures (older mcp lacking the resource(meta=...) kwarg) are an environment artifact — confirmed; under the pinned mcp>=1.23.0 all suites pass, which your forced-3.13 run verified (156 passed).

Re-requesting your review on c533be93.

_read_widget_html resolved assets via importlib.resources.files("recce.data.mcp"), which relies on recce.data.mcp resolving as a namespace package — fragile across install layouts (Copilot flagged a potential ModuleNotFoundError). Anchor on the recce regular package (always importable) and traverse to data/mcp/ instead, so a pip-installed wheel reliably serves the widgets. Goal: widgets must work on any install, not just an editable/source checkout. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Kent <iamcxa@gmail.com>

Mark the deferred POC items (tracked in the PR description's follow-up section) with TODO(DRC-3526 follow-up) comments in code: - widget tools bypass central call_tool classifier/telemetry (DRC-2754) - schema_diff has no single-env warning - impact_analysis value_diff drift fix lacks a real-DuckDB e2e No behavior change. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Kent <iamcxa@gmail.com>

even-wei · 2026-06-12T03:47:58Z

Code Review: PR #1397

SHA 6fb3bff · Verdict NO-GO

Blockers

recce/widget_server.py:84 — PullRequestInfo.id: Optional[str] rejects the int id the state actually carries, so get_server_info crashes whenever PR metadata is present.
Evidence: recce/pull_request.py:14 is Optional[Union[int, str]] populated from pr.number (int); reproduced in this venv — ServerInfoOutput(..., pull_request={'id': 15, ...}) → ValidationError: pull_request.id Input should be a valid string. Tests only cover pull_request is None (tests/test_widget_server.py:427).
Pass C.
recce/cli.py:3132-3135 — --dry-run (and a declined confirm prompt) still creates claude_desktop_config.json + parent dirs when the file doesn't exist, violating the documented "without writing" contract.
Evidence: the skeleton write_text runs before the if dry_run: check at cli.py:3199; reproduced — mcp-config-install --dry-run with a nonexistent --config exits 0 and leaves the file on disk. The dry-run test pre-creates the file so the suite can't catch it (tests/test_mcp_config_install.py:166).
Pass A.

Issues

recce/widget_server.py (all 15 tools) — single args: <Model> parameter makes FastMCP emit a nested inputSchema (properties.args.$ref) while every docstring documents flat parameters and the same-named tools on recce mcp-server use flat schemas.
Evidence: nested {"properties": {"args": {...}}, "required": ["args"]} reproduced with mcp.server.fastmcp in this venv — description ≠ schema breaks the repo's "MCP tool description = LLM agent contract" rule, and flipping RECCE_MCP_WIDGETS silently changes the calling convention.
Pass C.
recce/widget_server.py:692-699 (also :1162-1170) — query/query_diff declare readOnlyHint: True, destructiveHint: False but execute the raw sql_template (a DELETE runs); hosts use these hints to skip confirmations.
Evidence: QueryTask → execute_sql_with_limit (recce/tasks/query.py:43) passes the statement straight to adapter.execute; relatedly tests/test_widget_server.py:336 pins openWorldHint: False for row_count_diff, which runs COUNT(*) against the warehouse.
Pass B.
recce/data/mcp/get_server_info.html:317 — PR link href has no scheme allowlist; a javascript: URL executes on click (escapeHtml doesn't neutralize schemes).
Evidence: `<a href="${escapeHtml(pr.url)}" target="_blank" ...>` — only href-from-data sink in the PR; gate on http(s):.
Pass B (MEDIUM — pr.url originates from state/PR metadata, not direct user input).
recce/data/mcp/impact_analysis.html:546-561 — the mini-graph fabricates edges: every modified model is connected to every downstream model (cartesian product), visually asserting lineage relationships that may not exist.
Evidence: for (const srcName of modifiedNames) { ... for (const dstName of downstreamNames) { ... appendChild(path) (comment admits "simplified").
Pass A.
recce/data/mcp/query_diff.html:424-436 / value_diff_detail.html:384-397 — in_a/in_b are located by key || name but filtered out by name only → header/data column misalignment when they diverge; and when the columns are absent (inAIdx === -1) every row is labeled "Removed".
Evidence: findIndex(c => c.key === "in_a" || c.name === "in_a") vs filter(c => c.name !== "in_a"); const status = (!inB) ? "removed" : "added" with inB=false fallback.
Pass A.
recce/data/mcp/value_diff.html:401,426 — "All values match" banner renders when every matched_p is null (false-quiet, against the repo's false-loud principle), and the derived mismatched-row count can render negative.
Evidence: null-excluded mismatchedCols + added === 0 && removed === 0 → "Every column is 100% identical"; Math.round(common - matchedCount) unclamped.
Pass A.
11 of 15 widgets call render() with no try/catch — any throw leaves the widget stuck on "Loading…" forever.
Evidence: bare app.ontoolresult = ({ structuredContent }) => { render(structuredContent ?? {}); } in query, query_diff, row_count_diff, value_diff_detail, schema_diff, list_checks, top_k_diff, histogram_diff, value_diff, get_model, get_server_info; only get_cll/impact_analysis/lineage_diff/profile_diff wrap it.
Pass D.
recce/widget_server.py:295-301 — schema_diff runs get_lineage_diff() + adapter.select_nodes synchronously inside the async handler (no executor), blocking the FastMCP event loop for all concurrent tool/resource calls; the get_model/get_cll/histogram_diff delegates share the pattern (inherited, but newly exposed to FastMCP concurrency).
Pass H.
recce/widget_server.py:531,617-628 — the documented not_found: bool miss-signal is dead code: _tool_get_model raises ValueError when the model is missing (mcp_server.py:2179-2183), so the branch never fires and the docstring contract is wrong.
Pass C.
recce/widget_server.py:360-370 — get_server_info docstring promises base_status "reveals stale or missing artifacts", but only run_mcp_server runs the freshness check; the widget server always reports "unknown"/"single_env".
Evidence: _base_status set only at mcp_server.py:2815; run_widget_server (widget_server.py:2528-2573) never sets it.
Pass C.
Tests — schema_diff is the only widget tool whose handler path is never executed (the very tool that had the serialization regression); zero error-path tests across the 15 delegates (handler raising, None raw results); the call_tool widget-rejection guard (mcp_server.py:1319-1321) and the run_widget_server/mcp_widget_server bootstrap are untested (every test injects ws._recce_server directly).
Evidence: no ws.schema_diff(...) call in tests/test_widget_server.py; only error tests are missing-HTML (line 143) and lineage over-cap (line 2290).
Pass E.
CLAUDE.md:101-137 — machine-local GBrain guidance ("GBrain is set up and synced on this machine", .gbrain-source, ~/.gstack/) committed into the shared repo's agent instructions; false/noise for every other contributor's environment.
Pass F.

Notes

recce/widget_server.py:60-64 — SchemaChange fields typed List[Dict[str, str]] but _compute_schema_changes emits .get("type") values that can be None (the documented present-but-None CLAUDE.md gotcha); use Optional[str]. Pass C.
recce/widget_server.py:71-77 — GitInfo.base_branch/base_sha/current_sha can never populate (state GitRepoInfo has only branch); the matching render block in get_server_info.html:296-309 is dead. Pass C.
recce/widget_server.py:1975 — get_cll pops _warning but _tool_get_cll never adds one, and lineage_diff has no warning field at all — in single-env mode it renders an empty diff that reads as clean (same false-quiet class as the acknowledged schema_diff deferral, which covers only schema_diff). Pass A.
recce/data/mcp/profile_diff.html:283,333 — string stats are escaped twice (R&D displays as R&D); strStatRow null-checks miss undefined. Pass A.
recce/data/mcp/lineage_diff.html:193,255,389 — layout keyed on n.idx with no validation; missing/duplicate idx stacks all nodes at one position. Pass A.
recce/data/mcp/row_count_diff.html:296-311 — === null guards miss undefined (safe under the current Pydantic dump contract, fragile to it), and the summary "Changed" counter counts null-vs-value rows the badge shows as "N/A". Pass A.
recce/data/mcp/query_diff.html:485 / value_diff_detail.html:444 — inline onclick handlers break silently under a strict host CSP; prefer addEventListener. Pass A.
recce/data/mcp/top_k_diff.html:359,410 — only base.values is iterated (current-only categories depend on an unasserted server union contract) and counts aren't coerced to Number (string counts concatenate in the "shown" stat). Pass A.
tests/test_check_base.py:77+ — seven dead patch("recce.git.current_commit_hash") contexts and the stale test_status_stale_sha name survive the opt-in change; expected_base_sha has no production caller (acknowledged deferral). Pass E/F.
PR description says "18-vs-20 tools"; the real split is 20 widgets-off / 5 widgets-on / 15 widget tools. Stale test name ..._six_tools_and_six_resources (tests/test_widget_server.py:89) asserts 15. Pass F.
Verification: black/isort/flake8 clean at 6fb3bff; all PR test files pass; the 19 failing tests in this env (test_cli_cache, test_server SPA routes, …) fail identically on origin/main — pre-existing/environmental, not introduced here.

Multi-pass adversarial review (recce-dev:claude-code-review) — passes A–I + security skill; widget_server.py, 15 HTML widgets, and test suite each independently reviewed and cross-refuted.

even-wei

Claude Code Review: Critical issues found (2 blockers, 12 issues). See review comment: #1397 (comment)

iamcxa and others added 13 commits May 22, 2026 10:06

iamcxa changed the title ~~feat(mcp): iter 1 widget POC — row_count_diff + schema_diff as MCP App widgets~~ feat(DRC-3526): iter 1 widget POC — row_count_diff + schema_diff as MCP App widgets May 25, 2026

iamcxa requested a review from Copilot May 25, 2026 09:17

Copilot started reviewing on behalf of iamcxa May 25, 2026 09:18 View session

Copilot AI reviewed May 25, 2026

View reviewed changes

iamcxa and others added 12 commits May 25, 2026 17:26

iamcxa changed the title ~~feat(DRC-3526): iter 1 widget POC — row_count_diff + schema_diff as MCP App widgets~~ feat(DRC-3526): MCP App widget POC — 15 widget tools + config-install helper May 29, 2026

iamcxa and others added 3 commits May 29, 2026 18:13

even-wei self-requested a review June 2, 2026 07:26

even-wei assigned iamcxa Jun 2, 2026

even-wei requested changes Jun 2, 2026

View reviewed changes

even-wei and others added 4 commits June 2, 2026 17:26

Merge branch 'main' into iamcxa/recce-mcp-ui-integration

b5e2a8c

iamcxa requested a review from Copilot June 2, 2026 10:15

Copilot started reviewing on behalf of iamcxa June 2, 2026 10:16 View session

Copilot AI reviewed Jun 2, 2026

View reviewed changes

Comment thread recce/cli.py Outdated

Comment thread recce/cli.py

Comment thread recce/mcp_server.py Outdated

Comment thread recce/data/mcp/get_server_info.html

Comment thread tests/test_mcp_config_install.py

iamcxa commented Jun 2, 2026

View reviewed changes

iamcxa and others added 5 commits June 2, 2026 22:24

iamcxa requested a review from even-wei June 2, 2026 15:17

iamcxa and others added 2 commits June 2, 2026 23:20

Merge branch 'main' into iamcxa/recce-mcp-ui-integration

6fb3bff

even-wei requested changes Jun 12, 2026

View reviewed changes

even-wei requested a review from wcchang1115 June 12, 2026 07:50

		base_type = base_columns[col].get("type")
		curr_type = current_columns[col].get("type")

Conversation

iamcxa commented May 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Demo journeys (recorded)

Demo 1 — Quick scan (Journey A) · Loom

Demo 2 — Investigate impact (Journey 2) · Loom

What's in this PR

Test plan

User-facing changes

What's NOT in this PR

Reviewer notes

Review follow-up — deferred items (⚠️ this PR is NOT rushing to merge)

Uh oh!

codecov Bot commented May 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

even-wei commented Jun 2, 2026

Code Review: PR #1397

Blockers

Issues

Notes

Limits

Uh oh!

even-wei left a comment

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

iamcxa left a comment

Choose a reason for hiding this comment

Goal-achievement verdict: ✅ Achieved (POC scope, with follow-up caveats)

Verification Matrix

Verification Summary (tests)

Advisory (not blocking)

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

iamcxa commented May 25, 2026 •

edited

Loading

codecov Bot commented May 25, 2026 •

edited

Loading