feat(DRC-3526): MCP App widget POC — 15 widget tools + config-install helper#1397
feat(DRC-3526): MCP App widget POC — 15 widget tools + config-install helper#1397iamcxa wants to merge 45 commits into
Conversation
Added during the /sync-gbrain session that backfilled embeddings after Conductor's GSTACK_OPENAI_API_KEY shim was wired up. The .gbrain-source file (gitignored) pins this worktree to its scoped code source for kubectl-style routing. CLAUDE.md gets a new "GBrain Search Guidance" section so future agents in this worktree prefer gbrain over Grep for semantic queries. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Kent <iamcxa@gmail.com>
…_TOOLS filter (iter 1 Day 1)
Per design-20260521-234647 amended — adds a parallel FastMCP-based widget server
that delegates to RecceMCPServer's existing _tool_row_count_diff and
_tool_schema_diff methods, gated by RECCE_MCP_WIDGETS=1 env var.
- recce/widget_server.py: NEW, FastMCP("recce-widgets") with @mcp.tool delegates
and @mcp.resource handlers (graceful when recce/data/mcp/*.html missing).
- recce/cli.py: NEW `recce mcp-widget-server` subcommand (local mode only).
- recce/mcp_server.py: WIDGET_TOOLS set, _widgets_enabled helper, list_tools
filter, call_tool defensive raise.
Day 1 scope only — widget HTML assets + tests deferred to Day 1.5 / Day 2.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Kent <iamcxa@gmail.com>
…eedback) Fix #1: run_widget_server() is synchronous — mcp.run(transport="stdio") manages its own asyncio event loop internally. asyncio.run(run_widget_server(...)) raised ValueError because the function returns None, not a coroutine. Drop asyncio.run() in cli.py and remove the unused asyncio import. Fix #2: replace **arguments with typed params so FastMCP infers a proper JSON inputSchema for tools/list. Without typed params, Claude Desktop sees the tool registered but cannot construct a tools/call (no schema to fill). Params translated from existing low-level Tool inputSchema definitions: - row_count_diff: node_names, node_ids, select, exclude (all Optional) - schema_diff: select, exclude, packages (all Optional) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Kent <iamcxa@gmail.com>
- recce/data/mcp/row_count_diff.html: NEW, ~110 LOC tier 1 widget (status
pills + diff numbers; base/curr null guards using === null)
- recce/data/mcp/schema_diff.html: NEW, ~110 LOC tier 2 widget (added/
removed/type_changed sections; empty state shows unchanged_count)
- recce/mcp_server.py: extract _compute_schema_changes() helper that returns
rich per-model dict {added, removed, type_changed, unchanged_count};
existing _tool_schema_diff flattens output back to DataFrame for low-level
mcp-server consumers — zero regression on existing response shape
- recce/widget_server.py: row_count_diff delegate wraps result as
{"models": result}; schema_diff delegate calls _compute_schema_changes
directly and returns {"models": rich_result} for widget consumption
- .gitignore: replace broad recce/data ignore with per-extension rules so
recce/data/mcp/*.html (widget source, not build output) can be committed
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Kent <iamcxa@gmail.com>
- tests/test_widget_server.py: 5 tests covering WIDGET_TOOLS env-var coordination, widget server tool/resource registration, file-missing graceful degradation, and tool enumeration regression. - docs/mcp-widgets.md: template for Scott — file layout, Claude Desktop config example, add-widget walkthrough using row_count_diff as worked reference, structuredContent contract, gotchas (SDK version pin, typed-params requirement, dual-key meta, mcp.run sync nature, .gitignore allowlist, stdout discipline). Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Kent <iamcxa@gmail.com>
…de Desktop cwd=/ workaround) Claude Desktop spawns MCP servers with cwd=/. Two issues surfaced when manually testing iter 1 widgets: 1. recce/track.py:61 used bare print() which goes to stdout. When RecceConfig couldn't find/create recce.yml at cwd=/, the resulting traceback corrupted the JSON-RPC stdio channel. Fixed to print to stderr. Pre-existing bug — affected mcp-server equally. 2. recce/cli.py mcp_server and mcp_widget_server now chdir to --project-dir before RecceConfig so the config file is searched relative to the user's dbt project instead of cwd. Eliminates need for a bash wrapper in Claude Desktop config. Verified by reproducing the cwd=/ failure before fix (32 lines of traceback to stdout) and confirming clean stdout (0 lines) after. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: Kent <iamcxa@gmail.com>
Claude Desktop manual test revealed widget rendered but showed empty
state ("No models found") while agent text summary correctly reported
5 models. Root cause: FastMCP with @mcp.tool returning Dict[str, Any]
may not populate the JSON-RPC structuredContent field — only content
text payload. Widget HTML only read structuredContent.
- Both widgets now try structuredContent first, fall back to parsing
content[0].text as JSON. Works regardless of FastMCP serialization
behavior.
- row_count_diff also skips keys starting with "_" when iterating
models (so _maybe_add_single_env_warning's _warning key doesn't
break the per-row render); _warning text is rendered as a notice
banner above the table when present.
- schema_diff gets the same underscore-key skip for defensive uniformity.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: Kent <iamcxa@gmail.com>
…Content
Per MCP log inspection: FastMCP synthesizes an outputSchema like
\`{properties: {result: {...}}, required: ["result"]}\` when @mcp.tool
returns a Dict[str, Any] without a Pydantic return type. The actual
structuredContent sent to the widget has shape {result: {models: ...}}
not {models: ...}. Widget HTML now unwraps the result key when present,
so both flat returns and FastMCP-wrapped returns render correctly.
Defensive: only unwraps if data.result exists AND data.models does not,
so tools that legitimately return both keys aren't mishandled.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: Kent <iamcxa@gmail.com>
…ntegration
Manual Claude Desktop test showed widget rendered real data correctly but
hand-rolled hardcoded colors looked washed-out in dark mode and out of place
next to Claude's native UI. Refactored both widgets to use Claude's CSS
custom properties (var(--color-text-primary), var(--color-background-*),
var(--font-sans), etc.) with light-mode hex fallbacks and a
prefers-color-scheme dark fallback for non-Claude hosts.
- row_count_diff.html: redesigned with header + 4-up summary cards + main
table, semantic status badges (warning/success/info/secondary), tabular
numeric alignment, monospace model names.
- schema_diff.html: same design-token migration, per-model sections with
semantic colors for added/removed/type_changed.
- No new external dependencies (kept ext-apps SDK as only unpkg load,
CSP resourceDomains unchanged). Used Unicode glyphs instead of icon
font to stay dependency-light.
Preserves all existing JS: unwrap fallback for FastMCP {result: ...}
wrapping, _warning notice rendering, === null guards, _-prefix key skip.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: Kent <iamcxa@gmail.com>
…lt + annotations
Per anthropic/skills/mcp-builder Python idiom checklist (python_mcp_server.md:691).
Closes the gap between iter 1's "make it work" shortcuts and SDK best practices.
- Pydantic BaseModel for tool inputs (RowCountDiffInput, SchemaDiffInput) with
Field(description=...) on every param — replaces bare Optional[...] params,
gives the LLM richer inputSchema descriptions.
- Pydantic BaseModel for tool outputs (RowCountDiffOutput, SchemaDiffOutput,
nested per-model models) — replaces Dict[str, Any] return, generates proper
outputSchema, eliminates FastMCP's {result: ...} wrapping (widget JS unwrap
fallback removed in commit 2).
- Return CallToolResult explicitly with short content (one sentence) + data in
structuredContent only. Agent receives a summary; widget receives the dict.
- annotations dict on every @mcp.tool: title, readOnlyHint, destructiveHint,
idempotentHint, openWorldHint per checklist line 691.
- _warning moved from top-level result dict to RowCountDiffOutput.warning
named field — cleaner Pydantic shape, no more _-prefix key skip in widget JS.
- logging.basicConfig explicitly sets stream=sys.stderr per stdio discipline
(mcp_best_practices.md:139).
- prefersBorder: False added to @mcp.resource meta per spec UIResourceMeta.
- Tool docstrings expanded per python_mcp_server.md:278-328 pattern:
what / Use when / Don't use when / Returns / Error Handling.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Kent <iamcxa@gmail.com>
Per add-app-to-server/SKILL.md JS-side patterns. Replaces manual CSS var()
fallback with SDK helpers that actively apply host design tokens via
postMessage context.
- Import applyDocumentTheme, applyHostStyleVariables, applyHostFonts from
ext-apps SDK. Register app.onhostcontextchanged to invoke them on theme/
font change. Apply initial context after connect() if exposed.
- Remove dead defensive layers no longer needed after commit 1:
- structuredContent unwrap (FastMCP no longer wraps Pydantic returns)
- content[0].text JSON.parse fallback (content is one-sentence string)
- _-prefix key skip (warning is now a named field, models is clean dict)
- Add app.onteardown handler returning {} per SDK pattern.
- CSS fallback values: rename --color-text-error → --color-text-danger and
--color-background-error → --color-background-danger (and badge-error →
badge-danger) to match spec enum McpUiStyleVariableKey (no "error"
semantic; "danger" is canonical).
- CSS var() fallbacks retained as defensive layer in case SDK helper races.
- warning now read from data.warning (named field) not data.models._warning.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Kent <iamcxa@gmail.com>
…ests
Closes iter 1 idiomatic refactor.
- tests/test_widget_server.py: 3 new tests (5→8 total):
- test_row_count_diff_returns_calltoolresult_with_short_content: verifies
content is short one-sentence string and structuredContent has correct keys.
- test_structured_content_matches_pydantic_model: RowCountDiffOutput.model_validate
round-trip on structuredContent proves clean Pydantic shape.
- test_widget_tool_annotations_present: asserts readOnlyHint/destructiveHint/
idempotentHint/openWorldHint/title on both tools.
- docs/mcp-widgets.md:
- Step 2 (HTML template): updated to use applyDocumentTheme/applyHostStyleVariables/
applyHostFonts from ext-apps SDK + onteardown handler.
- Step 3 (@mcp.tool template): Pydantic BaseModel input/output + CallToolResult
replaces bare typed params + Dict[str, Any] return pattern.
- Step 4 (@mcp.resource): added prefersBorder: False to meta example.
- structuredContent contract: updated to explain CallToolResult + short content
discipline; added note on why bare Dict causes {result: ...} wrapping.
- Gotchas: replaced bare-Dict warning with Pydantic requirement; added CSS
danger vs error naming gotcha; removed {result: ...} unwrap entry (resolved).
- Added "Python vs TypeScript SDK Support" subsection explaining qr-server
pattern and why Recce stays Python.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Kent <iamcxa@gmail.com>
…classes Captain's manual test in Claude Desktop dark mode revealed summary card values invisible and model name cells with unexpected styling. Root cause: the @media (prefers-color-scheme: dark) blocks in both widget HTML files only covered .card, th, border, .na, .empty, .warning — leaving .card-value, .card-label, .header-*, .model-name, .diff-*, and .badge-* with light-mode fallback hex values that become invisible on Claude's dark card backgrounds. Both widgets now have exhaustive @media dark overrides so rendering is correct in all four (mode × SDK helpers fire/no-fire) combinations: - Dark + SDK fire: host tokens drive - Dark + no SDK: @media dark fallback hex drives (this commit's job) - Light + SDK fire: host tokens drive - Light + no SDK: default var() inline fallback drives Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Kent <iamcxa@gmail.com>
Codecov Report❌ Patch coverage is
... and 4 files with indirect coverage changes 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Pull request overview
Adds an Iter 1 POC for MCP Apps widgets by introducing a parallel FastMCP-based “widget server” that serves widget-capable versions of row_count_diff and schema_diff, coordinated with the existing stdio MCP server via RECCE_MCP_WIDGETS=1.
Changes:
- Added
recce mcp-widget-server(FastMCP) with widget tool delegates + HTML widget resources forrow_count_diffandschema_diff. - Coordinated tool routing between servers using
WIDGET_TOOLS+_widgets_enabled()filtering inrecce/mcp_server.py. - Added tests and documentation for the widget workflow; fixed one stdout→stderr issue for stdio MCP safety.
Reviewed changes
Copilot reviewed 9 out of 10 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/test_widget_server.py | Adds test coverage for widget server registration, env-var coordination, and tool result shapes. |
| recce/widget_server.py | Implements FastMCP widget server with two widget tools and two HTML resources. |
| recce/mcp_server.py | Adds widget tool filtering/guardrails and extracts _compute_schema_changes() for shared schema-diff shape. |
| recce/cli.py | Adds mcp-widget-server CLI command and ensures --project-dir becomes cwd for config discovery. |
| recce/track.py | Sends traceback output to stderr to avoid corrupting stdio JSON-RPC. |
| recce/data/mcp/row_count_diff.html | Self-contained MCP Apps widget for row count diffs. |
| recce/data/mcp/schema_diff.html | Self-contained MCP Apps widget for schema diffs. |
| docs/mcp-widgets.md | Developer guide/template for adding future widgets. |
| .gitignore | Stops ignoring the entire recce/data tree; ignores only build output paths so recce/data/mcp/*.html can be committed. |
| CLAUDE.md | Adds gbrain search guidance block. |
| base_type = base_columns[col].get("type") | ||
| curr_type = current_columns[col].get("type") |
There was a problem hiding this comment.
Acknowledged — minor. _compute_schema_changes can emit base_type / curr_type as None (from .get("type")), and the widget already tolerates it (c.type ?? ""). Tightening the SchemaChange annotation to the real shape ({name,type} / {name,base_type,curr_type} submodels, or Optional) is batched with the other type-contract polish (the Literal/submodel pass) — deferred for the POC, tracked in the follow-up discussion.
…onfig POC users had to manually edit claude_desktop_config.json to register both recce + recce-widgets MCP servers and set RECCE_MCP_WIDGETS=1 on both. This helper does it idempotently: merges entries (preserves other servers), auto-sets the env var, backs up the old config, and prints next-steps. - Required: --project-dir <PATH> (validated for dbt_project.yml presence) - Optional: --config <PATH>, --yes, --dry-run - iter 1 scope: macOS only, local mode only (errors on cloud flags / other OS with explicit fix message) - 5 unit tests using CliRunner + tmp_path (write/preserve/validate/dry-run/backup) Closes the manual install friction surfaced in iter 1 manual testing. Cuts "5 minutes editing JSON + finding absolute recce path + restarting" to one command + Cmd+Q. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Kent <iamcxa@gmail.com>
Adds the third widget in the iter 1 widget POC line — tier 1 (status pill + key/value grid) for server runtime state. First widget added post-iter-1 idiomatic refactor, so it's born following the canonical pattern: - Pydantic output model matching _tool_get_server_info return shape - @mcp.tool with annotations dict (readOnly/idempotent/title) - CallToolResult with short content + structuredContent (no agent dual-render) - @mcp.resource with mime text/html;profile=mcp-app + CSP + prefersBorder - Widget HTML: SDK helpers (applyDocumentTheme + applyHostStyleVariables + applyHostFonts), onhostcontextchanged + onteardown, exhaustive @media dark for every class with var(--color-*, fallback) — closes the dark-mode fallback gap that bit row_count_diff/schema_diff earlier (commit b882697) - WIDGET_TOOLS set updated to filter from main mcp-server when widgets enabled - 2 new tests + updated enumeration assertions (10 total, was 8) - docs/mcp-widgets.md "Add widget N+1" walkthrough now references this as the canonical worked example (born idiomatic, no later-cycle defensive layers) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Kent <iamcxa@gmail.com>
Fourth widget. tier 2 (list / simple table) — saved Recce checks for a PR
session, rendered as a 3-up summary + status table.
- Pydantic CheckSummary + ListChecksOutput models matching actual
_tool_list_checks shape (minimal field set — name/type/status/description,
NOT internal params). approved count from raw return ("approved" key);
pending derived as total - approved in the delegate.
- @mcp.tool with annotations, CallToolResult short content + structuredContent
- @mcp.resource with mime/CSP/prefersBorder
- Widget HTML: SDK helpers + onhostcontextchanged + onteardown + exhaustive
@media dark overrides (354 lines, 13 KB)
- WIDGET_TOOLS filter updated to 4 widget tools
- 2 new tests; enumeration assertion bumped to 4; annotation loop includes
list_checks; filter test asserts list_checks absent when widgets enabled
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Kent <iamcxa@gmail.com>
Fifth widget, completing Phase A of the post-iter-1 widget expansion. Single-model detail view (base/current column tables, constraints, not-found state). - Pydantic ColumnInfo + ModelEnvironment + GetModelOutput + GetModelInput matching actual _tool_get_model return shape (columns dict → list normalised in _parse_model_env helper; primary_key preserved; raw_code omitted) - @mcp.tool with annotations, CallToolResult short content + structuredContent - @mcp.resource with mime/CSP/prefersBorder - Widget HTML: SDK helpers + onhostcontextchanged + onteardown + exhaustive @media dark; adaptive 2-col/3-col column table layout (constraints visible only when present); per-env base/current sections; not-found empty state - WIDGET_TOOLS filter at 5 tools - 2 new tests (test_get_model_widget_registered, test_get_model_returns_calltoolresult_with_pydantic_shape); enumeration assertions bumped to 5 tools/resources Phase A (tier 1-2 multipliers) complete: row_count_diff, schema_diff, get_server_info, list_checks, get_model. ~half the design's 12 widget candidates done — design Open Q #6 >=50% FastMCP migration trigger approaches. Iter 2 mini-doc to reevaluate. No Pydantic reserved-name conflicts in actual handler shape (columns, primary_key, raw_code are all safe); Gotcha documented in mcp-widgets.md for future implementors. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Kent <iamcxa@gmail.com>
…st[str]
Captain hit Pydantic validation error in Claude Desktop: actual
_tool_get_server_info returns support_tasks as {"query": True, ...,
"change_analysis": True} (dict of task slug → enabled bool), not a list
of enabled slugs. Fix the Pydantic field type and widget HTML iteration.
- ServerInfoOutput.support_tasks: Dict[str, bool] (was List[str])
- Widget HTML iterates Object.entries(...).filter(([_,v]) => v) for enabled
tasks; renders as info badges (same visual as before)
- Test fixture updated to use dict shape
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Kent <iamcxa@gmail.com>
… invocation)
Captain hit Pydantic validation error in Claude Desktop:
Error executing tool list_checks: 1 validation error for list_checksArguments
args Field required [type=missing, input_value={}, input_type=dict]
Root cause: list_checks delegate signature was `async def list_checks(args:
ListChecksInput)` where ListChecksInput was an empty BaseModel (no real args
to pass). FastMCP exposed this as an inputSchema requiring `args` as a top-
level field — agent called with no params, MCP sent {}, validation failed.
Fix: drop the empty ListChecksInput class and signature param. Match the
no-args pattern of get_server_info. `_tool_list_checks` still receives {}
internally (the handler accepts but ignores arguments).
Confirmed only list_checks affected: row_count_diff / schema_diff have
Optional-fields-only inputs (OK), get_server_info has no-args (OK),
get_model has required model_id (OK).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Kent <iamcxa@gmail.com>
…dget) Sixth widget; first tier-3 widget. Establishes the data-table rendering pattern for the other Phase B widgets (query_diff, value_diff, value_diff_detail, top_k_diff). - QueryInput / QueryOutput Pydantic models matching actual DataFrame.model_dump shape (READ from source — columns/data/limit/more/total_row_count) - Sticky-header scrollable table inside ~400px container - Cell type rendering (null → "—" italic secondary, number → tabular-nums right, string → truncated with title attr, bool → check/dash, date → display as-is) - Truncation badge when result is capped (DataFrame.more) - Empty state with SQL echo for debug context - @mcp.tool with annotations (openWorldHint=True since queries hit the warehouse) - CallToolResult short content + structuredContent - WIDGET_TOOLS filter expanded to 6 tools - 2 new tests (tests 15-16); enumeration assertions bumped to 6 - docs/mcp-widgets.md gains "Adding a tier-3 (data table) widget" section as template for the remaining Phase B widgets Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Kent <iamcxa@gmail.com>
…mparison) Seventh widget. Builds on query widget (46251d5)'s data-table pattern. Compares SQL execution against base + current environments with per-row status rendering. Handles both QueryDiffTask result shapes from source. - QueryDiffInput / QueryDiffOutput Pydantic models verified against actual QueryDiffTask.execute() return shape (QueryDiffResult) - Two render modes detected at runtime from structuredContent: * Shape A (no primary_keys): side-by-side base / current tables * Shape B (primary_keys): join-diff table with in_a/in_b stripped, status pills (Added/Removed), row tinting, filter buttons - _parse_dataframe() helper mirrors _parse_model_env pattern - _warning extracted to output.warning named field (consistent with row_count_diff Day 1.5 pattern) - Widget HTML: sticky-header scrollable tables + status pills + filter buttons (All / Added / Removed with counts) + dark-mode coverage - @mcp.tool with openWorldHint=True (queries hit warehouse) - WIDGET_TOOLS updated to 7 tools - 2 new tests (registration + both Shape A and B); enumeration bumped to 7 - test_widget_tool_annotations_present updated for query_diff open-world group - docs/mcp-widgets.md: widget count + reference table updated Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Kent <iamcxa@gmail.com>
…comparison)
Eighth widget. Aggregate stats + per-column match breakdown for value
comparison across two environments (primary key matching).
- ValueDiffInput / ValueDiffColumnRow / ValueDiffSummary / ValueDiffOutput
Pydantic models matching actual ValueDiffTask return shape (verified from source):
summary={total, added, removed}, data.data=[[col, matched, matched_p], ...]
where matched_p is 0.0-1.0 fraction (not percentage)
- _warning extracted to output.warning named field
- 4-up summary cards (total/common/columns-affected/added+removed) + per-column
table with inline bar visualization (CSS-only, no chart lib)
- All-match empty state when every column is 100% identical
- @mcp.tool with annotations (openWorldHint=True, queries warehouse)
- WIDGET_TOOLS at 8 tools
- 2 new tests (registration + Pydantic shape); enumeration bumped to 8
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Kent <iamcxa@gmail.com>
…el inspection) Ninth widget. Row-level companion to value_diff. Shows the actual rows with mismatched values for investigation. - ValueDiffDetailInput / Output Pydantic models matching actual ValueDiffDetailTask return shape (plain DataFrame with in_a/in_b flags) - Scrollable table with sticky primary-key column (CSS sticky left), filter buttons (All/Removed/Added) with counts, status pill per row, row tinting (red=removed, green=added), cell type-aware rendering - _warning extracted to output.warning named field - @mcp.tool with annotations (openWorldHint=True) - WIDGET_TOOLS at 9 tools - 2 new tests (Tests 21-22); enumeration bumped to 9; 27 total passing Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Kent <iamcxa@gmail.com>
Tenth widget; closes Phase B (5 tier-3 widgets). Side-by-side ranked
lists of top-K most frequent values across base + current envs.
- TopKDiffInput / TopKEnvStats / TopKDiffOutput Pydantic models matching
actual TopKDiffTask return shape (READ from source):
base/current: {values[], counts[], valids, total} parallel lists.
values[] is the SAME union list in both envs (SQL FULL OUTER JOIN order:
curr_count DESC, base_count DESC). count=0 means absent from that env.
Param is column_name (not column), default k=10 (not 50).
- 2-column grid with ranked lists per env, inline bar viz, rank-change
arrows (up/down), New/Gone badges for env-exclusive categories
- _warning extracted to output.warning named field
- @mcp.tool annotations (openWorldHint=True — executes warehouse SQL)
- WIDGET_TOOLS at 10 tools — 50% coverage (10/20), design Open Q #6
FastMCP migration trigger threshold REACHED. iter 2 to reevaluate.
- 2 new tests (23: registration + inputSchema, 24: Pydantic shape);
enumeration bumped to 10
Phase B (tier-3 data tables): query, query_diff, value_diff,
value_diff_detail, top_k_diff. Each hand-rolled its table/list layout
because patterns diverged enough that a shared <recce-table> would
need extreme flexibility. Iter 2 mini-doc: evaluate renderRankedList()
JS helper extraction; full component abstraction deferred until 3+
widgets converge on same layout.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: Kent <iamcxa@gmail.com>
…t widget)
Eleventh widget; first tier-4 (statistics/chart) widget. Hand-rolled SVG
bars — no new chart library, no CSP/CDN changes. Establishes the
"tier-4 via hand-rolled SVG" pattern that profile_diff and any future
chart widgets will follow.
Actual HistogramDiffTask return shape (read from recce/tasks/histogram.py):
base/current: {counts: List[int], total: int} (or {} if env failed)
min/max: overall min/max across both envs (numeric or date object)
bin_edges: N+1 edge values (int/float for numeric, date for datetime)
labels: List[str] for numeric cols ("lo-hi" format); None for datetime
_tool_histogram_diff auto-detects column_type from catalog — tool only
needs model + column_name (+ optional num_bins).
- HistogramDiffInput / HistogramEnvStats / HistogramDiffOutput Pydantic
models matching actual HistogramDiffTask return shape
- Date bin_edges serialised to ISO strings in widget delegate
(date objects are not JSON-serialisable natively)
- Hand-rolled SVG: viewBox 600x180, base bars (blue 45% opacity) behind
current bars (green 70% opacity), x-axis label density auto-reduced
(every Nth label for bins > 10), hover tooltip via mousemove
- CSS tokens for bar colors with exhaustive @media dark fallback
- _warning extracted to output.warning named field
- @mcp.tool annotations (openWorldHint=True, hits the warehouse)
- WIDGET_TOOLS at 11 tools
- 2 new tests (25, 26); enumeration bumped to 11; annotations test
extended to cover histogram_diff openWorldHint=True
- docs/mcp-widgets.md: widget count 10→11, new "Tier-4 (Chart) Widget
Architecture" section explaining hand-roll SVG approach + iter-2
upgrade criteria for swapping to a real chart library
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Kent <iamcxa@gmail.com>
…e cap) 15th widget tool. Hand-roll SVG mini-DAG (no Mermaid/ReactFlow dep) copying the impact_analysis BFS layout pattern. Hard cap MAX_INLINE_NODES=10 — over cap shows graceful empty-state message pointing to Recce web UI, no truncated view. Full ReactFlow plan from design Phase E remains deferred. Motivation: without a lineage_diff widget, Claude Desktop's agent falls back to rendering lineage as Mermaid text after seeing impact_analysis output, bypassing the widget rendering pipeline. Adding this widget keeps lineage visualization inside the same theming + interaction surface as the other 14 widgets and brings widget coverage to 15/20 (75%). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Kent <iamcxa@gmail.com>
These two Claude Code plugins back the MCP App widget work in this branch (ext-apps SDK reference, MCP server scaffolding). Enabling them in the repo settings keeps the dev environment consistent for anyone continuing the widget POC. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Kent <iamcxa@gmail.com>
The impact_analysis value_diff builder derived its per-column diff list from model_info["columns"], which reflects only the CURRENT relation (get_model -> get_columns(base=False)). It then applied `b."col" IS DISTINCT FROM c."col"` to BOTH the base (b) and current (c) relations. When a column had drifted — present in the current physical table but not the base — the warehouse binder failed hard (Snowflake: `Table "b" does not have a column named "<col>"`). Restrict the diff to the intersection of base and current columns via get_model(node_id, base=True) (which manages its own warehouse connection), mirroring ValueDiffTask. Drifted columns are already reported via schema_changes. Also skip value_diff when the PK itself has drifted, since it is the FULL OUTER JOIN key. Adds a regression test asserting drifted columns never reach the generated SQL and that value_diff covers only common non-PK columns. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Kent <iamcxa@gmail.com>
`logs/` (dbt run logs) and `.hypothesis/` (hypothesis testing example cache) are runtime/test artifacts regenerated on every test run, but the existing rules only covered the root `dbt.log` file — not the `logs/` directory dbt writes to, nor `integration_tests/dbt/logs/`. They showed as persistent untracked changes in the workspace UI. Ignore the dir forms (no leading slash → matches at any depth). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Kent <iamcxa@gmail.com>
Code Review: PR #1397SHA Verified checks: BlockersNone. Issues
Notes
Limits
|
even-wei
left a comment
There was a problem hiding this comment.
Claude Code Review: NO-GO. 1 issue (format gate) + 5 notes. See review comment.
Batch of correctness/robustness fixes from a Codex review pass: - mcp_server: default 30s timeout on cloud proxy requests so a stalled endpoint cannot hang the MCP server indefinitely. - mcp_server: include the effective base selection (single-env vs dual-env) in the set_backend local cache key, so a context loaded in single-env mode is reloaded once target-base/ later appears instead of serving a stale single-env context. - cli: mcp_config_install invokes `python -m recce.cli` (the package has no recce.__main__, so `-m recce` is not runnable). - cli: make check_base_freshness SHA comparison opt-in via expected_base_sha. Base-branch artifacts legitimately carry a DBT_GIT_SHA that differs from the feature-branch HEAD, so the previous unconditional compare produced false "stale_sha" results. - tests: cover cloud request timeout, single-env -> dual-env reload, the config-install module path, and the opt-in base-SHA freshness check. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Kent <iamcxa@gmail.com>
Addresses the Code Review format-gate issue (Issue 1): black wraps an over-length call and isort splits the multi-line import block in the lineage_diff widget test. Pure formatting, no behavior change. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Kent <iamcxa@gmail.com>
Addresses Code Review Note 1. The backup was rewritten unconditionally on every run, so a second `recce mcp-config-install` clobbered the `.recce.bak` with the already-modified config — destroying the pristine pre-recce original and making the documented "restore from .recce.bak" undo path useless. Skip the copy when a backup already exists, and report "Existing backup preserved" instead of "Backup saved" in that case. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Kent <iamcxa@gmail.com>
iamcxa
left a comment
There was a problem hiding this comment.
@iamcxa — automated review (kc-pr-review, Full tier: code-reviewer + silent-failure-hunter + pr-test-analyzer + type-design-analyzer + HTML-XSS scan + ToB security). Event is COMMENT, not approve: (a) tests could not be re-run locally (env SIGKILL/OOM at pytest import — the first background run did exit 0, and the PR body documents 35+5 passing); (b) the check_base_freshness behavior change needs author intent; (c) the one production-risk path (Snowflake binder in the value_diff fix) is mock + manual-A/B only, no CI e2e. No CRITICAL/HIGH blocking bugs were found.
Goal-achievement verdict: ✅ Achieved (POC scope, with follow-up caveats)
Evidence
- All 15 tool/resource pairs register correctly; every
_meta.ui.resourceUrimaps to a realrecce/data/mcp/*.html(verified file-by-file). - All 15 HTML widgets escape warehouse data consistently (
escapeHtml/textContent/ numeric formatters) — no exploitable stored-XSS sink found. - The
impact_analysisvalue_diff drift fix is correct — it intersects base ∩ current columns and usesget_model(base=True)connection-managed introspection, matching the canonicalValueDiffTask. - Type design is solid for a POC (None-vs-zero distinctions in
RowCountModel,ValueDiffColumnRow.matched_p,HistogramEnvStats.totalare deliberate and correct).
Verification Matrix
| # | Concern (from PR body) | Verified? | Evidence |
|---|---|---|---|
| V1 | RECCE_MCP_WIDGETS=0/unset → all 20 tools text-only, backward compatible |
✅ | _widgets_enabled() reads env at call time; filter only removes WIDGET_TOOLS when enabled (mcp_server.py:1294) |
| V2 | RECCE_MCP_WIDGETS=1 → main 5 + widget 15, no collision |
✅ | WIDGET_TOOLS filter + widget server registers exactly the 15-name set; test pins it (test_widget_server.py:102/173) |
| V3 | impact_analysis drift fix (Snowflake binder) | Code correct, but e2e gap — see inline on mcp_server.py:1931 |
|
| V4 | schema_diff serialization (SchemaChange collision fix) | Models distinct + field-pinned; no behavioral output test — see Advisory F | |
| V5 | stdio stdout purity (JSON-RPC channel) | ✅ | No stdout writes in widget_server.py; logging → stderr (widget_server.py:2521) |
| V6 | chdir to --project-dir before RecceConfig |
✅ | Present in both subcommands; minor: no dir-existence guard (Advisory) |
Verification Summary (tests)
| Check | Result |
|---|---|
| Local re-execution | |
| First background run (widget + config_install + check_base + mcp_server + mcp_cloud_backend) | exit 0 |
| PR-body documented | test_widget_server.py 35 passed · test_mcp_config_install.py 5 (file actually has 7) · broad selection 363 passed |
Advisory (not blocking)
- A — Bundled tooling / scope creep.
CLAUDE.md(+37 gbrain boilerplate, includes a per-machine "on this machine" assertion in shared docs) and.claude/settings.json(enablesmcp-server-devfor all repo contributors) are unrelated to the widget feature. Suggest splitting into a separate commit/PR, marking POC-only, or moving dev-plugin prefs to gitignoredsettings.local.json. - B — External SDK dependency. All 15 widgets
importhttps://unpkg.com/@modelcontextprotocol/ext-apps@0.4.0(no SRI). Mitigated by each resource'scsp.resourceDomains: ["https://unpkg.com"](host-enforced origin restriction) and this is the standard MCP Apps SDK load pattern — acceptable for POC; consider local vendoring before productionizing. - C — Type contracts.
status/change_status/priorityetc. are freestrwith legal values only in comments. Converting the widget-branched ones toLiteral[...]is a zero-cost contract win. - D — HTML hardening. No
<meta>CSP in the HTML (partially mitigated byresourceDomains);get_server_info.html:317pr.urlhref has no scheme validation (javascript:possible, though source is GitHub metadata not warehouse) andtarget="_blank"lacksrel="noopener". - E — Mega-PR. Feature POC + 4 unrelated pre-existing bugfixes (track stderr / chdir / set_backend cache-key / freshness) + tooling, 37 commits / +14K. Hard to review and to revert. Future: land pre-existing bugfixes as their own PRs.
- F — Test gaps.
schema_diffis the only widget with no behavioralmodel_validateround-trip test (the collision broke exactly that serialization); no absolute tool-count assertion (assert len==20/5) so dropping a non-widget tool would pass silently. - G — impact_analysis per-step
except Exception(pre-existing,mcp_server.py~1866/1901/2042, not in this diff) appends toerrors[]without logging unexpected exceptions — a code bug is indistinguishable from a warehouse failure. Worth narrowing +exc_infologging when that area is next touched.
Nice work on the POC — the escaping discipline, type care, and the documented drift fix are genuinely solid. The items above are hardening + a couple of bundled changes worth confirming.
| current_sha = current_commit_hash() | ||
| result["current_sha"] = current_sha | ||
| if current_sha and base_sha != current_sha: | ||
| if base_sha is not None and expected_base_sha: |
There was a problem hiding this comment.
MEDIUM — SHA freshness is now opt-in (and expected_base_sha), but neither shipping caller passes it: check_base CLI (L3488) and the MCP startup check (mcp_server.py:2807) both omit expected_base_sha. So stale_sha is now unreachable in production, current_sha is always None, and the stale_sha color-map (L3505) + exit-code-2 branch (L3522) are dead.
The rationale (base artifacts come from the base branch → base SHA ≠ feature HEAD → the old auto-check was a false-positive generator) is sound. But this is a behavior change to the existing recce check-base command, not mentioned in the PR body and bundled into a widget POC. Please confirm it's intentional — if so, clean up the dead stale_sha/current_sha path, or wire a caller that actually passes expected_base_sha.
There was a problem hiding this comment.
Deferred to the follow-up discussion (POC scope) — leaving this thread open as a tracked TODO. The opt-in SHA change is intentional behavior-wise (base artifacts come from the base branch, so the base SHA always differs from feature HEAD → the old auto-check was a false-positive generator). But stale_sha / current_sha / the exit-code-2 branch are now dead for both callers. Decision pending: clean up the dead path vs. wire a caller that passes expected_base_sha.
There was a problem hiding this comment.
On reflection, holding this rather than cleaning it up inside this widget PoC. The stale_sha / expected_base_sha path is not pure dead code — it's a tested opt-in API: test_status_stale_sha_when_expected_base_sha_mismatch exercises the mismatch → stale_sha branch, and test_status_stale_sha pins the no-arg = fresh behavior. So it's "no production caller yet", not "unreachable".
Removing it would touch the recce check-base command (color-map, exit-code-2, docstring) plus those tests — a cross-feature refactor outside this widget PoC's scope, and one I can't validate locally right now (the full pytest suite OOMs in this env; only the lint tools run). Proposal: either leave the opt-in as-is (harmless + tested), or do a focused follow-up that wires a caller passing expected_base_sha (giving it a purpose) or removes SHA-freshness entirely. Leaving this thread open for that decision.
| - table_not_found / permission_denied surface in *_meta.status | ||
| - tool raises only on fundamental dbt/adapter failures | ||
| """ | ||
| result = await _recce_server._tool_row_count_diff(args.model_dump(exclude_none=True)) |
There was a problem hiding this comment.
MEDIUM — Widget tools call _recce_server._tool_*() directly, bypassing the central @server.call_tool handler in mcp_server.py:1426. That handler is where DRC-2754 error classification lives: _classify_db_error → downgrade table_not_found/permission_denied to warnings + sentry_metrics.count("mcp.expected_error", ...). For all 15 widget tools that machinery never runs — expected DB errors become hard errors and there's zero telemetry.
Not a silent failure (FastMCP still wraps the raised exception into an error result), so not blocking for the POC — but it quietly undoes documented investment. Suggest routing widget delegates through a shared classify/telemetry wrapper, or a tracking note that the widget path defers DRC-2754 parity.
There was a problem hiding this comment.
Deferred to the follow-up discussion (POC scope) — leaving open as a tracked TODO. The widget delegates bypass the central call_tool classifier/telemetry (DRC-2754). Errors still surface (FastMCP wraps them); what's lost is expected-error classification + Sentry metrics for the 15 widget tools. Tracking a shared classify/telemetry wrapper for when the two-server architecture is consolidated.
There was a problem hiding this comment.
TODO marker added in d080819e (above _recce_server) and tracked in the PR description's "Review follow-up — deferred items" section. Deferred per iter-1 POC scope; leaving open until the shared classify/telemetry wrapper lands (two-server consolidation).
| # Intersect both sides, mirroring ValueDiffTask; drifted columns are | ||
| # already reported via schema_changes (Step 2b). get_model(base=True) | ||
| # manages its own warehouse connection for the base-side introspection. | ||
| base_model_info = self.context.adapter.get_model(node_id, base=True) |
There was a problem hiding this comment.
MEDIUM (test gap) — This drift fix is correct (intersection + get_model(base=True) connection-managed introspection, matching ValueDiffTask). But it has no real-adapter e2e. test_mcp_e2e.py::test_schema_changes_detected builds the drift shape against real DuckDB, yet its users fixture declares no primary_key, so if not pk: continue (L1917) skips value_diff entirely and the IS DISTINCT FROM binder path is never exercised on a live warehouse. Coverage is mock-only (drifted column absent from generated SQL) + the documented manual Snowflake A/B.
Per project MEMORY, mocks can't catch the missing-warehouse-connection class of bug. Recommend an e2e fixture with a PK where current has a column absent in base; assert impact_analysis returns value_diff over the common non-PK columns and does not raise.
There was a problem hiding this comment.
Deferred to the follow-up discussion — leaving open as a tracked test-coverage TODO. The fix is mock-tested (asserts drifted columns never reach the generated SQL) plus the documented manual Snowflake A/B, and even-wei independently verified 156 tests pass. A real-DuckDB e2e with a PK + a column present only in current is the gap to close; not a blocker for the POC.
There was a problem hiding this comment.
TODO marker added in d080819e (at the value_diff intersection) and tracked in the PR description's deferred section. Deferred; leaving open until the real-DuckDB e2e fixture (PK + drifted column) lands.
| exclude=args.exclude, | ||
| packages=args.packages if args.packages is not None else None, | ||
| ) | ||
| output = SchemaDiffOutput( |
There was a problem hiding this comment.
LOW — SchemaDiffOutput has no warning field and schema_diff never extracts _warning, unlike the other 13 diff tools (e.g. L202 warning = result.pop("_warning", None)). In single-env mode it renders "No models found" (schema_diff.html:264), which reads as "no schema changes" when the truth is "no base to compare against". (_tool_schema_diff also lacks the single-env warning on the main server — pre-existing — but the widget's empty-state makes it more misleading.) Suggest adding a warning field + the single-env notice.
There was a problem hiding this comment.
Deferred to the follow-up discussion (POC scope) — leaving open as a tracked TODO. Likely fix: add a warning field to SchemaDiffOutput + extract _warning, plus the single-env notice, so the empty-state ("No models found") isn't read as a clean diff when there's simply no base to compare against.
There was a problem hiding this comment.
TODO marker added in d080819e (at the SchemaDiffOutput build) and tracked in the PR description's deferred section. Deferred; leaving open until the warning field + single-env notice are added.
…stall - expanduser() on --project-dir and --config so ~/path resolves (Copilot) - atomic config write (temp + os.replace) guarded by except OSError, so a mid-write failure leaves the existing claude_desktop_config.json intact instead of a truncated file (self-review) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Kent <iamcxa@gmail.com>
- _widgets_enabled() now case-insensitive (accepts TRUE/True/true/1) (Copilot) - run_widget_server cloud/session guard checks cloud_session (the actual --session dest) in addition to session, so the guard is not silently dead (self-review) - _read_widget_html broadens except + logs on failure so a packaging defect that drops recce/data/mcp/*.html is visible in server logs (self-review) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Kent <iamcxa@gmail.com>
_show_error_message used Console() (stdout); the --debug path's print_exception corrupts the stdio JSON-RPC channel for MCP servers. Now Console(stderr=True), matching the already-fixed non-debug branch (Copilot, outdated thread). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Kent <iamcxa@gmail.com>
- escapeHtml in all 15 widgets now also escapes single quotes (' -> ');
defense-in-depth so a future single-quoted attribute stays safe (even-wei N2)
- get_server_info PR link adds rel=noopener noreferrer to its target=_blank
anchor (Copilot, even-wei)
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Signed-off-by: Kent <iamcxa@gmail.com>
- .gitignore ignores recce/data/** then re-includes recce/data/mcp/ via negation, so nested build output (e.g. recce/data/lineage/index.html) is ignored while widget source is tracked; replaces the flat per-extension allowlist that missed nested dirs (self-review) - docs/mcp-widgets.md: gitignore gotcha rewritten for the negation scheme; widget test count 28 -> 35 (Copilot) - test_mcp_config_install docstring: add backup-pristine-on-rerun + python-fallback coverage lines (Copilot) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Kent <iamcxa@gmail.com>
|
@even-wei — addressing your NO-GO (reviewed at Blocker (format gate) — already resolved before the current head: Note 1 / 4 (backup clobbered on re-run, undo lost) — fixed in Note 2 (escapeHtml doesn't escape Note 3 (double base-side introspection in impact_analysis) — acknowledged; this is the tradeoff for the binder-error correctness fix. Tracking a cache/batch of the base-side Note 5 (15 copy-pasted escapeHtml/svgEl helpers) — acknowledged; the POC accepts per-file helpers. Shared-asset extraction tracked as a follow-up. Also note your "Limits": the Re-requesting your review on |
_read_widget_html resolved assets via importlib.resources.files("recce.data.mcp"),
which relies on recce.data.mcp resolving as a namespace package — fragile across
install layouts (Copilot flagged a potential ModuleNotFoundError). Anchor on the
recce regular package (always importable) and traverse to data/mcp/ instead, so a
pip-installed wheel reliably serves the widgets. Goal: widgets must work on any
install, not just an editable/source checkout.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Signed-off-by: Kent <iamcxa@gmail.com>
Mark the deferred POC items (tracked in the PR description's follow-up section) with TODO(DRC-3526 follow-up) comments in code: - widget tools bypass central call_tool classifier/telemetry (DRC-2754) - schema_diff has no single-env warning - impact_analysis value_diff drift fix lacks a real-DuckDB e2e No behavior change. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Kent <iamcxa@gmail.com>
Code Review: PR #1397SHA Blockers
Issues
Notes
Multi-pass adversarial review (recce-dev:claude-code-review) — passes A–I + security skill; widget_server.py, 15 HTML widgets, and test suite each independently reviewed and cross-refuted. |
even-wei
left a comment
There was a problem hiding this comment.
Claude Code Review: Critical issues found (2 blockers, 12 issues). See review comment: #1397 (comment)
Summary
Recce MCP App POC (DRC-3526). Ships 15 widget-enabled MCP tools rendered as interactive HTML widgets in Claude Desktop and other MCP App-capable hosts, plus a
recce mcp-config-installhelper that auto-writes the Claude Desktop config so POC users skip the hand-edited JSON.Architecture is a parallel FastMCP-based widget server (
recce mcp-widget-server) alongside the existing low-levelrecce mcp-server, coordinated byRECCE_MCP_WIDGETS=1. Day 0 spike found low-levelmcp.server.Serverdoesn't render widgets in Claude Desktop whilemcp.server.fastmcp.FastMCPdoes — hence two servers rather than upgrading the existing one. Full FastMCP migration is deferred (design Open Q #6; now at 15/20 = 75% widget coverage, past the ≥50% reconsideration threshold).Demo journeys (recorded)
Both journeys use plain reviewer-style prompts — no
mcp__recce-widgets__<tool>tool-naming — to validate that a non-technical reviewer's natural phrasing reliably triggers the right widget.Demo 1 — Quick scan (Journey A) · Loom
Replaces the AI-summary "block of text" with inline widgets so a reviewer grasps "what changed" in seconds. PR scenario: jaffle_shop_golden#15 (refactor
customers.sql→ two intermediate models).Prompt 1 →
row_count_diffwidget:Prompt 2 →
schema_diffwidget:Demo 2 — Investigate impact (Journey 2) · Loom
Blast-radius summary → evidence drilldown, with the agent following the
next_actionadvisory hint. PR scenario: jaffle_shop_golden#16 (amountDOUBLE → DECIMAL onstg_payments).Prompt 1 →
impact_analysiswidget:Prompt 2 →
profile_diffwidget:What's in this PR
New surface:
recce mcp-widget-serverCLI subcommand (local mode only — iter 1 scope)recce mcp-config-installCLI helper — writes both server entries +RECCE_MCP_WIDGETS=1intoclaude_desktop_config.json(macOS, iter 1 scope), with--dry-runand--yes, and a backup of the existing configrecce/widget_server.py— FastMCP server with 15@mcp.tooldelegates + matching@mcp.resourcehandlers, idiomatic shape (Pydantic input/output models,CallToolResultwith shortcontent+structuredContent, tool annotations, ext-apps SDK theme helpers)recce/data/mcp/*.html— 15 self-contained HTML widgets, each with exhaustive@media (prefers-color-scheme: dark)overrides and Claude design-token themingWIDGET_TOOLSfilter inrecce/mcp_server.py— whenRECCE_MCP_WIDGETS=1, removes the 15 widget-eligible tools from the main server'slist_tools(defers to widget server)15 widgets by tier:
row_count_diff,schema_diff,get_server_info,list_checks,get_modelquery,query_diff,value_diff,value_diff_detail,top_k_diffhistogram_diff,profile_diffget_cll,impact_analysislineage_diff— hand-roll SVG, hard 10-node cap, graceful empty-state over capBug fixes (pre-existing, surfaced by the POC):
recce/track.pyprint(traceback.format_exc())→file=sys.stderr— stdout traceback is fatal for any stdio MCP server when RecceConfig failsrecce mcp-server/mcp-widget-servernowos.chdir(--project-dir)beforeRecceConfig, so Claude Desktop'scwd=/spawn finds the dbt project + recce.yml (removes the bash-wrapper workaround)SchemaChangeclass-name collision — Phase D'simpact_analysisdefined a second module-levelSchemaChangethat shadowedschema_diff's model, breakingschema_diffserialization with a Pydantic "column / change_status Field required" error. Renamed toColumnSchemaChange; regression test pins both field surfaces.Quality / docs:
tests/test_widget_server.py— 35 tests (registration via public FastMCP API, env-var coordination 18-vs-20 tools, file-missing graceful degradation,CallToolResultshape, Pydantic structuredContent round-trip, annotations, collision guard,lineage_diff10-node-cap branch)tests/test_mcp_config_install.py— 5 tests (project-dir validation, dry-run no-write, backup creation)docs/mcp-widgets.md— template doc for Scott's hand-off: file layout, config example, "Add widget N+1" walkthrough, structuredContent contract, gotchas, Python-vs-TypeScript SDK noteTest plan
pytest tests/test_widget_server.py→ 35 passedpytest tests/test_mcp_config_install.py→ 5 passedpytest tests/ -k "mcp or widget or schema_diff or impact or lineage"→ 363 passed, no widget-related regressionsrecce mcp-widget-server --project-dir <dbt>) — exit 0, clean stderr, no stdout pollutionNote:
tests/test_server.py::test_spa_route_lineagefails locally because the frontend SPA build artifact (recce/data/lineage/index.html) isn't present in this worktree — pre-existing, unrelated to this PR (fails identically with the diff stashed; resolved bycd js && pnpm run build).User-facing changes
For Claude Desktop POC users:
recce mcp-config-install --project-dir /path/to/dbt(writes both entries + env var, backs up existing config). Or hand-edit perdocs/mcp-widgets.md.RECCE_MCP_WIDGETS=0/unset → zero change, all 20 tools text-only fromrecce mcp-server, backward compatible.RECCE_MCP_WIDGETS=1→recce mcp-serverserves 5 non-widget tools,recce mcp-widget-serverserves the 15 widget tools. No name collision._meta.ui.resourceUri, uses the tool's one-sentencecontenttext.What's NOT in this PR
recce mcp-widget-server(iter 1 = local mode only). This is the strategic next step — MCP ↔ Recce Cloud is what lets non-technical users skip git/dbt entirely and is where PR-selection + artifact-packaging are properly solved (not as extra local tools).lineage_diffbeyond v1 — progressive degradation for >10 nodes (shrink / zoom / aggregate) and the ReactFlow@datarecce/ui/lineageborrow contract_recce_serverstill module-global)Reviewer notes
docs/mcp-widgets.mdis the canonical Scott hand-off doc — if the "Add widget N+1" walkthrough leaves a cold reader unsure, that's PR-blocking feedback.rgb(), no@datarecce/uiinternal-path imports (widget HTML is plain JS).🤖 Generated with Claude Code
Review follow-up — deferred items (⚠️ this PR is NOT rushing to merge)
A full review pass (kc-pr-review + kc-pr-review-resolve) ran on
d089e737. Actionable fixes are committed (65ca0739→d089e737):~path expansion + atomic config write inmcp-config-install, case-insensitiveRECCE_MCP_WIDGETS, stdio-safe--debugtraceback, robustrecce/data/**gitignore (nested build artifacts now ignored;recce/data/mcp/source tracked), single-quote HTML escaping across all 15 widgets,rel="noopener noreferrer"on the PR link, widget HTML load anchored on thereccepackage (no longer depends on namespace-package resolution → reliable underpip install), and doc/test-count sync.even-wei's NO-GO blocker (the
test_widget_server.pyblack/isortformat gate) was already resolved in230af0f0and is re-verified clean on the current head (black/isort/flake8exit 0).The items below are intentionally deferred — this is an iter-1 POC whose goal is breadth (a widget for every tool), not hardened design. Each is marked with a
TODO(DRC-3526 follow-up)comment in code; a DRC follow-up ticket will track them:call_toolerror classifier/telemetry (DRC-2754). Errors still surface (FastMCP wraps them); what's deferred is expected-error classification (table_not_found/permission_denied→ warning) + Sentry metrics for the 15 widget tools. Resolve when the two-server architecture is consolidated. —recce/widget_server.pyimpact_analysisvalue_diff drift fix lacks a real-DuckDB e2e. Covered by a mock test (drifted column absent from generated SQL) + a manual Snowflake A/B; a real-adapter e2e with a PK + a column present only incurrentis the gap. —recce/mcp_server.pyschema_diffhas no single-env warning. The empty-state ("No models found") can read as a clean diff when there is simply no base to compare against. —recce/widget_server.pystrfor fixed-set fields (status/change_status/priority) — batch aLiteral/submodel tightening pass. —recce/widget_server.pyimpact_analysisdoubles base-side warehouse introspection (the cost of the binder-error correctness fix) — consider caching/batchingget_model(base=True). — even-wei N3escapeHtml/svgElhelpers — extract a shared asset to keep the escaping contract single-sourced. — even-wei N5check_base_freshnessSHA-freshness is opt-in with no production caller (thestale_sha/expected_base_shapath is a tested opt-in API, not pure dead code). Decide in a focused PR: leave as-is, wire a caller, or remove SHA-freshness — it touches the unrelatedrecce check-basecommand + tests, so it is out of scope here. —recce/cli.pyPre-merge check: run a packaged-wheel smoke test (
pip installthe built wheel, startrecce mcp-widget-server, load one widget) to confirmrecce/data/mcp/ships and resolves under the installed layout.