Skip to content

feat(mcp): surface run_id in run-backed tool results for citation (DRC-3532)#1418

Open
iamcxa wants to merge 1 commit into
mainfrom
feature/drc-3532-mcp-expose-run-id
Open

feat(mcp): surface run_id in run-backed tool results for citation (DRC-3532)#1418
iamcxa wants to merge 1 commit into
mainfrom
feature/drc-3532-mcp-expose-run-id

Conversation

@iamcxa

@iamcxa iamcxa commented Jun 3, 2026

Copy link
Copy Markdown
Contributor

Why (cross-repo, DRC-3532)

The recce-cloud summary agent wants to deep-link each factual statement in its PR
summary to the exact run it executed, via deterministic {{run:<run_id>}} inline
markers (server-side marker replacement already exists in recce-cloud-infra's
recce_task_func). But the agent never received the run_id: the cloud-backend
MCP tools that back the agent's analysis (row_count_diff, profile_diff,
value_diff, query, query_diff, top_k_diff, histogram_diff) route through
_tool_run_backed, which returned only run["result"] and dropped run_id.

Without run_id the agent cannot cite runs deterministically, forcing fragile
server-side fuzzy prose matching (low/unstable coverage, occasional wrong links).

What

_tool_run_backed now merges run_id into the result dict (additive — existing
result fields preserved). Only added when the response actually carries a run_id
and the result is a dict, so run-less or non-dict responses are untouched (the
agent must never be handed, or synthesize, a run_id it was not given).

Test

tests/test_mcp_cloud_backend.py:

  • new: run-backed result surfaces run_id (merged, fields preserved)
  • new: run-less response leaves the result untouched (no invented run_id)
  • updated: the existing run-tool routing test now expects the surfaced run_id

40 cloud-backend tests + 123 local mcp_server tests green.

Paired change

recce-cloud-infra: the summary agent prompt emits {{run:<run_id>}} markers
using this run_id; fuzzy linkify is demoted to a legacy fallback. Tracked under
DRC-3532. Durable structured-citation design is DRC-3634.

🤖 Generated with Claude Code

…C-3532)

The cloud-backend MCP tools (row_count_diff, profile_diff, value_diff, query,
query_diff, top_k_diff, histogram_diff) routed through _tool_run_backed returned
only run["result"], dropping run_id. The recce-cloud summary agent therefore
never saw the run_id and could not emit deterministic {{run:<run_id>}} citation
markers, forcing fragile server-side fuzzy prose matching.

Merge run_id into the result dict (additive; existing result fields preserved).
Only added when the response carries a run_id and the result is a dict, so
run-less or non-dict responses are untouched (never synthesize a run_id).

Cross-repo (DRC-3532): the recce-cloud summary agent prompt is updated to emit
the markers; server-side marker replacement already exists in recce_task_func.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Signed-off-by: Kent <iamcxa@gmail.com>
@codecov

codecov Bot commented Jun 3, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.

Files with missing lines Coverage Δ
recce/mcp_server.py 91.30% <100.00%> (+0.03%) ⬆️
tests/test_mcp_cloud_backend.py 100.00% <100.00%> (ø)

... and 3 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant