Cache vMCP advertised set to avoid audit re-aggregation#5512
Cache vMCP advertised set to avoid audit re-aggregation#5512tgrunnagle wants to merge 3 commits into
Conversation
On the Serve path, audit backend-enrichment resolved a request's backend by calling core.Lookup*, which — by the core's stateless design — re-aggregates every backend's capabilities on every audited request (and fans out to remote backends with discovery). That doubled the aggregation work per audited tools/call, resources/read, and prompts/get. Implement the "Serve caches, core is stateless" pattern for issue #5493: - Add a bounded, per-session advertised-set cache (name->backend display name for tools and resources), populated once at session registration from the same ListTools/ListResources aggregation already performed. - Read enrichment from the cache (keyed by Mcp-Session-Id) instead of core.Lookup*; a miss degrades the audit label gracefully and never re-aggregates. - Evict on the transport-driven session-end paths; the LRU bound backstops the DELETE/TTL paths the session manager evicts lazily without notifying the transport. The legacy server.New path is unchanged (cache is nil there). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Fixed issues from code review: - MEDIUM: Align advertised-set cache capacity validation with the session manager — newAdvertisedSetCache now rejects a negative capacity and treats 0 as the default, instead of silently clamping. Both caches read the same CacheCapacity field, so they must agree on what a value means. Updated the cache validation test and retargeted the Serve construction-failure test (which used a negative CacheCapacity) to a nil FactoryConfig.Base, since a negative capacity is now caught before sessionmanager.New. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #5512 +/- ##
==========================================
+ Coverage 69.66% 69.72% +0.06%
==========================================
Files 644 646 +2
Lines 65548 65640 +92
==========================================
+ Hits 45661 45765 +104
+ Misses 16539 16523 -16
- Partials 3348 3352 +4 ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
tgrunnagle
left a comment
There was a problem hiding this comment.
Multi-agent review — Cache vMCP advertised set (#5512)
Reviewed by 6 specialist agents (concurrency/lifecycle, vMCP architecture & anti-patterns, Go style/comments, test coverage, security/audit, general quality). No HIGH-severity findings; no authorization, routing, or admission bypass. The design is sound and a near-textbook satisfaction of the vMCP anti-pattern rules it operates under: the optimization is evidence-backed and was explicitly deferred to this exact landing point (#9 satisfied), the abstraction is package-internal and touches no shared interface (#8), sessionID is threaded as an explicit parameter rather than via ctx.Value (#1 improved), and advertisedSet is immutable-after-construction with reconstruct-and-replace updates (#10).
The findings below are non-blocking for the current test-only Serve path, but several should be resolved before audit ships on a live Serve composition root.
Consensus findings (≥7/10)
| # | Severity | Finding |
|---|---|---|
| F1 | MEDIUM | lazyInjectSessionTools tools-only store can clobber a fuller cached entry on the owning replica for a zero-tool/has-resource session; the "never clobbers a fuller entry" comment is false in that case |
| F2 | LOW | Lifecycle comment claims DELETE/TTL paths "do not notify the transport" — the mcp-go SDK in fact fires OnUnregisterSession on both; vMCP simply doesn't register the hook |
| F3 | MEDIUM | Backend label is silently dropped on a cache miss (cross-pod / post-eviction); the path it replaces always resolved one — an audit-completeness regression gated by #5493's own "hard blocker before live audit" |
| F4 | LOW | core.LookupTool/LookupResource/LookupPrompt now have zero production callers — dead interface surface (acknowledged out-of-scope; track + annotate) |
| F5 | LOW | Test gaps: lazy-inject cache re-population and registration-failure eviction are both untested |
| F6 | LOW | Orphan-backend (advertised but absent from registry) records the raw BackendID on Serve but empty WorkloadName on legacy — the "audit parity" comment overstates equivalence (pre-existing behavior, carried over) |
| F7 | LOW | fakeCore.lookupResCalls/lookupPromptCalls counters are incremented but never asserted; the resources/read re-aggregation guard is missing |
What this PR gets right (verified)
- AC1 is genuinely guarded: the "does not re-aggregate" test asserts
lookupToolCalls==0/listToolsCalls==0against a fresh fakeCore whose counters are actually wired — not vacuous. - Cache is bounded (LRU, validated capacity; 0→default, negative rejected) and nil-receiver-safe (the legacy
server.Newpath leaves it nil and calls evict unconditionally). - Stores logical backend display names keyed by a server-generated UUID session ID — no internal addressing in shared state; a miss never affects enforcement (calls still route through
enforceSessionBinding+ core admission).
Dropped (< 7, for audit only)
const-coupling drift (F8), 22-field Server struct (F9, pre-existing), capacity divergence when audit is disabled (F10), forged-header audit-label confusion (F11, no auth impact), minor test nits (F12), lazy re-inject identity mismatch (F13, mitigated by binding check).
Overall: COMMENT — approve-with-suggestions. None of the findings block merge for a test-only path; F1/F3/F5 are worth resolving (or explicitly tracking against the live-audit blocker) before the production Serve composition root lands.
🤖 Generated with Claude Code
Addresses #5512 review comments: - MEDIUM pkg/vmcp/server/server.go (3405516256): F1 — gate lazy re-population on a cache miss so a tools-only store never clobbers a fuller {tools,resources} entry on the owning replica (a zero-tool/has-resource session reaches this path). - MEDIUM pkg/vmcp/server/backend_enrichment.go (3405516282): F3 — log a Serve-path enrichment cache miss at DEBUG so the audit-label gap is observable; deliberately no core.Lookup* fallback (that would reintroduce the re-aggregation the cache removes). - LOW pkg/vmcp/server/serve_advertised_set.go + serve.go (3405516280): F2 — correct the lifecycle comment (mcp-go v0.54.1 fires OnUnregisterSession on DELETE/TTL) and wire AddOnUnregisterSession for eager eviction; the LRU bound stays as a backstop. - LOW pkg/vmcp/server/serve_handlers.go (3405516287): F6 — fix backendDisplayName doc to stop claiming legacy parity in the orphan case (Serve records the ID, legacy records ""). - LOW pkg/vmcp/server/serve_session_test.go (3405516296): F7 — assert resources/read and prompts/get do not re-aggregate (lookupResCalls/lookupPromptCalls/listResourcesCalls == 0). - LOW pkg/vmcp/server/serve_session_test.go (3405516300): F5 — test lazy-inject cache re-population and registration-failure eviction. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
jerm-dro
left a comment
There was a problem hiding this comment.
blocker: This is the overuse of middleware coming home to roost, and the cache is complexity added to compensate for it rather than removing it. The backend-enrichment middleware checks the request method and no-ops for most of them, re-reads and re-Unmarshals the request body, and exists solely to stuff one value into BackendInfo context state — and now it needs a per-session cache to do that without re-aggregating on the Serve path. We're spending a cache, four eviction sites, an LRU backstop, and a capacity knob to keep a misplaced layer performant.
The backend name is produced at the call path, and the audit middleware emits after the handler returns (next.ServeHTTP → logAuditEvent, reading BackendName at emit time). So the handler can write the label directly into the BackendInfo the audit middleware already created — and the enrichment middleware comes off the Serve chain entirely.
// coreSessionTools — pass the name (already computed here) into the closure:
Handler: s.coreToolHandler(sessionID, domainTool.Name,
s.backendDisplayName(ctx, domainTool.BackendID)),// coreToolHandler — label where the call is handled:
func (s *Server) coreToolHandler(sessionID, toolName, backendName string) server.ToolHandlerFunc {
return func(ctx context.Context, req mcp.CallToolRequest) (*mcp.CallToolResult, error) {
if bi, ok := audit.BackendInfoFromContext(ctx); ok && bi != nil {
bi.BackendName = backendName
}
// ... existing binding check + core.CallTool, unchanged ...
}
}Same one-liner in coreResourceHandler. Perf should be unchanged (the name is computed once at registration; per-request is an O(1) write, no core.Lookup*) and the log identical (same field, value, emitter).
Net effect: removes serve_advertised_set.go + tests, the advertisedSets field, all four evict calls + the LRU backstop, the resolveBackendName/cachedBackendName split, and drops the enrichment middleware from the Serve path. A layer comes out and no cache goes in. The middleware stays wired only for the legacy path (s.core == nil) until #5445 retires it.
This rests on two things I sketched but didn't implement: that the audit event is emitted after the handler runs, and that the SDK propagates the request context into the handler so BackendInfo is reachable there (coreToolHandler already reads identity from ctx, which suggests it does). Is this approach feasible? WDYT?
Summary
On the vMCP
Servepath, audit backend-enrichment resolved a request's backend by callingcore.Lookup*, which — by the core's deliberate stateless design — re-aggregates every backend's capabilities on every audited request (O(backends·capabilities), plus a remote discovery fan-out) purely to label the audit event. This was a documented deferral from PR #5491 (P2.4) and is a hard blocker before audit can ship on a liveServepath.This PR implements the "Serve caches, core is stateless" pattern:
Servepath. Backend-enrichment middleware runs for every MCP request whenAuditConfig != nil, so each auditedtools/call/resources/readwas paying for a full backend aggregation (and remote fan-out) on top of the aggregation the call itself performs — just to attach a backend label.core.ListTools/core.ListResourcesaggregation the session already performs (no extra aggregation), read by audit enrichment keyed by theMcp-Session-Idheader, and evicted on transport-driven session-end paths. A cache miss (e.g. a cross-pod request without session affinity) degrades the audit label gracefully and never falls back to re-aggregation. The legacyserver.Newpath is unchanged (the cache is nil there).Closes #5493
Type of change
Test plan
task test)task lint-fix)Verification run on the touched package:
go test -race ./pkg/vmcp/server/...passes.golangci-lint run ./pkg/vmcp/server/...reports 0 issues.go build ./...is clean.task lint-fixsurfaces only a pre-existing, unrelated G115 incmd/thv/app/upgrade.go(not touched by this PR).Acceptance criteria verified by tests:
Serve-path requests no longer trigger a second backend aggregation —TestBackendEnrichmentMiddleware_ServePath/"does not re-aggregate via the core"assertscore.LookupToolandcore.ListToolscall counts are 0.Serveaudit is resolved.Changes
pkg/vmcp/server/serve_advertised_set.goadvertisedSet(immutable name→backend maps) andadvertisedSetCache(bounded LRU, nil-safe, concurrent-safe).pkg/vmcp/server/serve_advertised_set_test.gopkg/vmcp/server/backend_enrichment.goServereads the per-session cache (nocore.Lookup*); legacy reads the context routing table.pkg/vmcp/server/backend_enrichment_test.goServe-path coverage, including the no-re-aggregation assertion.pkg/vmcp/server/serve_handlers.gopkg/vmcp/server/serve.goCacheCapacity) and wires it onto theServer.pkg/vmcp/server/server.goadvertisedSetsfield; re-populates the cache on lazy tool injection (cross-replica) and evicts on registration-failure cleanup.pkg/vmcp/server/serve_session_test.goDoes this introduce a user-facing change?
No. This is a performance/internal change with no user-facing behavior change — the audit event content is identical to the legacy path (both record the backend's human-readable name).
Special notes for reviewers
DELETEand TTL/capacity eviction). The transport-driven session-end paths evict eagerly; the LRU guarantees the cache can never grow without limit even when one of the lazy paths leaves an entry behind for a now-dead session ID. Stale entries are harmless (session IDs are UUIDs, never reused) and are evicted under capacity pressure. The default capacity mirrors the session manager's default so the two node-local caches hold roughly the same number of live sessions.MultiSessionruntime layer. A request landing on a replica that did not register the session misses and records the event without a backend label rather than re-aggregating, soServe-path enrichment is aggregation-free regardless of cache state. On lazy tool re-injection, only the tools mapping is re-populated, so aresources/readaudit on a non-owning replica misses gracefully.core.Lookup*interface surface is a candidate for retirement.resources/readaudit labels (tools-only on non-owning replicas) can be revisited when the liveServecomposition root lands.