Skip to content

perf(rest): cache hostname→env resolution (P1-4); document cluster pub/sub durability (P1-5)#1589

Merged
xuyushun441-sys merged 1 commit into
mainfrom
fix/p1-control-plane
Jun 5, 2026
Merged

perf(rest): cache hostname→env resolution (P1-4); document cluster pub/sub durability (P1-5)#1589
xuyushun441-sys merged 1 commit into
mainfrom
fix/p1-control-plane

Conversation

@xuyushun441-sys

Copy link
Copy Markdown
Contributor

Closes launch-readiness P1-4 (real fix) and P1-5 (verified by-design; documented & accepted). Both hand-verified against main first.

P1-4 — hostname→env resolution cache (real fix)

RestServer.resolveProtocol() called envRegistry.resolveByHostname() on every unscoped request — a control-plane lookup (typically a DB query) in the hot path, at 3 call sites, with no caching.

Fix: resolveHostnameCached() caches hostname → environmentId in-memory with a 30s TTL, used at all 3 sites. It caches negative results too (so unknown hosts don't hammer the registry) but not errors (a transient control-plane blip self-heals next request).

P1-5 — cluster pub/sub durability (verified by-design, documented)

Redis pub/sub is at-most-once by designpublish() already awaits the PUBLISH command, but there's no subscriber-delivery guarantee. Verified that metadata.changed is a cache-invalidation hint only: the durable source of truth is the transactional sys_metadata (+ sys_metadata_history) write. A node that misses the event serves a stale cached schema until its next reload and loses no data (self-heals against the DB).

No delivery-semantics change. Recorded the durability contract in pubsub.ts publish() and the doc; risk accepted for v1 (no exactly-once state may flow through this channel — durable state uses an outbox).

Tests

+3 (rest 80 green): caches within TTL (1 lookup for 3 calls), refreshes after TTL, caches negative results.

🤖 Generated with Claude Code

…b/sub durability (P1-5)

P1-4: resolveByHostname() ran on every unscoped request (a control-plane DB
lookup in the hot path). RestServer.resolveHostnameCached() now caches
hostname→environmentId for 30s across all 3 call sites, including negative
results so unknown hosts don't hammer the registry; registry errors aren't
cached so a transient blip self-heals. +3 tests.

P1-5: verified the Redis pub/sub fire-and-forget is by design (at-most-once) and
already implied in code. Recorded the durability contract in pubsub.ts:
metadata.changed is a cache-invalidation hint only — the durable source of truth
is the transactional sys_metadata (+ sys_metadata_history) write, so a missed
event self-heals on reload and never loses data. No delivery-semantics change;
risk accepted + documented.

docs/launch-readiness.md P1-4/P1-5 updated (Verify ✅, Sign-off left for the team).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@vercel

vercel Bot commented Jun 5, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
spec Ready Ready Preview, Comment Jun 5, 2026 1:44am

Request Review

@github-actions github-actions Bot added size/m documentation Improvements or additions to documentation tests tooling labels Jun 5, 2026
@xuyushun441-sys xuyushun441-sys merged commit 0a6438e into main Jun 5, 2026
12 checks passed
@xuyushun441-sys xuyushun441-sys deleted the fix/p1-control-plane branch June 5, 2026 01:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation size/m tests tooling

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants