test(cli-e2e): capability probes + capability-based live gating#5749
test(cli-e2e): capability probes + capability-based live gating#5749avallete wants to merge 6 commits into
Conversation
Adds a declarative way to mark live e2e tests by the runtime capabilities they need (docker / internet / external-tool) and skip them when the target env can't provide it — so one suite runs against staging (all capabilities, the oracle), supabox (whatever it currently supports), and Antithesis (offline subset), each skipping only what it genuinely can't do. - env.ts: `Capability` type + `PROVIDED_CAPABILITIES`, from `CLI_E2E_CAPABILITIES` with per-target defaults (staging = all; supabox = none until opened up). - live-context.ts: `testLiveRequires(caps)` → `testLive` or `testLive.skip`. - capabilities.live.e2e.test.ts: one minimal probe per category, each forcing its capability so a failure/skip is a precise supabox-gap signal — C1 mgmt-api (projects list), C2 docker/offline (db push+pull shadow), C3 external-tool (db dump via native pg_dump), C4 internet (deploy jsr fn), C5 docker+internet (--use-docker deploy of a jsr fn). Reuses existing fixtures. Staging is the oracle: probes green there prove soundness, so any supabox skip/red is a real gap for the CLI-in-supabox work (supabox#106) to close. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Supabase CLI previewnpx --yes https://pkg.pr.new/supabase/cli/supabase@26e70380d24ba3ac920b987842da439f25f25c6aPreview package for commit |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 26e70380d2
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| export function testLiveRequires(required: readonly Capability[]): typeof testLive { | ||
| const missing = required.filter((capability) => !PROVIDED_CAPABILITIES.has(capability)); | ||
| return missing.length === 0 ? testLive : testLive.skip; |
There was a problem hiding this comment.
Gate the rest of the live suite by capabilities
In a supabox/Antithesis run with missing Docker or internet, this helper only skips tests that opt into testLiveRequires; vitest.live.config.ts:10 still includes every *.live.e2e.test.ts, and existing live files still call testLive directly (for example functions-deploy.live.e2e.test.ts:20 runs the matrix containing --use-docker, and db-sync.live.e2e.test.ts:13 runs db push/db pull). Those tests will execute and fail instead of being skipped, so CLI_E2E_CAPABILITIES does not actually make the live suite run only supported cases. Please either convert the existing live tests to capability requirements or limit the supabox probe run to the capability file.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Partially addressed (8bcc653) — tagged the clearly docker/internet existing tests: db-sync requires docker, functions deploy --use-docker requires docker, and deploy-all requires internet. Full suite-wide gating of the env-dependent db dump and the data-plane-provisioned tests (storage, pooler) is coupled to the global-setup change in the sibling comment and lands with the supabox-enablement follow-up; until then capability-limited runs target the probe file + the tagged subset. Leaving open to track that.
🤖 Addressed by Claude Code
- env.ts: reject unknown CLI_E2E_CAPABILITIES tokens (a typo like `external_tools` silently skipped tests and left the run green). - C3 probe: pass SUPABASE_DB_USE_LOCAL_TOOLS=1 so `db dump` exercises the native pg_dump path (external-tool), not the default container path. - C2 probe: unique migration timestamp so it can't collide with db-sync's on the shared per-run project. - C5 probe: deploy a distinct slug (deploy-e2e-npm) so the invoke proves the --use-docker deploy produced the function, not C4's earlier server-bundled one. - Tag the existing docker/internet live tests so CLI_E2E_CAPABILITIES gates them too: db-sync (docker), functions deploy --use-docker (docker), deploy-all (internet). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Extends the capability set with the data-plane axes and tags every live test with its full requirements, so CLI_E2E_CAPABILITIES gates the entire suite (not just the probes): - Add `database` (project Postgres reachable via the pooler) and `storage` (the Storage API) to the capability set; staging provides all, supabox opts in. - Tag data-plane tests: database db-stats/migration-list → [database], db dump → [database, docker]; db-sync → [database, docker]; gen-types → [database, docker]; storage → [database, storage]; capability probes C2/C3 gain [database]. - Management-API-only tests (projects, link, secrets, branches, functions-lifecycle) need no extra capability and stay on the `testLive` baseline. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…eline The project data plane (Postgres via the pooler, the Storage API) is always present on a live target, so it isn't a capability a test opts into. Revert to the three runtime capabilities (docker / internet / external-tool) and re-tag: - storage, database db-stats/migration-list → back to the bare `testLive` baseline (pooler connection needs no runtime capability). - database db dump, gen-types, db-sync, probe C2 → `docker` only (db pull's shadow DB / gen-types' postgres-meta / db dump's pg_dump container); probe C3 → `external-tool` only. db push+pull keeps `docker` because `db pull` starts a shadow postgres container (PrepareRawShadow → DockerStart), not because of the --db-url connection. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Move the C2 (docker) probe before C3 (external-tool) so the probes read in numeric order. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…nitize
Validated the capability probes against real staging (all 5 pass); two harness
flakes surfaced and are fixed:
- createStorageBucket retries on 400 TenantNotFound: the storage tenant is
registered asynchronously, after the project reports ACTIVE_HEALTHY, so the
first bucket call races it and aborted global setup (upstreams the supabox
cli-storage-bucket-retry patch into staging-project.ts).
- Sanitize the workspace temp-dir name: a test title with ':' or spaces leaks
into the temp path, which the cli mounts as a Docker volume for docker-backed
commands (functions bundling, db diff shadow), breaking the src:dst:mode spec
("too many colons"). Collapse non-alphanumerics so any test name is volume-safe.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Adds a declarative way to mark
apps/cli-e2elive tests by the runtime capabilities they need, plus one minimal capability probe per category. Groundwork for running the live e2e suite against the CLI-in-supabox model (supabox#106): the probes systematically reveal which capabilities supabox can/can't provide.What's here
src/tests/env.ts— aCapabilitytype (docker/internet/external-tool) andPROVIDED_CAPABILITIES, read fromCLI_E2E_CAPABILITIES(comma list) with per-target defaults:stagingprovides everything (the oracle),supaboxstarts empty and is opened up as it gains support.src/tests/live/live-context.ts—testLiveRequires(caps)→testLiveortestLive.skip, so a test skips (not fails) when the target can't provide a required capability.src/tests/live/capabilities.live.e2e.test.ts— one probe per category, each forcing its capability so a skip/red is a precise gap signal:projects listincludes the projectdb push+db pull(pull's shadow DB requires DockerStart; pre-built images)db dumpremote schema via nativepg_dump(SUPABASE_DB_USE_LOCAL_TOOLS)jsr.io-importing function (server-side bundler) + invoke--use-dockerdeploy of the same jsr function + invokeReuses the existing live harness (
live-setup.tsprovisioning,deploy-e2e-jsr/deploy-e2e-mode-dockerfixtures,invoke.ts); no new fixtures.How it's used
CLI_E2E_MODE=live CLI_E2E_TARGET_ENV=staging→ all five run and must pass, proving the probes are sound.CLI_E2E_CAPABILITIES; anything green-on-staging but skipped/red on supabox is a concrete gap for the supabox-side work to close (delegated).Inert on normal PR/replay runs —
testLive*skips unlessCLI_E2E_MODE=live.