🤖 feat: support all-namespaces LIST across multiple CoderControlPlane instances#85
Merged
🤖 feat: support all-namespaces LIST across multiple CoderControlPlane instances#85
Conversation
Member
Author
|
@codex review |
|
Codex Review: Didn't find any major issues. Keep it up! ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
Member
Author
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Enable all-namespaces LIST to aggregate results across every eligible
CoderControlPlaneinstance (one per namespace) forCoderTemplateandCoderWorkspaceresources. Previously,kubectl get codertemplates -Afailed with:Background
The aggregated API server's storage assumed exactly one eligible
CoderControlPlanewhen handling all-namespaces LIST (request namespace is empty). When multiple eligible control planes exist across namespaces, client list-watch loops fail, blockingkubectl get codertemplates -A,kubectl get coderworkspaces -A, and controllers/informers that use list+watch.Implementation
New interface —
coder.NamespaceListerwithEligibleNamespaces(ctx) ([]string, error):Provider implementations:
ControlPlaneClientProvider.EligibleNamespaces— discovers eligible CPs viafindEligibleControlPlanes, groups by namespace, rejects duplicates within a namespace, returns sorted namespace list.StaticClientProvider.EligibleNamespaces— returns the pinned namespace.Storage fan-out in
TemplateStorage.ListandWorkspaceStorage.List:NamespaceLister: fan out across all eligible namespaces, query each, convert with correct per-namespace metadata, and return aggregated results sorted by(namespace, name).NamespaceListerkeep existing behavior unchanged.Validation
make verify-vendor✅make test✅ (8 new tests + no regressions)make build✅make lint✅Risks
Low risk — the fan-out path only activates for all-namespaces LIST when the provider supports
NamespaceLister. Single-namespace requests and the static provider path are unchanged. Multiple eligible CPs within the same namespace are still explicitly rejected.📋 Implementation Plan
Plan: Support querying multiple CoderControlPlane instances (multi-namespace aggregation)
Context / Why
Today, the aggregated API server’s
CoderTemplate/CoderWorkspacestorage assumes exactly one eligibleCoderControlPlanewhen handling all-namespaces LIST (i.e. request namespace is empty). When multiple eligible control planes exist across namespaces, client list-watch loops fail with:multiple eligible CoderControlPlane instances across namespaces; multi-instance support is plannedThis blocks workflows like:
kubectl get codertemplates -Aaggregation.coder.com/v1alpha1Goal
Enable multi-instance querying by making all-namespaces LIST aggregate results across every eligible
CoderControlPlane(one per namespace) for:aggregation.coder.com/v1alpha1/CoderTemplateaggregation.coder.com/v1alpha1/CoderWorkspaceNamespaced requests (
-n <namespace>) must keep current behavior.Non-goals (v1)
CoderControlPlaneobjects within the same Kubernetes namespace.Acceptance Criteria
kubectl get codertemplates -Areturns templates from all eligibleCoderControlPlanenamespaces.kubectl get coderworkspaces -Areturns workspaces from all eligibleCoderControlPlanenamespaces.kubectl get codertemplates -A --watchno longer fails due to the multi-instance discovery error (because the initial LIST succeeds).aggregated-apiservermode (static provider pinned to--coder-namespace) continues working unchanged.Evidence (code references)
internal/aggregated/coder/controlplane_provider.goClientForNamespaceerrors onlen(eligible) > 1DefaultNamespaceerrors on multiple eligible across namespacesmultipleEligibleControlPlaneMessage("")returns the exact string shown in the screenshotinternal/aggregated/storage/template.go:(*TemplateStorage).Listinternal/aggregated/storage/workspace.go:(*WorkspaceStorage).ListnamespaceForListConversion(...)and thenclientForNamespace(ctx, requestNamespace)whererequestNamespace==""triggers the provider’s “pick exactly one CP” logicinternal/aggregated/storage/watch.go,template.go,workspace.goImplementation Details
1) Add an optional provider capability: list eligible namespaces
File:
internal/aggregated/coder/provider.goAdd a small, opt-in interface that lets storage enumerate the set of namespaces it can serve.
Rationale: keeps storage decoupled from concrete provider types (
StaticClientProvidervsControlPlaneClientProvider).2) Implement
NamespaceLister2a) Dynamic control-plane provider
File:
internal/aggregated/coder/controlplane_provider.goImplement:
func (p *ControlPlaneClientProvider) EligibleNamespaces(ctx context.Context) ([]string, error)Algorithm:
eligibleCPs, err := p.findEligibleControlPlanes(ctx, "")len(eligibleCPs) == 0: returnServiceUnavailable(noEligibleControlPlaneMessage("")).cp.Namespace.BadRequest(multipleEligibleControlPlaneMessage(namespace))(still not supported).Defensive programming:
2b) Static provider
File:
internal/aggregated/coder/provider.goImplement:
func (p *StaticClientProvider) EligibleNamespaces(ctx context.Context) ([]string, error)Behavior:
p.Namespace == "": returnServiceUnavailable("static provider has no default namespace")(consistent with existing behavior).[]string{p.Namespace}.3) Update storage LIST to fan out when request namespace is empty
3a) Templates
File:
internal/aggregated/storage/template.goUpdate
func (s *TemplateStorage) List(ctx context.Context, _ *metainternalversion.ListOptions) ...:s.providerimplementscoder.NamespaceLister:namespaces := lister.EligibleNamespaces(ctx)sdk := s.clientForNamespace(ctx, namespace)templates := sdk.Templates(ctx, codersdk.TemplateFilter{})convert.TemplateToK8s(namespace, template)and append.list.Itemsby(namespace, name)for deterministic output.namespaceForListConversion+clientForNamespace(ctx, "")behavior.3b) Workspaces
File:
internal/aggregated/storage/workspace.goUpdate
func (s *WorkspaceStorage) List(ctx context.Context, _ *metainternalversion.ListOptions) ...with the same fan-out strategy.Concurrency (recommended, not required for correctness):
errgroup.Group+ a small semaphore/limit (e.g. 4) so large numbers of namespaces don’t produce N sequential slow requests.Error semantics (v1):
4) Tests
4a) Provider unit tests
File:
internal/aggregated/coder/controlplane_provider_test.goAdd tests for
EligibleNamespaces:BadRequestwhen a single namespace has multiple eligible CPs.ServiceUnavailablewhen no eligible CPs exist.4b) Storage aggregation tests
File:
internal/aggregated/storage/storage_test.goAdd tests verifying all-namespaces aggregation:
coder.ClientProvider(map namespace → client)coder.NamespaceLister(returns both namespaces)TemplateStorage.List(context.Background(), nil)returns items from both namespaces and each item has the correctmetadata.namespace.WorkspaceStorage.List.5) Docs / behavior notes
File:
internal/aggregated/storage/doc.goUpdate the “v1 semantics” comment to note:
CoderControlPlanenamespaces when the provider supports it.6) Validation
Run locally after implementation:
make testmake buildmake lintmake verify-vendor(expected no-op; only if deps are added)Optional follow-ups (not required to fix the observed error)
Upstream-backed watch: implement per-control-plane polling/websocket to generate watch events even when changes happen directly in Coder.
Partial failure mode: consider returning partial results for all-namespaces LIST when one instance is down (would require a clear API/UX decision; Kubernetes APIs generally favor fail-fast).
Performance & caching: optionally cache per-namespace SDK clients/tokens (with invalidation on secret changes) to reduce per-request secret reads.
Generated with
mux• Model:anthropic:claude-opus-4-6• Thinking:xhigh• Cost:$3.28