Skip to content

feat: per-tenant Valkey ACL provisioning#3

Open
nicacioliveira wants to merge 1 commit intomainfrom
feat/valkey-tenant-acl
Open

feat: per-tenant Valkey ACL provisioning#3
nicacioliveira wants to merge 1 commit intomainfrom
feat/valkey-tenant-acl

Conversation

@nicacioliveira
Copy link
Copy Markdown

@nicacioliveira nicacioliveira commented Apr 9, 2026

Summary

  • Adds NamespaceReconciler that watches namespaces annotated with deco.sites/valkey-acl: "true" and provisions a Valkey ACL user + K8s Secret (valkey-acl) per site
  • Each ACL user is restricted to ~{site}:* and ~lock:{site}:* key patterns — enforces tenant isolation at the Valkey protocol level
  • Mutating webhook now injects envFrom: secretRef: valkey-acl optional: true into every site pod — pods degrade gracefully to FILE_SYSTEM cache if the secret isn't provisioned yet
  • Adds cmd/bootstrap CLI to annotate existing sites-* namespaces for migration (--dry-run supported)
  • Uses NoopClient when VALKEY_SENTINEL_URLS is unset — safe to deploy before Valkey auth is enabled

New env vars (operator)

Var Description
VALKEY_SENTINEL_URLS Comma-separated Sentinel addresses. Empty = ACL provisioning disabled
VALKEY_SENTINEL_MASTER_NAME Default: mymaster
VALKEY_ADMIN_PASSWORD Admin password for issuing ACL commands

Validation required before enabling

  • Deploy operator with VALKEY_SENTINEL_URLS configured
  • Annotate a test namespace: kubectl annotate ns sites-test deco.sites/valkey-acl=true
  • Verify Secret created: kubectl get secret valkey-acl -n sites-test
  • Verify ACL user in Valkey: ACL LIST shows user test ...
  • Verify key restriction: attempt to write other-site:key as test user → NOPERM
  • Run bootstrap dry-run: go run ./cmd/bootstrap --dry-run

Dependencies

  • Depends on Valkey auth being enabled in infra_applications (separate step)
  • deco-cx/deco#feat/valkey-tenant-acl must be merged and deployed after operator validation

🤖 Generated with Claude Code


Summary by cubic

Adds per-tenant Valkey ACLs for site namespaces to enforce protocol-level key isolation. A new controller provisions an ACL user and a valkey-acl Secret per site, and the webhook injects it into pods.

  • New Features

    • Namespace controller watches deco.sites/valkey-acl: "true" and creates a Valkey ACL user restricted to ~{site}:* and ~lock:{site}:*, plus a valkey-acl Secret with LOADER_CACHE_REDIS_USERNAME/LOADER_CACHE_REDIS_PASSWORD.
    • Mutating webhook injects envFrom for valkey-acl (optional=true) so pods start and fall back to FILE_SYSTEM cache if the Secret isn’t ready.
    • cmd/bootstrap CLI to annotate existing sites-* namespaces (--dry-run supported).
    • Safe rollout: uses NoopClient when VALKEY_SENTINEL_URLS is unset; periodic reconcile self-heals missing ACLs after Valkey restarts.
  • Migration

    • Enable Valkey auth in infra, then set VALKEY_SENTINEL_URLS, VALKEY_ADMIN_PASSWORD, and optionally VALKEY_SENTINEL_MASTER_NAME (default mymaster) for the operator.
    • Annotate namespaces with deco.sites/valkey-acl=true or run go run ./cmd/bootstrap --namespace-pattern sites- [--dry-run].
    • Verify: Secret valkey-acl exists in the namespace and ACL GETUSER {site} returns the user; cross-site writes should fail with NOPERM.

Written for commit e0581ef. Summary will update on new commits.

Adds a Namespace controller that provisions isolated Valkey ACL users
for each site namespace annotated with deco.sites/valkey-acl: "true".
Each site gets a K8s Secret (valkey-acl) with its own credentials,
restricted to ~{site}:* key patterns. The mutating webhook injects
the secret into site pods via envFrom (optional=true, so pods degrade
gracefully to FILE_SYSTEM cache if not yet provisioned). A bootstrap
CLI annotates existing namespaces for migration.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 issues found across 8 files

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="internal/valkey/client.go">

<violation number="1" location="internal/valkey/client.go:68">
P1: Missing `resetpass` before setting the password causes old passwords to accumulate. Each `UpsertUser` call *adds* the new password but never removes previous ones, so all historical passwords remain valid. This breaks password rotation and undermines tenant isolation.

Add `"resetpass"` before `">" + password` to match the reset-then-set pattern already used for keys, channels, and commands.</violation>
</file>

<file name="internal/controller/namespace_controller.go">

<violation number="1" location="internal/controller/namespace_controller.go:204">
P1: Annotation is set on the Service's top-level metadata (`svc.Annotations`) instead of the template metadata (`svc.Spec.Template.Annotations`). Knative only creates a new Revision when `spec.template` changes, so this patch will never trigger the rollout needed for pods to pick up the new `valkey-acl` Secret.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

func (c *sentinelClient) UpsertUser(ctx context.Context, username, password string) error {
args := []interface{}{
"ACL", "SETUSER", username,
"on",
Copy link
Copy Markdown

@cubic-dev-ai cubic-dev-ai bot Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1: Missing resetpass before setting the password causes old passwords to accumulate. Each UpsertUser call adds the new password but never removes previous ones, so all historical passwords remain valid. This breaks password rotation and undermines tenant isolation.

Add "resetpass" before ">" + password to match the reset-then-set pattern already used for keys, channels, and commands.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At internal/valkey/client.go, line 68:

<comment>Missing `resetpass` before setting the password causes old passwords to accumulate. Each `UpsertUser` call *adds* the new password but never removes previous ones, so all historical passwords remain valid. This breaks password rotation and undermines tenant isolation.

Add `"resetpass"` before `">" + password` to match the reset-then-set pattern already used for keys, channels, and commands.</comment>

<file context>
@@ -0,0 +1,121 @@
+func (c *sentinelClient) UpsertUser(ctx context.Context, username, password string) error {
+	args := []interface{}{
+		"ACL", "SETUSER", username,
+		"on",
+		">" + password,
+		"resetkeys",
</file context>
Fix with Cubic

if svc.Annotations == nil {
svc.Annotations = make(map[string]string)
}
svc.Annotations[valkeyProvisionedAnnot] = now
Copy link
Copy Markdown

@cubic-dev-ai cubic-dev-ai bot Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1: Annotation is set on the Service's top-level metadata (svc.Annotations) instead of the template metadata (svc.Spec.Template.Annotations). Knative only creates a new Revision when spec.template changes, so this patch will never trigger the rollout needed for pods to pick up the new valkey-acl Secret.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At internal/controller/namespace_controller.go, line 204:

<comment>Annotation is set on the Service's top-level metadata (`svc.Annotations`) instead of the template metadata (`svc.Spec.Template.Annotations`). Knative only creates a new Revision when `spec.template` changes, so this patch will never trigger the rollout needed for pods to pick up the new `valkey-acl` Secret.</comment>

<file context>
@@ -0,0 +1,225 @@
+		if svc.Annotations == nil {
+			svc.Annotations = make(map[string]string)
+		}
+		svc.Annotations[valkeyProvisionedAnnot] = now
+		if err := r.Patch(ctx, svc, patch); err != nil {
+			log.Error(err, "Failed to patch Knative Service", "service", svc.Name)
</file context>
Fix with Cubic

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant