fix(runner): stamp Workflow ownerRef on applied SeiNodeTask CRs by bdchatham · Pull Request #350 · sei-protocol/sei-k8s-controller

bdchatham · 2026-05-21T23:03:36Z

Summary

Closes the last gap in the workflow ownership chain. Before this change, the seitask runner subcommand applied SeiNodeTask CRs via SSA but never stamped an ownerRef pointing at the parent Workflow — so per-run SNTs survived workflow deletion and accumulated indefinitely across nightly runs.

keygen, provision-snd, and taskruntime.EnsureWorkflowVarsCM already stamp the Workflow ownerRef on what they create. This PR brings runner in line with that pattern:

Add OwnerRef *metav1.OwnerReference to runner.DefaultRenderer.
In cmd/seitask/runner.go, call taskruntime.LoadWorkflowIdentity(ctx, cliClient) at subcommand startup; pass &wf.OwnerRef() into the renderer.
In RenderBytes, replace (not merge) metadata.ownerReferences so a template-smuggled bogus ref can't leak through — mirrors provisionsnd.stampMetadata exactly.

Result: kubectl delete workflow major-upgrade-<run-id> (or the gc-cronjob age sweep at 24h) cascades to every per-run resource — SND, workflow-vars CM, admin Secret, SeiNodeTasks, Task pods.

Test plan

New TestRenderBytes_StampsOwnerRef covers both the stamping path (single Workflow ref replaces any template-declared refs) and the nil-ownerRef path (template-declared refs preserved for backward compat).
go test ./... passes.
Verify on the next nightly fire that newly-applied SNTs carry a Workflow ownerRef (kubectl get snt -o yaml | grep -A3 ownerReferences).

The runner subcommand was the last seitask command not stamping the parent Workflow's ownerRef on what it creates. provision-snd, keygen, and EnsureWorkflowVarsCM already do; runner-applied SeiNodeTask CRs silently leaked across runs because no ownerRef linked them to the Workflow lifecycle. Add an OwnerRef field to DefaultRenderer; populate it from taskruntime.LoadWorkflowIdentity at subcommand startup; replace (not merge) ownerReferences on the rendered manifest before SSA. Mirrors provisionsnd.stampMetadata exactly so a template-smuggled bogus ref can't leak through. With this in place, deleting the Workflow CR cascades the per-run SeiNodeTask CRs along with the existing SND + workflow-vars CM + admin Secret. The platform-side gc-cronjob can sweep just Workflows; cascade reaps the rest. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

cursor · 2026-05-21T23:03:42Z

PR Summary

Medium Risk
Medium risk because it changes Kubernetes metadata (ownerReferences) for all runner-applied tasks, affecting lifecycle/GC behavior and potentially overriding template-provided refs when enabled.

Overview
seitask runner now loads the parent Workflow identity at startup and stamps its ownerReference onto rendered/applied SeiNodeTask manifests, enabling cascading deletion when the workflow is removed.

runner.DefaultRenderer and RenderBytes were extended to accept an optional owner ref; when provided, rendering replaces (not merges) metadata.ownerReferences to prevent template-smuggled refs. Tests were updated and a new case added to verify both stamping and the nil/no-stamp path.

^{Reviewed by Cursor Bugbot for commit 906e520. Bugbot is set up for automated code reviews on this repo. Configure here.}

… on runner Task containers (#351) #350 made taskruntime.LoadWorkflowIdentity mandatory at runner startup so applied SeiNodeTask CRs get a Workflow ownerRef. LoadWorkflowIdentity reads SEI_WORKFLOW_NAME + SEI_NAMESPACE from env (downward API). The keygen/provision-snd/upload-report Task containers already project these via downward API, but the 12 `runner` Task containers in scenarios/major-upgrade.yaml did NOT — pre-#350 the runner didn't need them. Result: every runner step in major-upgrade exits 2 with `seitask: infra: downward-API env not projected: [SEI_WORKFLOW_NAME SEI_NAMESPACE]` before doing any work. The gov pipeline silently failed end-to-end on the first post-#350 fire even though chaos-mesh reported the Workflow Accomplished (because chaos-mesh Accomplished doesn't mean container exit 0). Add the env block to all 12 runner Task containers. scenarios/release-test and scenarios/load-test don't invoke `runner`, so they're unaffected. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

bdchatham merged commit cd33795 into main May 21, 2026
5 checks passed

bdchatham mentioned this pull request May 22, 2026

fix(scenarios/major-upgrade): downward-API env on runner Tasks + tolerate space in gov REST status #351

Merged

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(runner): stamp Workflow ownerRef on applied SeiNodeTask CRs#350

fix(runner): stamp Workflow ownerRef on applied SeiNodeTask CRs#350
bdchatham merged 1 commit into
mainfrom
fix/runner-stamp-workflow-ownerref

bdchatham commented May 21, 2026 •

edited

Loading

Uh oh!

cursor Bot commented May 21, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

bdchatham commented May 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Related

Uh oh!

cursor Bot commented May 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Summary

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

bdchatham commented May 21, 2026 •

edited

Loading

cursor Bot commented May 21, 2026 •

edited

Loading