From 8a1255944cba2b4854b87a6b27527710c2e6d816 Mon Sep 17 00:00:00 2001 From: Nikolai Emil Damm Date: Fri, 29 May 2026 15:43:42 +0200 Subject: [PATCH] fix(flux): spread controllers across workers to prevent GitOps deadlock MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The four Flux controllers (source/kustomize/helm/notification) are single-replica Deployments with no topology spread, so the scheduler can stack them on one worker. On 2026-05-28 kustomize-controller landed on prod-worker-2 when that node's Cilium ClusterIP datapath degraded after an OOMKill; it then crash-looped on "dial tcp 10.96.0.1:443: i/o timeout" and GitOps reconciliation stalled — so the fix for the underlying OOM (#1649) could not even be applied. A single bad worker decapitated reconciliation: a deadlock GitOps cannot self-heal from. Add a soft topologySpreadConstraint (maxSkew 1, ScheduleAnyway, keyed on app.kubernetes.io/part-of=flux) to every controller via the prod FluxInstance kustomize.patches, so the set spreads across the three workers. Soft (ScheduleAnyway) so it never blocks scheduling on the capacity-constrained cluster. Verified with a standalone kustomize build that the JSON6902 patch injects the constraint as intended. Co-Authored-By: Claude Opus 4.8 --- .../flux-instance/flux-instance.yaml | 25 +++++++++++++++++++ 1 file changed, 25 insertions(+) diff --git a/k8s/providers/hetzner/infrastructure/controllers/flux-instance/flux-instance.yaml b/k8s/providers/hetzner/infrastructure/controllers/flux-instance/flux-instance.yaml index 5d5370064..7e7fca1b7 100644 --- a/k8s/providers/hetzner/infrastructure/controllers/flux-instance/flux-instance.yaml +++ b/k8s/providers/hetzner/infrastructure/controllers/flux-instance/flux-instance.yaml @@ -38,3 +38,28 @@ spec: - op: add path: /spec/template/spec/containers/0/args/- value: --requeue-dependency=5s + # Spread the four Flux controllers across worker nodes. They are + # single-replica Deployments that all carry app.kubernetes.io/part-of=flux, + # so a per-controller topology spread keyed on that label distributes the + # set (skew <= 1) instead of letting the scheduler stack them on one node. + # On 2026-05-28 kustomize-controller landed on prod-worker-2 when that + # node's Cilium ClusterIP datapath degraded after an OOMKill; it then + # crash-looped on "dial tcp 10.96.0.1:443: i/o timeout" and GitOps + # reconciliation stalled — so the fix for the underlying OOM could not be + # applied (a deadlock GitOps cannot self-heal from). Keeping the + # controllers spread means a single bad worker cannot decapitate + # reconciliation. Soft (ScheduleAnyway) so it never blocks scheduling on + # the capacity-constrained 3-worker cluster. + - target: + kind: Deployment + labelSelector: app.kubernetes.io/part-of=flux + patch: | + - op: add + path: /spec/template/spec/topologySpreadConstraints + value: + - maxSkew: 1 + topologyKey: kubernetes.io/hostname + whenUnsatisfiable: ScheduleAnyway + labelSelector: + matchLabels: + app.kubernetes.io/part-of: flux