Deploy releases/k8s-manifests 454249c#146
Merged
Merged
Conversation
…tion (phase 1) Bumps civic-cloud blueprint v1.7.7 → v1.9.2 which brings: - cert-manager 1.13.3 → 1.20.2 (Gateway API integration + ListenerSets gate) - Gateway API v1.5.1 CRDs (standard channel) - Envoy Gateway v1.7.3 controller (installs to envoy-gateway-system) - hairpin-proxy removed (Linode LKE now supports LB hairpin natively) - Server-side apply for CRDs in deploy workflow Adds _infra/envoy-gateway/ with the three foundation resources copied from sandbox: GatewayClass `eg` references EnvoyProxy `shared` with mergeGateways enabled (single LB for all Gateways), main-gateway has an HTTP catchall listener used by both cert-manager solver routes and the global HTTP→HTTPS redirect (added in phase 3.5). Traffic still flows through ingress-nginx after this deploys — phase 1 is foundation only. Refs: #144
Adds `letsencrypt-prod-gateway` and `letsencrypt-staging-gateway` ClusterIssuers using cert-manager 1.20's gatewayHTTPRoute solver against main-gateway. The existing nginx-solver issuers in cert-manager.issuers.yaml stay untouched so existing Ingress-managed Certs continue to renew normally — clean separation between the two paths until each app cuts over. Lesson from sandbox: mutating the existing solver in place couples Ingress and Gateway renewal behavior in a way that's hard to reason about and hard to revert. Parallel is the safer pattern. Refs: #144
One file per app in _gateways/ — Gateway with per-hostname HTTPS listeners (each with its own cert-manager-managed cert via the letsencrypt-prod-gateway ClusterIssuer added in phase 2) plus a single HTTPRoute matching all the app's hostnames and routing to its backend Service. Cert Secret naming uses `-gw-tls` suffix to avoid collision with existing Ingress-managed `<app>-tls` Certs — both coexist until each app's Ingress is removed (phase 5). Per-app HTTPRoutes attach only to the per-app HTTPS Gateway; HTTP→HTTPS redirect is handled globally on main-gateway (phase 3.5), not per-app. Apex domains (balancerproject.org, choosenativeplants.com, codeforphilly.org, penn-chime.phl.io, vaultwarden.phl.io, bitwarden.phl.io) will not issue certs until their DNS cuts over to Envoy — HTTP-01 challenge needs to reach Envoy. Plan DNS cutover + cert issuance together for each apex. For initial verification per app, the letsencrypt-prod-gateway annotation can be swapped to letsencrypt-staging-gateway to avoid Let's Encrypt rate limits during smoke testing — then flipped back to prod. Refs: #144
…3.5) Single HTTPRoute on main-gateway with a RequestRedirect filter (no hostnames, no path → matches everything that hits the HTTP listener). ACME challenges bypass via Gateway API conflict resolution — cert-manager's solver HTTPRoute carries both a hostname filter and pathType: Exact on /.well-known/acme-challenge/<token>, both more specific. Safe to deploy any time after phase 2 — doesn't depend on per-app Gateways being ready. Once DNS cuts over per host, HTTP requests to that host get a 301 to HTTPS instead of falling through to ingress-nginx. Refs: #144
Adopts the convention sandbox settled on: top-level directories under the
workspace root use the `_` prefix when they hold infrastructure / glue /
admin manifests that aren't tied to a single workload. Workloads stay bare.
Renames:
admins/ → _admins/
docs/ → _docs/
Updates `.holo/branches/docs-site/{_,docs/}_cfp-live-cluster.toml` to read
from `_docs/`, and the k8s-manifests exclude to skip `_docs/**`. The docs-site
branch still publishes `docs/` at root — only the workspace source path moved.
Already on the `_` convention: `_infra/`, `_gateways/` (added in the in-flight
Envoy Gateway migration on this branch).
Refs: cfp-sandbox-cluster@d7af5bd8 + @4763b70e
Adapted from cfp-sandbox-cluster@fadcf31c. Same structure (projection model, required local-diff QA, guardrails) but rewritten for live's situation: - Migration is in flight (#144), not complete — sandbox is the source for patterns, live trails it - Parallel ClusterIssuers `letsencrypt-{prod,staging}-gateway` coexist with the legacy nginx-solver `letsencrypt-{prod,staging}` at the repo root - Wildcard DNS is `*.live.k8s.phl.io` not `*.sandbox.k8s.phl.io` - Apex domains documented (balancerproject.org, codeforphilly.org, etc.) + the ACME-DNS-cutover dependency for them - No cnpg / shared-cluster — per-app PostgreSQL StatefulSets where needed - ingress-nginx + hairpin-proxy noted as currently-present, scheduled for removal in #144 Refs: cfp-sandbox-cluster@fadcf31c
Envoy Gateway migration: phases 1–3.5 prep + workspace refactor
Author
|
Errors/Warnings |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
kubectl diffreports that applying 454249c will change:Errors/Warnings