Implement service name caching for UniquePodNames#1796
Conversation
|
when updating the operators on an existing/deployed env, the ctplane CR gets updated to reflect the existing serviceName via the webhook: |
875f1ca to
9173588
Compare
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: stuggi The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
9173588 to
e044551
Compare
|
rebased |
|
The patch looks good to me and having a mechanism that migrates fields at webhook layer seems a good approach. In addition this mechanism can be either extended to support other services or reused for other purposes (see the rabbitmq patch series). |
c8d0b81 to
fa2c1ec
Compare
**Problem:** When UniquePodNames is enabled, GetServiceName() uses instance.UID[:5] to generate unique service names. During CREATE webhook, UID is empty (assigned after mutating webhooks), causing panic: "slice bounds out of range [:5] with length 0". **Solution - Hybrid Approach:** 1. **Webhook handles CREATE and UPDATE**: - CREATE: Generate random 5-char hex ID (UID not available yet) - UPDATE: Look up existing service CR by owner reference, cache its name - Uses lib-common's object.CheckOwnerRefExist() for robust ownership checking 2. **Controller triggers webhook for operator upgrades**: - When UniquePodNames=true but ServiceName="", controller adds annotation - Uses lib-common's webhook.EnsureWebhookTrigger() function directly - Sets annotation with timestamp, defer block saves it (triggers UPDATE webhook) - Webhook caches service name and removes annotation - Controller NEVER directly updates spec (follows Kubernetes best practices) **Key Implementation Details:** **GetServiceName() fix** (api/core/v1beta1/openstackcontrolplane_types.go): - Check if len(instance.UID) >= 5 before slicing - Return base name if UID unavailable (no panic) **Webhook caching** (api/core/v1beta1/openstackcontrolplane_webhook.go): - CacheServiceNameForCreate(): Uses random ID since UID unavailable - CacheServiceNameForUpdate(): Uses object.CheckOwnerRefExist(ownerUID, cr.GetOwnerReferences()) to find existing CR owned by this controlplane - Preserves existing CR name regardless of format (handles flips correctly) - Cleans up reconcile-trigger annotation after caching **Controller annotation trigger** (internal/openstack/cinder.go, glance.go): - Directly calls webhook.EnsureWebhookTrigger() from lib-common - Passes ReconcileTriggerAnnotation constant: "openstack.org/reconcile-trigger" - Passes logger and default timeout (0 = 5 minutes) - Returns Requeue=true, defer block saves annotation - No direct spec mutation (avoids GitOps conflicts, SSA issues) **Benefits:** - ✅ Consistent naming across reconciliations - ✅ Handles operator upgrades (existing CRs get names cached) - ✅ GitOps friendly (spec only mutated by webhook, not controller) - ✅ No SSA field ownership conflicts, if we'd call update in the controller if setting the spec. - ✅ Robust flip detection via owner references - ✅ Works with existing environments (no manual CR updates needed) - ✅ Uses generic webhook trigger mechanism from lib-common **Testing:** - Added comprehensive tests for all scenarios: - CREATE with UniquePodNames (uses random ID) - UPDATE with UniquePodNames (caches from existing CR or generates) - Operator upgrade scenario (controller triggers webhook) - Flip scenarios (UniquePodNames false→true, true→false) - Existing CR discovery by owner reference Depends-On: openstack-k8s-operators/lib-common#671 Jira: OSPRH-26002 Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> Signed-off-by: Martin Schuppert <mschuppert@redhat.com>
fa2c1ec to
10ac79c
Compare
3602e24
into
openstack-k8s-operators:main
Problem:
When UniquePodNames is enabled, GetServiceName() uses instance.UID[:5] to generate
unique service names. During CREATE webhook, UID is empty (assigned after mutating
webhooks), causing panic: "slice bounds out of range [:5] with length 0".
Solution - Hybrid Approach:
Webhook handles CREATE and UPDATE:
Controller triggers webhook for operator upgrades:
Key Implementation Details:
GetServiceName() fix (api/core/v1beta1/openstackcontrolplane_types.go):
Webhook caching (api/core/v1beta1/openstackcontrolplane_webhook.go):
to find existing CR owned by this controlplane
Controller annotation trigger (internal/openstack/cinder.go, glance.go):
Benefits:
controller if setting the spec.
Testing:
Depends-On: openstack-k8s-operators/lib-common#671
Jira: OSPRH-26002