OCPBUGS-76943: Fix dualstack BMC secret and BMH lookup using short name extraction#30788
OCPBUGS-76943: Fix dualstack BMC secret and BMH lookup using short name extraction#30788clobrano wants to merge 5 commits intoopenshift:mainfrom
Conversation
|
Skipping CI for Draft Pull Request. |
|
@clobrano: This pull request references Jira Issue OCPBUGS-76943, which is valid. The bug has been moved to the POST state. 3 validation(s) were run on this bug
No GitHub users were found matching the public email listed for the QA contact in Jira (dhensel@redhat.com), skipping review request. The bug has been updated to refer to the pull request using the external bug tracker. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
Pipeline controller notification For optional jobs, comment This repository is configured in: automatic mode |
|
@clobrano: This pull request references Jira Issue OCPBUGS-76943, which is valid. 3 validation(s) were run on this bug
No GitHub users were found matching the public email listed for the QA contact in Jira (dhensel@redhat.com), skipping review request. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/payload-job periodic-ci-openshift-release-main-nightly-4.22-e2e-metal-ovn-two-node-fencing-dualstack-recovery-techpreview |
|
@clobrano: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/ab6d40c0-0b4c-11f1-9908-74254d90a978-0 |
|
/payload-job periodic-ci-openshift-release-main-nightly-4.22-e2e-metal-ovn-two-node-fencing-dualstack-recovery-techpreview |
|
@clobrano: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/1fd7e7f0-0bdc-11f1-8d5c-7afa9c9c7395-0 |
|
/payload-job periodic-ci-openshift-release-main-nightly-4.22-e2e-metal-ovn-two-node-fencing-ipv6-recovery-techpreview |
|
@clobrano: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/ec76d5a0-0bf5-11f1-9a34-d2a9bac30969-0 |
|
/payload-job periodic-ci-openshift-release-main-nightly-4.22-e2e-metal-ovn-two-node-fencing-recovery-techpreview |
|
@clobrano: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/349ed1f0-0c1b-11f1-96bc-ebc7b2439b73-0 |
|
/payload-job periodic-ci-openshift-release-main-nightly-4.22-e2e-metal-ovn-two-node-fencing-ipv6-recovery-techpreview |
|
@clobrano: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/63497a90-0c21-11f1-80c9-e7d6709efd00-0 |
|
/payload-job periodic-ci-openshift-release-main-nightly-4.22-e2e-metal-ovn-two-node-fencing-ipv6-recovery-techpreview |
|
@clobrano: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/93dcdc30-0c2e-11f1-93da-bfdc0119b53e-0 |
|
/payload-job periodic-ci-openshift-release-main-nightly-4.22-e2e-metal-ovn-two-node-fencing-dualstack-recovery-techpreview |
|
@clobrano: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/098d1e60-0ca0-11f1-812d-e0af8d5b08dc-0 |
|
/payload-job periodic-ci-openshift-release-main-nightly-4.22-e2e-metal-ovn-two-node-fencing-ipv6-recovery-techpreview |
|
@clobrano: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/2a88bf70-0ccd-11f1-8791-faece5a7aad1-0 |
cb2c920 to
1c6d4c8
Compare
|
/payload-job periodic-ci-openshift-release-main-nightly-4.22-e2e-metal-ovn-two-node-fencing-dualstack-recovery-techpreview |
|
@clobrano: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/a39bb3d0-0d6e-11f1-8d4d-d438dda61354-0 |
|
/lgtm |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: clobrano, eggfoobar The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
This is failing for a problem in the job workflow (see openshift/release PR75198) /verified by @clobrano |
|
@clobrano: This PR has been marked as verified by DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
New changes are detected. LGTM label has been removed. |
|
/payload-job periodic-ci-openshift-release-main-nightly-4.22-e2e-metal-ovn-two-node-fencing-dualstack-recovery-techpreview |
|
@clobrano: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/be4057d0-121f-11f1-8b59-d7e8efccc748-0 |
|
/hold |
|
/payload-job periodic-ci-openshift-release-main-nightly-4.22-e2e-metal-ovn-two-node-fencing-dualstack-recovery-techpreview |
WalkthroughRefactors Kubernetes secret operations in the baremetalhost test utility from using Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes 🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 golangci-lint (2.5.0)Error: can't load config: unsupported version of the configuration: "" See https://golangci-lint.run/docs/product/migration-guide for migration instructions Comment |
|
@clobrano: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/e3084a50-1224-11f1-8b2e-d0254e6a9f08-0 |
|
@clobrano: This pull request references Jira Issue OCPBUGS-76943, which is valid. 3 validation(s) were run on this bug
No GitHub users were found matching the public email listed for the QA contact in Jira (dhensel@redhat.com), skipping review request. The bug has been updated to refer to the pull request using the external bug tracker. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
test/extended/two_node/utils/apis/baremetalhost.go (1)
133-133: Update stale comment to match implementation.Line 133 says “using oc patch”, but this block now uses client-go
Update.Suggested fix
- // Rotate password using oc patch + // Rotate password using Kubernetes client update🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/extended/two_node/utils/apis/baremetalhost.go` at line 133, Update the stale comment that reads "Rotate password using oc patch" to reflect the current implementation which uses the client-go Update call—e.g., change wording to "Rotate password using client-go Update" or similar; locate the comment near the password rotation block in test/extended/two_node/utils/apis/baremetalhost.go (the comment above the password rotation logic) and edit it to accurately describe that the code performs an Update via client-go rather than an oc patch.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@test/extended/two_node/utils/apis/baremetalhost.go`:
- Around line 153-156: The restore function RestoreBMCPassword is incorrectly
using the constant BMCSecretNamespace when creating secretClient and later
operations, so it can fetch/update the wrong Secret; replace usages of
BMCSecretNamespace with the function parameter namespace when constructing
secretClient (secretClient := oc.AdminKubeClient().CoreV1().Secrets(namespace))
and any subsequent calls that assume BMCSecretNamespace (e.g., the re-fetch/get
and update) so the function operates on the caller-provided namespace.
- Around line 135-136: The assignment to updated.Data[secretsDataPasswordKey]
can panic if secret.Data is nil after DeepCopy(); before assigning, check if
updated.Data == nil and if so initialize it (e.g., make(map[string][]byte)) then
set updated.Data[secretsDataPasswordKey] = []byte(newPass); apply the same
nil-check/initialization for the other assignment at the later usage so both
places that touch updated.Data (after DeepCopy()) guard against nil maps.
---
Nitpick comments:
In `@test/extended/two_node/utils/apis/baremetalhost.go`:
- Line 133: Update the stale comment that reads "Rotate password using oc patch"
to reflect the current implementation which uses the client-go Update call—e.g.,
change wording to "Rotate password using client-go Update" or similar; locate
the comment near the password rotation block in
test/extended/two_node/utils/apis/baremetalhost.go (the comment above the
password rotation logic) and edit it to accurately describe that the code
performs an Update via client-go rather than an oc patch.
ℹ️ Review info
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge base: Disabled due to data retention organization setting
📒 Files selected for processing (1)
test/extended/two_node/utils/apis/baremetalhost.go
| updated := secret.DeepCopy() | ||
| updated.Data[secretsDataPasswordKey] = []byte(newPass) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
cat -n test/extended/two_node/utils/apis/baremetalhost.go | sed -n '130,170p'Repository: openshift/origin
Length of output: 1584
🏁 Script executed:
rg -n "secret\s*:=\s*" test/extended/two_node/utils/apis/baremetalhost.go | head -20Repository: openshift/origin
Length of output: 42
🏁 Script executed:
head -50 test/extended/two_node/utils/apis/baremetalhost.goRepository: openshift/origin
Length of output: 1899
🏁 Script executed:
# Check if there's any Data initialization or nil checks between secret fetch and assignment
rg -n "secret.Data\s*=" test/extended/two_node/utils/apis/baremetalhost.goRepository: openshift/origin
Length of output: 42
🏁 Script executed:
# Look at the function signatures to understand the flow better
sed -n '125,145p' test/extended/two_node/utils/apis/baremetalhost.goRepository: openshift/origin
Length of output: 748
🏁 Script executed:
# Search for where secrets are created or used in the file to understand if Data can be nil
rg -n "secretClient\|Data" test/extended/two_node/utils/apis/baremetalhost.goRepository: openshift/origin
Length of output: 42
🏁 Script executed:
# Check Kubernetes corev1.Secret definition to understand Data field behavior
# Search for any Secret initialization or handling patterns in the codebase
rg -n "corev1\.Secret{|\.Data\s*=" test/extended/two_node/utils/apis/ -A 2Repository: openshift/origin
Length of output: 183
🏁 Script executed:
# Search for Secret initialization patterns
rg -n "corev1.Secret" test/extended/two_node/utils/apis/baremetalhost.goRepository: openshift/origin
Length of output: 42
🏁 Script executed:
# Check if there's documentation or tests showing Secret.Data behavior
rg -n "Data.*nil\|nil.*Data" test/extended/two_node/utils/apis/ -B2 -A2Repository: openshift/origin
Length of output: 42
🏁 Script executed:
# Get the full function context to see the entire flow
sed -n '105,170p' test/extended/two_node/utils/apis/baremetalhost.goRepository: openshift/origin
Length of output: 2550
🏁 Script executed:
# Check if there are other examples in the codebase of assigning to Secret.Data
rg -n "\.Data\[" test/extended/ -A 1 -B 1 | grep -E "updated|secret" -A 1 -B 1 | head -50Repository: openshift/origin
Length of output: 2704
🏁 Script executed:
# Check Kubernetes API documentation by looking at how Secrets are typically handled
# Look for patterns of nil-checking Data in the codebase
rg -n "\.Data == nil|if.*\.Data" test/extended/ | head -20Repository: openshift/origin
Length of output: 2107
🏁 Script executed:
# Verify the Go behavior: reading vs writing to nil maps
python3 << 'EOF'
# In Go:
# - Reading from a nil map: m := map[string]string(nil); v := m["key"] // safe, returns zero value
# - Writing to a nil map: m := map[string]string(nil); m["key"] = "value" // PANICS
#
# The code:
# Line 131: original := secret.Data[secretsDataPasswordKey] // SAFE - read from map
# Line 136: updated.Data[secretsDataPasswordKey] = []byte(newPass) // PANICS if Data == nil
#
# The review comment is CORRECT in principle about the Go behavior
print("Go nil map behavior:")
print("1. Reading from nil map is safe (returns zero value)")
print("2. Writing to nil map panics with 'assignment to entry in nil map'")
print("\nThe code at lines 136 and 160 writes to updated.Data")
print("If updated.Data is nil, both will panic")
print("\nQuestion: Can secret.Data be nil after secretClient.Get()?")
EOFRepository: openshift/origin
Length of output: 354
🏁 Script executed:
# Search for how Secrets are created/initialized in tests to understand typical state
rg -n "DeepCopy\|Secret{" test/extended/two_node/ -B 2 -A 2 | head -40Repository: openshift/origin
Length of output: 157
🏁 Script executed:
# Look for test fixtures or mock Secrets that might have nil Data
rg -n "corev1.Secret.*nil|&corev1.Secret" test/extended/two_node/ -A 3Repository: openshift/origin
Length of output: 42
🏁 Script executed:
# Check the actual Kubernetes k8s.io/api/core/v1 Secret type by searching documentation patterns
# Search for how Secrets are created in similar code patterns
rg -n "\.Data\[.*\] =" test/extended/ -B 5 | grep -E "DeepCopy|secretClient|secret :=" -A 5 | head -50Repository: openshift/origin
Length of output: 962
🏁 Script executed:
# Check whether k8s corev1.Secret.Data is typically initialized or can be nil
# by looking at how secrets are accessed elsewhere with ok checks
rg -n ", ok := secret.Data|, ok := .*\.Data" test/extended/ | head -20Repository: openshift/origin
Length of output: 1130
🏁 Script executed:
# Check if there's any pattern in the codebase where DeepCopy() is followed by Data assignment without nil check
rg -n "DeepCopy()" test/extended/ -A 3 | grep -E "\.Data\[|DeepCopy" | head -30Repository: openshift/origin
Length of output: 2815
🏁 Script executed:
# Verify the exact behavior: does the codebase follow a pattern of checking Data before assignment?
rg -n "if.*\.Data == nil" test/extended/ -B 1 -A 3 | head -40Repository: openshift/origin
Length of output: 805
🏁 Script executed:
# Show the exact nil-check patterns used elsewhere for Data assignment
sed -n '228,235p' test/extended/util/openshift/clusterversionoperator/adminack.goRepository: openshift/origin
Length of output: 394
🏁 Script executed:
sed -n '926,933p' test/extended/networking/util.goRepository: openshift/origin
Length of output: 254
Guard against nil Secret.Data before assignment.
Lines 136 and 160 assign to updated.Data[secretsDataPasswordKey] without ensuring the map is initialized. In Go, assigning to a nil map panics with "assignment to entry in nil map". Since DeepCopy() preserves the field state, if secret.Data is nil, the assignment will panic.
The codebase establishes the pattern of nil-checking before Data assignment (see test/extended/util/openshift/clusterversionoperator/adminack.go:230-233 and test/extended/networking/util.go:928-930).
Suggested fix
updated := secret.DeepCopy()
+ if updated.Data == nil {
+ updated.Data = map[string][]byte{}
+ }
updated.Data[secretsDataPasswordKey] = []byte(newPass) updated := secret.DeepCopy()
+ if updated.Data == nil {
+ updated.Data = map[string][]byte{}
+ }
updated.Data[secretsDataPasswordKey] = originalPassword📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| updated := secret.DeepCopy() | |
| updated.Data[secretsDataPasswordKey] = []byte(newPass) | |
| updated := secret.DeepCopy() | |
| if updated.Data == nil { | |
| updated.Data = map[string][]byte{} | |
| } | |
| updated.Data[secretsDataPasswordKey] = []byte(newPass) |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@test/extended/two_node/utils/apis/baremetalhost.go` around lines 135 - 136,
The assignment to updated.Data[secretsDataPasswordKey] can panic if secret.Data
is nil after DeepCopy(); before assigning, check if updated.Data == nil and if
so initialize it (e.g., make(map[string][]byte)) then set
updated.Data[secretsDataPasswordKey] = []byte(newPass); apply the same
nil-check/initialization for the other assignment at the later usage so both
places that touch updated.Data (after DeepCopy()) guard against nil maps.
| secretClient := oc.AdminKubeClient().CoreV1().Secrets(BMCSecretNamespace) | ||
| secret, err := secretClient.Get(ctx, name, metav1.GetOptions{}) | ||
| if err != nil { | ||
| return fmt.Errorf("failed to re-fetch BMC secret %s/%s: %w", namespace, name, err) |
There was a problem hiding this comment.
Use the function’s namespace parameter in restore operations.
Line 153 hardcodes BMCSecretNamespace, so RestoreBMCPassword can fetch/update the wrong Secret when callers pass a different namespace.
Suggested fix
- secretClient := oc.AdminKubeClient().CoreV1().Secrets(BMCSecretNamespace)
+ secretClient := oc.AdminKubeClient().CoreV1().Secrets(namespace)Also applies to: 162-163
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@test/extended/two_node/utils/apis/baremetalhost.go` around lines 153 - 156,
The restore function RestoreBMCPassword is incorrectly using the constant
BMCSecretNamespace when creating secretClient and later operations, so it can
fetch/update the wrong Secret; replace usages of BMCSecretNamespace with the
function parameter namespace when constructing secretClient (secretClient :=
oc.AdminKubeClient().CoreV1().Secrets(namespace)) and any subsequent calls that
assume BMCSecretNamespace (e.g., the re-fetch/get and update) so the function
operates on the caller-provided namespace.
|
Scheduling required tests: Scheduling tests matching the |
|
/retest-required |
|
@clobrano: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
/payload-job periodic-ci-openshift-release-main-nightly-4.22-e2e-metal-ovn-two-node-fencing-ipv6-recovery-techpreview |
|
@eggfoobar: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/7d19b040-1282-11f1-9b9b-70b85c63653e-0 |
|
Job Failure Risk Analysis for sha: 00c065a
|
|
Job Failure Risk Analysis for sha: 00c065a
|
|
/payload-job periodic-ci-openshift-release-main-nightly-4.22-e2e-metal-ovn-two-node-fencing-ipv6-recovery-techpreview |
|
@eggfoobar: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/eff695b0-12c8-11f1-8d3d-b0fe2b387aed-0 |
Extract short name from FQDN node names before pattern matching BMC secrets and BareMetalHosts. Updates RotateNodeBMCPassword and RestoreBMCPassword to use oc commands for consistency.
fixes: OCPBUGS-76943
Summary by CodeRabbit