Skip to content

DNM/TEST: force IPv4 for idle test in dual-stack + BGP environments#30813

Closed
jluhrsen wants to merge 1 commit intoopenshift:mainfrom
jluhrsen:fix-idle-test-k8s-1.34-auth
Closed

DNM/TEST: force IPv4 for idle test in dual-stack + BGP environments#30813
jluhrsen wants to merge 1 commit intoopenshift:mainfrom
jluhrsen:fix-idle-test-k8s-1.34-auth

Conversation

@jluhrsen
Copy link
Contributor

@jluhrsen jluhrsen commented Feb 26, 2026

Root cause: HAProxy prefers IPv6 endpoints in dual-stack configurations, but the IPv6 path hits Kubernetes 1.34 anonymous authentication restrictions (KEP-4633), resulting in HTTP 403 Forbidden errors.

This issue is specific to bare metal + BGP + dual-stack because:

  1. BGP advertises both IPv4 and IPv6 routes
  2. HAProxy sees both IPv4 and IPv6 endpoints
  3. HAProxy defaults to preferring IPv6 (documented behavior)
  4. K8s 1.34 (KEP-4633) restricts anonymous auth by default
  5. IPv6 requests through the BGP path hit auth restrictions

Virtual/cloud environments don't exhibit this because their traffic path doesn't involve the same BGP speaker routing.

Fix: Configure the test service with IPFamilyPolicy: SingleStack and IPFamilies: [IPv4] to prevent HAProxy from seeing IPv6 endpoints, forcing all traffic through the IPv4 path which works with anonymous auth.

Timeline:

  • K8s 1.34 shipped in OCP 4.22 payloads on Feb 18-19, 2026
  • Test started failing 100% on Feb 20, 2026
  • Only affects: e2e-metal-ipi-ovn-dualstack-bgp jobs

Related:

  • KEP-4633: Anonymous Auth Configurable Endpoints (K8s 1.34 GA)
  • KEP-4601: Authorization with Selectors (K8s 1.34 GA)
  • HAProxy issue Refactor Config labels, fix #286 #508: IPv6 preference behavior
  • OVN-Kubernetes conntrack fix (merged Jan 27, 2026)

Affects: e2e-metal-ipi-ovn-dualstack-bgp
Fixes: Consistent HTTP 403 errors on route access

Summary by CodeRabbit

  • Tests
    • Improved test logging for HTTP checks: when responses don't match the expected status during early attempts, tests now capture and log response headers and a truncated body preview.
    • Logs include errors encountered while reading response bodies, improving visibility, while preserving existing test behavior and outcomes.

@openshift-ci-robot
Copy link

Pipeline controller notification
This repo is configured to use the pipeline controller. Second-stage tests will be triggered either automatically or after lgtm label is added, depending on the repository configuration. The pipeline controller will automatically detect which contexts are required and will utilize /test Prow commands to trigger the second stage.

For optional jobs, comment /test ? to see a list of all defined jobs. To trigger manually all jobs from second stage use /pipeline required command.

This repository is configured in: automatic mode

@jluhrsen jluhrsen marked this pull request as draft February 26, 2026 04:27
@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 26, 2026
@jluhrsen
Copy link
Contributor Author

/test e2e-metal-ipi-ovn-dualstack-bgp

@coderabbitai
Copy link

coderabbitai bot commented Feb 26, 2026

Walkthrough

The waitHTTPGetStatus helper in the extended router tests now reads and inspects HTTP response bodies and improves logging: it logs read errors, conditionally logs headers and a truncated body preview for early failures, and otherwise preserves previous status-check behavior and return logic.

Changes

Cohort / File(s) Summary
Test: enhanced HTTP response logging
test/extended/router/idle.go
Replaced prior response-discard behavior with reading the response body via ioutil.ReadAll; log read errors with status and URL; on non-matching status within first 10 attempts, log headers and a truncated (max 500 chars) body preview; otherwise keep simple status logging. No signature or public API changes.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Title check ⚠️ Warning The PR title describes forcing IPv4 for an idle test in dual-stack BGP environments, but the actual changes modify HTTP response body reading and logging in the waitHTTPGetStatus function—unrelated to IPv4 enforcement or service IP family configuration. Update the title to reflect the actual changes made (e.g., 'Improve HTTP status check logging by reading and displaying response bodies') or ensure the code changes match the stated objective of forcing IPv4 in the idle test.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Stable And Deterministic Test Names ✅ Passed Test file contains no dynamic information in test names. PR changes only modify logging implementation within helper function body, not test titles.
Test Structure And Quality ✅ Passed PR enhances test quality by adding detailed logging for debugging HTTP response failures without modifying core test behavior.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 golangci-lint (2.5.0)

Error: can't load config: unsupported version of the configuration: "" See https://golangci-lint.run/docs/product/migration-guide for migration instructions
The command is terminated due to an error: can't load config: unsupported version of the configuration: "" See https://golangci-lint.run/docs/product/migration-guide for migration instructions


Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci openshift-ci bot requested review from frobware and rfredette February 26, 2026 04:29
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Feb 26, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: jluhrsen
Once this PR has been reviewed and has the lgtm label, please assign knobunc for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
test/extended/router/idle.go (1)

102-109: LGTM! Clear fix for the dual-stack IPv6 preference issue.

The IPFamilies and IPFamilyPolicy configuration correctly forces IPv4-only behavior, and the comment explains the rationale well (HAProxy IPv6 preference + K8s 1.34 KEP-4633 restrictions).

Optional: The IIFE for pointer creation is functional but verbose. Consider using ptr.To() from k8s.io/utils/ptr for cleaner code:

♻️ Optional simplification using ptr.To()

Add import:

"k8s.io/utils/ptr"

Then simplify:

-				IPFamilyPolicy: func() *corev1.IPFamilyPolicy {
-					policy := corev1.IPFamilyPolicySingleStack
-					return &policy
-				}(),
+				IPFamilyPolicy: ptr.To(corev1.IPFamilyPolicySingleStack),

,

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/extended/router/idle.go` around lines 102 - 109, Replace the anonymous
IIFE used to create a pointer for IPFamilyPolicy with the helper from
k8s.io/utils/ptr: import "k8s.io/utils/ptr" and change the IPFamilyPolicy value
from the inline func that returns &policy to
ptr.To(corev1.IPFamilyPolicySingleStack) so the block uses IPFamilyPolicy:
ptr.To(corev1.IPFamilyPolicySingleStack) while leaving IPFamilies and the
surrounding comment unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@test/extended/router/idle.go`:
- Around line 102-109: Replace the anonymous IIFE used to create a pointer for
IPFamilyPolicy with the helper from k8s.io/utils/ptr: import "k8s.io/utils/ptr"
and change the IPFamilyPolicy value from the inline func that returns &policy to
ptr.To(corev1.IPFamilyPolicySingleStack) so the block uses IPFamilyPolicy:
ptr.To(corev1.IPFamilyPolicySingleStack) while leaving IPFamilies and the
surrounding comment unchanged.

ℹ️ Review info

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to data retention organization setting

📥 Commits

Reviewing files that changed from the base of the PR and between a631c60 and a4d3dfa.

📒 Files selected for processing (1)
  • test/extended/router/idle.go

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Feb 26, 2026

@jluhrsen: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-metal-ipi-ovn-dualstack-bgp a4d3dfa link false /test e2e-metal-ipi-ovn-dualstack-bgp

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

jluhrsen added a commit to jluhrsen/origin that referenced this pull request Feb 26, 2026
The router idling test is failing with HTTP 403 Forbidden on
metal-ipi-ovn-dualstack-bgp jobs starting Feb 20, 2026.

Previous hypothesis (IPv6 endpoint preference) was disproven when
forcing IPv4-only service still resulted in HTTP 403.

This commit:
- Reverts the IPv4-only service configuration (didn't help)
- Adds logging of HTTP response body and headers for the first 10
  non-200 responses
- Limits body preview to 500 chars to avoid log spam

This will help identify what component is returning 403 and why.

Affects: e2e-metal-ipi-ovn-dualstack-bgp
Related: openshift#30813
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
test/extended/router/idle.go (2)

314-331: Use io.ReadAll instead of deprecated ioutil.ReadAll.

ioutil.ReadAll is deprecated since Go 1.16. The io package is already imported, so switch to io.ReadAll.

The diagnostic logging logic itself looks good for debugging the 403 errors in dual-stack BGP environments.

♻️ Proposed fix
 		// Read response body to help debug unexpected status codes
-		bodyBytes, readErr := ioutil.ReadAll(resp.Body)
+		bodyBytes, readErr := io.ReadAll(resp.Body)
 		if readErr != nil {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/extended/router/idle.go` around lines 314 - 331, Replace the deprecated
ioutil.ReadAll call with io.ReadAll: in the block that reads the response body
(where bodyBytes, readErr are assigned from ioutil.ReadAll(resp.Body)), call
io.ReadAll(resp.Body) instead; keep the existing error handling and subsequent
logging logic that uses bodyBytes, readErr, resp.StatusCode, resp.Header and
attempt unchanged so behavior remains identical while removing the deprecated
import usage.

8-8: Remove deprecated io/ioutil import.

The io/ioutil package has been deprecated since Go 1.16. Since io is already imported on line 7, this import can be removed after updating the ioutil.ReadAll call on line 316.

♻️ Proposed fix
-	"io/ioutil"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/extended/router/idle.go` at line 8, Remove the deprecated io/ioutil
import and replace its use with the io package: delete the "io/ioutil" import
entry and change the ioutil.ReadAll(...) call (the ReadAll invocation in this
file, e.g., where ioutil.ReadAll is used) to io.ReadAll(...); ensure the io
import remains and run go vet/go test to verify there are no remaining
references to io/ioutil.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@test/extended/router/idle.go`:
- Around line 314-331: Replace the deprecated ioutil.ReadAll call with
io.ReadAll: in the block that reads the response body (where bodyBytes, readErr
are assigned from ioutil.ReadAll(resp.Body)), call io.ReadAll(resp.Body)
instead; keep the existing error handling and subsequent logging logic that uses
bodyBytes, readErr, resp.StatusCode, resp.Header and attempt unchanged so
behavior remains identical while removing the deprecated import usage.
- Line 8: Remove the deprecated io/ioutil import and replace its use with the io
package: delete the "io/ioutil" import entry and change the ioutil.ReadAll(...)
call (the ReadAll invocation in this file, e.g., where ioutil.ReadAll is used)
to io.ReadAll(...); ensure the io import remains and run go vet/go test to
verify there are no remaining references to io/ioutil.

ℹ️ Review info

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to data retention organization setting

📥 Commits

Reviewing files that changed from the base of the PR and between a4d3dfa and f8070c3.

📒 Files selected for processing (1)
  • test/extended/router/idle.go

The router idling test is failing with HTTP 403 Forbidden on
metal-ipi-ovn-dualstack-bgp jobs starting Feb 20, 2026.

Root cause is still unknown. This commit adds diagnostic logging to
capture the actual HTTP response body and headers for the first 10
failed requests.

Changes:
- Log full HTTP headers on non-200 responses (first 10 attempts)
- Log response body preview (up to 500 chars)
- After attempt 10, just log status code to avoid log spam

This will help identify:
- What component is returning 403 (HAProxy, proxy, auth service)
- Why it's forbidden (error message in response body)
- Any relevant headers (forwarding, auth, server info)

Affects: e2e-metal-ipi-ovn-dualstack-bgp
Related: Investigation of Feb 20, 2026 test failures
@jluhrsen jluhrsen force-pushed the fix-idle-test-k8s-1.34-auth branch from f8070c3 to 8256187 Compare February 26, 2026 17:35
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
test/extended/router/idle.go (2)

98-110: ⚠️ Potential issue | 🟠 Major

Implement the IPv4 SingleStack service config described by the PR objective.

idleService.Spec currently does not set IPFamilyPolicy / IPFamilies. On dual-stack+BGP, this can still expose IPv6 endpoints and keep the 403 path active instead of enforcing IPv4.

💡 Proposed fix
 			Spec: corev1.ServiceSpec{
+				IPFamilyPolicy: utilpointer.To(corev1.IPFamilyPolicySingleStack),
+				IPFamilies: []corev1.IPFamily{
+					corev1.IPv4Protocol,
+				},
 				Selector: map[string]string{
 					"app": "idle-test",
 				},
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/extended/router/idle.go` around lines 98 - 110, The Service spec for the
test's idleService is missing IPFamilyPolicy/IPFamilies and must be configured
for IPv4 single-stack; update the corev1.ServiceSpec (the object created as
idleService / the inline Spec block) to set IPFamilyPolicy to
corev1.IPFamilyPolicySingleStack and IPFamilies to
[]corev1.IPFamily{corev1.IPv4Protocol} so the test always exposes IPv4 endpoints
only.

7-9: ⚠️ Potential issue | 🔴 Critical

Switch to io.ReadAll and remove io/ioutil import.

The io package is imported at line 7 but never used. Line 316 calls ioutil.ReadAll, which is deprecated since Go 1.16. Use io.ReadAll instead and remove the unused io/ioutil import. Unused imports cause compilation failures.

💡 Proposed fix
-	"io"
-	"io/ioutil"
+	"io"
...
-		bodyBytes, readErr := ioutil.ReadAll(resp.Body)
+		bodyBytes, readErr := io.ReadAll(resp.Body)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/extended/router/idle.go` around lines 7 - 9, Replace the deprecated call
to ioutil.ReadAll with io.ReadAll and remove the unused "io/ioutil" import:
change the call to ioutil.ReadAll(...) to io.ReadAll(...), delete the
"io/ioutil" entry from the import block, and run goimports/gofmt to clean up
imports so the existing "io" import is used (this fixes the unused-import
compile error referencing ioutil.ReadAll).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@test/extended/router/idle.go`:
- Around line 98-110: The Service spec for the test's idleService is missing
IPFamilyPolicy/IPFamilies and must be configured for IPv4 single-stack; update
the corev1.ServiceSpec (the object created as idleService / the inline Spec
block) to set IPFamilyPolicy to corev1.IPFamilyPolicySingleStack and IPFamilies
to []corev1.IPFamily{corev1.IPv4Protocol} so the test always exposes IPv4
endpoints only.
- Around line 7-9: Replace the deprecated call to ioutil.ReadAll with io.ReadAll
and remove the unused "io/ioutil" import: change the call to ioutil.ReadAll(...)
to io.ReadAll(...), delete the "io/ioutil" entry from the import block, and run
goimports/gofmt to clean up imports so the existing "io" import is used (this
fixes the unused-import compile error referencing ioutil.ReadAll).

ℹ️ Review info

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to data retention organization setting

📥 Commits

Reviewing files that changed from the base of the PR and between f8070c3 and 8256187.

📒 Files selected for processing (1)
  • test/extended/router/idle.go

@jluhrsen
Copy link
Contributor Author

/close
talk about barking up the wrong tree

@openshift-ci openshift-ci bot closed this Feb 26, 2026
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Feb 26, 2026

@jluhrsen: Closed this PR.

Details

In response to this:

/close
talk about barking up the wrong tree

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants