Skip to content

test(e2e): add test-mode failure handling tests#988

Open
shifa-khan wants to merge 1 commit intopython-wheel-build:mainfrom
shifa-khan:895
Open

test(e2e): add test-mode failure handling tests#988
shifa-khan wants to merge 1 commit intopython-wheel-build:mainfrom
shifa-khan:895

Conversation

@shifa-khan
Copy link
Copy Markdown
Contributor

@shifa-khan shifa-khan commented Mar 26, 2026

Add e2e tests for --test-mode bootstrapping failures as requested in issue

Tests cover:

  • Top-level resolution failure (non-existent package)
  • Secondary dependency resolution failure (constrained pbr)
  • Build failure without prebuilt fallback (local git fixture)
  • Build failure with prebuilt fallback (broken patch on setuptools)

Closes: #895

@shifa-khan shifa-khan requested a review from a team as a code owner March 26, 2026 21:34
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 26, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds end-to-end tests for fromager bootstrap --test-mode. Introduces a failing-package fixture at e2e/test_build_failure/ (pyproject.toml, setup.py, package init that raises) and four new test scripts: e2e/test_mode_resolution.sh, e2e/test_mode_deps.sh, e2e/test_mode_build.sh, and e2e/test_mode_fallback.sh. Each script runs bootstrap in test-mode and validates exit codes, bootstrap logs, and generated test-mode-failures-*.json contents for resolution, dependency, build, and fallback scenarios. The CI suite e2e/ci_bootstrap_suite.sh adds a new "bootstrap test-mode tests" section that runs these four scripts in sequence.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 4
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the main change—adding end-to-end tests for --test-mode failure handling scenarios.
Description check ✅ Passed The description clearly outlines the test coverage (resolution, dependency, and build failure scenarios) and links to issue #895.
Linked Issues check ✅ Passed All code changes fulfill issue #895 requirements: four test scripts cover resolution failure, dependency resolution failure, build failure without fallback, and build failure with fallback.
Out of Scope Changes check ✅ Passed All changes are scoped to issue #895: test fixture files (test_build_failure) and four test scripts align with the stated objective of validating --test-mode failure handling.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

@shifa-khan shifa-khan requested a review from smoparth March 26, 2026 21:34
@mergify mergify bot added the ci label Mar 26, 2026
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
e2e/test_build_failure/setup.py (1)

2-2: Unused import: sys is imported but never used.

This import can be removed as it serves no purpose in the current implementation.

🧹 Proposed fix
 """Setup script that intentionally fails during build."""
-import sys
 
 # Fail immediately when this module is imported during build
 raise RuntimeError("Intentional build failure for e2e testing")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@e2e/test_build_failure/setup.py` at line 2, Remove the unused import by
deleting the lone "import sys" statement at the top of the module (the unused
symbol "sys"); update the file so there is no unused import left and run the
linter/pytest to verify no remaining import warnings.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@e2e/test_mode_build.sh`:
- Around line 88-91: The conditional uses `if ! grep -q "pre-built fallback"
"$OUTDIR/bootstrap.log"` but the echo message is inverted; update the message or
the condition so it reflects the actual check: either remove the `!` to print
"INFO: Log mentions fallback (may or may not be present)" when the pattern
exists, or (preferably) keep the negation and change the echo to "INFO: Log does
not mention fallback" so the `grep -q "pre-built fallback"` check and the `echo`
text align; modify the line with `grep -q` and the `echo` accordingly.

In `@e2e/test_mode_fallback.sh`:
- Around line 66-106: Ensure the script actually fails the test when fallback
doesn't happen: validate EXIT_CODE is 0 or 1 and set pass=false otherwise;
require that either the log contains "successfully used pre-built wheel" (treat
as success) or a valid failures file exists (FAILURES_FILE with .failures length
> 0) — if neither is true set pass=false; change the informational checks around
"applying patch\|patch", "pre-built fallback", and the test mode summary grep
("test mode:" and "test mode enabled") so they set pass=false when their
expected positive evidence is missing (use the variables EXIT_CODE,
OUTDIR/bootstrap.log, FAILURES_FILE, FAILURE_COUNT and the exact grep strings
"successfully used pre-built wheel", "pre-built fallback", "test mode enabled",
and "test mode:").

---

Nitpick comments:
In `@e2e/test_build_failure/setup.py`:
- Line 2: Remove the unused import by deleting the lone "import sys" statement
at the top of the module (the unused symbol "sys"); update the file so there is
no unused import left and run the linter/pytest to verify no remaining import
warnings.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 1073cfef-4117-4697-bab9-123724b65de7

📥 Commits

Reviewing files that changed from the base of the PR and between 9b399c8 and 33d9867.

📒 Files selected for processing (8)
  • e2e/ci_bootstrap_suite.sh
  • e2e/test_build_failure/pyproject.toml
  • e2e/test_build_failure/setup.py
  • e2e/test_build_failure/test_build_failure/__init__.py
  • e2e/test_mode_build.sh
  • e2e/test_mode_deps.sh
  • e2e/test_mode_fallback.sh
  • e2e/test_mode_resolution.sh

# Initialize git repo at runtime (fixture files are committed without .git)
FIXTURE_DIR="$SCRIPTDIR/test_build_failure"
if [ ! -d "$FIXTURE_DIR/.git" ]; then
(cd "$FIXTURE_DIR" && git init -q && git add -A && git commit -q -m "init")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test is failing CI
Error: https://github.com/python-wheel-build/fromager/actions/runs/23619151714/job/68794088620?pr=988#step:8:5039
Fix: The git commit needs user.name and user.email to be set. Use placeholder values ex: testuser@example.com , testuser

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@smoparth I've fixed it using your suggestion. Mind taking a look when available

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
e2e/test_mode_fallback.sh (1)

79-111: ⚠️ Potential issue | 🟠 Major

Key checks are still permissive and can hide regressions.

Line 79-82 and Line 106-111 still allow pass-through when expected evidence is missing, and Line 96-104 skips validation when the failures report is missing. Tighten these to fail the test on missing expected artifacts/log markers.

Proposed change
 # Check 3: Look for evidence of the patch failure
 if grep -q "applying patch\|patch" "$OUTDIR/bootstrap.log"; then
   echo "Patch application was attempted"
+else
+  echo "FAIL: Expected patch activity in bootstrap log" 1>&2
+  pass=false
 fi
@@
-FAILURES_FILE=$(find "$OUTDIR/work-dir" -name "test-mode-failures-*.json" 2>/dev/null | head -1)
-if [ -n "$FAILURES_FILE" ] && [ -f "$FAILURES_FILE" ]; then
+FAILURES_FILE=$(find "$OUTDIR/work-dir" -name "test-mode-failures-*.json" -print -quit 2>/dev/null)
+if [ -n "$FAILURES_FILE" ] && [ -f "$FAILURES_FILE" ]; then
   FAILURE_COUNT=$(jq '.failures | length' "$FAILURES_FILE")
   if [ "$FAILURE_COUNT" -gt 0 ]; then
     echo "FAIL: Expected no failures (fallback should succeed), got $FAILURE_COUNT" 1>&2
     jq '.failures[] | {package, failure_type, exception_type}' "$FAILURES_FILE" 1>&2
     pass=false
   fi
+else
+  echo "FAIL: Expected test-mode failures report file to be present" 1>&2
+  pass=false
 fi
@@
 else
-  echo "NOTE: Test mode summary not found in log" 1>&2
+  echo "FAIL: Test mode summary not found in log" 1>&2
+  pass=false
 fi
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@e2e/test_mode_fallback.sh` around lines 79 - 111, The current checks are
permissive; update the grep and failure-file logic so missing expected artifacts
cause the script to fail by setting pass=false and emitting an error to stderr.
Specifically: for the "applying patch\|patch" grep (check 3) treat absence as an
error (set pass=false and echo to stderr); for the "pre-built fallback" and
"successfully used pre-built wheel" checks (check 4) ensure both missing and
partial matches set pass=false and print failure messages to stderr instead of
only echoing success; for FAILURES_FILE handling (variable FAILURES_FILE and
FAILURE_COUNT logic) treat a missing FAILURES_FILE as a test failure (set
pass=false and echo an error) rather than skipping validation; and for the "test
mode:" grep (check 6) treat a missing summary as a failure (set pass=false and
echo to stderr) instead of a note. Ensure all error messages go to stderr and
the global pass variable is updated consistently.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@e2e/test_mode_fallback.sh`:
- Around line 12-14: Add nounset handling by enabling -u in the script: after
sourcing common.sh (the line "source \"$SCRIPTDIR/common.sh\""), add an explicit
set -u (or set -euo pipefail) so the script treats unset variables as errors;
this ensures test_mode_fallback.sh honors nounset even if common.sh doesn't set
it.

---

Duplicate comments:
In `@e2e/test_mode_fallback.sh`:
- Around line 79-111: The current checks are permissive; update the grep and
failure-file logic so missing expected artifacts cause the script to fail by
setting pass=false and emitting an error to stderr. Specifically: for the
"applying patch\|patch" grep (check 3) treat absence as an error (set pass=false
and echo to stderr); for the "pre-built fallback" and "successfully used
pre-built wheel" checks (check 4) ensure both missing and partial matches set
pass=false and print failure messages to stderr instead of only echoing success;
for FAILURES_FILE handling (variable FAILURES_FILE and FAILURE_COUNT logic)
treat a missing FAILURES_FILE as a test failure (set pass=false and echo an
error) rather than skipping validation; and for the "test mode:" grep (check 6)
treat a missing summary as a failure (set pass=false and echo to stderr) instead
of a note. Ensure all error messages go to stderr and the global pass variable
is updated consistently.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: c32c822f-a251-47b0-bfee-1a87bcea06fd

📥 Commits

Reviewing files that changed from the base of the PR and between 33d9867 and cdeb286.

📒 Files selected for processing (8)
  • e2e/ci_bootstrap_suite.sh
  • e2e/test_build_failure/pyproject.toml
  • e2e/test_build_failure/setup.py
  • e2e/test_build_failure/test_build_failure/__init__.py
  • e2e/test_mode_build.sh
  • e2e/test_mode_deps.sh
  • e2e/test_mode_fallback.sh
  • e2e/test_mode_resolution.sh
✅ Files skipped from review due to trivial changes (5)
  • e2e/test_build_failure/setup.py
  • e2e/ci_bootstrap_suite.sh
  • e2e/test_mode_deps.sh
  • e2e/test_build_failure/pyproject.toml
  • e2e/test_build_failure/test_build_failure/init.py
🚧 Files skipped from review as they are similar to previous changes (2)
  • e2e/test_mode_resolution.sh
  • e2e/test_mode_build.sh

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
e2e/test_mode_build.sh (1)

55-55: Make failure-file discovery pipefail-safe.

Line 55 uses find ... | head -1, which is brittle under set -o pipefail (upstream find can exit on SIGPIPE). Use find ... -print -quit instead.

Suggested patch
-FAILURES_FILE=$(find "$OUTDIR/work-dir" -name "test-mode-failures-*.json" 2>/dev/null | head -1)
+FAILURES_FILE=$(find "$OUTDIR/work-dir" -name "test-mode-failures-*.json" -print -quit 2>/dev/null)

As per coding guidelines, e2e/**/*.sh: Ensure set -euo pipefail is used.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@e2e/test_mode_build.sh` at line 55, The current FAILURES_FILE assignment uses
a pipeline `find ... | head -1` which is unsafe under set -o pipefail; modify
the FAILURES_FILE assignment (the line that sets the FAILURES_FILE variable) to
use find's builtin exit-after-first-match flags (e.g., `-print -quit`) and keep
the stderr redirection (2>/dev/null) instead of piping to head, and ensure the
script enables strict mode (`set -euo pipefail`) near the top of the file so the
shell honors pipefail semantics.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@e2e/test_mode_build.sh`:
- Around line 18-23: The fixture .git is created without teardown and may
persist between runs; add a trap to remove "$FIXTURE_DIR/.git" when the script
exits if this script created it. Implement a flag (e.g., CREATED_FIXTURE_GIT=1)
set when you run git init in the block that checks "$FIXTURE_DIR/.git", register
a trap like trap '[[ $CREATED_FIXTURE_GIT ]] && rm -rf "$FIXTURE_DIR/.git"'
EXIT, and after the init block ensure the repository has a valid HEAD (for
example, check git rev-parse --verify HEAD or test for "$FIXTURE_DIR/.git/HEAD"
and if missing perform a commit in FIXTURE_DIR to create HEAD using git add -A
&& git commit -q -m "init").
- Around line 12-14: The script relies on shell strict mode coming implicitly
from common.sh; explicitly enable strict mode to avoid fragile coupling by
adding set -euo pipefail after sourcing common.sh so the script enforces failure
on errors/unset variables and pipe failures even if common.sh changes; modify
the file that defines SCRIPTDIR and sources common.sh (the lines using SCRIPTDIR
and source "$SCRIPTDIR/common.sh") to include an immediate set -euo pipefail
following the source statement.

---

Nitpick comments:
In `@e2e/test_mode_build.sh`:
- Line 55: The current FAILURES_FILE assignment uses a pipeline `find ... | head
-1` which is unsafe under set -o pipefail; modify the FAILURES_FILE assignment
(the line that sets the FAILURES_FILE variable) to use find's builtin
exit-after-first-match flags (e.g., `-print -quit`) and keep the stderr
redirection (2>/dev/null) instead of piping to head, and ensure the script
enables strict mode (`set -euo pipefail`) near the top of the file so the shell
honors pipefail semantics.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: fcca1ffb-2624-40a2-9920-45a60f65111b

📥 Commits

Reviewing files that changed from the base of the PR and between cdeb286 and 168cb16.

📒 Files selected for processing (8)
  • e2e/ci_bootstrap_suite.sh
  • e2e/test_build_failure/pyproject.toml
  • e2e/test_build_failure/setup.py
  • e2e/test_build_failure/test_build_failure/__init__.py
  • e2e/test_mode_build.sh
  • e2e/test_mode_deps.sh
  • e2e/test_mode_fallback.sh
  • e2e/test_mode_resolution.sh
✅ Files skipped from review due to trivial changes (6)
  • e2e/test_build_failure/test_build_failure/init.py
  • e2e/ci_bootstrap_suite.sh
  • e2e/test_build_failure/setup.py
  • e2e/test_build_failure/pyproject.toml
  • e2e/test_mode_fallback.sh
  • e2e/test_mode_deps.sh
🚧 Files skipped from review as they are similar to previous changes (1)
  • e2e/test_mode_resolution.sh

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@e2e/test_mode_build.sh`:
- Around line 80-106: Update the failure-path assertions to require the intended
build→fallback-fail flow: broaden the allowed FAILURE_TYPE values by adding
"hook" and "dependency_extraction" to the check that uses the FAILURE_TYPE
variable, and make the fallback log check strict by treating absence of the
"pre-built fallback" message in OUTDIR/bootstrap.log as a failure (set
pass=false and echo a FAIL message) instead of printing INFO; preserve the
existing EXCEPTION_MSG check but keep it tied to the same failure entry so the
three checks (FAILURE_TYPE, EXCEPTION_MSG, and fallback log) all must pass to
signal the expected build→fallback-fail path.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 432ce6b5-41fc-4b13-9155-41b1395750dd

📥 Commits

Reviewing files that changed from the base of the PR and between 50d3f0a and f26a89e.

📒 Files selected for processing (8)
  • e2e/ci_bootstrap_suite.sh
  • e2e/test_build_failure/pyproject.toml
  • e2e/test_build_failure/setup.py
  • e2e/test_build_failure/test_build_failure/__init__.py
  • e2e/test_mode_build.sh
  • e2e/test_mode_deps.sh
  • e2e/test_mode_fallback.sh
  • e2e/test_mode_resolution.sh
✅ Files skipped from review due to trivial changes (3)
  • e2e/test_build_failure/test_build_failure/init.py
  • e2e/test_build_failure/setup.py
  • e2e/test_build_failure/pyproject.toml
🚧 Files skipped from review as they are similar to previous changes (1)
  • e2e/ci_bootstrap_suite.sh

@@ -0,0 +1,3 @@
# This file intentionally raises an error when imported
# to cause build failures for e2e testing
raise RuntimeError("Intentional build failure for e2e testing")
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is misplaced and will be never invoked during a test.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A reasonable fix would be to either:

  • Replace the body with a docstring: """Test fixture package for e2e build failure tests."""
  • Or just leave it empty (Python requires init.py to exist for the package to be recognized by setuptools' auto-discovery, but it doesn't need to contain anything)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! I've simplified it to just a docstring since the actual failure mechanism is in setup.py.

echo "FAIL: Unexpected failure_type '$FAILURE_TYPE'" 1>&2
pass=false
;;
esac
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The assertion should be tightened to "bootstrap".
The current case statement accepting all four types would pass even if a code change accidentally reclassified this failure as "resolution", which would be a real bug (the package resolved fine but the build failed). Pinning it catches that regression.

  # Current (too loose):                                                                                    
  case "$FAILURE_TYPE" in                                                                                                                                                                           
    resolution|bootstrap|hook|dependency_extraction)                                                                                                                                                
# Should be:                                                                                                                                                                                      
  if [ "$FAILURE_TYPE" != "bootstrap" ]; then                                                                                                                                                       
    echo "FAIL: Expected failure_type 'bootstrap', got '$FAILURE_TYPE'" 1>&2                                                                                                                        
    pass=false                                                                                                                                                                                      
  fi                                                                                                           

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. I've reworked the fixture to fail during actual wheel building instead of metadata extraction.

# Check 3: Look for evidence of the patch failure
if grep -q "applying patch\|patch" "$OUTDIR/bootstrap.log"; then
echo "Patch application was attempted"
fi
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check 3 (line 80) is informational-only, no failure branch, so it can't catch regressions. Also, grep -q "applying patch|patch" matches any line containing "patch" (including the startup log
"patches dir: ..."), so it always passes.

Suggest converting to a real assertion:

 # Check 3: Patch application must be attempted
 if ! grep -q "applying patch file.*break-build.patch" "$OUTDIR/bootstrap.log"; then                                                                                                               
   echo "FAIL: Expected patch 'break-build.patch' to be applied" 1>&2                                                                                                                              
   pass=false
 fi                                                                                                                                                                                                

This adds a failure branch, targets the specific log message, and checks the actual patch filename

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Changed the pattern from "applying patch|patch" to "applying patch file.*break-build.patch" so it specifically looks for our test patch.

Add e2e tests for --test-mode bootstrapping failures as requested
in issue

 Tests cover:
- Top-level resolution failure (non-existent package)
- Secondary dependency resolution failure (constrained pbr)
- Build failure without prebuilt fallback (local git fixture)
- Build failure with prebuilt fallback (broken patch on setuptools)

Co-Authored-By: Claude <claude@anthropic.com>
Closes: python-wheel-build#895
Made-with: Cursor
@shifa-khan
Copy link
Copy Markdown
Contributor Author

@LalatenduMohanty Worked on fixing suggested changes. Mind taking a look again?
Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add e2e test for bootstrapping failures

3 participants