Skip to content

[doc-only] Fix backport release checks and add release dry-run#2161

Merged
rwgk merged 9 commits into
mainfrom
check-release-notes_with_backport-git-tag
Jun 13, 2026
Merged

[doc-only] Fix backport release checks and add release dry-run#2161
rwgk merged 9 commits into
mainfrom
check-release-notes_with_backport-git-tag

Conversation

@rwgk

@rwgk rwgk commented Jun 2, 2026

Copy link
Copy Markdown
Contributor

Description

Closes #2141.

Supersedes #2143. PR #2143 was useful for unblocking the 12.9.7 release process quickly, but this PR replaces it with a narrower and more maintainable fix for the release workflow.

This PR fixes the issue that blocked the 12.9.7 release process and, just as importantly, makes the release workflow easy to test before an actual release. The new dry-run mode validates the release path without publishing to GitHub Releases, TestPyPI, or PyPI, and it can optionally deploy generated docs to a non-production gh-pages-* branch so we can inspect the docs exactly as they would be published before running the real release.

Fix

The release-note check now runs from the workflow branch, not from the release tag checkout. This keeps the checker on current CI tooling while still using the requested tag as the release identity/source for the release build.

For cuda-bindings and cuda-python mainline releases, the workflow now also requires an explicit backport decision:

  • provide the planned backport tag, such as v12.9.7; or
  • set backport-git-tag to not planned.

For cuda-bindings and cuda-python, the tag inputs are interpreted differently for mainline and backport releases. For a mainline release, the mainline release notes are required, and backport-git-tag must either name the planned backport tag or be set to not planned. If a planned backport tag is provided, the corresponding backport release notes are required too, and missing or empty notes fail the mainline release. For an actual backport release, backport-git-tag is left blank; the checker looks for release notes for the backport tag itself, but missing or empty notes only emit a GitHub Actions warning so they do not block an otherwise valid package release late in the process. With the mainline-time backport-git-tag check in place, that warning case is not expected to occur in normal release flow.

The docs workflow also accepts older release-tag source trees that still have ci/versions.json instead of ci/versions.yml.

Dry-Run Mode

The manual release workflow now has a release-action input:

  • dry-run is the default and safe choice.
  • full-release must be selected deliberately for the real release.

In dry-run mode:

  • the release tag is still validated;
  • release notes are checked;
  • the artifact run ID is resolved or accepted from the user;
  • source archives are created and checksummed;
  • wheels are downloaded and validated against the release tag;
  • docs are generated from the release tag;
  • GitHub release draft checks/creation are skipped;
  • GitHub release asset uploads are skipped;
  • TestPyPI and PyPI publishing are skipped;
  • dry-run release artifacts are uploaded as workflow artifacts.

If dry-run-docs-branch is provided, the workflow deploys generated docs to that branch. The branch must match gh-pages-*, for example gh-pages-dry-run, and this input is rejected for full-release.

Dry-Run Docs Branch Reset

Before a release dry-run batch, reset the dry-run docs branch to the current production docs branch:

git fetch upstream
git push --force-with-lease=refs/heads/gh-pages-dry-run upstream \
  refs/remotes/upstream/gh-pages:refs/heads/gh-pages-dry-run

Then run the dry-run workflows sequentially, using dry-run-docs-branch: gh-pages-dry-run. After the batch finishes, fetch upstream/gh-pages-dry-run and compare it with upstream/gh-pages to inspect the exact generated docs update before running full-release.

Validation

Focused unit coverage was added for the release-note checker, including:

  • loading the configured backport branch;
  • requiring a backport decision for mainline cuda-bindings and cuda-python
    releases;
  • accepting not planned;
  • validating planned backport release notes;
  • warning rather than failing for missing notes on actual backport releases;
  • rejecting malformed or non-backport planned tags.

I also tested the PR exhaustively with a four-run dry-run batch:

  1. cuda-bindings, v13.3.1
  2. cuda-python, v13.3.1
  3. cuda-bindings, v12.9.7
  4. cuda-python, v12.9.7

Full test report:

#2161 (comment)

The batch verified both the mainline and backport release paths for both components, including release-note checks, artifact validation, skipped external publishing, and docs deployment to gh-pages-dry-run.

rwgk added 2 commits June 1, 2026 21:23
Require mainline cuda-bindings and cuda-python releases to explicitly declare a planned backport tag or mark it not planned. Keep actual backport releases unblocked while surfacing missing notes as warnings, and preserve docs builds for older tags that still use ci/versions.json.
Validate release docs, archives, and wheels without publishing to GitHub Releases, GitHub Pages, TestPyPI, or PyPI.
@rwgk rwgk added this to the cuda.bindings next milestone Jun 2, 2026
@rwgk rwgk self-assigned this Jun 2, 2026
@rwgk rwgk added P0 High priority - Must do! CI/CD CI/CD infrastructure labels Jun 2, 2026
@copy-pr-bot

copy-pr-bot Bot commented Jun 2, 2026

Copy link
Copy Markdown
Contributor

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@rwgk rwgk changed the title Fix backport release checks and add release dry-run [no-ci] Fix backport release checks and add release dry-run Jun 2, 2026
rwgk added 4 commits June 2, 2026 15:18
Add an explicit dry-run docs branch input so release dry-runs can optionally write generated docs to a seeded non-production branch while keeping artifact-only dry-runs as the default.
Make dry-run the first and default release action so production publishing must be deliberately selected for manual release workflow runs.
Require optional dry-run docs deployments to target a non-production gh-pages-* branch so manual release dry-runs cannot accidentally publish docs to production or source branches.
@rwgk

rwgk commented Jun 4, 2026

Copy link
Copy Markdown
Contributor Author

Comprehensive dry-run testing report

I tested this PR with a four-run dry-run batch designed to approximate the
release sequence we would use for both mainline and backport releases:

  1. cuda-bindings, v13.3.1
  2. cuda-python, v13.3.1
  3. cuda-bindings, v12.9.7
  4. cuda-python, v12.9.7

All four workflow runs completed successfully. The batch exercised:

  • mainline release-note handling with backport-git-tag: not planned;
  • backport release-note handling with blank backport-git-tag;
  • dry-run release artifact validation without publishing to GitHub Releases,
    TestPyPI, or PyPI;
  • docs generation from the release tag;
  • dry-run docs deployment to gh-pages-dry-run;
  • sequential docs updates where the second backport component run preserved the
    docs written by the first backport component run.

Workflow runs

# Component Git tag Workflow run Result
1 cuda-bindings v13.3.1 https://github.com/NVIDIA/cuda-python/actions/runs/26918388859 success
2 cuda-python v13.3.1 https://github.com/NVIDIA/cuda-python/actions/runs/26919456751 success
3 cuda-bindings v12.9.7 https://github.com/NVIDIA/cuda-python/actions/runs/26919970969 success after rerun
4 cuda-python v12.9.7 https://github.com/NVIDIA/cuda-python/actions/runs/26921351868 success

Run 3 needed a rerun because of an infrastructure failure. The rerun succeeded,
and GitHub reports the latest attempt as successful.

Common checks verified

For all four runs:

  • The workflow ran from check-release-notes_with_backport-git-tag at
    86cc688be2228ca35d2a52cec7df10f8665827c7.
  • release-action was dry-run.
  • dry-run-docs-branch was gh-pages-dry-run.
  • Release notes validation passed.
  • Docs generation succeeded.
  • Release archive creation and wheel validation ran.
  • GitHub release asset upload was skipped.
  • TestPyPI publishing was skipped.
  • PyPI publishing was skipped.
  • Dry-run release artifacts were uploaded as workflow artifacts.

Per-run observations

1. cuda-bindings v13.3.1

Workflow: https://github.com/NVIDIA/cuda-python/actions/runs/26918388859

Observed inputs:

  • component: cuda-bindings
  • git-tag: v13.3.1
  • release-action: dry-run
  • backport-git-tag: not planned
  • dry-run-docs-branch: gh-pages-dry-run
  • run-id: 26663354706

Result:

  • Release notes validation passed.
  • Docs built from v13.3.1.
  • No files changed on gh-pages-dry-run.
  • upstream/gh-pages and upstream/gh-pages-dry-run both remained at
    0824a79330b9d45fb775bb4b0372dbc54388a61c.

2. cuda-python v13.3.1

Workflow: https://github.com/NVIDIA/cuda-python/actions/runs/26919456751

Observed inputs:

  • component: cuda-python
  • git-tag: v13.3.1
  • release-action: dry-run
  • backport-git-tag: not planned
  • dry-run-docs-branch: gh-pages-dry-run
  • run-id: 26663354706

Result:

  • Release notes validation passed.
  • Docs built from v13.3.1.
  • The deploy action targeted gh-pages-dry-run and reported:
    There is nothing to commit. Exiting early.
  • No files changed on gh-pages-dry-run.
  • upstream/gh-pages and upstream/gh-pages-dry-run still matched at
    0824a79330b9d45fb775bb4b0372dbc54388a61c.

3. cuda-bindings v12.9.7

Workflow: https://github.com/NVIDIA/cuda-python/actions/runs/26919970969

Observed inputs from the successful rerun:

  • component: cuda-bindings
  • git-tag: v12.9.7
  • release-action: dry-run
  • backport-git-tag: blank
  • dry-run-docs-branch: gh-pages-dry-run
  • run-id: 26491171597

Result:

  • Release notes validation passed as a backport release.
  • Docs built from v12.9.7.
  • gh-pages-dry-run advanced from
    0824a79330b9d45fb775bb4b0372dbc54388a61c to
    ad1ebe50a57dad930e5afd7fb2f0b932f9427ce4.
  • The docs deploy commit was Deploy release docs: v12.9.7.
  • 104 docs files changed, all under docs/cuda-bindings/12.9.7/, plus
    docs/cuda-bindings/objects.inv and docs/cuda-bindings/versions.json.

4. cuda-python v12.9.7

Workflow: https://github.com/NVIDIA/cuda-python/actions/runs/26921351868

Observed inputs:

  • component: cuda-python
  • git-tag: v12.9.7
  • release-action: dry-run
  • backport-git-tag: blank
  • dry-run-docs-branch: gh-pages-dry-run
  • run-id: 26491171597

Result:

  • check-tag verified v12.9.7 and skipped GitHub release draft work because
    this was a dry run.
  • Release notes validation passed as a backport release.
  • Docs built from v12.9.7.
  • gh-pages-dry-run advanced from
    ad1ebe50a57dad930e5afd7fb2f0b932f9427ce4 to
    f4581dbc59645e843cf18ff6dbb1634839fcd3ae.
  • The docs deploy commit was Deploy release docs: v12.9.7.
  • 15 docs files changed: top-level docs/12.9.7/..., plus
    docs/objects.inv and docs/versions.json.
  • This run did not overwrite the docs/cuda-bindings/12.9.7/... files written
    by run 3.

Final docs branch state

After the four-run batch:

  • gh-pages-dry-run ended at
    f4581dbc59645e843cf18ff6dbb1634839fcd3ae.
  • The two v13.3.1 dry-runs produced no docs branch changes because the
    generated output matched the seeded branch.
  • The two v12.9.7 dry-runs produced the expected backport docs updates.
  • The final diff from upstream/gh-pages to upstream/gh-pages-dry-run
    contained 119 docs files total:
    • 104 from cuda-bindings v12.9.7;
    • 15 from cuda-python v12.9.7.

This confirms that the dry-run mode exercised the release validation path and
docs deployment path with high fidelity, while avoiding external publishing
side effects.

Conclusion

The four-run batch passed and covered the intended mainline/backport and
component matrix:

  • cuda-bindings mainline dry-run
  • cuda-python mainline dry-run
  • cuda-bindings backport dry-run
  • cuda-python backport dry-run

The checks validate the PR's two intended behaviors:

  • the release-note checker handles mainline and backport releases correctly;
  • the dry-run release workflow validates artifacts and docs deployment without
    publishing to GitHub Releases, TestPyPI, or PyPI.

Screenshots from triggering the workflow runs

Screenshot 2026-06-03 at 15 59 20
Screenshot 2026-06-03 at 16 25 30
Screenshot 2026-06-03 at 16 38 03
Screenshot 2026-06-03 at 17 11 52

@rwgk rwgk marked this pull request as ready for review June 4, 2026 04:18
@rwgk rwgk added bug Something isn't working PR review get-together Mark PRs you'd like the team to review at the weekly PR review get-together. labels Jun 4, 2026
@mdboom mdboom mentioned this pull request Jun 10, 2026
1 task
@rparolin rparolin requested review from mdboom and rparolin June 12, 2026 18:22
@rparolin

Copy link
Copy Markdown
Collaborator

Do the unit tests run on the CI? If not, we should resolve that.
Other than that, lgtm.

rwgk added 2 commits June 12, 2026 12:58
# Conflicts:
#	.github/workflows/build-docs.yml
#	.github/workflows/release.yml
PR #2177 made the release workflow checkout shallow, so the dry-run tag validation now needs tags fetched explicitly while keeping history shallow.
@rwgk rwgk marked this pull request as draft June 12, 2026 20:43
@rwgk rwgk changed the title [no-ci] Fix backport release checks and add release dry-run [doc-only] Fix backport release checks and add release dry-run Jun 12, 2026
@rwgk

rwgk commented Jun 12, 2026

Copy link
Copy Markdown
Contributor Author

Follow-up on the release-note checker tests, based on offline conversation: Rob correctly noticed that the existing ci/tools/tests/test_check_release_notes.py tests are not currently wired into regular CI. Since that test file already existed before this PR, I’m leaving the added coverage in place here and will address the broader CI wiring as follow-on work so this release workflow fix can land without expanding scope.

@rwgk

rwgk commented Jun 12, 2026

Copy link
Copy Markdown
Contributor Author

GPT-5.5


Recommended retest set

To get focused coverage of the PR #2177-related merge conflict resolution, especially the shallow-checkout/tag-fetch interaction and the docs deployment workflow, the recommended retest set is:

  1. Repeat one mainline dry-run:

    • component: cuda-bindings
    • git-tag: v13.3.1
    • release-action: dry-run
    • backport-git-tag: not planned
    • dry-run-docs-branch: gh-pages-dry-run

    This verifies that check-tag still resolves the requested tag with fetch-depth: 1 plus fetch-tags: true, and that dry-run mode still skips GitHub release draft creation.

  2. Repeat the two backport docs dry-runs in sequence:

    • component: cuda-bindings
    • git-tag: v12.9.7
    • release-action: dry-run
    • backport-git-tag: blank
    • dry-run-docs-branch: gh-pages-dry-run

    then:

    • component: cuda-python
    • git-tag: v12.9.7
    • release-action: dry-run
    • backport-git-tag: blank
    • dry-run-docs-branch: gh-pages-dry-run

    This revalidates the most merge-sensitive docs behavior: deployment to the non-production docs branch, and the second component run preserving docs written by the first component run.

I would not repeat the full previous four-run matrix unless we want maximum confidence. The omitted cuda-python / v13.3.1 mainline dry-run is useful parity coverage, but the merge/fix did not create a specific new reason to suspect that path more than the others.

The [doc-only] PR CI docs build is still useful, but it does not replace these workflow-dispatch dry-run tests because it does not exercise release.yml's check-tag, dry-run-docs-branch, or release artifact validation path.

@rwgk

rwgk commented Jun 12, 2026

Copy link
Copy Markdown
Contributor Author

/ok to test

@github-actions

This comment has been minimized.

Add workflow-local guidance for validating non-trivial release workflow changes with focused dry-run coverage.
@rwgk

rwgk commented Jun 13, 2026

Copy link
Copy Markdown
Contributor Author

I added a comment-only follow-up commit (d6d554f) documenting the focused dry-run retest matrix for future non-trivial changes to release.yml. This captures the coverage we found useful here: tag resolution with shallow checkouts, release-note validation, dry-run artifact validation, docs generation, and docs-branch preservation. The comment is targeted at both humans and agents.

@rwgk

rwgk commented Jun 13, 2026

Copy link
Copy Markdown
Contributor Author

/ok to test

@rwgk

rwgk commented Jun 13, 2026

Copy link
Copy Markdown
Contributor Author

Pre-merge retest set

  1. Mainline dry-run, cuda-bindings, v13.3.1

  2. Backport dry-run, cuda-bindings, v12.9.7

    • URL: https://github.com/NVIDIA/cuda-python/actions/runs/27474371299
    • Result: success
    • Notes: release notes validation passed for blank backport-git-tag; check-tag resolved v12.9.7 with fetch-depth: 1 plus fetch-tags: true; docs deploy committed Deploy release docs: v12.9.7 to gh-pages-dry-run on top of the first dry-run commit.
  3. Backport dry-run, cuda-python, v12.9.7

    • URL: https://github.com/NVIDIA/cuda-python/actions/runs/27475130878
    • Result: success
    • Notes: blank run-id exercised determine-run-id, which auto-detected run 26491171597; release notes validation passed for blank backport-git-tag; artifact validation passed for cuda-python version 12.9.7; docs deploy added top-level docs/12.9.7 content on top of the existing dry-run branch without deleting prior cuda-bindings docs.

01_release_yml_cuda-bindings_13 3 1
02_release_yml_cuda-bindings_12 9 7
03_release_yml_cuda-python_12 9 7

@rwgk rwgk marked this pull request as ready for review June 13, 2026 18:32
@rwgk rwgk enabled auto-merge (squash) June 13, 2026 18:33
@rwgk rwgk merged commit 8ba39e8 into main Jun 13, 2026
35 checks passed
@rwgk rwgk deleted the check-release-notes_with_backport-git-tag branch June 13, 2026 18:38
@github-actions

This comment has been minimized.

@rwgk

rwgk commented Jun 13, 2026

Copy link
Copy Markdown
Contributor Author

Post-merge dry-run test report

Post-merge dry-run testing was run after PR #2161 was merged into main.

Workflow run:

Workflow metadata

  • Workflow: CI: Release
  • Event: workflow_dispatch
  • Branch: main
  • Commit: 8ba39e858718fe67f0e80fab9ac4267640944fd0
  • Result: success
  • Started: 2026-06-13T19:20:04Z
  • Completed: 2026-06-13T19:21:50Z
  • Duration: about 1 minute 46 seconds

Inputs tested

  • component: cuda-python
  • release-action: dry-run
  • git-tag: v13.3.1
  • backport-git-tag: not planned
  • run-id: blank
  • dry-run-docs-branch: gh-pages-dry-run

This was the omitted mainline cuda-python dry-run from the focused retest plan,
run after merge from the main branch.

Checks observed

  • check-release-notes passed for cuda-python v13.3.1.
  • The checker reported: Backport release not planned for v13.3.1, skipping backport release-notes check.
  • The checker also reported: Release notes present for tag v13.3.1, component cuda-python.
  • determine-run-id used fetch-depth: 1 plus fetch-tags: true.
  • With run-id left blank, determine-run-id resolved tag v13.3.1 to commit 96c7f518d73e977ca9c9fbfd20f0c1293a479d56 and auto-detected workflow run 26663354706.
  • check-tag used fetch-depth: 1 plus fetch-tags: true.
  • check-tag successfully ran git rev-parse --verify 'v13.3.1^{commit}'.
  • Dry-run mode skipped GitHub release draft creation with: Dry-run selected; not checking or creating a GitHub release draft.
  • Docs built from tag v13.3.1, with CUDA_PYTHON_DOCS_GITHUB_REF: v13.3.1.
  • Release artifact validation downloaded cuda-python-wheel from run 26663354706.
  • Release artifact validation passed for cuda-python version 13.3.1 from tag v13.3.1.
  • TestPyPI and PyPI publishing jobs were skipped.

Docs deployment

The docs deployment step targeted gh-pages-dry-run and used commit message
Deploy release docs: v13.3.1.

Before deployment, gh-pages-dry-run pointed at the pre-merge third dry-run commit:

  • 47057670cbcb5b785ab409cfeca78fe6a39a775c

After deployment, gh-pages-dry-run pointed at:

  • 374fe65e5ff4d650187707cb76f039b24942bf02

The deploy action found the existing branch, fetched it, checked it out, rsynced
the generated docs, committed, and force-pushed:

  • 4705767..374fe65 github-pages-deploy-action/9pyg890f9 -> gh-pages-dry-run

Docs branch delta

Comparing the previous dry-run branch head
47057670cbcb5b785ab409cfeca78fe6a39a775c to the post-merge head
374fe65e5ff4d650187707cb76f039b24942bf02:

  • 1 new commit
  • 90 files changed
  • 88 files added
  • 2 files modified
  • 0 files deleted

The added files were under:

  • docs/13.3.1/...

The modified files were:

  • docs/objects.inv
  • docs/versions.json

No files under the previously deployed backport docs paths were deleted. This
confirms that the post-merge mainline cuda-python dry-run preserved the
pre-merge backport dry-run docs while adding the v13.3.1 top-level docs.

Conclusion

The post-merge dry-run succeeded from main and covered the merged release
workflow path for mainline cuda-python: shallow checkout plus tag fetching,
automatic run ID detection, release-note validation with backport-git-tag: not planned, dry-run artifact validation, docs generation, docs deployment to
gh-pages-dry-run, and skipped external publishing.


04_release_yml_cuda-python_13 3 1

@github-actions

Copy link
Copy Markdown
Doc Preview CI
Preview removed because the pull request was closed or merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working CI/CD CI/CD infrastructure P0 High priority - Must do! PR review get-together Mark PRs you'd like the team to review at the weekly PR review get-together.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Release workflow fails for backport tags because check-release-notes runs from the tag checkout

2 participants