Skip to content

ci: speed up Windows integration tests#595

Merged
lwshang merged 1 commit into
mainfrom
lwshang/ci-windows-test-speedup
Jun 9, 2026
Merged

ci: speed up Windows integration tests#595
lwshang merged 1 commit into
mainfrom
lwshang/ci-windows-test-speedup

Conversation

@lwshang

@lwshang lwshang commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

Problem

The Windows test jobs spend ~10 min compiling in the "Run tests" step while Ubuntu takes 1–2 min — and it's been this way regardless of PR. The Rust cache is working, but it only stores third-party dependency artifacts. By design Swatinem/rust-cache strips workspace crates from target/ before saving (cache-workspace-crates: false), and the per-file integration test binaries (one per tests/*.rs) plus linking are never cached. So every test job recompiles the whole workspace and links its test binary from scratch. That residual work is ~1–2 min on Ubuntu but ~10 min on Windows because of slow MSVC link.exe linking, full debuginfo, and Defender real-time file scanning.

Changes (quick, CI-scoped — no structural change)

All in .github/workflows/test.yml:

  • CARGO_PROFILE_DEV_DEBUG=line-tables-only — drop full debuginfo so object files are smaller and link faster; keeps file:line in backtraces. The test profile inherits this.
  • lld-link on windows-msvc — link with LLVM's lld-link (preinstalled at C:\Program Files\LLVM on windows-2025, no install step) instead of the slow link.exe. Only consumed when building the windows-msvc target → no-op on Linux/macOS. Scoped to this workflow, so the release/dist thin-LTO build keeps the default linker (lld + LTO has known segfault caveats).
  • Windows Defender exclusion for the workspace + .cargo/.rustup, in both the unit-tests and test jobs. Non-fatal so it can't break CI.

Notes for reviewers

  • First run recompiles fully on every OS — changing the linker and debug level invalidates the cache fingerprint once. Steady-state is what improves.
  • The lld-link swap is validated by this PR's Windows CI run (can't link MSVC locally from macOS). If lld-link.exe isn't found at that path on some image variant, those jobs fail at link with an obvious "linker not found" error; the fix/revert is a one-line env change.
  • Deferred follow-up: the larger structural win — prime all integration test binaries once in the unit-tests job + cache-workspace-crates: true so the parallel test jobs reuse pre-built binaries instead of each recompiling the workspace. Larger payoff but grows caches and needs eviction-limit care.

Verification

Confirmed against this PR's Test run on Windows:

  • Warm-cache compile+link dropped from ~10 min to ~3.2 min. Example: sync_tests spent 194s compiling/linking + 182s actually running tests (was ~10 min compile before). Jobs that hit the deps cache finished in ~380s end-to-end.
  • lld-link works — all Windows jobs linked and passed; no "linker not found" errors, so LLVM's lld-link is present and used on windows-2025.
  • The few slow jobs were cache misses, not a regression. They cold-rebuilt dependencies (~7 min extra) because the repo is currently over GitHub's 10 GB cache cap and 30 jobs contend for the single cache unit-tests had just created — a transient warm-up effect, unrelated to the linker/debuginfo change. This clears once the new cache key is the norm across main and open PRs, which is why the rollout plan is: merge this, then rebase/merge it into existing PRs so all future CI populates and reuses the new caches.

🤖 Generated with Claude Code

The Windows test jobs spend ~10 min compiling in the "Run tests" step
while Ubuntu takes 1-2 min. The cache only stores third-party dependency
artifacts; workspace crates, the per-file integration test binaries, and
linking are recompiled every run, and that residual work is dominated by
slow MSVC linking + Defender file scanning on Windows.

Quick, CI-scoped wins (no structural change):

- CARGO_PROFILE_DEV_DEBUG=line-tables-only: drop full debuginfo so object
  files are smaller and link faster; keeps file:line in backtraces.
- Link windows-msvc with LLVM's lld-link (preinstalled on windows-2025)
  instead of link.exe. Scoped to this workflow, so the release/dist
  thin-LTO build keeps the default linker.
- Exclude the workspace and .cargo/.rustup from Windows Defender real-time
  scanning in both the unit-tests and test jobs.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@lwshang lwshang marked this pull request as ready for review June 9, 2026 19:56
@lwshang lwshang requested a review from a team as a code owner June 9, 2026 19:56
@lwshang lwshang merged commit 3c8f933 into main Jun 9, 2026
91 checks passed
@lwshang lwshang deleted the lwshang/ci-windows-test-speedup branch June 9, 2026 19:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants