GH-50046: [CI][C++] Improve caching with apache/infrastructure-actions/stash and more general cache keys#50047
GH-50046: [CI][C++] Improve caching with apache/infrastructure-actions/stash and more general cache keys#50047rok wants to merge 6 commits into
Conversation
|
|
There was a problem hiding this comment.
Pull request overview
This PR refactors C++ CI cache handling to improve ccache reuse. Instead of using actions/cache with cache keys that include hashFiles('cpp/**'), it splits cache restore/save into separate steps, uses github.run_id as the unique save key, and only saves new caches when running on main. This avoids creating new cache entries per PR (which can cause eviction) while still letting PR builds benefit from the stable restore prefix.
Changes:
- Replace
actions/cache@v5with pairedactions/cache/restore@v5andactions/cache/save@v5steps across C++ workflows. - Drop
hashFiles('cpp/**')from cache keys; use${{ github.run_id }}as the save key and a stable prefix as the restore-key. - Gate
Savesteps withif: github.ref == 'refs/heads/main'so only main pushes write new cache entries.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
| .github/workflows/cpp.yml | Splits Docker volume and ccache caching into restore/save pairs for docker, macOS, and MinGW jobs; saves only from main. |
| .github/workflows/cpp_windows.yml | Splits Windows ccache caching into restore/save; saves only from main. |
| .github/workflows/cpp_extra.yml | Same restore/save split applied to docker-extra, JNI (linux + macOS), ODBC linux/macOS/MSVC jobs. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
https://github.com/apache/infrastructure-actions/tree/main/stash |
|
@kou I think just tweaking ccache could have worked, but I updated the PR to use |
kou
left a comment
There was a problem hiding this comment.
Could you update the PR title and description?
Let's try this. If this doesn't work, we can fix this or revert this later.
|
If we switch to |
|
@pitrou any further comments or shall we merge and see what happens? |
Co-authored-by: Antoine Pitrou <pitrou@free.fr> Co-authored-by: Rok Mihevc <rok@mihevc.org>
Rationale for this change
C++ CI jobs currently key
ccache/Docker volume caches withhashFiles('cpp/**'). This creates new immutable GitHub Actions cache entries whenever any C++ file changes, even thoughccachealready handles per-file invalidation internally.Recent CI logs show that when caches are restored, C++ jobs get high ccache hit rates, but GitHub cache restore misses are common and cause large CI runtime regressions.
What changes are included in this PR?
actions/cachewithapache/infrastructure-actions/stash/restoreandapache/infrastructure-actions/stash/save.hashFiles('cpp/**')from cache keys.Are these changes tested?
By CI?
Are there any user-facing changes?
CI users should see faster builds..