feat: Optimize Docker caching and harden CI cache behavior#295
Merged
BenjaminMichaelis merged 10 commits intoMay 19, 2026
Conversation
- Reorder build stage: install OS tools first, then copy only manifest files (*.csproj, package.json/lock, NuGet.config, Directory.*.props, eng/), run dotnet restore, then COPY all source. Routine code edits no longer bust the tool-install and restore layers. - Add --mount=type=cache for /root/.nuget/packages (both restore and publish) and /root/.npm (build-js.sh) to speed up local builds. - Combine separate tdnf install commands into one RUN in both stages, avoiding unnecessary layer overhead. - Add .dockerignore to exclude .git, **/obj, **/bin, **/dist, **/node_modules, TestResults, artifacts, and *.tar. The **/obj exclusion is important: local dotnet restore writes project.assets.json there; without ignoring it those files could overwrite the container's restore output and cause silent mismatches. - Add cache-from: type=gha to the PR Docker build step so pull requests can read from the cache written by main-branch builds, avoiding full rebuilds on every PR. PRs do not write cache to avoid consuming the 10 GB quota with per-branch entries.
Use reproducible-containers/buildkit-cache-dance@v3 to serialize the BuildKit cache mounts (/root/.nuget/packages and /root/.npm) into actions/cache between runs. On ephemeral GHA runners these mounts were previously empty at the start of every build; now they are warm on every run where dependencies haven't changed. Changes: - Add id: setup-buildx to the setup-buildx action step (required for the builder name output used by cache-dance). - Add actions/cache@v4 step with a key that includes hashes of Dockerfile, Directory.Packages.props, NuGet.config, and all package-lock.json files. restore-keys provides a warm fallback when only a subset of those files changes. - Add reproducible-containers/buildkit-cache-dance@v3 step before the Docker builds. skip-extraction is set from the cache-hit output so extraction is skipped when the key is an exact hit (actions/cache will not overwrite an existing entry with the same key anyway). - Add .buildkit-cache and scratch to .dockerignore. The cache-dance action populates .buildkit-cache with serialized NuGet and npm package data (can exceed 1 GB) and uses scratch as a temp dir; both must be excluded from the Docker build context.
- Add '# syntax=docker/dockerfile:1' header to Dockerfile to explicitly pin the BuildKit frontend (required recommendation when using --mount=type=cache). - Add --mount=type=cache,target=/var/cache/tdnf,sharing=locked to both tdnf install RUN instructions. The cache mount keeps downloaded RPMs out of the image layer without needing tdnf clean all; remove those clean calls so the persistent cache is not wiped after each install. - Add permissions: actions: write to the build-and-test job. docker/build-push-action requires this permission to write to the GitHub Actions cache (type=gha). Without it cache writes fail silently.
- Add scope=try-main to GHA Docker layer cache (cache-from/cache-to) so PR builds can't evict main branch cache entries - Switch to actions/cache/restore@v5 (restore-only) + conditional actions/cache/save@v5 (main only) so PRs never write buildkit-cache entries to the shared GHA cache - Set skip-extraction based on event type: false on push (always refresh cache after build), true on pull_request (no save, no need to extract) - Add explicit cache mount IDs (try-tdnf, try-nuget, try-npm) to all --mount=type=cache directives in Dockerfile to prevent collisions on shared runners
Caches the compiled output as a distinct layer. If publish fails (e.g. publish config issue), the build layer is already cached so the next run skips the full recompile.
Avoids compiling test projects, SimulatorGenerator, and WasmRunner that are not needed in the runtime image. dotnet build resolves project references automatically so all required dependencies still build.
- Split non-PR container build into main/non-main paths - Keep cache-to scope=try-main only on refs/heads/main - Restrict actions/cache/save for buildkit mounts to main branch - Keep non-main dispatch builds read-only against main cache
Revert split build/publish layering and keep a single publish command for the deployment target project: - dotnet publish --no-restore /App/src/Microsoft.TryDotNet This keeps the Dockerfile simpler while still avoiding a second restore.
There was a problem hiding this comment.
Pull request overview
This PR improves Docker build performance and CI cache stability by restructuring the Dockerfile to maximize layer reuse, introducing persistent BuildKit cache mounts (NuGet/npm/tdnf), and hardening GitHub Actions caching so only main updates shared cache state.
Changes:
- Re-layered the
Dockerfileto restore dependencies before copying full source, and added BuildKit--mount=type=cachefortdnf, NuGet, and npm. - Added mount-cache persistence in CI via
reproducible-containers/buildkit-cache-dance+actions/cache, and scoped GHA layer caching totry-main. - Added a
.dockerignoreto reduce build context churn (e.g.,bin/,obj/,node_modules/,.buildkit-cache).
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
Dockerfile |
Splits stable vs. volatile layers and adds BuildKit cache mounts for faster rebuilds. |
.github/workflows/Build-Test-And-Deploy.yaml |
Adds Buildx + mount cache persistence, and restricts cache writes to refs/heads/main. |
.dockerignore |
Reduces Docker context size/churn to improve cache hit rates and build speed. |
Comments suppressed due to low confidence (1)
.github/workflows/Build-Test-And-Deploy.yaml:77
- The workflow isn't configured to run on
merge_groupevents (theon:section only listspush,pull_request, andworkflow_dispatch), so the|| github.event_name == 'merge_group'part of this condition is currently unreachable. Either addmerge_group:to the workflow triggers or drop the extra condition to avoid confusion.
- name: Docker build (no push)
if: github.event_name == 'pull_request' || github.event_name == 'merge_group'
uses: docker/build-push-action@v7
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Keep PR validation path as build-only/no-push - Keep main branch path for image build, cache writes, and artifact upload - Remove non-main non-PR image build path - Gate deploy jobs to main branch to match artifact-producing path
- Use actions/cache@v5 on main (restore + post-job save) - Keep PR path restore-only with actions/cache/restore@v5 - Remove explicit actions/cache/save step - Keep extraction disabled on PRs and on main cache-hit runs
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This improves Docker build performance and cache stability in CI by separating stable and volatile Docker layers, persisting BuildKit cache mounts, and preventing non-main runs from polluting mainline cache state.
What changed
Dockerfilelayering for cache efficiency:tdnf(try-tdnf)try-nuget)try-npm)dotnet publish --no-restore /App/src/Microsoft.TryDotNet.dockerignoreentries to reduce context churn and avoid cache-dance directory leakage (.buildkit-cache,scratch,**/obj,**/bin, etc.).CI cache behavior
reproducible-containers/buildkit-cache-danceto persistRUN --mount=type=cachedata across runs.refs/heads/maincan writescope=try-mainand save mount cache, avoiding cache pollution from non-mainworkflow_dispatchruns.Notes for reviewers