-
Notifications
You must be signed in to change notification settings - Fork 732
Open
Description
Background
Your team has been actively optimizing CI runners — from migrating to VM-based runners (Feb 2026, #4045) to custom runners for heavy workloads (#4061, #3699, #3716). This proposal is a natural next step in that journey.
Problem
Several ubuntu-latest jobs in .github/workflows/ci.yml perform heavy Rust compilation but run on GitHub's standard 2-vCPU runner (limited I/O). The most impactful example:
smoketests_mod_rs_complete (line ~1192) explicitly rebuilds V8 from source when cache misses occur:
- name: Check v8 outputs
run: |
if ! [ -f "${CARGO_TARGET_DIR}"/debug/gn_out/obj/librusty_v8.a ]; then
cargo clean -p v8 || true
cargo build -p v8 # ← V8 compile: 15–20 minutes cold on ubuntu-latest
fiThis is a known pain point (related to recent fixes in #4223, #4209). Other affected jobs:
global_json_policy— runscargo ci global-json-policyon 2-vCPU runnerinternal-tests— orchestration overhead on standard runnerwarn-python-smoketests— lightweight but uses standard runner
Solution: Blacksmith
Blacksmith provides:
- Drop-in replacement for
ubuntu-latest→blacksmith-4vcpu-ubuntu-2204(or higher vCPU options) - 2–3x faster for Rust workloads (NVMe SSD vs HDD, better cache coherency)
- Native Cargo caching for dependency compilation
- Price-competitive with GitHub-hosted minutes (often cheaper for compile-heavy repos)
- No code changes — just runner label swaps
Change Example
# Before
runs-on: ubuntu-latest
# After
runs-on: blacksmith-4vcpu-ubuntu-2204Suggested Evaluation Path
Phase 1 (Low Risk)
- Team signs up at blacksmith.sh and configures billing integration
- Pilot on 2 jobs:
smoketests_mod_rs_completeandglobal_json_policy - Measure runtime reduction + verify cache hit rates
- Decision point: roll to all
ubuntu-latestjobs or iterate
Phase 2 (Optional)
- Evaluate Blacksmith's premium tiers (8, 16, 32 vCPU) for heavy jobs
- Consider consolidating some custom runners (
benchmarks-runner,arm-runner) if cost-effective
Metrics to Track
- Job duration before/after (especially
smoketests_mod_rs_completeand V8 rebuild time) - Cost per run compared to current GitHub-hosted minutes
- Cache hit rates (Blacksmith provides native Cargo cache metrics)
- Wall-clock time to run full
ci.ymlworkflow
Questions for Review
- Does the team have budget/interest in evaluating Blacksmith?
- Should we start with 4-vCPU or jump to 8-vCPU for V8-heavy jobs?
- Are there other
ubuntu-latestjobs not inci.ymlthat should be included? - Would you prefer a full pilot PR or an exploratory branch first?
Notes
- Your recent VM-based runner migration (PR Use VM-based github runners #4045) and V8 fixes (CI - Fix v8 in debug and release #4223, CI - csharp-testsuite v8 dance done properly #4209) show strong commitment to CI reliability and performance. Blacksmith is a natural complement to that work.
- This is not a replacement for custom runners like
spacetimedb-new-runner-2; rather, it optimizes theubuntu-latestjobs that don't warrant dedicated hardware. - Blacksmith allows easy rollback if needed (just revert the
runs-onlabels).
Related PRs
- Use VM-based github runners #4045 — VM-based GitHub runners migration
- CI - Fix v8 in debug and release #4223 — CI - Fix v8 in debug and release
- CI - csharp-testsuite v8 dance done properly #4209 — CI - csharp-testsuite v8 dance done properly
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels