ci: Evaluate Blacksmith runners to improve ubuntu-latest job performance

### Background

Your team has been actively optimizing CI runners — from migrating to VM-based runners (Feb 2026, #4045) to custom runners for heavy workloads (#4061, #3699, #3716). This proposal is a natural next step in that journey.

### Problem

Several `ubuntu-latest` jobs in [`.github/workflows/ci.yml`](https://github.com/clockworklabs/SpacetimeDB/blob/master/.github/workflows/ci.yml) perform heavy Rust compilation but run on GitHub's standard 2-vCPU runner (limited I/O). The most impactful example:

**`smoketests_mod_rs_complete`** (line ~1192) explicitly rebuilds **V8 from source** when cache misses occur:

```yaml
- name: Check v8 outputs
  run: |
    if ! [ -f "${CARGO_TARGET_DIR}"/debug/gn_out/obj/librusty_v8.a ]; then
      cargo clean -p v8 || true
      cargo build -p v8   # ← V8 compile: 15–20 minutes cold on ubuntu-latest
    fi
```

This is a known pain point (related to recent fixes in #4223, #4209). Other affected jobs:
- `global_json_policy` — runs `cargo ci global-json-policy` on 2-vCPU runner
- `internal-tests` — orchestration overhead on standard runner
- `warn-python-smoketests` — lightweight but uses standard runner

### Solution: Blacksmith

[Blacksmith](https://www.blacksmith.sh/) provides:
- **Drop-in replacement** for `ubuntu-latest` → `blacksmith-4vcpu-ubuntu-2204` (or higher vCPU options)
- **2–3x faster** for Rust workloads (NVMe SSD vs HDD, better cache coherency)
- **Native Cargo caching** for dependency compilation
- **Price-competitive** with GitHub-hosted minutes (often cheaper for compile-heavy repos)
- **No code changes** — just runner label swaps

### Change Example

```yaml
# Before
runs-on: ubuntu-latest

# After
runs-on: blacksmith-4vcpu-ubuntu-2204
```

### Suggested Evaluation Path

**Phase 1 (Low Risk)**
1. Team signs up at blacksmith.sh and configures billing integration
2. Pilot on 2 jobs: `smoketests_mod_rs_complete` and `global_json_policy`
3. Measure runtime reduction + verify cache hit rates
4. Decision point: roll to all `ubuntu-latest` jobs or iterate

**Phase 2 (Optional)**
- Evaluate Blacksmith's premium tiers (8, 16, 32 vCPU) for heavy jobs
- Consider consolidating some custom runners (`benchmarks-runner`, `arm-runner`) if cost-effective

### Metrics to Track

- **Job duration** before/after (especially `smoketests_mod_rs_complete` and V8 rebuild time)
- **Cost per run** compared to current GitHub-hosted minutes
- **Cache hit rates** (Blacksmith provides native Cargo cache metrics)
- **Wall-clock time** to run full `ci.yml` workflow

### Questions for Review

1. Does the team have budget/interest in evaluating Blacksmith?
2. Should we start with 4-vCPU or jump to 8-vCPU for V8-heavy jobs?
3. Are there other `ubuntu-latest` jobs not in `ci.yml` that should be included?
4. Would you prefer a full pilot PR or an exploratory branch first?

---

## Notes

- Your recent VM-based runner migration (PR #4045) and V8 fixes (#4223, #4209) show strong commitment to CI reliability and performance. Blacksmith is a natural complement to that work.
- This is **not** a replacement for custom runners like `spacetimedb-new-runner-2`; rather, it optimizes the `ubuntu-latest` jobs that don't warrant dedicated hardware.
- Blacksmith allows easy **rollback** if needed (just revert the `runs-on` labels).

---

## Related PRs

- #4045 — VM-based GitHub runners migration
- #4223 — CI - Fix v8 in debug and release
- #4209 — CI - csharp-testsuite v8 dance done properly


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ci: Evaluate Blacksmith runners to improve ubuntu-latest job performance #4479

Background

Problem

Solution: Blacksmith

Change Example

Suggested Evaluation Path

Metrics to Track

Questions for Review

Notes

Related PRs

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

ci: Evaluate Blacksmith runners to improve ubuntu-latest job performance #4479

Description

Background

Problem

Solution: Blacksmith

Change Example

Suggested Evaluation Path

Metrics to Track

Questions for Review

Notes

Related PRs

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions