Skip to content

test: mark CPU-heavy tests exclusive for scheduling stability#166

Merged
chen3feng merged 1 commit into
masterfrom
mark-heavy-tests-exclusive
Jun 3, 2026
Merged

test: mark CPU-heavy tests exclusive for scheduling stability#166
chen3feng merged 1 commit into
masterfrom
mark-heavy-tests-exclusive

Conversation

@chen3feng
Copy link
Copy Markdown
Collaborator

Summary

Mark 5 CPU-heavy tests exclusive = True so they run serially after the parallel pool, rather than oversubscribing the box and starving neighbours of CPU.

Test Standalone CPU%
flare/base/object_pool:memory_node_shared_test 811%
flare/base/internal:dpc_test 576%
flare/base:monitoring_test 554%
flare/fiber/detail:waitable_test 331%
flare/base:hazptr_test 321%

(measured on M2, release/-O2, blade test --full-test)

Why this, not loop reduction

The point is scheduling stability, not wall time. When these tests run in the parallel pool, they each want 3-8 cores; together they saturate the box and starve timing-sensitive neighbours, producing CI flakes that don't reproduce locally.

This extends an existing pattern: chrono_test, time_keeper_test, time_view_test, timer_test, this_fiber_test, scheduling_group_test, and dumper_test are already marked exclusive for the same reason (timing accuracy under contention).

Wall-time impact

Local M2 8-core: 2m44 → 4m30 (+106s). The pool was already CPU-saturated, so moving these out doesn't free up much for phase 1 and adds the serial sum to phase 2.

CI is the actual question. GHA standard runners are 2-core: the oversubscription pain was worse there, so phase-1 savings should be larger relative to the serial cost. Measuring on this PR's CI is the point — if Run tests regresses, we'll scope back to top-2 (memory_node_shared_test + dpc_test, the >550% CPU pair).

Test plan

  • All 235 tests pass after change (local M2)
  • CI Run tests duration: compare to master baseline (8:31 / 8:09 on ubuntu_22/24)
  • If CI also regresses meaningfully, scope down to top-2

🤖 Generated with Claude Code

Local profiling on M2 (release/-O2) shows these tests each consume
3+ effective cores when run standalone:

  memory_node_shared_test   811% CPU
  dpc_test                  576% CPU
  monitoring_test           554% CPU
  waitable_test             331% CPU
  hazptr_test               321% CPU

When run inside blade's parallel pool, they over-subscribe the box
and starve other tests for CPU. The primary harm isn't wall time --
it's *stability*: timing-sensitive tests neighboring these get
non-deterministic latency, leading to CI flakes that don't reproduce
locally.

Mark all five exclusive (same scheduler queue as the existing
exclusive tests: chrono_test / time_keeper_test / timer_test /
this_fiber_test / scheduling_group_test / dumper_test, which are
exclusive for the same reason -- timing accuracy under contention).

Local wall: +106s (2m44 -> 4m30) on 8-core M2 where pool is
already CPU-saturated. CI runners (2-core GHA) likely see smaller
regression -- or improvement -- because the oversubscription pain
was worse there. We'll measure on the actual CI workflow and
re-tune the threshold (currently CPU% > 300) if needed.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@chen3feng chen3feng merged commit f2da35a into master Jun 3, 2026
8 checks passed
@chen3feng chen3feng deleted the mark-heavy-tests-exclusive branch June 3, 2026 08:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant