test: mark CPU-heavy tests exclusive for scheduling stability#166
Merged
Conversation
Local profiling on M2 (release/-O2) shows these tests each consume 3+ effective cores when run standalone: memory_node_shared_test 811% CPU dpc_test 576% CPU monitoring_test 554% CPU waitable_test 331% CPU hazptr_test 321% CPU When run inside blade's parallel pool, they over-subscribe the box and starve other tests for CPU. The primary harm isn't wall time -- it's *stability*: timing-sensitive tests neighboring these get non-deterministic latency, leading to CI flakes that don't reproduce locally. Mark all five exclusive (same scheduler queue as the existing exclusive tests: chrono_test / time_keeper_test / timer_test / this_fiber_test / scheduling_group_test / dumper_test, which are exclusive for the same reason -- timing accuracy under contention). Local wall: +106s (2m44 -> 4m30) on 8-core M2 where pool is already CPU-saturated. CI runners (2-core GHA) likely see smaller regression -- or improvement -- because the oversubscription pain was worse there. We'll measure on the actual CI workflow and re-tune the threshold (currently CPU% > 300) if needed. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Mark 5 CPU-heavy tests
exclusive = Trueso they run serially after the parallel pool, rather than oversubscribing the box and starving neighbours of CPU.flare/base/object_pool:memory_node_shared_testflare/base/internal:dpc_testflare/base:monitoring_testflare/fiber/detail:waitable_testflare/base:hazptr_test(measured on M2, release/-O2, blade test --full-test)
Why this, not loop reduction
The point is scheduling stability, not wall time. When these tests run in the parallel pool, they each want 3-8 cores; together they saturate the box and starve timing-sensitive neighbours, producing CI flakes that don't reproduce locally.
This extends an existing pattern:
chrono_test,time_keeper_test,time_view_test,timer_test,this_fiber_test,scheduling_group_test, anddumper_testare already markedexclusivefor the same reason (timing accuracy under contention).Wall-time impact
Local M2 8-core: 2m44 → 4m30 (+106s). The pool was already CPU-saturated, so moving these out doesn't free up much for phase 1 and adds the serial sum to phase 2.
CI is the actual question. GHA standard runners are 2-core: the oversubscription pain was worse there, so phase-1 savings should be larger relative to the serial cost. Measuring on this PR's CI is the point — if Run tests regresses, we'll scope back to top-2 (
memory_node_shared_test+dpc_test, the >550% CPU pair).Test plan
🤖 Generated with Claude Code