feat(platform): CBM_WORKERS env override for cbm_default_worker_count#364
Open
yangsec888 wants to merge 1 commit into
Open
feat(platform): CBM_WORKERS env override for cbm_default_worker_count#364yangsec888 wants to merge 1 commit into
yangsec888 wants to merge 1 commit into
Conversation
Adds a single env-knob, CBM_WORKERS, that explicitly sets the worker count returned by cbm_default_worker_count(). When unset, behaviour is unchanged. In containerized deployments, cbm_default_worker_count(initial=true) returns sysconf(_SC_NPROCESSORS_ONLN) (host-CPU count, not the container's CPU quota). On a 1-vCPU pod scheduled onto a 16-core node cbm spawns ~16 indexing workers, each with its own per-worker buffers — the dominant OOMKill driver in container deployments. The broader ask (full cgroup awareness for cbm_system_info / mem.init budget logs) lives in a sibling issue; this knob is the smaller, lower-risk path that ships independently. Same precedence shape as the existing CBM_* env overrides: explicit override > implicit detection. Range clamped to [1, 256]; invalid values are ignored and the function logs `workers.env.invalid` at WARN before falling through to the sysconf-derived default. Files touched: - src/foundation/system_info.c — env probe + clamp + warn-and-fall-through - tests/test_platform.c — 3 new TEST cases covering override, invalid fallback, and unset baseline - README.md — CBM_WORKERS row in the Environment Variables table Related: cgroup-aware detection issue (filed alongside this PR).
3e1f80b to
80be86f
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Adds a single env-knob,
CBM_WORKERS, that explicitly sets theworker count returned by
cbm_default_worker_count(). When unset,behaviour is unchanged.
Why
In containerized deployments,
cbm_default_worker_count(initial=true)returns
sysconf(_SC_NPROCESSORS_ONLN)(host-CPU count, not thecontainer's CPU quota). On a 1-vCPU pod scheduled onto a 16-core
node cbm spawns ~16 indexing workers, each with its own per-worker
buffers — the dominant OOMKill driver in container deployments
(see companion issue #363 for the broader cgroup-awareness ask).
A small env knob solves the immediate operator pain without taking
on the scope of full cgroup detection, and it follows the same
shape as the existing
CBM_*env overrides: explicit override >implicit detection.
Patch
src/foundation/system_info.c:CBM_WORKERS_MAX = 256is added to the file-localenum; happy todrop the upper clamp if you'd rather trust the operator, or move
the constant somewhere shared.
cbm_safe_getenv,CBM_DECIMAL_BASE,CBM_SZ_32,SKIP_ONE,MIN_WORKERS, andcbm_log_warnallalready exist (see
src/foundation/platform.h,constants.h, andlog.h).Tests
tests/test_platform.cgains three cases inside the existingSUITE(platform):platform_default_workers_env_override—CBM_WORKERS=4is honored for bothinitial=trueandinitial=false.platform_default_workers_env_invalid—0,-1,9999, andnot-a-numberall fall back to the sysconf-derived default (and emitworkers.env.invalidat WARN).platform_default_workers_env_unset— when unset,cbm_default_worker_count(true) == cbm_system_info().total_cores, preserving today's contract.Docs
One-line addition to the Environment Variables table in
README.md:Behaviour when unset
Identical to today. No deployment that doesn't set
CBM_WORKERSsees any difference.
Verification on my side
I have not run
scripts/build.sh/scripts/test.sh/scripts/lint.shlocally because the full ASan + UBSan setupisn't installed on my machine. Happy to iterate quickly on any
CI feedback (lint, clang-tidy, cppcheck, format) — the patch is
small enough that a turnaround should be fast.
Related
ask this PR carves a smaller, lower-risk path out of).
Segmentation fault (core dumped) #336, Segmentation fault at start of lsp_cross pass on repositories with 200+ files #340.
MCP-server process inside a 2 GiB Kubernetes pod. Backend pod's
current workaround is a memory-limit bump from 2 GiB → 4 GiB to
ride out the overspawn until this lands.