fix(index): bound + source-cap parallel retention with re-read fallback (low-RAM peak RSS)#854
Merged
Merged
Conversation
…ck (low-RAM peak RSS) Distilled from #685 (nguyentamdat) rebased onto current main, plus two research-driven refinements and a genuine reproduce-first guard. The parallel extract retains each file's source text so the fused cross-file LSP resolve can re-parse it. That retention is transient but a peak-RSS driver. On main the caps are flat (100 MiB/file, 2 GiB total) and a file over the cap is silently unretained AND its cross-file resolution is skipped -- a graph-quality gap. This change bounds retention AND keeps every cross-file edge. - Source-text cap as a FLOOR, not just a ceiling: retention total defaults to min(cbm_mem_budget()/8, 1 GiB), per-file min(32 MiB, total). Following the rust-analyzer memory model, the RAM-derived default is clamped to a small absolute ceiling so a huge-RAM host does not hold tens of GB it would re-read cheaply. Both caps env-overridable via CBM_RETAIN_TOTAL_MB / CBM_RETAIN_PER_FILE_MB (limits.c convention); ceilings bound only the auto-derived default, never an explicit operator/caller choice. A dropped file emits one index.retain_capped WARN per run. - Bounded re-read fallback (the correctness guarantee): resolve_worker re-reads an unretained file's source on demand (bounded, freed immediately) instead of skipping resolution, wired at every cross-LSP site that consumes source. The cap now only trades retained RAM for a bounded re-read, never a lost edge. - cbm_parallel_extract_ex + opts struct (cbm_parallel_extract is now a wrapper passing NULL -> env-derived defaults); malloc/calloc NULL-check hardening. Reproduce-first: parallel_cross_file_reread_preserves_unretained_edges uses a Java<->Kotlin pair whose cross-file lsp edges are genuinely source-dependent; the edges are lost when the caller is unretained and the fallback is absent (RED), present with it (GREEN), with a retained CONTROL scenario proving non-vacuity. #685's original Python red test was a false guard (per-file py_lsp already resolves those calls) and is replaced. Peak-bound guards (retained_bytes <= total_cap; retain_sources=false retains nothing) in test_mem.c. Verify: make -f Makefile.cbm cbm && make -f Makefile.cbm lint-ci; test-runner parallel pipeline incremental py_lsp ts_lsp java_lsp kotlin_lsp c_lsp cs_lsp go_lsp rust_lsp mem -> 2323 passed. Co-authored-by: nguyentamdat <nguyentamdat@gmail.com> Signed-off-by: Martin Vogel <martin.vogel.tech@gmail.com>
This was referenced Jul 4, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
fix(index): bound + source-cap parallel retention with re-read fallback (low-RAM peak RSS)
Distilled from #685 (nguyentamdat), rebased onto current
main, with tworesearch-driven refinements and a genuine reproduce-first guard.
The tradeoff: peak RSS vs. graph quality
During a full index the parallel extract RETAINS each file's source text so the
later fused cross-file LSP resolve can re-parse it without re-opening the file.
That retention is TRANSIENT (freed at run end) but it is a PEAK-RSS driver:
every retained byte is resident at once across the extract→resolve handoff.
On
mainthe caps are flat (PP_RETAIN_PER_FILE_MAX_BYTES100 MiB,PP_RETAIN_TOTAL_BUDGET_BYTES2 GiB) and a file over the per-file cap issilently not retained and its cross-file LSP resolution is then
skipped — a graph-quality gap (the resolver can't re-read it). Lowering the
cap to save RAM would make that gap worse.
This change breaks the tradeoff: bound retention and keep every cross-file
edge, by re-reading dropped files on demand.
What changed (scope:
pass_parallel.c+pipeline_internal.h+ tests only)now defaults to
min(cbm_mem_budget()/8, 1 GiB)with a modest per-file cap ofmin(32 MiB, total). Following the rust-analyzer memory model, the RAM-deriveddefault is clamped to a small absolute ceiling so a huge-RAM host does not
HOLD tens of GB of source it would re-read cheaply anyway. Both caps are
env-overridable via
CBM_RETAIN_TOTAL_MB/CBM_RETAIN_PER_FILE_MB(limits.c strtol convention); the hard ceilings bound only the auto-derived
default, never a deliberate operator/caller choice. A dropped file emits a
single
index.retain_capped path=… bytes=…WARN per run.needs an unretained file's source,
resolve_workerre-reads it from disk(bounded by
cbm_max_file_bytes, freed immediately) instead of skippingresolution. Wired at every cross-LSP site that consumes source
(Python / C·C++·CUDA / C# / TS·JS / the per-file
cbm_pxc_run_one(_ts)path),so lowering the cap only trades retained RAM for a bounded re-read — it never
loses a cross-file edge.
cbm_parallel_extract_ex+ opts struct (cbm_parallel_extractis now athin wrapper passing
NULL→ env-derived defaults) and malloc/callocNULL-check hardening (
sorted,pkg_entries).The production pipeline reaches the new caps + fallback through the existing
cbm_parallel_extractcall — no change topipeline.c/pipeline_incremental.cis required, keeping this PR atomic and disjoint from the memory-workstream
keystone that owns
mem.c/mcp.c/main.c.Reproduce-first (green ⟺ fixed)
RED/GREEN A — graph quality (
parallel_cross_file_reread_preserves_unretained_edges).A Java↔Kotlin pair with genuinely cross-language calls that ONLY the source-
dependent cross-file LSP resolves (
JavaCaller.call → KotlinService.ping,KotlinService.ping → JavaService.pong, strategylsp). Three scenarios:CONTROL (retained), NO-RETAIN (
retain_sources=false), OVER-CAP (1-byteper-file cap). All GREEN with the fallback.
Red-first evidence (fallback disabled to simulate
main):The CONTROL (retained) scenario passes both ways = non-vacuity; the drop
scenarios lose the
lspcross-file resolution without the re-read = RED.Guard B — peak bound (
parallel_extract_tiny_source_retention_budget,parallel_extract_without_source_retention, test_mem.c). Index a fixture whosetotal source exceeds a tiny budget; assert
retained_bytes <= total_cap(andretain_sources=falseretains nothing) while defs/nodes still extract. Correctnessof the over-cap files is covered by the re-read exercised in RED/GREEN A.
Caps are forced tiny via the opts/env knobs so the over-cap path is deterministic
(no giant fixtures).
Verification
make -f Makefile.cbm cbm— clean (-Werror)make -f Makefile.cbm lint-ci— clean (cppcheck + clang-format + NOLINT)CBM_INDEX_SUPERVISOR=0 ./build/c/test-runner parallel pipeline incremental py_lsp ts_lsp java_lsp kotlin_lsp c_lsp cs_lsp go_lsp rust_lsp mem—2323 passed, 0 failed (ASan/UBSan)
Closes #685-review
Refs #832 (retention layer of the memory workstream)