Skip to content

fix(ui): bound octree recursion for coincident points (3D layout)#848

Merged
DeusData merged 1 commit into
mainfrom
distill/821-octree-guard
Jul 4, 2026
Merged

fix(ui): bound octree recursion for coincident points (3D layout)#848
DeusData merged 1 commit into
mainfrom
distill/821-octree-guard

Conversation

@DeusData

@DeusData DeusData commented Jul 4, 2026

Copy link
Copy Markdown
Owner

fix(ui): bound octree recursion for coincident points (3D layout)

Distilled from #821 (guard half) — thanks @Ljove02 for the analysis and fix.

Mechanism

The graph-UI 3D layout builds a Barnes-Hut octree every iteration
(src/ui/layout3d.c). When two or more bodies share (or nearly share) a
position, subdivision never separates them: octree_insert recurses forever,
calloc-ing one cell per level while half_size shrinks toward zero. The
process either overflows the stack (SIGSEGV) or — with the tail call
optimized — spins allocating without bound (the 34GB-swap freeze reported in
the field). Plausibly the driver of #498, #726 and #402.

Coincident positions are reachable through the public layout API: nodes are
anchored by fnv1a(file cluster key) and jittered by a PRNG seeded with
fnv1a(qualified_name), so same-file nodes with hash-colliding qualified
names get bit-identical positions.

Fix (from #821)

  • octree_insert now carries a depth parameter; at depth >= 26 or
    half_size < 1e-4f it stops subdividing and folds the body into the cell
    as an aggregate (mass-weighted centroid, body_index = -1).
  • octree_repulse interaction audited: a folded cell has body_index = -1
    so the self-skip doesn't apply, but d is clamped to 0.01 before the
    dx/d division — coincident bodies get exactly zero force, no NaN.
  • HARD_MAX_NODES raised 10000 → 200000: the opt-in
    CBM_UI_MAX_RENDER_NODES escape hatch now reaches the new ceiling for
    power users (safe now that the octree is bounded).

Excluded from #821 (deferred to its own discussion)

The default-cap raise is a UX/perf policy change, not part of the bug fix,
and was split out per review (atomic PRs):

  • DEFAULT_MAX_NODES 2000 → 60000 in src/ui/layout3d.ckept at 2000.
  • GRAPH_RENDER_NODE_LIMIT 2000 → 60000 in
    graph-ui/src/hooks/useGraphData.ts (+ its test) — kept at 2000.

Users who want more nodes today can already opt in via
CBM_UI_MAX_RENDER_NODES (now up to 200000).

Reproduce-first evidence

New guard: layout_coincident_nodes_bounded in tests/test_ui.c. It drives
the public layout API with three same-file nodes whose distinct qualified
names share one 32-bit FNV-1a hash (0x06bb012e) — bit-identical coincident
positions on every platform — plus spread nodes for a realistic root box. The
child does all work under fork + alarm(5) so the unfixed runaway cannot
take down the runner; deliberately no memory rlimit (a failing calloc makes
the unfixed octree_insert silently truncate and complete — a vacuous
green). The child also asserts the trio's output coordinates are bitwise
identical, so the fixture fails loudly if seeding ever changes instead of
silently losing coverage.

  • RED on origin/main layout3d.c: layout_coincident_nodes_bounded FAIL tests/test_ui.c:584: ASSERT(!WIFSIGNALED(status)) — the child is killed by
    a signal under the recursion runaway; suite 15 passed, 1 failed, runner
    unharmed (fork+alarm harness contained it).
  • GREEN with the guard: ui suite 16 passed, 0 failed — the child exits 0 in
    milliseconds with bounded depth, the coincident trio keeps bitwise-identical
    coordinates, all coordinates finite.
  • C: make -f Makefile.cbm cbm (-Werror) clean; make -f Makefile.cbm lint-ci (cppcheck + clang-format + NOLINT whitelist) passed.
  • graph-ui: npm ci (0 vulnerabilities); useGraphData.test.ts 1/1 pass with
    GRAPH_RENDER_NODE_LIMIT back at 2000; npx tsc -b clean.
    (Pre-existing, unrelated: GraphTab.test.ts "truncated for render safety"
    fails on non-en-US-locale machines — formatGraphLimitNotice uses
    host-locale toLocaleString() while the test hardcodes commas; present on
    origin/main, untouched by this PR.)

A literal sub-ULP-separated pair is not constructible through the public API
(same-anchor positions are quantized to exact multiples of the jitter
quantum, 20 ULP at anchor magnitude; cross-anchor separations are libm
platform-dependent). Exact coincidence is the API-reachable degenerate input
and drives the recursion through the same sub-ULP regime the floor guards.

Refs #498, #726, #402

Coincident (or sub-ULP-separated) bodies made octree_insert subdivide
forever in the graph-UI 3D layout, calloc-ing one cell per level until
the process crashed (stack overflow) or froze the machine allocating
(the 34GB-swap reports). Guard distilled from #821: octree_insert now
carries a depth and stops at depth 26 or half_size < 1e-4f, folding the
body into the cell as a mass-weighted centroid aggregate
(body_index = -1). octree_repulse already clamps d to 0.01 before the
dx/d division, so folded coincident bodies get exactly zero force and
no NaN.

The default-cap raise bundled in #821 (DEFAULT_MAX_NODES and
GRAPH_RENDER_NODE_LIMIT 2000 -> 60000) is a UX policy change deferred
to its own discussion per review; HARD_MAX_NODES is raised to 200000 so
opt-in CBM_UI_MAX_RENDER_NODES users get the new ceiling.

Guard test layout_coincident_nodes_bounded drives the public layout API
with same-file nodes whose distinct qualified names share one 32-bit
FNV-1a hash (bit-identical coincident positions on every platform), in
a fork+alarm child so the unfixed runaway cannot take down the runner.

Refs #498, #726, #402

Co-authored-by: Ljove02 <135197334+Ljove02@users.noreply.github.com>
Signed-off-by: Martin Vogel <martin.vogel.tech@gmail.com>
@DeusData DeusData enabled auto-merge July 4, 2026 11:52
@DeusData DeusData merged commit 2e8f0f0 into main Jul 4, 2026
15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant