fix(ui): bound octree recursion for coincident points (3D layout)#848
Merged
Conversation
Coincident (or sub-ULP-separated) bodies made octree_insert subdivide forever in the graph-UI 3D layout, calloc-ing one cell per level until the process crashed (stack overflow) or froze the machine allocating (the 34GB-swap reports). Guard distilled from #821: octree_insert now carries a depth and stops at depth 26 or half_size < 1e-4f, folding the body into the cell as a mass-weighted centroid aggregate (body_index = -1). octree_repulse already clamps d to 0.01 before the dx/d division, so folded coincident bodies get exactly zero force and no NaN. The default-cap raise bundled in #821 (DEFAULT_MAX_NODES and GRAPH_RENDER_NODE_LIMIT 2000 -> 60000) is a UX policy change deferred to its own discussion per review; HARD_MAX_NODES is raised to 200000 so opt-in CBM_UI_MAX_RENDER_NODES users get the new ceiling. Guard test layout_coincident_nodes_bounded drives the public layout API with same-file nodes whose distinct qualified names share one 32-bit FNV-1a hash (bit-identical coincident positions on every platform), in a fork+alarm child so the unfixed runaway cannot take down the runner. Refs #498, #726, #402 Co-authored-by: Ljove02 <135197334+Ljove02@users.noreply.github.com> Signed-off-by: Martin Vogel <martin.vogel.tech@gmail.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
fix(ui): bound octree recursion for coincident points (3D layout)
Distilled from #821 (guard half) — thanks @Ljove02 for the analysis and fix.
Mechanism
The graph-UI 3D layout builds a Barnes-Hut octree every iteration
(
src/ui/layout3d.c). When two or more bodies share (or nearly share) aposition, subdivision never separates them:
octree_insertrecurses forever,calloc-ing one cell per level whilehalf_sizeshrinks toward zero. Theprocess either overflows the stack (SIGSEGV) or — with the tail call
optimized — spins allocating without bound (the 34GB-swap freeze reported in
the field). Plausibly the driver of #498, #726 and #402.
Coincident positions are reachable through the public layout API: nodes are
anchored by
fnv1a(file cluster key)and jittered by a PRNG seeded withfnv1a(qualified_name), so same-file nodes with hash-colliding qualifiednames get bit-identical positions.
Fix (from #821)
octree_insertnow carries adepthparameter; atdepth >= 26orhalf_size < 1e-4fit stops subdividing and folds the body into the cellas an aggregate (mass-weighted centroid,
body_index = -1).octree_repulseinteraction audited: a folded cell hasbody_index = -1so the self-skip doesn't apply, but
dis clamped to0.01before thedx/ddivision — coincident bodies get exactly zero force, no NaN.HARD_MAX_NODESraised 10000 → 200000: the opt-inCBM_UI_MAX_RENDER_NODESescape hatch now reaches the new ceiling forpower users (safe now that the octree is bounded).
Excluded from #821 (deferred to its own discussion)
The default-cap raise is a UX/perf policy change, not part of the bug fix,
and was split out per review (atomic PRs):
DEFAULT_MAX_NODES2000 → 60000 insrc/ui/layout3d.c— kept at 2000.GRAPH_RENDER_NODE_LIMIT2000 → 60000 ingraph-ui/src/hooks/useGraphData.ts(+ its test) — kept at 2000.Users who want more nodes today can already opt in via
CBM_UI_MAX_RENDER_NODES(now up to 200000).Reproduce-first evidence
New guard:
layout_coincident_nodes_boundedintests/test_ui.c. It drivesthe public layout API with three same-file nodes whose distinct qualified
names share one 32-bit FNV-1a hash (
0x06bb012e) — bit-identical coincidentpositions on every platform — plus spread nodes for a realistic root box. The
child does all work under
fork+alarm(5)so the unfixed runaway cannottake down the runner; deliberately no memory rlimit (a failing
callocmakesthe unfixed
octree_insertsilently truncate and complete — a vacuousgreen). The child also asserts the trio's output coordinates are bitwise
identical, so the fixture fails loudly if seeding ever changes instead of
silently losing coverage.
origin/mainlayout3d.c:layout_coincident_nodes_bounded FAIL tests/test_ui.c:584: ASSERT(!WIFSIGNALED(status))— the child is killed bya signal under the recursion runaway; suite
15 passed, 1 failed, runnerunharmed (fork+alarm harness contained it).
16 passed, 0 failed — the child exits 0 inmilliseconds with bounded depth, the coincident trio keeps bitwise-identical
coordinates, all coordinates finite.
make -f Makefile.cbm cbm(-Werror) clean;make -f Makefile.cbm lint-ci(cppcheck + clang-format + NOLINT whitelist) passed.npm ci(0 vulnerabilities);useGraphData.test.ts1/1 pass withGRAPH_RENDER_NODE_LIMITback at 2000;npx tsc -bclean.(Pre-existing, unrelated:
GraphTab.test.ts"truncated for render safety"fails on non-en-US-locale machines —
formatGraphLimitNoticeuseshost-locale
toLocaleString()while the test hardcodes commas; present onorigin/main, untouched by this PR.)
A literal sub-ULP-separated pair is not constructible through the public API
(same-anchor positions are quantized to exact multiples of the jitter
quantum, 20 ULP at anchor magnitude; cross-anchor separations are libm
platform-dependent). Exact coincidence is the API-reachable degenerate input
and drives the recursion through the same sub-ULP regime the floor guards.
Refs #498, #726, #402