refactor: extract graph model subsystem#436
Conversation
…d classifiers Introduce src/graph/ with a unified in-memory CodeGraph model that replaces ad-hoc graphology usage scattered across consumers. Graph construction (from DB), algorithms (BFS, Tarjan, Louvain, shortest-path), and classification (roles, risk) are now centralized. Refactors communities, cycles, structure, triage, and viewer to use the new subsystem, removing ~300 lines of duplicated graph-building logic.
|
Claude finished @carlos-alm's task —— View job PR Review: Graph Model Subsystem RefactoringRoot Cause AnalysisProblem Being Solved: This PR addresses architectural debt by consolidating scattered graph construction logic across multiple modules. Previously, files like Why This Approach: The refactor extracts this scattered logic into a dedicated
Risk Assessment: Low risk. This is a pure refactoring that consolidates existing functionality without changing behavior. The graph model provides the same interface as the previous ad-hoc implementations. Configuration Changes✅ No risky configuration changes detected
Test Integrity✅ Test quality maintained
Code Quality AssessmentExcellent Design Decisions
Code Quality Issues
Architectural Improvements
Minor Concerns
Backlog Compliance
Critical ConcernsNone identified. This is a well-executed refactoring. Final Recommendation
|
Greptile SummaryThis PR extracts a unified
Confidence Score: 5/5
Important Files Changed
Sequence DiagramsequenceDiagram
participant W as watcher.js
participant R as rebuildFile (incremental.js)
participant S as stmts (prepared SQLite statements)
W->>R: rebuildFile(db, rootDir, filePath, stmts, engineOpts, cache, opts)
note over R: db/_db is received but<br/>never used inside the function
R->>S: stmts.deleteEdgesForFile.run(relPath)
R->>S: stmts.deleteNodes.run(relPath)
R->>S: stmts.insertNode.run(...)
R->>S: stmts.insertEdge.run(...)
R-->>W: { file, nodesAdded, nodesRemoved, edgesAdded, ... }
Last reviewed commit: f4d2d1e |
src/graph/model.js
Outdated
| if (type === 'undirected') { | ||
| // Deduplicate: only add each unordered pair once | ||
| for (const [src, tgt] of this.edges()) { | ||
| if (src === tgt) continue; | ||
| if (!g.hasEdge(src, tgt)) g.addEdge(src, tgt); | ||
| } | ||
| } else { | ||
| for (const [src, tgt] of this.edges()) { | ||
| if (src === tgt) continue; | ||
| if (!g.hasEdge(src, tgt)) g.addEdge(src, tgt); | ||
| } | ||
| } |
There was a problem hiding this comment.
Duplicate if/else branches in toGraphology
Both the undirected and else branches contain identical code, making the conditional a no-op. The comment "Deduplicate: only add each unordered pair once" appears only in the undirected block but the same loop body exists in the else block, so it's misleading. The whole if/else can be collapsed into a single loop:
| if (type === 'undirected') { | |
| // Deduplicate: only add each unordered pair once | |
| for (const [src, tgt] of this.edges()) { | |
| if (src === tgt) continue; | |
| if (!g.hasEdge(src, tgt)) g.addEdge(src, tgt); | |
| } | |
| } else { | |
| for (const [src, tgt] of this.edges()) { | |
| if (src === tgt) continue; | |
| if (!g.hasEdge(src, tgt)) g.addEdge(src, tgt); | |
| } | |
| } | |
| for (const [src, tgt] of this.edges()) { | |
| if (src === tgt) continue; | |
| if (!g.hasEdge(src, tgt)) g.addEdge(src, tgt); | |
| } |
Note: for the directed→undirected case (the primary use for Louvain), this.edges() will yield both A→B and B→A when they exist as directed edges, but the !g.hasEdge guard correctly deduplicates them in the undirected graphology graph.
There was a problem hiding this comment.
Fixed in ca868ba — collapsed the identical if/else branches into a single loop. The !g.hasEdge guard on the graphology undirected graph handles deduplication for both directed and undirected source graphs.
There was a problem hiding this comment.
Already fixed in ca868ba — the identical if/else branches were collapsed into a single loop with the !g.hasEdge guard handling deduplication.
There was a problem hiding this comment.
Already fixed in ca868ba — the identical if/else branches were collapsed into a single loop with the !g.hasEdge guard handling deduplication.
…cate round4 - Collapse identical if/else branches in CodeGraph.toGraphology() into a single loop (the undirected dedup is handled by graphology's hasEdge guard) - Export round4 from risk.js and import in triage.js instead of redeclaring
|
Both findings addressed in ca868ba:
|
| get edgeCount() { | ||
| let count = 0; | ||
| for (const targets of this._successors.values()) count += targets.size; | ||
| return count; | ||
| } |
There was a problem hiding this comment.
edgeCount double-counts edges on undirected graphs
For an undirected CodeGraph, addEdge(A, B) stores both A→B and B→A in _successors (lines 78-81). Because edgeCount simply sums all entries across all successor maps, it returns 2× the actual logical edge count for undirected graphs. By contrast, the edges() generator correctly deduplicates via the seen set.
buildTemporalGraph creates an undirected CodeGraph, and any caller that checks graph.edgeCount against a threshold (e.g. to skip processing an empty graph) will see inflated values. This won't cause a wrong "skip" today, but as more consumers are added for the temporal graph the stale API contract will produce subtle bugs.
The fix is to either (a) apply the same deduplication logic as edges(), or (b) track a separate _edgeCount field incremented only once per logical edge:
| get edgeCount() { | |
| let count = 0; | |
| for (const targets of this._successors.values()) count += targets.size; | |
| return count; | |
| } | |
| get edgeCount() { | |
| if (this._directed) { | |
| let count = 0; | |
| for (const targets of this._successors.values()) count += targets.size; | |
| return count; | |
| } | |
| // Undirected: each logical edge is stored in both directions, count once | |
| let count = 0; | |
| for (const targets of this._successors.values()) count += targets.size; | |
| return count / 2; | |
| } |
There was a problem hiding this comment.
Fixed in merge commit f4d2d1e — resolved conflict by taking main's fix: \ now returns \ for undirected graphs since each logical edge is stored in both directions.
Impact: 5 functions changed, 10 affected
|
Claude finished @carlos-alm's task —— View job Code Review: Graph Model Subsystem RefactoringTodo List:
Review Summary: ⭐⭐⭐⭐ APPROVEThis is a high-quality refactoring that successfully consolidates scattered graph construction logic into a well-designed unified subsystem. The fixes from commits Key Strengths1. Architectural Excellence
2. Fixes Properly Implemented
3. Code Quality
4. Test CoverageBased on the comprehensive test structure I analyzed:
Architecture AssessmentBefore: Scattered Logic
After: Unified SubsystemBenefits Achieved
Risk Assessment: LOW
Minor Observations
Test Plan Verification✅ CLI Commands Confirmed:
✅ Test Structure: 29 graph + integration test files with 564 tests (as claimed in PR description) ✅ Integration Coverage: Tests exist for all refactored functionality in RecommendationAPPROVE - This refactor represents a significant architectural improvement that:
The code quality is high, fixes have been properly implemented, and the comprehensive test coverage provides confidence in the refactoring's correctness. --- • |
… rationale Impact: 2 functions changed, 0 affected
Address PR #436 review feedback: document why each risk dimension and role category is weighted the way it is.
Summary
src/graph/subsystem with a unifiedCodeGraphin-memory model replacing ad-hoc graphology graph construction scattered across multiple modulessrc/graph/classifiers/communities.js,cycles.js,structure.js,triage.js, andviewer.jsto use the new subsystem, removing ~300 lines of duplicated graph-building logicTest plan
codegraph communities,codegraph cycles,codegraph structure,codegraph triageCLI commands work end-to-endnpm test)