feat: parser refactoring, Jedi call resolution, and performance optimizations by gzenz · Pull Request #158 · tirth8205/code-review-graph

gzenz · 2026-04-08T20:27:42Z

Summary

Major improvements spanning parser architecture, call graph accuracy, build performance, and dead code detection:

Parser refactoring: Extracted 16 per-language handler modules into code_review_graph/lang/ using a strategy pattern, replacing monolithic conditionals in parser.py. Thread-safe parser caches with double-check locking.
Jedi-based call resolution: New jedi_resolver.py resolves Python method calls at build time. Pre-scan filtering by project function names reduces enrichment from 36s to 3s on large repos. New [enrichment] optional dependency group.
PreToolUse search enrichment: New enrich.py module and code-review-graph enrich CLI command inject graph context (callers, flows, community, tests) into agent search results passively via hook.
Call graph improvements: Typed variable call enrichment (Python, JS/TS, Kotlin/Java), star import resolution, namespace imports, CommonJS require(), Angular template parsing, JSX handler tracking, module-qualified call resolution, function/class references as arguments.
Dead code FP reduction: Framework decorators recognized as entry points, CDK construct methods and abstract overrides excluded, e2e test directories filtered.
Community detection 21x speedup: Bulk node loading + adjacency-indexed cohesion computation (48.6s to 2.3s on 41k-node repos).
Build performance: Batch file storage (50-file transactions), batch risk_index (2 GROUP BY queries replace per-node loops).
DB schema v8: Composite edge index for upsert performance (v7 reserved by PR fix: add sqlite edge compound indexes #127).
Other: Weighted flow risk scoring, transitive TESTED_BY, --quiet/--json CLI flags, search deduplication, 829+ tests (up from 615).

Evaluation

Tested against Gadgetbridge (41k nodes, 280k edges):

8/10 scorecard PASS (callers_of, callees_of, tests_for, communities, flows, impact radius, risk scores)
Call resolution rate improved from 28% to 39.6%
Community detection: 48.6s to 2.3s
Full build time: ~22s for 3,574 files

Migration note

Our composite edge index migration is numbered v8 to avoid conflict with v6 (summary tables, already on main) and v7 (reserved by PR #127).

Test plan

uv run pytest tests/ --tb=short -q -- 829 passed, 4 skipped
uv run ruff check code_review_graph/ -- all checks passed
Full rebuild + evaluation on Gadgetbridge (41k nodes)
Full rebuild + evaluation on internal Python/TS/React project (261 files)
CI pipeline (lint, type-check, security, test matrix)

…izations Major improvements to code-review-graph spanning parser architecture, call graph accuracy, and build performance. Parser refactoring: - Extract 16 per-language handler modules into code_review_graph/lang/ using a strategy pattern, replacing monolithic conditionals in parser.py - Thread-safe parser caches with double-check locking Call graph enrichment: - Jedi-based Python method call resolution at build time (jedi_resolver.py) - Pre-scan filtering by project function names (36s to 3s on large repos) - Typed variable call enrichment (Python, JS/TS, Kotlin/Java) - Star import resolution, namespace imports, CommonJS require() - Angular template parsing, JSX handler tracking - Module-level import tracking and module-qualified call resolution - Function/class references passed as call arguments PreToolUse search enrichment: - New enrich.py module and code-review-graph enrich CLI command - Injects graph context (callers, flows, community, tests) into agent search results passively via hook Dead code false positive reduction: - Framework decorators recognized as entry points - CDK construct methods, abstract overrides excluded - E2e test directories excluded from dead code detection Performance: - Community detection: 48.6s to 2.3s (21x speedup) via bulk node loading and adjacency-indexed cohesion computation - Jedi enrichment: 36s to 3s (12x) via pre-scan filtering - Batch file storage (50-file transactions) - Batch risk_index (2 GROUP BY queries replace per-node loops) Other: - Weighted flow risk scoring by criticality - Transitive TESTED_BY lookup for tests_for and risk scoring - DB schema v8: composite edge index (v7 reserved by PR tirth8205#127) - --quiet and --json CLI flags - Search query deduplication, test function deprioritization - New [enrichment] optional dependency group for Jedi - 829+ tests across 26 test files (up from 615) Evaluated against Gadgetbridge (41k nodes, 280k edges): 8/10 PASS, call resolution rate improved from 28% to 39.6%.

gzenz force-pushed the feat/lang-handlers-jedi-resolver-and-perf branch from 6df1421 to 702ac5b Compare April 8, 2026 20:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: parser refactoring, Jedi call resolution, and performance optimizations#158

feat: parser refactoring, Jedi call resolution, and performance optimizations#158
gzenz wants to merge 1 commit intotirth8205:mainfrom
gzenz:feat/lang-handlers-jedi-resolver-and-perf

gzenz commented Apr 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

gzenz commented Apr 8, 2026

Summary

Evaluation

Migration note

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant