docs: add Phase 4 (Native Analysis Acceleration) to roadmap#471
docs: add Phase 4 (Native Analysis Acceleration) to roadmap#471carlos-alm merged 7 commits intomainfrom
Conversation
…in backlog These two items deliver the highest immediate impact on agent experience and graph accuracy without requiring Rust porting or TypeScript migration. They should be implemented before any Phase 4+ roadmap work. - #83: hook-optimized `codegraph brief` enriches passively-injected context - #71: basic type inference closes the biggest resolution gap for TS/Java
Impact: 14 functions changed, 0 affected
Add new Phase 4 covering the port of JS-only build phases to Rust: - 4.1-4.3: AST nodes, CFG, dataflow visitor ports (~587ms savings) - 4.4: Batch SQLite inserts (~143ms) - 4.5: Role classification & structure (~42ms) - 4.6: Complete complexity pre-computation - 4.7: Fix incremental rebuild data loss on native engine - 4.8: Incremental rebuild performance (target sub-100ms) Bump old Phases 4-10 to 5-11 with all cross-references updated. Benchmark evidence shows ~50% of native build time is spent in JS visitors that run identically on both engines.
|
Claude finished @carlos-alm's task —— View job Comprehensive PR Review: Phase 4 Documentation AdditionTodo List:
Root Cause AnalysisProblem Being Solved: Adding Phase 4 (Native Analysis Acceleration) to the roadmap to document the next planned infrastructure improvement. The current native engine only accelerates ~50% of the build pipeline (parse, resolve, edges) while the remaining 7 phases (AST nodes, CFG, dataflow, insert nodes, structure, roles, complexity) still run in JavaScript, leaving ~690ms of potential speedup on the table. Why This Approach: This is purely documentation work — no code changes, just adding comprehensive Phase 4 documentation with benchmarks, sub-phases, technical details, and dependency updates. The approach is systematic: detailed benchmarking data, clear technical specifications, and proper renumbering of subsequent phases. Risk Assessment: VERY LOW — Pure documentation with no implementation. The technical content appears well-researched with specific benchmark data and clear technical specifications. Backlog Compliance
Technical Review✅ Phase 1: Root Cause AnalysisThe documentation clearly explains the performance problem: benchmarks show native builds spend ~690ms in JS visitor walks that could be eliminated. The Phase 4 goals are well-justified with concrete performance targets (2x speedup, sub-100ms incremental rebuilds). ✅ Phase 2: Configuration Change DetectionNo configuration changes — this is pure documentation. ✅ Phase 3: Test IntegrityNo test changes — documentation only. ✅ Phase 4: Code QualityDocumentation quality is excellent:
✅ Phase 5: Content VerificationPhase Numbering Consistency ✅All phase references properly updated:
Technical Accuracy ✅
Completeness ✅
Minor Observations
Test Plan VerificationThe PR author requested verification of:
Final Recommendation
Excellent work on the technical analysis and comprehensive documentation. The benchmark data and sub-phase breakdown provide a clear technical roadmap for the upcoming native acceleration work. |
Greptile SummaryThis PR inserts a new Phase 4 — Native Analysis Acceleration into the roadmap and shifts the old Phases 4–10 to Phases 5–11, with updated headers, cross-references, dependency graph, and the overview table. The new phase documents the plan to port the remaining JS-only build visitors (AST nodes, CFG, dataflow, insert nodes, structure, roles, complexity) to Rust, and records the pre-existing incremental-rebuild data-loss bug as item 4.7. Key changes:
Confidence Score: 3/5
Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
P1["Phase 1 — Rust Core ✅\nv1.3.0"]
P2["Phase 2 — Foundation Hardening ✅\nv1.5.0"]
P25["Phase 2.5 — Analysis Expansion ✅\nv2.7.0"]
P27["Phase 2.7 — Deep Analysis ✅\nv3.0.0"]
P3["Phase 3 — Architectural Refactoring 🔄\nv3.1.4"]
P4["Phase 4 — Native Analysis Acceleration 🆕\nPlanned"]
P5["Phase 5 — TypeScript Migration\nPlanned"]
P6["Phase 6 — Runtime & Extensibility\nPlanned"]
P7["Phase 7 — Intelligent Embeddings\nPlanned"]
P8["Phase 8 — Natural Language Queries\nPlanned"]
P9["Phase 9 — Expanded Language Support\nPlanned"]
P10["Phase 10 — GitHub Integration & CI\nPlanned"]
P11["Phase 11 — Visualization & Advanced\nPlanned"]
P1 --> P2 --> P25 --> P27 --> P3 --> P4 --> P5
P5 --> P6
P5 --> P7 --> P8
P5 --> P9
P5 --> P10
P7 --> P10
P6 & P7 & P8 & P9 & P10 --> P11
style P4 fill:#d4edda,stroke:#28a745,color:#000
|
| Current label | Actual phase content | Correct label |
|---|---|---|
| 4 | tsc --noEmit / TypeScript Migration |
5 |
| 5 | Embeddings quality / Intelligent Embeddings | 7 |
| 6 | codegraph ask / Natural Language Queries |
8 |
| 7 | New language parsers / Expanded Language Support | 9 |
| 8 | GitHub Action PR test / GitHub Integration & CI | 10 |
| 9 | codegraph viz / Visualization & Advanced |
11 |
Additionally, there is no row for the new Phase 4 (Native Analysis Acceleration) at all. A reader using this table to verify a completed Phase 4 build would run the TypeScript Migration checks instead.
The table needs a new Phase 4 row (e.g. "Benchmark native full-build time ≤700ms; verify parity between native and WASM outputs for AST nodes, CFG, and dataflow; regression-test 1-file incremental rebuilds for complexity/CFG/dataflow data preservation") and all existing rows should be relabelled to match the new phase numbering.
Last reviewed commit: 30fdd26
docs/roadmap/ROADMAP.md
Outdated
| - Embedding tests in separate workflow requiring HuggingFace token | ||
| - 312 `setTimeout`/`sleep` instances in tests — potential flakiness under load | ||
| - No dependency audit step in CI (see also [4.7](#47----supply-chain-security--audit)) | ||
| - No dependency audit step in CI (see also [5.7](#47----supply-chain-security--audit)) |
There was a problem hiding this comment.
Broken anchor links to section 5.7
Both cross-references on lines 1230 and 1237 update the display text from 4.7 to 5.7, but the anchor targets were not updated. They still point to #47----supply-chain-security--audit instead of #57----supply-chain-security--audit. These links will not resolve in rendered markdown.
| - No dependency audit step in CI (see also [5.7](#47----supply-chain-security--audit)) | |
| - No dependency audit step in CI (see also [5.7](#57----supply-chain-security--audit)) |
docs/roadmap/ROADMAP.md
Outdated
| 2. **Unified test workflow** -- merge embedding tests into the main CI workflow using a securely stored `HF_TOKEN` secret; eliminate the separate workflow | ||
| 3. **Timer cleanup** -- audit and reduce `setTimeout`/`sleep` usage in tests; replace with deterministic waits (event-based, polling with backoff, or `vi.useFakeTimers()`) to reduce flakiness | ||
| 4. > _Dependency audit step is covered by [4.7](#47----supply-chain-security--audit) deliverable 1._ | ||
| 4. > _Dependency audit step is covered by [5.7](#47----supply-chain-security--audit) deliverable 1._ |
There was a problem hiding this comment.
Broken anchor link (same issue as line 1230)
Same broken anchor — display text says 5.7 but the anchor still targets the old #47- prefix.
| 4. > _Dependency audit step is covered by [5.7](#47----supply-chain-security--audit) deliverable 1._ | |
| 4. > _Dependency audit step is covered by [5.7](#57----supply-chain-security--audit) deliverable 1._ |
There was a problem hiding this comment.
Fixed — same anchor correction applied.
Take main's corrected #57 section anchors; keep HEAD's v2.7.0 version reference. Impact: 10 functions changed, 11 affected
docs/roadmap/ROADMAP.md
Outdated
| @@ -219,7 +219,7 @@ Per-function complexity analysis using language-specific AST rules. | |||
|
|
|||
| **New file:** `src/complexity.js` (2,163 lines) | |||
|
|
|||
| ### 2.5.2 -- Community Detection & Drift ✅ | |||
| ### 2.6.2 -- Community Detection & Drift ✅ | |||
|
|
|||
| Louvain community detection at file or function level. | |||
|
|
|||
| @@ -230,7 +230,7 @@ Louvain community detection at file or function level. | |||
|
|
|||
| **New file:** `src/communities.js` (310 lines) | |||
|
|
|||
| ### 2.5.3 -- Structure & Role Classification ✅ | |||
| ### 2.6.3 -- Structure & Role Classification ✅ | |||
|
|
|||
| Directory structure graph with node role classification. | |||
|
|
|||
| @@ -243,7 +243,7 @@ Directory structure graph with node role classification. | |||
|
|
|||
| **New file:** `src/structure.js` (668 lines) | |||
|
|
|||
| ### 2.5.4 -- Execution Flow Tracing ✅ | |||
| ### 2.6.4 -- Execution Flow Tracing ✅ | |||
|
|
|||
| Forward BFS from framework entry points through callees to leaves. | |||
|
|
|||
| @@ -253,7 +253,7 @@ Forward BFS from framework entry points through callees to leaves. | |||
|
|
|||
| **New file:** `src/flow.js` (362 lines) | |||
|
|
|||
| ### 2.5.5 -- Temporal Coupling (Co-change Analysis) ✅ | |||
| ### 2.6.5 -- Temporal Coupling (Co-change Analysis) ✅ | |||
|
|
|||
| Git history analysis for temporal file coupling. | |||
|
|
|||
| @@ -264,7 +264,7 @@ Git history analysis for temporal file coupling. | |||
|
|
|||
| **New file:** `src/cochange.js` (502 lines) | |||
|
|
|||
| ### 2.5.6 -- Manifesto Rule Engine ✅ | |||
| ### 2.6.6 -- Manifesto Rule Engine ✅ | |||
|
|
|||
| Configurable rule engine with warn/fail thresholds for function, file, and graph rules. | |||
|
|
|||
| @@ -276,7 +276,7 @@ Configurable rule engine with warn/fail thresholds for function, file, and graph | |||
|
|
|||
| **New file:** `src/manifesto.js` (511 lines) | |||
|
|
|||
| ### 2.5.7 -- Architecture Boundary Rules ✅ | |||
| ### 2.6.7 -- Architecture Boundary Rules ✅ | |||
|
|
|||
| Architecture enforcement using glob patterns and presets. | |||
|
|
|||
| @@ -287,7 +287,7 @@ Architecture enforcement using glob patterns and presets. | |||
|
|
|||
| **New file:** `src/boundaries.js` (347 lines) | |||
|
|
|||
| ### 2.5.8 -- CI Validation Predicates (`check`) ✅ | |||
| ### 2.6.8 -- CI Validation Predicates (`check`) ✅ | |||
|
|
|||
| Structured pass/fail checks for CI pipelines. | |||
|
|
|||
| @@ -301,7 +301,7 @@ Structured pass/fail checks for CI pipelines. | |||
|
|
|||
| **New file:** `src/check.js` (433 lines) | |||
|
|
|||
| ### 2.5.9 -- Composite Analysis Commands ✅ | |||
| ### 2.6.9 -- Composite Analysis Commands ✅ | |||
|
|
|||
| High-level commands that compose multiple analysis steps. | |||
|
|
|||
| @@ -311,7 +311,7 @@ High-level commands that compose multiple analysis steps. | |||
|
|
|||
| **New files:** `src/audit.js` (424 lines), `src/batch.js` (91 lines), `src/triage.js` (274 lines) | |||
|
|
|||
| ### 2.5.10 -- Hybrid Search ✅ | |||
| ### 2.6.10 -- Hybrid Search ✅ | |||
|
|
|||
| BM25 keyword search + semantic vector search with RRF fusion. | |||
|
|
|||
| @@ -323,7 +323,7 @@ BM25 keyword search + semantic vector search with RRF fusion. | |||
|
|
|||
| **Affected file:** `src/embedder.js` (grew from 525 -> 1,113 lines) | |||
|
|
|||
| ### 2.5.11 -- Supporting Infrastructure ✅ | |||
| ### 2.6.11 -- Supporting Infrastructure ✅ | |||
|
|
|||
| Cross-cutting utilities added during the expansion. | |||
|
|
|||
| @@ -335,7 +335,7 @@ Cross-cutting utilities added during the expansion. | |||
| - ✅ **Journal:** change journal validation/management (`src/journal.js`, 110 lines) | |||
| - ✅ **Update Check:** npm registry polling with 24h cache (`src/update-check.js`, 161 lines) | |||
|
|
|||
| ### 2.5.12 -- MCP Tool Expansion ✅ | |||
| ### 2.6.12 -- MCP Tool Expansion ✅ | |||
There was a problem hiding this comment.
Sub-section numbers don't match parent phase
All twelve sub-sections inside ## Phase 2.5 -- Analysis Expansion have been renumbered from 2.5.x to 2.6.x (e.g. ### 2.6.1 -- Complexity Metrics, ### 2.6.12 -- MCP Tool Expansion), but the parent phase heading itself was not changed — it is still ## Phase 2.5 -- Analysis Expansion. This creates a structural mismatch: 2.6.x sub-sections nested under Phase 2.5 look like they belong to a "Phase 2.6" that doesn't exist anywhere in the overview table, dependency graph, or any anchor target. Anyone navigating the document by section number will be confused, and any future cross-reference written as #261----complexity-metrics- would not resolve against the parent phase.
If the intent is to keep the phase named "2.5", the sub-sections should revert to 2.5.1 – 2.5.12. If the intent is to rename Phase 2.5 to Phase 2.6 (perhaps to create room for a hypothetical future Phase 2.6 between 2.5 and 2.7), the phase header, overview table entry, dependency graph entry, and anchor target (#phase-25--analysis-expansion) all need to be updated consistently as well.
There was a problem hiding this comment.
Fixed — reverted all 2.6.x sub-sections back to 2.5.x to match the Phase 2.5 parent heading.
docs/roadmap/ROADMAP.md
Outdated
| @@ -384,7 +384,7 @@ Define-use chain extraction tracking how data flows between functions. | |||
|
|
|||
| **New file:** `src/dataflow.js` (1,187 lines) | |||
|
|
|||
| ### 2.7.2 -- Expanded Node Types (Phase 1) ✅ | |||
| ### 2.8.2 -- Expanded Node Types (Phase 1) ✅ | |||
|
|
|||
| Extend the graph model with sub-declaration node kinds. | |||
|
|
|||
| @@ -398,7 +398,7 @@ Extend the graph model with sub-declaration node kinds. | |||
|
|
|||
| **Affected files:** All extractors, `src/builder.js`, `src/queries.js`, `src/db.js` | |||
|
|
|||
| ### 2.7.3 -- Expanded Edge Types (Phase 2) ✅ | |||
| ### 2.8.3 -- Expanded Edge Types (Phase 2) ✅ | |||
|
|
|||
| Structural edges for richer graph relationships. | |||
|
|
|||
| @@ -409,7 +409,7 @@ Structural edges for richer graph relationships. | |||
|
|
|||
| **Affected files:** `src/builder.js`, `src/queries.js` | |||
|
|
|||
| ### 2.7.4 -- Intraprocedural Control Flow Graph (CFG) ✅ | |||
| ### 2.8.4 -- Intraprocedural Control Flow Graph (CFG) ✅ | |||
|
|
|||
| Basic-block control flow graph construction from function ASTs. | |||
|
|
|||
| @@ -424,7 +424,7 @@ Basic-block control flow graph construction from function ASTs. | |||
|
|
|||
| **New file:** `src/cfg.js` (1,451 lines) | |||
|
|
|||
| ### 2.7.5 -- Stored Queryable AST Nodes ✅ | |||
| ### 2.8.5 -- Stored Queryable AST Nodes ✅ | |||
|
|
|||
| Persist and query selected AST node types for pattern-based codebase exploration. | |||
|
|
|||
| @@ -439,7 +439,7 @@ Persist and query selected AST node types for pattern-based codebase exploration | |||
|
|
|||
| **New file:** `src/ast.js` (392 lines) | |||
|
|
|||
| ### 2.7.6 -- Extractors Refactoring ✅ | |||
| ### 2.8.6 -- Extractors Refactoring ✅ | |||
|
|
|||
| Split per-language extractors from monolithic `parser.js` into dedicated modules. | |||
|
|
|||
| @@ -453,7 +453,7 @@ Split per-language extractors from monolithic `parser.js` into dedicated modules | |||
|
|
|||
| **New directory:** `src/extractors/` | |||
|
|
|||
| ### 2.7.7 -- normalizeSymbol Utility ✅ | |||
| ### 2.8.7 -- normalizeSymbol Utility ✅ | |||
|
|
|||
| Stable JSON schema for symbol output across all query functions. | |||
|
|
|||
| @@ -463,7 +463,7 @@ Stable JSON schema for symbol output across all query functions. | |||
|
|
|||
| **Affected file:** `src/queries.js` | |||
|
|
|||
| ### 2.7.8 -- Interactive Graph Viewer ✅ | |||
| ### 2.8.8 -- Interactive Graph Viewer ✅ | |||
|
|
|||
| Self-contained HTML visualization with vis-network. | |||
|
|
|||
| @@ -480,7 +480,7 @@ Self-contained HTML visualization with vis-network. | |||
|
|
|||
| **New file:** `src/viewer.js` (948 lines) | |||
|
|
|||
| ### 2.7.9 -- Exports Command ✅ | |||
| ### 2.8.9 -- Exports Command ✅ | |||
|
|
|||
| Per-symbol consumer analysis for file exports. | |||
|
|
|||
| @@ -491,7 +491,7 @@ Per-symbol consumer analysis for file exports. | |||
|
|
|||
| **Affected file:** `src/queries.js` | |||
|
|
|||
| ### 2.7.10 -- Export Format Expansion ✅ | |||
| ### 2.8.10 -- Export Format Expansion ✅ | |||
|
|
|||
| Three new graph export formats for external tooling integration. | |||
|
|
|||
| @@ -501,7 +501,7 @@ Three new graph export formats for external tooling integration. | |||
|
|
|||
| **Affected file:** `src/export.js` (681 lines) | |||
|
|
|||
| ### 2.7.11 -- CLI Consolidation ✅ | |||
| ### 2.8.11 -- CLI Consolidation ✅ | |||
|
|
|||
| First CLI surface area reduction -- 5 commands merged into existing ones. | |||
|
|
|||
| @@ -514,7 +514,7 @@ First CLI surface area reduction -- 5 commands merged into existing ones. | |||
|
|
|||
| **Affected file:** `src/cli.js` | |||
|
|
|||
| ### 2.7.12 -- MCP Tool Consolidation & Expansion ✅ | |||
| ### 2.8.12 -- MCP Tool Consolidation & Expansion ✅ | |||
|
|
|||
There was a problem hiding this comment.
Same sub-section renumbering mismatch as Phase 2.5
All twelve sub-sections inside ## Phase 2.7 -- Deep Analysis & Graph Enrichment have been renumbered from 2.7.x to 2.8.x (e.g. ### 2.8.1 -- Dataflow Analysis through ### 2.8.12 -- MCP Tool Consolidation), but the parent phase heading remains ## Phase 2.7. This produces the same structural inconsistency as Phase 2.5: there is no "Phase 2.8" in the overview table or dependency graph, yet the sub-sections are prefixed as if there were. The "2.7 Summary" table immediately following also still reads ### 2.7 Summary, making it ambiguous whether the summary belongs to Phase 2.7 or Phase 2.8.
The same fix applies: either revert sub-sections to 2.7.x, or rename the phase header + all affected references to 2.8 consistently.
There was a problem hiding this comment.
Fixed — reverted all 2.8.x sub-sections back to 2.7.x to match the Phase 2.7 parent heading.
docs/roadmap/ROADMAP.md
Outdated
| ### 11.1 -- Interactive Web Visualization (Partially Complete) | ||
|
|
||
| > **Phase 2.7 progress:** `codegraph plot` (Phase 2.7.8) ships a self-contained HTML viewer with vis-network. It supports layout switching, color/size/cluster overlays, drill-down, community detection, and a detail panel. The remaining work is the server-based experience below. | ||
| > **Phase 2.7 progress:** `codegraph plot` (Phase 2.8.8) ships a self-contained HTML viewer with vis-network. It supports layout switching, color/size/cluster overlays, drill-down, community detection, and a detail panel. The remaining work is the server-based experience below. |
There was a problem hiding this comment.
Inconsistent phase cross-reference in Phase 11 note
The inline note reads:
Phase 2.7 progress:
codegraph plot(Phase 2.8.8) ships a self-contained HTML viewer…
This references "Phase 2.8.8" in the same sentence that attributes the work to "Phase 2.7 progress". Because the parent phase heading is ## Phase 2.7, the sub-section that ships codegraph plot is numbered ### 2.8.8 -- Interactive Graph Viewer as a result of this PR's renumbering — but there is no standalone "Phase 2.8" in the document. A reader following this note will look for a "Phase 2.8" header and find only "Phase 2.7". This should either say "Phase 2.7.8" (reverting the sub-section numbering) or be written as "sub-section 2.8.8 of Phase 2.7" with an explicit clarification, once the sub-section numbering issue above is resolved.
There was a problem hiding this comment.
Fixed — updated the cross-reference from 2.8.8 to 2.7.8 to match the corrected sub-section numbering.
docs/roadmap/ROADMAP.md
Outdated
| ### 2.7 Summary | ||
|
|
||
| | Metric | Before (v2.6.0) | After (v3.0.0) | Delta | | ||
| | Metric | Before (v2.7.0) | After (v3.0.0) | Delta | |
There was a problem hiding this comment.
"Before" version in Phase 2.7 summary now equals Phase 2.5's completion version
The 2.7 Summary table was updated to show Before (v2.7.0) as the baseline. However, the Phase 2.5 status block was also updated to shipped across v2.0.0 -> v2.7.0 in this same PR. The result is that Phase 2.5 and Phase 2.7's "before" state both reference v2.7.0, making it appear as though the two phases started and ended at the same version.
If Phase 2.5 shipped its final features in v2.7.0 and Phase 2.7 then began from that state, the "Before" column in the 2.7 Summary table should reflect the version prior to Phase 2.7 work — which is v2.7.0 only if 2.7 started immediately at that same release. Consider clarifying with a note or using a more precise baseline version to avoid the circular appearance.
There was a problem hiding this comment.
The Phase 2.7 summary baseline reflects that Phase 2.5 shipped through v2.7.0 and Phase 2.7 began from that same state. Added a clarifying note to the summary table header ("v2.7.0 baseline").
Summary
Test plan