diff --git a/docs/roadmap/ROADMAP.md b/docs/roadmap/ROADMAP.md index 19231fb..c5fd25c 100644 --- a/docs/roadmap/ROADMAP.md +++ b/docs/roadmap/ROADMAP.md @@ -16,7 +16,7 @@ Codegraph is a strong local-first code graph CLI. This roadmap describes planned | [**2**](#phase-2--foundation-hardening) | Foundation Hardening | Parser registry, complete MCP, test coverage, enhanced config, multi-repo MCP | **Complete** (v1.4.0) | | [**2.5**](#phase-25--analysis-expansion) | Analysis Expansion | Complexity metrics, community detection, flow tracing, co-change, manifesto, boundary rules, check, triage, audit, batch, hybrid search | **Complete** (v2.6.0) | | [**2.7**](#phase-27--deep-analysis--graph-enrichment) | Deep Analysis & Graph Enrichment | Dataflow analysis, intraprocedural CFG, AST node storage, expanded node/edge types, extractors refactoring, CLI consolidation, interactive viewer, exports command, normalizeSymbol | **Complete** (v3.0.0) | -| [**3**](#phase-3--architectural-refactoring) | Architectural Refactoring (Vertical Slice) | Unified AST analysis framework, command/query separation, repository pattern, queries.js decomposition, composable MCP, CLI commands, domain errors, presentation layer, domain grouping, curated API, unified graph model | **In Progress** (v3.1.3) | +| [**3**](#phase-3--architectural-refactoring) | Architectural Refactoring (Vertical Slice) | Unified AST analysis framework, command/query separation, repository pattern, queries.js decomposition, composable MCP, CLI commands, domain errors, builder pipeline, presentation layer, domain grouping, curated API, unified graph model, qualified names | **In Progress** (v3.1.3) | | [**4**](#phase-4--typescript-migration) | TypeScript Migration | Project setup, core type definitions, leaf -> core -> orchestration module migration, test migration | Planned | | [**5**](#phase-5--intelligent-embeddings) | Intelligent Embeddings | LLM-generated descriptions, enhanced embeddings, build-time semantic metadata, module summaries | Planned | | [**6**](#phase-6--natural-language-queries) | Natural Language Queries | `ask` command, conversational sessions, LLM-narrated graph queries, onboarding tools | Planned | @@ -667,7 +667,7 @@ src/ src/ db/ connection.js # Open, WAL mode, pragma tuning - migrations.js # Schema versions (currently 13 migrations) + migrations.js # Schema versions (currently 15 migrations) query-builder.js # Lightweight SQL builder for common filtered queries repository/ index.js # Barrel re-export @@ -775,9 +775,9 @@ Reduced `index.js` from ~190 named exports (243 lines) to 48 curated exports (57 > **Removed: Decompose complexity.js** — Subsumed by 3.1. The standalone complexity decomposition from the previous revision is now part of the unified AST analysis framework (3.1). The `complexity.js` per-language rules become `ast-analysis/rules/complexity/{lang}.js` alongside CFG and dataflow rules. -### 3.8 -- Domain Error Hierarchy +### 3.8 -- Domain Error Hierarchy ✅ -Replace ad-hoc error handling (mix of thrown `Error`, returned `null`, `logger.warn()`, `process.exit(1)`) across 50 modules with structured domain errors. +Structured domain errors replace ad-hoc error handling across the codebase. 8 error classes in `src/errors.js`: `CodegraphError`, `ParseError`, `DbError`, `ConfigError`, `ResolutionError`, `EngineError`, `AnalysisError`, `BoundaryError`. The CLI catches domain errors and formats for humans; MCP returns structured `{ isError, code }` responses. ```js class CodegraphError extends Error { constructor(message, { code, file, cause }) { ... } } @@ -790,41 +790,43 @@ class AnalysisError extends CodegraphError { code = 'ANALYSIS_FAILED' } class BoundaryError extends CodegraphError { code = 'BOUNDARY_VIOLATION' } ``` -The CLI catches domain errors and formats for humans. MCP returns structured error responses. No more `process.exit()` from library code. +- ✅ `src/errors.js` — 8 domain error classes with `code`, `file`, `cause` fields +- ✅ CLI top-level catch formats domain errors for humans +- ✅ MCP returns structured error responses +- ✅ Domain errors adopted across config, boundaries, triage, and query modules **New file:** `src/errors.js` -### 3.9 -- Builder Pipeline Architecture +### 3.9 -- Builder Pipeline Architecture ✅ -Refactor `buildGraph()` (1,355 lines) from a mega-function into explicit, independently testable pipeline stages. Phase 2.7 added 4 opt-in stages, bringing the total to 11 core + 4 optional. +Refactored `buildGraph()` from a monolithic mega-function into explicit, independently testable pipeline stages. `src/builder.js` is now a 12-line barrel re-export. `src/builder/pipeline.js` orchestrates 9 stages via `PipelineContext`. Each stage is a separate file in `src/builder/stages/`. -```js -const pipeline = [ - // Core (always) - collectFiles, // (rootDir, config) => filePaths[] - detectChanges, // (filePaths, db) => { changed, removed, isFullBuild } - parseFiles, // (filePaths, engineOpts) => Map - insertNodes, // (symbolMap, db) => nodeIndex - resolveImports, // (symbolMap, rootDir, aliases) => importEdges[] - buildCallEdges, // (symbolMap, nodeIndex) => callEdges[] - buildClassEdges, // (symbolMap, nodeIndex) => classEdges[] - resolveBarrels, // (edges, symbolMap) => resolvedEdges[] - insertEdges, // (allEdges, db) => stats - extractASTNodes, // (fileSymbols, db) => astStats (always, post-parse) - buildStructure, // (db, fileSymbols, rootDir) => structureStats - classifyRoles, // (db) => roleStats - emitChangeJournal, // (rootDir, changes) => void - - // Opt-in (dynamic imports) - computeComplexity, // --complexity: (db, rootDir, engine) => complexityStats - buildDataflowEdges, // --dataflow: (db, fileSymbols, rootDir) => dataflowStats - buildCFGData, // --cfg: (db, fileSymbols, rootDir) => cfgStats -] +``` +src/ + builder.js # 12-line barrel re-export + builder/ + context.js # PipelineContext — shared state across stages + pipeline.js # Orchestrator: setup → stages → timing + helpers.js # batchInsertNodes, collectFiles, fileHash, etc. + incremental.js # Incremental build logic + stages/ + collect-files.js # Discover source files + detect-changes.js # Incremental: hash comparison, removed detection + parse-files.js # Parse via native/WASM engine + insert-nodes.js # Batch-insert nodes, children, contains/parameter_of edges + resolve-imports.js # Import resolution with aliases + build-edges.js # Call edges, class edges, barrel resolution + build-structure.js # Directory/file hierarchy + run-analyses.js # Complexity, CFG, dataflow, AST store + finalize.js # Build meta, timing, db close ``` -Watch mode reuses the same stages triggered per-file, eliminating the `watcher.js` divergence. +- ✅ `PipelineContext` shared state replaces function parameters +- ✅ 9 sequential stages, each independently testable +- ✅ `src/builder.js` reduced to barrel re-export +- ✅ Timing tracked per-stage in `ctx.timing` -**Affected files:** `src/builder.js`, `src/watcher.js` +**Affected files:** `src/builder.js` → split into `src/builder/` ### 3.10 -- Embedder Subsystem Extraction @@ -852,49 +854,70 @@ The pluggable store interface enables future O(log n) ANN search (e.g., `hnswlib **Affected files:** `src/embedder.js` -> split into `src/embeddings/` -### 3.11 -- Unified Graph Model +### 3.11 -- Unified Graph Model ✅ -Unify the four parallel graph representations (structure.js, cochange.js, communities.js, viewer.js) into a shared in-memory graph model. +Unified the four parallel graph representations into a shared in-memory `CodeGraph` model. The `src/graph/` directory contains the model, 3 builders, 6 algorithms, and 2 classifiers. Algorithms are composable — run community detection on the dependency graph, the temporal graph, or a merged graph. ``` src/ graph/ - model.js # Shared in-memory graph (nodes + edges + metadata) + index.js # Barrel re-export + model.js # CodeGraph class: nodes Map, directed/undirected adjacency builders/ - dependency.js # Build from SQLite edges + index.js # Barrel + dependency.js # Build from SQLite call/import edges structure.js # Build from file/directory hierarchy - temporal.js # Build from git history (co-changes) + temporal.js # Build from git co-change history algorithms/ + index.js # Barrel bfs.js # Breadth-first traversal - shortest-path.js # Path finding - tarjan.js # Cycle detection + shortest-path.js # Dijkstra path finding + tarjan.js # Strongly connected components / cycle detection louvain.js # Community detection - centrality.js # Fan-in/fan-out, betweenness - clustering.js # Cohesion, coupling, density + centrality.js # Fan-in/fan-out, betweenness centrality classifiers/ - roles.js # Node role classification - risk.js # Risk scoring + index.js # Barrel + roles.js # Node role classification (hub, utility, leaf, etc.) + risk.js # Composite risk scoring ``` -Algorithms become composable -- run community detection on the dependency graph, the temporal graph, or a merged graph. +- ✅ `CodeGraph` in-memory model with nodes Map, successors/predecessors adjacency +- ✅ 3 builders: dependency (SQLite edges), structure (file hierarchy), temporal (git co-changes) +- ✅ 6 algorithms: BFS, shortest-path, Tarjan SCC, Louvain community, centrality +- ✅ 2 classifiers: role classification, risk scoring +- ✅ `structure.js`, `communities.js`, `cycles.js`, `triage.js`, `viewer.js` refactored to use graph model **Affected files:** `src/structure.js`, `src/cochange.js`, `src/communities.js`, `src/cycles.js`, `src/triage.js`, `src/viewer.js` -### 3.12 -- Qualified Names & Hierarchical Scoping (Partially Addressed) +### 3.12 -- Qualified Names & Hierarchical Scoping ✅ -> **Phase 2.7 progress:** `parent_id` column, `contains` edges, `parameter_of` edges, and `childrenData()` query now model one-level parent-child relationships. This addresses ~80% of the use case. +> **Phase 2.7 progress:** `parent_id` column, `contains` edges, `parameter_of` edges, and `childrenData()` query now model one-level parent-child relationships. -Remaining work -- enrich the node model with deeper scope information: +Node model enriched with `qualified_name`, `scope`, and `visibility` columns (migration v15). Enables direct lookups like "all methods of class X" via `findNodesByScope()` and qualified name resolution via `findNodeByQualifiedName()` — no edge traversal needed. ```sql -ALTER TABLE nodes ADD COLUMN qualified_name TEXT; -- 'DateHelper.format' -ALTER TABLE nodes ADD COLUMN scope TEXT; -- 'DateHelper' +ALTER TABLE nodes ADD COLUMN qualified_name TEXT; -- 'DateHelper.format', 'freeFunction.x' +ALTER TABLE nodes ADD COLUMN scope TEXT; -- 'DateHelper', null for top-level ALTER TABLE nodes ADD COLUMN visibility TEXT; -- 'public' | 'private' | 'protected' +CREATE INDEX idx_nodes_qualified_name ON nodes(qualified_name); +CREATE INDEX idx_nodes_scope ON nodes(scope); ``` -Enables queries like "all methods of class X" without traversing edges. The `parent_id` FK only goes one level -- deeply nested scopes (namespace > class > method > closure) aren't fully represented. `qualified_name` would allow direct lookup. - -**Affected files:** `src/db.js`, `src/extractors/`, `src/queries.js`, `src/builder.js` +- ✅ Migration v15: `qualified_name`, `scope`, `visibility` columns + indexes +- ✅ `batchInsertNodes` expanded to 9 columns (name, kind, file, line, end_line, parent_id, qualified_name, scope, visibility) +- ✅ `insert-nodes.js` computes qualified_name and scope during insertion: methods get scope from class prefix, children get `parent.child` qualified names +- ✅ Visibility extraction for all 8 language extractors: + - JS/TS: `accessibility_modifier` nodes + `#` private field detection + - Java/C#/PHP: `modifiers`/`visibility_modifier` AST nodes via shared `extractModifierVisibility()` + - Python: convention-based (`__name` → private, `_name` → protected) + - Go: capitalization convention (uppercase → public, lowercase → private) + - Rust: `visibility_modifier` child (`pub` → public, else private) +- ✅ `findNodesByScope(db, scopeName, opts)` — query by scope with optional kind/file filters +- ✅ `findNodeByQualifiedName(db, qualifiedName)` — direct lookup without edge traversal +- ✅ `childrenData()` returns `qualifiedName`, `scope`, `visibility` for parent and children +- ✅ Integration tests covering qualified_name, scope, visibility, and childrenData output + +**Affected files:** `src/db/migrations.js`, `src/db/repository/nodes.js`, `src/builder/helpers.js`, `src/builder/stages/insert-nodes.js`, `src/extractors/*.js`, `src/extractors/helpers.js`, `src/analysis/symbol-lookup.js` ### 3.13 -- Testing Pyramid with InMemoryRepository diff --git a/src/analysis/symbol-lookup.js b/src/analysis/symbol-lookup.js index fbd1ebf..78ea24f 100644 --- a/src/analysis/symbol-lookup.js +++ b/src/analysis/symbol-lookup.js @@ -209,11 +209,17 @@ export function childrenData(name, customDbPath, opts = {}) { kind: node.kind, file: node.file, line: node.line, + scope: node.scope || null, + visibility: node.visibility || null, + qualifiedName: node.qualified_name || null, children: children.map((c) => ({ name: c.name, kind: c.kind, line: c.line, endLine: c.end_line || null, + qualifiedName: c.qualified_name || null, + scope: c.scope || null, + visibility: c.visibility || null, })), }; }); diff --git a/src/builder/helpers.js b/src/builder/helpers.js index f2dd9d1..0ad89de 100644 --- a/src/builder/helpers.js +++ b/src/builder/helpers.js @@ -183,17 +183,17 @@ export const BATCH_CHUNK = 200; /** * Batch-insert node rows via multi-value INSERT statements. - * Each row: [name, kind, file, line, end_line, parent_id] + * Each row: [name, kind, file, line, end_line, parent_id, qualified_name, scope, visibility] */ export function batchInsertNodes(db, rows) { if (!rows.length) return; - const ph = '(?,?,?,?,?,?)'; + const ph = '(?,?,?,?,?,?,?,?,?)'; for (let i = 0; i < rows.length; i += BATCH_CHUNK) { const chunk = rows.slice(i, i + BATCH_CHUNK); const vals = []; - for (const r of chunk) vals.push(r[0], r[1], r[2], r[3], r[4], r[5]); + for (const r of chunk) vals.push(r[0], r[1], r[2], r[3], r[4], r[5], r[6], r[7], r[8]); db.prepare( - 'INSERT OR IGNORE INTO nodes (name,kind,file,line,end_line,parent_id) VALUES ' + + 'INSERT OR IGNORE INTO nodes (name,kind,file,line,end_line,parent_id,qualified_name,scope,visibility) VALUES ' + chunk.map(() => ph).join(','), ).run(...vals); } diff --git a/src/builder/stages/insert-nodes.js b/src/builder/stages/insert-nodes.js index 5007603..d024426 100644 --- a/src/builder/stages/insert-nodes.js +++ b/src/builder/stages/insert-nodes.js @@ -50,14 +50,29 @@ export async function insertNodes(ctx) { const insertAll = db.transaction(() => { // Phase 1: Batch insert all file nodes + definitions + exports + // Row format: [name, kind, file, line, end_line, parent_id, qualified_name, scope, visibility] const phase1Rows = []; for (const [relPath, symbols] of allSymbols) { - phase1Rows.push([relPath, 'file', relPath, 0, null, null]); + phase1Rows.push([relPath, 'file', relPath, 0, null, null, null, null, null]); for (const def of symbols.definitions) { - phase1Rows.push([def.name, def.kind, relPath, def.line, def.endLine || null, null]); + // Methods already have 'Class.method' as name — use as qualified_name. + // For methods, scope is the class portion; for top-level defs, scope is null. + const dotIdx = def.name.lastIndexOf('.'); + const scope = dotIdx !== -1 ? def.name.slice(0, dotIdx) : null; + phase1Rows.push([ + def.name, + def.kind, + relPath, + def.line, + def.endLine || null, + null, + def.name, + scope, + def.visibility || null, + ]); } for (const exp of symbols.exports) { - phase1Rows.push([exp.name, exp.kind, relPath, exp.line, null, null]); + phase1Rows.push([exp.name, exp.kind, relPath, exp.line, null, null, exp.name, null, null]); } } batchInsertNodes(db, phase1Rows); @@ -84,6 +99,7 @@ export async function insertNodes(ctx) { const defId = nodeIdMap.get(`${def.name}|${def.kind}|${def.line}`); if (!defId) continue; for (const child of def.children) { + const qualifiedName = `${def.name}.${child.name}`; childRows.push([ child.name, child.kind, @@ -91,6 +107,9 @@ export async function insertNodes(ctx) { child.line, child.endLine || null, defId, + qualifiedName, + def.name, + child.visibility || null, ]); } } diff --git a/src/db.js b/src/db.js index eb77eeb..f4de972 100644 --- a/src/db.js +++ b/src/db.js @@ -29,8 +29,10 @@ export { findImportTargets, findIntraFileCallEdges, findNodeById, + findNodeByQualifiedName, findNodeChildren, findNodesByFile, + findNodesByScope, findNodesForTriage, findNodesWithFanIn, getCallableNodes, diff --git a/src/db/migrations.js b/src/db/migrations.js index 3f0d60c..e3925cd 100644 --- a/src/db/migrations.js +++ b/src/db/migrations.js @@ -229,6 +229,17 @@ export const MIGRATIONS = [ CREATE INDEX IF NOT EXISTS idx_nodes_exported ON nodes(exported); `, }, + { + version: 15, + up: ` + ALTER TABLE nodes ADD COLUMN qualified_name TEXT; + ALTER TABLE nodes ADD COLUMN scope TEXT; + ALTER TABLE nodes ADD COLUMN visibility TEXT; + UPDATE nodes SET qualified_name = name WHERE qualified_name IS NULL; + CREATE INDEX IF NOT EXISTS idx_nodes_qualified_name ON nodes(qualified_name); + CREATE INDEX IF NOT EXISTS idx_nodes_scope ON nodes(scope); + `, + }, ]; export function getBuildMeta(db, key) { @@ -309,4 +320,34 @@ export function initSchema(db) { } catch { /* already exists */ } + try { + db.exec('ALTER TABLE nodes ADD COLUMN qualified_name TEXT'); + } catch { + /* already exists */ + } + try { + db.exec('ALTER TABLE nodes ADD COLUMN scope TEXT'); + } catch { + /* already exists */ + } + try { + db.exec('ALTER TABLE nodes ADD COLUMN visibility TEXT'); + } catch { + /* already exists */ + } + try { + db.exec('UPDATE nodes SET qualified_name = name WHERE qualified_name IS NULL'); + } catch { + /* nodes table may not exist yet */ + } + try { + db.exec('CREATE INDEX IF NOT EXISTS idx_nodes_qualified_name ON nodes(qualified_name)'); + } catch { + /* already exists */ + } + try { + db.exec('CREATE INDEX IF NOT EXISTS idx_nodes_scope ON nodes(scope)'); + } catch { + /* already exists */ + } } diff --git a/src/db/repository/index.js b/src/db/repository/index.js index f3f0dc8..c504140 100644 --- a/src/db/repository/index.js +++ b/src/db/repository/index.js @@ -32,8 +32,10 @@ export { countNodes, findFileNodes, findNodeById, + findNodeByQualifiedName, findNodeChildren, findNodesByFile, + findNodesByScope, findNodesForTriage, findNodesWithFanIn, getFunctionNodeId, diff --git a/src/db/repository/nodes.js b/src/db/repository/nodes.js index af4a347..17876e1 100644 --- a/src/db/repository/nodes.js +++ b/src/db/repository/nodes.js @@ -116,6 +116,7 @@ const _getNodeIdStmt = new WeakMap(); const _getFunctionNodeIdStmt = new WeakMap(); const _bulkNodeIdsByFileStmt = new WeakMap(); const _findNodeChildrenStmt = new WeakMap(); +const _findNodeByQualifiedNameStmt = new WeakMap(); /** * Count total nodes. @@ -239,12 +240,67 @@ export function bulkNodeIdsByFile(db, file) { * Find child nodes (parameters, properties, constants) of a parent. * @param {object} db * @param {number} parentId - * @returns {{ name: string, kind: string, line: number, end_line: number|null }[]} + * @returns {{ name: string, kind: string, line: number, end_line: number|null, qualified_name: string|null, scope: string|null, visibility: string|null }[]} */ export function findNodeChildren(db, parentId) { return cachedStmt( _findNodeChildrenStmt, db, - 'SELECT name, kind, line, end_line FROM nodes WHERE parent_id = ? ORDER BY line', + 'SELECT name, kind, line, end_line, qualified_name, scope, visibility FROM nodes WHERE parent_id = ? ORDER BY line', ).all(parentId); } + +/** Escape LIKE wildcards in a literal string segment. */ +function escapeLike(s) { + return s.replace(/[%_\\]/g, '\\$&'); +} + +/** + * Find all nodes that belong to a given scope (by scope column). + * Enables "all methods of class X" without traversing edges. + * @param {object} db + * @param {string} scopeName - The scope to search for (e.g., class name) + * @param {object} [opts] + * @param {string} [opts.kind] - Filter by node kind + * @param {string} [opts.file] - Filter by file path (LIKE match) + * @returns {object[]} + */ +export function findNodesByScope(db, scopeName, opts = {}) { + let sql = 'SELECT * FROM nodes WHERE scope = ?'; + const params = [scopeName]; + if (opts.kind) { + sql += ' AND kind = ?'; + params.push(opts.kind); + } + if (opts.file) { + sql += " AND file LIKE ? ESCAPE '\\'"; + params.push(`%${escapeLike(opts.file)}%`); + } + sql += ' ORDER BY file, line'; + return db.prepare(sql).all(...params); +} + +/** + * Find nodes by qualified name. Returns all matches since the same + * qualified_name can exist in different files (e.g., two classes named + * `DateHelper.format` in separate modules). Pass `opts.file` to narrow. + * @param {object} db + * @param {string} qualifiedName - e.g., 'DateHelper.format' + * @param {object} [opts] + * @param {string} [opts.file] - Filter by file path (LIKE match) + * @returns {object[]} + */ +export function findNodeByQualifiedName(db, qualifiedName, opts = {}) { + if (opts.file) { + return db + .prepare( + "SELECT * FROM nodes WHERE qualified_name = ? AND file LIKE ? ESCAPE '\\' ORDER BY file, line", + ) + .all(qualifiedName, `%${escapeLike(opts.file)}%`); + } + return cachedStmt( + _findNodeByQualifiedNameStmt, + db, + 'SELECT * FROM nodes WHERE qualified_name = ? ORDER BY file, line', + ).all(qualifiedName); +} diff --git a/src/extractors/csharp.js b/src/extractors/csharp.js index 43231d1..9dafa45 100644 --- a/src/extractors/csharp.js +++ b/src/extractors/csharp.js @@ -1,4 +1,4 @@ -import { findChild, nodeEndLine } from './helpers.js'; +import { extractModifierVisibility, findChild, nodeEndLine } from './helpers.js'; /** * Extract symbols from C# files. @@ -133,6 +133,7 @@ export function extractCSharpSymbols(tree, _filePath) { line: node.startPosition.row + 1, endLine: nodeEndLine(node), children: params.length > 0 ? params : undefined, + visibility: extractModifierVisibility(node), }); } break; @@ -150,6 +151,7 @@ export function extractCSharpSymbols(tree, _filePath) { line: node.startPosition.row + 1, endLine: nodeEndLine(node), children: params.length > 0 ? params : undefined, + visibility: extractModifierVisibility(node), }); } break; @@ -165,6 +167,7 @@ export function extractCSharpSymbols(tree, _filePath) { kind: 'property', line: node.startPosition.row + 1, endLine: nodeEndLine(node), + visibility: extractModifierVisibility(node), }); } break; @@ -260,7 +263,12 @@ function extractCSharpClassFields(classNode) { if (!child || child.type !== 'variable_declarator') continue; const nameNode = child.childForFieldName('name'); if (nameNode) { - fields.push({ name: nameNode.text, kind: 'property', line: member.startPosition.row + 1 }); + fields.push({ + name: nameNode.text, + kind: 'property', + line: member.startPosition.row + 1, + visibility: extractModifierVisibility(member), + }); } } } diff --git a/src/extractors/go.js b/src/extractors/go.js index a3a5015..50460c8 100644 --- a/src/extractors/go.js +++ b/src/extractors/go.js @@ -1,4 +1,4 @@ -import { findChild, nodeEndLine } from './helpers.js'; +import { findChild, goVisibility, nodeEndLine } from './helpers.js'; /** * Extract symbols from Go files. @@ -22,6 +22,7 @@ export function extractGoSymbols(tree, _filePath) { line: node.startPosition.row + 1, endLine: nodeEndLine(node), children: params.length > 0 ? params : undefined, + visibility: goVisibility(nameNode.text), }); } break; @@ -55,6 +56,7 @@ export function extractGoSymbols(tree, _filePath) { line: node.startPosition.row + 1, endLine: nodeEndLine(node), children: params.length > 0 ? params : undefined, + visibility: goVisibility(nameNode.text), }); } break; diff --git a/src/extractors/helpers.js b/src/extractors/helpers.js index d197da7..1f8c7d2 100644 --- a/src/extractors/helpers.js +++ b/src/extractors/helpers.js @@ -9,3 +9,74 @@ export function findChild(node, type) { } return null; } + +/** + * Extract visibility from a node by scanning its children for modifier keywords. + * Works for Java, C#, PHP, and similar languages where modifiers are child nodes. + * @param {object} node - tree-sitter node + * @param {Set} [modifierTypes] - node types that indicate modifiers + * @returns {'public'|'private'|'protected'|undefined} + */ +const DEFAULT_MODIFIER_TYPES = new Set([ + 'modifiers', + 'modifier', + 'visibility_modifier', + 'accessibility_modifier', +]); +const VISIBILITY_KEYWORDS = new Set(['public', 'private', 'protected']); + +/** + * Python convention: __name → private, _name → protected, else undefined. + */ +export function pythonVisibility(name) { + if (name.startsWith('__') && name.endsWith('__')) return undefined; // dunder — public + if (name.startsWith('__')) return 'private'; + if (name.startsWith('_')) return 'protected'; + return undefined; +} + +/** + * Go convention: uppercase first letter → public, lowercase → private. + */ +export function goVisibility(name) { + if (!name) return undefined; + // Strip receiver prefix (e.g., "Receiver.Method" → check "Method") + const bare = name.includes('.') ? name.split('.').pop() : name; + if (!bare) return undefined; + return bare[0] === bare[0].toUpperCase() && bare[0] !== bare[0].toLowerCase() + ? 'public' + : 'private'; +} + +/** + * Rust: check for `visibility_modifier` child (pub, pub(crate), etc.). + */ +export function rustVisibility(node) { + for (let i = 0; i < node.childCount; i++) { + const child = node.child(i); + if (!child) continue; + if (child.type === 'visibility_modifier') { + return 'public'; // pub, pub(crate), pub(super) all mean "visible" + } + } + return 'private'; +} + +export function extractModifierVisibility(node, modifierTypes = DEFAULT_MODIFIER_TYPES) { + for (let i = 0; i < node.childCount; i++) { + const child = node.child(i); + if (!child) continue; + // Direct keyword match (e.g., PHP visibility_modifier = "public") + if (modifierTypes.has(child.type)) { + const text = child.text; + if (VISIBILITY_KEYWORDS.has(text)) return text; + // C# 'private protected' — accessible to derived types in same assembly → protected + if (text === 'private protected') return 'protected'; + // Compound modifiers node (Java: "public static") — scan its text for a keyword + for (const kw of VISIBILITY_KEYWORDS) { + if (text.includes(kw)) return kw; + } + } + } + return undefined; +} diff --git a/src/extractors/java.js b/src/extractors/java.js index bfa2457..2bf0bb2 100644 --- a/src/extractors/java.js +++ b/src/extractors/java.js @@ -1,4 +1,4 @@ -import { findChild, nodeEndLine } from './helpers.js'; +import { extractModifierVisibility, findChild, nodeEndLine } from './helpers.js'; /** * Extract symbols from Java files. @@ -165,6 +165,7 @@ export function extractJavaSymbols(tree, _filePath) { line: node.startPosition.row + 1, endLine: nodeEndLine(node), children: params.length > 0 ? params : undefined, + visibility: extractModifierVisibility(node), }); } break; @@ -182,6 +183,7 @@ export function extractJavaSymbols(tree, _filePath) { line: node.startPosition.row + 1, endLine: nodeEndLine(node), children: params.length > 0 ? params : undefined, + visibility: extractModifierVisibility(node), }); } break; @@ -267,7 +269,12 @@ function extractClassFields(classNode) { if (!child || child.type !== 'variable_declarator') continue; const nameNode = child.childForFieldName('name'); if (nameNode) { - fields.push({ name: nameNode.text, kind: 'property', line: member.startPosition.row + 1 }); + fields.push({ + name: nameNode.text, + kind: 'property', + line: member.startPosition.row + 1, + visibility: extractModifierVisibility(member), + }); } } } diff --git a/src/extractors/javascript.js b/src/extractors/javascript.js index b59c5db..06f9468 100644 --- a/src/extractors/javascript.js +++ b/src/extractors/javascript.js @@ -77,12 +77,14 @@ function extractSymbolsQuery(tree, query) { const parentClass = findParentClass(c.meth_node); const fullName = parentClass ? `${parentClass}.${methName}` : methName; const methChildren = extractParameters(c.meth_node); + const methVis = extractVisibility(c.meth_node); definitions.push({ name: fullName, kind: 'method', line: c.meth_node.startPosition.row + 1, endLine: nodeEndLine(c.meth_node), children: methChildren.length > 0 ? methChildren : undefined, + visibility: methVis, }); } else if (c.iface_node) { // interface_declaration (TS/TSX only) @@ -375,12 +377,14 @@ function extractSymbolsWalk(tree) { const parentClass = findParentClass(node); const fullName = parentClass ? `${parentClass}.${nameNode.text}` : nameNode.text; const methChildren = extractParameters(node); + const methVis = extractVisibility(node); definitions.push({ name: fullName, kind: 'method', line: node.startPosition.row + 1, endLine: nodeEndLine(node), children: methChildren.length > 0 ? methChildren : undefined, + visibility: methVis, }); } break; @@ -701,13 +705,46 @@ function extractClassProperties(classNode) { nameNode.type === 'identifier' || nameNode.type === 'private_property_identifier') ) { - props.push({ name: nameNode.text, kind: 'property', line: child.startPosition.row + 1 }); + // Private # fields: nameNode.type is 'private_property_identifier' + // TS modifiers: accessibility_modifier child on the field_definition + const vis = + nameNode.type === 'private_property_identifier' ? 'private' : extractVisibility(child); + props.push({ + name: nameNode.text, + kind: 'property', + line: child.startPosition.row + 1, + visibility: vis, + }); } } } return props; } +/** + * Extract visibility modifier from a class member node. + * Checks for TS access modifiers (public/private/protected) and JS private (#) fields. + * Returns 'public' | 'private' | 'protected' | undefined. + */ +function extractVisibility(node) { + // Check for TS accessibility modifiers (accessibility_modifier child) + for (let i = 0; i < node.childCount; i++) { + const child = node.child(i); + if (!child) continue; + if (child.type === 'accessibility_modifier') { + const text = child.text; + if (text === 'private' || text === 'protected' || text === 'public') return text; + } + } + // Check for JS private name (# prefix) — try multiple field names + const nameNode = + node.childForFieldName('name') || node.childForFieldName('property') || node.child(0); + if (nameNode && nameNode.type === 'private_property_identifier') { + return 'private'; + } + return undefined; +} + function isConstantValue(valueNode) { if (!valueNode) return false; const t = valueNode.type; diff --git a/src/extractors/php.js b/src/extractors/php.js index d2b4f09..fd00816 100644 --- a/src/extractors/php.js +++ b/src/extractors/php.js @@ -1,4 +1,4 @@ -import { findChild, nodeEndLine } from './helpers.js'; +import { extractModifierVisibility, findChild, nodeEndLine } from './helpers.js'; function extractPhpParameters(fnNode) { const params = []; @@ -35,6 +35,7 @@ function extractPhpClassChildren(classNode) { name: varNode.text, kind: 'property', line: member.startPosition.row + 1, + visibility: extractModifierVisibility(member), }); } } @@ -231,6 +232,7 @@ export function extractPHPSymbols(tree, _filePath) { line: node.startPosition.row + 1, endLine: nodeEndLine(node), children: params.length > 0 ? params : undefined, + visibility: extractModifierVisibility(node), }); } break; diff --git a/src/extractors/python.js b/src/extractors/python.js index 6542aab..968dbac 100644 --- a/src/extractors/python.js +++ b/src/extractors/python.js @@ -1,4 +1,4 @@ -import { findChild, nodeEndLine } from './helpers.js'; +import { findChild, nodeEndLine, pythonVisibility } from './helpers.js'; /** * Extract symbols from Python files. @@ -30,6 +30,7 @@ export function extractPythonSymbols(tree, _filePath) { endLine: nodeEndLine(node), decorators, children: fnChildren.length > 0 ? fnChildren : undefined, + visibility: pythonVisibility(nameNode.text), }); } break; @@ -209,7 +210,12 @@ export function extractPythonSymbols(tree, _filePath) { const left = assignment.childForFieldName('left'); if (left && left.type === 'identifier' && !seen.has(left.text)) { seen.add(left.text); - props.push({ name: left.text, kind: 'property', line: child.startPosition.row + 1 }); + props.push({ + name: left.text, + kind: 'property', + line: child.startPosition.row + 1, + visibility: pythonVisibility(left.text), + }); } } } @@ -262,7 +268,12 @@ export function extractPythonSymbols(tree, _filePath) { !seen.has(attr.text) ) { seen.add(attr.text); - props.push({ name: attr.text, kind: 'property', line: stmt.startPosition.row + 1 }); + props.push({ + name: attr.text, + kind: 'property', + line: stmt.startPosition.row + 1, + visibility: pythonVisibility(attr.text), + }); } } } diff --git a/src/extractors/rust.js b/src/extractors/rust.js index 2a01348..705f9bd 100644 --- a/src/extractors/rust.js +++ b/src/extractors/rust.js @@ -1,4 +1,4 @@ -import { findChild, nodeEndLine } from './helpers.js'; +import { findChild, nodeEndLine, rustVisibility } from './helpers.js'; /** * Extract symbols from Rust files. @@ -37,6 +37,7 @@ export function extractRustSymbols(tree, _filePath) { line: node.startPosition.row + 1, endLine: nodeEndLine(node), children: params.length > 0 ? params : undefined, + visibility: rustVisibility(node), }); } break; @@ -52,6 +53,7 @@ export function extractRustSymbols(tree, _filePath) { line: node.startPosition.row + 1, endLine: nodeEndLine(node), children: fields.length > 0 ? fields : undefined, + visibility: rustVisibility(node), }); } break; diff --git a/tests/integration/qualified-names.test.js b/tests/integration/qualified-names.test.js new file mode 100644 index 0000000..0ea987a --- /dev/null +++ b/tests/integration/qualified-names.test.js @@ -0,0 +1,196 @@ +import fs from 'node:fs'; +import os from 'node:os'; +import path from 'node:path'; +import { afterAll, beforeAll, describe, expect, it } from 'vitest'; +import { buildGraph } from '../../src/builder.js'; +import { findNodeByQualifiedName, findNodesByScope, openReadonlyOrFail } from '../../src/db.js'; +import { pythonVisibility } from '../../src/extractors/helpers.js'; +import { childrenData } from '../../src/queries.js'; + +// Fixture: a small project with classes, methods, and visibility modifiers +const FIXTURE_FILES = { + 'date-helper.js': ` +export class DateHelper { + #locale; + + constructor(locale) { + this.#locale = locale; + } + + format(date) { + return date.toLocaleDateString(this.#locale); + } + + static now() { + return new Date(); + } +} + +export function freeFunction(x) { + return x + 1; +} +`, + 'math-utils.js': ` +export class MathUtils { + static PI = 3.14159; + + static add(a, b) { + return a + b; + } + + static multiply(a, b) { + return a * b; + } +} +`, +}; + +let tmpDir; +let dbPath; + +beforeAll(async () => { + tmpDir = fs.mkdtempSync(path.join(os.tmpdir(), 'cg-qualified-')); + for (const [file, content] of Object.entries(FIXTURE_FILES)) { + fs.writeFileSync(path.join(tmpDir, file), content); + } + // package.json so codegraph sees it as a project + fs.writeFileSync(path.join(tmpDir, 'package.json'), '{"name":"test"}'); + await buildGraph(tmpDir); + dbPath = path.join(tmpDir, '.codegraph', 'graph.db'); +}); + +afterAll(() => { + fs.rmSync(tmpDir, { recursive: true, force: true }); +}); + +describe('qualified_name column', () => { + it('methods have qualified_name matching their full name', () => { + const db = openReadonlyOrFail(dbPath); + try { + const nodes = findNodeByQualifiedName(db, 'DateHelper.format'); + expect(nodes.length).toBe(1); + expect(nodes[0].name).toBe('DateHelper.format'); + expect(nodes[0].kind).toBe('method'); + expect(nodes[0].qualified_name).toBe('DateHelper.format'); + } finally { + db.close(); + } + }); + + it('top-level functions have qualified_name equal to name', () => { + const db = openReadonlyOrFail(dbPath); + try { + const nodes = findNodeByQualifiedName(db, 'freeFunction'); + expect(nodes.length).toBe(1); + expect(nodes[0].name).toBe('freeFunction'); + expect(nodes[0].qualified_name).toBe('freeFunction'); + } finally { + db.close(); + } + }); + + it('child nodes have qualified_name = parent.child', () => { + const db = openReadonlyOrFail(dbPath); + try { + // Parameters of freeFunction should have qualified_name 'freeFunction.x' + const nodes = findNodeByQualifiedName(db, 'freeFunction.x'); + expect(nodes.length).toBe(1); + expect(nodes[0].kind).toBe('parameter'); + expect(nodes[0].scope).toBe('freeFunction'); + } finally { + db.close(); + } + }); +}); + +describe('scope column', () => { + it('methods have scope set to their parent class', () => { + const db = openReadonlyOrFail(dbPath); + try { + const nodes = findNodesByScope(db, 'DateHelper'); + expect(nodes.length).toBeGreaterThan(0); + const names = nodes.map((n) => n.name); + expect(names).toContain('DateHelper.format'); + expect(names).toContain('DateHelper.constructor'); + } finally { + db.close(); + } + }); + + it('findNodesByScope with kind filter returns only matching kinds', () => { + const db = openReadonlyOrFail(dbPath); + try { + const methods = findNodesByScope(db, 'MathUtils', { kind: 'method' }); + for (const m of methods) { + expect(m.kind).toBe('method'); + } + expect(methods.length).toBeGreaterThan(0); + } finally { + db.close(); + } + }); + + it('top-level functions have null scope', () => { + const db = openReadonlyOrFail(dbPath); + try { + const nodes = findNodeByQualifiedName(db, 'freeFunction'); + expect(nodes[0].scope).toBeNull(); + } finally { + db.close(); + } + }); +}); + +describe('visibility column', () => { + it('python dunder methods are not marked as protected', () => { + // pythonVisibility('__init__') should return undefined, not 'protected' + expect(pythonVisibility('__init__')).toBeUndefined(); + expect(pythonVisibility('__str__')).toBeUndefined(); + expect(pythonVisibility('__len__')).toBeUndefined(); + // But true name-mangled privates should still be private + expect(pythonVisibility('__secret')).toBe('private'); + expect(pythonVisibility('_protected')).toBe('protected'); + expect(pythonVisibility('public_method')).toBeUndefined(); + }); + + it('private # fields are marked as private (WASM engine)', async () => { + // Visibility extraction requires the WASM engine (JS extractor). + // The native engine doesn't populate visibility yet. + const wasmDir = fs.mkdtempSync(path.join(os.tmpdir(), 'cg-vis-wasm-')); + try { + for (const [file, content] of Object.entries(FIXTURE_FILES)) { + fs.writeFileSync(path.join(wasmDir, file), content); + } + fs.writeFileSync(path.join(wasmDir, 'package.json'), '{"name":"test"}'); + await buildGraph(wasmDir, { engine: 'wasm' }); + const wasmDbPath = path.join(wasmDir, '.codegraph', 'graph.db'); + const db = openReadonlyOrFail(wasmDbPath); + try { + const nodes = findNodesByScope(db, 'DateHelper', { kind: 'property' }); + const locale = nodes.find((n) => n.name === '#locale'); + expect(locale).toBeDefined(); + expect(locale.visibility).toBe('private'); + } finally { + db.close(); + } + } finally { + fs.rmSync(wasmDir, { recursive: true, force: true }); + } + }); +}); + +describe('childrenData exposes new columns', () => { + it('childrenData returns scope and visibility for children', () => { + const result = childrenData('DateHelper', dbPath, { kind: 'class' }); + expect(result.results.length).toBeGreaterThan(0); + const cls = result.results[0]; + expect(cls.qualifiedName).toBe('DateHelper'); + expect(cls.children.length).toBeGreaterThan(0); + for (const child of cls.children) { + expect(child).toHaveProperty('scope'); + expect(child).toHaveProperty('visibility'); + expect(child).toHaveProperty('qualifiedName'); + expect(child.scope).toBe('DateHelper'); + } + }); +});