Skip to content

[BUG] Intra-file CALLS edges incorrectly marked as external due to missing target ID resolution #2

@schneidermr

Description

@schneidermr

Describe the bug
All intra-file CALLS edges are misclassified as external even when the target symbol is defined in the same file. For example, a function that calls oauth2_scheme (a module-level variable defined 3 lines above in the same file) produces a CALLS edge with to_node_type: external instead of resolving to the actual LexicalNode.

Root cause: _extract_call_edges() in _ast_utils.py (line 371) stores the bare callee name string as target_id (e.g. "oauth2_scheme"). LexicalNode.make_id() produces structured IDs of the form var:a1b2c3d4e5f6 (sha1 of {tenant_id}:{repo_id}:{file}:{name}:{node_type}). When Neo4jWriter.write_edges() runs MERGE (tgt:LexicalNode {node_id: "oauth2_scheme"}), no match is found and Neo4j creates a new stub node with node_type = 'external'. The resolver that rewrites CALLS target_id values to proper node_id hashes (_resolve_call_targets(), present in the main codesteward agent) was never ported to codesteward-graph.

To reproduce

  1. Create a Python file with a module-level variable and a function that references it:
# auth.py
oauth2_scheme = OAuth2PasswordBearer(tokenUrl="token")

async def get_current_user(token: str = Depends(oauth2_scheme)):
    ...
  1. Build the graph: graph_rebuild(repo_path=..., tenant_id="t", repo_id="r")
  2. Query referential edges for get_current_user:
    codebase_graph_query(query_type="referential", query="get_current_user", ...)
  3. Observe that the CALLS edge to oauth2_scheme has to_node_type: external and to_file: null, even though oauth2_scheme is defined in the same file.

Expected behavior
The CALLS edge should point to the LexicalNode for oauth2_scheme defined in auth.py. to_node_type should be variable and to_file should be auth.py.

Environment
codesteward version: codesteward-graph 0.2.2 / codesteward-mcp 0.2.2
Backend: Neo4j (also reproducible in stub mode — incorrect target_id is visible in raw ParseResult)
Transport: any
OS: any

Additional context
The fix requires a post-parse resolution pass in GraphBuilder.build_graph() (after all files are parsed, before write_edges()). The pass should:

  1. Build a name → node_id map from all collected LexicalNodes
  2. Rewrite any CALLS edge whose target_id equals a bare name string (i.e., doesn't start with a type prefix like fn, var, cls) to the corresponding resolved node_id
  3. Leave unresolved edges (genuinely external symbols) unchanged
    This resolver exists as _resolve_call_targets() in the codesteward-graph project and can be ported directly. The same issue affects all languages, not just Python — any call to a same-file symbol will be falsely marked external.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions