Skip to content
This repository was archived by the owner on Mar 10, 2026. It is now read-only.

Add C# language support#107

Merged
gkorland merged 4 commits intomainfrom
csharp
Feb 19, 2026
Merged

Add C# language support#107
gkorland merged 4 commits intomainfrom
csharp

Conversation

@gkorland
Copy link
Contributor

@gkorland gkorland commented Feb 18, 2026

  • Add tree-sitter-c-sharp dependency to pyproject.toml
  • Create CSharpAnalyzer in api/analyzers/csharp/ supporting: classes, interfaces, enums, structs, methods, constructors
  • Register .cs extension in source_analyzer.py with LSP config
  • Add C# test source file and unit test
  • Update README to list C# as supported language

Summary by CodeRabbit

  • New Features

    • Added C# language support for comprehensive source analysis (types, members, inheritance, method usage).
  • Documentation

    • Updated displayed supported languages to list Python, Java, and C#.
  • Tests

    • Added unit tests exercising the C# analysis workflow and validating discovered entities and relationships.
  • Refactor

    • Improved symbol resolution to consistently return multiple resolved entities for more accurate cross-references.

- Add tree-sitter-c-sharp dependency to pyproject.toml
- Create CSharpAnalyzer in api/analyzers/csharp/ supporting:
  classes, interfaces, enums, structs, methods, constructors
- Register .cs extension in source_analyzer.py with LSP config
- Add C# test source file and unit test
- Update README to list C# as supported language

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@vercel
Copy link

vercel bot commented Feb 18, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
code-graph-backend Ready Ready Preview, Comment Feb 18, 2026 9:07pm

Request Review

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 18, 2026

📝 Walkthrough

Walkthrough

Adds comprehensive C# support: new Tree-sitter–based CSharpAnalyzer, integrates .cs into source collection and language-server startup, adds tree-sitter-c-sharp dependency, and includes C# sample code plus tests to validate extracted entities and relationships.

Changes

Cohort / File(s) Summary
Docs & Project Config
README.md, pyproject.toml
Updated supported languages text (now "Python, Java, and C#") and added tree-sitter-c-sharp>=0.23.1,<0.24.0.
C# Analyzer Implementation
api/analyzers/csharp/analyzer.py
New CSharpAnalyzer class: Tree-sitter integration, query/capture helpers, entity label/name/docstring extraction, dependency handling (dotnet restore guard), symbol extraction (types, interfaces, bases, methods, parameters, calls), and resolve_type/resolve_method/resolve_symbol returning lists of Entity.
Analyzer Core API
api/analyzers/analyzer.py
Changed AbstractAnalyzer.resolve_symbol signature and docstring to return list[Entity] instead of a single Entity.
Language Integrations
api/analyzers/source_analyzer.py
Registered .cs analyzer, include *.cs in source collection, and conditionally initialize/start a C# language server (MultilspyConfig "csharp") alongside other LS instances.
Java/Python Analyzer Adjustments
api/analyzers/java/analyzer.py, api/analyzers/python/analyzer.py
Updated resolve_symbol signatures to return list[Entity] and adjusted implementations to consistently return lists.
Tests & Samples
tests/source_files/csharp/Program.cs, tests/test_csharp_analyzer.py
Added sample C# source (ILogger, ConsoleLogger, Task) and unit tests asserting presence of interface/class/method/constructor nodes and file→defines relationships in analysis graph.
Misc (startup/test infra)
...
Second-pass language-server startup and analyzer registry updated to include .cs; test assets added to exercise new analyzer.

Sequence Diagram(s)

sequenceDiagram
    participant User as User
    participant SourceAnalyzer as SourceAnalyzer
    participant CSharpAnalyzer as CSharpAnalyzer
    participant TreeSitter as TreeSitter
    participant LanguageServer as LanguageServer
    participant Graph as Graph

    User->>SourceAnalyzer: analyze_sources(path)
    SourceAnalyzer->>SourceAnalyzer: collect *.cs files
    SourceAnalyzer->>LanguageServer: init/start C# LS (MultilspyConfig)
    SourceAnalyzer->>CSharpAnalyzer: analyze(file)
    CSharpAnalyzer->>TreeSitter: parse file -> AST
    TreeSitter-->>CSharpAnalyzer: AST nodes
    CSharpAnalyzer->>CSharpAnalyzer: extract entities (types, methods, docs)
    CSharpAnalyzer->>LanguageServer: resolve_type / resolve_method
    LanguageServer-->>CSharpAnalyzer: resolved symbols
    CSharpAnalyzer->>SourceAnalyzer: enriched entities
    SourceAnalyzer->>Graph: create nodes & relationships
    Graph-->>User: analysis results available
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐰 I hopped in code where C# grows,
Trees of syntax, dotted rows.
Servers wake and join the chase,
I nibble lines with joyful pace—
carrots and tests left in my trace. 🥕

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 15.38% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Add C# language support' is concise and clearly summarizes the main change: adding support for C# as a new supported language alongside Python and Java.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch csharp

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

else:
lsps[".py"] = NullLanguageServer()
with lsps[".java"].start_server(), lsps[".py"].start_server():
if any(path.rglob('*.cs')):

Check failure

Code scanning / CodeQL

Uncontrolled data used in path expression High

This path depends on a
user-provided value
.
This path depends on a
user-provided value
.

def analyze_sources(self, path: Path, ignore: list[str], graph: Graph) -> None:
files = list(path.rglob("*.java")) + list(path.rglob("*.py"))
files = list(path.rglob("*.java")) + list(path.rglob("*.py")) + list(path.rglob("*.cs"))

Check failure

Code scanning / CodeQL

Uncontrolled data used in path expression High

This path depends on a
user-provided value
.
This path depends on a
user-provided value
.

def analyze_sources(self, path: Path, ignore: list[str], graph: Graph) -> None:
files = list(path.rglob("*.java")) + list(path.rglob("*.py"))
files = list(path.rglob("*.java")) + list(path.rglob("*.py")) + list(path.rglob("*.cs"))

Check failure

Code scanning / CodeQL

Uncontrolled data used in path expression High

This path depends on a
user-provided value
.
This path depends on a
user-provided value
.

def analyze_sources(self, path: Path, ignore: list[str], graph: Graph) -> None:
files = list(path.rglob("*.java")) + list(path.rglob("*.py"))
files = list(path.rglob("*.java")) + list(path.rglob("*.py")) + list(path.rglob("*.cs"))

Check failure

Code scanning / CodeQL

Uncontrolled data used in path expression High

This path depends on a
user-provided value
.
This path depends on a
user-provided value
.
- Remove unused 'import os' from csharp/analyzer.py
- Replace 'from ...entities import *' with explicit imports
- Remove unused 'from pathlib import Path' from test_csharp_analyzer.py

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🧹 Nitpick comments (3)
api/analyzers/csharp/analyzer.py (1)

114-122: else branch in resolve_method is a no-op — dead code.

When func_node is neither member_access_expression nor identifier, the else branch reassigns func_node to node.child_by_field_name('function') — the exact value func_node was just set to on line 114. The assignment is redundant and the branch can be removed.

♻️ Proposed fix
         func_node = node.child_by_field_name('function')
         if func_node and func_node.type == 'member_access_expression':
             func_node = func_node.child_by_field_name('name')
         elif func_node and func_node.type == 'identifier':
             pass
-        else:
-            func_node = node.child_by_field_name('function')
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@api/analyzers/csharp/analyzer.py` around lines 114 - 122, In resolve_method,
the else branch that reassigns func_node to node.child_by_field_name('function')
is redundant because func_node was already set to that; remove that else branch
and its assignment so the logic becomes: get func_node =
node.child_by_field_name('function'), if its type is 'member_access_expression'
replace it with its child 'name', elif it's 'identifier' leave it, otherwise do
nothing; retain the subsequent "if func_node: node = func_node" behavior
(references: resolve_method, func_node, node.child_by_field_name('function'),
member_access_expression, identifier).
tests/test_csharp_analyzer.py (2)

52-60: Consider asserting the IMPLEMENTS relationship between ConsoleLogger and ILogger.

The test covers DEFINES and entity detection but omits verifying the core IMPLEMENTS/EXTENDS relationship resolution — the primary value of add_symbols and resolve_symbol. A CI-green test that doesn't cover this could mask LSP resolution failures silently.

✅ Suggested addition
# Verify ConsoleLogger implements ILogger
q = "MATCH (c:Class {name: 'ConsoleLogger'})-[:IMPLEMENTS]->(i:Interface {name: 'ILogger'}) RETURN c, i LIMIT 1"
res = g._query(q).result_set
self.assertEqual(len(res), 1, "ConsoleLogger should implement ILogger")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/test_csharp_analyzer.py` around lines 52 - 60, Add an assertion that
verifies the IMPLEMENTS relationship between ConsoleLogger and ILogger: run a
Cypher query similar to the existing checks (e.g., use g._query with "MATCH
(c:Class {name: 'ConsoleLogger'})-[:IMPLEMENTS]->(i:Interface {name: 'ILogger'})
RETURN c, i LIMIT 1") and assert the query returns the expected row (e.g.,
assert len(result_set) == 1 or that result_set[0] exists) so the test ensures
resolve_symbol/add_symbols correctly records the implements relationship for
ConsoleLogger -> ILogger.

8-24: Add teardown to clean up the "csharp" graph after the test.

Graph("csharp") is created but never cleaned up. If the underlying FalkorDB instance persists data across runs, a subsequent test execution will find pre-populated entities and the assertions may pass vacuously even if the analyzer is broken.

♻️ Suggested fix
 class Test_CSharp_Analyzer(unittest.TestCase):
+    def tearDown(self):
+        try:
+            Graph("csharp").delete()
+        except Exception:
+            pass
+
     def test_analyzer(self):
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/test_csharp_analyzer.py` around lines 8 - 24, Add a tearDown to
Test_CSharp_Analyzer that cleans up the "csharp" graph after the test: implement
def tearDown(self): g = Graph("csharp"); call the Graph cleanup method provided
by your Graph API (e.g., g.delete(), g.clear(), or g.delete_graph()) to remove
all entities so the graph is empty between test runs; ensure the cleanup runs
regardless of test outcome so analyzer.analyze_local_folder(path, g) does not
leave persistent state.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@api/analyzers/csharp/analyzer.py`:
- Line 6: Replace the wildcard import in analyzer.py with explicit imports of
the entities you actually use (e.g., Entity, File, Symbol, Reference, Location —
list the exact class/variable names referenced in this module) to resolve Ruff
F403/F405; update the top import line from "from ...entities import *" to "from
...entities import Entity, File, Symbol, ..." matching the symbols used by
functions/methods in this file (refer to usages in analyzer.py such as any
constructor calls or type hints that reference Entity/File/Symbol) so static
analyzers can verify definitions.
- Line 50: Guard the call to node.child_by_field_name('name') before accessing
.text.decode('utf-8'): capture the result (e.g., name_node =
node.child_by_field_name('name')), return a safe default (None or empty string
consistent with the module) if name_node is None, otherwise decode
name_node.text.decode('utf-8'); update the site where the expression appears to
avoid AttributeError and keep behavior consistent with other analyzers.
- Around line 27-30: The current condition uses Path.glob(...) which returns
generators that are always truthy, causing subprocess.run(["dotnet", "restore"],
...) to run unconditionally; change the guard to actually test for matches (for
example use any(Path(path).glob("*.csproj")) or any(Path(path).glob("*.sln")) or
wrap the glob in list(...) and check truthiness) so dotnet restore only runs
when a .csproj or .sln is present; keep the existing temp_deps_cs directory
check and call to subprocess.run unchanged except for using the corrected
condition.
- Line 131: Update the return type annotation of resolve_symbol from "-> Entity"
to "-> list[Entity]" in api/analyzers/csharp/analyzer.py (the resolve_symbol
method) to match resolve_type and resolve_method and the usage in
Entity.resolved_symbol; also update the corresponding abstract declaration
AbstractAnalyzer.resolve_symbol to return list[Entity] so the signatures align
and mypy passes.

In `@api/analyzers/source_analyzer.py`:
- Around line 141-145: The rglob call uses the user-controlled variable path
directly, allowing path traversal; before any use of path.rglob (including the
C# block that creates SyncLanguageServer via SyncLanguageServer.create and
MultilspyConfig.from_dict), resolve and validate the incoming path in the entry
routines (e.g., analyze_local_folder/analyze_sources): convert to a Path and
call .resolve(), then ensure the resolved path is within an allowed base/root
(reject or normalize paths that escape the root, e.g., via ..), and only then
pass the safe resolved path to path.rglob and to SyncLanguageServer.create;
apply the same validation for the Java/Python blocks that also use path.rglob.

---

Duplicate comments:
In `@api/analyzers/source_analyzer.py`:
- Line 176: The rglob calls in analyze_sources (files =
list(path.rglob("*.java")) + list(path.rglob("*.py")) +
list(path.rglob("*.cs"))) use an unvalidated Path variable and trigger path
traversal findings; resolve and validate the incoming path once (e.g., call path
= path.resolve(strict=False) and ensure it is under an allowed root or repo root
/ whitelist) at the upstream site where path is created (the earlier code around
the path derivation) and then use the validated/resolved Path for the rglob
calls (resolved_path.rglob("*.java"), etc.) so all three rglob usages are fixed
together.

---

Nitpick comments:
In `@api/analyzers/csharp/analyzer.py`:
- Around line 114-122: In resolve_method, the else branch that reassigns
func_node to node.child_by_field_name('function') is redundant because func_node
was already set to that; remove that else branch and its assignment so the logic
becomes: get func_node = node.child_by_field_name('function'), if its type is
'member_access_expression' replace it with its child 'name', elif it's
'identifier' leave it, otherwise do nothing; retain the subsequent "if
func_node: node = func_node" behavior (references: resolve_method, func_node,
node.child_by_field_name('function'), member_access_expression, identifier).

In `@tests/test_csharp_analyzer.py`:
- Around line 52-60: Add an assertion that verifies the IMPLEMENTS relationship
between ConsoleLogger and ILogger: run a Cypher query similar to the existing
checks (e.g., use g._query with "MATCH (c:Class {name:
'ConsoleLogger'})-[:IMPLEMENTS]->(i:Interface {name: 'ILogger'}) RETURN c, i
LIMIT 1") and assert the query returns the expected row (e.g., assert
len(result_set) == 1 or that result_set[0] exists) so the test ensures
resolve_symbol/add_symbols correctly records the implements relationship for
ConsoleLogger -> ILogger.
- Around line 8-24: Add a tearDown to Test_CSharp_Analyzer that cleans up the
"csharp" graph after the test: implement def tearDown(self): g =
Graph("csharp"); call the Graph cleanup method provided by your Graph API (e.g.,
g.delete(), g.clear(), or g.delete_graph()) to remove all entities so the graph
is empty between test runs; ensure the cleanup runs regardless of test outcome
so analyzer.analyze_local_folder(path, g) does not leave persistent state.

from pathlib import Path

from multilspy import SyncLanguageServer
from ...entities import *
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Replace star import with explicit imports.

from ...entities import * prevents static analysis from verifying that Entity, File, Symbol, etc. are actually defined — exactly what Ruff F403/F405 is reporting. Other analyzers in the project use the same pattern, but this is the right place to break the habit for a new file.

♻️ Proposed fix
-from ...entities import *
+from ...entities.entity import Entity
+from ...entities.file import File
🧰 Tools
🪛 Ruff (0.15.1)

[error] 6-6: from ...entities import * used; unable to detect undefined names

(F403)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@api/analyzers/csharp/analyzer.py` at line 6, Replace the wildcard import in
analyzer.py with explicit imports of the entities you actually use (e.g.,
Entity, File, Symbol, Reference, Location — list the exact class/variable names
referenced in this module) to resolve Ruff F403/F405; update the top import line
from "from ...entities import *" to "from ...entities import Entity, File,
Symbol, ..." matching the symbols used by functions/methods in this file (refer
to usages in analyzer.py such as any constructor calls or type hints that
reference Entity/File/Symbol) so static analyzers can verify definitions.

res.append(file.entities[method_dec])
return res

def resolve_symbol(self, files: dict[Path, File], lsp: SyncLanguageServer, file_path: Path, path: Path, key: str, symbol: Node) -> Entity:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Return type annotation -> Entity is incorrect; should be -> list[Entity].

Both resolve_type and resolve_method return list[Entity], and Entity.resolved_symbol (in entity.py) iterates the result with for resolved_symbol in f(key, symbol.symbol) — confirming the list contract. The annotation mismatch will cause mypy errors and mislead callers. The same fix is needed in the abstract base class AbstractAnalyzer.resolve_symbol.

🐛 Proposed fix
-    def resolve_symbol(self, files: dict[Path, File], lsp: SyncLanguageServer, file_path: Path, path: Path, key: str, symbol: Node) -> Entity:
+    def resolve_symbol(self, files: dict[Path, File], lsp: SyncLanguageServer, file_path: Path, path: Path, key: str, symbol: Node) -> list[Entity]:
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def resolve_symbol(self, files: dict[Path, File], lsp: SyncLanguageServer, file_path: Path, path: Path, key: str, symbol: Node) -> Entity:
def resolve_symbol(self, files: dict[Path, File], lsp: SyncLanguageServer, file_path: Path, path: Path, key: str, symbol: Node) -> list[Entity]:
🧰 Tools
🪛 Ruff (0.15.1)

[error] 131-131: File may be undefined, or defined from star imports

(F405)


[error] 131-131: Entity may be undefined, or defined from star imports

(F405)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@api/analyzers/csharp/analyzer.py` at line 131, Update the return type
annotation of resolve_symbol from "-> Entity" to "-> list[Entity]" in
api/analyzers/csharp/analyzer.py (the resolve_symbol method) to match
resolve_type and resolve_method and the usage in Entity.resolved_symbol; also
update the corresponding abstract declaration AbstractAnalyzer.resolve_symbol to
return list[Entity] so the signatures align and mypy passes.

Comment on lines +141 to +145
if any(path.rglob('*.cs')):
config = MultilspyConfig.from_dict({"code_language": "csharp"})
lsps[".cs"] = SyncLanguageServer.create(config, logger, str(path))
else:
lsps[".cs"] = NullLanguageServer()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

CodeQL: path traversal via user-controlled path in rglob.

path flows from the API request body into path.rglob('*.cs') without sanitization. A crafted path (e.g., containing ../) could cause the scanner to traverse outside the intended directory. CodeQL flagged this as a security failure. The pre-existing Java/Python blocks at lines 131 and 136 carry the same risk, but this new block adds another instance.

Resolve and validate path against an allowed base before using it in any rglob call. For example:

🔒 Suggested mitigation pattern
# In analyze_local_folder or analyze_sources, resolve and validate path early:
resolved = Path(path).resolve()
# Optionally assert it's within an allowed root
🧰 Tools
🪛 GitHub Check: CodeQL

[failure] 141-141: Uncontrolled data used in path expression
This path depends on a user-provided value.
This path depends on a user-provided value.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@api/analyzers/source_analyzer.py` around lines 141 - 145, The rglob call uses
the user-controlled variable path directly, allowing path traversal; before any
use of path.rglob (including the C# block that creates SyncLanguageServer via
SyncLanguageServer.create and MultilspyConfig.from_dict), resolve and validate
the incoming path in the entry routines (e.g.,
analyze_local_folder/analyze_sources): convert to a Path and call .resolve(),
then ensure the resolved path is within an allowed base/root (reject or
normalize paths that escape the root, e.g., via ..), and only then pass the safe
resolved path to path.rglob and to SyncLanguageServer.create; apply the same
validation for the Java/Python blocks that also use path.rglob.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
api/analyzers/csharp/analyzer.py (1)

113-122: Dead else branch is a no-op — remove it.

The else block on line 119 re-assigns func_node = node.child_by_field_name('function'), which func_node already holds (set on line 114 and left unchanged by the pass on line 118). The branch is unreachable in any meaningful way.

♻️ Proposed fix
         func_node = node.child_by_field_name('function')
         if func_node and func_node.type == 'member_access_expression':
             func_node = func_node.child_by_field_name('name')
-        elif func_node and func_node.type == 'identifier':
-            pass
-        else:
-            func_node = node.child_by_field_name('function')
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@api/analyzers/csharp/analyzer.py` around lines 113 - 122, The code handling
invocation_expression has a dead else that reassigns func_node to
node.child_by_field_name('function') even though func_node was already set, so
remove the redundant else branch: in the block where node.type ==
'invocation_expression' keep the initial func_node =
node.child_by_field_name('function'), handle the member_access_expression case
by replacing func_node with its 'name' child and keep the identifier branch (or
no-op), then drop the final else clause entirely so func_node is not redundantly
reassigned before the if func_node: node = func_node assignment.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@api/analyzers/csharp/analyzer.py`:
- Around line 53-59: get_entity_docstring currently only returns
node.prev_sibling.text which captures only the last line of multi-line "///" XML
doc blocks; change get_entity_docstring to walk backwards from node.prev_sibling
while siblings are comment nodes and contiguous (e.g., loop using prev_sibling),
collect each comment.text.decode('utf-8') into a list, then reverse or prepend
so the original order is preserved and join with newlines to return the full
docstring; keep the existing type checks and the ValueError for unknown types,
and ensure it still returns None when no preceding comment siblings exist.
- Around line 66-77: The current heuristic in the class/struct handling loop in
analyzer.py labels the first base_list entry as "base_class" even when it might
be an interface; update the block around the loop (where base_list_captures is
iterated) to: add a clear inline comment documenting this known false-positive,
and before calling entity.add_symbol("base_class", base_type) call
self.resolve_type(base_type) (or equivalent resolver used in this analyzer) and
check the resolved node's kind (e.g., 'class_declaration' or
'struct_declaration') — only then add "base_class"; otherwise fall back to
add_symbol("implement_interface", base_type). Keep the existing first-flag
behavior but gate the base_class assignment on the resolved type to avoid
spurious base_class edges.

---

Duplicate comments:
In `@api/analyzers/csharp/analyzer.py`:
- Around line 47-51: get_entity_name currently assumes
node.child_by_field_name('name') is non-null and calls .text, risking a None
dereference; update get_entity_name to fetch name_node =
node.child_by_field_name('name'), check if name_node is truthy before accessing
name_node.text.decode('utf-8'), and if it's None either raise a clearer
ValueError mentioning the parent node type and maybe node.start_point or return
a safe fallback (e.g., '') so callers of get_entity_name won't dereference None;
ensure the change touches only get_entity_name and preserves the existing node
type checks for 'class_declaration', 'interface_declaration',
'enum_declaration', 'struct_declaration', 'method_declaration', and
'constructor_declaration'.
- Line 131: The return type annotation for resolve_symbol is incorrect: change
the signature of resolve_symbol (in api/analyzers/csharp/analyzer.py) from "->
Entity" to "-> list[Entity]" so it matches resolve_type and resolve_method and
what Entity.resolved_symbol expects; update any import/type aliases if needed to
ensure list[Entity] is recognized by mypy.
- Around line 26-30: The guard in add_dependencies is ineffective and
Path.glob() is misused: change the early-return to check
path.joinpath("temp_deps_cs").is_dir(), only run dotnet restore when project
files exist by using any(path.glob("*.csproj")) or any(path.glob("*.sln"))
(instead of treating the generator as truthy), and after a successful
subprocess.run(..., cwd=str(path)) create the temp_deps_cs directory (e.g.,
path.joinpath("temp_deps_cs").mkdir()) so subsequent calls return early; adjust
references in add_dependencies to use the path parameter's methods (path.glob,
path.joinpath) rather than Path(f"{path}").

---

Nitpick comments:
In `@api/analyzers/csharp/analyzer.py`:
- Around line 113-122: The code handling invocation_expression has a dead else
that reassigns func_node to node.child_by_field_name('function') even though
func_node was already set, so remove the redundant else branch: in the block
where node.type == 'invocation_expression' keep the initial func_node =
node.child_by_field_name('function'), handle the member_access_expression case
by replacing func_node with its 'name' child and keep the identifier branch (or
no-op), then drop the final else clause entirely so func_node is not redundantly
reassigned before the if func_node: node = func_node assignment.

- Guard child_by_field_name('name') against None in get_entity_name
- Fix Path.glob() always-truthy bug in add_dependencies (use any())
- Fix resolve_symbol return type: -> list[Entity] in abstract class and all analyzers
- Remove redundant else branch in resolve_method
- Resolve path in analyze_sources to prevent traversal (path.resolve())
- Add setUp/tearDown to test class for graph cleanup
- Add IMPLEMENTS relationship assertion for ConsoleLogger -> ILogger

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request adds comprehensive C# language support to the code analysis tool, enabling it to parse and analyze C# source files alongside existing Python and Java support.

Changes:

  • Added tree-sitter-c-sharp dependency (v0.23.1) for C# parsing
  • Implemented CSharpAnalyzer supporting classes, interfaces, enums, structs, methods, and constructors with inheritance/implementation tracking
  • Registered .cs file extension with LSP integration for C# in the source analyzer
  • Added unit tests and test source file demonstrating C# analysis capabilities
  • Updated documentation to reflect C# as a supported language

Reviewed changes

Copilot reviewed 6 out of 8 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
pyproject.toml Added tree-sitter-c-sharp dependency version constraint
uv.lock Auto-generated lock file updates for C# dependencies (tree-sitter-c-sharp, rdflib, pyparsing)
api/analyzers/csharp/init.py Empty init file for C# analyzer package
api/analyzers/csharp/analyzer.py Complete C# analyzer implementation with tree-sitter query support, entity extraction, and LSP integration
api/analyzers/source_analyzer.py Registered CSharpAnalyzer for .cs extension and added C# LSP configuration
tests/test_csharp_analyzer.py Unit test validating C# entity detection and relationship tracking
tests/source_files/csharp/Program.cs Test source file demonstrating C# features (interfaces, classes, constructors, methods, inheritance)
README.md Updated supported languages list to include C#

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

def add_dependencies(self, path: Path, files: list[Path]):
if Path(f"{path}/temp_deps_cs").is_dir():
return
if Path(f"{path}").glob("*.csproj") or Path(f"{path}").glob("*.sln"):
Copy link

Copilot AI Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The condition Path(f"{path}").glob("*.csproj") or Path(f"{path}").glob("*.sln") always evaluates to a truthy generator object, even when there are no matching files. This should use any() to check if files exist: any(Path(path).glob("*.csproj")) or any(Path(path).glob("*.sln"))

Suggested change
if Path(f"{path}").glob("*.csproj") or Path(f"{path}").glob("*.sln"):
if any(Path(path).glob("*.csproj")) or any(Path(path).glob("*.sln")):

Copilot uses AI. Check for mistakes.
Comment on lines +117 to +120
elif func_node and func_node.type == 'identifier':
pass
else:
func_node = node.child_by_field_name('function')
Copy link

Copilot AI Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the resolve_method function, when node.type is 'invocation_expression', the code extracts func_node and then has a branch that does nothing (lines 117-118: elif func_node and func_node.type == 'identifier': pass). This empty branch followed by an else clause that reassigns func_node to the same value (line 120) makes the logic confusing. Consider simplifying: if func_node.type == 'member_access_expression', extract the name field; otherwise use the function node as-is.

Suggested change
elif func_node and func_node.type == 'identifier':
pass
else:
func_node = node.child_by_field_name('function')

Copilot uses AI. Check for mistakes.

def analyze_sources(self, path: Path, ignore: list[str], graph: Graph) -> None:
files = list(path.rglob("*.java")) + list(path.rglob("*.py"))
path = path.resolve()

Check failure

Code scanning / CodeQL

Uncontrolled data used in path expression High

This path depends on a
user-provided value
.
This path depends on a
user-provided value
.
- Walk back through contiguous comment siblings to collect all XML doc
  comment lines instead of only the last one
- Add clarifying comment about the base_list heuristic limitation
  (cannot distinguish base class from interface without semantic analysis)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
api/analyzers/csharp/analyzer.py (1)

20-24: Simplify _captures to align with Java and Python analyzers.

The CSharp analyzer currently uses QueryCursor(query).captures(node) via the _captures helper, while both JavaAnalyzer and PythonAnalyzer call query.captures(node) directly. These paths are functionally equivalent in current tree-sitter Python bindings. Consolidating to the direct pattern reduces maintenance surface and removes API-version sensitivity.

Proposed simplification
-    def _captures(self, pattern: str, node: Node) -> dict:
-        """Run a tree-sitter query and return captures dict."""
-        query = self.language.query(pattern)
-        cursor = QueryCursor(query)
-        return cursor.captures(node)
+    def _captures(self, pattern: str, node: Node) -> dict:
+        """Run a tree-sitter query and return captures dict."""
+        return self.language.query(pattern).captures(node)

Remove QueryCursor from the import on line 11, as it will no longer be used.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@api/analyzers/csharp/analyzer.py` around lines 20 - 24, The _captures helper
in CSharpAnalyzer should call the query's captures directly like the Java/Python
analyzers: replace the body of _captures in analyzer.py to return
self.language.query(pattern).captures(node) (i.e., remove use of QueryCursor),
and remove the unused QueryCursor import at the top; keep the method signature
def _captures(self, pattern: str, node: Node) -> dict and ensure other callers
to _captures continue to work unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@api/analyzers/csharp/analyzer.py`:
- Around line 26-30: The add_dependencies function never creates the sentinel
directory and never populates the files list; modify add_dependencies to run
subprocess.run(["dotnet","restore"], cwd=str(path), check=True) only when a
.csproj or .sln exists, and after a successful restore create the sentinel
Path(f"{path}/temp_deps_cs").mkdir(parents=True, exist_ok=True) so the
idempotency guard can short-circuit future calls; also populate the files
argument (e.g., extend files with Path(path).glob("**/*.cs") or other C# source
patterns) so C# source files are queued for analysis and the files parameter is
actually used.

---

Duplicate comments:
In `@api/analyzers/source_analyzer.py`:
- Around line 141-145: analyze_files currently passes the user-supplied Path
directly into first_pass/second_pass causing second_pass's use of
path.rglob('*.cs') to be vulnerable; update analyze_files to resolve the
incoming path with path = path.resolve() before calling first_pass and
second_pass, and additionally validate the resolved path is inside an allowed
analysis root (e.g., compare against a configurable ANALYSIS_ROOT or raise/abort
if not a subpath) so that rglob cannot operate on arbitrary filesystem
locations; ensure references include analyze_files, first_pass, second_pass, and
path.resolve() in your changes.

---

Nitpick comments:
In `@api/analyzers/csharp/analyzer.py`:
- Around line 20-24: The _captures helper in CSharpAnalyzer should call the
query's captures directly like the Java/Python analyzers: replace the body of
_captures in analyzer.py to return self.language.query(pattern).captures(node)
(i.e., remove use of QueryCursor), and remove the unused QueryCursor import at
the top; keep the method signature def _captures(self, pattern: str, node: Node)
-> dict and ensure other callers to _captures continue to work unchanged.

Comment on lines +26 to +30
def add_dependencies(self, path: Path, files: list[Path]):
if Path(f"{path}/temp_deps_cs").is_dir():
return
if any(Path(f"{path}").glob("*.csproj")) or any(Path(f"{path}").glob("*.sln")):
subprocess.run(["dotnet", "restore"], cwd=str(path))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

add_dependencies idempotency guard never fires — dotnet restore runs on every invocation.

Path(f"{path}/temp_deps_cs").is_dir() is checked on line 27, but nothing in this function (or anywhere else) creates the temp_deps_cs directory. The guard is therefore permanently False and dotnet restore will run on every call. This contrasts with the Python analyzer (checks venv/ which it creates) and the Java analyzer (creates temp_deps/ before scanning it).

Additionally, unlike Java (files.extend(...)) and Python (files.extend(...)), files is never populated here, so no C# dependency source files are queued for analysis. Ruff also flags the files parameter as unused (ARG002).

🐛 Proposed fix — create the sentinel directory after a successful restore
 def add_dependencies(self, path: Path, files: list[Path]):
     if Path(f"{path}/temp_deps_cs").is_dir():
         return
     if any(Path(f"{path}").glob("*.csproj")) or any(Path(f"{path}").glob("*.sln")):
-        subprocess.run(["dotnet", "restore"], cwd=str(path))
+        result = subprocess.run(["dotnet", "restore"], cwd=str(path))
+        if result.returncode == 0:
+            Path(f"{path}/temp_deps_cs").mkdir(exist_ok=True)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def add_dependencies(self, path: Path, files: list[Path]):
if Path(f"{path}/temp_deps_cs").is_dir():
return
if any(Path(f"{path}").glob("*.csproj")) or any(Path(f"{path}").glob("*.sln")):
subprocess.run(["dotnet", "restore"], cwd=str(path))
def add_dependencies(self, path: Path, files: list[Path]):
if Path(f"{path}/temp_deps_cs").is_dir():
return
if any(Path(f"{path}").glob("*.csproj")) or any(Path(f"{path}").glob("*.sln")):
result = subprocess.run(["dotnet", "restore"], cwd=str(path))
if result.returncode == 0:
Path(f"{path}/temp_deps_cs").mkdir(exist_ok=True)
🧰 Tools
🪛 Ruff (0.15.1)

[warning] 26-26: Unused method argument: files

(ARG002)


[error] 30-30: Starting a process with a partial executable path

(S607)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@api/analyzers/csharp/analyzer.py` around lines 26 - 30, The add_dependencies
function never creates the sentinel directory and never populates the files
list; modify add_dependencies to run subprocess.run(["dotnet","restore"],
cwd=str(path), check=True) only when a .csproj or .sln exists, and after a
successful restore create the sentinel
Path(f"{path}/temp_deps_cs").mkdir(parents=True, exist_ok=True) so the
idempotency guard can short-circuit future calls; also populate the files
argument (e.g., extend files with Path(path).glob("**/*.cs") or other C# source
patterns) so C# source files are queued for analysis and the files parameter is
actually used.

@gkorland gkorland merged commit 4fefd9f into main Feb 19, 2026
8 of 9 checks passed
@gkorland gkorland deleted the csharp branch February 19, 2026 08:31
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants