fix: prevent findDbPath from escaping git worktree boundary by carlos-alm · Pull Request #457 · optave/codegraph

carlos-alm · 2026-03-16T08:16:07Z

Summary

Add findRepoRoot() to src/db/connection.js using git rev-parse --show-toplevel, which returns the correct root for both regular repos and git worktrees. Results are cached per-process.
Use findRepoRoot() as a ceiling in findDbPath() so the walk-up stops at the git boundary — worktree sessions resolve to their own DB, not the parent repo's.
Add unit tests for findRepoRoot (caching, non-git fallback) and findDbPath ceiling behavior (worktree simulation, non-git fallback).

Context

When codegraph runs inside a git worktree (e.g. .claude/worktrees/agent-xxx/), findDbPath() walked up from process.cwd() and crossed the worktree boundary, finding the main repo's database. This made codegraph unusable in parallel worktree sessions — the exact workflow it's designed to support.

Behavior change

When no existing .codegraph/graph.db is found during the walk-up, the default DB creation path is now the git repo root (from findRepoRoot()) instead of process.cwd(). For example, running codegraph build from /repo/packages/foo/ with no pre-existing DB now creates /repo/.codegraph/graph.db instead of /repo/packages/foo/.codegraph/graph.db. When not inside a git repo, the previous behavior (process.cwd()) is preserved. This is intentional — the repo root is the correct default for most use cases.

Test plan

npm test — all existing tests pass
In a worktree: cd .claude/worktrees/xxx && npx codegraph build . && npx codegraph stats — uses worktree DB
In main repo: codegraph stats — still works as before
MCP tools from a worktree session resolve to the correct DB

Compatibility

Only 2 files changed (src/db/connection.js, tests/unit/db.test.js). No overlap with #456 on connection.js. Minor merge conflict possible in tests/unit/db.test.js import lines — trivial to resolve.

findDbPath() walks up from cwd looking for .codegraph/graph.db. In a git worktree (e.g. .claude/worktrees/agent-xxx/), this crosses the worktree boundary and finds the main repo's database instead. Add findRepoRoot() using `git rev-parse --show-toplevel` (which returns the correct root for both repos and worktrees) and use it as a ceiling in findDbPath(). The walk-up now stops at the git boundary, so each worktree resolves to its own database.

claude · 2026-03-16T08:16:20Z

Claude finished @carlos-alm's task —— View job

PR Review: fix: prevent findDbPath from escaping git worktree boundary

Todo List:

Read and analyze modified files
Phase 1: Root cause analysis
Phase 2: Configuration change detection
Phase 3: Test integrity assessment
Phase 4: Code quality review
Phase 5: Backlog compliance evaluation
Final assessment and rating

Root Cause Analysis

Problem Being Solved: When codegraph runs inside a git worktree (e.g., .claude/worktrees/agent-xxx/), the findDbPath() function would traverse up the directory tree and cross the worktree boundary, finding the main repo's database instead of creating/using a worktree-specific database. This made codegraph unusable in parallel worktree sessions — the exact workflow it's designed to support.

Why This Approach: The solution uses git rev-parse --show-toplevel to find the git boundary and uses it as a ceiling for the directory walk. This is the correct approach because:

git rev-parse --show-toplevel returns the proper root for both regular repos and worktrees
The ceiling-based approach prevents directory traversal beyond the git boundary
Results are cached for performance (important for repeated calls)

Risk Assessment: ✅ Low risk. The implementation is conservative with proper fallbacks when not in a git repo.

Phase 2: Configuration Change Detection

✅ No configuration changes detected. This PR only adds functionality without modifying any linting rules, build config, or relaxing any constraints.

Phase 3: Test Integrity Assessment

✅ Test integrity is strong. The PR adds comprehensive test coverage including:

findRepoRoot caching behavior and non-git fallback scenarios
findDbPath ceiling behavior with worktree simulation
Edge cases for non-git environments

The tests are well-designed and don't weaken existing assertions. Lines 266-290 in the test file specifically test the core worktree boundary functionality.

Phase 4: Code Quality Review

Strengths:

Clean separation of concerns: findRepoRoot() is a focused utility function
Proper error handling with try-catch and graceful fallbacks
Good caching strategy with per-process cache and explicit cache reset for testing
Comprehensive test coverage with edge cases
Security-conscious: uses execFileSync instead of shell execution to prevent command injection

Potential Issues:

Performance: execFileSync calls are blocking. However, the caching mitigates repeated calls for the common case (src/db/connection.js:19, 32)
Error handling: Silent catch blocks (lines 29-31) could potentially hide git configuration issues, but this is acceptable for graceful degradation

Code Style: Follows existing patterns in the codebase. Uses proper JSDoc documentation.

Phase 5: Backlog Compliance

Zero-dep: ✅ No new runtime dependencies. Uses only Node.js built-ins (node:child_process, node:path) and existing dependencies.

Foundation-aligned: ✅ Fully aligned with FOUNDATION.md principles:

Maintains zero-infrastructure requirement (uses local git commands)
No cloud API calls or external services
Purely enhances existing local functionality
Doesn't change the programmatic API contract

Problem-fit: 4/5 — This directly addresses a critical workflow issue for AI coding assistants using parallel sessions. When agents can't properly isolate their graph databases in worktrees, they interfere with each other and produce incorrect results. This fix enables the intended parallel session workflow.

Breaking: No — This is a pure enhancement. The findDbPath() function signature remains unchanged, and the behavior only improves for worktree scenarios while maintaining backward compatibility for regular repos.

Tier classification: Tier 1 (zero-dep ✓ + foundation-aligned ✓)

Critical Concerns

None identified. This is a well-engineered fix for a legitimate workflow issue.

Minor Suggestions

Consider adding a debug log when the git ceiling is applied (for troubleshooting edge cases)
The test at lines 218-224 could be more robust by testing against a known non-git directory rather than assuming os.tmpdir() isn't a repo

Final Recommendation

Rating: ⭐⭐⭐⭐⭐ (5/5)
Action: APPROVE
Reasoning: This PR solves a critical workflow issue with minimal risk, excellent test coverage, and clean implementation. It directly enables the parallel worktree sessions that codegraph is designed to support, with no breaking changes or dependencies added. The code quality is high with proper error handling and performance considerations.

greptile-apps · 2026-03-16T08:19:15Z

Greptile Summary

This PR fixes a real correctness bug where findDbPath() would escape a git worktree boundary and resolve to the parent repository's database — breaking parallel worktree workflows. The fix adds findRepoRoot() (backed by git rev-parse --show-toplevel, with cwd-keyed caching) and threads its result as a ceiling into the findDbPath() walk-up loop.

Key changes:

findRepoRoot() resolves symlinks (realpathSync) and uses inode-based comparison (isSameDirectory) to handle macOS /tmp → /private/tmp and Windows 8.3 short-name edge cases robustly.
Cache is keyed on process.cwd() at call time, so an MCP server serving multiple worktree sessions doesn't reuse a stale ceiling.
Default DB creation path changes from process.cwd() to the git root when inside a git repo (documented in PR description under Behavior change).
_resetRepoRootCache is correctly excluded from the public barrel (src/db/index.js) and imported directly from connection.js in tests.
Tests cover ceiling stop, within-boundary discovery, null-ceiling fallback, spy-verified caching, and macOS/Windows path equivalence — all previously-raised gaps have been addressed.

Confidence Score: 4/5

Safe to merge — the ceiling logic is correct and all previously-raised issues have been resolved; one minor test-hermetics gap remains.
Production code in connection.js is correct and handles all identified edge cases (symlinks, 8.3 short names, MCP multi-session cwd changes). All critical issues from prior review rounds have been addressed. Score is 4 rather than 5 solely because the pre-existing 'finds .codegraph/graph.db walking up parent directories' test still relies implicitly on the test-runner not being in a git-controlled temp directory, unlike the newly-added tests that control the ceiling explicitly.
No files require special attention — src/db/connection.js is the only production file changed and its logic is sound.

Important Files Changed

Filename	Overview
src/db/connection.js	Adds `findRepoRoot()` (with cwd-keyed caching) and `_resetRepoRootCache()`, then updates `findDbPath()` to use the git root as a ceiling. Logic is correct: ceiling check fires after DB-existence check on the same directory, `isSameDirectory` handles macOS symlinks and Windows 8.3 names via inode comparison, and `process.cwd()` fallback is preserved when not in a git repo. No issues found in production code.
src/db/index.js	Adds `findRepoRoot` to the public barrel export (intentional public API). Correctly excludes `_resetRepoRootCache` (test helper), which is now imported directly from `connection.js` in tests. No issues.
tests/unit/db.test.js	Comprehensive test coverage for `findRepoRoot` (caching, bypass, null path) and `findDbPath` ceiling behavior using a real `git init`. All previously-flagged issues addressed (spy-based cache verification, macOS symlink handling via `findRepoRoot()` for expected path, `mockImplementationOnce` for hermetic null-ceiling tests). One remaining minor gap: the pre-existing `'finds .codegraph/graph.db walking up parent directories'` test doesn't mock git, relying implicitly on `os.tmpdir()` not being inside a git repo.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A["findDbPath(customPath)"] --> B{customPath?}
    B -- yes --> C["return path.resolve(customPath)"]
    B -- no --> D["ceiling = findRepoRoot()"]
    D --> D1{cached &\ncwd unchanged?}
    D1 -- yes --> D2["return _cachedRepoRoot"]
    D1 -- no --> D3["git rev-parse --show-toplevel"]
    D3 -- success --> D4["realpathSync(raw) → ceiling"]
    D3 -- throws --> D5["ceiling = null"]
    D4 --> E
    D5 --> E
    D2 --> E
    E["dir = realpathSync(cwd)"]
    E --> F["candidate = dir/.codegraph/graph.db"]
    F --> G{exists?}
    G -- yes --> H["return candidate"]
    G -- no --> I{ceiling &&\nisSameDirectory\ndirceiling?}
    I -- yes --> J["break — at ceiling"]
    I -- no --> K{parent === dir?}
    K -- yes --> J
    K -- no --> L["dir = parent"]
    L --> F
    J --> M["base = ceiling ∥ cwd()"]
    M --> N["return base/.codegraph/graph.db"]

_{Last reviewed commit: b87a12b}

greptile-apps · 2026-03-16T08:19:19Z

tests/unit/db.test.js

+  it('stops walking at git ceiling and does not find parent DB', () => {
+    const origCwd = process.cwd;
+    // Mock cwd to be inside the inner "worktree"
+    process.cwd = () => innerDir;
+    // Mock findRepoRoot to return the worktree root (one level up from innerDir)
+    const worktreeRoot = path.join(outerDir, 'worktree');
+    const origFindRepoRoot = findRepoRoot;
+
+    // We can't easily mock findRepoRoot since findDbPath calls it internally,
+    // but we can test the behavior by creating a DB at the worktree level
+    fs.mkdirSync(path.join(worktreeRoot, '.codegraph'), { recursive: true });
+    fs.writeFileSync(path.join(worktreeRoot, '.codegraph', 'graph.db'), '');
+
+    try {
+      _resetRepoRootCache();
+      const result = findDbPath();
+      // Should find the worktree DB, not the outer one
+      expect(result).toContain('worktree');
+      expect(result).toContain('.codegraph');
+    } finally {
+      process.cwd = origCwd;
+      // Clean up the worktree DB
+      fs.rmSync(path.join(worktreeRoot, '.codegraph'), { recursive: true, force: true });
+    }
+  });


Ceiling test doesn't exercise ceiling logic

This test runs inside a temp directory that is not a git repo, so findRepoRoot() returns null and the ceiling check on connection.js:104 (if (ceiling && ...)) is never triggered. The test passes because both DBs exist in the walk-up path and the worktree-level DB is found first — this would happen with or without the ceiling fix.

To actually verify that the ceiling prevents crossing the worktree boundary, you'd need to either:

Mock findRepoRoot (e.g., via vi.mock or dependency injection) to return the worktreeRoot path, then remove the worktree-level DB so the only DB is the outer one — and assert it is NOT found.

Initialize an actual git repo (via git init) in the worktreeRoot so findRepoRoot() returns it as the ceiling.

Without this, the core feature of this PR (stopping the walk at the git boundary) has no effective test coverage.

Additionally, origFindRepoRoot on line 272 is assigned but never used — leftover from an abandoned mocking approach.

Fixed in 83efde5. The ceiling test now uses git init in the worktree root so findRepoRoot() returns a real ceiling. The test no longer creates a worktree-level DB — the only DB is in the outer dir (beyond the ceiling), and the assertion verifies the walk stops at the ceiling and returns the default path at the worktree root, not the outer DB. Also removed the unused origFindRepoRoot variable.

greptile-apps · 2026-03-16T08:19:20Z

tests/unit/db.test.js

+  it('returns null when not in a git repo', () => {
+    const root = findRepoRoot(os.tmpdir());
+    // os.tmpdir() is typically not a git repo; if it is, skip gracefully
+    if (root === null) {
+      expect(root).toBeNull();
+    }
+  });


Test silently passes without asserting

If os.tmpdir() happens to be inside a git repo (possible on some CI environments or developer machines), this test passes without making any assertion at all — the if block is skipped. Consider using vi.spyOn to mock execFileSync to throw, ensuring the null path is always exercised regardless of the host environment.

Fixed in 83efde5. Now uses vi.mock('node:child_process') with mockImplementationOnce to force execFileSync to throw, ensuring the null path is always exercised regardless of host environment.

greptile-apps · 2026-03-16T08:19:21Z

tests/unit/db.test.js

+    process.cwd = () => innerDir;
+    // Mock findRepoRoot to return the worktree root (one level up from innerDir)
+    const worktreeRoot = path.join(outerDir, 'worktree');
+    const origFindRepoRoot = findRepoRoot;


Unused variable

origFindRepoRoot is assigned but never used. This appears to be a leftover from an abandoned attempt to mock findRepoRoot.

Suggested change

const origFindRepoRoot = findRepoRoot;

Fixed in 83efde5. Removed the unused origFindRepoRoot variable.

…t test - Ceiling test now uses `git init` to create a real git repo boundary, and verifies the outer DB is NOT found (without the ceiling fix it would be). Added separate test for finding DB within the boundary. - Non-git test uses a fresh mkdtemp dir instead of os.tmpdir(). - Added debug log when ceiling stops the walk-up. - Removed unused `origFindRepoRoot` variable.

carlos-alm · 2026-03-16T08:23:31Z

Addressed all three review comments in deac23e:

1. Ceiling test doesn't exercise ceiling logic — Rewrote the test to use git init in the worktree dir so findRepoRoot() returns a real ceiling. The outer DB (beyond the ceiling) is the only one — asserts it is NOT found and the default path resolves at the ceiling root instead. Added a separate test that creates a DB within the boundary to verify normal walk-up still works.

2. Test silently passes without asserting — Replaced os.tmpdir() with a fresh mkdtemp directory that is guaranteed not to be inside a git repo. The assertion is now unconditional.

3. Unused variable origFindRepoRoot — Removed.

Also added a debug() log when the git ceiling stops the walk-up, for troubleshooting edge cases.

@greptileai

greptile-apps · 2026-03-16T08:27:51Z

src/db/connection.js

+  const base = ceiling || process.cwd();
+  return path.join(base, '.codegraph', 'graph.db');


Undocumented behavior change for non-worktree users

When no existing DB is found during the walk-up, the default creation path has silently changed from process.cwd() to ceiling (the git repo root) for any user inside a git repository. Previously, running codegraph build from /repo/sub/ with no DB would create /repo/sub/.codegraph/graph.db. After this PR, it creates /repo/.codegraph/graph.db.

This is likely the desired long-term behavior, but it's a breaking change for users who intentionally scope their codegraph DB to a project subdirectory rather than the repo root. The PR description only mentions the worktree ceiling fix, not the changed default creation location.

Consider documenting this in the PR description or CHANGELOG, or preserving the previous fallback for non-worktree scenarios:

const base = ceiling || process.cwd(); return path.join(base, '.codegraph', 'graph.db');

Good catch on documenting this. Updated the PR description with a Behavior change section explaining that the default DB creation path is now the git repo root instead of process.cwd() when inside a git repo. The process.cwd() fallback is preserved when not in a git repo (line 109: const base = ceiling || process.cwd()).

greptile-apps · 2026-03-16T08:27:52Z

tests/unit/db.test.js

+  it('caches results when called without arguments', () => {
+    _resetRepoRootCache();
+    const first = findRepoRoot();
+    const second = findRepoRoot();
+    expect(first).toBe(second);
+  });


Caching test does not actually verify caching

JavaScript string primitives are compared by value, not reference, so expect(first).toBe(second) passes even if execFileSync is called twice and returns two distinct string objects with the same content. The test does not prove that the cache is being used.

To truly verify the caching behaviour, spy on execFileSync and assert it was called exactly once:

it('caches results when called without arguments', () => { _resetRepoRootCache(); const spy = vi.spyOn(childProcess, 'execFileSync'); const first = findRepoRoot(); const second = findRepoRoot(); expect(first).toBe(second); expect(spy).toHaveBeenCalledTimes(1); // cached on second call spy.mockRestore(); });

This requires importing node:child_process as a namespace (import * as childProcess from 'node:child_process') so the spy can intercept the module-level binding.

Fixed in 83efde5. The caching test now uses vi.mock('node:child_process') to create a spy on execFileSync, clears it before the test, and asserts toHaveBeenCalledTimes(1) — proving the second call hits the cache.

greptile-apps · 2026-03-16T08:27:53Z

tests/unit/db.test.js

+  it('does not use cache when called with explicit dir', () => {
+    _resetRepoRootCache();
+    const fromCwd = findRepoRoot();
+    // Calling with an explicit dir should still work (not use cwd cache)
+    const fromExplicit = findRepoRoot(process.cwd());
+    expect(fromExplicit).toBe(fromCwd);
+  });


Test name does not match what is being verified

The test is named "does not use cache when called with explicit dir", implying it verifies that execFileSync is invoked again when fromDir is provided. However, the test only asserts that both calls return the same value — which would pass whether or not the cache is bypassed.

Consider either renaming the test to reflect what is actually checked ('returns same value whether using cache or explicit dir'), or adding a spy to assert that execFileSync is called a second time when an explicit fromDir is passed.

Suggested change

it('does not use cache when called with explicit dir', () => {

_resetRepoRootCache();

const fromCwd = findRepoRoot();

// Calling with an explicit dir should still work (not use cwd cache)

const fromExplicit = findRepoRoot(process.cwd());

expect(fromExplicit).toBe(fromCwd);

});

it('returns same value whether using cache or explicit dir', () => {

Fixed in 83efde5. Renamed to "bypasses cache when called with explicit dir" and added a spy assertion: expect(execFileSyncSpy).toHaveBeenCalledTimes(2) — confirming the second call with an explicit dir actually invokes execFileSync again instead of using the cache.

…boundary

- Mock execFileSync via vi.mock to verify caching calls exactly once and cache bypass calls twice (not just comparing return values) - Mock execFileSync to throw in "not in git repo" test so the null path is always exercised regardless of host environment - Rename "does not use cache" test to "bypasses cache" with spy assertion

…edback" This reverts commit 83efde5.

…eedback" This reverts commit ccb8bd5.

carlos-alm · 2026-03-16T08:36:12Z

@greptileai

greptile-apps · 2026-03-16T08:41:19Z

tests/unit/db.test.js

+  it('falls back gracefully when not in a git repo', () => {
+    const emptyDir = fs.mkdtempSync(path.join(os.tmpdir(), 'cg-nogit-'));
+    const origCwd = process.cwd;
+    process.cwd = () => emptyDir;
+    _resetRepoRootCache();
+    try {
+      const result = findDbPath();
+      // Should return default path at cwd since there's no git ceiling
+      expect(result).toBe(path.join(emptyDir, '.codegraph', 'graph.db'));
+    } finally {
+      process.cwd = origCwd;
+      fs.rmSync(emptyDir, { recursive: true, force: true });
+    }
+  });


Flaky non-git fallback test — same issue that was fixed in findRepoRoot

This test creates emptyDir inside os.tmpdir() but never mocks execFileSync to throw. The spy's default implementation calls the real execFileSync, so if the test host's os.tmpdir() happens to be inside a git repository (possible on some CI setups or Docker images), findRepoRoot() will return a non-null ceiling. findDbPath() would then return ceiling/.codegraph/graph.db instead of emptyDir/.codegraph/graph.db, making the expect(result).toBe(path.join(emptyDir, ...)) assertion fail.

The analogous test in findRepoRoot ('returns null when not in a git repo') was already fixed by using mockImplementationOnce to force execFileSync to throw. The same fix should be applied here:

it('falls back gracefully when not in a git repo', () => { const emptyDir = fs.mkdtempSync(path.join(os.tmpdir(), 'cg-nogit-')); const origCwd = process.cwd; process.cwd = () => emptyDir; _resetRepoRootCache(); execFileSyncSpy.mockImplementationOnce(() => { throw new Error('not a git repo'); }); try { const result = findDbPath(); expect(result).toBe(path.join(emptyDir, '.codegraph', 'graph.db')); } finally { process.cwd = origCwd; fs.rmSync(emptyDir, { recursive: true, force: true }); } });

Fixed in e3c1d4a. Added \ before the test, ensuring \ always returns null regardless of host environment.

greptile-apps · 2026-03-16T08:41:20Z

tests/unit/db.test.js

 * Unit tests for src/db.js — build_meta helpers included
 */

+import { execFileSync as realExecFileSync } from 'node:child_process';


realExecFileSync is not the real function — it's the spy

Due to Vitest's vi.mock hoisting, vi.mock('node:child_process', ...) is applied before all import statements are resolved, including this one. As a result, realExecFileSync is bound to execFileSyncSpy, not the unspied original.

This is currently harmless — the spy's default mockImplementation delegates to the real execFileSync, and realExecFileSync is only used in beforeAll (before any mockImplementationOnce calls). However, the name realExecFileSync is misleading: it implies the import bypasses the mock, which it does not. A future developer who relies on this assumption (e.g., calling it after a mockReset()) would get unexpected behavior.

Consider renaming it to execFileSyncUnspied and obtaining it via importOriginal inside the factory, or at minimum add a comment:

Suggested change

import { execFileSync as realExecFileSync } from 'node:child_process';

import { execFileSync as execFileSyncForSetup } from 'node:child_process'; // Note: this is the spy's delegate, not the original — only safe to call before test-level mockImplementationOnce overrides

Fixed in e3c1d4a. Renamed realExecFileSync → execFileSyncForSetup and added a comment explaining it resolves to the spy due to vi.mock hoisting, and is only safe to call before test-level mockImplementationOnce overrides.

…rt name - Mock execFileSync to throw in "falls back gracefully when not in a git repo" test, preventing flakiness when os.tmpdir() is inside a git repo - Rename realExecFileSync → execFileSyncForSetup with comment explaining it resolves to the spy due to vi.mock hoisting

carlos-alm · 2026-03-16T09:02:07Z

@greptileai

Combine PR's connection.js imports (findRepoRoot, _resetRepoRootCache) with main's barrel db/index.js pattern. Add missing re-exports to barrel. Impact: 22 functions changed, 10 affected

greptile-apps · 2026-03-16T09:07:48Z

tests/unit/db.test.js

@@ -144,11 +160,13 @@ describe('findDbPath', () => {
    const origCwd = process.cwd;
    process.cwd = () => emptyDir;
    try {
+      _resetRepoRootCache();
      const result = findDbPath();
      expect(result).toContain('.codegraph');
      expect(result).toContain('graph.db');
    } finally {
      process.cwd = origCwd;


findDbPath fallback test doesn't control the ceiling

This test calls _resetRepoRootCache() but never mocks execFileSync to throw, so findRepoRoot() will call the real git rev-parse --show-toplevel from emptyDir. If the test host's os.tmpdir() is inside a git repo (common in some CI setups), ceiling will be set to that repo root and the default path returned will be <git-root>/.codegraph/graph.db rather than emptyDir/.codegraph/graph.db. The toContain assertions pass either way, so the test silently misses validating the actual fallback location.

The analogous 'falls back gracefully when not in a git repo' test in the new findDbPath with git ceiling describe already solves this correctly with execFileSyncSpy.mockImplementationOnce. The same treatment should be applied here:

Suggested change

it('returns default path when no DB found', () => {

const emptyDir = fs.mkdtempSync(path.join(tmpDir, 'empty-'));

const origCwd = process.cwd;

process.cwd = () => emptyDir;

execFileSyncSpy.mockImplementationOnce(() => {

throw new Error('not a git repo');

});

try {

_resetRepoRootCache();

const result = findDbPath();

expect(result).toBe(path.join(emptyDir, '.codegraph', 'graph.db'));

} finally {

process.cwd = origCwd;

_resetRepoRootCache();

}

});

This ensures ceiling = null, the process.cwd() fallback is exercised deterministically, and the assertion is precise enough to detect regressions.

Fixed in 0d0fdd9. The test now mocks execFileSync to throw (via mockImplementationOnce) so findRepoRoot() always returns null, ensuring the process.cwd() fallback is deterministically exercised. Also tightened the assertion from toContain to toBe with the exact expected path.

This was already fixed in 0d0fdd9 — the test now mocks execFileSync to throw via mockImplementationOnce, ensuring findRepoRoot() always returns null regardless of host environment. The assertion was also tightened from toContain to toBe with the exact expected path.

On macOS, os.tmpdir() returns /var/folders/... but git rev-parse returns /private/var/folders/... (resolved symlink). The ceiling comparison failed because the paths didn't match. Use fs.realpathSync on cwd to normalize symlinks before comparing against the ceiling. Impact: 1 functions changed, 1 affected

Impact: 1 functions changed, 1 affected

…boundary Impact: 9 functions changed, 19 affected

The 'returns default path when no DB found' test didn't control the git ceiling — if tmpDir was inside a git repo, findRepoRoot() would return a non-null ceiling and the default path would differ from emptyDir. Mock execFileSync to throw so the cwd fallback is always exercised.

carlos-alm · 2026-03-16T09:21:54Z

@greptileai

…m ceiling On macOS, os.tmpdir() returns /var/... but git resolves symlinks to /private/var/..., and on Windows, 8.3 short names (RUNNER~1) differ from long names (runneradmin). findDbPath normalizes dir via fs.realpathSync but findRepoRoot only used path.resolve, causing the ceiling comparison to fail — the walk crossed the worktree boundary. Fix findRepoRoot to use fs.realpathSync on git output, and resolve test paths after directory creation so assertions match. Impact: 1 functions changed, 77 affected

greptile-apps · 2026-03-16T09:28:50Z

tests/unit/db.test.js

+    try {
+      _resetRepoRootCache();


Test assertion will fail on macOS when /tmp is the temp dir

result is built from ceiling, which is the real (symlink-resolved) path returned by git rev-parse --show-toplevel. On macOS, if os.tmpdir() resolves through /tmp (which is a symlink to /private/tmp), git will return /private/tmp/cg-ceiling-xxx/worktree while worktreeRoot is still /tmp/cg-ceiling-xxx/worktree. The toBe assertion then compares:

actual: /private/tmp/cg-ceiling-xxx/worktree/.codegraph/graph.db

expected: /tmp/cg-ceiling-xxx/worktree/.codegraph/graph.db

These are not equal, so the test fails. The production code in connection.js is actually correct — realpathSync is applied to dir precisely to ensure ceiling (from git) and dir agree on symlinks. The test expectation just needs to match the same resolved form:

Suggested change

try {

_resetRepoRootCache();

expect(result).toBe(path.join(fs.realpathSync(worktreeRoot), '.codegraph', 'graph.db'));

expect(result).not.toContain(`${path.basename(outerDir)}${path.sep}.codegraph`);

Fixed in 651c164. The test now uses \ for the expected ceiling instead of the test's , so the assertion always matches the path form that \ uses internally. This handles both the macOS symlink case (\ → ) and the Windows 8.3 short name case (\ → ). Additionally, the production code now uses a stat-based \ comparison (dev+ino fallback) instead of string equality for the ceiling check, which is robust against all path representation differences.

git rev-parse resolves 8.3 short names (RUNNER~1 → runneradmin) but fs.realpathSync may not, causing the string comparison to fail and the walk to escape the git ceiling boundary. Replace string equality with isSameDirectory() that falls back to dev+ino comparison when paths differ textually but refer to the same directory. Also fix the test assertion to use findRepoRoot() for the expected ceiling path instead of the test's worktreeRoot, since git may resolve paths differently than realpathSync. Impact: 2 functions changed, 137 affected

carlos-alm · 2026-03-16T09:47:22Z

@greptileai

greptile-apps · 2026-03-16T09:51:45Z

src/db/index.js

+export {
+  _resetRepoRootCache,
+  closeDb,
+  findDbPath,
+  findRepoRoot,
+  openDb,
+  openReadonlyOrFail,
+} from './connection.js';


Test-only utility exported through the public barrel

_resetRepoRootCache is a testing helper (signalled by the leading underscore) but is now re-exported from the public barrel src/db/index.js. Any consumer that imports from …/db/index.js will see this function in their IDE completions and bundled types, and could accidentally call it in production code to blow away the per-process cache.

A cleaner approach is to have the test files import directly from the implementation module:

// in tests/unit/db.test.js import { _resetRepoRootCache, findRepoRoot, ... } from '../../src/db/connection.js';

Then remove _resetRepoRootCache from the barrel export, keeping the public API free of test infrastructure.

Fixed in the latest push. Removed _resetRepoRootCache from the src/db/index.js barrel export. The test file now imports it directly from ../../src/db/connection.js, keeping the public API free of test infrastructure.

carlos-alm · 2026-03-16T10:10:50Z

@greptileai

greptile-apps · 2026-03-16T10:15:46Z

tests/unit/db.test.js

+  it('finds DB within the ceiling boundary', () => {
+    // Create a DB inside the worktree — should be found normally
+    fs.mkdirSync(path.join(worktreeRoot, '.codegraph'), { recursive: true });
+    fs.writeFileSync(path.join(worktreeRoot, '.codegraph', 'graph.db'), '');
+    const origCwd = process.cwd;
+    process.cwd = () => innerDir;
+    try {
+      _resetRepoRootCache();
+      const result = findDbPath();
+      expect(result).toContain('worktree');
+      expect(result).toContain('.codegraph');
+    } finally {
+      process.cwd = origCwd;
+      fs.rmSync(path.join(worktreeRoot, '.codegraph'), { recursive: true, force: true });
+    }
+  });


Loose assertions don't validate the found DB location precisely

The two toContain assertions verify that the result string contains 'worktree' and '.codegraph', but they don't confirm the DB was found at worktreeRoot rather than innerDir. A path like path.join(innerDir, '.codegraph', 'graph.db') would also satisfy both checks. Every other assertion in this describe block uses toBe with an exact path. Consider using the same here for consistency and precision:

Suggested change

it('finds DB within the ceiling boundary', () => {

// Create a DB inside the worktree — should be found normally

fs.mkdirSync(path.join(worktreeRoot, '.codegraph'), { recursive: true });

fs.writeFileSync(path.join(worktreeRoot, '.codegraph', 'graph.db'), '');

const origCwd = process.cwd;

process.cwd = () => innerDir;

try {

_resetRepoRootCache();

const result = findDbPath();

expect(result).toContain('worktree');

expect(result).toContain('.codegraph');

} finally {

process.cwd = origCwd;

fs.rmSync(path.join(worktreeRoot, '.codegraph'), { recursive: true, force: true });

}

});

it('finds DB within the ceiling boundary', () => {

// Create a DB inside the worktree — should be found normally

fs.mkdirSync(path.join(worktreeRoot, '.codegraph'), { recursive: true });

fs.writeFileSync(path.join(worktreeRoot, '.codegraph', 'graph.db'), '');

const origCwd = process.cwd;

process.cwd = () => innerDir;

try {

_resetRepoRootCache();

const result = findDbPath();

const ceiling = findRepoRoot();

expect(result).toBe(path.join(ceiling, '.codegraph', 'graph.db'));

} finally {

process.cwd = origCwd;

fs.rmSync(path.join(worktreeRoot, '.codegraph'), { recursive: true, force: true });

}

});

Fixed — tightened the assertion to use toBe with the exact ceiling-based path, consistent with the rest of the describe block.

greptile-apps · 2026-03-16T10:15:48Z

src/db/connection.js

+export function findRepoRoot(fromDir) {
+  const dir = fromDir || process.cwd();
+  if (!fromDir && _cachedRepoRoot !== undefined) return _cachedRepoRoot;
+  let root = null;
+  try {
+    const raw = execFileSync('git', ['rev-parse', '--show-toplevel'], {
+      cwd: dir,
+      encoding: 'utf-8',
+      stdio: ['pipe', 'pipe', 'pipe'],
+    }).trim();
+    // Use realpathSync to resolve symlinks (macOS /var → /private/var) and
+    // 8.3 short names (Windows RUNNER~1 → runneradmin) so the ceiling path
+    // matches the realpathSync'd dir in findDbPath.
+    try {
+      root = fs.realpathSync(raw);
+    } catch {
+      root = path.resolve(raw);
+    }
+  } catch {
+    root = null;
+  }
+  if (!fromDir) _cachedRepoRoot = root;
+  return root;
+}


Stale cache if process.cwd() changes after first call

findRepoRoot() (no args) caches its result the first time it's called and returns that value on every subsequent no-arg call — even if process.cwd() has changed in the meantime. In a long-running process (e.g. an MCP server that serves multiple worktree sessions sequentially), a second session with a different cwd would receive the ceiling from the first session. findDbPath() then computes dir from the current process.cwd() but compares it against a ceiling from the old cwd, allowing the walk to escape the expected boundary.

The per-process cache is intentional and documented, and it's benign if each codegraph invocation is a fresh process. However, for the MCP-server use case explicitly called out in the PR, the lifetime of the process spans multiple client connections. A defensive fix would be to key the cache on the cwd value captured at call time:

let _cachedRepoRoot; // undefined = not computed let _cachedRepoRootCwd; // cwd when the cache was populated export function findRepoRoot(fromDir) { const dir = fromDir || process.cwd(); if (!fromDir && _cachedRepoRoot !== undefined && _cachedRepoRootCwd === dir) { return _cachedRepoRoot; } // ... rest of implementation ... if (!fromDir) { _cachedRepoRoot = root; _cachedRepoRootCwd = dir; } return root; }

Fixed — the cache is now keyed on the cwd value at call time. If process.cwd() changes between calls (e.g. MCP server serving multiple sessions), the cache is invalidated and recomputed. _resetRepoRootCache clears both the cached value and the cwd key.

Impact: 2 functions changed, 2 affected

carlos-alm · 2026-03-16T10:26:54Z

@greptileai

…hort names Impact: 1 functions changed, 1 affected

…tave/codegraph into fix/db-path-worktree-boundary Impact: 13 functions changed, 3 affected

On Windows CI, fs.realpathSync(process.cwd()) and git rev-parse --show-toplevel can resolve 8.3 short names (RUNNER~1) differently than long names (runneradmin). Apply realpathSync to both sides of the assertion so the comparison is path-equivalent.

…s 8.3 names

…boundary # Conflicts: # src/db/index.js Impact: 10 functions changed, 11 affected

carlos-alm · 2026-03-16T11:13:37Z

@greptileai