fix: prevent findDbPath from escaping git worktree boundary#457
fix: prevent findDbPath from escaping git worktree boundary#457carlos-alm merged 24 commits intomainfrom
Conversation
findDbPath() walks up from cwd looking for .codegraph/graph.db. In a git worktree (e.g. .claude/worktrees/agent-xxx/), this crosses the worktree boundary and finds the main repo's database instead. Add findRepoRoot() using `git rev-parse --show-toplevel` (which returns the correct root for both repos and worktrees) and use it as a ceiling in findDbPath(). The walk-up now stops at the git boundary, so each worktree resolves to its own database.
|
Claude finished @carlos-alm's task —— View job PR Review: fix: prevent findDbPath from escaping git worktree boundaryTodo List:
Root Cause AnalysisProblem Being Solved: When codegraph runs inside a git worktree (e.g., Why This Approach: The solution uses
Risk Assessment: ✅ Low risk. The implementation is conservative with proper fallbacks when not in a git repo. Phase 2: Configuration Change Detection✅ No configuration changes detected. This PR only adds functionality without modifying any linting rules, build config, or relaxing any constraints. Phase 3: Test Integrity Assessment✅ Test integrity is strong. The PR adds comprehensive test coverage including:
The tests are well-designed and don't weaken existing assertions. Lines 266-290 in the test file specifically test the core worktree boundary functionality. Phase 4: Code Quality ReviewStrengths:
Potential Issues:
Code Style: Follows existing patterns in the codebase. Uses proper JSDoc documentation. Phase 5: Backlog ComplianceZero-dep: ✅ No new runtime dependencies. Uses only Node.js built-ins ( Foundation-aligned: ✅ Fully aligned with FOUNDATION.md principles:
Problem-fit: 4/5 — This directly addresses a critical workflow issue for AI coding assistants using parallel sessions. When agents can't properly isolate their graph databases in worktrees, they interfere with each other and produce incorrect results. This fix enables the intended parallel session workflow. Breaking: No — This is a pure enhancement. The Tier classification: Tier 1 (zero-dep ✓ + foundation-aligned ✓) Critical ConcernsNone identified. This is a well-engineered fix for a legitimate workflow issue. Minor Suggestions
Final Recommendation
|
Greptile SummaryThis PR fixes a real correctness bug where Key changes:
Confidence Score: 4/5
Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A["findDbPath(customPath)"] --> B{customPath?}
B -- yes --> C["return path.resolve(customPath)"]
B -- no --> D["ceiling = findRepoRoot()"]
D --> D1{cached &\ncwd unchanged?}
D1 -- yes --> D2["return _cachedRepoRoot"]
D1 -- no --> D3["git rev-parse --show-toplevel"]
D3 -- success --> D4["realpathSync(raw) → ceiling"]
D3 -- throws --> D5["ceiling = null"]
D4 --> E
D5 --> E
D2 --> E
E["dir = realpathSync(cwd)"]
E --> F["candidate = dir/.codegraph/graph.db"]
F --> G{exists?}
G -- yes --> H["return candidate"]
G -- no --> I{ceiling &&\nisSameDirectory\ndirceiling?}
I -- yes --> J["break — at ceiling"]
I -- no --> K{parent === dir?}
K -- yes --> J
K -- no --> L["dir = parent"]
L --> F
J --> M["base = ceiling ∥ cwd()"]
M --> N["return base/.codegraph/graph.db"]
Last reviewed commit: b87a12b |
tests/unit/db.test.js
Outdated
| it('stops walking at git ceiling and does not find parent DB', () => { | ||
| const origCwd = process.cwd; | ||
| // Mock cwd to be inside the inner "worktree" | ||
| process.cwd = () => innerDir; | ||
| // Mock findRepoRoot to return the worktree root (one level up from innerDir) | ||
| const worktreeRoot = path.join(outerDir, 'worktree'); | ||
| const origFindRepoRoot = findRepoRoot; | ||
|
|
||
| // We can't easily mock findRepoRoot since findDbPath calls it internally, | ||
| // but we can test the behavior by creating a DB at the worktree level | ||
| fs.mkdirSync(path.join(worktreeRoot, '.codegraph'), { recursive: true }); | ||
| fs.writeFileSync(path.join(worktreeRoot, '.codegraph', 'graph.db'), ''); | ||
|
|
||
| try { | ||
| _resetRepoRootCache(); | ||
| const result = findDbPath(); | ||
| // Should find the worktree DB, not the outer one | ||
| expect(result).toContain('worktree'); | ||
| expect(result).toContain('.codegraph'); | ||
| } finally { | ||
| process.cwd = origCwd; | ||
| // Clean up the worktree DB | ||
| fs.rmSync(path.join(worktreeRoot, '.codegraph'), { recursive: true, force: true }); | ||
| } | ||
| }); |
There was a problem hiding this comment.
Ceiling test doesn't exercise ceiling logic
This test runs inside a temp directory that is not a git repo, so findRepoRoot() returns null and the ceiling check on connection.js:104 (if (ceiling && ...)) is never triggered. The test passes because both DBs exist in the walk-up path and the worktree-level DB is found first — this would happen with or without the ceiling fix.
To actually verify that the ceiling prevents crossing the worktree boundary, you'd need to either:
- Mock
findRepoRoot(e.g., viavi.mockor dependency injection) to return theworktreeRootpath, then remove the worktree-level DB so the only DB is the outer one — and assert it is NOT found. - Initialize an actual git repo (via
git init) in theworktreeRootsofindRepoRoot()returns it as the ceiling.
Without this, the core feature of this PR (stopping the walk at the git boundary) has no effective test coverage.
Additionally, origFindRepoRoot on line 272 is assigned but never used — leftover from an abandoned mocking approach.
There was a problem hiding this comment.
Fixed in 83efde5. The ceiling test now uses git init in the worktree root so findRepoRoot() returns a real ceiling. The test no longer creates a worktree-level DB — the only DB is in the outer dir (beyond the ceiling), and the assertion verifies the walk stops at the ceiling and returns the default path at the worktree root, not the outer DB. Also removed the unused origFindRepoRoot variable.
| it('returns null when not in a git repo', () => { | ||
| const root = findRepoRoot(os.tmpdir()); | ||
| // os.tmpdir() is typically not a git repo; if it is, skip gracefully | ||
| if (root === null) { | ||
| expect(root).toBeNull(); | ||
| } | ||
| }); |
There was a problem hiding this comment.
Test silently passes without asserting
If os.tmpdir() happens to be inside a git repo (possible on some CI environments or developer machines), this test passes without making any assertion at all — the if block is skipped. Consider using vi.spyOn to mock execFileSync to throw, ensuring the null path is always exercised regardless of the host environment.
There was a problem hiding this comment.
Fixed in 83efde5. Now uses vi.mock('node:child_process') with mockImplementationOnce to force execFileSync to throw, ensuring the null path is always exercised regardless of host environment.
tests/unit/db.test.js
Outdated
| process.cwd = () => innerDir; | ||
| // Mock findRepoRoot to return the worktree root (one level up from innerDir) | ||
| const worktreeRoot = path.join(outerDir, 'worktree'); | ||
| const origFindRepoRoot = findRepoRoot; |
There was a problem hiding this comment.
Unused variable
origFindRepoRoot is assigned but never used. This appears to be a leftover from an abandoned attempt to mock findRepoRoot.
| const origFindRepoRoot = findRepoRoot; |
There was a problem hiding this comment.
Fixed in 83efde5. Removed the unused origFindRepoRoot variable.
…t test - Ceiling test now uses `git init` to create a real git repo boundary, and verifies the outer DB is NOT found (without the ceiling fix it would be). Added separate test for finding DB within the boundary. - Non-git test uses a fresh mkdtemp dir instead of os.tmpdir(). - Added debug log when ceiling stops the walk-up. - Removed unused `origFindRepoRoot` variable.
|
Addressed all three review comments in deac23e: 1. Ceiling test doesn't exercise ceiling logic — Rewrote the test to use 2. Test silently passes without asserting — Replaced 3. Unused variable Also added a |
| const base = ceiling || process.cwd(); | ||
| return path.join(base, '.codegraph', 'graph.db'); |
There was a problem hiding this comment.
Undocumented behavior change for non-worktree users
When no existing DB is found during the walk-up, the default creation path has silently changed from process.cwd() to ceiling (the git repo root) for any user inside a git repository. Previously, running codegraph build from /repo/sub/ with no DB would create /repo/sub/.codegraph/graph.db. After this PR, it creates /repo/.codegraph/graph.db.
This is likely the desired long-term behavior, but it's a breaking change for users who intentionally scope their codegraph DB to a project subdirectory rather than the repo root. The PR description only mentions the worktree ceiling fix, not the changed default creation location.
Consider documenting this in the PR description or CHANGELOG, or preserving the previous fallback for non-worktree scenarios:
const base = ceiling || process.cwd();
return path.join(base, '.codegraph', 'graph.db');There was a problem hiding this comment.
Good catch on documenting this. Updated the PR description with a Behavior change section explaining that the default DB creation path is now the git repo root instead of process.cwd() when inside a git repo. The process.cwd() fallback is preserved when not in a git repo (line 109: const base = ceiling || process.cwd()).
| it('caches results when called without arguments', () => { | ||
| _resetRepoRootCache(); | ||
| const first = findRepoRoot(); | ||
| const second = findRepoRoot(); | ||
| expect(first).toBe(second); | ||
| }); |
There was a problem hiding this comment.
Caching test does not actually verify caching
JavaScript string primitives are compared by value, not reference, so expect(first).toBe(second) passes even if execFileSync is called twice and returns two distinct string objects with the same content. The test does not prove that the cache is being used.
To truly verify the caching behaviour, spy on execFileSync and assert it was called exactly once:
it('caches results when called without arguments', () => {
_resetRepoRootCache();
const spy = vi.spyOn(childProcess, 'execFileSync');
const first = findRepoRoot();
const second = findRepoRoot();
expect(first).toBe(second);
expect(spy).toHaveBeenCalledTimes(1); // cached on second call
spy.mockRestore();
});This requires importing node:child_process as a namespace (import * as childProcess from 'node:child_process') so the spy can intercept the module-level binding.
There was a problem hiding this comment.
Fixed in 83efde5. The caching test now uses vi.mock('node:child_process') to create a spy on execFileSync, clears it before the test, and asserts toHaveBeenCalledTimes(1) — proving the second call hits the cache.
tests/unit/db.test.js
Outdated
| it('does not use cache when called with explicit dir', () => { | ||
| _resetRepoRootCache(); | ||
| const fromCwd = findRepoRoot(); | ||
| // Calling with an explicit dir should still work (not use cwd cache) | ||
| const fromExplicit = findRepoRoot(process.cwd()); | ||
| expect(fromExplicit).toBe(fromCwd); | ||
| }); |
There was a problem hiding this comment.
Test name does not match what is being verified
The test is named "does not use cache when called with explicit dir", implying it verifies that execFileSync is invoked again when fromDir is provided. However, the test only asserts that both calls return the same value — which would pass whether or not the cache is bypassed.
Consider either renaming the test to reflect what is actually checked ('returns same value whether using cache or explicit dir'), or adding a spy to assert that execFileSync is called a second time when an explicit fromDir is passed.
| it('does not use cache when called with explicit dir', () => { | |
| _resetRepoRootCache(); | |
| const fromCwd = findRepoRoot(); | |
| // Calling with an explicit dir should still work (not use cwd cache) | |
| const fromExplicit = findRepoRoot(process.cwd()); | |
| expect(fromExplicit).toBe(fromCwd); | |
| }); | |
| it('returns same value whether using cache or explicit dir', () => { |
There was a problem hiding this comment.
Fixed in 83efde5. Renamed to "bypasses cache when called with explicit dir" and added a spy assertion: expect(execFileSyncSpy).toHaveBeenCalledTimes(2) — confirming the second call with an explicit dir actually invokes execFileSync again instead of using the cache.
- Mock execFileSync via vi.mock to verify caching calls exactly once and cache bypass calls twice (not just comparing return values) - Mock execFileSync to throw in "not in git repo" test so the null path is always exercised regardless of host environment - Rename "does not use cache" test to "bypasses cache" with spy assertion
…edback" This reverts commit 83efde5.
…eedback" This reverts commit ccb8bd5.
| it('falls back gracefully when not in a git repo', () => { | ||
| const emptyDir = fs.mkdtempSync(path.join(os.tmpdir(), 'cg-nogit-')); | ||
| const origCwd = process.cwd; | ||
| process.cwd = () => emptyDir; | ||
| _resetRepoRootCache(); | ||
| try { | ||
| const result = findDbPath(); | ||
| // Should return default path at cwd since there's no git ceiling | ||
| expect(result).toBe(path.join(emptyDir, '.codegraph', 'graph.db')); | ||
| } finally { | ||
| process.cwd = origCwd; | ||
| fs.rmSync(emptyDir, { recursive: true, force: true }); | ||
| } | ||
| }); |
There was a problem hiding this comment.
Flaky non-git fallback test — same issue that was fixed in findRepoRoot
This test creates emptyDir inside os.tmpdir() but never mocks execFileSync to throw. The spy's default implementation calls the real execFileSync, so if the test host's os.tmpdir() happens to be inside a git repository (possible on some CI setups or Docker images), findRepoRoot() will return a non-null ceiling. findDbPath() would then return ceiling/.codegraph/graph.db instead of emptyDir/.codegraph/graph.db, making the expect(result).toBe(path.join(emptyDir, ...)) assertion fail.
The analogous test in findRepoRoot ('returns null when not in a git repo') was already fixed by using mockImplementationOnce to force execFileSync to throw. The same fix should be applied here:
it('falls back gracefully when not in a git repo', () => {
const emptyDir = fs.mkdtempSync(path.join(os.tmpdir(), 'cg-nogit-'));
const origCwd = process.cwd;
process.cwd = () => emptyDir;
_resetRepoRootCache();
execFileSyncSpy.mockImplementationOnce(() => {
throw new Error('not a git repo');
});
try {
const result = findDbPath();
expect(result).toBe(path.join(emptyDir, '.codegraph', 'graph.db'));
} finally {
process.cwd = origCwd;
fs.rmSync(emptyDir, { recursive: true, force: true });
}
});There was a problem hiding this comment.
Fixed in e3c1d4a. Added \ before the test, ensuring \ always returns null regardless of host environment.
tests/unit/db.test.js
Outdated
| * Unit tests for src/db.js — build_meta helpers included | ||
| */ | ||
|
|
||
| import { execFileSync as realExecFileSync } from 'node:child_process'; |
There was a problem hiding this comment.
realExecFileSync is not the real function — it's the spy
Due to Vitest's vi.mock hoisting, vi.mock('node:child_process', ...) is applied before all import statements are resolved, including this one. As a result, realExecFileSync is bound to execFileSyncSpy, not the unspied original.
This is currently harmless — the spy's default mockImplementation delegates to the real execFileSync, and realExecFileSync is only used in beforeAll (before any mockImplementationOnce calls). However, the name realExecFileSync is misleading: it implies the import bypasses the mock, which it does not. A future developer who relies on this assumption (e.g., calling it after a mockReset()) would get unexpected behavior.
Consider renaming it to execFileSyncUnspied and obtaining it via importOriginal inside the factory, or at minimum add a comment:
| import { execFileSync as realExecFileSync } from 'node:child_process'; | |
| import { execFileSync as execFileSyncForSetup } from 'node:child_process'; // Note: this is the spy's delegate, not the original — only safe to call before test-level mockImplementationOnce overrides |
There was a problem hiding this comment.
Fixed in e3c1d4a. Renamed realExecFileSync → execFileSyncForSetup and added a comment explaining it resolves to the spy due to vi.mock hoisting, and is only safe to call before test-level mockImplementationOnce overrides.
…rt name - Mock execFileSync to throw in "falls back gracefully when not in a git repo" test, preventing flakiness when os.tmpdir() is inside a git repo - Rename realExecFileSync → execFileSyncForSetup with comment explaining it resolves to the spy due to vi.mock hoisting
Combine PR's connection.js imports (findRepoRoot, _resetRepoRootCache) with main's barrel db/index.js pattern. Add missing re-exports to barrel. Impact: 22 functions changed, 10 affected
| @@ -144,11 +160,13 @@ describe('findDbPath', () => { | |||
| const origCwd = process.cwd; | |||
| process.cwd = () => emptyDir; | |||
| try { | |||
| _resetRepoRootCache(); | |||
| const result = findDbPath(); | |||
| expect(result).toContain('.codegraph'); | |||
| expect(result).toContain('graph.db'); | |||
| } finally { | |||
| process.cwd = origCwd; | |||
There was a problem hiding this comment.
findDbPath fallback test doesn't control the ceiling
This test calls _resetRepoRootCache() but never mocks execFileSync to throw, so findRepoRoot() will call the real git rev-parse --show-toplevel from emptyDir. If the test host's os.tmpdir() is inside a git repo (common in some CI setups), ceiling will be set to that repo root and the default path returned will be <git-root>/.codegraph/graph.db rather than emptyDir/.codegraph/graph.db. The toContain assertions pass either way, so the test silently misses validating the actual fallback location.
The analogous 'falls back gracefully when not in a git repo' test in the new findDbPath with git ceiling describe already solves this correctly with execFileSyncSpy.mockImplementationOnce. The same treatment should be applied here:
| it('returns default path when no DB found', () => { | |
| const emptyDir = fs.mkdtempSync(path.join(tmpDir, 'empty-')); | |
| const origCwd = process.cwd; | |
| process.cwd = () => emptyDir; | |
| execFileSyncSpy.mockImplementationOnce(() => { | |
| throw new Error('not a git repo'); | |
| }); | |
| try { | |
| _resetRepoRootCache(); | |
| const result = findDbPath(); | |
| expect(result).toBe(path.join(emptyDir, '.codegraph', 'graph.db')); | |
| } finally { | |
| process.cwd = origCwd; | |
| _resetRepoRootCache(); | |
| } | |
| }); |
This ensures ceiling = null, the process.cwd() fallback is exercised deterministically, and the assertion is precise enough to detect regressions.
There was a problem hiding this comment.
Fixed in 0d0fdd9. The test now mocks execFileSync to throw (via mockImplementationOnce) so findRepoRoot() always returns null, ensuring the process.cwd() fallback is deterministically exercised. Also tightened the assertion from toContain to toBe with the exact expected path.
There was a problem hiding this comment.
This was already fixed in 0d0fdd9 — the test now mocks execFileSync to throw via mockImplementationOnce, ensuring findRepoRoot() always returns null regardless of host environment. The assertion was also tightened from toContain to toBe with the exact expected path.
On macOS, os.tmpdir() returns /var/folders/... but git rev-parse returns /private/var/folders/... (resolved symlink). The ceiling comparison failed because the paths didn't match. Use fs.realpathSync on cwd to normalize symlinks before comparing against the ceiling. Impact: 1 functions changed, 1 affected
Impact: 1 functions changed, 1 affected
…boundary Impact: 9 functions changed, 19 affected
The 'returns default path when no DB found' test didn't control the git ceiling — if tmpDir was inside a git repo, findRepoRoot() would return a non-null ceiling and the default path would differ from emptyDir. Mock execFileSync to throw so the cwd fallback is always exercised.
…m ceiling On macOS, os.tmpdir() returns /var/... but git resolves symlinks to /private/var/..., and on Windows, 8.3 short names (RUNNER~1) differ from long names (runneradmin). findDbPath normalizes dir via fs.realpathSync but findRepoRoot only used path.resolve, causing the ceiling comparison to fail — the walk crossed the worktree boundary. Fix findRepoRoot to use fs.realpathSync on git output, and resolve test paths after directory creation so assertions match. Impact: 1 functions changed, 77 affected
| try { | ||
| _resetRepoRootCache(); |
There was a problem hiding this comment.
Test assertion will fail on macOS when /tmp is the temp dir
result is built from ceiling, which is the real (symlink-resolved) path returned by git rev-parse --show-toplevel. On macOS, if os.tmpdir() resolves through /tmp (which is a symlink to /private/tmp), git will return /private/tmp/cg-ceiling-xxx/worktree while worktreeRoot is still /tmp/cg-ceiling-xxx/worktree. The toBe assertion then compares:
- actual:
/private/tmp/cg-ceiling-xxx/worktree/.codegraph/graph.db - expected:
/tmp/cg-ceiling-xxx/worktree/.codegraph/graph.db
These are not equal, so the test fails. The production code in connection.js is actually correct — realpathSync is applied to dir precisely to ensure ceiling (from git) and dir agree on symlinks. The test expectation just needs to match the same resolved form:
| try { | |
| _resetRepoRootCache(); | |
| expect(result).toBe(path.join(fs.realpathSync(worktreeRoot), '.codegraph', 'graph.db')); | |
| expect(result).not.toContain(`${path.basename(outerDir)}${path.sep}.codegraph`); |
There was a problem hiding this comment.
Fixed in 651c164. The test now uses \ for the expected ceiling instead of the test's , so the assertion always matches the path form that \ uses internally. This handles both the macOS symlink case (\ → ) and the Windows 8.3 short name case (\ → ). Additionally, the production code now uses a stat-based \ comparison (dev+ino fallback) instead of string equality for the ceiling check, which is robust against all path representation differences.
git rev-parse resolves 8.3 short names (RUNNER~1 → runneradmin) but fs.realpathSync may not, causing the string comparison to fail and the walk to escape the git ceiling boundary. Replace string equality with isSameDirectory() that falls back to dev+ino comparison when paths differ textually but refer to the same directory. Also fix the test assertion to use findRepoRoot() for the expected ceiling path instead of the test's worktreeRoot, since git may resolve paths differently than realpathSync. Impact: 2 functions changed, 137 affected
| export { | ||
| _resetRepoRootCache, | ||
| closeDb, | ||
| findDbPath, | ||
| findRepoRoot, | ||
| openDb, | ||
| openReadonlyOrFail, | ||
| } from './connection.js'; |
There was a problem hiding this comment.
Test-only utility exported through the public barrel
_resetRepoRootCache is a testing helper (signalled by the leading underscore) but is now re-exported from the public barrel src/db/index.js. Any consumer that imports from …/db/index.js will see this function in their IDE completions and bundled types, and could accidentally call it in production code to blow away the per-process cache.
A cleaner approach is to have the test files import directly from the implementation module:
// in tests/unit/db.test.js
import { _resetRepoRootCache, findRepoRoot, ... } from '../../src/db/connection.js';Then remove _resetRepoRootCache from the barrel export, keeping the public API free of test infrastructure.
There was a problem hiding this comment.
Fixed in the latest push. Removed _resetRepoRootCache from the src/db/index.js barrel export. The test file now imports it directly from ../../src/db/connection.js, keeping the public API free of test infrastructure.
| it('finds DB within the ceiling boundary', () => { | ||
| // Create a DB inside the worktree — should be found normally | ||
| fs.mkdirSync(path.join(worktreeRoot, '.codegraph'), { recursive: true }); | ||
| fs.writeFileSync(path.join(worktreeRoot, '.codegraph', 'graph.db'), ''); | ||
| const origCwd = process.cwd; | ||
| process.cwd = () => innerDir; | ||
| try { | ||
| _resetRepoRootCache(); | ||
| const result = findDbPath(); | ||
| expect(result).toContain('worktree'); | ||
| expect(result).toContain('.codegraph'); | ||
| } finally { | ||
| process.cwd = origCwd; | ||
| fs.rmSync(path.join(worktreeRoot, '.codegraph'), { recursive: true, force: true }); | ||
| } | ||
| }); |
There was a problem hiding this comment.
Loose assertions don't validate the found DB location precisely
The two toContain assertions verify that the result string contains 'worktree' and '.codegraph', but they don't confirm the DB was found at worktreeRoot rather than innerDir. A path like path.join(innerDir, '.codegraph', 'graph.db') would also satisfy both checks. Every other assertion in this describe block uses toBe with an exact path. Consider using the same here for consistency and precision:
| it('finds DB within the ceiling boundary', () => { | |
| // Create a DB inside the worktree — should be found normally | |
| fs.mkdirSync(path.join(worktreeRoot, '.codegraph'), { recursive: true }); | |
| fs.writeFileSync(path.join(worktreeRoot, '.codegraph', 'graph.db'), ''); | |
| const origCwd = process.cwd; | |
| process.cwd = () => innerDir; | |
| try { | |
| _resetRepoRootCache(); | |
| const result = findDbPath(); | |
| expect(result).toContain('worktree'); | |
| expect(result).toContain('.codegraph'); | |
| } finally { | |
| process.cwd = origCwd; | |
| fs.rmSync(path.join(worktreeRoot, '.codegraph'), { recursive: true, force: true }); | |
| } | |
| }); | |
| it('finds DB within the ceiling boundary', () => { | |
| // Create a DB inside the worktree — should be found normally | |
| fs.mkdirSync(path.join(worktreeRoot, '.codegraph'), { recursive: true }); | |
| fs.writeFileSync(path.join(worktreeRoot, '.codegraph', 'graph.db'), ''); | |
| const origCwd = process.cwd; | |
| process.cwd = () => innerDir; | |
| try { | |
| _resetRepoRootCache(); | |
| const result = findDbPath(); | |
| const ceiling = findRepoRoot(); | |
| expect(result).toBe(path.join(ceiling, '.codegraph', 'graph.db')); | |
| } finally { | |
| process.cwd = origCwd; | |
| fs.rmSync(path.join(worktreeRoot, '.codegraph'), { recursive: true, force: true }); | |
| } | |
| }); |
There was a problem hiding this comment.
Fixed — tightened the assertion to use toBe with the exact ceiling-based path, consistent with the rest of the describe block.
| export function findRepoRoot(fromDir) { | ||
| const dir = fromDir || process.cwd(); | ||
| if (!fromDir && _cachedRepoRoot !== undefined) return _cachedRepoRoot; | ||
| let root = null; | ||
| try { | ||
| const raw = execFileSync('git', ['rev-parse', '--show-toplevel'], { | ||
| cwd: dir, | ||
| encoding: 'utf-8', | ||
| stdio: ['pipe', 'pipe', 'pipe'], | ||
| }).trim(); | ||
| // Use realpathSync to resolve symlinks (macOS /var → /private/var) and | ||
| // 8.3 short names (Windows RUNNER~1 → runneradmin) so the ceiling path | ||
| // matches the realpathSync'd dir in findDbPath. | ||
| try { | ||
| root = fs.realpathSync(raw); | ||
| } catch { | ||
| root = path.resolve(raw); | ||
| } | ||
| } catch { | ||
| root = null; | ||
| } | ||
| if (!fromDir) _cachedRepoRoot = root; | ||
| return root; | ||
| } |
There was a problem hiding this comment.
Stale cache if process.cwd() changes after first call
findRepoRoot() (no args) caches its result the first time it's called and returns that value on every subsequent no-arg call — even if process.cwd() has changed in the meantime. In a long-running process (e.g. an MCP server that serves multiple worktree sessions sequentially), a second session with a different cwd would receive the ceiling from the first session. findDbPath() then computes dir from the current process.cwd() but compares it against a ceiling from the old cwd, allowing the walk to escape the expected boundary.
The per-process cache is intentional and documented, and it's benign if each codegraph invocation is a fresh process. However, for the MCP-server use case explicitly called out in the PR, the lifetime of the process spans multiple client connections. A defensive fix would be to key the cache on the cwd value captured at call time:
let _cachedRepoRoot; // undefined = not computed
let _cachedRepoRootCwd; // cwd when the cache was populated
export function findRepoRoot(fromDir) {
const dir = fromDir || process.cwd();
if (!fromDir && _cachedRepoRoot !== undefined && _cachedRepoRootCwd === dir) {
return _cachedRepoRoot;
}
// ... rest of implementation ...
if (!fromDir) {
_cachedRepoRoot = root;
_cachedRepoRootCwd = dir;
}
return root;
}There was a problem hiding this comment.
Fixed — the cache is now keyed on the cwd value at call time. If process.cwd() changes between calls (e.g. MCP server serving multiple sessions), the cache is invalidated and recomputed. _resetRepoRootCache clears both the cached value and the cwd key.
Impact: 2 functions changed, 2 affected
…hort names Impact: 1 functions changed, 1 affected
…tave/codegraph into fix/db-path-worktree-boundary Impact: 13 functions changed, 3 affected
On Windows CI, fs.realpathSync(process.cwd()) and git rev-parse --show-toplevel can resolve 8.3 short names (RUNNER~1) differently than long names (runneradmin). Apply realpathSync to both sides of the assertion so the comparison is path-equivalent.
…boundary # Conflicts: # src/db/index.js Impact: 10 functions changed, 11 affected
Summary
findRepoRoot()tosrc/db/connection.jsusinggit rev-parse --show-toplevel, which returns the correct root for both regular repos and git worktrees. Results are cached per-process.findRepoRoot()as a ceiling infindDbPath()so the walk-up stops at the git boundary — worktree sessions resolve to their own DB, not the parent repo's.findRepoRoot(caching, non-git fallback) andfindDbPathceiling behavior (worktree simulation, non-git fallback).Context
When codegraph runs inside a git worktree (e.g.
.claude/worktrees/agent-xxx/),findDbPath()walked up fromprocess.cwd()and crossed the worktree boundary, finding the main repo's database. This made codegraph unusable in parallel worktree sessions — the exact workflow it's designed to support.Behavior change
When no existing
.codegraph/graph.dbis found during the walk-up, the default DB creation path is now the git repo root (fromfindRepoRoot()) instead ofprocess.cwd(). For example, runningcodegraph buildfrom/repo/packages/foo/with no pre-existing DB now creates/repo/.codegraph/graph.dbinstead of/repo/packages/foo/.codegraph/graph.db. When not inside a git repo, the previous behavior (process.cwd()) is preserved. This is intentional — the repo root is the correct default for most use cases.Test plan
npm test— all existing tests passcd .claude/worktrees/xxx && npx codegraph build . && npx codegraph stats— uses worktree DBcodegraph stats— still works as beforeCompatibility
Only 2 files changed (
src/db/connection.js,tests/unit/db.test.js). No overlap with #456 onconnection.js. Minor merge conflict possible intests/unit/db.test.jsimport lines — trivial to resolve.