fix(pre-tool-use): don't intercept incidental memory-path mentions by Kaustubh1504 · Pull Request #251 · activeloopai/hivemind

Kaustubh1504 · 2026-06-09T23:23:26Z

Closes #87

Problem

The Bash branch ran touchesMemory(cmd) as a substring/regex test, so the
memory path triggered interception even when it only appeared as inert text
in a prompt — e.g. claude -p 'use ~/.deeplake/memory/' was rewritten to the
[RETRY REQUIRED] echo instead of running.

Fix

Added bashTouchesMemory() (in memory-path-utils.ts), which tokenizes the
command (parseBashTokens) and applies a narrow carve-out on top of broad
detection:

Default: any memory-path reference is a real interaction → intercept. This
keeps file builtins routed and interpreters/fetchers/substitutions on the
retry-guidance path (preserves the host-access + path-traversal protections
the existing tests pin).
Carve-out: echo/printf/claude stages where the path is inert text pass
through — UNLESS it's a redirection target (a write, the documented
echo '<content>' > '<path>' path) or smuggles a $()/backtick
substitution, which fall through to be caught.

Swapped touchesMemory → bashTouchesMemory in the two Bash-command gates in
pre-tool-use.ts; bare-path detection for Read/Grep/Glob/Write/Edit is
unchanged.

Acceptance (#87)

cat ~/.deeplake/memory/index.md — still intercepted
claude -p 'use ~/.deeplake/memory/' — passes through
echo '~/.deeplake/memory/' — passes through
grep foo ~/.deeplake/memory/summaries/ — still intercepted

Why not the argv/command-name approach from the issue

Command-name alone can't satisfy the acceptance: it can't pass echo '<path>'
while keeping rm/python3 on a bare memory path routed/guided, since those
are non-reader commands too. Worse, "unrecognized → pass through" would let
python3 ~/.deeplake/memory/../../../etc/passwd reach the host shell. The
carve-out approach passes all four cases and keeps the existing security tests
green.

Tests

Added a #87 block covering the two passthrough cases plus boundaries (write
redirect, $() substitution, quoted reader still intercept). Full existing
suite stays green.

Out of scope

parseBashTokens handles ; | || && \n, quotes, and >/>>; it does not
unescape backslashes, fully model 2>/&> fd-redirects, or parse <(
(isSafe() rejects <(, $(), backticks upstream).

Summary by CodeRabbit

New Features
- Improved Bash-aware parsing and memory-interaction detection for shell commands, used by pre-tool guidance.
Bug Fixes
- Avoids false positives for incidental memory-path mentions (e.g., printed, quoted, or heredoc content) while still catching real interactions like redirects, command substitution, and process reads.
Tests
- Added comprehensive boundary tests for quoting, substitutions, redirections, heredocs, multi-stage commands, and passthrough cases.
Chores
- CI workflow: tightened condition for posting coverage comments on PRs.

Tighten the Bash memory check so a path that only appears as inert text (echo/printf/claude prompt) passes through, while real interactions (file builtins, interpreters, redirects, substitutions) still route/guide. Closes activeloopai#87

coderabbitai · 2026-06-09T23:23:36Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: bea32559-bc71-4e54-be0a-2677202a76f4

📥 Commits

Reviewing files that changed from the base of the PR and between cc37426 and abbae04.

📒 Files selected for processing (4)

.github/workflows/ci.yaml
src/hooks/memory-path-utils.ts
tests/claude-code/pre-tool-use.test.ts
tests/shared/memory-path-utils.test.ts

✅ Files skipped from review due to trivial changes (1)

.github/workflows/ci.yaml

🚧 Files skipped from review as they are similar to previous changes (3)

tests/claude-code/pre-tool-use.test.ts
src/hooks/memory-path-utils.ts
tests/shared/memory-path-utils.test.ts

📝 Walkthrough

Walkthrough

Replaces substring memory-path checks with bash-aware tokenization and detection, wires bashTouchesMemory into the pre-tool-use hook, adds unit and integration tests for passthrough and boundary cases, and tightens a CI workflow step condition.

Changes

Bash memory path detection refinement

Layer / File(s)	Summary
Bash command parsing and memory-interaction detection `src/hooks/memory-path-utils.ts`	New `parseBashTokens` tokenizer respects quoting and escapes to split commands into stages with tokens. `bashTouchesMemory` uses it (and strips heredoc bodies) to detect memory-path interactions: flags redirection targets, treats command/process substitution + memory as interaction, exempts inert mentions inside `echo`/`printf`/`claude` stages, and flags other token-level memory uses.
Pre-tool-use hook integration `src/hooks/pre-tool-use.ts`	Imports `bashTouchesMemory` and replaces `touchesMemory` with `bashTouchesMemory` in the Bash branch of `getShellCommand` and the fallback retry-guidance path in `processPreToolUse`.
Test coverage for pass-through and interception `tests/claude-code/pre-tool-use.test.ts`, `tests/shared/memory-path-utils.test.ts`	Adds tests asserting incidental mentions (e.g., `claude -p '~/.deeplake/memory/'`, `echo` printing a memory path) pass through, and boundary cases (redirects, command substitution, quoted read paths, process substitution, input redirects, heredoc inertness) continue to trigger interception and guidance.

CI workflow condition tweak

Layer / File(s)	Summary
Post-coverage comment condition `.github/workflows/ci.yaml`	The "Post coverage comment on PR" step `if:` condition now additionally checks that the PR head repo equals the repository (fork check) while retaining the non-empty `steps.pr-coverage.outputs.body-file` requirement.

Sequence Diagram(s)

sequenceDiagram
  participant CLI as CLI command string
  participant Parser as parseBashTokens
  participant Detector as bashTouchesMemory
  participant Hook as preToolUseHook

  CLI->>Parser: raw bash command
  Parser->>Detector: stages and tokens (heredoc bodies stripped)
  Detector->>Hook: boolean: touchesMemory?
  Hook->>Hook: decide rewrite / guidance / passthrough

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

activeloopai/hivemind#229: Related handling of quoted heredoc bodies when detecting memory-path occurrences.

Suggested reviewers

efenocchi
kaghni

Poem

🐇 I hop through quotes and backslashes with care,
Parsing stages and tokens hidden there.
Redirects and subs show when memory's touched,
Echoes and prompts left alone — not too much.
A tiny rabbit cheers: no more false scare!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 20.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately describes the main change: replacing broad substring matching with token-aware detection to avoid intercepting incidental memory-path mentions in Bash commands.
Description check	✅ Passed	The description provides a clear problem statement, solution explanation, acceptance criteria validation, and test coverage details aligned with the template.
Linked Issues check	✅ Passed	All four acceptance criteria from `#87` are addressed: real memory-path references are intercepted, incidental mentions in prompts pass through, echo commands pass through, and grep on memory paths is intercepted.
Out of Scope Changes check	✅ Passed	All changes are directly related to fixing `#87`: new bash tokenizer/detection in memory-path-utils, integration into pre-tool-use gates, comprehensive tests, and a CI workflow refinement unrelated to the fix scope.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

🧹 Nitpick comments (1)

tests/claude-code/pre-tool-use.test.ts (1)

503-519: 💤 Low value

Consider strengthening boundary test assertions.

The boundary tests correctly verify that genuine memory interactions are intercepted (empty === false), but they don't assert on the decision type or content. For consistency with other interception tests in this file (e.g., lines 148-152 which check r.decision and r.updatedCommand), consider adding assertions on the expected decision shape.

Example strengthening

   it("still intercepts `echo` redirecting INTO memory (documented write path)", () => {
     const r = runPreToolUse("Bash", { command: "echo 'hi' > ~/.deeplake/memory/note.md" });
     expect(r.empty).toBe(false);
+    if (!r.empty) {
+      expect(r.decision).toBe("allow");
+      expect(r.updatedCommand).toContain("RETRY REQUIRED");
+    }
   });

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/claude-code/pre-tool-use.test.ts` around lines 503 - 519, The three
boundary tests using runPreToolUse ("still intercepts `echo` redirecting INTO
memory (documented write path)", "still intercepts `echo` with a substitution
touching memory", "intercepts a quoted reader path") only assert r.empty but
should also assert the interception decision and updated command like other
tests do; update each test to check r.decision (e.g., equals "intercept" or
matches the existing decision shape used in earlier tests) and r.updatedCommand
(or the expected sanitized/redirected command string) to ensure the decision
shape and content are verified alongside r.empty, referencing runPreToolUse,
r.decision, and r.updatedCommand to locate the change points.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@tests/claude-code/pre-tool-use.test.ts`:
- Around line 503-519: The three boundary tests using runPreToolUse ("still
intercepts `echo` redirecting INTO memory (documented write path)", "still
intercepts `echo` with a substitution touching memory", "intercepts a quoted
reader path") only assert r.empty but should also assert the interception
decision and updated command like other tests do; update each test to check
r.decision (e.g., equals "intercept" or matches the existing decision shape used
in earlier tests) and r.updatedCommand (or the expected sanitized/redirected
command string) to ensure the decision shape and content are verified alongside
r.empty, referencing runPreToolUse, r.decision, and r.updatedCommand to locate
the change points.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 3f5f08c1-2a08-4c2f-89f0-6d7433008af1

📥 Commits

Reviewing files that changed from the base of the PR and between 7ee79b3 and ebe4c6f.

📒 Files selected for processing (3)

src/hooks/memory-path-utils.ts
src/hooks/pre-tool-use.ts
tests/claude-code/pre-tool-use.test.ts

The echo/printf/claude passthrough let three genuinely-interacting shapes reach the host shell: - Process substitution: the stage-level guard checked $( and backticks but not <(. The PR notes said isSafe() rejects <( upstream, but isSafe only runs AFTER bashTouchesMemory returns true. Moved the substitution guard to the top of bashTouchesMemory on the raw command, covering <(. - Input redirection: < was not tokenized, so feeding a memory file into `claude -p` via `< <memory-path>` passed through. < is now its own token (<</<<< lumped and ignored) and a redirect target check runs alongside >/>>. - Quote parsing: backslash was treated as an escape inside single quotes, where bash treats it as literal — `echo 'a\' ; cat <memory-path>` swallowed the cat stage into the echo passthrough. Escape handling now skips inside single quotes. All three cases added as regression tests; boundary tests also assert decision shape per review feedback. 4395 tests green.

efenocchi · 2026-06-10T21:23:23Z

Thanks for this — the approach is right: a narrow carve-out on top of broad detection is the correct call, and your justification for rejecting the "unrecognized command → pass through" idea from #87 is spot on.

While reviewing, we probed the carve-out empirically and found three shapes that the old substring check intercepted but the new code passed through to the host shell. Rather than going back and forth, we pushed a follow-up commit (b19fb20) directly to this branch (maintainer edits are enabled) closing them:

Process substitution — echo <(cat <memory-path>). The stage-level guard checked $( and backticks but not <(. The "isSafe() rejects <( upstream" note in the description is inverted: isSafe() only runs after bashTouchesMemory() returns true, so the carve-out skipped it entirely. The substitution guard now runs once at the top of bashTouchesMemory() on the raw command and includes <(.
Input redirection — claude -p 'summarize' < <memory-path> is a genuine memory read, but only >/>> were tokenized. < is now its own token (<</<<< are lumped and ignored so heredocs/herestrings are unaffected) and checked as a redirect target.
Backslash inside single quotes — bash treats \ as literal there, but the parser treated it as an escape, so echo 'a\' ; cat <memory-path> swallowed the cat stage into the echo passthrough. The escape branch now skips inside single quotes.

All three are added as regression tests, the boundary tests now also assert the decision shape (per the CodeRabbit note), and the full suite is green (4395 tests). With these in, this looks good to us — thanks again for the contribution and the careful writeup.

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tests/claude-code/pre-tool-use.test.ts`:
- Around line 508-511: The test uses a loose substring assertion on
r.updatedCommand (expect(r.updatedCommand).toContain("RETRY REQUIRED")) which
can mask regressions; update the assertions for the cases where !r.empty and
r.decision === "allow" to assert a stricter shape — either compare
r.updatedCommand to a stable expected constant (exact string) or use an anchored
regex to validate the exact guidance format (e.g., start/end or full message
structure) and keep the decision check as expect(r.decision).toBe("allow");
change the four occurrences referencing r.updatedCommand so they assert the
exact/anchored message instead of toContain.
- Around line 541-547: The test only checks r.empty and needs explicit
assertions about the hook decision and guidance; update this test (and the two
sibling cases) to assert the expected decision shape returned by runPreToolUse
(e.g., that r.decision === 'allow' and that the guidance/retry fields contain
the expected retry semantics or message text), and also assert r.empty is false
as before so regressions that change decision mode or guidance are caught; look
for the runPreToolUse call and the test blocks named "still intercepts..." to
add these assertions.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 4549cfe0-3ac6-43dc-9e54-70445534ca83

📥 Commits

Reviewing files that changed from the base of the PR and between ebe4c6f and b19fb20.

📒 Files selected for processing (2)

src/hooks/memory-path-utils.ts
tests/claude-code/pre-tool-use.test.ts

🚧 Files skipped from review as they are similar to previous changes (1)

src/hooks/memory-path-utils.ts

coderabbitai · 2026-06-10T21:25:30Z

+    if (!r.empty) {
+      expect(r.decision).toBe("allow");
+      expect(r.updatedCommand).toContain("RETRY REQUIRED");
+    }


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Strengthen guidance assertions to specific shapes instead of substring-only checks.

On Line 510, Line 519, Line 528, and Line 537, .toContain("RETRY REQUIRED") is broad and can mask wording/shape regressions. Prefer asserting a stricter structure (e.g., decision === "allow" + exact/anchored guidance format or a stable expected message constant).

As per coding guidelines, "tests/**: Prefer asserting on specific values (paths, messages) over generic substrings."

Also applies to: 517-520, 526-529, 535-538

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/claude-code/pre-tool-use.test.ts` around lines 508 - 511, The test uses a loose substring assertion on r.updatedCommand (expect(r.updatedCommand).toContain("RETRY REQUIRED")) which can mask regressions; update the assertions for the cases where !r.empty and r.decision === "allow" to assert a stricter shape — either compare r.updatedCommand to a stable expected constant (exact string) or use an anchored regex to validate the exact guidance format (e.g., start/end or full message structure) and keep the decision check as expect(r.decision).toBe("allow"); change the four occurrences referencing r.updatedCommand so they assert the exact/anchored message instead of toContain.

Source: Coding guidelines

The new helpers were only exercised through the bundled hook in a subprocess, which v8 coverage cannot see, so the per-file 90% gate failed (82.67% statements / 81.01% branches). Direct imports of the real module bring the file to 99.21% statements / 97.46% branches / 100% functions and lines.

coderabbitai

🧹 Nitpick comments (1)

tests/shared/memory-path-utils.test.ts (1)
126-128: 💤 Low value

Consider clarifying the test description.

The test correctly validates that string-based detection intercepts path traversal attempts (the path contains the memory substring even though it would resolve elsewhere). The description "on a memory path" could be more precise, e.g., "intercepts an interpreter with memory-path prefix even when traversing elsewhere" to make the security intent clearer.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/shared/memory-path-utils.test.ts` around lines 126 - 128, Update the
test description to more clearly express the security intent: change the it()
title for the case using bashTouchesMemory(`python3 ${MEM}/../../etc/passwd`) to
something like "intercepts an interpreter with memory-path prefix even when
traversal targets elsewhere" so it references the memory-path prefix and path
traversal; locate the test using the bashTouchesMemory function and MEM constant
in tests/shared/memory-path-utils.test.ts and replace the existing description
string accordingly.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@tests/shared/memory-path-utils.test.ts`:
- Around line 126-128: Update the test description to more clearly express the
security intent: change the it() title for the case using
bashTouchesMemory(`python3 ${MEM}/../../etc/passwd`) to something like
"intercepts an interpreter with memory-path prefix even when traversal targets
elsewhere" so it references the memory-path prefix and path traversal; locate
the test using the bashTouchesMemory function and MEM constant in
tests/shared/memory-path-utils.test.ts and replace the existing description
string accordingly.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 21bf65d5-2b22-40a8-8956-50b65a519915

📥 Commits

Reviewing files that changed from the base of the PR and between b19fb20 and 5ee1d0b.

📒 Files selected for processing (1)

tests/shared/memory-path-utils.test.ts

Fork PRs run with a read-only GITHUB_TOKEN, so the sticky-comment action fails with 'Resource not accessible by integration', marking the job red and skipping the bundle build/smoke steps after it. The coverage gate itself runs (and enforces) earlier regardless.

… name Addresses CodeRabbit review notes: the three new boundary tests now assert decision and retry-guidance content instead of only non-empty output, and the interpreter traversal test name states the security intent.

coderabbitai Bot reviewed Jun 9, 2026

View reviewed changes

coderabbitai Bot reviewed Jun 10, 2026

View reviewed changes

efenocchi added 2 commits June 10, 2026 23:34

efenocchi force-pushed the fix/pre-tool-use-incidental-memory-mentions branch from cc37426 to abbae04 Compare June 10, 2026 23:47

efenocchi merged commit a6f9f55 into activeloopai:main Jun 10, 2026
10 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(pre-tool-use): don't intercept incidental memory-path mentions#251

fix(pre-tool-use): don't intercept incidental memory-path mentions#251
efenocchi merged 5 commits into
activeloopai:mainfrom
Kaustubh1504:fix/pre-tool-use-incidental-memory-mentions

Kaustubh1504 commented Jun 9, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Jun 9, 2026 •

edited

Loading

Reviews paused

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

efenocchi commented Jun 10, 2026

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Jun 10, 2026

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Kaustubh1504 commented Jun 9, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Fix

Acceptance (#87)

Why not the argv/command-name approach from the issue

Tests

Out of scope

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jun 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

efenocchi commented Jun 10, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 10, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Kaustubh1504 commented Jun 9, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 9, 2026 •

edited

Loading