v1.6.16: F-CACHE-DEFER + F-NATIVE-GREP (P0 dogfood fixes) by Anuj7411 · Pull Request #7 · Anuj7411/sipcode

Anuj7411 · 2026-06-21T18:32:43Z

Summary

Two P0 fixes from the 2026-06-17 dogfood backlog, shipped 24 hours before the public launch (Tue 2026-06-23). Tests 1,317 → 1,363 (46 new), 0 regressions.

The launch story arc gains a beat: v1.6.14 (24h) → v1.6.15 (24h) → v1.6.16. Three releases in 9 days, each closes a real user-felt bug found through dogfood.

What this PR does

F-CACHE-DEFER (the user-felt bug Anuj could not see savings from)

sipcode init modifying ~/.claude/settings.json mid-session invalidates Anthropic's prompt cache. For active users the extra input-token cost was outweighing the proxy savings. Anuj's 2026-06-17 dogfood: drift flagged "Cache reuse down 83 points" while proxy --stats reported only ~$1.16 saved.

Fix: Detect active Claude Code sessions before writing settings.json. If active, write a pending-install marker at ~/.sipcode/install-pending.json. The next quiet sipcode command (any command except init) auto-applies the deferred install. The hook script file itself is always written immediately (safe — does not invalidate the cache).

src/modules/init/sessionDetection.ts (pure, 13 tests)
src/modules/init/pendingInstall.ts + maybeApplyPendingInstall (pure, 22 tests)
src/commands/init.ts integration (8 new tests in init-system-setup.test.ts)
src/cli.ts Commander preAction hook
New --force flag on sipcode init for users who want install-now
New StepStatus variant { kind: "deferred"; reason: string } rendered with ⏸ glyph in the SETUP card

F-NATIVE-GREP (the highest-volume, lowest-integrity rewriter)

native-grep fires 30% of all proxy work but held 65% signal kept — the worst of any rewriter in real-world data. The HEAD_LIMIT=50 cap was too aggressive for typical Claude Code symbol lookups in larger codebases (50–100 matches Claude needs for follow-up reads).

Fix: Cap 50 → 100, integrity declaration 0.65 → 0.78. 3 new tests verify new cap + integrity + field preservation.

Test plan

Full suite green every commit (1,317 → 1,330 → 1,346 → 1,354 → 1,360 → 1,363)
TypeScript build clean (tsc -p tsconfig.json)
No regressions in any existing module
CHANGELOG entry covering all behaviors and the engineering trail
README test-count badge + llms.txt + llms-full.txt updated to 1,363 / v1.6.16
Anuj acceptance test (after merge): on a fresh Claude Code session with v1.6.16 installed via npm i -g sipcode && sipcode init, verify (a) deferred path triggers when called from inside an active session, (b) --force overrides, (c) proxy --stats shows native-grep integrity ≥ 75% on real workload, (d) drift no longer reports "Cache reuse down N points" after a clean install
sipcode benchmark produces 62.6% median (or within 1-2 points)

How to release after merge

git checkout main
git pull origin main
git tag v1.6.16
git push origin v1.6.16

CI publishes via OIDC. CDN propagates in ~30-60 seconds.

Commit trail (each step revertable in isolation)

#	Commit	What
1	`7efcf58`	feat(init): detectActiveClaudeSessions
2	`09330c7`	feat(init): pendingInstall marker module
3	`55ba152`	feat(init): F-CACHE-DEFER runSystemSetup integration
4	`0357af3`	feat(cli,init): maybeApplyPendingInstall + --force flag
5	`7af879e`	feat(proxy): F-NATIVE-GREP cap 50 → 100
6	`42e1d5c`	chore(release): bump to 1.6.16 + CHANGELOG + artifacts

Risk assessment

Risk	Mitigation
Active-session detection false negatives (real session not detected)	Defensive: detection failure degrades to "proceed with install" rather than blocking; `--force` always works
Active-session detection false positives (no session, but we defer)	Auto-apply on next sipcode command catches it within seconds; users can always pass `--force`
Pending marker survives across sipcode versions	Schema validation rejects unknown future versions rather than mis-applying — explicit guard
nativeGrep cap raise reduces savings	Locked benchmark re-run as part of acceptance test — measurable, not a guess
Auto-apply log noise on every command	Only logs when state actually changed — no-op clears stale marker silently

🤖 Generated with Claude Code

…p 1) Pure module with injected I/O seam (listDir + stat + now). Scans ~/.claude/projects/<proj>/sessions/<sid>.jsonl for files within the threshold (default 5 min = Anthropic prompt-cache TTL). Defensive: any subtree permission error degrades to "skip subtree", whole walk never throws. 13 tests cover: empty layouts, single recent / stale detection, multi-project counts, listDir failure mid-walk, stat race returning null, non-.jsonl files ignored, custom thresholds. Test count: 1,317 -> 1,330. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Marker at ~/.sipcode/install-pending.json carries the intent "install proxy hook at scriptPath" when the settings.json write is deferred to protect Anthropic's prompt cache. Schema sipcode-install-pending/1, strictly validated on read so unknown future versions are rejected rather than mis-applied. applyPendingInstall re-generates the script (latest version even if deferred days ago) and applies installProxyHook against current settings.json so user-managed hook entries survive. Idempotent: second apply returns no-marker. 16 tests cover: round-trip, overwrite, missing/corrupt/wrong-schema/ missing-fields rejection, no-op clear, no-marker no-op, full apply, idempotency, no-change redundant apply, non-sipcode hook preservation, corrupt settings.json fallback. Test count: 1,330 -> 1,346. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Step 3 of runSystemSetup now gates the settings.json write on active-session detection. When an active Claude Code session is detected, the script file is still written (safe, doesn't invalidate the prompt cache) and a pending marker is dropped so the next quiet sipcode invocation applies the install. The --force flag bypasses the check. StepStatus gains a "deferred" variant. formatSetupCard renders it with ⏸ glyph and a specific footer ("auto-applies on your next sipcode command outside an active session, or pass --force"). Install marker is also deferred when the proxy is deferred so `sipcode impact` baselines do not get skewed by the deferral window. Defensive: detection or marker-write failures degrade gracefully. Detection throwing falls through to the normal install path. Marker-write throwing surfaces as proxyHook=failed without blocking later steps. 8 new tests cover: defer when active, install when quiet, --force override, install-marker defer cascade, detection-throws fallback, marker-write failure, singular/plural reason, deferred SETUP card rendering. Test count: 1,346 -> 1,354. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…eps 4+5) maybeApplyPendingInstall is the CLI startup wrapper for F-CACHE-DEFER: fast no-op when no marker exists; skip-active-session when a session is detected; skip-detection-error when the scan throws; apply otherwise. Logs only when the apply actually changed something on disk, so a stale marker after an `init --force` run is cleared silently. Wired into cli.ts via Commander preAction hook. Runs before every command except `init` (which manages its own state). Hook failures are swallowed so the user's real command never gets blocked by an ergonomic wrapper. Added --force option to the `init` command. When set, runSystemSetup bypasses active-session detection and installs settings.json directly. 6 new tests cover: no-op, skip-active-session, apply, defensive detection failure, no-log-when-no-marker, stale-marker-cleanup-without-log. Test count: 1,354 -> 1,360. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

native-grep was the highest-volume, lowest-integrity rewriter in the 2026-06-17 dogfood data: 30% of all proxy work, 65% signal kept. The HEAD_LIMIT=50 cap was too aggressive for typical Claude Code grep workloads where symbol lookups in larger codebases routinely return 50-100 matches Claude needs for follow-up reads. Raised cap to 100, integrity declaration 0.65 -> 0.78. 3 new tests verify new cap, integrity score, and field-preservation. Acceptance criterion (manual): real-world dogfood should show native-grep integrity >= 75% in `sipcode proxy --stats`. Test count: 1,360 -> 1,363. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…llms.txt Locks v1.6.16 release artifacts: - package.json version 1.6.15 -> 1.6.16 - CHANGELOG: [1.6.16] entry with F-CACHE-DEFER + F-NATIVE-GREP narrative - README test count badge 1,317 -> 1,363 - llms.txt + llms-full.txt: current version + test count refreshed Full suite green (1,363 / 1,363). Build clean. Ready for self-review PR into main, then `git tag v1.6.16 && git push --tags` to trigger npm publish. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Anuj7411 and others added 6 commits June 21, 2026 20:57

Anuj7411 merged commit d1972cf into main Jun 22, 2026
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v1.6.16: F-CACHE-DEFER + F-NATIVE-GREP (P0 dogfood fixes)#7

v1.6.16: F-CACHE-DEFER + F-NATIVE-GREP (P0 dogfood fixes)#7
Anuj7411 merged 6 commits into
mainfrom
v1.6.16-fixes

Anuj7411 commented Jun 21, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Anuj7411 commented Jun 21, 2026

Summary

What this PR does

F-CACHE-DEFER (the user-felt bug Anuj could not see savings from)

F-NATIVE-GREP (the highest-volume, lowest-integrity rewriter)

Test plan

How to release after merge

Commit trail (each step revertable in isolation)

Risk assessment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant