Skip to content

v1.6.16: F-CACHE-DEFER + F-NATIVE-GREP (P0 dogfood fixes)#7

Merged
Anuj7411 merged 6 commits into
mainfrom
v1.6.16-fixes
Jun 22, 2026
Merged

v1.6.16: F-CACHE-DEFER + F-NATIVE-GREP (P0 dogfood fixes)#7
Anuj7411 merged 6 commits into
mainfrom
v1.6.16-fixes

Conversation

@Anuj7411

Copy link
Copy Markdown
Owner

Summary

Two P0 fixes from the 2026-06-17 dogfood backlog, shipped 24 hours before the public launch (Tue 2026-06-23). Tests 1,317 → 1,363 (46 new), 0 regressions.

The launch story arc gains a beat: v1.6.14 (24h) → v1.6.15 (24h) → v1.6.16. Three releases in 9 days, each closes a real user-felt bug found through dogfood.

What this PR does

F-CACHE-DEFER (the user-felt bug Anuj could not see savings from)

sipcode init modifying ~/.claude/settings.json mid-session invalidates Anthropic's prompt cache. For active users the extra input-token cost was outweighing the proxy savings. Anuj's 2026-06-17 dogfood: drift flagged "Cache reuse down 83 points" while proxy --stats reported only ~$1.16 saved.

Fix: Detect active Claude Code sessions before writing settings.json. If active, write a pending-install marker at ~/.sipcode/install-pending.json. The next quiet sipcode command (any command except init) auto-applies the deferred install. The hook script file itself is always written immediately (safe — does not invalidate the cache).

  • src/modules/init/sessionDetection.ts (pure, 13 tests)
  • src/modules/init/pendingInstall.ts + maybeApplyPendingInstall (pure, 22 tests)
  • src/commands/init.ts integration (8 new tests in init-system-setup.test.ts)
  • src/cli.ts Commander preAction hook
  • New --force flag on sipcode init for users who want install-now
  • New StepStatus variant { kind: "deferred"; reason: string } rendered with ⏸ glyph in the SETUP card

F-NATIVE-GREP (the highest-volume, lowest-integrity rewriter)

native-grep fires 30% of all proxy work but held 65% signal kept — the worst of any rewriter in real-world data. The HEAD_LIMIT=50 cap was too aggressive for typical Claude Code symbol lookups in larger codebases (50–100 matches Claude needs for follow-up reads).

Fix: Cap 50 → 100, integrity declaration 0.65 → 0.78. 3 new tests verify new cap + integrity + field preservation.

Test plan

  • Full suite green every commit (1,317 → 1,330 → 1,346 → 1,354 → 1,360 → 1,363)
  • TypeScript build clean (tsc -p tsconfig.json)
  • No regressions in any existing module
  • CHANGELOG entry covering all behaviors and the engineering trail
  • README test-count badge + llms.txt + llms-full.txt updated to 1,363 / v1.6.16
  • Anuj acceptance test (after merge): on a fresh Claude Code session with v1.6.16 installed via npm i -g sipcode && sipcode init, verify (a) deferred path triggers when called from inside an active session, (b) --force overrides, (c) proxy --stats shows native-grep integrity ≥ 75% on real workload, (d) drift no longer reports "Cache reuse down N points" after a clean install
  • sipcode benchmark produces 62.6% median (or within 1-2 points)

How to release after merge

git checkout main
git pull origin main
git tag v1.6.16
git push origin v1.6.16

CI publishes via OIDC. CDN propagates in ~30-60 seconds.

Commit trail (each step revertable in isolation)

# Commit What
1 7efcf58 feat(init): detectActiveClaudeSessions
2 09330c7 feat(init): pendingInstall marker module
3 55ba152 feat(init): F-CACHE-DEFER runSystemSetup integration
4 0357af3 feat(cli,init): maybeApplyPendingInstall + --force flag
5 7af879e feat(proxy): F-NATIVE-GREP cap 50 → 100
6 42e1d5c chore(release): bump to 1.6.16 + CHANGELOG + artifacts

Risk assessment

Risk Mitigation
Active-session detection false negatives (real session not detected) Defensive: detection failure degrades to "proceed with install" rather than blocking; --force always works
Active-session detection false positives (no session, but we defer) Auto-apply on next sipcode command catches it within seconds; users can always pass --force
Pending marker survives across sipcode versions Schema validation rejects unknown future versions rather than mis-applying — explicit guard
nativeGrep cap raise reduces savings Locked benchmark re-run as part of acceptance test — measurable, not a guess
Auto-apply log noise on every command Only logs when state actually changed — no-op clears stale marker silently

🤖 Generated with Claude Code

Anuj7411 and others added 6 commits June 21, 2026 20:57
…p 1)

Pure module with injected I/O seam (listDir + stat + now). Scans
~/.claude/projects/<proj>/sessions/<sid>.jsonl for files within the
threshold (default 5 min = Anthropic prompt-cache TTL).

Defensive: any subtree permission error degrades to "skip subtree", whole
walk never throws. 13 tests cover: empty layouts, single recent / stale
detection, multi-project counts, listDir failure mid-walk, stat race
returning null, non-.jsonl files ignored, custom thresholds.

Test count: 1,317 -> 1,330.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Marker at ~/.sipcode/install-pending.json carries the intent
"install proxy hook at scriptPath" when the settings.json write is
deferred to protect Anthropic's prompt cache.

Schema sipcode-install-pending/1, strictly validated on read so unknown
future versions are rejected rather than mis-applied. applyPendingInstall
re-generates the script (latest version even if deferred days ago) and
applies installProxyHook against current settings.json so user-managed
hook entries survive. Idempotent: second apply returns no-marker.

16 tests cover: round-trip, overwrite, missing/corrupt/wrong-schema/
missing-fields rejection, no-op clear, no-marker no-op, full apply,
idempotency, no-change redundant apply, non-sipcode hook preservation,
corrupt settings.json fallback.

Test count: 1,330 -> 1,346.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Step 3 of runSystemSetup now gates the settings.json write on active-session
detection. When an active Claude Code session is detected, the script file
is still written (safe, doesn't invalidate the prompt cache) and a pending
marker is dropped so the next quiet sipcode invocation applies the install.
The --force flag bypasses the check.

StepStatus gains a "deferred" variant. formatSetupCard renders it with ⏸
glyph and a specific footer ("auto-applies on your next sipcode command
outside an active session, or pass --force"). Install marker is also
deferred when the proxy is deferred so `sipcode impact` baselines do not
get skewed by the deferral window.

Defensive: detection or marker-write failures degrade gracefully. Detection
throwing falls through to the normal install path. Marker-write throwing
surfaces as proxyHook=failed without blocking later steps.

8 new tests cover: defer when active, install when quiet, --force override,
install-marker defer cascade, detection-throws fallback, marker-write
failure, singular/plural reason, deferred SETUP card rendering.

Test count: 1,346 -> 1,354.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…eps 4+5)

maybeApplyPendingInstall is the CLI startup wrapper for F-CACHE-DEFER:
fast no-op when no marker exists; skip-active-session when a session is
detected; skip-detection-error when the scan throws; apply otherwise.
Logs only when the apply actually changed something on disk, so a
stale marker after an `init --force` run is cleared silently.

Wired into cli.ts via Commander preAction hook. Runs before every command
except `init` (which manages its own state). Hook failures are swallowed
so the user's real command never gets blocked by an ergonomic wrapper.

Added --force option to the `init` command. When set, runSystemSetup
bypasses active-session detection and installs settings.json directly.

6 new tests cover: no-op, skip-active-session, apply, defensive detection
failure, no-log-when-no-marker, stale-marker-cleanup-without-log.

Test count: 1,354 -> 1,360.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
native-grep was the highest-volume, lowest-integrity rewriter in the
2026-06-17 dogfood data: 30% of all proxy work, 65% signal kept. The
HEAD_LIMIT=50 cap was too aggressive for typical Claude Code grep
workloads where symbol lookups in larger codebases routinely return
50-100 matches Claude needs for follow-up reads.

Raised cap to 100, integrity declaration 0.65 -> 0.78. 3 new tests
verify new cap, integrity score, and field-preservation. Acceptance
criterion (manual): real-world dogfood should show native-grep
integrity >= 75% in `sipcode proxy --stats`.

Test count: 1,360 -> 1,363.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…llms.txt

Locks v1.6.16 release artifacts:
- package.json version 1.6.15 -> 1.6.16
- CHANGELOG: [1.6.16] entry with F-CACHE-DEFER + F-NATIVE-GREP narrative
- README test count badge 1,317 -> 1,363
- llms.txt + llms-full.txt: current version + test count refreshed

Full suite green (1,363 / 1,363). Build clean. Ready for self-review PR
into main, then `git tag v1.6.16 && git push --tags` to trigger npm publish.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@Anuj7411 Anuj7411 merged commit d1972cf into main Jun 22, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant