Skip to content

Conversation

@JordanCoin
Copy link
Owner

@JordanCoin JordanCoin commented Jan 29, 2026

Summary

  • Session-start hook now uses adaptive depth based on repo size (depth 2-4)
  • Both hook and MCP get_structure enforce 60KB max output (~15k tokens, <10% of context)
  • Truncates cleanly at line boundaries with helpful message

Problem

Large repos (10k+ files like Rails monoliths) were outputting 1.3MB+ of tree structure on session start, consuming 250%+ of Claude Code's context window before any conversation even started.

Root Cause: Hook Output Goes to "Messages"

The critical insight is that hook output gets injected into the "Messages" portion of Claude's context, not into system prompt or tools. This means:

  1. Hook output competes directly with conversation history
  2. /clear doesn't help because hooks re-run on session start
  3. Even with no conversation, "Messages" can show 500k+ tokens
  4. Users see "Context limit reached" immediately on fresh sessions

Why This Matters for Hook Architecture

Hooks need to be context-aware. A hook that's helpful for a 500-file project becomes destructive for a 10k-file project. Current hooks assume unlimited output is fine - it's not.

Principles for future hook design:

  • Hooks should NEVER output more than ~10% of context window (~20k tokens / ~80KB)
  • Output should scale inversely with repo size (bigger repo = less detail)
  • Consider making hooks return structured data instead of free-form text
  • Add a global hook output budget that hooks share

The Quick Fix (This PR)

  • Adaptive depth: >5000 files → depth 2, >2000 files → depth 3, else depth 4
  • Hard cap: 60KB max output with clean truncation
  • Applies to both CLI hooks and MCP get_structure tool

Future Refactoring Ideas

  1. Hook output budget: Global limit shared across all hooks
  2. Structured hook responses: Return JSON that Claude Code can format/truncate
  3. Lazy loading: Show summary first, let user request details
  4. Project-specific config: Allow .codemap/config.json to tune hook behavior
  5. Hook priority system: Critical info first, nice-to-have truncated

Test Results

  • Tested on gumroad repo (10,144 files, 303MB):
    • Before: 1,375,873 bytes (1.3MB) → 500k+ tokens
    • After: 60,973 bytes (61KB) → ~15k tokens
  • Both codemap and codemap-mcp rebuild successfully
  • Claude Code sessions now start with normal context usage

Files Changed

  • cmd/hooks.go: Adaptive depth + size limit for session-start hook
  • mcp/main.go: Size limit for get_structure MCP tool

🤖 Generated with Claude Code

JordanCoin and others added 2 commits January 29, 2026 01:49
- Session-start hook now uses adaptive depth based on repo size:
  - >5000 files: depth 2
  - >2000 files: depth 3
  - Otherwise: depth 4
- Both hook and MCP get_structure enforce 60KB max output (~15k tokens)
- Truncates cleanly at line boundaries with helpful message
- Prevents consuming >10% of LLM context window

Fixes issue where 10k+ file repos (like Rails monoliths) would output
1.3MB+ of tree structure, overwhelming Claude Code's context.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Hook output goes directly into Claude's "Messages" context, not system
prompt. This means hook output competes with conversation history for
the ~200k token limit. A 1.3MB output (like a full tree of a 10k file
repo) equals ~500k tokens, causing instant context overflow.

The size limits (adaptive depth + 60KB cap) are critical safeguards.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants