Skip to content

⚡ Claude Token Optimization2026-05-28 — documentation-maintainer #3948

@github-actions

Description

@github-actions

Target Workflow: documentation-maintainer

Source report: #3946
Estimated cost per run: $0.39
Total tokens per run: ~2,410K
Effective tokens per run: ~9,726K (~4× multiplier from cache writes)
Cache read rate: N/A (no token_usage_summary available)
Cache write rate: N/A (no token_usage_summary available; ~4× effective multiplier strongly suggests cache write dominance)
LLM turns: 31
Duration: 5.0 min
Model: claude-haiku-4-5

Current Configuration

Setting Value
Tools loaded 3: bash, edit, github: false
Tools actually used bash (14 calls), Edit (2 calls), Read (5 calls), Write (5 calls)
Network groups AWF sandbox (id: awf)
Pre-agent steps ✅ Yes — 5 steps pre-computing git diffs, doc lists, affected docs
Prompt size ~3,081 chars (~770 tokens)
max-turns set ❌ No

Key Finding

31 turns is the primary cost driver. With no turn cap, the growing context window compounds quadratically: each turn re-sends the accumulated conversation history. The ~4× effective-token multiplier (2.4M raw → 9.7M effective) is consistent with Haiku's context being re-encoded as cache writes on every new turn, with few cache reads (the prompt body changes each run based on git diff output injected via ${{ steps.git-changes.outputs.RECENT_DIFFS }}).

Additionally, despite having comprehensive pre-computed context in /tmp/gh-aw/doc-maintainer-context/, the agent still runs exploratory git commands (git show <sha>, git log) — adding unnecessary turns.

Recommendations

1. Add max-turns limit

Estimated savings: ~50–60% tokens/run (~1,200–1,450K tokens, ~$0.20–$0.24/run)

A well-scoped documentation sync should complete in 10–15 turns. Without a cap, the agent over-explores and runs 31 turns. Add to the frontmatter:

engine:
  id: claude
  model: claude-haiku-4-5
  max-turns: 15

This is the single highest-impact change. With 15 turns instead of 31, the quadratic context growth is cut roughly in half, reducing both raw and effective tokens by ~50%+.

2. Harden prompt against redundant git exploration

Estimated savings: ~5–8 turns eliminated (~300–500K tokens, ~$0.05–$0.08/run)

The tool usage shows bash_git show <sha> and bash_git log being called despite pre-computed recent-diffs.txt. Strengthen the constraint in the prompt:

Change in ### 1. Analyze Pre-computed Changes:

Use `/tmp/gh-aw/doc-maintainer-context/recent-diffs.txt` as your **sole source** for recent
source changes. **Do not run any `git` commands** — all required git data is already
pre-computed. Running `git show`, `git log`, or `git diff` wastes turns.

Also add an explicit early-exit instruction before all other steps:

### 0. Check For Changes First (Do This Before Anything Else)

Read `/tmp/gh-aw/doc-maintainer-context/has-changes.txt`.
- If `false`: call `safeoutputs noop` immediately and stop. Do not read any other files.
- If `true`: proceed to Step 1.

3. Cap pre-agent diff context

Estimated savings: ~5–10% token reduction in turn 1 (~120–240K tokens, ~$0.02–$0.04/run)

The recent-diffs.txt is generated with head -500 lines, which can produce 25–50KB of diff context in the first turn. Reduce to 200 lines and tighten context:

- name: Gather recent git diffs
  run: |
    ...
    git log --since="7 days ago" --format="=== Commit %H: %s ===" \
      --patch --stat --unified=2 -- src/ containers/ scripts/ docs/ '*.md' \
      | head -200 > "$CONTEXT_DIR/recent-diffs.txt"

--unified=2 (2 context lines instead of 3) further reduces diff verbosity by ~15%.

4. Limit affected-docs to top 10 files

Estimated savings: ~3–5% per turn on doc-reading turns (~75–120K tokens, ~$0.01–$0.02/run)

Both head -30 limits in the identify affected docs step allow up to 30 files — reading them all generates many turns. Cap at 10:

grep -i -F -f "$TOKENS" "$DOC_POOL" | head -10 > "$AFFECTED"
# fallback:
head -10 "$DOC_POOL" > "$AFFECTED"

And reinforce in the prompt:

### 2. Identify Documentation Gaps
Review **only** the files listed in `affected-docs.txt` (max 10 files).
Do not proactively read additional files not in this list.

Cache Analysis (Anthropic-Specific)

Per-turn cache breakdown unavailable (no token_usage_summary in run data). The ~4× effective/raw multiplier is the key signal:

Metric Value Interpretation
Raw tokens 2,410,409 Actual input + output
Effective tokens 9,726,361 Billed equivalent
Effective multiplier ~4.04× ~3× overhead from cache writes each turn
Turns 31 Each turn grows context window

Cache write amortization: Within a single run, Turn 1's cache write is reused across turns 2–31 (good). However, because RECENT_DIFFS and AFFECTED_DOCS are injected as step outputs (varying each run), the system prompt changes between runs — so cross-run cache hit rate is effectively 0%. All cache writes amortize only within the single run.

Cache cost vs. benefit: The dominant cost is the number of turns (31), not the per-turn cache write rate. Reducing turns has far greater impact than cache strategy tuning.

Expected Impact

Metric Current Projected Savings
Total tokens/run ~2,410K ~900–1,100K ~55–63%
Effective tokens/run ~9,726K ~3,000–4,000K ~59–69%
Cost/run $0.39 ~$0.13–$0.18 ~$0.21–$0.26
LLM turns 31 ≤15 −16 turns
Session time 5.0 min ~2.5–3.0 min ~40–50%

Implementation Checklist

  • Add max-turns: 15 to engine: block in .github/workflows/documentation-maintainer.md
  • Replace "Do not run git show <sha> per commit unless absolutely necessary" with hard prohibition: "Do not run any git commands"
  • Add Step 0 early-exit check before all other steps
  • Change head -500head -200 and --unified=3--unified=2 in git-diff pre-agent step
  • Change head -30head -10 in affected-docs step (both grep path and fallback)
  • Recompile: gh aw compile .github/workflows/documentation-maintainer.md
  • Post-process: npx tsx scripts/ci/postprocess-smoke-workflows.ts
  • Verify CI passes on PR
  • Compare token usage on next scheduled run vs $0.39 baseline (target: <$0.18)

Generated by Daily Claude Token Optimization Advisor · sonnet46 1.2M ·

Metadata

Metadata

Assignees

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions