Target Workflow: documentation-maintainer
Source report: #3946
Estimated cost per run: $0.39
Total tokens per run: ~2,410K
Effective tokens per run: ~9,726K (~4× multiplier from cache writes)
Cache read rate: N/A (no token_usage_summary available)
Cache write rate: N/A (no token_usage_summary available; ~4× effective multiplier strongly suggests cache write dominance)
LLM turns: 31
Duration: 5.0 min
Model: claude-haiku-4-5
Current Configuration
| Setting |
Value |
| Tools loaded |
3: bash, edit, github: false |
| Tools actually used |
bash (14 calls), Edit (2 calls), Read (5 calls), Write (5 calls) |
| Network groups |
AWF sandbox (id: awf) |
| Pre-agent steps |
✅ Yes — 5 steps pre-computing git diffs, doc lists, affected docs |
| Prompt size |
~3,081 chars (~770 tokens) |
max-turns set |
❌ No |
Key Finding
31 turns is the primary cost driver. With no turn cap, the growing context window compounds quadratically: each turn re-sends the accumulated conversation history. The ~4× effective-token multiplier (2.4M raw → 9.7M effective) is consistent with Haiku's context being re-encoded as cache writes on every new turn, with few cache reads (the prompt body changes each run based on git diff output injected via ${{ steps.git-changes.outputs.RECENT_DIFFS }}).
Additionally, despite having comprehensive pre-computed context in /tmp/gh-aw/doc-maintainer-context/, the agent still runs exploratory git commands (git show <sha>, git log) — adding unnecessary turns.
Recommendations
1. Add max-turns limit
Estimated savings: ~50–60% tokens/run (~1,200–1,450K tokens, ~$0.20–$0.24/run)
A well-scoped documentation sync should complete in 10–15 turns. Without a cap, the agent over-explores and runs 31 turns. Add to the frontmatter:
engine:
id: claude
model: claude-haiku-4-5
max-turns: 15
This is the single highest-impact change. With 15 turns instead of 31, the quadratic context growth is cut roughly in half, reducing both raw and effective tokens by ~50%+.
2. Harden prompt against redundant git exploration
Estimated savings: ~5–8 turns eliminated (~300–500K tokens, ~$0.05–$0.08/run)
The tool usage shows bash_git show <sha> and bash_git log being called despite pre-computed recent-diffs.txt. Strengthen the constraint in the prompt:
Change in ### 1. Analyze Pre-computed Changes:
Use `/tmp/gh-aw/doc-maintainer-context/recent-diffs.txt` as your **sole source** for recent
source changes. **Do not run any `git` commands** — all required git data is already
pre-computed. Running `git show`, `git log`, or `git diff` wastes turns.
Also add an explicit early-exit instruction before all other steps:
### 0. Check For Changes First (Do This Before Anything Else)
Read `/tmp/gh-aw/doc-maintainer-context/has-changes.txt`.
- If `false`: call `safeoutputs noop` immediately and stop. Do not read any other files.
- If `true`: proceed to Step 1.
3. Cap pre-agent diff context
Estimated savings: ~5–10% token reduction in turn 1 (~120–240K tokens, ~$0.02–$0.04/run)
The recent-diffs.txt is generated with head -500 lines, which can produce 25–50KB of diff context in the first turn. Reduce to 200 lines and tighten context:
- name: Gather recent git diffs
run: |
...
git log --since="7 days ago" --format="=== Commit %H: %s ===" \
--patch --stat --unified=2 -- src/ containers/ scripts/ docs/ '*.md' \
| head -200 > "$CONTEXT_DIR/recent-diffs.txt"
--unified=2 (2 context lines instead of 3) further reduces diff verbosity by ~15%.
4. Limit affected-docs to top 10 files
Estimated savings: ~3–5% per turn on doc-reading turns (~75–120K tokens, ~$0.01–$0.02/run)
Both head -30 limits in the identify affected docs step allow up to 30 files — reading them all generates many turns. Cap at 10:
grep -i -F -f "$TOKENS" "$DOC_POOL" | head -10 > "$AFFECTED"
# fallback:
head -10 "$DOC_POOL" > "$AFFECTED"
And reinforce in the prompt:
### 2. Identify Documentation Gaps
Review **only** the files listed in `affected-docs.txt` (max 10 files).
Do not proactively read additional files not in this list.
Cache Analysis (Anthropic-Specific)
Per-turn cache breakdown unavailable (no token_usage_summary in run data). The ~4× effective/raw multiplier is the key signal:
| Metric |
Value |
Interpretation |
| Raw tokens |
2,410,409 |
Actual input + output |
| Effective tokens |
9,726,361 |
Billed equivalent |
| Effective multiplier |
~4.04× |
~3× overhead from cache writes each turn |
| Turns |
31 |
Each turn grows context window |
Cache write amortization: Within a single run, Turn 1's cache write is reused across turns 2–31 (good). However, because RECENT_DIFFS and AFFECTED_DOCS are injected as step outputs (varying each run), the system prompt changes between runs — so cross-run cache hit rate is effectively 0%. All cache writes amortize only within the single run.
Cache cost vs. benefit: The dominant cost is the number of turns (31), not the per-turn cache write rate. Reducing turns has far greater impact than cache strategy tuning.
Expected Impact
| Metric |
Current |
Projected |
Savings |
| Total tokens/run |
~2,410K |
~900–1,100K |
~55–63% |
| Effective tokens/run |
~9,726K |
~3,000–4,000K |
~59–69% |
| Cost/run |
$0.39 |
~$0.13–$0.18 |
~$0.21–$0.26 |
| LLM turns |
31 |
≤15 |
−16 turns |
| Session time |
5.0 min |
~2.5–3.0 min |
~40–50% |
Implementation Checklist
Generated by Daily Claude Token Optimization Advisor · sonnet46 1.2M · ◷
Target Workflow:
documentation-maintainerSource report: #3946
Estimated cost per run: $0.39
Total tokens per run: ~2,410K
Effective tokens per run: ~9,726K (~4× multiplier from cache writes)
Cache read rate: N/A (no
token_usage_summaryavailable)Cache write rate: N/A (no
token_usage_summaryavailable; ~4× effective multiplier strongly suggests cache write dominance)LLM turns: 31
Duration: 5.0 min
Model: claude-haiku-4-5
Current Configuration
bash,edit,github: falsebash(14 calls),Edit(2 calls),Read(5 calls),Write(5 calls)id: awf)max-turnssetKey Finding
31 turns is the primary cost driver. With no turn cap, the growing context window compounds quadratically: each turn re-sends the accumulated conversation history. The ~4× effective-token multiplier (2.4M raw → 9.7M effective) is consistent with Haiku's context being re-encoded as cache writes on every new turn, with few cache reads (the prompt body changes each run based on git diff output injected via
${{ steps.git-changes.outputs.RECENT_DIFFS }}).Additionally, despite having comprehensive pre-computed context in
/tmp/gh-aw/doc-maintainer-context/, the agent still runs exploratory git commands (git show <sha>,git log) — adding unnecessary turns.Recommendations
1. Add
max-turnslimitEstimated savings: ~50–60% tokens/run (~1,200–1,450K tokens, ~$0.20–$0.24/run)
A well-scoped documentation sync should complete in 10–15 turns. Without a cap, the agent over-explores and runs 31 turns. Add to the frontmatter:
This is the single highest-impact change. With 15 turns instead of 31, the quadratic context growth is cut roughly in half, reducing both raw and effective tokens by ~50%+.
2. Harden prompt against redundant git exploration
Estimated savings: ~5–8 turns eliminated (~300–500K tokens, ~$0.05–$0.08/run)
The tool usage shows
bash_git show <sha>andbash_git logbeing called despite pre-computedrecent-diffs.txt. Strengthen the constraint in the prompt:Change in
### 1. Analyze Pre-computed Changes:Also add an explicit early-exit instruction before all other steps:
3. Cap pre-agent diff context
Estimated savings: ~5–10% token reduction in turn 1 (~120–240K tokens, ~$0.02–$0.04/run)
The
recent-diffs.txtis generated withhead -500lines, which can produce 25–50KB of diff context in the first turn. Reduce to 200 lines and tighten context:--unified=2(2 context lines instead of 3) further reduces diff verbosity by ~15%.4. Limit affected-docs to top 10 files
Estimated savings: ~3–5% per turn on doc-reading turns (~75–120K tokens, ~$0.01–$0.02/run)
Both
head -30limits in theidentify affected docsstep allow up to 30 files — reading them all generates many turns. Cap at 10:And reinforce in the prompt:
Cache Analysis (Anthropic-Specific)
Per-turn cache breakdown unavailable (no
token_usage_summaryin run data). The ~4× effective/raw multiplier is the key signal:Cache write amortization: Within a single run, Turn 1's cache write is reused across turns 2–31 (good). However, because
RECENT_DIFFSandAFFECTED_DOCSare injected as step outputs (varying each run), the system prompt changes between runs — so cross-run cache hit rate is effectively 0%. All cache writes amortize only within the single run.Cache cost vs. benefit: The dominant cost is the number of turns (31), not the per-turn cache write rate. Reducing turns has far greater impact than cache strategy tuning.
Expected Impact
Implementation Checklist
max-turns: 15toengine:block in.github/workflows/documentation-maintainer.mdgit show <sha>per commit unless absolutely necessary" with hard prohibition: "Do not run anygitcommands"head -500→head -200and--unified=3→--unified=2in git-diff pre-agent stephead -30→head -10in affected-docs step (both grep path and fallback)gh aw compile .github/workflows/documentation-maintainer.mdnpx tsx scripts/ci/postprocess-smoke-workflows.ts