fix: queue-based working memory capture (#179) by dean0x · Pull Request #180 · dean0x/devflow

dean0x · 2026-04-08T22:29:22Z

Summary

Replaces broken transcript extraction with queue-based turn capture from hook inputs
UserPromptSubmit (preamble) captures user prompts to .memory/.pending-turns.jsonl
Stop hook captures assistant_message (on end_turn only) to same queue, decouples throttle from capture
Background updater uses mv-based atomic handoff with crash recovery via .pending-turns.processing

What was wrong

The previous implementation extracted user/assistant messages from session transcript JSONL, but most entries are tool_result/tool_use with no text content. In a typical session (89 "user" messages, only 8 with actual text), the tail -3 windowing almost always missed real content.

Design

Hook inputs already provide both data points directly — no transcript needed:

UserPromptSubmit → prompt (user's text)
Stop → assistant_message (full response) + stop_reason

Queue accumulates turns; throttle only gates background processing (not capture). 10-turn batch cap, overflow safety, crash recovery.

Test plan

5 new queue behavior tests in shell-hooks.test.ts
618/618 tests passing
bash -n syntax validation on all 3 modified hooks
16 acceptance scenarios (stop_reason filtering, dual format, capture before throttle, DEVFLOW_BG_UPDATER guard, crash recovery, missing .memory/)
Evaluator: 5/5 plan requirements aligned

Closes #179

…extraction (#179) Transcript extraction was unreliable (most entries are tool_use with no text). Switch to hook-based capture using inputs available at hook execution time: - preamble (UserPromptSubmit) captures user prompt to .pending-turns.jsonl queue - stop-update-memory captures assistant_message (end_turn only) to same queue, then spawns throttled background updater (capture decoupled from throttle) - background-memory-update reads queue via atomic mv handoff, pairs user/assistant turns, caps at 10 most recent turns, with crash recovery via .processing file Key improvements: - No sleep 3 transcript flush wait - No transcript path computation - Queue overflow safety (>200 lines → keep last 100) - stop_reason filter (tool_use stops skipped) - Inline jq/node JSONL construction (avoids bash 4+ array syntax in json_construct) - 5 new behavior tests covering capture, skip, and JSONL format validation Co-Authored-By: Claude <noreply@anthropic.com>

- Cap .processing file at 200 lines after crash recovery merge to prevent unbounded growth when claude -p Write failures persist - Clean up .processing on claude exit 0 even if Write was not invoked (retrying same turns is futile; new turns arrive via queue) - Use json_field helper in turn parsing loop (Simplifier) - Deduplicate HOOKS_DIR in tests (Simplifier)

dean0x · 2026-04-08T22:35:03Z

🔍 Code Review: Consolidated High-Confidence Findings (≥80%)

Overview

9 blocking/should-fix issues identified across 4 files. Multiple reviewers flagged the same concerns, indicating consistent patterns. Below, I've deduped and consolidated findings by location.

🔴 BLOCKING Issues (Fix Before Merge)

1. scripts/hooks/preamble — Stale Header Comment (95% confidence)

Line 5

The file header still claims:

# Zero file I/O beyond stdin — static injection only.

But this PR adds file I/O at lines 23–39 (appends to .memory/.pending-turns.jsonl). Misleading documentation that will confuse future maintainers.

Fix:

# Injects a detection-only preamble. Classification rules only — skill mappings live in devflow:router.
# Also captures user prompts to .memory/.pending-turns.jsonl for working memory.

2. scripts/hooks/preamble & stop-update-memory — Raw JSON Construction Instead of Helper (85% confidence)

Preamble lines 31–36 and stop-update-memory lines 74–79

Both scripts construct JSON objects using raw jq -n -c --arg patterns. The codebase already has a json_construct helper in the json-parse library that wraps exactly this pattern (with jq/node dual fallback).

Problem: Duplicated abstraction layer. If JSON construction changes are needed, multiple locations must be updated in sync.

Fix:

# In preamble (lines 31-36):
json_construct --arg role "user" --arg content "$_TRUNCATED_PROMPT" --argjson ts "$_TS" >> "$CWD/.memory/.pending-turns.jsonl"

# In stop-update-memory (lines 74-79):
json_construct --arg role "assistant" --arg content "$ASSISTANT_MSG" --argjson ts "$TS" >> "$QUEUE_FILE"

3. scripts/hooks/stop-update-memory — Mtime Logic Duplicated (92% confidence)

Lines 96–100

The stop hook inlines the Linux/macOS stat mtime detection, while background-memory-update defines a reusable get_mtime() function for the same operation. This pattern also appears in session-start-memory.

Problem: Two competing approaches for the identical operation. If stat detection needs a fix (new platform), both locations must be updated in sync.

Fix: Extract get_mtime() into a shared sourced file, or reuse the existing function from background-memory-update.

4. scripts/hooks/stop-update-memory — Queue Overflow Logic Duplicated (85% confidence)

Lines 82–89 (also in background-memory-update lines 107–111)

The overflow safety pattern (check >200 lines, truncate to 100 via tail/mv) appears in both scripts with identical thresholds and idioms (wc -l | tr -d ' ', tail > .tmp && mv).

Problem: DRY violation. If thresholds need tuning, two locations must be updated.

Fix: Extract a shared cap_file_lines() function into a sourced helper.

5. scripts/hooks/background-memory-update — Per-Line Subprocess Spawning (95% confidence)

Lines 148–181

The turn-parsing while-read loop calls json_field twice per JSONL line (role and content). Each invocation spawns a jq or node subprocess. With the 20-line cap (10 turns × 2 lines), this creates 40 subprocess spawns in sequence.

Problem: This is the exact pattern documented in PF-006 (per-line jq spawning adds 1–3s latency). Reintroducing it adds measurable latency and extends lock hold time.

Fix: Use single-pass jq -s (slurp) to extract all roles/contents at once:

if [ "$_HAS_JQ" = "true" ]; then
  PARSED=$(echo "$ENTRIES" | jq -r '[.role, .content] | @tsv')
else
  PARSED=$(echo "$ENTRIES" | node -e "d='';process.stdin.on('data',c=>d+=c);process.stdin.on('end',()=>{d.trim().split('\n').filter(Boolean).forEach(l=>{try{const o=JSON.parse(l);console.log(o.role+'\t'+o.content)}catch(e){}})})")
fi

CURRENT_USER=""
while IFS=$'\t' read -r ROLE CONTENT; do
  # ... same pairing logic, but ROLE/CONTENT already extracted
done <<< "$PARSED"

6. tests/shell-hooks.test.ts — Shell Injection Risk in execSync (82% confidence)

Multiple test blocks: lines 772, 791, 816, 840

The test pattern echo '${input.replace(/'/g, "'\\''")}' | bash escapes single quotes but doesn't guard against shell metacharacters in dynamically constructed JSON (backticks, $(), newlines).

Problem: While tests have controlled inputs today, the pattern is fragile. Test data evolution could cause unexpected failures.

Fix: Use safer input option:

execSync(`bash "${STOP_HOOK}"`, {
  input: JSON.stringify({ cwd: tmpDir, stop_reason: 'end_turn', assistant_message: 'test' }),
  stdio: ['pipe', 'pipe', 'pipe'],
});

🟡 SHOULD-FIX Issues (Gaps in Coverage/Documentation)

7. tests/shell-hooks.test.ts — Missing Content Array Format Test (85% confidence)

The stop hook handles two assistant_message formats: plain string and content array ([{type: "text", text: "..."}]). Tests only cover the string format.

Fix: Add test for content array:

it('stop_reason end_turn with content array — extracts text blocks', () => {
  const input = JSON.stringify({
    cwd: tmpDir,
    session_id: 'test-session',
    stop_reason: 'end_turn',
    assistant_message: [
      { type: 'text', text: 'First' },
      { type: 'tool_result', content: 'ignored' },
      { type: 'text', text: 'Second' },
    ],
  });
  // ... run hook and verify both text blocks in queue
});

8. tests/shell-hooks.test.ts — Missing Queue Overflow Truncation Test (82% confidence)

The stop hook implements queue overflow safety (>200 lines → truncate to 100). This has zero test coverage. Bugs in overflow handling would go undetected.

Fix: Pre-populate queue with 201 lines, trigger stop hook, verify truncation to ≤100 lines.

9. CLAUDE.md — Hook Count and Description Outdated (82–85% confidence)

Line 41: Says "Three shell-script hooks" but now involves four:

preamble (user prompt capture)
stop-update-memory (assistant response capture)
background-memory-update (queue processing)
session-start-memory (memory injection)

Line 108: Says WORKING-MEMORY.md is "overwritten each session" — now queue-based, so it accumulates turns across multiple stops.

Fix:

Line 41: Update to "Four shell-script hooks" or rephrase to avoid counting
Line 108: Update to "Auto-maintained by background updater (queue-fed by preamble + stop hooks)"

📊 Summary

Category	Count	Severity
Blocking	6	HIGH
Should-Fix	3	MEDIUM
Dedup Rate	4/13	—

Key Pattern: DRY violations (json_construct, mtime, queue overflow logic) + PF-006 reintroduction (subprocess spawning) + incomplete test coverage. Multiple reviewers flagged the same issues, indicating consistent concerns.

Recommendation: Address the 6 blocking issues before merge. They involve straightforward refactoring (extract helpers, replace duplicates) and test additions. The PF-006 pattern is the most complex fix but unlocks the biggest latency win.

Review compiled from 9 reviewer reports. Inline comments can be created on specific files if needed.

- Fix stale header comment in preamble claiming zero file I/O (now accurately describes .pending-turns.jsonl capture behavior) - Guard $CWD with [ ! -d ] check in both hooks so a non-existent path from a deleted/unmounted project exits cleanly (preamble:17, stop:24) - Add -- separator before positional args in node fallback invocations to prevent content starting with -- being misinterpreted as node flags (preamble:36, stop:88) - Extract get_mtime() function in stop-update-memory to eliminate inlined stat -c/%m duplication; mirrors the identical function in background-memory-update Co-Authored-By: Claude <noreply@anthropic.com>

… spawning Two fixes in background-memory-update: 1. CWD existence guard (security/robustness): validate $CWD from args before constructing any paths, exiting early with a clear error message if the directory does not exist. 2. Single-pass JSONL extraction (performance, PF-006): replace the per-line json_field calls inside the while-read loop — which spawned 2 jq/node subprocesses per JSONL entry (up to 40 total) — with a single jq/node invocation that extracts role+content as TSV in one pass. The loop now reads pre-extracted TSV rows with IFS=$'\t', spawning zero subprocesses per iteration. Co-Authored-By: Claude <noreply@anthropic.com>

…e injection - Add test: stop hook handles assistant_message as content array, joining text blocks with newline and excluding tool_use blocks - Add test: queue overflow truncates to last 100 lines when >200 entries - Fix: replace echo-pipe shell injection pattern in all 4 working memory queue behavior tests with execSync({ input, stdio: ['pipe','pipe','pipe'] }) - docs: update CLAUDE.md hook count from Three to Four (stop, session-start, pre-compact, preamble) and fix WORKING-MEMORY.md description to reflect queue-based batch update model Co-Authored-By: Claude <noreply@anthropic.com>

…mble to pure ambient Creates scripts/hooks/prompt-capture-memory (UserPromptSubmit) that captures user prompts to .memory/.pending-turns.jsonl. Reverts preamble to zero-file-I/O ambient-only injection — prompt capture no longer mixed into classification path. Co-Authored-By: Claude <noreply@anthropic.com>

…nt, queue cleanup on disable MEMORY_HOOK_CONFIG gains UserPromptSubmit → prompt-capture-memory as first entry. hasMemoryHooks uses Object.keys(MEMORY_HOOK_CONFIG).length (4) instead of hardcoded 3. devflow memory --disable now removes queue files (.pending-turns.jsonl + .pending-turns.processing). devflow init with memory=false also cleans up queue files. Co-Authored-By: Claude <noreply@anthropic.com>

…ture-memory, preamble separation memory.test.ts: update all hook counts 3→4, add UserPromptSubmit assertions, add upgrade test (3→4 hooks), add removeMemoryHooks preamble preservation test, add toggle cycle test, add queue file cleanup tests (delete files, safe when missing, safe without .memory/). shell-hooks.test.ts: add prompt-capture-memory to HOOK_SCRIPTS syntax check, replace preamble capture tests with prompt-capture-memory equivalents, add preamble-no-longer-writes test and slash-command-no-queue-write test. Co-Authored-By: Claude <noreply@anthropic.com>

…, add threshold comments file-organization.md: four hooks (not three), add prompt-capture-memory to hook table and source tree, update flow description to show queue-based capture path, document disable cleanup. CLAUDE.md: update Working Memory paragraph — prompt-capture-memory (not preamble) captures prompts; preamble is ambient-only; note disable cleanup. stop-update-memory + background-memory-update: add cross-reference comments for 200/100 overflow threshold to keep them in sync. Co-Authored-By: Claude <noreply@anthropic.com>

dean0x

Review Findings: High-Confidence Issues (≥80%)

This batch of inline comments flags blocking and should-fix issues from the code review. Lower-confidence suggestions (60-79%) are included in the summary comment below.

Issues Found

Total inline comments: 6

Idempotency regression in addMemoryHooks (Consistency, 85%)
Path resolution assumes project root (TypeScript, 85%)
Double JSON parsing (TypeScript, 80%)
Per-prompt subprocess overhead in hook (Performance, 85%)
get_mtime duplication across hooks (Architecture, 85%)
Preamble hook subprocess optimization (Performance, 80%)

Lower-Confidence Suggestions (60-79%)

These findings are consolidated here as they're architectural rather than blocking:

Queue Cleanup Logic Duplication (Complexity/Architecture, 70%)

Locations: src/cli/commands/init.ts:1413-1415, src/cli/commands/memory.ts:213-215

The .pending-turns.jsonl and .pending-turns.processing cleanup is implemented identically in two places. Extract to a shared utility in src/cli/utils/post-install.ts:

export async function cleanupQueueFiles(memoryDir: string): Promise<{ queue: boolean; processing: boolean }> {
  const queue = await fs.unlink(path.join(memoryDir, '.pending-turns.jsonl')).then(() => true).catch(() => false);
  const processing = await fs.unlink(path.join(memoryDir, '.pending-turns.processing')).then(() => true).catch(() => false);
  return { queue, processing };
}

Overflow Threshold Magic Numbers (Complexity, 85%)

Locations: scripts/hooks/stop-update-memory:92-98, scripts/hooks/background-memory-update:115-119

Both scripts use hardcoded 200/100 thresholds (200 lines trigger cleanup, keep 100). Define once:

QUEUE_MAX_LINES=200
QUEUE_TRIM_LINES=100

The cross-referencing # NOTE: comments are good but constants would be more robust.

Queue File Concurrency Assumption (Architecture, 70%)

Locations: scripts/hooks/prompt-capture-memory:37, scripts/hooks/stop-update-memory:67

Two hooks (prompt-capture-memory and stop-update-memory) both append to .pending-turns.jsonl. The >> append is atomic on POSIX for lines under PIPE_BUF, but a design comment would clarify this concurrency assumption:

# Atomic append (POSIX: lines under PIPE_BUF are atomic)
echo "$TURN_JSON" >> "$CWD/.memory/.pending-turns.jsonl"

Hook Lifecycle Abstraction Opportunity (Architecture, 65%)

Pattern: addMemoryHooks/removeMemoryHooks in memory.ts vs learning hooks in learn.ts

Both follow an identical pattern: iterate config map, match by marker substring, add/remove/count. A generic HookManager class could eliminate structural duplication. Lower priority as the current duplication is minor.

Summary by Category

Reviewer	Finding	Confidence	Action
Consistency	`changed` variable removal in `addMemoryHooks`	85%	Inline comment
TypeScript	`process.cwd()` path resolution	85%	Inline comment
TypeScript	Double JSON.parse()	80%	Inline comment
Performance	Per-prompt subprocess overhead	85%	Inline comment
Architecture	`get_mtime` duplication	85%	Inline comment
Performance	Preamble hook subprocess calls	80%	Inline comment
Complexity	Queue cleanup duplication	70%	Summary
Complexity	Magic number thresholds	85%	Summary
Architecture	Queue concurrency assumption	70%	Summary
Architecture	Hook lifecycle abstraction	65%	Summary

Deduplication Note

The following issues were reported by multiple reviewers and consolidated:

get_mtime duplication: Architecture (85%), Complexity (90%), Consistency (70%)
Queue overflow thresholds: Complexity (85%), Consistency (65%)
Queue cleanup duplication: Architecture (85%), Complexity (70%)
process.cwd() path resolution: TypeScript (85%), Architecture (82%)
Per-prompt subprocess overhead: Performance (85%)
Idempotency regression: Consistency (85%), TypeScript (82%)

Each deduplicated finding appears once in this comment at highest confidence.

Next Steps

Review inline comments and address blocking issues
Address should-fix items (double-parse, preamble optimization)
Consolidate lower-confidence findings as appropriate

Recommendation: Approve with conditions (fix blocking items, address should-fix items before merge).

Created by Claude Code git agent for devflow PR review. Findings ≥80% confidence elevated to inline comments.

- Semantic idempotency checks for --enable/--disable (Issue 1) - Remove queue cleanup from --disable, add --clear command (Issue 2) - Single-pass field extraction in prompt-capture-memory and preamble (Issue 3) - Truncation and BG_UPDATER guard tests (Issues 5, 6) - Fix file-organization.md hook table entries (Issue 7) - Extract get-mtime shared helper (Issue 8) - countMemoryHooks/hasMemoryHooks accept parsed Settings (Issue 9) - Add background-memory-update to docs and CLAUDE.md (Issue 10)

…behavior tests - Deduplicate filterProjectsWithMemory to delegate to hasMemoryDir - Remove 11 tests that only exercised JS builtins, not devflow logic

- Extract cleanQueueFiles() pure function (SRP, testable) - Parallelize discoverProjectGitRoots() + getGitRoot() with Promise.all - Add process.stdin.isTTY guard before p.select() (non-interactive default: clean all) - Replace currentProject! non-null assertion with null-safe conditional - Check for .working-memory.lock before deleting queue files to prevent data loss during active background update Co-Authored-By: Claude <noreply@anthropic.com>

…form, add behavioral test Restores the original detection order from stop-update-memory/background-memory-update: stat --version is probed first (succeeds on GNU/Linux, fails on BSD/macOS) so the correct stat flag is chosen. Previous get-mtime tried BSD stat -f %m first, which can return wrong values on Linux GNU stat (filesystem info instead of file mtime). Caches the detected platform type in _GET_MTIME_STAT_TYPE to avoid repeated capability probes on subsequent calls. Adds header comment documenting the Unix epoch return value. Adds behavioral test in shell-hooks.test.ts that creates a temp file, sources get-mtime, calls get_mtime, and validates a positive integer epoch within a sane range. Co-Authored-By: Claude <noreply@anthropic.com>

@TSV

Addresses batch-3 review issues: - Add json_extract_cwd_prompt() helper to json-parse: extracts cwd + prompt in one subprocess using ASCII SOH (U+0001) as delimiter. SOH is safe for prompts with tabs/backslashes and works in bash 3.2 parameter expansion (unlike NUL byte). Fixes the @TSV pattern which corrupted literal tabs in prompts to the string "\t". - Deduplicate identical 8-line extraction blocks from preamble and prompt-capture-memory — both now call json_extract_cwd_prompt with ${FIELDS%%$'\001'*} / ${FIELDS#*$'\001'} split (zero extra subprocesses). - Batch cwd + stop_reason extraction in stop-update-memory into one subprocess, consistent with the pattern introduced for UserPromptSubmit hooks in this PR. - Replace echo "$INPUT" | json_field with printf '%s' "$INPUT" | json_field across all 6 hooks (stop-update-memory, session-start-memory, session-start-classification, pre-compact-memory, session-end-learning). echo with arbitrary JSON content is unsafe when the shell interprets escape sequences. Co-Authored-By: Claude <noreply@anthropic.com>

…ve error handling - Export hasMemoryDir and filterProjectsWithMemory for testability - Parallelize cleanQueueFiles: project iteration and per-project deletions both use Promise.all, reducing N*2 sequential ops to O(1) async batches - Widen removeMemoryHooks to accept string | Settings (consistent with hasMemoryHooks/countMemoryHooks API) - Differentiate ENOENT from unexpected errors (EACCES etc.) in hasMemoryDir with console.warn for non-ENOENT cases - Add --disable hint pointing to --clear for stale queue file cleanup - Add 22 tests for cleanQueueFiles, hasMemoryDir, filterProjectsWithMemory, and removeMemoryHooks(Settings) covering all edge cases Co-Authored-By: Claude <noreply@anthropic.com>

…tings, fix memory description - hasAmbientHook and hasLearningHook now accept string | Settings, matching the existing hasMemoryHooks/countMemoryHooks API for consistency - Update memory command .description() to mention --clear option Co-Authored-By: Claude <noreply@anthropic.com>

Replace loose < 3000 / < 5000 length guards with exact toBe(2015), matching the 2000-char limit + '... [truncated]' suffix (15 chars). Weak assertions would pass even with broken truncation logic. Add JSDoc to hasMemoryDir and filterProjectsWithMemory to match the documentation style of adjacent exported functions. Co-Authored-By: Claude <noreply@anthropic.com>

…earn --list - session-start-memory: replace inline stat platform detection with shared get-mtime helper - learn.ts: consolidate --list to use shared readObservations helper

Dean Sharon and others added 2 commits April 9, 2026 01:04

Dean Sharon and others added 8 commits April 9, 2026 01:39

refactor(memory): remove dead changed flag from addMemoryHooks

a50aea0

dean0x commented Apr 9, 2026

View reviewed changes

Dean Sharon and others added 9 commits April 9, 2026 21:19

refactor(memory): simplify filterProjectsWithMemory, remove language-…

209f3fa

…behavior tests - Deduplicate filterProjectsWithMemory to delegate to hasMemoryDir - Remove 11 tests that only exercised JS builtins, not devflow logic

refactor: use shared get-mtime in session-start-memory, consolidate l…

9f3c4a3

…earn --list - session-start-memory: replace inline stat platform detection with shared get-mtime helper - learn.ts: consolidate --list to use shared readObservations helper

dean0x merged commit 9685f27 into main Apr 9, 2026
4 checks passed

dean0x deleted the fix/179-memory-extraction branch April 9, 2026 21:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: queue-based working memory capture (#179)#180

fix: queue-based working memory capture (#179)#180
dean0x merged 19 commits intomainfrom
fix/179-memory-extraction

dean0x commented Apr 8, 2026

Uh oh!

dean0x commented Apr 8, 2026

Uh oh!

dean0x left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

dean0x commented Apr 8, 2026

Summary

What was wrong

Design

Test plan

Uh oh!

dean0x commented Apr 8, 2026

🔍 Code Review: Consolidated High-Confidence Findings (≥80%)

Overview

🔴 BLOCKING Issues (Fix Before Merge)

1. scripts/hooks/preamble — Stale Header Comment (95% confidence)

2. scripts/hooks/preamble & stop-update-memory — Raw JSON Construction Instead of Helper (85% confidence)

3. scripts/hooks/stop-update-memory — Mtime Logic Duplicated (92% confidence)

4. scripts/hooks/stop-update-memory — Queue Overflow Logic Duplicated (85% confidence)

5. scripts/hooks/background-memory-update — Per-Line Subprocess Spawning (95% confidence)

6. tests/shell-hooks.test.ts — Shell Injection Risk in execSync (82% confidence)

🟡 SHOULD-FIX Issues (Gaps in Coverage/Documentation)

7. tests/shell-hooks.test.ts — Missing Content Array Format Test (85% confidence)

8. tests/shell-hooks.test.ts — Missing Queue Overflow Truncation Test (82% confidence)

9. CLAUDE.md — Hook Count and Description Outdated (82–85% confidence)

📊 Summary

Uh oh!

dean0x left a comment

Choose a reason for hiding this comment

Review Findings: High-Confidence Issues (≥80%)

Issues Found

Lower-Confidence Suggestions (60-79%)

Queue Cleanup Logic Duplication (Complexity/Architecture, 70%)

Overflow Threshold Magic Numbers (Complexity, 85%)

Queue File Concurrency Assumption (Architecture, 70%)

Hook Lifecycle Abstraction Opportunity (Architecture, 65%)

Summary by Category

Deduplication Note

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant