fix: prevent beehave noise patterns — title rules, literal guidance, output column patterns by nullhack · Pull Request #164 · nullhack/temple8

nullhack · 2026-05-19T10:32:29Z

Summary

Post-mortem analysis of two features (rate limit buckets, cache orderbook history) revealed recurring patterns where agents add noise to satisfy beehave checks instead of writing natural test code. Six changes across 4 files:

Title special characters (gherkin.md): POs wrote titles with hyphens/periods/underscores which break slug generation. Added explicit "ONLY Unicode letters, digits, and spaces" rule to Key Takeaways, Concepts, Title Conventions, and Common Mistakes.
Literal awareness (gherkin.md): POs wrote display-oriented literals that are awkward for tests. Added guidance: choose short, code-friendly literals.
Output columns and Hypothesis (gherkin.md, test-stubs.md): Scenario Outline output columns like expected=min(a,b) are generated as @given strategies. Documented the reassignment pattern — the @given value is noise, overridden in the test body.
String literal helpers (test-stubs.md): Documented the helper pattern (e.g., base, quote = "FOO/BAR".split("/")) so literals appear naturally instead of being stuffed into assert messages.
write-test skill step 3 (write-test/SKILL.md): Added explicit instructions for output column reassignment and string literal helpers. Added "NEVER stuff literals into assert messages" prohibition.
write-bdd-features skill step 9 (write-bdd-features/SKILL.md): Added beehave check at write time to catch title/placeholder issues before downstream stub generation.

Files Changed

File	Change
`.opencode/knowledge/requirements/gherkin.md`	Title chars, literal guidance, output columns, Hypothesis constraint
`.opencode/knowledge/software-craft/test-stubs.md`	Derived output columns, string literal helpers
`.opencode/skills/write-test/SKILL.md`	Step 3: output column reassignment, literal helper patterns
`.opencode/skills/write-bdd-features/SKILL.md`	Step 9: beehave check at write time

Testing

Applied during live cex-mm session. The cache_orderbook_history feature exposed all three failure modes. After fixes, beehave check and all tests pass clean with zero noise assertions.

…h verification, todo discipline, property patterns for BDD examples Post-session analysis of cex-mm project revealed three systemic failures: 1. Orchestrator routinely bypasses owner dispatch and does work directly. The todo template had no Dispatch step — it jumped from Preparation to Load Skills, so the orchestrator never saw the instruction to dispatch. Fix: added explicit Dispatch step (#2) in todo template with MUST NOT do the work itself constraint and owner mapping table. 2. Branch discipline not enforced at state entry. Agents entered states declaring git:dev while on feature branches and vice versa. Fix: Preparation step now verifies branch matches attrs.git. Golden rule 7 now says 'Verify before starting'. New golden rule 8: feature branches must be merged back to dev before new work starts. 3. Todo list goes stale or disappears mid-state as agents focus on work. Fix: added Todo discipline paragraph requiring update after every step and regeneration if missing. 4. Review-gate skill loaded smell-catalogue at #key-takeaways but detecting violations needs the full document (per progressive knowledge loading rules in AGENTS.md). Fix: step 5 now loads full docs for detection, #key-takeaways only for recall. 5. No guidance for choosing Example vs Scenario Outline during BDD example creation. Agents either over-used Scenario Outlines or under-used them. Fix: added property-patterns knowledge file (Wlaschin, 2014) with seven patterns and a decision tree. Updated write-bdd-features skill step 4 to apply patterns systematically. Added research reference.

@given

…output column patterns Post-mortem analysis of two features (rate limit buckets, cache history) revealed recurring patterns where agents add noise to satisfy beehave checks instead of writing natural test code: 1. Title special characters: POs write titles with hyphens, periods, underscores which break slug generation. Added explicit 'ONLY Unicode letters, digits, and spaces' rule to Key Takeaways, Concepts, Title Conventions, and Common Mistakes in gherkin.md. 2. Literal awareness: POs write display-oriented literals (e.g. 'United States of America') that are awkward for tests. Added guidance in Key Takeaways and Concepts: choose short, code-friendly literals. 3. Output columns and Hypothesis: Scenario Outline output columns (like expected=min(a,b)) are generated as @given strategies. Agents either remove them (breaking example-mismatch) or add noise (_=expected). Documented the reassignment pattern in gherkin.md Concepts and test-stubs.md Concepts. 4. String literal helpers: Documented the _pair_from('FOO/BAR') helper pattern in test-stubs.md. Agents were stuffing literals into assert messages or assigning to _. 5. write-test skill step 3: Added explicit instructions for output column reassignment and string literal helpers. Added 'NEVER stuff literals into assert messages' prohibition. 6. write-bdd-features skill step 9: Added beehave check at write time to catch title/placeholder issues before downstream stub generation.

Anti-patterns (output column reassignment, helper functions for literals, code-friendly literal guidance) treated symptoms — the test working around awkward spec values. Root cause: every value in a spec should carry domain meaning, and both PO and SE must preserve that meaning. Introduces Spec Value Fidelity as a first-class concept in test-design: values exist with domain intent; tests must reflect that intent; noise patterns (assigning to _, assert stuffing, helpers for traceability) violate this fidelity. Cascades to gherkin.md (PO: meaningful literals and columns), test-stubs.md (SE: use values for their domain purpose), and both skills (write-bdd-features adds semantic value check, write-test replaces workaround bullets).

Ubuntu added 3 commits May 19, 2026 08:06

nullhack merged commit 33f7b2e into main May 19, 2026
10 checks passed

nullhack deleted the fix/beehave-test-patterns-post-mortem branch May 19, 2026 11:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: prevent beehave noise patterns — title rules, literal guidance, output column patterns#164

fix: prevent beehave noise patterns — title rules, literal guidance, output column patterns#164
nullhack merged 3 commits into
mainfrom
fix/beehave-test-patterns-post-mortem

nullhack commented May 19, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

nullhack commented May 19, 2026

Summary

Files Changed

Testing

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant