nullhack · nullhack · May 19, 2026 · May 19, 2026
diff --git a/.opencode/knowledge/requirements/property-patterns.md b/.opencode/knowledge/requirements/property-patterns.md
@@ -0,0 +1,107 @@
+---
+domain: requirements
+tags: [property-based-testing, examples, scenario-outline, test-design, bdd, hypothesis]
+last-updated: 2026-05-19
+---
+
+# Property Patterns for BDD Example Selection
+
+## Key Takeaways
+
+- When writing BDD Examples, use these seven property patterns (Wlaschin, 2014) to decide whether an Example should be a simple `Example:` or a `Scenario Outline:` with multiple input combinations.
+- **Simple `Example:`** is appropriate when the behaviour is a single observable outcome with fixed inputs — no interesting property to generalise.
+- **`Scenario Outline:`** is appropriate when the same behavioural outcome holds across multiple input/output combinations — the property pattern reveals which combinations matter.
+- The seven patterns also surface missing Examples: if a property pattern applies but has no corresponding Example, the specification is incomplete.
+
+## Concepts
+
+**Seven Property Patterns** (Wlaschin, 2014). When choosing what to verify in a specification, these patterns help discover what properties (invariants, relationships) the system should satisfy:
+
+| Pattern | Core Idea | When to use Scenario Outline |
+|---------|-----------|------------------------------|
+| Different paths, same destination | Two operation sequences produce the same result | When multiple paths exist to the same outcome (e.g., different orderings, different constructors) |
+| There and back again | An operation and its inverse return to the starting state | When serialise/deserialise, encode/decode, add/remove pairs exist |
+| Some things never change | An invariant is preserved after a transformation | When a transform should preserve size, membership, ordering, or other invariants |
+| The more things change, the more they stay the same | Applying an operation twice is the same as applying it once (idempotence) | When operations should be idempotent (e.g., deduplicate, round, normalise) |
+| Solve a smaller problem first | A property true for a small case implies truth for a composed case (structural induction) | When recursive or composable structures are involved (lists, trees, nested objects) |
+| Hard to prove, easy to verify | Finding the answer is complex, but checking it is simple | When output can be verified by a simpler check (e.g., sort result is a permutation, parse result concatenates to original) |
+| The test oracle | An alternate implementation exists to verify results | When a brute-force or simplified reference implementation can validate the optimised version |
+
+**Using Patterns to Choose Example vs Scenario Outline**: During feature example creation (write-bdd-features skill), apply these patterns to each Rule:
+
+1. For each Rule, ask: "Does any of the seven patterns apply to this behaviour?"
+2. If **no pattern applies** — the behaviour is a single discrete outcome with fixed inputs — write a simple `Example:`.
+3. If a pattern applies — the behaviour holds across a range of inputs — write a `Scenario Outline:` with an `Examples:` table covering the significant input combinations surfaced by the pattern.
+4. If a pattern reveals an edge case not covered by existing Examples — add the missing Example.
+
+**Pattern-to-Example Decision Tree**:
+
+```
+Does the Rule describe an invariant that holds across inputs?
+├─ Yes → Scenario Outline with inputs that exercise the invariant
+│        + Hypothesis property test per [[software-craft/test-design#concepts]]
+└─ No → Does the Rule have "easy to verify" checkable output?
+    ├─ Yes → Can multiple inputs produce different valid outputs?
+    │        ├─ Yes → Scenario Outline with representative input/output pairs
+    │        └─ No → Simple Example with the key input
+    └─ No → Simple Example (single observable outcome)
+```
+
+**Pre-mortem Integration**: During the behavior-level pre-mortem per [[requirements/pre-mortem#concepts]], apply property patterns adversarially: "Given this pattern applies to this Rule, what inputs would break it?" Surface failure modes as additional Examples.
+
+## Content
+
+### Pattern Application Examples
+
+**Different paths, same destination**: A sort function produces the same result regardless of input order. Use Scenario Outline with different input orderings asserting identical sorted output. This also applies to commutative operations: `a + b == b + a`.
+
+**There and back again**: JSON serialisation round-trips: `decode(encode(obj)) == obj`. Use Scenario Outline with different object shapes. HTTP encode/decode, compression/decompress, and format conversions all fit this pattern.
+
+**Some things never change**: A `map` operation preserves list length. A `sort` preserves the multiset of elements. Use Scenario Outline with different input sizes and element values, asserting the invariant holds.
+
+**Idempotence**: Calling `distinct()` twice produces the same result as calling it once. Use Scenario Outline with different input sets, some already distinct, some with duplicates. REST PUT operations are another common case.
+
+**Structural induction**: If a property holds for a base case (empty list) and for appending one element, it holds for all lists. Use Scenario Outline with list sizes 0, 1, 2, N to cover induction steps.
+
+**Hard to prove, easy to verify**: Finding a prime factorisation is hard, but multiplying the factors back is trivial. Tokenising a string is hard, but concatenating tokens should equal the original. Use Scenario Outline with different input strings or numbers, asserting the verification check.
+
+**Test oracle**: A fast sorting algorithm can be verified against a naive bubble sort. A parallel computation can be verified against a sequential version. Use Scenario Outline where each row exercises a different input against both implementations.
+
+### Integration with BDD Workflow
+
+When the PO (or SE) writes Examples during `write-bdd-features`:
+
+1. Write the Rule's declarative behaviour first (Given/When/Then).
+2. Check each of the seven patterns against the Rule.
+3. For each matching pattern, determine the input combinations that exercise the property.
+4. If 1-2 combinations → simple `Example:` per combination.
+5. If 3+ combinations with the same step structure → `Scenario Outline:` with `Examples:` table.
+6. For invariant/structural Rules → also generate a Hypothesis property test per [[software-craft/test-design#concepts]].
+
+### Hypothesis Property Tests from Patterns
+
+Each invariant/structural Rule should produce both BDD Examples AND a Hypothesis property test. The property pattern guides the Hypothesis strategy:
+
+| Pattern | Hypothesis Strategy |
+|---------|-------------------|
+| Different paths, same destination | `@given(inputs, order=strategies.permutations)` |
+| There and back again | `@given(arbitrary_input)` then round-trip assert |
+| Some things never change | `@given(transform_input)` then assert invariant |
+| Idempotence | `@given(input)` then `assert f(f(x)) == f(x)` |
+| Structural induction | `@given(recursive_strategy)` with base + step |
+| Hard to prove, easy to verify | `@given(input)` then verify output with simple check |
+| Test oracle | `@given(input)` then `assert fast(input) == oracle(input)` |
+
+## Related
+
+- [[requirements/gherkin]]
+- [[requirements/pre-mortem]]
+- [[software-craft/test-design]]
+- [[software-craft/tdd]]
+
+## Related
+
+- [[software-craft/test-design]]
+- [[software-craft/tdd]]
+- [[requirements/gherkin]]
+- [[requirements/pre-mortem]]
diff --git a/.opencode/skills/review-gate/SKILL.md b/.opencode/skills/review-gate/SKILL.md
@@ -15,7 +15,7 @@ Available knowledge: [[software-craft/code-review]], [[software-craft/test-desig
 2. Verify implementation aligns with architectural decisions per [[software-craft/code-review#concepts]]: ADR compliance, quality attributes met.
 3. Verify all `# Constraints:` in the .feature file are met in the implementation. For technology constraints, read domain_spec.md `### Technology Requirements` table and execute the Verification instruction for each row (grep imports, check file existence, inspect config). Zero evidence → FAIL. For quality attribute constraints, verify thresholds are enforced.
 4. Verify implementation aligns with feature specification: all Examples have corresponding test implementations, behavior matches Gherkin steps.
-5. Verify design principles adversarially per the priority order in [[software-craft/tdd#content]], loading ObjCal per [[software-craft/object-calisthenics#key-takeaways]], smells per [[software-craft/smell-catalogue#key-takeaways]], and SOLID per [[software-craft/solid#key-takeaways]].
+5. Verify design principles adversarially per the priority order in [[software-craft/tdd#content]], loading the full documents for detection: ObjCal per [[software-craft/object-calisthenics]], smells per [[software-craft/smell-catalogue]], and SOLID per [[software-craft/solid]]. Use `#key-takeaways` only when recalling principles, not when detecting violations.
 6. **FAIL-FAST**: If any design violations found → exit `fail` with specific citations (file:line). Do NOT proceed to structure review.
 
 ## Tier 2: Structure Review

diff --git a/.opencode/skills/write-bdd-features/SKILL.md b/.opencode/skills/write-bdd-features/SKILL.md
@@ -5,22 +5,22 @@ description: "Write concrete Given/When/Then Example blocks for each Rule in the
 
 # Write BDD Features
 
-Available knowledge: [[requirements/gherkin]], [[requirements/moscow]], [[requirements/pre-mortem]], [[requirements/decomposition]]. `in` artifacts: read all before starting work.
+Available knowledge: [[requirements/gherkin]], [[requirements/moscow]], [[requirements/pre-mortem]], [[requirements/decomposition]], [[requirements/property-patterns]]. `in` artifacts: read all before starting work.
 
 1. Discover and read the feature file, product definition, domain spec, and glossary from `in`.
 2. Run a pre-mortem per [[requirements/pre-mortem]] for each Rule before writing any Examples. All Rules must have their pre-mortems completed before any Examples are written.
 3. IF hidden failure modes surface from the pre-mortem → plan Examples to cover them per [[requirements/gherkin#key-takeaways]].
-4. For each Rule, write Example or Scenario Outline blocks directly from the Rule description and domain spec knowledge per [[requirements/gherkin#concepts]]. Do NOT use behavior hints — they have been removed from the flow. Derive Example behavior directly from:
-   - The Rule's behavioral description paragraph
-   - The domain spec's External Contracts, Data Shapes, and Invariants
-   - The feature's `# Constraints:` comments
-   - Quality attributes from product_definition.md
-   Write Examples per format rules in [[requirements/gherkin#concepts]].
+4. For each Rule, apply property patterns per [[requirements/property-patterns#concepts]] to determine Example structure:
+    a) Check each of the seven patterns against the Rule's behaviour.
+    b) If no pattern applies → write a simple `Example:` with fixed inputs.
+    c) If a pattern applies and reveals 3+ input combinations with the same step structure → write a `Scenario Outline:` with an `Examples:` table covering the significant combinations surfaced by the pattern.
+    d) If a pattern applies but only reveals 1-2 combinations → write simple `Example:` per combination.
+    Write Examples per format rules in [[requirements/gherkin#concepts]], deriving behavior from the Rule's description, domain spec External Contracts/Data Shapes/Invariants, the feature's `# Constraints:` comments, and quality attributes from product_definition.md.
 5. For each Rule, verify Examples cover distinct behaviours per [[requirements/gherkin#concepts]]:
    a) Group Examples by `Then` outcome. Same outcome = same behaviour. Keep one representative per outcome. Discard duplicates. Exception: Scenario Outline rows are parameterized variants of the same behaviour — they are NOT duplicates.
    b) For each distinct outcome, run the behavior-level pre-mortem per [[requirements/pre-mortem#concepts]].
    c) Add Examples targeting the failure modes surfaced.
-   d) Structural (invariant) rules: one representative Example suffices. Defer full coverage to a Hypothesis property test per [[software-craft/test-design#concepts]].
+    d) Structural (invariant) rules: one representative Example suffices. Defer full coverage to a Hypothesis property test per [[software-craft/test-design#concepts]], using the pattern-to-strategy mapping in [[requirements/property-patterns#content]].
 6. Classify each Example per [[requirements/moscow#concepts]]; MoSCoW classification is for internal triage only: do NOT add Must/Should/Could tags to Examples in the .feature file.
 7. IF a Rule has more than 8 Must behaviors (after grouping by Then-outcome and collapsing Scenario Outlines) → this is a soft flag for PO review. Do NOT split or modify the Rule — Rule structure is frozen after define-flow. Decomposition was applied during refine-features; this check catches edge cases that slipped through. A Rule with 9+ Must behaviors is acceptable if the behaviour genuinely requires that many distinct cases.
 8. Evaluate each Rule's Examples for quality, checking every criterion per [[requirements/gherkin#concepts]]:

diff --git a/AGENTS.md b/AGENTS.md
@@ -8,7 +8,8 @@ Post-mortem analysis shows these practices prevent most project failures. Violat
 4. **Never decompose a feature without stakeholder approval.** If a feature is too large for INVEST, propose the split to the stakeholder with rationale. They decide what's core vs. deferred.
 5. **Verify inputs exist before entering a state.** Every state's `in` artifacts must be readable on disk. If they're missing, stop and reconstruct them. Don't proceed with assumed knowledge.
 6. **A feature is not done until every interview requirement is traced.** Every stakeholder Q&A must map to either a passing @id test or an explicit stakeholder deferral. Untraced requirements = incomplete delivery.
-7. **Respect git branch discipline.** Every state declares `git: dev`, `git: feature`, or `git: main` in its attrs. Work on the branch the state declares. Never switch branches mid-state. Before exiting a project-phase flow (discovery, architecture, branding, setup), set `committed-to-dev-locally: ==verified` evidence. Changes must be committed to dev before advancing.
+7. **Respect git branch discipline.** Every state declares `git: dev`, `git: feature`, or `git: main` in its attrs. **Verify the current branch matches `attrs.git` before starting any work.** If the branch is wrong, checkout or create the correct branch before proceeding. Never switch branches mid-state. Before exiting a project-phase flow (discovery, architecture, branding, setup), set `committed-to-dev-locally: ==verified` evidence. Changes must be committed to dev before advancing.
+8. **Every feature branch must be merged back to dev.** A feature is not delivered until its commits are squash-merged into local dev and `task test-fast` passes on dev. The develop-flow exits to deliver-flow which handles the merge, but the orchestrator must never leave a feature branch dangling — if the session ends mid-feature, resume and complete the merge before starting new work.
 
 ## Project Structure
 - `.flowr/flows/`: YAML state machine definitions (source of truth for routing)
@@ -163,16 +164,20 @@ Exception: The polish-code skill explicitly runs convention commands (`task conv
 
 ### Todo-Driven State Execution
 
-At state entry, generate a procedural todo list from the state's metadata using the todowrite tool. Format: `[X]` completed, `[ ]` pending, `[~]` anchor (always last).
+At state entry, generate a procedural todo list using the todowrite tool. Format: `[X]` completed, `[ ]` pending, `[~]` anchor (always last).
 
-1. **Preparation** (`[ ]`): list available `in` artifacts
-2. **Dispatch** (`[ ]`): call the state's owner agent with skills loaded
-3. **Output** (`[ ]`): one per `out` artifact
-4. **Verification** (`[ ]`): check constraints, run tests/lint if applicable
-5. **Anchor** (`[~]`, always last): flowr next → pick transition → flowr transition → rewrite todo
+1. **Preparation** (`[ ]`): verify current branch matches `attrs.git` (checkout or create if wrong). List available `in` artifacts.
+2. **Dispatch** (`[ ]`): dispatch to the owner agent listed in `attrs.owner` as a subagent with skills loaded. The orchestrator MUST NOT do the work itself — only route. Owner mapping: `PO` → product-owner, `DE` → domain-expert, `SE` → software-engineer, `SA` → system-architect, `R` → reviewer, `Design Agent` → design-agent, `Setup Agent` → setup-agent.
+3. **Load skills** (`[ ]`): read every skill file listed in `attrs.skills` from `.opencode/skills/<skill_name>/SKILL.md`. This step is MANDATORY — never skip it.
+4. **Skill-derived work items** (`[ ]`): one todo item per numbered step in the skill, using the skill's own language verbatim. These are the substantive work items. Self-generated items are only permitted for infrastructure (read artifacts, commit) — never for the core procedure.
+5. **Output** (`[ ]`): one per `out` artifact
+6. **Verification** (`[ ]`): check constraints, run tests/lint if applicable
+7. **Anchor** (`[~]`, always last): flowr next → pick transition → flowr transition → rewrite todo
 
 The todo is the execution contract. Every item must be marked `[X]` before the anchor fires. One state per todo; never span multiple states or collapse loop iterations. Full protocol: [[workflow/todo-anchor-protocol]].
 
+**Todo discipline**: After completing ANY step, update the todowrite tool to mark it `[X]` and set the next step `[ ]` to `in_progress`. If the todo list is empty or missing, regenerate it immediately — working without a todo means working without a contract. Never let the todo go stale between steps.
+
 ### Session Init
 
 Before starting a flow, create a session to track progress: