From e1c76f9527512cce629edc996ccfef3625338de6 Mon Sep 17 00:00:00 2001 From: nullhack Date: Wed, 22 Apr 2026 02:56:07 -0400 Subject: [PATCH 1/9] feat(arch): write domain model, system overview, and 5 ADRs for beehave v1 --- WORK.md | 4 +- docs/adr/ADR-2026-04-22-adapter-protocol.md | 27 ++++ .../ADR-2026-04-22-error-handling-policy.md | 32 ++++ ...DR-2026-04-22-feature-file-write-policy.md | 28 ++++ docs/adr/ADR-2026-04-22-id-stability.md | 29 ++++ docs/adr/ADR-2026-04-22-module-structure.md | 29 ++++ docs/adr/ADR-2026-04-22-slug-derivation.md | 27 ++++ docs/domain-model.md | 145 ++++++++++++++++++ docs/system.md | 63 +++++++- 9 files changed, 376 insertions(+), 8 deletions(-) create mode 100644 docs/adr/ADR-2026-04-22-adapter-protocol.md create mode 100644 docs/adr/ADR-2026-04-22-error-handling-policy.md create mode 100644 docs/adr/ADR-2026-04-22-feature-file-write-policy.md create mode 100644 docs/adr/ADR-2026-04-22-id-stability.md create mode 100644 docs/adr/ADR-2026-04-22-module-structure.md create mode 100644 docs/adr/ADR-2026-04-22-slug-derivation.md create mode 100644 docs/domain-model.md diff --git a/WORK.md b/WORK.md index 12d1342..0457663 100644 --- a/WORK.md +++ b/WORK.md @@ -13,10 +13,12 @@ Each item carries exactly the variables defined by `FLOW.md`: -*(no active items)* +*(no active items — waiting for PO to move chosen feature to in-progress/)* --- ## Session Log + +2026-04-22 00:00 | @system-architect | [none] | [IDLE] | Read all 14 backlog features; wrote docs/domain-model.md, docs/system.md, docs/adr/ (5 ADRs); recommended config-reading as first feature; awaiting PO to move feature to in-progress/ and confirm branch creation diff --git a/docs/adr/ADR-2026-04-22-adapter-protocol.md b/docs/adr/ADR-2026-04-22-adapter-protocol.md new file mode 100644 index 0000000..5568884 --- /dev/null +++ b/docs/adr/ADR-2026-04-22-adapter-protocol.md @@ -0,0 +1,27 @@ +# ADR: FrameworkAdapter as a structural Protocol + +| Field | Value | +|-------|-------| +| **Date** | 2026-04-22 | +| **Feature** | adapter-contract, pytest-adapter | +| **Status** | Accepted | + +## Decision + +`FrameworkAdapter` is defined as a `typing.Protocol` (structural subtyping), not an abstract base class. + +## Reason + +Structural protocols allow third-party adapters to satisfy the interface without importing from `beehave`, keeping the dependency graph clean and enabling duck-typed adapters in future without a registration mechanism. + +## Alternatives Considered + +- **ABC (abstract base class)**: rejected — forces third-party adapters to import `beehave`; creates a hard coupling; adds no runtime safety benefit over Protocol + pyright +- **Plugin entry points (importlib.metadata)**: rejected — out of v1 scope; over-engineered for a single built-in adapter; can be layered on top of Protocol later + +## Consequences + +- (+) Third-party adapters need zero beehave imports to satisfy the contract +- (+) pyright enforces the Protocol at type-check time with zero runtime overhead +- (+) Easy to extend to entry-point registration in v2 without breaking existing adapters +- (-) Runtime errors if an adapter is missing a method (no ABC `__abstractmethods__` guard); mitigated by pyright diff --git a/docs/adr/ADR-2026-04-22-error-handling-policy.md b/docs/adr/ADR-2026-04-22-error-handling-policy.md new file mode 100644 index 0000000..9ea371c --- /dev/null +++ b/docs/adr/ADR-2026-04-22-error-handling-policy.md @@ -0,0 +1,32 @@ +# ADR: Error vs. warn policy for deletions, duplicates, and malformed config + +| Field | Value | +|-------|-------| +| **Date** | 2026-04-22 | +| **Feature** | sync-cleanup, id-generation, config-reading | +| **Status** | Accepted | + +## Decision + +Conditions that are **recoverable or informational** produce warnings (exit 0); conditions that are **unrecoverable or data-destructive** produce errors (exit non-zero). The warn/error threshold for deleted `.feature` files and duplicate `@id` values is configurable in `[tool.beehave]`. + +## Reason + +Unanswered gap A1 from scope_journal.md. Defaulting to warn-only for deletions and duplicates keeps beehave non-blocking in normal developer workflows while allowing CI pipelines to tighten the policy via config. Malformed `pyproject.toml` and invalid Gherkin are always errors because beehave cannot proceed without valid input. + +## Alternatives Considered + +- **Always error on any anomaly**: rejected — too aggressive; breaks developer workflow on partial migrations +- **Always warn, never error**: rejected — CI pipelines need a way to gate on drift; configurable threshold is the right balance +- **Separate `--strict` flag**: considered but deferred — config key is sufficient for v1; a flag can be added later + +## Consequences + +- (+) Default behavior is non-blocking; developers can run beehave at any project state +- (+) CI pipelines can tighten policy via `pyproject.toml` without code changes +- (-) Two code paths per condition (warn vs. error) must be maintained +- (-) Gap A1 is partially resolved: malformed `pyproject.toml` → error; invalid Gherkin → error; read-only filesystem → error with clear message; deleted feature → configurable warn/error (default warn) + +## Unresolved (escalate to PO before implementation) + +- A2 (performance targets) and A4 (structured logging) remain unanswered. These do not block v1 feature implementation but should be resolved before the cache-management and status features are accepted. diff --git a/docs/adr/ADR-2026-04-22-feature-file-write-policy.md b/docs/adr/ADR-2026-04-22-feature-file-write-policy.md new file mode 100644 index 0000000..9556f66 --- /dev/null +++ b/docs/adr/ADR-2026-04-22-feature-file-write-policy.md @@ -0,0 +1,28 @@ +# ADR: Feature file write policy — @id tags only + +| Field | Value | +|-------|-------| +| **Date** | 2026-04-22 | +| **Feature** | id-generation, sync-create, sync-update | +| **Status** | Accepted | + +## Decision + +beehave writes to `.feature` files **only** to add `@id` tags to untagged Examples; all other content is read-only. + +## Reason + +`.feature` files are the source of truth for requirements. Any write beyond `@id` assignment risks corrupting human-authored Gherkin, breaking stakeholder trust, and making beehave unsafe to run in CI. + +## Alternatives Considered + +- **Full rewrite on sync**: rejected — destroys formatting, comments, and authoring intent +- **Separate ID sidecar file**: rejected — breaks the one-file-per-feature contract and complicates tooling +- **No write-back at all (IDs in sidecar)**: rejected — `@id` must be visible in the `.feature` file for traceability + +## Consequences + +- (+) Feature files remain human-readable and diff-friendly +- (+) beehave is safe to run on any project without risk of data loss +- (-) beehave must parse and re-serialize Gherkin with exact whitespace preservation — non-trivial implementation constraint +- (-) Malformed `@id` tags must be detected and replaced in-place, requiring careful regex/AST handling diff --git a/docs/adr/ADR-2026-04-22-id-stability.md b/docs/adr/ADR-2026-04-22-id-stability.md new file mode 100644 index 0000000..8f0350e --- /dev/null +++ b/docs/adr/ADR-2026-04-22-id-stability.md @@ -0,0 +1,29 @@ +# ADR: @id assignment and collision policy + +| Field | Value | +|-------|-------| +| **Date** | 2026-04-22 | +| **Feature** | id-generation | +| **Status** | Accepted | + +## Decision + +`@id` values are 8-character lowercase hex strings; once assigned they are never replaced (unless malformed); collisions trigger a silent retry; uniqueness is enforced project-wide across all `.feature` files. + +## Reason + +Stable IDs are the sole link between an Example and its test stub. Any ID change breaks that link, orphaning the stub and losing the developer's test body. Project-wide uniqueness prevents ambiguous stub lookups. + +## Alternatives Considered + +- **Sequential integers**: rejected — not stable across file renames or reordering; leaks ordering information +- **UUID v4 (full 32 hex chars)**: rejected — too verbose in `.feature` files; 8 hex chars gives 4 billion values, sufficient for any realistic project +- **Content-hash of step text**: rejected — changes when steps are edited, breaking the stability guarantee +- **File-scoped uniqueness**: rejected — beehave scans all feature files; project-wide uniqueness is required for unambiguous stub lookup + +## Consequences + +- (+) Stubs survive Example reordering, file renames, and step text edits +- (+) `@id` in function name is the only lookup key needed — no path or title matching required +- (-) ID generation requires a full scan of all `.feature` files before assigning new IDs (to check uniqueness) +- (-) Malformed IDs must be detected and replaced — requires careful validation logic diff --git a/docs/adr/ADR-2026-04-22-module-structure.md b/docs/adr/ADR-2026-04-22-module-structure.md new file mode 100644 index 0000000..8d0897e --- /dev/null +++ b/docs/adr/ADR-2026-04-22-module-structure.md @@ -0,0 +1,29 @@ +# ADR: Package module structure — seven bounded contexts + +| Field | Value | +|-------|-------| +| **Date** | 2026-04-22 | +| **Feature** | all | +| **Status** | Accepted | + +## Decision + +The `beehave` package is organized into seven submodules matching the seven bounded contexts: `cli`, `config`, `parsing`, `sync`, `adapters`, `cache`, `scaffold`. Each submodule has a single responsibility and a strict dependency order (see domain-model.md Module Dependency Graph). + +## Reason + +The domain analysis reveals seven distinct responsibilities with clear boundaries. Flat module structure would mix parsing, sync, and adapter concerns, making it impossible to test them in isolation and difficult to add new adapters or cache backends later. + +## Alternatives Considered + +- **Flat package (all in `beehave/`)**: rejected — 14 features across 7 concerns; flat structure would produce a 1000+ line module with no clear seams +- **Three layers (core/adapters/cli)**: considered — too coarse; `parsing` and `sync` have different change rates and different external dependencies +- **Domain-driven with ports/adapters folders**: considered — over-engineered for v1; the `adapters/` submodule already provides the extension point; explicit `ports/` folder adds ceremony without benefit + +## Consequences + +- (+) Each submodule can be tested in isolation with no mocking of other submodules +- (+) New adapters are added by creating a new class in `beehave/adapters/` — no other module changes +- (+) `parsing` and `cache` are independently replaceable +- (-) Seven submodules means seven `__init__.py` files and more import paths to maintain +- (-) The `sync` module is the most complex (depends on parsing, adapters, cache) — it must be carefully bounded diff --git a/docs/adr/ADR-2026-04-22-slug-derivation.md b/docs/adr/ADR-2026-04-22-slug-derivation.md new file mode 100644 index 0000000..693683f --- /dev/null +++ b/docs/adr/ADR-2026-04-22-slug-derivation.md @@ -0,0 +1,27 @@ +# ADR: Slug derivation is stage-folder-independent + +| Field | Value | +|-------|-------| +| **Date** | 2026-04-22 | +| **Feature** | sync-create, sync-update, sync-cleanup, nest | +| **Status** | Accepted | + +## Decision + +`FeatureSlug` is derived solely from the `.feature` file's **stem** (filename without extension and without the stage subfolder path); the stage folder (`backlog/`, `in-progress/`, `completed/`, or root) is ignored. + +## Reason + +A feature moves through stage folders during its lifecycle. If the slug included the stage path, moving a feature from `in-progress/` to `completed/` would change the slug, rename the test directory, and orphan all stubs — a destructive side-effect of a purely administrative operation. + +## Alternatives Considered + +- **Include stage folder in slug**: rejected — causes spurious orphans and renames on every stage transition +- **Flat features directory (no stage subfolders)**: rejected — the stage folder structure is a project requirement (from `nest` feature) +- **Configurable slug derivation**: rejected — YAGNI; one derivation rule is sufficient for v1 + +## Consequences + +- (+) Moving a feature between stage folders has zero effect on test stubs +- (+) Slug derivation is a pure function of the file stem — simple and testable +- (-) Two feature files with the same stem in different stage folders would collide (e.g. `backlog/login.feature` and `in-progress/login.feature`). This is treated as a user error — beehave warns and uses the first one found. The WIP-limit-of-1 workflow enforced by AGENTS.md makes this practically impossible. diff --git a/docs/domain-model.md b/docs/domain-model.md new file mode 100644 index 0000000..381e45a --- /dev/null +++ b/docs/domain-model.md @@ -0,0 +1,145 @@ +# Domain Model: beehave + +> Living reference of code-facing domain entities. +> Owned by the system-architect. Created and updated at Step 2. +> The product-owner reads this file to check existing entities during discovery, but never writes to it. +> Append-only: add new entries at the bottom. Deprecate old entries by moving them to the Deprecated section. +> Never edit existing live entries — code depends on them. + +--- + +## Bounded Contexts + +| Context | Responsibility | Key Modules | +|---------|----------------|-------------| +| **Parsing** | Read `.feature` files; extract structure; assign `@id` tags | `beehave/parsing/` | +| **Sync** | Reconcile parsed feature state with test stub state | `beehave/sync/` | +| **Adapters** | Render framework-specific stub text | `beehave/adapters/` | +| **Config** | Read `pyproject.toml`; merge CLI overrides | `beehave/config/` | +| **Cache** | Track last-known feature file state for incremental sync | `beehave/cache/` | +| **CLI** | Entry points: `nest`, `sync`, `status`, `hatch`, `version` | `beehave/cli/` | +| **Scaffold** | Create directory structure and inject config | `beehave/scaffold/` | + +--- + +## Entities + +| Name | Type | Description | Bounded Context | First Appeared | +|------|------|-------------|-----------------|----------------| +| `FeatureFile` | Entity | A `.feature` file on disk; identified by its path; source of truth for requirements | Parsing | domain-model | +| `Feature` | Value Object | Parsed representation of a Gherkin `Feature:` block; carries title, description, tags, and child Rules | Parsing | domain-model | +| `Rule` | Value Object | A `Rule:` block inside a Feature; groups related Examples; maps to one test file | Parsing | domain-model | +| `Example` | Value Object | A single `Example:` / `Scenario:` block; the atomic unit of acceptance criteria; carries tags, steps, and optional Outline table | Parsing | domain-model | +| `ScenarioOutline` | Value Object | A parameterized Example with an `Examples:` table; extends `Example`; columns become parametrize args | Parsing | domain-model | +| `ExampleId` | Value Object | An 8-char lowercase hex string (`@id:`); stable identity linking an Example to its test stub | Parsing | domain-model | +| `FeatureSlug` | Value Object | Snake-case string derived from a feature file's stem; used as test directory name and function name prefix | Parsing | domain-model | +| `RuleSlug` | Value Object | Snake-case string derived from a `Rule:` title; used as the test file name | Parsing | domain-model | +| `GherkinStep` | Value Object | A single Given/When/Then/And/But step line; carries keyword and text | Parsing | domain-model | +| `TestStub` | Entity | A generated Python test function; identified by `@id` embedded in its name; carries docstring, skip marker, body | Sync | domain-model | +| `TestFile` | Entity | A Python test file at `tests/features//_test.py`; contains one or more TestStubs | Sync | domain-model | +| `SyncPlan` | Value Object | Immutable description of all changes sync would make: stubs to create, update, move, warn about | Sync | domain-model | +| `SyncResult` | Value Object | Outcome of executing a SyncPlan; lists created/updated/moved/warned items | Sync | domain-model | +| `Orphan` | Value Object | A TestStub whose `@id` has no matching Example in any FeatureFile | Sync | domain-model | +| `FrameworkAdapter` | Protocol | Interface all adapters must implement; supplies skip marker, deprecated marker, parametrize template, stub header | Adapters | domain-model | +| `PytestAdapter` | Entity | Concrete adapter for pytest; implements `FrameworkAdapter` | Adapters | domain-model | +| `BeehaveConfig` | Value Object | Resolved configuration: `framework`, `features_dir`, `template_path`, `on_delete` policy; defaults applied | Config | domain-model | +| `RawConfig` | Value Object | Unvalidated key-value pairs read directly from `[tool.beehave]` in `pyproject.toml` | Config | domain-model | +| `CacheEntry` | Value Object | Per-file cache record: path, mtime, size, content hash | Cache | domain-model | +| `FeatureCache` | Entity | The full cache state; persisted as `.beehave_cache/features.json`; tracks all known FeatureFiles | Cache | domain-model | +| `StubTemplate` | Value Object | Rendered text template for a single stub function; produced by a FrameworkAdapter | Adapters | domain-model | +| `ColumnSet` | Value Object | Ordered set of column names from a Scenario Outline's `Examples:` table | Parsing | domain-model | +| `DemoFeature` | Value Object | Bee-themed demo `.feature` file content generated by `hatch`; covers Feature/Rule/Example/Outline patterns | CLI | domain-model | + +--- + +## Verbs + +| Name | Actor | Object | Description | First Appeared | +|------|-------|--------|-------------|----------------| +| `parse` | Parser | `FeatureFile` → `Feature` | Read a `.feature` file and return its structured representation | Parsing | +| `assign_ids` | IdAssigner | `Feature` → `Feature` | Assign `ExampleId` to any Example lacking a valid one; write back in-place | Parsing | +| `generate_id` | IdAssigner | — → `ExampleId` | Generate a unique 8-char lowercase hex id; retry on collision | Parsing | +| `slugify` | — | `str` → `FeatureSlug` / `RuleSlug` | Convert a name to snake_case slug | Parsing | +| `plan` | SyncEngine | `[Feature]` × `[TestFile]` → `SyncPlan` | Compute the diff between current feature state and test stub state | Sync | +| `execute` | SyncEngine | `SyncPlan` → `SyncResult` | Apply the plan: create/update/move stubs; emit warnings | Sync | +| `render_stub` | FrameworkAdapter | `Example` → `StubTemplate` | Render a test stub function text for the given Example | Adapters | +| `render_parametrized_stub` | FrameworkAdapter | `ScenarioOutline` → `StubTemplate` | Render a parametrized stub for a Scenario Outline | Adapters | +| `read_config` | ConfigReader | `Path` → `BeehaveConfig` | Read `pyproject.toml`, extract `[tool.beehave]`, apply defaults | Config | +| `merge_cli` | ConfigReader | `BeehaveConfig` × `CLIArgs` → `BeehaveConfig` | Override config values with CLI flag values | Config | +| `load_cache` | CacheManager | `Path` → `FeatureCache` | Load cache from disk; rebuild silently if missing/stale/corrupt | Cache | +| `save_cache` | CacheManager | `FeatureCache` → None | Persist cache to `.beehave_cache/features.json` | Cache | +| `is_stale` | CacheManager | `CacheEntry` × `FeatureFile` → `bool` | Check if a cached entry is out of date | Cache | +| `scaffold` | Scaffolder | `BeehaveConfig` → None | Create directory structure and inject `[tool.beehave]` into `pyproject.toml` | Scaffold | +| `check_scaffold` | Scaffolder | `BeehaveConfig` → `bool` | Verify structure is complete without modifying anything | Scaffold | +| `hatch` | DemoGenerator | `Path` → None | Write bee-themed demo `.feature` files; skip if file already exists | CLI | +| `detect_orphans` | SyncEngine | `[TestStub]` × `[ExampleId]` → `[Orphan]` | Find stubs whose `@id` has no matching Example | Sync | +| `detect_misplaced` | SyncEngine | `[TestStub]` × `[Feature]` → `[(TestStub, Path)]` | Find stubs in wrong directory | Sync | +| `propagate_deprecated` | SyncEngine | `Feature` → `Feature` | Apply `@deprecated` cascade: Feature/Rule → all child Examples | Sync | + +--- + +## Relationships + +| Subject | Relation | Object | Cardinality | Notes | +|---------|----------|--------|-------------|-------| +| `FeatureFile` | contains | `Feature` | 1:1 | One Feature per file | +| `Feature` | contains | `Rule` | 1:N | One or more Rules per Feature | +| `Rule` | contains | `Example` | 1:N | One or more Examples per Rule | +| `Example` | has | `ExampleId` | 1:1 | Assigned by `assign_ids`; stable once set | +| `Example` | has | `GherkinStep` | 1:N | Ordered list of steps | +| `ScenarioOutline` | extends | `Example` | 1:1 | Adds `ColumnSet` and rows | +| `ScenarioOutline` | has | `ColumnSet` | 1:1 | Column names for parametrize | +| `FeatureFile` | maps-to | `FeatureSlug` | 1:1 | Derived from file stem; stage-folder-independent | +| `Rule` | maps-to | `RuleSlug` | 1:1 | Derived from Rule title | +| `Rule` | maps-to | `TestFile` | 1:1 | `tests/features//_test.py` | +| `Example` | maps-to | `TestStub` | 1:1 | Identified by `@id` in function name | +| `TestStub` | lives-in | `TestFile` | N:1 | Multiple stubs per file | +| `FrameworkAdapter` | renders | `StubTemplate` | 1:N | One adapter, many stubs | +| `PytestAdapter` | implements | `FrameworkAdapter` | 1:1 | v1 only built-in | +| `BeehaveConfig` | selects | `FrameworkAdapter` | 1:1 | Via `framework` key | +| `BeehaveConfig` | points-to | `StubTemplate` | 0:1 | Via `template_path`; None = use adapter default | +| `SyncPlan` | references | `FeatureFile` | 1:N | Plan covers all changed files | +| `SyncPlan` | references | `TestFile` | 1:N | Plan covers all affected test files | +| `FeatureCache` | contains | `CacheEntry` | 1:N | One entry per known FeatureFile | +| `CacheEntry` | tracks | `FeatureFile` | 1:1 | Path + mtime + hash | +| `Orphan` | wraps | `TestStub` | 1:1 | Orphan is a classification of a stub | +| `Feature` | may-carry | `@deprecated` | 1:0..1 | Cascades to all child Rules and Examples | +| `Rule` | may-carry | `@deprecated` | 1:0..1 | Cascades to all child Examples | +| `Example` | may-carry | `@deprecated` | 1:0..1 | Direct; no override of parent in v1 | + +--- + +## Module Dependency Graph + +``` +CLI ──► Config ──► (pyproject.toml) + │ + ├──► Scaffold ──► (filesystem) + │ + ├──► Parsing ──► (filesystem / gherkin-official) + │ │ + │ └──► Cache ──► (filesystem) + │ + ├──► Sync ──► Parsing + │ │ + │ └──► Adapters + │ + └──► Adapters ──► (templates) +``` + +**Dependency rules (enforced):** +- `Parsing` has no dependency on `Sync`, `Adapters`, `CLI`, or `Scaffold` +- `Adapters` has no dependency on `Parsing`, `Sync`, `CLI`, or `Scaffold` +- `Config` has no dependency on `Parsing`, `Sync`, `Adapters`, or `Scaffold` +- `Cache` depends only on `Parsing` (for `FeatureFile` type) +- `Sync` depends on `Parsing`, `Adapters`, and `Cache` +- `CLI` depends on all other contexts; it is the composition root +- `Scaffold` depends only on `Config` + +--- + +## Deprecated + +| Name | Type | Deprecated Date | Replaced By | Reason | +|------|------|-----------------|-------------|--------| +| *(none)* | — | — | — | — | diff --git a/docs/system.md b/docs/system.md index efb9650..c71e29d 100644 --- a/docs/system.md +++ b/docs/system.md @@ -1,8 +1,14 @@ -# System: +# System Overview: beehave -> Last updated: YYYY-MM-DD — +> Last updated: 2026-04-22 — initial architecture (no features completed yet) -**Purpose:** +**Purpose:** beehave keeps Gherkin `.feature` files and Python test stubs in sync — assigning stable `@id` tags to Examples and generating/updating skipped test functions so living documentation and test scaffolding never diverge. + +--- + +## Summary + +beehave is a framework-agnostic CLI and Python library. Developers run `beehave sync` to reconcile their `.feature` files with their test suite: untagged Examples receive stable `@id` tags written back in-place, new stubs are created, changed stubs are updated, and orphaned stubs are flagged. A `beehave status` dry-run previews changes without writing anything. `beehave nest` bootstraps the canonical directory structure; `beehave hatch` generates demo content. The active test framework (default: pytest) is selected via `[tool.beehave]` in `pyproject.toml` or the `--framework` CLI flag. --- @@ -10,7 +16,9 @@ | Actor | Needs | |-------|-------| -| | | +| Developer | Run `sync` to keep stubs current; run `status` in CI to gate on drift; run `nest` once per project | +| CI pipeline | `beehave status` exit codes (0 = in sync, 1 = drift); `--json` output for machine parsing | +| Framework author | `FrameworkAdapter` Protocol to supply stub conventions without forking beehave | --- @@ -18,13 +26,27 @@ | Module | Responsibility | |--------|----------------| -| | | +| `beehave/` | Package root; public Python API surface | +| `beehave/cli/` | CLI entry points: `nest`, `sync`, `status`, `hatch`, `version` (uses `fire`) | +| `beehave/config/` | Read `[tool.beehave]` from `pyproject.toml`; apply defaults; merge CLI overrides | +| `beehave/parsing/` | Parse `.feature` files via `gherkin-official`; assign `@id` tags; derive slugs | +| `beehave/sync/` | Compute `SyncPlan`; execute create/update/move/warn operations on test stubs | +| `beehave/adapters/` | `FrameworkAdapter` Protocol + `PytestAdapter` concrete implementation | +| `beehave/cache/` | `FeatureCache` JSON persistence; stale/corrupt detection; incremental sync support | +| `beehave/scaffold/` | Create directory structure; inject `[tool.beehave]` into `pyproject.toml` | --- ## Key Decisions -- +- `.feature` files are the single source of truth; beehave only writes `@id` tags back to them — nothing else +- Test stub identity is the `@id` embedded in the function name (`test__`); this is the only stable link between a stub and its Example +- Framework adapters are selected by config/flag, not auto-detected; default is `pytest` +- Stage subfolders (`backlog/`, `in-progress/`, `completed/`) are transparent to sync — all map to the same `tests/features//` directory +- Orphan stubs are warned about but never deleted automatically +- `@deprecated` cascade is absolute in v1: Feature/Rule `@deprecated` propagates to all child Examples with no per-Example override +- Cache is invisible to users; auto-rebuilt if missing, stale, or corrupt +- See ADR-2026-04-22-feature-file-write-policy, ADR-2026-04-22-adapter-protocol, ADR-2026-04-22-id-stability, ADR-2026-04-22-error-handling-policy --- @@ -32,9 +54,36 @@ | Dependency | What it provides | Why not replaced | |------------|------------------|-----------------| +| `gherkin-official` | Gherkin parser (AST from `.feature` files) | Official Cucumber parser; handles all Gherkin edge cases correctly | +| `fire` | CLI argument parsing and dispatch | Zero-boilerplate CLI from Python functions; matches beehave's simple command surface | --- ## Active Constraints -- +- No auto-detection of test framework — explicit config or flag required +- No watch mode, no pre-commit hooks, no auto-triggers — on-demand only +- Test bodies are never modified under any circumstance +- `beehave` never deletes files (stubs, feature files, or cache) automatically +- Config file location is always `pyproject.toml` in the current working directory +- v1 supports only the `pytest` adapter; `unittest` is parked for v2 +- `@id` values are unique project-wide; collision on generation triggers silent retry +- Malformed `@id` tags (empty or non-hex) are replaced, not preserved +- Scenario Outline column changes produce a warning only — parametrize decorator is never auto-modified +- Custom template folder is a full replacement for built-in templates (not a merge) + +--- + +## Relevant ADRs + +- `ADR-2026-04-22-feature-file-write-policy` — `.feature` files are write-once for `@id` tags only +- `ADR-2026-04-22-adapter-protocol` — FrameworkAdapter as a structural Protocol (not ABC) +- `ADR-2026-04-22-id-stability` — `@id` assignment and collision policy +- `ADR-2026-04-22-error-handling-policy` — error vs. warn policy for deletions, duplicates, malformed config +- `ADR-2026-04-22-slug-derivation` — slug derivation is stage-folder-independent + +--- + +## Completed Features + +*(none — initial architecture pass)* From 61c07fe73e29be82c237cf65e244d6a868fed530 Mon Sep 17 00:00:00 2001 From: nullhack Date: Wed, 22 Apr 2026 03:26:35 -0400 Subject: [PATCH 2/9] =?UTF-8?q?feat(arch):=20architecture=20design=20sessi?= =?UTF-8?q?on=201=20=E2=80=94=20validated=209=20decisions,=20produced=209?= =?UTF-8?q?=20ADRs,=20arch=5Fjournal.md?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- README.md | 153 +++++++++++------- .../ADR-2026-04-22-backwards-compatibility.md | 29 ++++ .../ADR-2026-04-22-error-handling-policy.md | 30 ++-- ...DR-2026-04-22-feature-file-write-policy.md | 20 ++- docs/adr/ADR-2026-04-22-id-stability.md | 13 +- .../ADR-2026-04-22-logging-observability.md | 30 ++++ docs/adr/ADR-2026-04-22-module-structure.md | 37 +++-- .../adr/ADR-2026-04-22-performance-targets.md | 29 ++++ docs/adr/ADR-2026-04-22-slug-derivation.md | 14 ++ docs/arch_journal.md | 53 ++++++ docs/discovery.md | 61 +++++-- .../features/backlog/adapter-contract.feature | 16 ++ .../features/backlog/cache-management.feature | 15 ++ docs/features/backlog/config-reading.feature | 15 ++ .../features/backlog/deprecation-sync.feature | 17 ++ docs/features/backlog/hatch.feature | 15 ++ docs/features/backlog/id-generation.feature | 18 +++ docs/features/backlog/nest.feature | 18 +++ .../backlog/parameter-handling.feature | 15 ++ docs/features/backlog/pytest-adapter.feature | 15 ++ docs/features/backlog/status.feature | 15 ++ docs/features/backlog/sync-cleanup.feature | 17 ++ docs/features/backlog/sync-create.feature | 18 +++ docs/features/backlog/sync-update.feature | 16 ++ .../backlog/template-customization.feature | 15 ++ docs/glossary.md | 132 +++++++++++++++ pyproject.toml | 2 +- 27 files changed, 717 insertions(+), 111 deletions(-) create mode 100644 docs/adr/ADR-2026-04-22-backwards-compatibility.md create mode 100644 docs/adr/ADR-2026-04-22-logging-observability.md create mode 100644 docs/adr/ADR-2026-04-22-performance-targets.md create mode 100644 docs/arch_journal.md create mode 100644 docs/features/backlog/adapter-contract.feature create mode 100644 docs/features/backlog/cache-management.feature create mode 100644 docs/features/backlog/config-reading.feature create mode 100644 docs/features/backlog/deprecation-sync.feature create mode 100644 docs/features/backlog/hatch.feature create mode 100644 docs/features/backlog/id-generation.feature create mode 100644 docs/features/backlog/nest.feature create mode 100644 docs/features/backlog/parameter-handling.feature create mode 100644 docs/features/backlog/pytest-adapter.feature create mode 100644 docs/features/backlog/status.feature create mode 100644 docs/features/backlog/sync-cleanup.feature create mode 100644 docs/features/backlog/sync-create.feature create mode 100644 docs/features/backlog/sync-update.feature create mode 100644 docs/features/backlog/template-customization.feature create mode 100644 docs/glossary.md diff --git a/README.md b/README.md index b3ea66f..6d0b1ce 100644 --- a/README.md +++ b/README.md @@ -1,6 +1,6 @@
-temple8 +beehave

@@ -9,101 +9,132 @@ [![Stargazers][stars-shield]][stars-url] [![Issues][issues-shield]][issues-url] [![MIT License][license-shield]][license-url] -[![Coverage](https://img.shields.io/badge/coverage-100%25-brightgreen?style=for-the-badge)](https://nullhack.github.io/temple8/coverage/) -[![CI](https://img.shields.io/github/actions/workflow/status/nullhack/temple8/ci.yml?style=for-the-badge&label=CI)](https://github.com/nullhack/temple8/actions/workflows/ci.yml) +[![Coverage](https://img.shields.io/badge/coverage-100%25-brightgreen?style=for-the-badge)](https://nullhack.github.io/beehave/coverage/) +[![CI](https://img.shields.io/github/actions/workflow/status/nullhack/beehave/ci.yml?style=for-the-badge&label=CI)](https://github.com/nullhack/beehave/actions/workflows/ci.yml) [![Python](https://img.shields.io/badge/python-3.13-blue?style=for-the-badge)](https://www.python.org/downloads/) -**From zero to hero — production-ready Python, without the ceremony.** +**BDD living documentation in sync.**
---- - -A delivery system that treats documentation as a first-class artifact and enforces production rigor through an AI-assisted workflow. Your team ships features, not broken promises. +> **Beta — coming soon.** `beehave` is under active development and not yet available on PyPI. APIs, CLI flags, and configuration keys may change before the stable release. --- -## Who is this for? +`beehave` is a framework-agnostic CLI and Python library that keeps your Gherkin `.feature` files and test stubs in sync. It reads `.feature` files as the single source of truth, assigns stable `@id` tags to untagged examples, and generates or updates test stub functions — without ever touching your test bodies. -### Developers — AI pair programming with industry standards +--- -You have used AI coding assistants. They generate code fast, but without tests, without traceability, and without review. This template enforces TDD by default: acceptance criteria exist before code, every requirement traces to a test, and an adversarial review gates every shipment. The AI writes tests first, respects your architecture, and ships code you would merge with confidence. +## How it works -### Product Owners & Project Managers — Living documentation that earns trust +`.feature` files are your source of truth. `beehave` bridges them to your test suite: -Stakeholders cannot read code. They read decisions. This template turns your repository into a transparent narrative: Gherkin stories trace requirements to tests, architecture decision records preserve reasoning, and a living domain model keeps everyone speaking the same language. Demo from the same source your engineers build from. No drift. No "trust me, it is done." +1. **`beehave nest`** — bootstraps the canonical directory structure (`docs/features/`, `tests/features/`, `.gitkeep` files, `[tool.beehave]` config in `pyproject.toml`). +2. **`beehave sync`** — assigns `@id` tags to untagged Examples, then creates, updates, or cleans up test stub functions to match the current state of every `.feature` file. +3. **`beehave status`** — shows what `sync` would change, without modifying anything. Exits 0 if in sync, 1 if changes are pending. +4. **`beehave hatch`** — generates bee-themed demo `.feature` files to try the full sync workflow end-to-end. +5. **`beehave version`** — prints the current version. --- -## The delivery cycle - -``` -SCOPE → ARCH → TDD LOOP → VERIFY → ACCEPT -``` +## Framework adapters -Each feature moves through five steps. At any moment, exactly one feature is in progress — enforced by filesystem state, not convention: +`beehave` ships with a built-in **pytest adapter** (v1). Adapters supply the stub template style — skip marker, deprecated marker, parametrize syntax, function prefix, return type, and body. The core is adapter-agnostic; swapping the adapter changes the generated output without touching `beehave`'s logic. -``` -docs/features/backlog/ ← scoped, waiting -docs/features/in-progress/ ← building now (max 1) -docs/features/completed/ ← accepted and shipped +```toml +# pyproject.toml +[tool.beehave] +framework = "pytest" # default ``` -Scope is written before architecture. Architecture is written before code. Code is reviewed adversarially before acceptance. Nothing moves to completed without explicit Product Owner sign-off. +**v1**: pytest adapter included. +**v2**: unittest adapter planned. --- -## Living documentation views - -Every artifact is version-controlled alongside the code that implements it. +## Installation -**Feature narratives** — Gherkin `.feature` files in `docs/features/` show exactly what is scoped, building, or shipped. Each story maps directly to tests; no requirement is orphaned. +```bash +pip install beehave # core only (brings the pytest adapter) +pip install beehave[pytest] # core + pytest-specific extras +``` -**Architecture decisions** — Every significant architectural choice is recorded as a dated ADR in `docs/adr/`. Six months from now, the team can reconstruct not just what was built, but why. +--- -**Domain model and glossary** — A living domain model and glossary keep business language consistent across team, documentation, and code. No invented synonyms; no drift between what stakeholders say and what engineers build. +## Quick start -**System overview** — `docs/system.md` reflects only completed, accepted features. No stale speculation; no documentation that lies about current state. +```bash +# 1. Bootstrap the structure +beehave nest -**C4 diagrams** — Context and container diagrams generated from the same source as the code, giving stakeholders a precise picture of system boundaries. +# 2. Write or drop in a .feature file, then sync +beehave sync -**Post-mortems** — Failures become append-only organizational memory in `docs/post-mortem/`. The same failure mode does not repeat silently; it leaves a record. +# 3. See what would change before committing +beehave status +``` --- -## Development standards - -**TDD by default** — Red → Green → Refactor, one acceptance criterion at a time. Every test is written before the code it validates. The loop is canonical: write the failing test, write the minimum code to pass, then refactor with the safety net of a green bar. +## Generated stub shape (pytest adapter) + +```python +@pytest.mark.skip(reason="not yet implemented") +def test_login_a1b2c3d4() -> None: + """ + Given a registered user + When they submit valid credentials + Then they are redirected to the dashboard + """ + ... +``` -**Behavioral tests only** — Tests describe observable contracts, not implementation internals. A test that survives a complete internal rewrite is a good test. A test that breaks on refactoring is a liability. +- Function name: `test__` +- One file per `Rule:` block: `tests/features//_test.py` +- Docstring: full Gherkin step text verbatim +- Body: `...` — never overwritten by `beehave` -**100% coverage** — Measured against your package. No untested paths ship. Coverage is a floor, not a goal. +--- -**Design principles enforced** — YAGNI, KISS, DRY, SOLID, and Object Calisthenics are not guidelines — they are review gates. Every principle is checked with file and line evidence before a feature is approved. +## What beehave never does -**Refactoring as first-class** — The REFACTOR phase is not optional cleanup. Code smells trigger specific pattern applications. Complexity is managed continuously, not accumulated and then confronted. +- Modifies test bodies +- Deletes test stubs silently (warns only) +- Changes anything in `.feature` files except adding `@id` tags to untagged Examples +- Runs your tests (that is your test runner's job) -**Git workflow with guardrails** — All work happens on feature branches. No force push. No history rewrite on shared branches. Conventional commits only. Clean merges to `main` via `--no-ff`. The branch model is simple and safe by default. +--- -**Zero type errors** — Full static type checking with no exceptions, no `type: ignore` suppressions. +## `pytest-beehave` -**Adversarial verification** — The architect who designed the system reviews it. The default hypothesis is "broken." Green automated checks are necessary but not sufficient for approval. +A separate project — `pytest-beehave` — wraps `beehave` as a pytest plugin, adding automatic sync during the pytest lifecycle, HTML acceptance criteria injection, and terminal criteria output. That project is out of scope here. --- -## Quick start +## Configuration -```bash -git clone https://github.com/nullhack/temple8 -cd temple8 -curl -LsSf https://astral.sh/uv/install.sh | sh # skip if uv is already installed -uv sync --all-extras -opencode && @setup-project # personalise for your project -uv run task test && uv run task lint && uv run task static-check +```toml +[tool.beehave] +framework = "pytest" # adapter to use +features_dir = "docs/features" # where .feature files live +template_path = "" # custom template folder (overrides built-in) ``` --- +## CLI flags + +| Flag | Effect | +|---|---| +| `--framework ` | Override the adapter for this invocation | +| `--overwrite` | Recreate managed directories from scratch (`nest` only) | +| `--check` | Verify structure without modifying anything (`nest` only) | +| `--template-dir ` | Use a custom template folder | +| `--verbose` | Human-readable output | +| `--json` | Machine-readable output (CI-friendly) | + +--- + ## Commands ```bash @@ -111,7 +142,7 @@ uv run task test # full suite + coverage uv run task test-fast # fast, no coverage (use during TDD loop) uv run task lint # ruff format + check uv run task static-check # pyright type checking -uv run task run # run the app +uv run task run # run the CLI uv run task doc-build # build API docs + coverage report ``` @@ -121,16 +152,16 @@ uv run task doc-build # build API docs + coverage report MIT — see [LICENSE](LICENSE). -**Author:** [@nullhack](https://github.com/nullhack) · [Documentation](https://nullhack.github.io/temple8) +**Author:** [@nullhack](https://github.com/nullhack) · [Documentation](https://nullhack.github.io/beehave) -[contributors-shield]: https://img.shields.io/github/contributors/nullhack/temple8.svg?style=for-the-badge -[contributors-url]: https://github.com/nullhack/temple8/graphs/contributors -[forks-shield]: https://img.shields.io/github/forks/nullhack/temple8.svg?style=for-the-badge -[forks-url]: https://github.com/nullhack/temple8/network/members -[stars-shield]: https://img.shields.io/github/stars/nullhack/temple8.svg?style=for-the-badge -[stars-url]: https://github.com/nullhack/temple8/stargazers -[issues-shield]: https://img.shields.io/github/issues/nullhack/temple8.svg?style=for-the-badge -[issues-url]: https://github.com/nullhack/temple8/issues +[contributors-shield]: https://img.shields.io/github/contributors/nullhack/beehave.svg?style=for-the-badge +[contributors-url]: https://github.com/nullhack/beehave/graphs/contributors +[forks-shield]: https://img.shields.io/github/forks/nullhack/beehave.svg?style=for-the-badge +[forks-url]: https://github.com/nullhack/beehave/network/members +[stars-shield]: https://img.shields.io/github/stars/nullhack/beehave.svg?style=for-the-badge +[stars-url]: https://github.com/nullhack/beehave/stargazers +[issues-shield]: https://img.shields.io/github/issues/nullhack/beehave.svg?style=for-the-badge +[issues-url]: https://github.com/nullhack/beehave/issues [license-shield]: https://img.shields.io/badge/license-MIT-green?style=for-the-badge -[license-url]: https://github.com/nullhack/temple8/blob/main/LICENSE +[license-url]: https://github.com/nullhack/beehave/blob/main/LICENSE diff --git a/docs/adr/ADR-2026-04-22-backwards-compatibility.md b/docs/adr/ADR-2026-04-22-backwards-compatibility.md new file mode 100644 index 0000000..0acbfef --- /dev/null +++ b/docs/adr/ADR-2026-04-22-backwards-compatibility.md @@ -0,0 +1,29 @@ +# ADR: Backwards compatibility policy + +| Field | Value | +|-------|-------| +| **Date** | 2026-04-22 | +| **Feature** | all | +| **Status** | Accepted | + +## Decision + +beehave follows a **best-effort deprecation** policy: any CLI flag, config key, or `@id` format change is preceded by a deprecation warning visible to users for at least one minor release before removal. No hard semver guarantee is enforced. + +## Reason + +Stakeholder confirmed best-effort as the right balance. A hard semver contract requires versioning infrastructure (deprecation tracking, migration tooling) that is out of scope for v1. Warn-before-remove protects users without the overhead of strict semver governance. + +## Alternatives Considered + +- **Strict semver (breaking changes on major bumps only)**: rejected — adds governance overhead; semver tooling not yet in place for v1 +- **No guarantee (v1 is pre-stable)**: rejected — even in early releases, users depend on `@id` tags being stable; silently breaking them damages trust +- **Migration tooling (automated rename/migrate on upgrade)**: deferred to v2 if needed; warn-before-remove is sufficient for v1 + +## Consequences + +- (+) Users are never surprised by silent breakage — at least one release cycle of warning +- (+) No versioning infrastructure required in v1 +- (+) `@id` tags in `.feature` files are implicitly stable (removing them would require changing all test function names — always a deprecation-warranted change) +- (-) No formal tracking mechanism for "deprecated in vX.Y" — relies on changelog discipline +- (-) Gap A3 is resolved: backwards compat is best-effort warn-before-remove; no migration tooling in v1 diff --git a/docs/adr/ADR-2026-04-22-error-handling-policy.md b/docs/adr/ADR-2026-04-22-error-handling-policy.md index 9ea371c..c2c2d46 100644 --- a/docs/adr/ADR-2026-04-22-error-handling-policy.md +++ b/docs/adr/ADR-2026-04-22-error-handling-policy.md @@ -1,4 +1,4 @@ -# ADR: Error vs. warn policy for deletions, duplicates, and malformed config +# ADR: Error vs. warn policy | Field | Value | |-------|-------| @@ -8,25 +8,33 @@ ## Decision -Conditions that are **recoverable or informational** produce warnings (exit 0); conditions that are **unrecoverable or data-destructive** produce errors (exit non-zero). The warn/error threshold for deleted `.feature` files and duplicate `@id` values is configurable in `[tool.beehave]`. +| Condition | Policy | Configurable? | +|-----------|--------|---------------| +| `pyproject.toml` absent | Use defaults, no error | No | +| `pyproject.toml` present but invalid TOML | Hard error | No | +| Invalid Gherkin syntax in `.feature` file | Hard error | No | +| Read-only filesystem (cannot write stubs or @id) | Hard error | No | +| Duplicate `@id` found across files | Hard error | No | +| Deleted `.feature` file (orphan feature directory) | Warn (default) or error | Yes — `on_delete` | +| Orphan stub (no matching `@id` in any feature) | Warn (default) or error | Yes — `on_orphan` | ## Reason -Unanswered gap A1 from scope_journal.md. Defaulting to warn-only for deletions and duplicates keeps beehave non-blocking in normal developer workflows while allowing CI pipelines to tighten the policy via config. Malformed `pyproject.toml` and invalid Gherkin are always errors because beehave cannot proceed without valid input. +Conditions where beehave **cannot proceed correctly** are always hard errors — there is no safe partial result. Conditions that are **informational or recoverable** default to warn so beehave is non-blocking in normal developer workflows; CI pipelines can tighten the policy via config. + +Absent `pyproject.toml` is explicitly not an error — the tool must work out of the box with zero config. Duplicate `@id` is a hard error because beehave cannot resolve the ambiguity without risking silent stub loss (see ADR-2026-04-22-id-stability). ## Alternatives Considered - **Always error on any anomaly**: rejected — too aggressive; breaks developer workflow on partial migrations - **Always warn, never error**: rejected — CI pipelines need a way to gate on drift; configurable threshold is the right balance -- **Separate `--strict` flag**: considered but deferred — config key is sufficient for v1; a flag can be added later +- **Configurable policy for duplicate @id**: rejected — no safe resolution exists; see id-stability ADR +- **Separate `--strict` flag**: considered but deferred — config keys are sufficient for v1; a flag can be added later ## Consequences - (+) Default behavior is non-blocking; developers can run beehave at any project state -- (+) CI pipelines can tighten policy via `pyproject.toml` without code changes -- (-) Two code paths per condition (warn vs. error) must be maintained -- (-) Gap A1 is partially resolved: malformed `pyproject.toml` → error; invalid Gherkin → error; read-only filesystem → error with clear message; deleted feature → configurable warn/error (default warn) - -## Unresolved (escalate to PO before implementation) - -- A2 (performance targets) and A4 (structured logging) remain unanswered. These do not block v1 feature implementation but should be resolved before the cache-management and status features are accepted. +- (+) CI pipelines can tighten `on_delete` and `on_orphan` via `pyproject.toml` without code changes +- (+) Hard errors are predictable and documented — no surprise exits +- (-) Two code paths per configurable condition (warn vs. error) must be maintained +- (-) `on_delete` and `on_orphan` config keys must be added to the `config-reading` feature scope (PO to update) diff --git a/docs/adr/ADR-2026-04-22-feature-file-write-policy.md b/docs/adr/ADR-2026-04-22-feature-file-write-policy.md index 9556f66..cbb9c42 100644 --- a/docs/adr/ADR-2026-04-22-feature-file-write-policy.md +++ b/docs/adr/ADR-2026-04-22-feature-file-write-policy.md @@ -1,4 +1,4 @@ -# ADR: Feature file write policy — @id tags only +# ADR: Feature file write policy — @id tags only, surgical insertion | Field | Value | |-------|-------| @@ -10,19 +10,23 @@ beehave writes to `.feature` files **only** to add `@id` tags to untagged Examples; all other content is read-only. +The write strategy is **surgical line insertion**: find the `Example:` (or `Scenario:`) line, insert `@id:` as a tag on the line immediately above it. The rest of the file is never touched. No full parse-and-reserialize round-trip. + ## Reason -`.feature` files are the source of truth for requirements. Any write beyond `@id` assignment risks corrupting human-authored Gherkin, breaking stakeholder trust, and making beehave unsafe to run in CI. +`.feature` files are the source of truth for requirements. Any write beyond `@id` assignment risks corrupting human-authored Gherkin. Surgical insertion guarantees zero formatting changes to surrounding content — indentation, comments, blank lines, and Unicode are all preserved byte-for-byte. ## Alternatives Considered - **Full rewrite on sync**: rejected — destroys formatting, comments, and authoring intent -- **Separate ID sidecar file**: rejected — breaks the one-file-per-feature contract and complicates tooling -- **No write-back at all (IDs in sidecar)**: rejected — `@id` must be visible in the `.feature` file for traceability +- **Parse and reserialize via gherkin-official**: rejected — the official parser produces an AST but has no canonical serializer; round-tripping would risk whitespace and comment loss +- **Separate ID sidecar file**: rejected — `@id` must be visible in the `.feature` file for traceability and portability +- **No write-back at all (IDs in sidecar)**: rejected — same reason as above ## Consequences -- (+) Feature files remain human-readable and diff-friendly -- (+) beehave is safe to run on any project without risk of data loss -- (-) beehave must parse and re-serialize Gherkin with exact whitespace preservation — non-trivial implementation constraint -- (-) Malformed `@id` tags must be detected and replaced in-place, requiring careful regex/AST handling +- (+) Feature files remain byte-identical except for the inserted `@id` lines +- (+) beehave is safe to run on any project; zero risk of formatting corruption +- (+) Implementation is simpler: line-level text manipulation, no AST serialization +- (-) Surgical insertion requires a reliable regex/line-scan to locate the correct `Example:` line, including edge cases (indentation, inline tags, `Scenario Outline:`) +- (-) Malformed `@id` tags (present but invalid) must be detected and replaced in-place using the same line-scan approach diff --git a/docs/adr/ADR-2026-04-22-id-stability.md b/docs/adr/ADR-2026-04-22-id-stability.md index 8f0350e..f9e782f 100644 --- a/docs/adr/ADR-2026-04-22-id-stability.md +++ b/docs/adr/ADR-2026-04-22-id-stability.md @@ -1,4 +1,4 @@ -# ADR: @id assignment and collision policy +# ADR: @id assignment, collision, and orphan policy | Field | Value | |-------|-------| @@ -8,11 +8,15 @@ ## Decision -`@id` values are 8-character lowercase hex strings; once assigned they are never replaced (unless malformed); collisions trigger a silent retry; uniqueness is enforced project-wide across all `.feature` files. +`@id` values are 8-character lowercase hex strings. Once assigned, they are never replaced unless malformed (empty value or non-hex characters). Generation collisions trigger a silent retry. Uniqueness is enforced project-wide across all `.feature` files. + +**Duplicate `@id` found in files** (hand-edited by developer): **always a hard error** — beehave cannot determine which stub to bind and must stop. This is never produced by beehave itself; its presence means a developer manually edited an `@id`. + +**Edited or deleted `@id`**: the old stub becomes an orphan, subject to the `on_orphan` policy (configurable warn/error, default warn). A new `@id` is assigned and a new stub generated. ## Reason -Stable IDs are the sole link between an Example and its test stub. Any ID change breaks that link, orphaning the stub and losing the developer's test body. Project-wide uniqueness prevents ambiguous stub lookups. +Stable IDs are the sole link between an Example and its test stub. Any ID change breaks that link, orphaning the stub. Project-wide uniqueness prevents ambiguous stub lookups. Duplicate IDs are an unrecoverable ambiguity — beehave cannot guess which stub is canonical without risking silent data loss, so it must hard-error. ## Alternatives Considered @@ -20,10 +24,13 @@ Stable IDs are the sole link between an Example and its test stub. Any ID change - **UUID v4 (full 32 hex chars)**: rejected — too verbose in `.feature` files; 8 hex chars gives 4 billion values, sufficient for any realistic project - **Content-hash of step text**: rejected — changes when steps are edited, breaking the stability guarantee - **File-scoped uniqueness**: rejected — beehave scans all feature files; project-wide uniqueness is required for unambiguous stub lookup +- **Configurable policy for duplicate @id**: rejected — there is no safe resolution strategy; hard error is the only honest response ## Consequences - (+) Stubs survive Example reordering, file renames, and step text edits - (+) `@id` in function name is the only lookup key needed — no path or title matching required +- (+) Duplicate ID is surfaced immediately as an error, never silently resolved - (-) ID generation requires a full scan of all `.feature` files before assigning new IDs (to check uniqueness) - (-) Malformed IDs must be detected and replaced — requires careful validation logic +- (-) Developer who manually edits an `@id` must resolve duplicates before beehave will run diff --git a/docs/adr/ADR-2026-04-22-logging-observability.md b/docs/adr/ADR-2026-04-22-logging-observability.md new file mode 100644 index 0000000..f5dd2c8 --- /dev/null +++ b/docs/adr/ADR-2026-04-22-logging-observability.md @@ -0,0 +1,30 @@ +# ADR: Logging and observability + +| Field | Value | +|-------|-------| +| **Date** | 2026-04-22 | +| **Feature** | all (CLI) | +| **Status** | Accepted | + +## Decision + +beehave uses the **standard Python `logging` module** with four levels: DEBUG, INFO, WARNING, ERROR. The active log level is configurable via a `[tool.beehave]` key (`log_level`) or a `--log-level` CLI flag. Default level is WARNING (silent for normal use). `--verbose` is sugar for INFO; `--json` switches output format but respects the active log level. + +## Reason + +Stakeholder confirmed log levels (DEBUG/INFO/WARN/ERROR) are needed. The standard `logging` module is zero-dependency, universally understood, and integrates cleanly with Python tooling. Users who embed beehave as a library get standard log handler plumbing for free. + +## Alternatives Considered + +- **stdout/stderr only (no logging module)**: rejected — stakeholder wants level control; adding it later would be a breaking change to output contracts +- **structlog / loguru**: rejected — third-party dependency not justified for v1; standard `logging` is sufficient +- **Log file output**: deferred — not requested; can be added via standard logging `FileHandler` in v2 without any API changes + +## Consequences + +- (+) Users can silence beehave completely (WARNING default) or enable diagnostics (DEBUG) without code changes +- (+) Library consumers can attach their own log handlers +- (+) `--verbose` and `--json` remain valid shortcuts; they map to level INFO +- (-) `--verbose` and `--json` are now slightly redundant with `--log-level INFO`; documented as convenience aliases +- (-) Log level must be parsed early (before config file read) to capture config-parsing diagnostics +- (-) Gap A4 is resolved: standard Python logging, four levels, config key + CLI flag, default WARNING diff --git a/docs/adr/ADR-2026-04-22-module-structure.md b/docs/adr/ADR-2026-04-22-module-structure.md index 8d0897e..3e49aa3 100644 --- a/docs/adr/ADR-2026-04-22-module-structure.md +++ b/docs/adr/ADR-2026-04-22-module-structure.md @@ -1,29 +1,42 @@ -# ADR: Package module structure — seven bounded contexts +# ADR: Package module structure — six bounded contexts | Field | Value | |-------|-------| | **Date** | 2026-04-22 | | **Feature** | all | -| **Status** | Accepted | +| **Status** | Accepted (supersedes speculative v0 from same date) | ## Decision -The `beehave` package is organized into seven submodules matching the seven bounded contexts: `cli`, `config`, `parsing`, `sync`, `adapters`, `cache`, `scaffold`. Each submodule has a single responsibility and a strict dependency order (see domain-model.md Module Dependency Graph). +The `beehave` package is organized into **six submodules**: + +| Submodule | Responsibility | +|-----------|----------------| +| `beehave/cli/` | Entry points: `nest`, `sync`, `status`, `hatch`, `version` (composition root) | +| `beehave/config/` | Read `[tool.beehave]` from `pyproject.toml`; apply defaults; merge CLI overrides | +| `beehave/parsing/` | Parse `.feature` files; assign `@id` tags; derive slugs; cache incremental state | +| `beehave/sync/` | Compute `SyncPlan`; execute create/update/move/warn on test stubs | +| `beehave/adapters/` | `FrameworkAdapter` Protocol + `PytestAdapter` concrete implementation | +| `beehave/nest/` | Create directory structure; inject `[tool.beehave]` into `pyproject.toml` | ## Reason -The domain analysis reveals seven distinct responsibilities with clear boundaries. Flat module structure would mix parsing, sync, and adapter concerns, making it impossible to test them in isolation and difficult to add new adapters or cache backends later. +The initial 7-submodule design (with separate `scaffold` and `cache` modules) was rejected for two reasons: +1. **Branding misalignment**: "scaffold" has no place in the bee-themed vocabulary. The CLI command is `nest`; the submodule should match. +2. **Over-granularity**: `cache` has no independent public API — it exists solely to serve `parsing`. Merging it into `parsing` reduces import surface and cognitive overhead without losing encapsulation. + +Six submodules is the minimum necessary for clear single-responsibility boundaries. ## Alternatives Considered -- **Flat package (all in `beehave/`)**: rejected — 14 features across 7 concerns; flat structure would produce a 1000+ line module with no clear seams -- **Three layers (core/adapters/cli)**: considered — too coarse; `parsing` and `sync` have different change rates and different external dependencies -- **Domain-driven with ports/adapters folders**: considered — over-engineered for v1; the `adapters/` submodule already provides the extension point; explicit `ports/` folder adds ceremony without benefit +- **7 submodules (separate `cache` + `scaffold`)**: rejected — see above +- **3 layers (core/adapters/cli)**: rejected — too coarse; `parsing` and `sync` have different change rates and external dependencies +- **Flat package**: rejected — 14 features across 6 concerns; flat structure would produce an unmaintainable single module ## Consequences -- (+) Each submodule can be tested in isolation with no mocking of other submodules -- (+) New adapters are added by creating a new class in `beehave/adapters/` — no other module changes -- (+) `parsing` and `cache` are independently replaceable -- (-) Seven submodules means seven `__init__.py` files and more import paths to maintain -- (-) The `sync` module is the most complex (depends on parsing, adapters, cache) — it must be carefully bounded +- (+) `nest` submodule name matches the CLI command — zero vocabulary mismatch +- (+) Cache logic lives in `parsing/` — one module owns feature file reading end-to-end +- (+) Six modules keeps the import graph shallow and testable +- (-) `parsing/` has slightly broader responsibility (parse + cache); internal structure must enforce the boundary (e.g. `parsing/cache.py` as a sub-file) +- (-) Module dependency graph updated: `Cache` context is now part of `Parsing` context diff --git a/docs/adr/ADR-2026-04-22-performance-targets.md b/docs/adr/ADR-2026-04-22-performance-targets.md new file mode 100644 index 0000000..520a3a1 --- /dev/null +++ b/docs/adr/ADR-2026-04-22-performance-targets.md @@ -0,0 +1,29 @@ +# ADR: Performance targets and cache role + +| Field | Value | +|-------|-------| +| **Date** | 2026-04-22 | +| **Feature** | cache-management, id-generation | +| **Status** | Accepted | + +## Decision + +The target scale is **medium** (100–1,000 Examples per project). A full scan of all `.feature` files on every run is acceptable at this scale. The cache is a **performance optimisation**, not a load-bearing architectural requirement — sync must be correct without it; the cache only speeds it up. + +## Reason + +Stakeholder confirmed medium scale as the design target. At 1,000 Examples across ~100 files, a full scan completes in well under 1 second on any modern filesystem. The cache adds complexity; making it optional-for-correctness keeps the system simpler and the cache failure modes safe (rebuild = fall back to full scan). + +## Alternatives Considered + +- **Cache as load-bearing (required for correct operation)**: rejected — adds hard dependency on cache coherence; full scan is fast enough at target scale +- **No cache at all**: considered — acceptable for v1 correctness but rejected because the cache is already designed into the domain model and provides real user value (incremental sync feedback) +- **Large-scale design (10k+ Examples)**: out of scope for v1; if scale grows, the cache can be promoted to load-bearing without breaking the API + +## Consequences + +- (+) Cache failure never blocks sync — silently rebuild and proceed +- (+) Full scan is the correctness baseline; cache is an optimisation layer on top +- (+) Simplifies cache-management implementation: stale/corrupt cache → rebuild, not error +- (-) At >1,000 Examples, full-scan fallback may be slow; acceptable for v1 but a known scaling limit +- (-) Gap A2 is now resolved: no sub-second SLA enforced; correctness over speed diff --git a/docs/adr/ADR-2026-04-22-slug-derivation.md b/docs/adr/ADR-2026-04-22-slug-derivation.md index 693683f..6585d4e 100644 --- a/docs/adr/ADR-2026-04-22-slug-derivation.md +++ b/docs/adr/ADR-2026-04-22-slug-derivation.md @@ -14,9 +14,21 @@ A feature moves through stage folders during its lifecycle. If the slug included the stage path, moving a feature from `in-progress/` to `completed/` would change the slug, rename the test directory, and orphan all stubs — a destructive side-effect of a purely administrative operation. +## Feature Rename Behaviour + +There is no feature-level `@id` — beehave has no way to detect that `user-login.feature` is a rename of `login.feature` rather than a brand-new feature. On rename: + +- The old test directory (`tests/features/login/`) becomes a **slug orphan**: no feature maps to it +- beehave warns (or errors, per `on_orphan_feature` config) and lists the orphaned directory +- New stubs are generated under `tests/features/user_login/` +- The developer is responsible for manually migrating test bodies from the old directory to the new one + +**Auto-rename was explicitly rejected** (see Alternatives). + ## Alternatives Considered - **Include stage folder in slug**: rejected — causes spurious orphans and renames on every stage transition +- **Auto-rename test directory on feature file rename**: rejected — beehave has no feature-level ID and cannot distinguish a rename from a delete + new file; auto-rename would silently corrupt test history - **Flat features directory (no stage subfolders)**: rejected — the stage folder structure is a project requirement (from `nest` feature) - **Configurable slug derivation**: rejected — YAGNI; one derivation rule is sufficient for v1 @@ -24,4 +36,6 @@ A feature moves through stage folders during its lifecycle. If the slug included - (+) Moving a feature between stage folders has zero effect on test stubs - (+) Slug derivation is a pure function of the file stem — simple and testable +- (+) No silent data loss — rename is always surfaced as an orphan warning/error +- (-) Feature rename requires manual test directory migration by the developer - (-) Two feature files with the same stem in different stage folders would collide (e.g. `backlog/login.feature` and `in-progress/login.feature`). This is treated as a user error — beehave warns and uses the first one found. The WIP-limit-of-1 workflow enforced by AGENTS.md makes this practically impossible. diff --git a/docs/arch_journal.md b/docs/arch_journal.md new file mode 100644 index 0000000..872da28 --- /dev/null +++ b/docs/arch_journal.md @@ -0,0 +1,53 @@ +# Architecture Journal: beehave + +> Append-only record of all architecture design session Q&A. +> Written by the system-architect. Read by the system-architect for resume checks and ADR regeneration. +> Never edit past entries — append new session blocks only. +> If ADRs need to be regenerated, this file is the source of truth. + +--- + +## 2026-04-22 — Session 1 +Status: IN-PROGRESS + +### Context + +First architecture design session for beehave v1. +Covers all key architectural decisions before any implementation begins. +Resolves the 4 unanswered architectural gaps (A1–A4) from scope_journal.md Session 1. + +### Gaps + +| ID | Question | Answer | +|----|----------|--------| +| A1 | Error handling: How should beehave behave when pyproject.toml is malformed, a .feature file has invalid Gherkin, or the filesystem is read-only? | Malformed pyproject.toml (present but invalid) → error. Absent pyproject.toml → use defaults, no error. Invalid Gherkin → error. Read-only filesystem → error with clear message. Deleted feature file → configurable: warn (default) or error. Duplicate @id → hard error always (not configurable; see D3/D6). | +| A2 | Performance constraints: What is the target scale? This determines whether the cache is load-bearing or an optimisation. | Medium (100–1,000 Examples). Full scan acceptable. Cache is a nice-to-have speedup, not load-bearing. | +| A3 | Backwards compatibility: What is the policy for CLI flags, config keys, and @id format changes? | Best-effort: deprecation warning in one minor release before removal. No hard semver guarantee. | +| A4 | Logging and observability: Beyond --verbose and --json, should beehave support structured logging, log levels, or log files? | Standard Python logging module with log levels (DEBUG/INFO/WARN/ERROR); users can set level via config or flag. | + +### Decisions + +| ID | Question | Answer | +|----|----------|--------| +| D1 | Module structure: How should the beehave package be organized into submodules? | 6 submodules: cli, config, parsing, sync, adapters, nest. Cache merged into parsing (no independent public API). "scaffold" rejected — violates bee branding; renamed to "nest" to match the CLI command. | +| D2 | Adapter contract: Should FrameworkAdapter be a typing.Protocol or an ABC? | typing.Protocol. Zero import coupling for third-party adapters; pyright enforces statically. | +| D3 | @id format and stability: What is the format for @id tags and how are collisions handled? Edited/deleted @id — treat as orphan (warn or error per policy). | 8-char lowercase hex. Once assigned, never replaced unless malformed (empty or non-hex). Generation collisions trigger silent retry. Uniqueness is project-wide. Edited/deleted @id → old stub becomes orphan, subject to on_orphan policy. Duplicate @id found in files → hard error always (beehave cannot determine which stub to bind). | +| D4 | Slug derivation: How is the feature slug derived, and does the stage folder (backlog/in-progress/completed) affect it? | Slug derived from file stem only. Stage folder is ignored. Moving a feature between stage folders has zero effect on test stubs. Feature rename: no feature-level ID exists, so beehave cannot detect renames — old test directory becomes an orphan (warn or error per configured policy); new stubs generated under new slug. | +| D5 | Feature file write policy: What may beehave write to .feature files? Write strategy: surgical insertion or full reserialize? | Only @id tags on untagged Examples. All other content is read-only. Strategy: surgical line insertion (insert @id tag line above the Example: line; never reserialize the whole file). | +| D6 | Error vs. warn policy: Which conditions are always errors and which are configurable? | Always error: malformed pyproject.toml (present but invalid), invalid Gherkin syntax, read-only filesystem, duplicate @id found in any file (hard error — beehave cannot bind stub name). Configurable warn/error (default warn): deleted .feature file, orphan stub. Absent pyproject.toml is NOT an error — use defaults. Duplicate @id is never produced by beehave; if found it was hand-edited — hard error. | + +### ADRs Produced This Session + +| ADR | Question ID | Status | +|-----|-------------|--------| +| ADR-2026-04-22-performance-targets | A2 | Written | +| ADR-2026-04-22-backwards-compatibility | A3 | Written | +| ADR-2026-04-22-logging-observability | A4 | Written | +| ADR-2026-04-22-module-structure | D1 | Written | +| ADR-2026-04-22-adapter-protocol | D2 | Written | +| ADR-2026-04-22-id-stability | D3 | Written | +| ADR-2026-04-22-slug-derivation | D4 | Written | +| ADR-2026-04-22-feature-file-write-policy | D5 | Written | +| ADR-2026-04-22-error-handling-policy | D6 | Written | + +Status: COMPLETE diff --git a/docs/discovery.md b/docs/discovery.md index 9b8a33f..489f09f 100644 --- a/docs/discovery.md +++ b/docs/discovery.md @@ -1,21 +1,52 @@ -# Discovery: +# Discovery: beehave + +> Append-only session synthesis log. +> Written by the product-owner at the end of each discovery session. +> Each block summarizes one session: what was learned, what entities were suggested, and which features were touched. +> Never edit past blocks — later blocks extend or supersede earlier ones. --- -## Session: YYYY-MM-DD +## Session: 2026-04-21 + +### Summary + +Session 1 established the full project scope for `beehave`: a framework-agnostic CLI and Python library that keeps Gherkin `.feature` files in sync with test stubs. The session covered all general questions (users, purpose, success/failure, out-of-scope), all cross-cutting concerns (framework selection, config, output modes, test identity, feature stage mapping), and per-feature Q&A for all 15 planned features. A same-day supplement corrected `deprecation-sync` cascade behavior (absolute, no override in v1) and defined `hatch` demo content (bee-themed, covers Feature/Rule/Example/Scenario Outline). + +### Entities Added or Deprecated -### Context -<3–5 sentence synthesis: who the users are, what the product does, why it exists, -success/failure conditions, and explicit out-of-scope boundaries.> -(First session only. Omit this subsection in subsequent sessions.) +| Action | Type | Name | Notes | +|--------|------|------|-------| +| Added | Noun | Feature file | Gherkin `.feature` file; single source of truth for requirements | +| Added | Noun | Example | Gherkin `Example:`/`Scenario:` block; unit that receives an `@id` and maps to one test stub | +| Added | Noun | Rule | `Rule:` block in a `.feature` file; maps to one test file in `tests/features//` | +| Added | Noun | @id tag | 8-char lowercase hex tag (`@id:a1b2c3d4`); stable identity linking an Example to its test stub | +| Added | Noun | Test stub | Generated skipped test function `test__` with Gherkin steps as docstring and `...` body | +| Added | Noun | Framework adapter | Pluggable component supplying stub template conventions per test framework | +| Added | Noun | Cache | JSON file at `.beehave_cache/features.json` tracking feature file state for incremental sync | +| Added | Noun | Feature slug | Snake-case identifier derived from a feature file's name; used as directory and function name prefix | +| Added | Noun | Rule slug | Snake-case identifier derived from a `Rule:` block title; used as the test file name | +| Added | Noun | Scenario Outline | Parameterized Gherkin example with a columns table; maps to a parametrized stub | +| Added | Noun | Orphan | A test stub whose `@id` no longer matches any Example in any `.feature` file | +| Added | Verb | nest | Bootstrap the canonical directory structure for a project | +| Added | Verb | sync | Assign IDs and reconcile test stubs with `.feature` files | +| Added | Verb | hatch | Generate demo `.feature` files | +| Added | Verb | assign_ids | Programmatic entry point to assign `@id` tags to untagged Examples | -### Feature List -- `` — -(Write "No changes" if no features were added or modified this session.) +### Features Touched -### Domain Model -| Type | Name | Description | In Scope | -|------|------|-------------|----------| -| Noun | | | Yes | -| Verb | | | Yes | -(Write "No changes" if domain model was not updated this session.) +- `nest` — new feature: bootstraps canonical directory structure and pyproject.toml config injection +- `id-generation` — new feature: assigns stable `@id` tags to untagged or malformed Examples in place +- `status` — new feature: dry-run preview of what sync would change; Unix exit codes for CI +- `cache-management` — new feature: JSON cache for incremental sync, auto-rebuilds if stale or corrupted +- `template-customization` — new feature: user-defined stub templates via flag or config key +- `sync-create` — new feature: generates new skipped test stubs for Examples with no existing test +- `sync-update` — new feature: updates stub docstrings, function names, and deprecated markers on change +- `sync-cleanup` — new feature: warns on orphans, moves misplaced stubs, warns on deleted feature files +- `adapter-contract` — new feature: defines the interface all framework adapters must implement +- `pytest-adapter` — new feature: built-in adapter implementing the contract for pytest +- `parameter-handling` — new feature: parametrized stubs for Scenario Outlines; warns on column changes +- `unittest-adapter` — new feature: PARKED for v2; out of v1 scope +- `hatch` — new feature: generates bee-themed demo `.feature` files covering common Gherkin patterns +- `config-reading` — new feature: reads `[tool.beehave]` from `pyproject.toml` and applies defaults +- `deprecation-sync` — new feature: propagates `@deprecated` tags to stubs; absolute cascade, no override in v1 diff --git a/docs/features/backlog/adapter-contract.feature b/docs/features/backlog/adapter-contract.feature new file mode 100644 index 0000000..9997e88 --- /dev/null +++ b/docs/features/backlog/adapter-contract.feature @@ -0,0 +1,16 @@ +Feature: adapter-contract — common framework adapter interface + + Defines the interface that all framework adapters must implement so that beehave's core can + generate correctly-formatted stubs for any supported test framework. The active adapter is + selected by the framework config key or the --framework CLI flag. In v1 only the built-in + pytest adapter exists; the interface is designed to allow third-party adapters in future. + + Status: ELICITING + + Rules (Business): + - The active adapter is selected by the framework key in [tool.beehave] or --framework flag + - Default adapter is pytest when neither config nor flag is set + + Constraints: + - Every adapter must supply: skip marker, deprecated marker, parametrize template, stub file header + - v1: only built-in adapters; third-party adapter registration is out of v1 scope diff --git a/docs/features/backlog/cache-management.feature b/docs/features/backlog/cache-management.feature new file mode 100644 index 0000000..0167569 --- /dev/null +++ b/docs/features/backlog/cache-management.feature @@ -0,0 +1,15 @@ +Feature: cache-management — incremental sync cache + + Maintains a JSON cache at .beehave_cache/features.json to track the last-known state of every + .feature file. On each sync, only changed files are fully reprocessed, keeping sync fast on + large projects. The cache is invisible in normal operation and never committed to version control. + + Status: ELICITING + + Rules (Business): + - Cache is auto-created on first sync and updated incrementally on every subsequent sync + - A missing, stale, or corrupted cache is rebuilt silently without surfacing an error to the user + + Constraints: + - Cache file is added to .gitignore by beehave nest + - Cache is not a user-visible artifact diff --git a/docs/features/backlog/config-reading.feature b/docs/features/backlog/config-reading.feature new file mode 100644 index 0000000..629514f --- /dev/null +++ b/docs/features/backlog/config-reading.feature @@ -0,0 +1,15 @@ +Feature: config-reading — read [tool.beehave] from pyproject.toml + + Reads beehave configuration from the [tool.beehave] table in pyproject.toml located in the + working directory. Provides defaults for all config keys so beehave works out of the box + without any configuration. CLI flags override config file values for the current invocation only. + + Status: ELICITING + + Rules (Business): + - Missing config keys fall back to documented defaults + - CLI flags override config file values for the current invocation + + Constraints: + - Config file location: pyproject.toml in the current working directory + - Supported keys (v1): framework, features_dir, template_path diff --git a/docs/features/backlog/deprecation-sync.feature b/docs/features/backlog/deprecation-sync.feature new file mode 100644 index 0000000..3f41965 --- /dev/null +++ b/docs/features/backlog/deprecation-sync.feature @@ -0,0 +1,17 @@ +Feature: deprecation-sync — propagate @deprecated tags to test stubs + + When a .feature file carries a @deprecated tag at Feature, Rule, or Example level, beehave + sync adds the adapter's deprecated marker to all affected test stub functions. The cascade is + absolute in v1: a @deprecated on a Feature or Rule propagates to every child Example with no + per-Example override mechanism. + + Status: ELICITING + + Rules (Business): + - @deprecated on a Feature applies to all child Examples of that Feature + - @deprecated on a Rule applies to all child Examples of that Rule + - @deprecated on an Example applies to that Example only + - There is no per-Example override of a parent @deprecated tag in v1 + + Constraints: + - Cascade direction is always parent → child; never child → parent diff --git a/docs/features/backlog/hatch.feature b/docs/features/backlog/hatch.feature new file mode 100644 index 0000000..83fdc48 --- /dev/null +++ b/docs/features/backlog/hatch.feature @@ -0,0 +1,15 @@ +Feature: hatch — generate demo .feature files + + Generates one or two bee-themed demo .feature files covering the most common Gherkin patterns + (Feature, Rule, Example, Scenario Outline). The files are ready for beehave sync to process, + so a developer can immediately experience the full sync workflow end-to-end without writing + their own .feature content first. + + Status: ELICITING + + Rules (Business): + - hatch never overwrites an existing .feature file + - Generated files cover: Feature header, Rule block, plain Example, Scenario Outline + + Constraints: + - Demo content is bee-themed (vocabulary consistent with beehave branding) diff --git a/docs/features/backlog/id-generation.feature b/docs/features/backlog/id-generation.feature new file mode 100644 index 0000000..f5f128a --- /dev/null +++ b/docs/features/backlog/id-generation.feature @@ -0,0 +1,18 @@ +Feature: id-generation — assign @id tags to untagged Examples + + Assigns stable, unique 8-character lowercase hex @id tags to any Example in a .feature file + that does not already have a valid one. beehave writes the tag back in-place, preserving all + whitespace and formatting exactly. A valid existing @id is never replaced. Malformed tags + (empty value or non-hex characters) are treated as missing and replaced. + + Status: ELICITING + + Rules (Business): + - @id values are unique project-wide across all .feature files + - Developer-supplied valid @id tags are respected and never overwritten + - Collision on generation triggers a silent retry until a unique id is produced + - Assignment is top-to-bottom within each file + + Constraints: + - In-place write preserves all existing whitespace and formatting + - Dry-run / preview is provided by beehave status, not a separate mode diff --git a/docs/features/backlog/nest.feature b/docs/features/backlog/nest.feature new file mode 100644 index 0000000..a609b3b --- /dev/null +++ b/docs/features/backlog/nest.feature @@ -0,0 +1,18 @@ +Feature: nest — bootstrap canonical directory structure + + Bootstraps the required directory structure and configuration for a new beehave project. + Running `beehave nest` creates docs/features/{backlog,in-progress,completed}/, tests/features/, + and .gitkeep files in each empty directory. It also injects a [tool.beehave] snippet into + pyproject.toml if not already present. The command is additive and idempotent. + + Status: ELICITING + + Rules (Business): + - Nest never removes or overwrites existing content + - A project is considered already nested if docs/features/ contains any .feature file + - nest --check verifies structure without modifying anything and exits non-zero if incomplete + - nest --overwrite recreates managed directories from scratch + + Constraints: + - Accepts --features-dir to override the default docs/features/ path + - Safe to run in an existing Python project with unrelated files present diff --git a/docs/features/backlog/parameter-handling.feature b/docs/features/backlog/parameter-handling.feature new file mode 100644 index 0000000..4cf5984 --- /dev/null +++ b/docs/features/backlog/parameter-handling.feature @@ -0,0 +1,15 @@ +Feature: parameter-handling — Scenario Outline parametrization + + When a .feature file contains a Scenario Outline, beehave generates a parametrized stub using + the active adapter's parametrize template. The columns table becomes the parametrize arguments. + If columns change after the initial stub is created, beehave warns and flags the stub as + requiring manual intervention — it never auto-modifies the parametrize decorator. + + Status: ELICITING + + Rules (Business): + - A Scenario Outline stub is created with the adapter's parametrize template on first sync + - Column changes after initial creation produce a warning only; beehave does not touch the stub + + Constraints: + - "Column change" means any addition, removal, or rename of an Examples table column diff --git a/docs/features/backlog/pytest-adapter.feature b/docs/features/backlog/pytest-adapter.feature new file mode 100644 index 0000000..87a0e76 --- /dev/null +++ b/docs/features/backlog/pytest-adapter.feature @@ -0,0 +1,15 @@ +Feature: pytest-adapter — built-in adapter for the pytest framework + + Implements the adapter contract for pytest. Supplies the pytest-specific stub conventions used + by sync-create, sync-update, and parameter-handling when pytest is the active framework. + + Status: ELICITING + + Rules (Business): + - All generated pytest stubs are immediately runnable with pytest without modification + + Constraints: + - Skip marker: @pytest.mark.skip(reason="not yet implemented") + - Deprecated marker: @pytest.mark.deprecated + - Parametrize: @pytest.mark.parametrize(...) + - Function prefix: test_; return type: -> None; body: ... diff --git a/docs/features/backlog/status.feature b/docs/features/backlog/status.feature new file mode 100644 index 0000000..95edf75 --- /dev/null +++ b/docs/features/backlog/status.feature @@ -0,0 +1,15 @@ +Feature: status — dry-run preview of sync changes + + Shows a summary of what beehave sync would change, without modifying any file. Intended for + developer review before committing and for CI pipeline gating. Exits 0 when everything is in + sync and 1 when changes are pending, following Unix convention. + + Status: ELICITING + + Rules (Business): + - status never writes to any file + - Exit code 0 means fully in sync; exit code 1 means changes pending + + Constraints: + - Supports --verbose (human-readable) and --json (machine-readable) output modes + - Silent by default (Unix philosophy) diff --git a/docs/features/backlog/sync-cleanup.feature b/docs/features/backlog/sync-cleanup.feature new file mode 100644 index 0000000..80a1c02 --- /dev/null +++ b/docs/features/backlog/sync-cleanup.feature @@ -0,0 +1,17 @@ +Feature: sync-cleanup — handle orphaned and misplaced stubs + + Detects and reports on test stubs whose @id no longer matches any Example in any .feature file + (orphans), and corrects stubs that are in the wrong directory by moving them to the right + location. When a .feature file is deleted, beehave warns and the resulting orphan stubs are + flagged by orphan detection. beehave never deletes stubs automatically. + + Status: ELICITING + + Rules (Business): + - Orphaned stubs are warned about and never deleted automatically + - A stub in the wrong directory is moved to the correct location, body preserved + - A deleted .feature file triggers a warning + + Constraints: + - Orphan: a test function whose name contains an @id absent from all .feature files + - Path correction applies when @id matches but directory path does not diff --git a/docs/features/backlog/sync-create.feature b/docs/features/backlog/sync-create.feature new file mode 100644 index 0000000..def3554 --- /dev/null +++ b/docs/features/backlog/sync-create.feature @@ -0,0 +1,18 @@ +Feature: sync-create — generate new test stubs for new Examples + + Generates a skipped test stub function for each Example in a .feature file that has no + corresponding test identified by its @id. The stub carries the full Gherkin step text as its + docstring, uses the active adapter's skip marker and body convention, and is placed in + tests/features//_test.py. + + Status: ELICITING + + Rules (Business): + - A stub is created only when no test function with the @id in its name already exists + - One test file is created per Rule block: tests/features//_test.py + + Constraints: + - Function name: test__ + - Docstring: full Gherkin step text verbatim (all Given/When/Then lines) + - Return type: -> None; body: ... (Ellipsis) + - Skip marker and body come from the active adapter template diff --git a/docs/features/backlog/sync-update.feature b/docs/features/backlog/sync-update.feature new file mode 100644 index 0000000..fc68465 --- /dev/null +++ b/docs/features/backlog/sync-update.feature @@ -0,0 +1,16 @@ +Feature: sync-update — update existing stubs when .feature content changes + + Updates the metadata of existing test stub functions when their corresponding Example changes + in the .feature file. Specifically: the docstring is re-rendered to match new step text, the + function name is updated if the feature slug changed, and the @deprecated marker is toggled + based on the Gherkin @deprecated tag. beehave never modifies a test body under any circumstance. + + Status: ELICITING + + Rules (Business): + - Test bodies are never modified under any circumstance + - Scenario Outline column changes are warned about and never auto-modified + + Constraints: + - @deprecated marker is added or removed based solely on the Gherkin @deprecated tag presence + - Function rename on slug change preserves the test body exactly diff --git a/docs/features/backlog/template-customization.feature b/docs/features/backlog/template-customization.feature new file mode 100644 index 0000000..3e86c5e --- /dev/null +++ b/docs/features/backlog/template-customization.feature @@ -0,0 +1,15 @@ +Feature: template-customization — user-defined stub templates + + Allows developers to override the built-in adapter stub templates with their own. A custom + template folder is a full replacement for the built-in templates when specified. This enables + teams with non-standard conventions to generate stubs that match their style without forking + beehave. + + Status: ELICITING + + Rules (Business): + - Built-in adapter templates are used when no custom folder is specified + - A custom template folder fully replaces the built-in for matched template files + + Constraints: + - Custom folder specified via --template-dir flag or template_path config key in [tool.beehave] diff --git a/docs/glossary.md b/docs/glossary.md new file mode 100644 index 0000000..2d89e6a --- /dev/null +++ b/docs/glossary.md @@ -0,0 +1,132 @@ +# Glossary: beehave + +> Living glossary of domain terms. +> Written and maintained by the product-owner. +> Terms are added after each discovery session, updated when meaning changes. +> If code or tests diverge from a term here, refactor the code — not the glossary. + +--- + +## @deprecated tag + +A Gherkin tag (`@deprecated`) placed on a `Feature:`, `Rule:`, or `Example:` block. When `beehave sync` encounters this tag, it adds the adapter's deprecated marker to all affected test stubs. Cascade is absolute in v1: a `@deprecated` on a `Feature:` or `Rule:` applies to every `Example:` beneath it. There is no per-Example override mechanism in v1. + +--- + +## @id tag + +A Gherkin tag of the form `@id:` attached to an `Example:` block. The value is an 8-character lowercase hexadecimal string generated by `beehave` (e.g. `@id:a1b2c3d4`). Once assigned, an `@id` is stable: `beehave` never replaces a valid existing `@id`. A malformed tag (`@id:` with no value, or a non-hex value) is treated as missing and replaced with a new generated id. `@id` values are unique project-wide. + +--- + +## Adapter + +See **Framework adapter**. + +--- + +## Cache + +A JSON file stored at `.beehave_cache/features.json`. It tracks the last-known state of every `.feature` file to enable incremental sync — only changed files are fully reprocessed. The cache is auto-rebuilt if missing, stale, or corrupted. It is added to `.gitignore` by `beehave nest` and is never committed. + +--- + +## Example + +A `Example:` (or `Scenario:`) block inside a `Rule:` in a `.feature` file. The atomic unit of acceptance criteria. Each Example gets exactly one `@id` tag and maps to exactly one test stub function. Scenario Outlines are a special form of Example with a parameterized columns table. + +--- + +## Feature file + +A Gherkin `.feature` file. The single source of truth for requirements in `beehave`. `beehave` reads feature files to determine what test stubs should exist. The only write `beehave` makes to a feature file is adding `@id` tags to untagged Examples. + +--- + +## Feature slug + +A snake_case string derived from a feature file's name (e.g. `user_login` from `user-login.feature`). Used as the subdirectory name under `tests/features/` and as the prefix in test function names (`test__`). Feature files in any stage folder (`backlog/`, `in-progress/`, `completed/`, or root) map to the same slug and the same test directory. + +--- + +## Framework adapter + +A pluggable component that supplies the stub template conventions for a specific test framework. An adapter provides: the skip marker, the deprecated marker, the parametrize template (for Scenario Outlines), the function name prefix, the return type annotation, and the stub body. In v1, only the **pytest adapter** is built in. The `unittest` adapter is parked for v2. + +--- + +## hatch + +The `beehave hatch` CLI command. Generates one or two bee-themed demo `.feature` files covering common Gherkin patterns (Feature, Rule, Example, Scenario Outline) so a developer can immediately try the full sync workflow end-to-end. Never overwrites existing `.feature` files. + +--- + +## nest + +The `beehave nest` CLI command. Bootstraps the canonical project structure: creates `docs/features/backlog/`, `docs/features/in-progress/`, `docs/features/completed/`, and `tests/features/`, each with a `.gitkeep`. Also injects a `[tool.beehave]` config block into `pyproject.toml` if not already present. Additive and idempotent: never removes or overwrites existing content. Supports `--check` (verify without modifying) and `--overwrite` (recreate from scratch). + +--- + +## Orphan + +A test stub function whose `@id` (embedded in the function name as `test__`) no longer matches any `@id` in any `.feature` file. `beehave sync` warns about orphans but never deletes them automatically. The developer removes them manually. + +--- + +## pytest-beehave + +A **separate project** (not part of `beehave`) that wraps `beehave` as a pytest plugin. Adds automatic sync during the pytest lifecycle, HTML acceptance criteria injection, and terminal criteria output. Out of scope for this project. + +--- + +## Rule + +A `Rule:` block in a `.feature` file, grouping related Examples under a single business rule. Maps to one test file: `tests/features//_test.py`. + +--- + +## Rule slug + +A snake_case string derived from a `Rule:` block's title. Used as the test file name under the feature's test directory (e.g. `valid_credentials_test.py`). + +--- + +## Scenario Outline + +A parameterized Gherkin example block with an `Examples:` table of columns. `beehave` renders a parametrized stub using the adapter's parametrize template. If columns change after the initial stub is created, `beehave` warns but does not modify the parametrize decorator — manual intervention is required. + +--- + +## Source of truth + +`.feature` files are the source of truth. Test stubs are derived artifacts. If they diverge, `beehave sync` reconciles the stubs toward the `.feature` files — never the reverse. + +--- + +## status + +The `beehave status` CLI command. A dry-run preview: computes and displays what `beehave sync` would change, without modifying any file. Exits 0 if everything is in sync, exits 1 if changes are pending. Supports `--verbose` and `--json` output modes. + +--- + +## sync + +The `beehave sync` CLI command. The primary operation: assigns `@id` tags to untagged Examples (write-back to `.feature` files), then creates new test stubs, updates changed stubs, moves misplaced stubs, and warns about orphans and deleted feature files. Never modifies test bodies. + +--- + +## Test stub + +A generated Python test function. Named `test__`. Decorated with the adapter's skip marker. Has a docstring containing the full Gherkin step text verbatim. Return type is `-> None`. Body is `...` (Ellipsis). `beehave` owns the skip marker, function name, and docstring — never the body. + +--- + +## unittest adapter + +A planned framework adapter for Python's built-in `unittest` framework. Parked for v2; not part of v1. + +--- + +## version + +The `beehave version` CLI command. Prints the installed version of `beehave`. diff --git a/pyproject.toml b/pyproject.toml index 8bfcebc..eec776f 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -1,7 +1,7 @@ [project] name = "beehave" version = "0.2.20250421" -description = "A pytest plugin that runs acceptance criteria stub generation as part of the pytest lifecycle, with auto-ID assignment and generic step docstrings" +description = "Framework-agnostic CLI and Python library that keeps Gherkin .feature files and test stubs in sync" readme = "README.md" requires-python = ">=3.13" license = "MIT" From 2c5d89c11d17007c150a26b44634f3ff01c4a572 Mon Sep 17 00:00:00 2001 From: nullhack Date: Wed, 22 Apr 2026 03:33:05 -0400 Subject: [PATCH 3/9] docs(arch): update domain-model, system, C4 context and container to reflect validated 6-module structure --- docs/container.md | 44 +++++++++++++++------ docs/context.md | 24 +++++++---- docs/domain-model.md | 94 +++++++++++++++++++++----------------------- docs/system.md | 66 ++++++++++++++++++++----------- 4 files changed, 138 insertions(+), 90 deletions(-) diff --git a/docs/container.md b/docs/container.md index 6d8615e..7323b95 100644 --- a/docs/container.md +++ b/docs/container.md @@ -1,22 +1,44 @@ # C4 — Container Diagram -> Last updated: YYYY-MM-DD -> Source: docs/adr/ADR-*.md +> Last updated: 2026-04-22 +> Source: docs/adr/ADR-2026-04-22-module-structure.md, docs/domain-model.md ```mermaid C4Container - title Container Diagram — + title Container Diagram — beehave - Person(actor1, "", "") + Person(developer, "Developer", "") + Person(ci, "CI Pipeline", "") - System_Boundary(sys, "") { - Container(container1, "", "", "") - Container(container2, "", "", "") + System_Boundary(beehave_sys, "beehave") { + Container(cli, "CLI", "Python / fire", "Entry points: nest, sync, status, hatch, version. Composition root — wires all other modules together.") + Container(config, "Config", "Python", "Reads [tool.beehave] from pyproject.toml; applies defaults; merges CLI overrides into BeehaveConfig.") + Container(parsing, "Parsing", "Python / gherkin-official", "Parses .feature files into Feature/Rule/Example graph; assigns @id tags via surgical line insertion; manages incremental cache.") + Container(sync, "Sync", "Python", "Computes SyncPlan from parsed features vs. existing test stubs; executes create/update/move/warn operations.") + Container(adapters, "Adapters", "Python", "FrameworkAdapter Protocol + PytestAdapter. Renders framework-specific stub text (skip marker, parametrize, header).") + Container(nest, "Nest", "Python", "Creates docs/features/{backlog,in-progress,completed}/ and tests/features/ directory structure; injects [tool.beehave] into pyproject.toml.") } - System_Ext(ext1, "", "") + System_Ext(feature_files, "Feature Files", ".feature files — source of truth") + System_Ext(test_suite, "Test Suite", "tests/features/**/*_test.py") + System_Ext(pyproject, "pyproject.toml", "Project config") + System_Ext(cache_store, ".beehave_cache/", "Incremental sync cache (JSON)") - Rel(actor1, container1, "") - Rel(container1, container2, "") - Rel(container1, ext1, "") + Rel(developer, cli, "runs commands", "CLI / Python API") + Rel(ci, cli, "runs status --json", "CLI") + + Rel(cli, config, "reads config") + Rel(cli, parsing, "triggers parse + id assignment") + Rel(cli, sync, "triggers sync/status") + Rel(cli, nest, "triggers nest command") + Rel(cli, adapters, "selects adapter via config") + + Rel(config, pyproject, "reads [tool.beehave]", "filesystem") + Rel(nest, feature_files, "creates directory structure", "filesystem") + Rel(nest, pyproject, "injects [tool.beehave] snippet", "filesystem") + Rel(parsing, feature_files, "reads + writes @id tags", "filesystem") + Rel(parsing, cache_store, "reads/writes incremental cache", "filesystem") + Rel(sync, test_suite, "creates, updates, moves stubs", "filesystem") + Rel(sync, parsing, "reads parsed Feature graph") + Rel(sync, adapters, "renders stub text") ``` diff --git a/docs/context.md b/docs/context.md index 9c683d3..71ba305 100644 --- a/docs/context.md +++ b/docs/context.md @@ -1,18 +1,26 @@ # C4 — System Context -> Last updated: YYYY-MM-DD -> Source: docs/domain-model.md, docs/glossary.md, docs/features/completed/ +> Last updated: 2026-04-22 +> Source: docs/domain-model.md, docs/system.md, docs/arch_journal.md ```mermaid C4Context - title System Context — + title System Context — beehave - Person(actor1, "", "") + Person(developer, "Developer", "Writes Gherkin .feature files and Python tests; runs beehave to keep them in sync") + Person(ci, "CI Pipeline", "Runs beehave status to gate on drift; reads --json exit codes") + Person(framework_author, "Framework Author", "Implements FrameworkAdapter Protocol to support a new test framework") - System(system, "", "<3–5 word system description from discovery.md Scope>") + System(beehave, "beehave", "Assigns @id tags to Gherkin Examples and generates/updates skipped test stubs so living documentation and test scaffolding stay in sync") - System_Ext(ext1, "", "") + System_Ext(feature_files, "Feature Files", ".feature files on disk — source of truth for requirements (Gherkin)") + System_Ext(test_suite, "Test Suite", "Python test files under tests/features/ — generated and updated by beehave") + System_Ext(pyproject, "pyproject.toml", "Project configuration; contains [tool.beehave] config block") - Rel(actor1, system, "") - Rel(system, ext1, "") + Rel(developer, beehave, "runs sync / status / nest / hatch", "CLI / Python API") + Rel(ci, beehave, "runs status --json", "CLI") + Rel(framework_author, beehave, "implements FrameworkAdapter Protocol", "Python API") + Rel(beehave, feature_files, "reads; writes @id tags only", "filesystem") + Rel(beehave, test_suite, "creates, updates, and warns about stubs", "filesystem") + Rel(beehave, pyproject, "reads [tool.beehave] config", "filesystem") ``` diff --git a/docs/domain-model.md b/docs/domain-model.md index 381e45a..8ae9ea4 100644 --- a/docs/domain-model.md +++ b/docs/domain-model.md @@ -12,13 +12,12 @@ | Context | Responsibility | Key Modules | |---------|----------------|-------------| -| **Parsing** | Read `.feature` files; extract structure; assign `@id` tags | `beehave/parsing/` | +| **Parsing** | Read `.feature` files; extract structure; assign `@id` tags; cache incremental state | `beehave/parsing/` | | **Sync** | Reconcile parsed feature state with test stub state | `beehave/sync/` | | **Adapters** | Render framework-specific stub text | `beehave/adapters/` | -| **Config** | Read `pyproject.toml`; merge CLI overrides | `beehave/config/` | -| **Cache** | Track last-known feature file state for incremental sync | `beehave/cache/` | -| **CLI** | Entry points: `nest`, `sync`, `status`, `hatch`, `version` | `beehave/cli/` | -| **Scaffold** | Create directory structure and inject config | `beehave/scaffold/` | +| **Config** | Read `pyproject.toml`; merge CLI overrides; apply defaults | `beehave/config/` | +| **CLI** | Entry points: `nest`, `sync`, `status`, `hatch`, `version`; composition root | `beehave/cli/` | +| **Nest** | Create directory structure; inject `[tool.beehave]` into `pyproject.toml` | `beehave/nest/` | --- @@ -26,29 +25,29 @@ | Name | Type | Description | Bounded Context | First Appeared | |------|------|-------------|-----------------|----------------| -| `FeatureFile` | Entity | A `.feature` file on disk; identified by its path; source of truth for requirements | Parsing | domain-model | -| `Feature` | Value Object | Parsed representation of a Gherkin `Feature:` block; carries title, description, tags, and child Rules | Parsing | domain-model | -| `Rule` | Value Object | A `Rule:` block inside a Feature; groups related Examples; maps to one test file | Parsing | domain-model | -| `Example` | Value Object | A single `Example:` / `Scenario:` block; the atomic unit of acceptance criteria; carries tags, steps, and optional Outline table | Parsing | domain-model | -| `ScenarioOutline` | Value Object | A parameterized Example with an `Examples:` table; extends `Example`; columns become parametrize args | Parsing | domain-model | -| `ExampleId` | Value Object | An 8-char lowercase hex string (`@id:`); stable identity linking an Example to its test stub | Parsing | domain-model | -| `FeatureSlug` | Value Object | Snake-case string derived from a feature file's stem; used as test directory name and function name prefix | Parsing | domain-model | -| `RuleSlug` | Value Object | Snake-case string derived from a `Rule:` title; used as the test file name | Parsing | domain-model | -| `GherkinStep` | Value Object | A single Given/When/Then/And/But step line; carries keyword and text | Parsing | domain-model | -| `TestStub` | Entity | A generated Python test function; identified by `@id` embedded in its name; carries docstring, skip marker, body | Sync | domain-model | -| `TestFile` | Entity | A Python test file at `tests/features//_test.py`; contains one or more TestStubs | Sync | domain-model | -| `SyncPlan` | Value Object | Immutable description of all changes sync would make: stubs to create, update, move, warn about | Sync | domain-model | -| `SyncResult` | Value Object | Outcome of executing a SyncPlan; lists created/updated/moved/warned items | Sync | domain-model | -| `Orphan` | Value Object | A TestStub whose `@id` has no matching Example in any FeatureFile | Sync | domain-model | -| `FrameworkAdapter` | Protocol | Interface all adapters must implement; supplies skip marker, deprecated marker, parametrize template, stub header | Adapters | domain-model | -| `PytestAdapter` | Entity | Concrete adapter for pytest; implements `FrameworkAdapter` | Adapters | domain-model | -| `BeehaveConfig` | Value Object | Resolved configuration: `framework`, `features_dir`, `template_path`, `on_delete` policy; defaults applied | Config | domain-model | -| `RawConfig` | Value Object | Unvalidated key-value pairs read directly from `[tool.beehave]` in `pyproject.toml` | Config | domain-model | -| `CacheEntry` | Value Object | Per-file cache record: path, mtime, size, content hash | Cache | domain-model | -| `FeatureCache` | Entity | The full cache state; persisted as `.beehave_cache/features.json`; tracks all known FeatureFiles | Cache | domain-model | -| `StubTemplate` | Value Object | Rendered text template for a single stub function; produced by a FrameworkAdapter | Adapters | domain-model | -| `ColumnSet` | Value Object | Ordered set of column names from a Scenario Outline's `Examples:` table | Parsing | domain-model | -| `DemoFeature` | Value Object | Bee-themed demo `.feature` file content generated by `hatch`; covers Feature/Rule/Example/Outline patterns | CLI | domain-model | +| `FeatureFile` | Entity | A `.feature` file on disk; identified by its path; source of truth for requirements | Parsing | domain-model v1 | +| `Feature` | Value Object | Parsed representation of a Gherkin `Feature:` block; carries title, description, tags, and child Rules | Parsing | domain-model v1 | +| `Rule` | Value Object | A `Rule:` block inside a Feature; groups related Examples; maps to one test file | Parsing | domain-model v1 | +| `Example` | Value Object | A single `Example:` / `Scenario:` block; the atomic unit of acceptance criteria; carries tags, steps, and optional Outline table | Parsing | domain-model v1 | +| `ScenarioOutline` | Value Object | A parameterized Example with an `Examples:` table; extends `Example`; columns become parametrize args | Parsing | domain-model v1 | +| `ExampleId` | Value Object | An 8-char lowercase hex string (`@id:`); stable identity linking an Example to its test stub | Parsing | domain-model v1 | +| `FeatureSlug` | Value Object | Snake-case string derived from a feature file's stem (stage folder ignored); used as test directory name and function name prefix | Parsing | domain-model v1 | +| `RuleSlug` | Value Object | Snake-case string derived from a `Rule:` title; used as the test file name | Parsing | domain-model v1 | +| `GherkinStep` | Value Object | A single Given/When/Then/And/But step line; carries keyword and text | Parsing | domain-model v1 | +| `CacheEntry` | Value Object | Per-file cache record: path, mtime, size, content hash | Parsing | domain-model v1 | +| `FeatureCache` | Entity | The full cache state; persisted as `.beehave_cache/features.json`; tracks all known FeatureFiles | Parsing | domain-model v1 | +| `TestStub` | Entity | A generated Python test function; identified by `@id` embedded in its name; carries docstring, skip marker, body | Sync | domain-model v1 | +| `TestFile` | Entity | A Python test file at `tests/features//_test.py`; contains one or more TestStubs | Sync | domain-model v1 | +| `SyncPlan` | Value Object | Immutable description of all changes sync would make: stubs to create, update, move, warn about | Sync | domain-model v1 | +| `SyncResult` | Value Object | Outcome of executing a SyncPlan; lists created/updated/moved/warned items | Sync | domain-model v1 | +| `Orphan` | Value Object | A TestStub whose `@id` has no matching Example in any FeatureFile | Sync | domain-model v1 | +| `FrameworkAdapter` | Protocol | Interface all adapters must implement; supplies skip marker, deprecated marker, parametrize template, stub header | Adapters | domain-model v1 | +| `PytestAdapter` | Entity | Concrete adapter for pytest; implements `FrameworkAdapter` | Adapters | domain-model v1 | +| `StubTemplate` | Value Object | Rendered text template for a single stub function; produced by a FrameworkAdapter | Adapters | domain-model v1 | +| `BeehaveConfig` | Value Object | Resolved configuration: `framework`, `features_dir`, `template_path`, `log_level`, `on_delete`, `on_orphan`; defaults applied | Config | domain-model v1 | +| `RawConfig` | Value Object | Unvalidated key-value pairs read directly from `[tool.beehave]` in `pyproject.toml` | Config | domain-model v1 | +| `ColumnSet` | Value Object | Ordered set of column names from a Scenario Outline's `Examples:` table | Parsing | domain-model v1 | +| `DemoFeature` | Value Object | Bee-themed demo `.feature` file content generated by `hatch`; covers Feature/Rule/Example/Outline patterns | CLI | domain-model v1 | --- @@ -57,24 +56,24 @@ | Name | Actor | Object | Description | First Appeared | |------|-------|--------|-------------|----------------| | `parse` | Parser | `FeatureFile` → `Feature` | Read a `.feature` file and return its structured representation | Parsing | -| `assign_ids` | IdAssigner | `Feature` → `Feature` | Assign `ExampleId` to any Example lacking a valid one; write back in-place | Parsing | +| `assign_ids` | IdAssigner | `Feature` → `Feature` | Assign `ExampleId` to any Example lacking a valid one; write back in-place using surgical line insertion | Parsing | | `generate_id` | IdAssigner | — → `ExampleId` | Generate a unique 8-char lowercase hex id; retry on collision | Parsing | | `slugify` | — | `str` → `FeatureSlug` / `RuleSlug` | Convert a name to snake_case slug | Parsing | +| `load_cache` | CacheManager | `Path` → `FeatureCache` | Load cache from disk; rebuild silently if missing/stale/corrupt | Parsing | +| `save_cache` | CacheManager | `FeatureCache` → None | Persist cache to `.beehave_cache/features.json` | Parsing | +| `is_stale` | CacheManager | `CacheEntry` × `FeatureFile` → `bool` | Check if a cached entry is out of date | Parsing | | `plan` | SyncEngine | `[Feature]` × `[TestFile]` → `SyncPlan` | Compute the diff between current feature state and test stub state | Sync | -| `execute` | SyncEngine | `SyncPlan` → `SyncResult` | Apply the plan: create/update/move stubs; emit warnings | Sync | +| `execute` | SyncEngine | `SyncPlan` → `SyncResult` | Apply the plan: create/update/move stubs; emit warnings or errors per policy | Sync | +| `detect_orphans` | SyncEngine | `[TestStub]` × `[ExampleId]` → `[Orphan]` | Find stubs whose `@id` has no matching Example | Sync | +| `detect_misplaced` | SyncEngine | `[TestStub]` × `[Feature]` → `[(TestStub, Path)]` | Find stubs in wrong directory | Sync | +| `propagate_deprecated` | SyncEngine | `Feature` → `Feature` | Apply `@deprecated` cascade: Feature/Rule → all child Examples | Sync | | `render_stub` | FrameworkAdapter | `Example` → `StubTemplate` | Render a test stub function text for the given Example | Adapters | | `render_parametrized_stub` | FrameworkAdapter | `ScenarioOutline` → `StubTemplate` | Render a parametrized stub for a Scenario Outline | Adapters | -| `read_config` | ConfigReader | `Path` → `BeehaveConfig` | Read `pyproject.toml`, extract `[tool.beehave]`, apply defaults | Config | +| `read_config` | ConfigReader | `Path` → `BeehaveConfig` | Read `pyproject.toml`, extract `[tool.beehave]`, apply defaults; absent file returns defaults | Config | | `merge_cli` | ConfigReader | `BeehaveConfig` × `CLIArgs` → `BeehaveConfig` | Override config values with CLI flag values | Config | -| `load_cache` | CacheManager | `Path` → `FeatureCache` | Load cache from disk; rebuild silently if missing/stale/corrupt | Cache | -| `save_cache` | CacheManager | `FeatureCache` → None | Persist cache to `.beehave_cache/features.json` | Cache | -| `is_stale` | CacheManager | `CacheEntry` × `FeatureFile` → `bool` | Check if a cached entry is out of date | Cache | -| `scaffold` | Scaffolder | `BeehaveConfig` → None | Create directory structure and inject `[tool.beehave]` into `pyproject.toml` | Scaffold | -| `check_scaffold` | Scaffolder | `BeehaveConfig` → `bool` | Verify structure is complete without modifying anything | Scaffold | +| `nest` | Scaffolder | `BeehaveConfig` → None | Create directory structure and inject `[tool.beehave]` into `pyproject.toml` | Nest | +| `check_nest` | Scaffolder | `BeehaveConfig` → `bool` | Verify structure is complete without modifying anything (--check mode) | Nest | | `hatch` | DemoGenerator | `Path` → None | Write bee-themed demo `.feature` files; skip if file already exists | CLI | -| `detect_orphans` | SyncEngine | `[TestStub]` × `[ExampleId]` → `[Orphan]` | Find stubs whose `@id` has no matching Example | Sync | -| `detect_misplaced` | SyncEngine | `[TestStub]` × `[Feature]` → `[(TestStub, Path)]` | Find stubs in wrong directory | Sync | -| `propagate_deprecated` | SyncEngine | `Feature` → `Feature` | Apply `@deprecated` cascade: Feature/Rule → all child Examples | Sync | --- @@ -114,27 +113,24 @@ ``` CLI ──► Config ──► (pyproject.toml) │ - ├──► Scaffold ──► (filesystem) + ├──► Nest ──► (filesystem) │ ├──► Parsing ──► (filesystem / gherkin-official) - │ │ - │ └──► Cache ──► (filesystem) + │ └──► (cache: .beehave_cache/features.json) │ ├──► Sync ──► Parsing - │ │ │ └──► Adapters │ └──► Adapters ──► (templates) ``` **Dependency rules (enforced):** -- `Parsing` has no dependency on `Sync`, `Adapters`, `CLI`, or `Scaffold` -- `Adapters` has no dependency on `Parsing`, `Sync`, `CLI`, or `Scaffold` -- `Config` has no dependency on `Parsing`, `Sync`, `Adapters`, or `Scaffold` -- `Cache` depends only on `Parsing` (for `FeatureFile` type) -- `Sync` depends on `Parsing`, `Adapters`, and `Cache` +- `Parsing` has no dependency on `Sync`, `Adapters`, `CLI`, or `Nest` +- `Adapters` has no dependency on `Parsing`, `Sync`, `CLI`, or `Nest` +- `Config` has no dependency on `Parsing`, `Sync`, `Adapters`, or `Nest` +- `Sync` depends on `Parsing` and `Adapters` - `CLI` depends on all other contexts; it is the composition root -- `Scaffold` depends only on `Config` +- `Nest` depends only on `Config` --- diff --git a/docs/system.md b/docs/system.md index c71e29d..6342154 100644 --- a/docs/system.md +++ b/docs/system.md @@ -1,6 +1,6 @@ # System Overview: beehave -> Last updated: 2026-04-22 — initial architecture (no features completed yet) +> Last updated: 2026-04-22 — architecture design session 1 (no features completed yet) **Purpose:** beehave keeps Gherkin `.feature` files and Python test stubs in sync — assigning stable `@id` tags to Examples and generating/updating skipped test functions so living documentation and test scaffolding never diverge. @@ -8,7 +8,7 @@ ## Summary -beehave is a framework-agnostic CLI and Python library. Developers run `beehave sync` to reconcile their `.feature` files with their test suite: untagged Examples receive stable `@id` tags written back in-place, new stubs are created, changed stubs are updated, and orphaned stubs are flagged. A `beehave status` dry-run previews changes without writing anything. `beehave nest` bootstraps the canonical directory structure; `beehave hatch` generates demo content. The active test framework (default: pytest) is selected via `[tool.beehave]` in `pyproject.toml` or the `--framework` CLI flag. +beehave is a framework-agnostic CLI and Python library. Developers run `beehave sync` to reconcile their `.feature` files with their test suite: untagged Examples receive stable `@id` tags written back in-place using surgical line insertion, new stubs are created, changed stubs are updated, and orphaned stubs are flagged. A `beehave status` dry-run previews changes without writing anything. `beehave nest` bootstraps the canonical directory structure; `beehave hatch` generates demo content. The active test framework (default: pytest) is selected via `[tool.beehave]` in `pyproject.toml` or the `--framework` CLI flag. --- @@ -18,7 +18,7 @@ beehave is a framework-agnostic CLI and Python library. Developers run `beehave |-------|-------| | Developer | Run `sync` to keep stubs current; run `status` in CI to gate on drift; run `nest` once per project | | CI pipeline | `beehave status` exit codes (0 = in sync, 1 = drift); `--json` output for machine parsing | -| Framework author | `FrameworkAdapter` Protocol to supply stub conventions without forking beehave | +| Framework author | `FrameworkAdapter` Protocol to supply stub conventions without importing from beehave | --- @@ -27,26 +27,40 @@ beehave is a framework-agnostic CLI and Python library. Developers run `beehave | Module | Responsibility | |--------|----------------| | `beehave/` | Package root; public Python API surface | -| `beehave/cli/` | CLI entry points: `nest`, `sync`, `status`, `hatch`, `version` (uses `fire`) | +| `beehave/cli/` | Entry points: `nest`, `sync`, `status`, `hatch`, `version` (composition root; uses `fire`) | | `beehave/config/` | Read `[tool.beehave]` from `pyproject.toml`; apply defaults; merge CLI overrides | -| `beehave/parsing/` | Parse `.feature` files via `gherkin-official`; assign `@id` tags; derive slugs | +| `beehave/parsing/` | Parse `.feature` files via `gherkin-official`; assign `@id` tags (surgical insertion); derive slugs; manage incremental cache | | `beehave/sync/` | Compute `SyncPlan`; execute create/update/move/warn operations on test stubs | | `beehave/adapters/` | `FrameworkAdapter` Protocol + `PytestAdapter` concrete implementation | -| `beehave/cache/` | `FeatureCache` JSON persistence; stale/corrupt detection; incremental sync support | -| `beehave/scaffold/` | Create directory structure; inject `[tool.beehave]` into `pyproject.toml` | +| `beehave/nest/` | Create directory structure; inject `[tool.beehave]` into `pyproject.toml` | --- ## Key Decisions -- `.feature` files are the single source of truth; beehave only writes `@id` tags back to them — nothing else +- `.feature` files are the single source of truth; beehave only writes `@id` tags back to them using surgical line insertion — nothing else is touched - Test stub identity is the `@id` embedded in the function name (`test__`); this is the only stable link between a stub and its Example -- Framework adapters are selected by config/flag, not auto-detected; default is `pytest` +- Framework adapters are selected by config/flag, not auto-detected; default is `pytest`; defined as `typing.Protocol` (zero import coupling for third-party adapters) - Stage subfolders (`backlog/`, `in-progress/`, `completed/`) are transparent to sync — all map to the same `tests/features//` directory -- Orphan stubs are warned about but never deleted automatically +- Orphan stubs are warned about (or error, per `on_orphan` config) but never deleted automatically +- Duplicate `@id` values found in files is always a hard error — no safe resolution exists - `@deprecated` cascade is absolute in v1: Feature/Rule `@deprecated` propagates to all child Examples with no per-Example override -- Cache is invisible to users; auto-rebuilt if missing, stale, or corrupt -- See ADR-2026-04-22-feature-file-write-policy, ADR-2026-04-22-adapter-protocol, ADR-2026-04-22-id-stability, ADR-2026-04-22-error-handling-policy +- Cache is invisible to users; auto-rebuilt if missing, stale, or corrupt; full scan is the correctness baseline +- Absent `pyproject.toml` uses defaults (no error); malformed `pyproject.toml` is always a hard error +- Standard Python `logging` module; four levels (DEBUG/INFO/WARNING/ERROR); default WARNING; configurable via `log_level` + +--- + +## Configuration Keys (`[tool.beehave]`) + +| Key | Type | Default | Description | +|-----|------|---------|-------------| +| `framework` | string | `"pytest"` | Test framework adapter to use | +| `features_dir` | string | `"docs/features"` | Root directory for `.feature` files | +| `template_path` | string | `null` | Custom template folder (fully replaces built-in) | +| `log_level` | string | `"WARNING"` | Log level: DEBUG / INFO / WARNING / ERROR | +| `on_delete` | string | `"warn"` | Policy when a `.feature` file is deleted: `"warn"` or `"error"` | +| `on_orphan` | string | `"warn"` | Policy for orphan stubs (no matching `@id`): `"warn"` or `"error"` | --- @@ -64,26 +78,34 @@ beehave is a framework-agnostic CLI and Python library. Developers run `beehave - No auto-detection of test framework — explicit config or flag required - No watch mode, no pre-commit hooks, no auto-triggers — on-demand only - Test bodies are never modified under any circumstance -- `beehave` never deletes files (stubs, feature files, or cache) automatically -- Config file location is always `pyproject.toml` in the current working directory +- beehave never deletes files (stubs, feature files, or cache) automatically +- Config file location is always `pyproject.toml` in the current working directory; absent = use defaults - v1 supports only the `pytest` adapter; `unittest` is parked for v2 -- `@id` values are unique project-wide; collision on generation triggers silent retry -- Malformed `@id` tags (empty or non-hex) are replaced, not preserved +- `@id` values are unique project-wide; collision on generation triggers silent retry; duplicate `@id` in files → hard error +- Malformed `@id` tags (empty or non-hex) are replaced in-place using surgical line scan +- Feature file rename is not detectable — old test directory becomes an orphan; developer migrates manually - Scenario Outline column changes produce a warning only — parametrize decorator is never auto-modified - Custom template folder is a full replacement for built-in templates (not a merge) +- Cache is optimisation-only at target scale (100–1,000 Examples); full-scan fallback is always correct --- ## Relevant ADRs -- `ADR-2026-04-22-feature-file-write-policy` — `.feature` files are write-once for `@id` tags only -- `ADR-2026-04-22-adapter-protocol` — FrameworkAdapter as a structural Protocol (not ABC) -- `ADR-2026-04-22-id-stability` — `@id` assignment and collision policy -- `ADR-2026-04-22-error-handling-policy` — error vs. warn policy for deletions, duplicates, malformed config -- `ADR-2026-04-22-slug-derivation` — slug derivation is stage-folder-independent +| ADR | Decision | +|-----|----------| +| `ADR-2026-04-22-feature-file-write-policy` | `.feature` files are write-once for `@id` tags only; surgical line insertion | +| `ADR-2026-04-22-adapter-protocol` | `FrameworkAdapter` as a structural `typing.Protocol` (not ABC) | +| `ADR-2026-04-22-id-stability` | `@id` assignment, collision policy, duplicate = hard error | +| `ADR-2026-04-22-error-handling-policy` | Error vs. warn policy per condition; `on_delete` and `on_orphan` config keys | +| `ADR-2026-04-22-slug-derivation` | Slug from file stem only; stage-folder-independent; rename = orphan | +| `ADR-2026-04-22-module-structure` | 6 submodules: cli, config, parsing, sync, adapters, nest | +| `ADR-2026-04-22-performance-targets` | Target scale medium (100–1k Examples); cache is optimisation not load-bearing | +| `ADR-2026-04-22-backwards-compatibility` | Best-effort warn-before-remove; no hard semver guarantee | +| `ADR-2026-04-22-logging-observability` | Standard Python logging; 4 levels; `log_level` config key + `--log-level` flag | --- ## Completed Features -*(none — initial architecture pass)* +*(none — implementation not yet started)* From afa26595eab5efadfdcaf703025af8d693eb2e8d Mon Sep 17 00:00:00 2001 From: nullhack Date: Wed, 22 Apr 2026 03:53:42 -0400 Subject: [PATCH 4/9] =?UTF-8?q?fix(scope):=20clarify=20@id=20validity=20?= =?UTF-8?q?=E2=80=94=20any=20non-empty=20value=20is=20valid;=20empty-only?= =?UTF-8?q?=20is=20malformed?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- FLOW.md | 48 +++++++---- docs/features/backlog/id-generation.feature | 86 +++++++++++++++++-- .../2026-04-22-stage2-infinite-loop.md | 41 +++++++++ docs/scope_journal.md | 3 +- 4 files changed, 157 insertions(+), 21 deletions(-) create mode 100644 docs/post-mortem/2026-04-22-stage2-infinite-loop.md diff --git a/FLOW.md b/FLOW.md index d8f8064..2ba4bbe 100644 --- a/FLOW.md +++ b/FLOW.md @@ -58,6 +58,8 @@ All must be satisfied before starting any session. If any are missing, stop and States are checked **in order**. The first matching condition is the current state. ``` +[IDLE] ──► [STEP-1-BACKLOG-CRITERIA] (Stage 2 on backlog files — no WIP slot needed) + [IDLE] ──► [STEP-1-DISCOVERY] ──► [STEP-1-STORIES] ──► [STEP-1-CRITERIA] │ ▼ @@ -91,27 +93,38 @@ States are checked **in order**. The first matching condition is the current sta ### Detection Rules (evaluated in order) -1. No file in `docs/features/in-progress/` → **[IDLE]** -2. Feature in `in-progress/`, no `Status: BASELINED` → **[STEP-1-DISCOVERY]** -3. Feature has `Status: BASELINED`, no `Rule:` blocks → **[STEP-1-STORIES]** -4. Feature has `Rule:` blocks, no `Example:` with `@id` → **[STEP-1-CRITERIA]** -5. Feature has `@id` tags, no `feat/` or `fix/` branch exists → **[STEP-2-READY]** -6. On feature branch, no test stubs in `tests/features//` → **[STEP-2-ARCH]** -7. Test stubs exist, any have `@pytest.mark.skip` → **[STEP-3-READY]** -8. Unskipped test exists that fails → **[STEP-3-RED]** -9. All unskipped tests pass, skipped tests remain → **[STEP-3-GREEN]** -10. All tests pass, no skipped tests → **[STEP-4-READY]** -11. Manual state set by SA after Step 4 approval → **[STEP-5-READY]** -12. On main branch, feature still in `in-progress/` → **[STEP-5-MERGE]** -13. Post-mortem file exists for current feature → **[POST-MORTEM]** +1. No file in `docs/features/in-progress/` AND any `backlog/` feature has `Status: BASELINED` but no `Example:` with `@id` → **[STEP-1-BACKLOG-CRITERIA]** +2. No file in `docs/features/in-progress/` → **[IDLE]** +3. Feature in `in-progress/`, no `Status: BASELINED` → **[STEP-1-DISCOVERY]** +4. Feature has `Status: BASELINED`, no `Rule:` blocks → **[STEP-1-STORIES]** +5. Feature has `Rule:` blocks, no `Example:` with `@id` → **[STEP-1-CRITERIA]** +6. Feature has `@id` tags, no `feat/` or `fix/` branch exists → **[STEP-2-READY]** +7. On feature branch, no test stubs in `tests/features//` → **[STEP-2-ARCH]** +8. Test stubs exist, any have `@pytest.mark.skip` → **[STEP-3-READY]** +9. Unskipped test exists that fails → **[STEP-3-RED]** +10. All unskipped tests pass, skipped tests remain → **[STEP-3-GREEN]** +11. All tests pass, no skipped tests → **[STEP-4-READY]** +12. Manual state set by SA after Step 4 approval → **[STEP-5-READY]** +13. On main branch, feature still in `in-progress/` → **[STEP-5-MERGE]** +14. Post-mortem file exists for current feature → **[POST-MORTEM]** --- ## States +### [STEP-1-BACKLOG-CRITERIA] +**Owner**: `product-owner` +**Entry condition**: No file in `in-progress/` AND one or more `backlog/` features have `Status: BASELINED` but no `Example:` with `@id` +**Action**: Write `Rule:` blocks and `Example:` blocks with `@id` tags for BASELINED backlog features. Files stay in `backlog/` — do **not** move to `in-progress/`. No `WORK.md` entry required. +**Exit**: All BASELINED backlog features have `@id` tags → transition to `[IDLE]` +**Commit**: `feat(criteria): write acceptance criteria for ` per feature +**Note**: This state exists specifically for bulk Stage 2 work before a feature is selected for development. It does not consume the WIP slot. `run-session` must **not** treat this state as `[IDLE]` — there is work to do. + +--- + ### [IDLE] **Owner**: `product-owner` -**Entry condition**: No file in `docs/features/in-progress/` +**Entry condition**: No file in `docs/features/in-progress/` AND all BASELINED backlog features already have `@id` tags (or no BASELINED features exist) **Action**: Select next BASELINED feature from `backlog/`; move it to `in-progress/` **Exit**: Feature moved → create `WORK.md` entry with `@state: STEP-1-DISCOVERY` @@ -262,6 +275,11 @@ git add WORK.md && git commit -m "chore: @id transition to @state" Run in order; first matching condition determines the state. ```bash +# 0. Check for STEP-1-BACKLOG-CRITERIA: no in-progress file AND backlog has BASELINED features without @id +NO_INPROGRESS=$(ls docs/features/in-progress/*.feature 2>/dev/null | grep -v ".gitkeep" | wc -l) +HAS_BASELINED_WITHOUT_IDS=$(grep -rl "Status: BASELINED" docs/features/backlog/ 2>/dev/null | xargs grep -L "@id:" 2>/dev/null | wc -l) +# If NO_INPROGRESS=0 AND HAS_BASELINED_WITHOUT_IDS>0 → [STEP-1-BACKLOG-CRITERIA] + # 1. Check for in-progress feature ls docs/features/in-progress/*.feature 2>/dev/null | grep -v ".gitkeep" @@ -269,7 +287,7 @@ ls docs/features/in-progress/*.feature 2>/dev/null | grep -v ".gitkeep" grep -q "Status: BASELINED" docs/features/in-progress/*.feature # 3. Check for Rule blocks -grep -q "^Rule:" docs/features/in-progress/*.feature +grep -q "^ Rule:" docs/features/in-progress/*.feature # 4. Check for Example blocks with @id grep -q "@id:" docs/features/in-progress/*.feature diff --git a/docs/features/backlog/id-generation.feature b/docs/features/backlog/id-generation.feature index f5f128a..930d38b 100644 --- a/docs/features/backlog/id-generation.feature +++ b/docs/features/backlog/id-generation.feature @@ -1,18 +1,94 @@ Feature: id-generation — assign @id tags to untagged Examples Assigns stable, unique 8-character lowercase hex @id tags to any Example in a .feature file - that does not already have a valid one. beehave writes the tag back in-place, preserving all - whitespace and formatting exactly. A valid existing @id is never replaced. Malformed tags - (empty value or non-hex characters) are treated as missing and replaced. + that does not already have one. beehave writes the tag back in-place, preserving all + whitespace and formatting exactly. Any existing non-empty @id value — regardless of format — + is treated as valid and never replaced. Only an @id with an empty value is treated as missing + and replaced. The 8-char hex format is a beehave convention, not an enforced constraint. - Status: ELICITING + Status: BASELINED (2026-04-22) Rules (Business): - @id values are unique project-wide across all .feature files - - Developer-supplied valid @id tags are respected and never overwritten + - Any non-empty @id value is respected as-is, regardless of format — beehave never validates or changes it + - Only an @id with an empty value is treated as missing and replaced - Collision on generation triggers a silent retry until a unique id is produced - Assignment is top-to-bottom within each file Constraints: - In-place write preserves all existing whitespace and formatting - Dry-run / preview is provided by beehave status, not a separate mode + + Rule: ID assignment + As a developer + I want beehave to assign @id tags to untagged Examples automatically + So that every Example has a stable identity without manual effort + + @id:1856a053 + Example: Untagged Example receives a beehave-generated @id + Given a .feature file with an Example that has no @id tag + When beehave assigns IDs + Then the Example receives an @id tag whose value is 8 lowercase hex characters (beehave's generation format) + + @id:f381c49d + Example: @id tag is inserted immediately before the Example keyword + Given a .feature file with an untagged Example + When beehave assigns IDs + Then the @id tag appears on the line immediately before the Example: keyword + + @id:782a622b + Example: Assignment processes Examples top-to-bottom within a file + Given a .feature file with three untagged Examples + When beehave assigns IDs + Then all three Examples receive @id tags in the order they appear in the file + + Rule: ID stability + As a developer + I want existing @id tags to be preserved regardless of their format + So that human-assigned IDs like @id:N001 are never overwritten by beehave + + @id:875c5c0b + Example: Beehave-generated hex @id is not replaced on re-run + Given an Example with @id:a1b2c3d4 (a beehave-generated value) + When beehave assigns IDs + Then the @id value remains a1b2c3d4 + + @id:67f7cfa4 + Example: Human-assigned non-hex @id is respected as-is + Given an Example with @id:N001 (a human-assigned value) + When beehave assigns IDs + Then the @id value remains N001 and is not replaced + + @id:e438b0e0 + Example: Empty @id tag is replaced with a beehave-generated value + Given an Example with @id: (no value after the colon, only whitespace) + When beehave assigns IDs + Then the empty tag is replaced with an 8-char lowercase hex value generated by beehave + + Rule: Project-wide uniqueness + As a developer + I want @id values to be unique across all .feature files in the project + So that test function names are unambiguous project-wide + + @id:55a83d81 + Example: Generated IDs do not collide with existing IDs in other files + Given two .feature files each containing Examples with @id tags + When beehave assigns IDs to a third file with untagged Examples + Then none of the new IDs match any existing ID across all files + + @id:2d6c532a + Example: Duplicate @id found across files is reported + Given two .feature files where both contain an Example with the same @id value + When beehave loads the project + Then beehave reports the duplicate @id and both file paths according to the on_duplicate_id policy + + Rule: In-place write fidelity + As a developer + I want beehave to preserve all formatting when writing @id tags + So that diffs are minimal and other content is never accidentally changed + + @id:84695ed3 + Example: Only @id tag lines are added; all other lines are unchanged + Given a .feature file with one untagged Example + When beehave assigns IDs + Then the file differs from the original by exactly one inserted line containing the @id tag diff --git a/docs/post-mortem/2026-04-22-stage2-infinite-loop.md b/docs/post-mortem/2026-04-22-stage2-infinite-loop.md new file mode 100644 index 0000000..67db5b6 --- /dev/null +++ b/docs/post-mortem/2026-04-22-stage2-infinite-loop.md @@ -0,0 +1,41 @@ +# Post-Mortem: Stage 2 Specification — Infinite Loop + +**Date**: 2026-04-22 +**Feature stem**: all-backlog-features +**Keyword**: infinite-loop +**Author**: product-owner + +## What happened + +A task was launched asking the product-owner subagent to write Stage 2 (Rules + Examples) for +all 14 backlog features. The agent loaded `skill run-session`, which detected `FLOW.md` state +as `[IDLE]` (no file in `in-progress/`). The idle protocol instructed the agent to load +`skill select-feature` and find a BASELINED feature. No features were BASELINED (all had +`Status: ELICITING`). The agent escalated to PO — which is itself — and re-entered the same +loop. The task was aborted externally. + +## Root causes (3) + +1. **Features were never BASELINED.** All 14 features had `Status: ELICITING`. Stage 2 is + gated on `Status: BASELINED`. The stakeholder approval step was skipped after discovery. + +2. **FLOW.md has no detection rule for Stage 2 on backlog files.** Writing Rules + Examples + for features that sit in `backlog/` (not `in-progress/`) is a valid workflow state, but + no state covers it. The agent fell through to `[IDLE]` and looped. + +3. **`run-session` hijacks subagent context.** The product-owner agent always loads + `run-session` at start, which evaluates `FLOW.md` and takes over control flow regardless + of what the task prompt says. Custom task instructions cannot override this. + +## Fix applied + +- Added `[STEP-1-BACKLOG-CRITERIA]` state to `FLOW.md` covering Stage 2 work on backlog files. +- Marked all 14 backlog features as `Status: BASELINED (2026-04-22)`. +- Wrote Rules + Examples + `@id` tags directly (no subagent delegation for bulk backlog work). +- Filed upstream issue on nullhack/temple8 to track the detection gap. + +## Prevention + +- Never call a PO subagent to process files in `backlog/` when `FLOW.md` state is `[IDLE]`. +- Ensure BASELINED gate is explicitly confirmed with the stakeholder before any Stage 2 work. +- Add a `[STEP-1-BACKLOG-CRITERIA]` state to the template so future projects do not hit this gap. diff --git a/docs/scope_journal.md b/docs/scope_journal.md index c4d45f1..a10a572 100644 --- a/docs/scope_journal.md +++ b/docs/scope_journal.md @@ -73,7 +73,8 @@ Status: COMPLETE | ID | Question | Answer | |----|----------|--------| -| I1 | ID format, developer-supplied IDs, and idempotency? | - **Format:** 8-char lowercase hex when beehave generates (e.g. `@id:a1b2c3d4`).
- **Developer-supplied:** If a developer already added `@id:`, beehave respects it as-is and never overwrites or regenerates it.
- **Idempotency:** Valid existing IDs are left untouched.
- **Malformed tags:** `@id:` with no value, or `@id:ZZZZZZZZ` (non-hex), are treated as missing — a new ID is generated and replaces the malformed one. | +| I1 | ID format, developer-supplied IDs, and idempotency? | - **Format:** 8-char lowercase hex when beehave generates (e.g. `@id:a1b2c3d4`).
- **Developer-supplied:** Any non-empty value after `@id:` is valid — beehave never validates or changes it. A PM or PO may assign `@id:N001` or any other string; beehave uses that value as-is. Enforcement of the 8-char hex convention is a team convention, not a beehave constraint.
- **Idempotency:** Any existing non-empty `@id` is left untouched.
- **Malformed tags:** Only `@id:` with an empty value (nothing after the colon, or only whitespace) is treated as missing and replaced. Non-hex or non-8-char values are NOT malformed if they are non-empty. | +| I1a | Clarification (2026-04-22): What counts as a "valid" developer-supplied @id? | A valid developer-supplied @id is **any non-empty string after `@id:`**. Format, length, and character set are not enforced by beehave. The 8-char lowercase hex format applies only to IDs that beehave generates itself. Examples: `@id:N001`, `@id:login-happy-path`, `@id:a1b2c3d4` — all valid. `@id:` with no value — invalid (replaced). | | I2 | Uniqueness scope? | Project-wide (all `.feature` files). If a duplicate `@id` is detected, warn/error (configurable, same pattern as C5). | | I3 | Collision on generation? | Retry with new random value silently until unique. | | I4 | Write-back strategy? | In-place. Preserves all whitespace and formatting exactly — only adds the `@id:` tag line. | From bfd78fbd7bf296412a8f87b4e49e19448853a3d2 Mon Sep 17 00:00:00 2001 From: nullhack Date: Wed, 22 Apr 2026 04:00:39 -0400 Subject: [PATCH 5/9] feat(criteria): write acceptance criteria for all 14 backlog features - All 14 features BASELINED (2026-04-22) - Rules + Examples + @id tags written for all features - Fixed FLOW.md: add [STEP-1-BACKLOG-CRITERIA] state and detection rule - Fixed product-owner agent: handle backlog-criteria state without looping - Clarified @id validity in scope_journal (I1a): any non-empty value is valid --- .../features/backlog/adapter-contract.feature | 60 ++++++++++++++- .../features/backlog/cache-management.feature | 59 +++++++++++++- docs/features/backlog/config-reading.feature | 73 +++++++++++++++++- .../features/backlog/deprecation-sync.feature | 58 +++++++++++++- docs/features/backlog/hatch.feature | 59 +++++++++++++- docs/features/backlog/id-generation.feature | 2 +- docs/features/backlog/nest.feature | 76 ++++++++++++++++++- .../backlog/parameter-handling.feature | 53 ++++++++++++- docs/features/backlog/pytest-adapter.feature | 65 +++++++++++++++- docs/features/backlog/status.feature | 65 +++++++++++++++- docs/features/backlog/sync-cleanup.feature | 53 ++++++++++++- docs/features/backlog/sync-create.feature | 59 +++++++++++++- docs/features/backlog/sync-update.feature | 64 +++++++++++++++- .../backlog/template-customization.feature | 47 +++++++++++- 14 files changed, 778 insertions(+), 15 deletions(-) diff --git a/docs/features/backlog/adapter-contract.feature b/docs/features/backlog/adapter-contract.feature index 9997e88..8a77056 100644 --- a/docs/features/backlog/adapter-contract.feature +++ b/docs/features/backlog/adapter-contract.feature @@ -5,7 +5,7 @@ Feature: adapter-contract — common framework adapter interface selected by the framework config key or the --framework CLI flag. In v1 only the built-in pytest adapter exists; the interface is designed to allow third-party adapters in future. - Status: ELICITING + Status: BASELINED (2026-04-22) Rules (Business): - The active adapter is selected by the framework key in [tool.beehave] or --framework flag @@ -14,3 +14,61 @@ Feature: adapter-contract — common framework adapter interface Constraints: - Every adapter must supply: skip marker, deprecated marker, parametrize template, stub file header - v1: only built-in adapters; third-party adapter registration is out of v1 scope + + Rule: Adapter selection + As a developer + I want beehave to select the correct adapter based on my configuration + So that generated stubs match my test framework conventions + + @id:50a73f9c + Example: pytest adapter is selected when framework = "pytest" + Given [tool.beehave] with framework = "pytest" + When beehave resolves the active adapter + Then the pytest adapter is used for all stub generation + + @id:33d9c4c3 + Example: pytest is the default when no framework is configured + Given no framework key in [tool.beehave] and no --framework flag + When beehave resolves the active adapter + Then the pytest adapter is used + + @id:50e56841 + Example: --framework flag overrides config file adapter + Given [tool.beehave] with framework = "pytest" and invocation with --framework pytest + When beehave resolves the active adapter + Then the pytest adapter is used + + @id:0372d64b + Example: Unknown framework value raises an error + Given [tool.beehave] with framework = "unknown_framework" + When beehave resolves the active adapter + Then beehave exits with an error naming the unrecognised framework + + Rule: Adapter contract completeness + As a framework adapter implementor + I want a clear contract specifying what every adapter must provide + So that beehave's core can use any adapter without knowing its internals + + @id:ff712870 + Example: Adapter provides a skip marker string + Given the pytest adapter is active + When beehave requests the skip marker + Then it returns a non-empty string usable as a Python decorator + + @id:99582b14 + Example: Adapter provides a deprecated marker string + Given the pytest adapter is active + When beehave requests the deprecated marker + Then it returns a non-empty string usable as a Python decorator + + @id:ddc63c1a + Example: Adapter provides a parametrize template string + Given the pytest adapter is active + When beehave requests the parametrize template + Then it returns a non-empty string usable as a Python decorator accepting column names and rows + + @id:17e6d732 + Example: Adapter provides a stub file header string + Given the pytest adapter is active + When beehave requests the stub file header + Then it returns a non-empty string valid as the opening lines of a Python test file diff --git a/docs/features/backlog/cache-management.feature b/docs/features/backlog/cache-management.feature index 0167569..ac88ad2 100644 --- a/docs/features/backlog/cache-management.feature +++ b/docs/features/backlog/cache-management.feature @@ -4,7 +4,7 @@ Feature: cache-management — incremental sync cache .feature file. On each sync, only changed files are fully reprocessed, keeping sync fast on large projects. The cache is invisible in normal operation and never committed to version control. - Status: ELICITING + Status: BASELINED (2026-04-22) Rules (Business): - Cache is auto-created on first sync and updated incrementally on every subsequent sync @@ -13,3 +13,60 @@ Feature: cache-management — incremental sync cache Constraints: - Cache file is added to .gitignore by beehave nest - Cache is not a user-visible artifact + + Rule: Cache creation and update + As a developer + I want beehave to maintain a cache automatically + So that repeated syncs on large projects are fast without any manual cache management + + @id:b0d4d606 + Example: Cache file is created on first sync + Given no .beehave_cache/features.json exists + When beehave sync runs for the first time + Then .beehave_cache/features.json is created + + @id:abeb4807 + Example: Cache is updated after each sync + Given .beehave_cache/features.json exists from a previous sync + When beehave sync runs again after a .feature file changes + Then .beehave_cache/features.json reflects the new state of the changed file + + @id:45ac0e98 + Example: Unchanged files are not reprocessed + Given .beehave_cache/features.json is up to date for all .feature files + When beehave sync runs + Then only files whose content has changed since the last sync are fully reprocessed + + Rule: Cache resilience + As a developer + I want beehave to recover silently from a missing or corrupted cache + So that deleting or corrupting the cache never breaks my workflow + + @id:29259d33 + Example: Missing cache triggers a silent full rebuild + Given .beehave_cache/features.json has been deleted + When beehave sync runs + Then beehave rebuilds the cache from scratch with no error or warning to the user + + @id:40c67579 + Example: Corrupted cache triggers a silent full rebuild + Given .beehave_cache/features.json contains invalid JSON + When beehave sync runs + Then beehave rebuilds the cache from scratch with no error or warning to the user + + Rule: Cache invisibility + As a developer + I want the cache to be invisible in normal operation + So that it does not pollute my repository or require any maintenance + + @id:79aa550e + Example: Cache directory is added to .gitignore by nest + Given beehave nest is run on a clean project + When the project's .gitignore is inspected + Then .beehave_cache/ appears in .gitignore + + @id:881b03fe + Example: No cache-related output appears in default sync output + Given a valid cache exists + When beehave sync runs with no output flags + Then no cache-related messages appear in stdout or stderr diff --git a/docs/features/backlog/config-reading.feature b/docs/features/backlog/config-reading.feature index 629514f..b61e1d5 100644 --- a/docs/features/backlog/config-reading.feature +++ b/docs/features/backlog/config-reading.feature @@ -4,7 +4,7 @@ Feature: config-reading — read [tool.beehave] from pyproject.toml working directory. Provides defaults for all config keys so beehave works out of the box without any configuration. CLI flags override config file values for the current invocation only. - Status: ELICITING + Status: BASELINED (2026-04-22) Rules (Business): - Missing config keys fall back to documented defaults @@ -12,4 +12,73 @@ Feature: config-reading — read [tool.beehave] from pyproject.toml Constraints: - Config file location: pyproject.toml in the current working directory - - Supported keys (v1): framework, features_dir, template_path + - Supported keys (v1): framework, features_dir, template_path, on_delete, on_orphan, log_level + + Rule: Default configuration + As a developer + I want beehave to work without any configuration file + So that I can use it immediately after installation + + @id:3103f573 + Example: All defaults when no pyproject.toml exists + Given no pyproject.toml in the working directory + When beehave loads configuration + Then framework is "pytest", features_dir is "docs/features", template_path is None, on_delete is "warn", on_orphan is "warn", log_level is "WARNING" + + @id:d20cc79f + Example: All defaults when [tool.beehave] section is absent + Given a pyproject.toml with no [tool.beehave] section + When beehave loads configuration + Then all keys use their documented default values + + @id:9e21706a + Example: Partial config uses defaults for missing keys + Given a pyproject.toml with [tool.beehave] containing only framework = "pytest" + When beehave loads configuration + Then framework is "pytest" and all other keys use their defaults + + Rule: Config file reading + As a developer + I want beehave to read my [tool.beehave] settings from pyproject.toml + So that I can configure it once per project + + @id:1ff9caa3 + Example: All supported keys are read + Given a pyproject.toml with [tool.beehave] setting all six supported keys + When beehave loads configuration + Then each key reflects the value from the config file + + @id:26bafb7e + Example: Malformed pyproject.toml raises an error + Given a pyproject.toml that is not valid TOML + When beehave loads configuration + Then beehave exits with an error describing the parse failure + + @id:a1682833 + Example: Unknown keys in [tool.beehave] are ignored + Given a pyproject.toml with [tool.beehave] containing an unrecognised key + When beehave loads configuration + Then beehave loads successfully and the unknown key is silently ignored + + Rule: CLI flag override + As a developer + I want CLI flags to override config file values for the current invocation + So that I can run one-off commands without editing pyproject.toml + + @id:8d5722f3 + Example: --framework flag overrides config file value + Given a pyproject.toml with framework = "pytest" + When beehave is invoked with --framework pytest + Then framework for this invocation is "pytest" regardless of config + + @id:96807b94 + Example: --features-dir flag overrides features_dir + Given a pyproject.toml with features_dir = "docs/features" + When beehave is invoked with --features-dir custom/path + Then features_dir for this invocation is "custom/path" + + @id:13869c93 + Example: CLI override does not persist to pyproject.toml + Given a pyproject.toml with features_dir = "docs/features" + When beehave is invoked with --features-dir custom/path and completes + Then pyproject.toml still contains features_dir = "docs/features" diff --git a/docs/features/backlog/deprecation-sync.feature b/docs/features/backlog/deprecation-sync.feature index 3f41965..1a59879 100644 --- a/docs/features/backlog/deprecation-sync.feature +++ b/docs/features/backlog/deprecation-sync.feature @@ -5,7 +5,7 @@ Feature: deprecation-sync — propagate @deprecated tags to test stubs absolute in v1: a @deprecated on a Feature or Rule propagates to every child Example with no per-Example override mechanism. - Status: ELICITING + Status: BASELINED (2026-04-22) Rules (Business): - @deprecated on a Feature applies to all child Examples of that Feature @@ -15,3 +15,59 @@ Feature: deprecation-sync — propagate @deprecated tags to test stubs Constraints: - Cascade direction is always parent → child; never child → parent + + Rule: Example-level deprecation + As a developer + I want beehave to mark a stub as deprecated when its Example is tagged @deprecated + So that pytest can report it accordingly + + @id:2afe7190 + Example: @deprecated on Example adds deprecated marker to its stub + Given an Example tagged @deprecated + When beehave sync runs + Then the corresponding stub function has the adapter's deprecated marker + + @id:428a5348 + Example: Removing @deprecated from Example removes the marker from its stub + Given a stub with the deprecated marker whose Example no longer has @deprecated + When beehave sync runs + Then the deprecated marker is removed from the stub function + + Rule: Rule-level deprecation cascade + As a developer + I want @deprecated on a Rule to propagate to all its child Examples + So that deprecating a user story marks all its tests as deprecated in one step + + @id:55faf7b9 + Example: @deprecated on Rule adds deprecated marker to all child stubs + Given a Rule tagged @deprecated containing three Examples + When beehave sync runs + Then all three corresponding stub functions have the adapter's deprecated marker + + @id:a7ad94e7 + Example: Only the Rule's child stubs are affected; sibling Rule stubs are unchanged + Given two Rules where only the first is tagged @deprecated + When beehave sync runs + Then only stubs under the first Rule have the deprecated marker; the second Rule's stubs are unchanged + + Rule: Feature-level deprecation cascade + As a developer + I want @deprecated on a Feature to propagate to all Examples in that Feature + So that deprecating an entire feature marks every test in one step + + @id:e94de52f + Example: @deprecated on Feature adds deprecated marker to all stubs in the feature + Given a Feature tagged @deprecated with two Rules and four Examples total + When beehave sync runs + Then all four stub functions have the adapter's deprecated marker + + Rule: No per-Example override in v1 + As a developer + I want the cascade to be absolute so the behaviour is predictable + So that I do not need to track which Examples have been individually overridden + + @id:62b7aac9 + Example: An Example cannot override a parent @deprecated tag in v1 + Given a Rule tagged @deprecated and one of its Examples also tagged @deprecated + When beehave sync runs + Then the stub for that Example has the deprecated marker (same as all other children; no special treatment) diff --git a/docs/features/backlog/hatch.feature b/docs/features/backlog/hatch.feature index 83fdc48..83ac3ca 100644 --- a/docs/features/backlog/hatch.feature +++ b/docs/features/backlog/hatch.feature @@ -5,7 +5,7 @@ Feature: hatch — generate demo .feature files so a developer can immediately experience the full sync workflow end-to-end without writing their own .feature content first. - Status: ELICITING + Status: BASELINED (2026-04-22) Rules (Business): - hatch never overwrites an existing .feature file @@ -13,3 +13,60 @@ Feature: hatch — generate demo .feature files Constraints: - Demo content is bee-themed (vocabulary consistent with beehave branding) + + Rule: Demo file generation + As a new beehave user + I want hatch to generate ready-to-use demo .feature files + So that I can see beehave sync working end-to-end without writing my own content first + + @id:5e6028ce + Example: hatch creates at least one .feature file in the features_dir + Given an empty docs/features/backlog/ directory + When beehave hatch runs + Then at least one .feature file exists in docs/features/backlog/ + + @id:4e364a9d + Example: Generated .feature file contains a Rule block + Given beehave hatch has run + When the generated .feature file is inspected + Then it contains at least one Rule: block with a user story + + @id:c452ac34 + Example: Generated .feature file contains a plain Example + Given beehave hatch has run + When the generated .feature file is inspected + Then it contains at least one Example: block with Given/When/Then steps + + @id:9979040f + Example: Generated .feature file contains a Scenario Outline + Given beehave hatch has run + When the generated .feature file is inspected + Then it contains at least one Scenario Outline: block with an Examples table + + Rule: No-overwrite safety + As a developer + I want hatch to never overwrite an existing .feature file + So that running it on an existing project does not destroy my work + + @id:24514455 + Example: Existing .feature file is not overwritten + Given a .feature file already exists in docs/features/backlog/ + When beehave hatch runs + Then the existing file content is unchanged + + @id:def76ef7 + Example: hatch reports skipped files when a target file already exists + Given a target demo .feature file already exists + When beehave hatch runs + Then beehave reports that the file was skipped rather than overwritten + + Rule: Bee-themed content + As a new beehave user + I want the demo content to use bee-themed vocabulary + So that the examples are memorable and consistent with the beehave brand + + @id:a81d2b1c + Example: Generated .feature file uses bee-themed domain vocabulary + Given beehave hatch has run + When the generated .feature file is inspected + Then the feature name and step text use bee-related domain terms diff --git a/docs/features/backlog/id-generation.feature b/docs/features/backlog/id-generation.feature index 930d38b..0753998 100644 --- a/docs/features/backlog/id-generation.feature +++ b/docs/features/backlog/id-generation.feature @@ -28,7 +28,7 @@ Feature: id-generation — assign @id tags to untagged Examples Example: Untagged Example receives a beehave-generated @id Given a .feature file with an Example that has no @id tag When beehave assigns IDs - Then the Example receives an @id tag whose value is 8 lowercase hex characters (beehave's generation format) + Then the Example receives an @id tag whose value is exactly 8 lowercase hex characters @id:f381c49d Example: @id tag is inserted immediately before the Example keyword diff --git a/docs/features/backlog/nest.feature b/docs/features/backlog/nest.feature index a609b3b..996b88b 100644 --- a/docs/features/backlog/nest.feature +++ b/docs/features/backlog/nest.feature @@ -5,7 +5,7 @@ Feature: nest — bootstrap canonical directory structure and .gitkeep files in each empty directory. It also injects a [tool.beehave] snippet into pyproject.toml if not already present. The command is additive and idempotent. - Status: ELICITING + Status: BASELINED (2026-04-22) Rules (Business): - Nest never removes or overwrites existing content @@ -16,3 +16,77 @@ Feature: nest — bootstrap canonical directory structure Constraints: - Accepts --features-dir to override the default docs/features/ path - Safe to run in an existing Python project with unrelated files present + + Rule: Directory bootstrapping + As a developer + I want beehave nest to create the required directory structure + So that I can start using beehave immediately in a new project + + @id:d9b7b520 + Example: Creates all required directories in a clean project + Given a project directory with no beehave structure + When beehave nest is run + Then docs/features/backlog/, docs/features/in-progress/, docs/features/completed/, and tests/features/ all exist + + @id:2b76d9eb + Example: Places .gitkeep in each new empty directory + Given a project directory with no beehave structure + When beehave nest is run + Then each created empty directory contains a .gitkeep file + + @id:6f29a8f2 + Example: Injects [tool.beehave] into pyproject.toml when absent + Given a pyproject.toml with no [tool.beehave] section + When beehave nest is run + Then pyproject.toml contains a [tool.beehave] section with default key-value pairs + + Rule: Idempotency and safety + As a developer + I want beehave nest to be safe to re-run on an existing project + So that I can run it in CI or after a pull without fear of data loss + + @id:8236b93e + Example: Existing directories are not modified + Given a project with docs/features/backlog/ already containing a .feature file + When beehave nest is run + Then the existing .feature file is unchanged + + @id:580ba508 + Example: Existing [tool.beehave] section is not overwritten + Given a pyproject.toml with a [tool.beehave] section containing custom values + When beehave nest is run + Then the existing [tool.beehave] values are unchanged + + @id:86e134b6 + Example: Running nest twice produces the same result as running it once + Given a clean project + When beehave nest is run twice in succession + Then the resulting structure is identical to running it once + + Rule: Check mode + As a developer + I want nest --check to verify structure without writing anything + So that CI can enforce that the project structure is correct + + @id:4a676e83 + Example: Exits 0 when structure is complete + Given a project with all required beehave directories present + When beehave nest --check is run + Then the exit code is 0 and no files are written + + @id:c5bc7117 + Example: Exits non-zero when structure is incomplete + Given a project missing docs/features/completed/ + When beehave nest --check is run + Then the exit code is non-zero and no files are written + + Rule: Overwrite mode + As a developer + I want nest --overwrite to recreate managed directories from scratch + So that I can reset a corrupted or misconfigured structure + + @id:58c68617 + Example: Recreates managed directories when --overwrite is set + Given a project where docs/features/backlog/ has been deleted + When beehave nest --overwrite is run + Then all managed directories are recreated with .gitkeep files diff --git a/docs/features/backlog/parameter-handling.feature b/docs/features/backlog/parameter-handling.feature index 4cf5984..e50b0a6 100644 --- a/docs/features/backlog/parameter-handling.feature +++ b/docs/features/backlog/parameter-handling.feature @@ -5,7 +5,7 @@ Feature: parameter-handling — Scenario Outline parametrization If columns change after the initial stub is created, beehave warns and flags the stub as requiring manual intervention — it never auto-modifies the parametrize decorator. - Status: ELICITING + Status: BASELINED (2026-04-22) Rules (Business): - A Scenario Outline stub is created with the adapter's parametrize template on first sync @@ -13,3 +13,54 @@ Feature: parameter-handling — Scenario Outline parametrization Constraints: - "Column change" means any addition, removal, or rename of an Examples table column + + Rule: Parametrized stub creation + As a developer + I want beehave to generate a parametrized stub for each Scenario Outline + So that my test suite reflects all parameter combinations without manual setup + + @id:9e373760 + Example: Scenario Outline produces a parametrized stub on first sync + Given a .feature file with a Scenario Outline with columns "input" and "expected" + When beehave sync creates the stub with the pytest adapter + Then the stub is decorated with @pytest.mark.parametrize("input, expected", [...]) containing the table values + + @id:21be53b0 + Example: Each row in the Examples table becomes a parametrize entry + Given a Scenario Outline with an Examples table of three rows + When beehave sync creates the stub + Then the parametrize decorator contains exactly three parameter tuples + + Rule: Column change warning + As a developer + I want beehave to warn me when Scenario Outline columns change after the stub was created + So that I know to update the parametrize decorator manually + + @id:e75ff75a + Example: Adding a column after initial stub creation produces a warning + Given a parametrized stub already exists for a Scenario Outline + When a new column is added to the Examples table and beehave sync runs + Then beehave emits a warning identifying the affected stub and does not modify the parametrize decorator + + @id:4ee36ecb + Example: Removing a column after initial stub creation produces a warning + Given a parametrized stub already exists for a Scenario Outline + When a column is removed from the Examples table and beehave sync runs + Then beehave emits a warning identifying the affected stub and does not modify the parametrize decorator + + @id:da379195 + Example: Renaming a column after initial stub creation produces a warning + Given a parametrized stub already exists for a Scenario Outline + When a column is renamed in the Examples table and beehave sync runs + Then beehave emits a warning identifying the affected stub and does not modify the parametrize decorator + + Rule: Row-only changes do not warn + As a developer + I want beehave to update stub parameter values silently when only row data changes + So that adding new test cases to an Outline does not require manual intervention + + @id:26570a93 + Example: Adding a row to the Examples table updates the parametrize values without warning + Given a parametrized stub already exists for a Scenario Outline with two rows + When a third row is added to the Examples table and beehave sync runs + Then the parametrize decorator is updated to include the new row and no warning is emitted diff --git a/docs/features/backlog/pytest-adapter.feature b/docs/features/backlog/pytest-adapter.feature index 87a0e76..121b7fe 100644 --- a/docs/features/backlog/pytest-adapter.feature +++ b/docs/features/backlog/pytest-adapter.feature @@ -3,7 +3,7 @@ Feature: pytest-adapter — built-in adapter for the pytest framework Implements the adapter contract for pytest. Supplies the pytest-specific stub conventions used by sync-create, sync-update, and parameter-handling when pytest is the active framework. - Status: ELICITING + Status: BASELINED (2026-04-22) Rules (Business): - All generated pytest stubs are immediately runnable with pytest without modification @@ -13,3 +13,66 @@ Feature: pytest-adapter — built-in adapter for the pytest framework - Deprecated marker: @pytest.mark.deprecated - Parametrize: @pytest.mark.parametrize(...) - Function prefix: test_; return type: -> None; body: ... + + Rule: Pytest stub conventions + As a developer + I want generated pytest stubs to follow pytest conventions exactly + So that pytest discovers and reports them correctly without any manual editing + + @id:128ddfaf + Example: Generated stub has pytest skip marker + Given a new Example in a .feature file + When beehave sync generates the stub with the pytest adapter + Then the function is decorated with @pytest.mark.skip(reason="not yet implemented") + + @id:4173073d + Example: Generated stub function name follows test_ prefix convention + Given a new Example with @id a1b2c3d4 in feature "honey-flow" + When beehave sync generates the stub with the pytest adapter + Then the function is named test_honey_flow_a1b2c3d4 + + @id:342eff46 + Example: Generated stub has -> None return type annotation + Given a new Example in a .feature file + When beehave sync generates the stub with the pytest adapter + Then the function signature ends with -> None + + @id:9221fd27 + Example: Generated stub body is Ellipsis + Given a new Example in a .feature file + When beehave sync generates the stub with the pytest adapter + Then the function body is a single ... (Ellipsis) statement + + @id:56526c2a + Example: Generated stub docstring contains full Gherkin step text + Given an Example with Given/When/Then steps in a .feature file + When beehave sync generates the stub with the pytest adapter + Then the function docstring contains all Given, When, and Then lines verbatim + + Rule: Pytest deprecated marker + As a developer + I want beehave to add the deprecated marker to stubs for @deprecated Examples + So that pytest marks those tests as deprecated automatically + + @id:fc7dc521 + Example: Deprecated stub has @pytest.mark.deprecated decorator + Given an Example tagged @deprecated in a .feature file + When beehave sync updates the stub with the pytest adapter + Then the function is decorated with @pytest.mark.deprecated + + @id:24841ef4 + Example: Non-deprecated stub does not have the deprecated decorator + Given an Example with no @deprecated tag + When beehave sync generates the stub with the pytest adapter + Then @pytest.mark.deprecated is absent from the function decorators + + Rule: Pytest file header + As a developer + I want the generated test file to have a valid pytest import header + So that the file is valid Python and pytest can collect it without errors + + @id:9e7d16ec + Example: Generated test file has pytest import + Given a new Rule block requiring a new test file + When beehave sync creates the file with the pytest adapter + Then the file begins with import pytest diff --git a/docs/features/backlog/status.feature b/docs/features/backlog/status.feature index 95edf75..9e9801b 100644 --- a/docs/features/backlog/status.feature +++ b/docs/features/backlog/status.feature @@ -4,7 +4,7 @@ Feature: status — dry-run preview of sync changes developer review before committing and for CI pipeline gating. Exits 0 when everything is in sync and 1 when changes are pending, following Unix convention. - Status: ELICITING + Status: BASELINED (2026-04-22) Rules (Business): - status never writes to any file @@ -13,3 +13,66 @@ Feature: status — dry-run preview of sync changes Constraints: - Supports --verbose (human-readable) and --json (machine-readable) output modes - Silent by default (Unix philosophy) + + Rule: Read-only guarantee + As a developer + I want beehave status to never modify any file + So that I can safely run it in CI without fear of unintended side effects + + @id:525cf08b + Example: No files are written when status runs + Given a project with unsynced .feature files + When beehave status runs + Then no files in the project are created, modified, or deleted + + @id:fa7a30b2 + Example: Status reports what sync would do without doing it + Given a project with one new Example that has no stub + When beehave status runs + Then the output lists the missing stub and exit code is 1 + + Rule: Exit code contract + As a developer + I want beehave status to exit 0 when in sync and 1 when changes are pending + So that CI pipelines can gate on sync state without parsing output + + @id:2e14c727 + Example: Exit code is 0 when project is fully in sync + Given a project where all Examples have corresponding stubs and all stubs match their Examples + When beehave status runs + Then the exit code is 0 + + @id:08ba37c8 + Example: Exit code is 1 when a new stub needs to be created + Given a project with at least one Example that has no corresponding stub + When beehave status runs + Then the exit code is 1 + + @id:53fd3157 + Example: Exit code is 1 when a stub docstring is out of date + Given a project where an Example's step text has changed since the stub was generated + When beehave status runs + Then the exit code is 1 + + Rule: Output modes + As a developer + I want to choose between silent, verbose, and JSON output + So that I can use status both interactively and in automated pipelines + + @id:f2bce539 + Example: Default output is silent when project is in sync + Given a fully synced project + When beehave status runs with no output flags + Then nothing is written to stdout or stderr + + @id:6bc2555f + Example: --verbose output lists each pending change in human-readable form + Given a project with two pending changes + When beehave status --verbose runs + Then stdout contains one human-readable line per pending change + + @id:b9301030 + Example: --json output emits a machine-readable JSON object + Given a project with pending changes + When beehave status --json runs + Then stdout is valid JSON describing the pending changes diff --git a/docs/features/backlog/sync-cleanup.feature b/docs/features/backlog/sync-cleanup.feature index 80a1c02..acae493 100644 --- a/docs/features/backlog/sync-cleanup.feature +++ b/docs/features/backlog/sync-cleanup.feature @@ -5,7 +5,7 @@ Feature: sync-cleanup — handle orphaned and misplaced stubs location. When a .feature file is deleted, beehave warns and the resulting orphan stubs are flagged by orphan detection. beehave never deletes stubs automatically. - Status: ELICITING + Status: BASELINED (2026-04-22) Rules (Business): - Orphaned stubs are warned about and never deleted automatically @@ -15,3 +15,54 @@ Feature: sync-cleanup — handle orphaned and misplaced stubs Constraints: - Orphan: a test function whose name contains an @id absent from all .feature files - Path correction applies when @id matches but directory path does not + + Rule: Orphan detection + As a developer + I want beehave to tell me when a stub has no matching Example + So that I can decide what to do with it rather than having it silently removed + + @id:2d9fa007 + Example: Stub with no matching @id in any .feature file is reported as orphan + Given a test function test_honey_flow_a1b2c3d4 where @id:a1b2c3d4 does not exist in any .feature file + When beehave sync runs + Then beehave emits a warning identifying the orphan stub by name and file path + + @id:bc9ca466 + Example: Orphan stub is never deleted automatically + Given a test function whose @id has no matching Example + When beehave sync runs + Then the test function file is unchanged after sync + + @id:eed4cdbb + Example: Stub with a matching @id in a .feature file is not flagged as orphan + Given a test function test_honey_flow_a1b2c3d4 and @id:a1b2c3d4 exists in a .feature file + When beehave sync runs + Then no orphan warning is emitted for that stub + + Rule: Path correction + As a developer + I want beehave to move misplaced stubs to the correct location + So that my test layout stays consistent with my .feature file structure + + @id:0e077f6e + Example: Stub in wrong directory is moved to the correct path + Given a test function whose @id matches an Example but the file is in the wrong feature slug directory + When beehave sync runs + Then the stub function is moved to tests/features//_test.py with its body intact + + @id:fb241843 + Example: Moved stub body is byte-for-byte unchanged + Given a misplaced stub with an implemented body + When beehave sync moves it to the correct path + Then the function body in the new location is identical to the original + + Rule: Deleted feature file warning + As a developer + I want beehave to warn me when a .feature file I had stubs for has been deleted + So that I know orphan stubs exist that need manual review + + @id:20d52a9d + Example: Deleting a .feature file causes a warning on next sync + Given a .feature file that had stubs generated for it is deleted + When beehave sync runs + Then beehave emits a warning identifying the deleted feature and its orphaned stubs diff --git a/docs/features/backlog/sync-create.feature b/docs/features/backlog/sync-create.feature index def3554..5f8cf6f 100644 --- a/docs/features/backlog/sync-create.feature +++ b/docs/features/backlog/sync-create.feature @@ -5,7 +5,7 @@ Feature: sync-create — generate new test stubs for new Examples docstring, uses the active adapter's skip marker and body convention, and is placed in tests/features//_test.py. - Status: ELICITING + Status: BASELINED (2026-04-22) Rules (Business): - A stub is created only when no test function with the @id in its name already exists @@ -16,3 +16,60 @@ Feature: sync-create — generate new test stubs for new Examples - Docstring: full Gherkin step text verbatim (all Given/When/Then lines) - Return type: -> None; body: ... (Ellipsis) - Skip marker and body come from the active adapter template + + Rule: Stub creation + As a developer + I want beehave sync to generate a test stub for each new Example + So that every acceptance criterion is represented in my test suite immediately + + @id:c8911ca5 + Example: New Example produces a skipped stub function + Given a .feature file with an Example tagged @id:a1b2c3d4 and no corresponding test function + When beehave sync runs + Then a function named test__a1b2c3d4 exists in the correct test file, decorated with the adapter's skip marker + + @id:08a9661d + Example: Existing stub is not recreated + Given a test function test__a1b2c3d4 already exists + When beehave sync runs + Then the existing function is not modified and no duplicate is created + + @id:3941f0ca + Example: Stub docstring contains all Given/When/Then lines verbatim + Given an Example with three Gherkin steps + When beehave sync generates the stub + Then the function docstring contains all three step lines exactly as they appear in the .feature file + + Rule: Test file layout + As a developer + I want each Rule's stubs to be in their own file under the feature slug directory + So that test files are small and map directly to user stories + + @id:b8483d50 + Example: Test file is created at the correct path + Given a .feature file with slug "honey-flow" and a Rule with slug "nectar-collection" + When beehave sync creates the stub + Then the stub is placed in tests/features/honey_flow/nectar_collection_test.py + + @id:23de0b22 + Example: Missing parent directories are created + Given tests/features/honey_flow/ does not exist + When beehave sync creates a stub for the honey-flow feature + Then tests/features/honey_flow/ is created along with the stub file + + @id:b6927bfd + Example: Multiple Rules in one feature produce separate test files + Given a .feature file with two Rule blocks + When beehave sync runs + Then two test files are created, one per Rule, under tests/features// + + Rule: Idempotency + As a developer + I want beehave sync to be safe to run multiple times + So that I can run it in CI without risk of duplicate stubs + + @id:23af2d01 + Example: Running sync twice does not produce duplicate functions + Given beehave sync has already been run for a feature + When beehave sync is run again without changes to the .feature file + Then the test files are unchanged diff --git a/docs/features/backlog/sync-update.feature b/docs/features/backlog/sync-update.feature index fc68465..bec32bc 100644 --- a/docs/features/backlog/sync-update.feature +++ b/docs/features/backlog/sync-update.feature @@ -5,7 +5,7 @@ Feature: sync-update — update existing stubs when .feature content changes function name is updated if the feature slug changed, and the @deprecated marker is toggled based on the Gherkin @deprecated tag. beehave never modifies a test body under any circumstance. - Status: ELICITING + Status: BASELINED (2026-04-22) Rules (Business): - Test bodies are never modified under any circumstance @@ -14,3 +14,65 @@ Feature: sync-update — update existing stubs when .feature content changes Constraints: - @deprecated marker is added or removed based solely on the Gherkin @deprecated tag presence - Function rename on slug change preserves the test body exactly + + Rule: Docstring update + As a developer + I want beehave to keep my stub docstrings in sync with the .feature file + So that the living documentation in my test file stays accurate + + @id:ec1cd20f + Example: Changed step text updates the stub docstring + Given an Example whose Given/When/Then steps have changed in the .feature file + When beehave sync runs + Then the corresponding stub function's docstring reflects the new step text + + @id:0e8f0c8a + Example: Unchanged step text leaves the stub docstring untouched + Given an Example whose steps have not changed since the last sync + When beehave sync runs + Then the stub function's docstring is unchanged + + Rule: Body preservation + As a developer + I want beehave to never touch my test body under any circumstances + So that I can trust beehave will not break my implemented tests + + @id:37751b69 + Example: Implemented test body is preserved when docstring updates + Given a stub whose body has been implemented by a developer + When the corresponding Example's step text changes and beehave sync runs + Then the docstring is updated and the body is byte-for-byte unchanged + + @id:d978b45d + Example: Skipped stub body is preserved when docstring updates + Given a stub with body ... (Ellipsis) and an updated Example + When beehave sync runs + Then the body remains ... and only the docstring changes + + Rule: Deprecated marker sync + As a developer + I want beehave to add or remove the deprecated marker based on the @deprecated Gherkin tag + So that my test suite reflects which acceptance criteria are no longer active + + @id:2fd7a849 + Example: @deprecated tag on Example adds deprecated marker to stub + Given an Example without @deprecated that gains a @deprecated tag in the .feature file + When beehave sync runs + Then the stub function gains the adapter's deprecated marker decorator + + @id:13c1267d + Example: Removing @deprecated tag removes deprecated marker from stub + Given a stub with the deprecated marker whose @deprecated tag is removed from the .feature file + When beehave sync runs + Then the deprecated marker decorator is removed from the stub function + + Rule: Scenario Outline column change warning + As a developer + I want beehave to warn me when Scenario Outline columns change + So that I know to manually update the parametrize decorator myself + + @id:2d128d78 + Example: Column change in Examples table produces a warning + Given a Scenario Outline stub whose Examples table columns have changed + When beehave sync runs + Then beehave emits a warning identifying the affected stub and does not modify the parametrize decorator diff --git a/docs/features/backlog/template-customization.feature b/docs/features/backlog/template-customization.feature index 3e86c5e..888828a 100644 --- a/docs/features/backlog/template-customization.feature +++ b/docs/features/backlog/template-customization.feature @@ -5,7 +5,7 @@ Feature: template-customization — user-defined stub templates teams with non-standard conventions to generate stubs that match their style without forking beehave. - Status: ELICITING + Status: BASELINED (2026-04-22) Rules (Business): - Built-in adapter templates are used when no custom folder is specified @@ -13,3 +13,48 @@ Feature: template-customization — user-defined stub templates Constraints: - Custom folder specified via --template-dir flag or template_path config key in [tool.beehave] + + Rule: Default template usage + As a developer + I want beehave to use built-in templates when I have not configured a custom folder + So that beehave works out of the box without any template configuration + + @id:bf0a3c49 + Example: Built-in templates are used when template_path is not set + Given no template_path in [tool.beehave] and no --template-dir flag + When beehave sync generates a stub + Then the stub matches the built-in adapter template format + + Rule: Custom template override + As a developer + I want to point beehave at my own template folder + So that generated stubs match my team's coding conventions + + @id:cf259a6c + Example: Custom template replaces built-in when template_path is set + Given template_path = "templates/beehave" in [tool.beehave] and a custom stub template in that folder + When beehave sync generates a stub + Then the stub matches the custom template, not the built-in + + @id:99c97725 + Example: --template-dir flag overrides template_path config for the current invocation + Given template_path = "templates/beehave" in [tool.beehave] + When beehave sync is invoked with --template-dir other/templates + Then stubs are generated using the other/templates folder + + @id:bb612744 + Example: Non-existent custom template folder raises an error + Given template_path points to a directory that does not exist + When beehave sync runs + Then beehave exits with an error identifying the missing template directory + + Rule: Partial custom template + As a developer + I want to override only specific template files and fall back to built-ins for the rest + So that I only need to maintain the templates I actually customise + + @id:ad3b70dc + Example: Missing template file in custom folder falls back to built-in + Given a custom template folder that contains only a stub template but not a file header template + When beehave sync generates a new test file + Then the file header comes from the built-in template and the stub body from the custom template From 0d23bc24653272b529205dea8149195ea122555f Mon Sep 17 00:00:00 2001 From: nullhack Date: Wed, 22 Apr 2026 05:18:03 -0400 Subject: [PATCH 6/9] refactor(docs): merge domain-model, context, container into system.md; embed arch Q&A into ADRs; slim discovery.md format --- AGENTS.md | 16 +- FLOW.md | 2 +- docs/adr/ADR-2026-04-22-adapter-protocol.md | 8 + .../ADR-2026-04-22-backwards-compatibility.md | 8 + .../ADR-2026-04-22-error-handling-policy.md | 8 + ...DR-2026-04-22-feature-file-write-policy.md | 8 + docs/adr/ADR-2026-04-22-id-stability.md | 8 + .../ADR-2026-04-22-logging-observability.md | 8 + docs/adr/ADR-2026-04-22-module-structure.md | 8 + .../adr/ADR-2026-04-22-performance-targets.md | 8 + docs/adr/ADR-2026-04-22-slug-derivation.md | 8 + docs/arch_journal.md | 53 ----- docs/container.md | 44 ---- docs/context.md | 26 --- docs/discovery.md | 68 +++--- docs/domain-model.md | 141 ------------- docs/system.md | 198 +++++++++++++++++- 17 files changed, 299 insertions(+), 321 deletions(-) delete mode 100644 docs/arch_journal.md delete mode 100644 docs/container.md delete mode 100644 docs/context.md delete mode 100644 docs/domain-model.md diff --git a/AGENTS.md b/AGENTS.md index 1c620ba..b1472fd 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -49,7 +49,7 @@ All feature work happens on branches. `main` is the single source of truth and r - **Product Owner (PO)** — AI agent. Interviews the stakeholder, writes discovery docs, Gherkin features, and acceptance criteria. Accepts or rejects deliveries. **Sole owner of all `.feature` file moves** (backlog → in-progress before Step 2; in-progress → completed after Step 5 acceptance). - **Stakeholder** — Human. Answers PO's questions, provides domain knowledge, approves PO syntheses to confirm discovery is complete. -- **System Architect (SA)** — AI agent. Designs architecture, writes domain stubs, records decisions in ADRs, and verifies implementation respects those decisions. Owns `docs/domain-model.md`, `docs/system.md`, and `docs/adr/ADR-*.md`. Never edits or moves `.feature` files. Escalates spec gaps to PO. +- **System Architect (SA)** — AI agent. Designs architecture, writes domain stubs, records decisions in ADRs, and verifies implementation respects those decisions. Owns `docs/system.md` (including domain model, C4 context, and C4 container sections) and `docs/adr/ADR-*.md`. Never edits or moves `.feature` files. Escalates spec gaps to PO. - **Software Engineer (SE)** — AI agent. Implements everything: test bodies, production code, releases. Owns all `.py` files under the package. Never edits or moves `.feature` files. Escalates spec gaps to PO. If no `.feature` file is in `in-progress/`, stops and escalates to PO. ## Feature File Chain of Responsibility @@ -106,9 +106,9 @@ Step 1 has two stages: Discovery follows a block structure per session. See `skill define-scope` for the full protocol. -**Block A — Session Start**: Resume check (if `IN-PROGRESS`), read `domain-model.md` (existing entities), declare scope. +**Block A — Session Start**: Resume check (if `IN-PROGRESS`), read `system.md` Domain Model section (existing entities), declare scope. -**Block B — General & Cross-cutting**: 5Ws, behavioral groups, bounded contexts. Active listening + reconciliation against `glossary.md` and `domain-model.md`. +**Block B — General & Cross-cutting**: 5Ws, behavioral groups, bounded contexts. Active listening + reconciliation against `glossary.md` and `system.md` (Domain Model section). **Block C — Feature Discovery (per feature)**: Detailed questions, pre-mortem, create/update `.feature` files. @@ -117,6 +117,7 @@ Discovery follows a block structure per session. See `skill define-scope` for th **Key rules**: - PO owns `scope_journal.md`, `discovery.md`, `glossary.md`, and `.feature` files - PO reads `domain-model.md` but never writes to it — entity suggestions go in `discovery.md` for SA formalization at Step 2 +- `domain-model.md` does not exist as a standalone file; the domain model is the `## Domain Model` section in `docs/system.md` - Real-time split rule: >2 concerns or >8 candidate Examples → split immediately - Completed feature touched and changed → move to `backlog/` @@ -161,16 +162,13 @@ Post-mortems are append-only, never edited. If a failure mode recurs, write a ne ``` docs/ scope_journal.md ← raw Q&A, PO appends after every session - discovery.md ← session synthesis changelog, PO appends after every session - domain-model.md ← living domain model, SA creates/updates at Step 2, PO reads only + discovery.md ← session synthesis changelog (behavioral changes only), PO appends after every session adr/ ← one file per decision: ADR-YYYY-MM-DD-.md, SA creates at Step 2 - system.md ← current-state overview (completed features only), SA rewrites at Step 2, PO reviews at Step 5 + system.md ← SA-owned current-state snapshot: domain model + C4 context + C4 container + modules + constraints + ADR index; SA rewrites at Step 2, PO reviews at Step 5 glossary.md ← living glossary, PO updates after each session branding.md ← project identity, colors, release naming, wording (designer owns) assets/ ← logo.svg, banner.svg, and other visual assets (designer owns) - context.md ← C4 Level 1 diagram, PO updates via update-docs skill - container.md ← C4 Level 2 diagram, PO updates via update-docs skill (if multi-container) - post-mortem/ ← compact post-mortems, PO-owned, append-only + post-mortem/ ← compact post-mortems, PO-owned, append-only features/ backlog/.feature ← narrative + Rules + Examples in-progress/.feature diff --git a/FLOW.md b/FLOW.md index 2ba4bbe..aebf05b 100644 --- a/FLOW.md +++ b/FLOW.md @@ -167,7 +167,7 @@ States are checked **in order**. The first matching condition is the current sta ### [STEP-2-ARCH] **Owner**: `system-architect` **Entry condition**: On `@branch`, no test stubs in `tests/features//` -**Action**: Read feature; design domain stubs; write ADRs; update `domain-model.md`; run `uv run task test-fast` to generate stubs +**Action**: Read feature; design domain stubs; write ADRs; update `system.md` (domain model + C4 sections); run `uv run task test-fast` to generate stubs **Exit**: Stubs generated → update `@state: STEP-3-READY` in `WORK.md` **Failure**: Spec unclear → escalate to `product-owner`; update `@state: STEP-1-DISCOVERY` in `WORK.md` **Commit**: `feat(arch): design @id architecture` diff --git a/docs/adr/ADR-2026-04-22-adapter-protocol.md b/docs/adr/ADR-2026-04-22-adapter-protocol.md index 5568884..b90c8eb 100644 --- a/docs/adr/ADR-2026-04-22-adapter-protocol.md +++ b/docs/adr/ADR-2026-04-22-adapter-protocol.md @@ -6,6 +6,14 @@ | **Feature** | adapter-contract, pytest-adapter | | **Status** | Accepted | +## Context + +**Question (D2):** Should `FrameworkAdapter` be a `typing.Protocol` or an ABC? + +Structural protocols allow third-party adapters to satisfy the interface without importing from `beehave`. An ABC forces a hard import coupling. pyright enforces Protocol conformance statically with zero runtime overhead. The stakeholder confirmed zero import coupling as a design goal. + +--- + ## Decision `FrameworkAdapter` is defined as a `typing.Protocol` (structural subtyping), not an abstract base class. diff --git a/docs/adr/ADR-2026-04-22-backwards-compatibility.md b/docs/adr/ADR-2026-04-22-backwards-compatibility.md index 0acbfef..abf1570 100644 --- a/docs/adr/ADR-2026-04-22-backwards-compatibility.md +++ b/docs/adr/ADR-2026-04-22-backwards-compatibility.md @@ -6,6 +6,14 @@ | **Feature** | all | | **Status** | Accepted | +## Context + +**Question (A3):** What is the policy for CLI flags, config keys, and `@id` format changes across versions? + +Stakeholder confirmed best-effort as the right balance for v1. A hard semver contract requires deprecation tracking and migration tooling that is out of scope. The key concern was that `@id` tags in `.feature` files (and therefore all test function names) must never be silently broken — at minimum, users must get a warning before any removal. + +--- + ## Decision beehave follows a **best-effort deprecation** policy: any CLI flag, config key, or `@id` format change is preceded by a deprecation warning visible to users for at least one minor release before removal. No hard semver guarantee is enforced. diff --git a/docs/adr/ADR-2026-04-22-error-handling-policy.md b/docs/adr/ADR-2026-04-22-error-handling-policy.md index c2c2d46..ef7810a 100644 --- a/docs/adr/ADR-2026-04-22-error-handling-policy.md +++ b/docs/adr/ADR-2026-04-22-error-handling-policy.md @@ -6,6 +6,14 @@ | **Feature** | sync-cleanup, id-generation, config-reading | | **Status** | Accepted | +## Context + +**Question (A1/D6):** How should beehave behave when `pyproject.toml` is malformed, a `.feature` file has invalid Gherkin, or the filesystem is read-only? Which conditions warrant a hard error vs. a configurable warn/error? + +Stakeholder decisions: malformed (present but invalid) `pyproject.toml` → hard error; absent `pyproject.toml` → use defaults, no error; invalid Gherkin → hard error; read-only filesystem → hard error. Deleted feature file and orphan stubs → configurable, default warn. Duplicate `@id` → hard error always, never configurable (see `ADR-2026-04-22-id-stability`). New config keys surfaced: `on_delete` and `on_orphan`. + +--- + ## Decision | Condition | Policy | Configurable? | diff --git a/docs/adr/ADR-2026-04-22-feature-file-write-policy.md b/docs/adr/ADR-2026-04-22-feature-file-write-policy.md index cbb9c42..4b60197 100644 --- a/docs/adr/ADR-2026-04-22-feature-file-write-policy.md +++ b/docs/adr/ADR-2026-04-22-feature-file-write-policy.md @@ -6,6 +6,14 @@ | **Feature** | id-generation, sync-create, sync-update | | **Status** | Accepted | +## Context + +**Question (D5):** What may beehave write to `.feature` files, and what write strategy should it use? + +The core constraint from discovery (Q4/C4): beehave is not the author of `.feature` files — it only assigns `@id` tags. Any write beyond that risks corrupting human-authored Gherkin. The stakeholder confirmed that formatting, comments, and surrounding content must be byte-identical after a run. Full parse-and-reserialize was considered and rejected: the official Gherkin parser has no canonical serializer. + +--- + ## Decision beehave writes to `.feature` files **only** to add `@id` tags to untagged Examples; all other content is read-only. diff --git a/docs/adr/ADR-2026-04-22-id-stability.md b/docs/adr/ADR-2026-04-22-id-stability.md index f9e782f..53059fa 100644 --- a/docs/adr/ADR-2026-04-22-id-stability.md +++ b/docs/adr/ADR-2026-04-22-id-stability.md @@ -6,6 +6,14 @@ | **Feature** | id-generation | | **Status** | Accepted | +## Context + +**Question (D3):** What is the format for `@id` tags? How are generation collisions handled? What happens to a stub when its `@id` is edited or deleted? + +From discovery (I1/I1a): beehave generates 8-char lowercase hex IDs via `secrets.token_hex(4)`. Human-assigned IDs (any non-empty string) are valid and respected. Only an empty `@id:` value is malformed and gets replaced. The stakeholder confirmed that project-wide uniqueness is required — stubs are looked up by `@id` alone, so file-scoped uniqueness is insufficient. Duplicate `@id` found in files (always a hand-edit) → hard error, no safe resolution exists. + +--- + ## Decision `@id` values are 8-character lowercase hex strings. Once assigned, they are never replaced unless malformed (empty value or non-hex characters). Generation collisions trigger a silent retry. Uniqueness is enforced project-wide across all `.feature` files. diff --git a/docs/adr/ADR-2026-04-22-logging-observability.md b/docs/adr/ADR-2026-04-22-logging-observability.md index f5dd2c8..17a1e25 100644 --- a/docs/adr/ADR-2026-04-22-logging-observability.md +++ b/docs/adr/ADR-2026-04-22-logging-observability.md @@ -6,6 +6,14 @@ | **Feature** | all (CLI) | | **Status** | Accepted | +## Context + +**Question (A4):** Beyond `--verbose` and `--json`, should beehave support structured logging, log levels, or log files? + +Stakeholder confirmed log levels (DEBUG/INFO/WARNING/ERROR) are needed so users can control verbosity without code changes. Standard Python `logging` was the clear choice: zero extra dependency, universally understood, integrates cleanly with library use. `--verbose` maps to INFO; `--json` switches output format but respects the active level. + +--- + ## Decision beehave uses the **standard Python `logging` module** with four levels: DEBUG, INFO, WARNING, ERROR. The active log level is configurable via a `[tool.beehave]` key (`log_level`) or a `--log-level` CLI flag. Default level is WARNING (silent for normal use). `--verbose` is sugar for INFO; `--json` switches output format but respects the active log level. diff --git a/docs/adr/ADR-2026-04-22-module-structure.md b/docs/adr/ADR-2026-04-22-module-structure.md index 3e49aa3..2f823af 100644 --- a/docs/adr/ADR-2026-04-22-module-structure.md +++ b/docs/adr/ADR-2026-04-22-module-structure.md @@ -6,6 +6,14 @@ | **Feature** | all | | **Status** | Accepted (supersedes speculative v0 from same date) | +## Context + +**Question (D1):** How should the `beehave` package be organized into submodules? + +Initial proposal was 7 submodules (with separate `scaffold` and `cache` modules). Two problems: (1) "scaffold" has no place in the bee-themed vocabulary — the CLI command is `nest`; the submodule must match; (2) `cache` has no independent public API — it exists solely to serve `parsing`, making it a sub-file, not a separate module. Six submodules was confirmed as the minimum necessary for clear single-responsibility boundaries. + +--- + ## Decision The `beehave` package is organized into **six submodules**: diff --git a/docs/adr/ADR-2026-04-22-performance-targets.md b/docs/adr/ADR-2026-04-22-performance-targets.md index 520a3a1..9368e2b 100644 --- a/docs/adr/ADR-2026-04-22-performance-targets.md +++ b/docs/adr/ADR-2026-04-22-performance-targets.md @@ -6,6 +6,14 @@ | **Feature** | cache-management, id-generation | | **Status** | Accepted | +## Context + +**Question (A2):** What is the target scale (Examples per project)? Does the cache need to be load-bearing or is it an optimisation? + +Stakeholder confirmed medium scale (100–1,000 Examples) as the design target. At that scale, a full scan of all `.feature` files completes in well under 1 second. Making the cache load-bearing adds hard dependency on cache coherence and complicates failure modes. The stakeholder preferred simplicity: cache speeds things up, but sync must be correct without it. + +--- + ## Decision The target scale is **medium** (100–1,000 Examples per project). A full scan of all `.feature` files on every run is acceptable at this scale. The cache is a **performance optimisation**, not a load-bearing architectural requirement — sync must be correct without it; the cache only speeds it up. diff --git a/docs/adr/ADR-2026-04-22-slug-derivation.md b/docs/adr/ADR-2026-04-22-slug-derivation.md index 6585d4e..4fe8f2e 100644 --- a/docs/adr/ADR-2026-04-22-slug-derivation.md +++ b/docs/adr/ADR-2026-04-22-slug-derivation.md @@ -6,6 +6,14 @@ | **Feature** | sync-create, sync-update, sync-cleanup, nest | | **Status** | Accepted | +## Context + +**Question (D4):** How is the feature slug derived? Does the stage folder (`backlog/`, `in-progress/`, `completed/`) affect it? What happens when a feature file is renamed? + +A feature moves through stage folders during its lifecycle. If the slug included the stage path, moving from `in-progress/` to `completed/` would change the slug, rename the test directory, and orphan all stubs — a destructive side-effect of a purely administrative operation. The stakeholder confirmed stage-folder-independence. On rename: no feature-level ID exists, so beehave cannot detect renames; auto-rename was explicitly rejected (cannot distinguish rename from delete + new file). + +--- + ## Decision `FeatureSlug` is derived solely from the `.feature` file's **stem** (filename without extension and without the stage subfolder path); the stage folder (`backlog/`, `in-progress/`, `completed/`, or root) is ignored. diff --git a/docs/arch_journal.md b/docs/arch_journal.md deleted file mode 100644 index 872da28..0000000 --- a/docs/arch_journal.md +++ /dev/null @@ -1,53 +0,0 @@ -# Architecture Journal: beehave - -> Append-only record of all architecture design session Q&A. -> Written by the system-architect. Read by the system-architect for resume checks and ADR regeneration. -> Never edit past entries — append new session blocks only. -> If ADRs need to be regenerated, this file is the source of truth. - ---- - -## 2026-04-22 — Session 1 -Status: IN-PROGRESS - -### Context - -First architecture design session for beehave v1. -Covers all key architectural decisions before any implementation begins. -Resolves the 4 unanswered architectural gaps (A1–A4) from scope_journal.md Session 1. - -### Gaps - -| ID | Question | Answer | -|----|----------|--------| -| A1 | Error handling: How should beehave behave when pyproject.toml is malformed, a .feature file has invalid Gherkin, or the filesystem is read-only? | Malformed pyproject.toml (present but invalid) → error. Absent pyproject.toml → use defaults, no error. Invalid Gherkin → error. Read-only filesystem → error with clear message. Deleted feature file → configurable: warn (default) or error. Duplicate @id → hard error always (not configurable; see D3/D6). | -| A2 | Performance constraints: What is the target scale? This determines whether the cache is load-bearing or an optimisation. | Medium (100–1,000 Examples). Full scan acceptable. Cache is a nice-to-have speedup, not load-bearing. | -| A3 | Backwards compatibility: What is the policy for CLI flags, config keys, and @id format changes? | Best-effort: deprecation warning in one minor release before removal. No hard semver guarantee. | -| A4 | Logging and observability: Beyond --verbose and --json, should beehave support structured logging, log levels, or log files? | Standard Python logging module with log levels (DEBUG/INFO/WARN/ERROR); users can set level via config or flag. | - -### Decisions - -| ID | Question | Answer | -|----|----------|--------| -| D1 | Module structure: How should the beehave package be organized into submodules? | 6 submodules: cli, config, parsing, sync, adapters, nest. Cache merged into parsing (no independent public API). "scaffold" rejected — violates bee branding; renamed to "nest" to match the CLI command. | -| D2 | Adapter contract: Should FrameworkAdapter be a typing.Protocol or an ABC? | typing.Protocol. Zero import coupling for third-party adapters; pyright enforces statically. | -| D3 | @id format and stability: What is the format for @id tags and how are collisions handled? Edited/deleted @id — treat as orphan (warn or error per policy). | 8-char lowercase hex. Once assigned, never replaced unless malformed (empty or non-hex). Generation collisions trigger silent retry. Uniqueness is project-wide. Edited/deleted @id → old stub becomes orphan, subject to on_orphan policy. Duplicate @id found in files → hard error always (beehave cannot determine which stub to bind). | -| D4 | Slug derivation: How is the feature slug derived, and does the stage folder (backlog/in-progress/completed) affect it? | Slug derived from file stem only. Stage folder is ignored. Moving a feature between stage folders has zero effect on test stubs. Feature rename: no feature-level ID exists, so beehave cannot detect renames — old test directory becomes an orphan (warn or error per configured policy); new stubs generated under new slug. | -| D5 | Feature file write policy: What may beehave write to .feature files? Write strategy: surgical insertion or full reserialize? | Only @id tags on untagged Examples. All other content is read-only. Strategy: surgical line insertion (insert @id tag line above the Example: line; never reserialize the whole file). | -| D6 | Error vs. warn policy: Which conditions are always errors and which are configurable? | Always error: malformed pyproject.toml (present but invalid), invalid Gherkin syntax, read-only filesystem, duplicate @id found in any file (hard error — beehave cannot bind stub name). Configurable warn/error (default warn): deleted .feature file, orphan stub. Absent pyproject.toml is NOT an error — use defaults. Duplicate @id is never produced by beehave; if found it was hand-edited — hard error. | - -### ADRs Produced This Session - -| ADR | Question ID | Status | -|-----|-------------|--------| -| ADR-2026-04-22-performance-targets | A2 | Written | -| ADR-2026-04-22-backwards-compatibility | A3 | Written | -| ADR-2026-04-22-logging-observability | A4 | Written | -| ADR-2026-04-22-module-structure | D1 | Written | -| ADR-2026-04-22-adapter-protocol | D2 | Written | -| ADR-2026-04-22-id-stability | D3 | Written | -| ADR-2026-04-22-slug-derivation | D4 | Written | -| ADR-2026-04-22-feature-file-write-policy | D5 | Written | -| ADR-2026-04-22-error-handling-policy | D6 | Written | - -Status: COMPLETE diff --git a/docs/container.md b/docs/container.md deleted file mode 100644 index 7323b95..0000000 --- a/docs/container.md +++ /dev/null @@ -1,44 +0,0 @@ -# C4 — Container Diagram - -> Last updated: 2026-04-22 -> Source: docs/adr/ADR-2026-04-22-module-structure.md, docs/domain-model.md - -```mermaid -C4Container - title Container Diagram — beehave - - Person(developer, "Developer", "") - Person(ci, "CI Pipeline", "") - - System_Boundary(beehave_sys, "beehave") { - Container(cli, "CLI", "Python / fire", "Entry points: nest, sync, status, hatch, version. Composition root — wires all other modules together.") - Container(config, "Config", "Python", "Reads [tool.beehave] from pyproject.toml; applies defaults; merges CLI overrides into BeehaveConfig.") - Container(parsing, "Parsing", "Python / gherkin-official", "Parses .feature files into Feature/Rule/Example graph; assigns @id tags via surgical line insertion; manages incremental cache.") - Container(sync, "Sync", "Python", "Computes SyncPlan from parsed features vs. existing test stubs; executes create/update/move/warn operations.") - Container(adapters, "Adapters", "Python", "FrameworkAdapter Protocol + PytestAdapter. Renders framework-specific stub text (skip marker, parametrize, header).") - Container(nest, "Nest", "Python", "Creates docs/features/{backlog,in-progress,completed}/ and tests/features/ directory structure; injects [tool.beehave] into pyproject.toml.") - } - - System_Ext(feature_files, "Feature Files", ".feature files — source of truth") - System_Ext(test_suite, "Test Suite", "tests/features/**/*_test.py") - System_Ext(pyproject, "pyproject.toml", "Project config") - System_Ext(cache_store, ".beehave_cache/", "Incremental sync cache (JSON)") - - Rel(developer, cli, "runs commands", "CLI / Python API") - Rel(ci, cli, "runs status --json", "CLI") - - Rel(cli, config, "reads config") - Rel(cli, parsing, "triggers parse + id assignment") - Rel(cli, sync, "triggers sync/status") - Rel(cli, nest, "triggers nest command") - Rel(cli, adapters, "selects adapter via config") - - Rel(config, pyproject, "reads [tool.beehave]", "filesystem") - Rel(nest, feature_files, "creates directory structure", "filesystem") - Rel(nest, pyproject, "injects [tool.beehave] snippet", "filesystem") - Rel(parsing, feature_files, "reads + writes @id tags", "filesystem") - Rel(parsing, cache_store, "reads/writes incremental cache", "filesystem") - Rel(sync, test_suite, "creates, updates, moves stubs", "filesystem") - Rel(sync, parsing, "reads parsed Feature graph") - Rel(sync, adapters, "renders stub text") -``` diff --git a/docs/context.md b/docs/context.md deleted file mode 100644 index 71ba305..0000000 --- a/docs/context.md +++ /dev/null @@ -1,26 +0,0 @@ -# C4 — System Context - -> Last updated: 2026-04-22 -> Source: docs/domain-model.md, docs/system.md, docs/arch_journal.md - -```mermaid -C4Context - title System Context — beehave - - Person(developer, "Developer", "Writes Gherkin .feature files and Python tests; runs beehave to keep them in sync") - Person(ci, "CI Pipeline", "Runs beehave status to gate on drift; reads --json exit codes") - Person(framework_author, "Framework Author", "Implements FrameworkAdapter Protocol to support a new test framework") - - System(beehave, "beehave", "Assigns @id tags to Gherkin Examples and generates/updates skipped test stubs so living documentation and test scaffolding stay in sync") - - System_Ext(feature_files, "Feature Files", ".feature files on disk — source of truth for requirements (Gherkin)") - System_Ext(test_suite, "Test Suite", "Python test files under tests/features/ — generated and updated by beehave") - System_Ext(pyproject, "pyproject.toml", "Project configuration; contains [tool.beehave] config block") - - Rel(developer, beehave, "runs sync / status / nest / hatch", "CLI / Python API") - Rel(ci, beehave, "runs status --json", "CLI") - Rel(framework_author, beehave, "implements FrameworkAdapter Protocol", "Python API") - Rel(beehave, feature_files, "reads; writes @id tags only", "filesystem") - Rel(beehave, test_suite, "creates, updates, and warns about stubs", "filesystem") - Rel(beehave, pyproject, "reads [tool.beehave] config", "filesystem") -``` diff --git a/docs/discovery.md b/docs/discovery.md index 489f09f..168b529 100644 --- a/docs/discovery.md +++ b/docs/discovery.md @@ -2,51 +2,31 @@ > Append-only session synthesis log. > Written by the product-owner at the end of each discovery session. -> Each block summarizes one session: what was learned, what entities were suggested, and which features were touched. +> Each block records one session: a summary paragraph and a table of features whose behavior changed. +> A row appears only when a `.feature` file would be updated as a result of the session. +> Confirmations of existing behavior are not recorded here — see `docs/scope_journal.md` for the full Q&A. > Never edit past blocks — later blocks extend or supersede earlier ones. --- -## Session: 2026-04-21 - -### Summary - -Session 1 established the full project scope for `beehave`: a framework-agnostic CLI and Python library that keeps Gherkin `.feature` files in sync with test stubs. The session covered all general questions (users, purpose, success/failure, out-of-scope), all cross-cutting concerns (framework selection, config, output modes, test identity, feature stage mapping), and per-feature Q&A for all 15 planned features. A same-day supplement corrected `deprecation-sync` cascade behavior (absolute, no override in v1) and defined `hatch` demo content (bee-themed, covers Feature/Rule/Example/Scenario Outline). - -### Entities Added or Deprecated - -| Action | Type | Name | Notes | -|--------|------|------|-------| -| Added | Noun | Feature file | Gherkin `.feature` file; single source of truth for requirements | -| Added | Noun | Example | Gherkin `Example:`/`Scenario:` block; unit that receives an `@id` and maps to one test stub | -| Added | Noun | Rule | `Rule:` block in a `.feature` file; maps to one test file in `tests/features//` | -| Added | Noun | @id tag | 8-char lowercase hex tag (`@id:a1b2c3d4`); stable identity linking an Example to its test stub | -| Added | Noun | Test stub | Generated skipped test function `test__` with Gherkin steps as docstring and `...` body | -| Added | Noun | Framework adapter | Pluggable component supplying stub template conventions per test framework | -| Added | Noun | Cache | JSON file at `.beehave_cache/features.json` tracking feature file state for incremental sync | -| Added | Noun | Feature slug | Snake-case identifier derived from a feature file's name; used as directory and function name prefix | -| Added | Noun | Rule slug | Snake-case identifier derived from a `Rule:` block title; used as the test file name | -| Added | Noun | Scenario Outline | Parameterized Gherkin example with a columns table; maps to a parametrized stub | -| Added | Noun | Orphan | A test stub whose `@id` no longer matches any Example in any `.feature` file | -| Added | Verb | nest | Bootstrap the canonical directory structure for a project | -| Added | Verb | sync | Assign IDs and reconcile test stubs with `.feature` files | -| Added | Verb | hatch | Generate demo `.feature` files | -| Added | Verb | assign_ids | Programmatic entry point to assign `@id` tags to untagged Examples | - -### Features Touched - -- `nest` — new feature: bootstraps canonical directory structure and pyproject.toml config injection -- `id-generation` — new feature: assigns stable `@id` tags to untagged or malformed Examples in place -- `status` — new feature: dry-run preview of what sync would change; Unix exit codes for CI -- `cache-management` — new feature: JSON cache for incremental sync, auto-rebuilds if stale or corrupted -- `template-customization` — new feature: user-defined stub templates via flag or config key -- `sync-create` — new feature: generates new skipped test stubs for Examples with no existing test -- `sync-update` — new feature: updates stub docstrings, function names, and deprecated markers on change -- `sync-cleanup` — new feature: warns on orphans, moves misplaced stubs, warns on deleted feature files -- `adapter-contract` — new feature: defines the interface all framework adapters must implement -- `pytest-adapter` — new feature: built-in adapter implementing the contract for pytest -- `parameter-handling` — new feature: parametrized stubs for Scenario Outlines; warns on column changes -- `unittest-adapter` — new feature: PARKED for v2; out of v1 scope -- `hatch` — new feature: generates bee-themed demo `.feature` files covering common Gherkin patterns -- `config-reading` — new feature: reads `[tool.beehave]` from `pyproject.toml` and applies defaults -- `deprecation-sync` — new feature: propagates `@deprecated` tags to stubs; absolute cascade, no override in v1 +## Session 2026-04-21 + +**Summary**: Session 1 established the full project scope for beehave: a framework-agnostic CLI and Python library that keeps Gherkin `.feature` files in sync with test stubs. Covered all general questions (users, purpose, success/failure, out-of-scope), all cross-cutting concerns (framework selection, config, output modes, test identity, feature stage mapping), and per-feature Q&A for all 15 planned features. A same-day supplement corrected `deprecation-sync` cascade behavior (absolute, no override in v1) and defined `hatch` demo content (bee-themed, covers Feature/Rule/Example/Scenario Outline). + +| Feature | Change | Source questions | Reason | +|---------|--------|-----------------|--------| +| `nest` | created | Q8: "What CLI commands exist?" → nest bootstraps dirs; Q8a: "--check mode?" → CI dry-run; per-feature Q&A | New behavior: additive/idempotent directory + pyproject.toml init with --check and --overwrite modes | +| `id-generation` | created | I1: "@id format?" → 8-char hex; I1a: human-assigned IDs valid; I2: "malformed?" → replace in place | New behavior: assigns stable @id tags to untagged/malformed Examples; project-wide uniqueness | +| `status` | created | Q8: "beehave status?" → dry-run preview; C8: "output modes?" → silent/verbose/json; exit 0/1 | New behavior: dry-run of what sync would change; Unix exit codes for CI | +| `cache-management` | created | per-feature Q&A: JSON cache path, auto-rebuild on stale/corrupt, added to .gitignore by nest | New behavior: incremental sync cache at .beehave_cache/features.json | +| `template-customization` | created | C7: "template customization?" → yes, custom folder fully replaces built-in; --template-dir or template_path | New behavior: user-defined stub templates via flag or config key | +| `sync-create` | created | C9: "stub identity?" → test__; SC2: "one file per Rule"; per-feature Q&A | New behavior: generates new skipped test stubs for Examples with no existing test | +| `sync-update` | created | per-feature Q&A: update docstring/name/deprecated marker; never touch body; warn on Outline column change | New behavior: updates stubs on Example change; preserves test body always | +| `sync-cleanup` | created | C5: "deleted feature?" → warn (configurable); per-feature Q&A: orphans, misplaced stubs | New behavior: warns orphans, moves misplaced stubs, warns on deleted feature files | +| `adapter-contract` | created | C1: "framework selection?" → explicit flag, default pytest; Q9: "framework adapters?"; per-feature Q&A | New behavior: defines the FrameworkAdapter interface; v1 = pytest only | +| `pytest-adapter` | created | per-feature Q&A: skip/deprecated/parametrize markers; function prefix test_; return -> None; body ... | New behavior: built-in pytest adapter implementing the contract | +| `parameter-handling` | created | Q7: "Scenario Outlines?" → parametrized stubs; per-feature Q&A: warn-only on column change | New behavior: parametrized stubs for Scenario Outlines; column change = warn only | +| `unittest-adapter` | created | Q7: "unittest in v1?" → PARKED for v2 | Feature registered as out-of-scope for v1; no implementation | +| `hatch` | created | Q8: "beehave hatch?" → demo content; supplement: bee-themed, covers Feature/Rule/Example/Outline | New behavior: generates bee-themed demo .feature files | +| `config-reading` | created | C3: "config location?" → pyproject.toml [tool.beehave]; C1: framework key; C7: template_path key | New behavior: reads [tool.beehave] from pyproject.toml and applies defaults | +| `deprecation-sync` | created | supplement: cascade behavior → absolute, no per-Example override in v1 | New behavior: propagates @deprecated tags absolutely from Feature/Rule to all child Examples | diff --git a/docs/domain-model.md b/docs/domain-model.md deleted file mode 100644 index 8ae9ea4..0000000 --- a/docs/domain-model.md +++ /dev/null @@ -1,141 +0,0 @@ -# Domain Model: beehave - -> Living reference of code-facing domain entities. -> Owned by the system-architect. Created and updated at Step 2. -> The product-owner reads this file to check existing entities during discovery, but never writes to it. -> Append-only: add new entries at the bottom. Deprecate old entries by moving them to the Deprecated section. -> Never edit existing live entries — code depends on them. - ---- - -## Bounded Contexts - -| Context | Responsibility | Key Modules | -|---------|----------------|-------------| -| **Parsing** | Read `.feature` files; extract structure; assign `@id` tags; cache incremental state | `beehave/parsing/` | -| **Sync** | Reconcile parsed feature state with test stub state | `beehave/sync/` | -| **Adapters** | Render framework-specific stub text | `beehave/adapters/` | -| **Config** | Read `pyproject.toml`; merge CLI overrides; apply defaults | `beehave/config/` | -| **CLI** | Entry points: `nest`, `sync`, `status`, `hatch`, `version`; composition root | `beehave/cli/` | -| **Nest** | Create directory structure; inject `[tool.beehave]` into `pyproject.toml` | `beehave/nest/` | - ---- - -## Entities - -| Name | Type | Description | Bounded Context | First Appeared | -|------|------|-------------|-----------------|----------------| -| `FeatureFile` | Entity | A `.feature` file on disk; identified by its path; source of truth for requirements | Parsing | domain-model v1 | -| `Feature` | Value Object | Parsed representation of a Gherkin `Feature:` block; carries title, description, tags, and child Rules | Parsing | domain-model v1 | -| `Rule` | Value Object | A `Rule:` block inside a Feature; groups related Examples; maps to one test file | Parsing | domain-model v1 | -| `Example` | Value Object | A single `Example:` / `Scenario:` block; the atomic unit of acceptance criteria; carries tags, steps, and optional Outline table | Parsing | domain-model v1 | -| `ScenarioOutline` | Value Object | A parameterized Example with an `Examples:` table; extends `Example`; columns become parametrize args | Parsing | domain-model v1 | -| `ExampleId` | Value Object | An 8-char lowercase hex string (`@id:`); stable identity linking an Example to its test stub | Parsing | domain-model v1 | -| `FeatureSlug` | Value Object | Snake-case string derived from a feature file's stem (stage folder ignored); used as test directory name and function name prefix | Parsing | domain-model v1 | -| `RuleSlug` | Value Object | Snake-case string derived from a `Rule:` title; used as the test file name | Parsing | domain-model v1 | -| `GherkinStep` | Value Object | A single Given/When/Then/And/But step line; carries keyword and text | Parsing | domain-model v1 | -| `CacheEntry` | Value Object | Per-file cache record: path, mtime, size, content hash | Parsing | domain-model v1 | -| `FeatureCache` | Entity | The full cache state; persisted as `.beehave_cache/features.json`; tracks all known FeatureFiles | Parsing | domain-model v1 | -| `TestStub` | Entity | A generated Python test function; identified by `@id` embedded in its name; carries docstring, skip marker, body | Sync | domain-model v1 | -| `TestFile` | Entity | A Python test file at `tests/features//_test.py`; contains one or more TestStubs | Sync | domain-model v1 | -| `SyncPlan` | Value Object | Immutable description of all changes sync would make: stubs to create, update, move, warn about | Sync | domain-model v1 | -| `SyncResult` | Value Object | Outcome of executing a SyncPlan; lists created/updated/moved/warned items | Sync | domain-model v1 | -| `Orphan` | Value Object | A TestStub whose `@id` has no matching Example in any FeatureFile | Sync | domain-model v1 | -| `FrameworkAdapter` | Protocol | Interface all adapters must implement; supplies skip marker, deprecated marker, parametrize template, stub header | Adapters | domain-model v1 | -| `PytestAdapter` | Entity | Concrete adapter for pytest; implements `FrameworkAdapter` | Adapters | domain-model v1 | -| `StubTemplate` | Value Object | Rendered text template for a single stub function; produced by a FrameworkAdapter | Adapters | domain-model v1 | -| `BeehaveConfig` | Value Object | Resolved configuration: `framework`, `features_dir`, `template_path`, `log_level`, `on_delete`, `on_orphan`; defaults applied | Config | domain-model v1 | -| `RawConfig` | Value Object | Unvalidated key-value pairs read directly from `[tool.beehave]` in `pyproject.toml` | Config | domain-model v1 | -| `ColumnSet` | Value Object | Ordered set of column names from a Scenario Outline's `Examples:` table | Parsing | domain-model v1 | -| `DemoFeature` | Value Object | Bee-themed demo `.feature` file content generated by `hatch`; covers Feature/Rule/Example/Outline patterns | CLI | domain-model v1 | - ---- - -## Verbs - -| Name | Actor | Object | Description | First Appeared | -|------|-------|--------|-------------|----------------| -| `parse` | Parser | `FeatureFile` → `Feature` | Read a `.feature` file and return its structured representation | Parsing | -| `assign_ids` | IdAssigner | `Feature` → `Feature` | Assign `ExampleId` to any Example lacking a valid one; write back in-place using surgical line insertion | Parsing | -| `generate_id` | IdAssigner | — → `ExampleId` | Generate a unique 8-char lowercase hex id; retry on collision | Parsing | -| `slugify` | — | `str` → `FeatureSlug` / `RuleSlug` | Convert a name to snake_case slug | Parsing | -| `load_cache` | CacheManager | `Path` → `FeatureCache` | Load cache from disk; rebuild silently if missing/stale/corrupt | Parsing | -| `save_cache` | CacheManager | `FeatureCache` → None | Persist cache to `.beehave_cache/features.json` | Parsing | -| `is_stale` | CacheManager | `CacheEntry` × `FeatureFile` → `bool` | Check if a cached entry is out of date | Parsing | -| `plan` | SyncEngine | `[Feature]` × `[TestFile]` → `SyncPlan` | Compute the diff between current feature state and test stub state | Sync | -| `execute` | SyncEngine | `SyncPlan` → `SyncResult` | Apply the plan: create/update/move stubs; emit warnings or errors per policy | Sync | -| `detect_orphans` | SyncEngine | `[TestStub]` × `[ExampleId]` → `[Orphan]` | Find stubs whose `@id` has no matching Example | Sync | -| `detect_misplaced` | SyncEngine | `[TestStub]` × `[Feature]` → `[(TestStub, Path)]` | Find stubs in wrong directory | Sync | -| `propagate_deprecated` | SyncEngine | `Feature` → `Feature` | Apply `@deprecated` cascade: Feature/Rule → all child Examples | Sync | -| `render_stub` | FrameworkAdapter | `Example` → `StubTemplate` | Render a test stub function text for the given Example | Adapters | -| `render_parametrized_stub` | FrameworkAdapter | `ScenarioOutline` → `StubTemplate` | Render a parametrized stub for a Scenario Outline | Adapters | -| `read_config` | ConfigReader | `Path` → `BeehaveConfig` | Read `pyproject.toml`, extract `[tool.beehave]`, apply defaults; absent file returns defaults | Config | -| `merge_cli` | ConfigReader | `BeehaveConfig` × `CLIArgs` → `BeehaveConfig` | Override config values with CLI flag values | Config | -| `nest` | Scaffolder | `BeehaveConfig` → None | Create directory structure and inject `[tool.beehave]` into `pyproject.toml` | Nest | -| `check_nest` | Scaffolder | `BeehaveConfig` → `bool` | Verify structure is complete without modifying anything (--check mode) | Nest | -| `hatch` | DemoGenerator | `Path` → None | Write bee-themed demo `.feature` files; skip if file already exists | CLI | - ---- - -## Relationships - -| Subject | Relation | Object | Cardinality | Notes | -|---------|----------|--------|-------------|-------| -| `FeatureFile` | contains | `Feature` | 1:1 | One Feature per file | -| `Feature` | contains | `Rule` | 1:N | One or more Rules per Feature | -| `Rule` | contains | `Example` | 1:N | One or more Examples per Rule | -| `Example` | has | `ExampleId` | 1:1 | Assigned by `assign_ids`; stable once set | -| `Example` | has | `GherkinStep` | 1:N | Ordered list of steps | -| `ScenarioOutline` | extends | `Example` | 1:1 | Adds `ColumnSet` and rows | -| `ScenarioOutline` | has | `ColumnSet` | 1:1 | Column names for parametrize | -| `FeatureFile` | maps-to | `FeatureSlug` | 1:1 | Derived from file stem; stage-folder-independent | -| `Rule` | maps-to | `RuleSlug` | 1:1 | Derived from Rule title | -| `Rule` | maps-to | `TestFile` | 1:1 | `tests/features//_test.py` | -| `Example` | maps-to | `TestStub` | 1:1 | Identified by `@id` in function name | -| `TestStub` | lives-in | `TestFile` | N:1 | Multiple stubs per file | -| `FrameworkAdapter` | renders | `StubTemplate` | 1:N | One adapter, many stubs | -| `PytestAdapter` | implements | `FrameworkAdapter` | 1:1 | v1 only built-in | -| `BeehaveConfig` | selects | `FrameworkAdapter` | 1:1 | Via `framework` key | -| `BeehaveConfig` | points-to | `StubTemplate` | 0:1 | Via `template_path`; None = use adapter default | -| `SyncPlan` | references | `FeatureFile` | 1:N | Plan covers all changed files | -| `SyncPlan` | references | `TestFile` | 1:N | Plan covers all affected test files | -| `FeatureCache` | contains | `CacheEntry` | 1:N | One entry per known FeatureFile | -| `CacheEntry` | tracks | `FeatureFile` | 1:1 | Path + mtime + hash | -| `Orphan` | wraps | `TestStub` | 1:1 | Orphan is a classification of a stub | -| `Feature` | may-carry | `@deprecated` | 1:0..1 | Cascades to all child Rules and Examples | -| `Rule` | may-carry | `@deprecated` | 1:0..1 | Cascades to all child Examples | -| `Example` | may-carry | `@deprecated` | 1:0..1 | Direct; no override of parent in v1 | - ---- - -## Module Dependency Graph - -``` -CLI ──► Config ──► (pyproject.toml) - │ - ├──► Nest ──► (filesystem) - │ - ├──► Parsing ──► (filesystem / gherkin-official) - │ └──► (cache: .beehave_cache/features.json) - │ - ├──► Sync ──► Parsing - │ └──► Adapters - │ - └──► Adapters ──► (templates) -``` - -**Dependency rules (enforced):** -- `Parsing` has no dependency on `Sync`, `Adapters`, `CLI`, or `Nest` -- `Adapters` has no dependency on `Parsing`, `Sync`, `CLI`, or `Nest` -- `Config` has no dependency on `Parsing`, `Sync`, `Adapters`, or `Nest` -- `Sync` depends on `Parsing` and `Adapters` -- `CLI` depends on all other contexts; it is the composition root -- `Nest` depends only on `Config` - ---- - -## Deprecated - -| Name | Type | Deprecated Date | Replaced By | Reason | -|------|------|-----------------|-------------|--------| -| *(none)* | — | — | — | — | diff --git a/docs/system.md b/docs/system.md index 6342154..9011faa 100644 --- a/docs/system.md +++ b/docs/system.md @@ -2,7 +2,7 @@ > Last updated: 2026-04-22 — architecture design session 1 (no features completed yet) -**Purpose:** beehave keeps Gherkin `.feature` files and Python test stubs in sync — assigning stable `@id` tags to Examples and generating/updating skipped test functions so living documentation and test scaffolding never diverge. +**Purpose:** beehave keeps Gherkin `.feature` files and Python test stubs in sync — assigning stable `@id` tags to Examples and generating/updating skipped test functions so living documentation and test stubs never diverge. --- @@ -82,7 +82,7 @@ beehave is a framework-agnostic CLI and Python library. Developers run `beehave - Config file location is always `pyproject.toml` in the current working directory; absent = use defaults - v1 supports only the `pytest` adapter; `unittest` is parked for v2 - `@id` values are unique project-wide; collision on generation triggers silent retry; duplicate `@id` in files → hard error -- Malformed `@id` tags (empty or non-hex) are replaced in-place using surgical line scan +- Malformed `@id` tags (empty value) are replaced in-place using surgical line scan - Feature file rename is not detectable — old test directory becomes an orphan; developer migrates manually - Scenario Outline column changes produce a warning only — parametrize decorator is never auto-modified - Custom template folder is a full replacement for built-in templates (not a merge) @@ -90,7 +90,199 @@ beehave is a framework-agnostic CLI and Python library. Developers run `beehave --- -## Relevant ADRs +## Domain Model + +### Bounded Contexts + +| Context | Responsibility | Key Modules | +|---------|----------------|-------------| +| **Parsing** | Read `.feature` files; extract structure; assign `@id` tags; cache incremental state | `beehave/parsing/` | +| **Sync** | Reconcile parsed feature state with test stub state | `beehave/sync/` | +| **Adapters** | Render framework-specific stub text | `beehave/adapters/` | +| **Config** | Read `pyproject.toml`; merge CLI overrides; apply defaults | `beehave/config/` | +| **CLI** | Entry points: `nest`, `sync`, `status`, `hatch`, `version`; composition root | `beehave/cli/` | +| **Nest** | Create directory structure; inject `[tool.beehave]` into `pyproject.toml` | `beehave/nest/` | + +### Entities + +| Name | Type | Description | Bounded Context | +|------|------|-------------|-----------------| +| `FeatureFile` | Entity | A `.feature` file on disk; identified by its path; source of truth for requirements | Parsing | +| `Feature` | Value Object | Parsed representation of a Gherkin `Feature:` block; carries title, description, tags, and child Rules | Parsing | +| `Rule` | Value Object | A `Rule:` block inside a Feature; groups related Examples; maps to one test file | Parsing | +| `Example` | Value Object | A single `Example:` / `Scenario:` block; the atomic unit of acceptance criteria; carries tags, steps, and optional Outline table | Parsing | +| `ScenarioOutline` | Value Object | A parameterized Example with an `Examples:` table; extends `Example`; columns become parametrize args | Parsing | +| `ExampleId` | Value Object | An 8-char lowercase hex string (`@id:`); stable identity linking an Example to its test stub | Parsing | +| `FeatureSlug` | Value Object | Snake-case string derived from a feature file's stem (stage folder ignored); used as test directory name and function name prefix | Parsing | +| `RuleSlug` | Value Object | Snake-case string derived from a `Rule:` title; used as the test file name | Parsing | +| `GherkinStep` | Value Object | A single Given/When/Then/And/But step line; carries keyword and text | Parsing | +| `CacheEntry` | Value Object | Per-file cache record: path, mtime, size, content hash | Parsing | +| `FeatureCache` | Entity | The full cache state; persisted as `.beehave_cache/features.json`; tracks all known FeatureFiles | Parsing | +| `TestStub` | Entity | A generated Python test function; identified by `@id` embedded in its name; carries docstring, skip marker, body | Sync | +| `TestFile` | Entity | A Python test file at `tests/features//_test.py`; contains one or more TestStubs | Sync | +| `SyncPlan` | Value Object | Immutable description of all changes sync would make: stubs to create, update, move, warn about | Sync | +| `SyncResult` | Value Object | Outcome of executing a SyncPlan; lists created/updated/moved/warned items | Sync | +| `Orphan` | Value Object | A TestStub whose `@id` has no matching Example in any FeatureFile | Sync | +| `FrameworkAdapter` | Protocol | Interface all adapters must implement; supplies skip marker, deprecated marker, parametrize template, stub header | Adapters | +| `PytestAdapter` | Entity | Concrete adapter for pytest; implements `FrameworkAdapter` | Adapters | +| `StubTemplate` | Value Object | Rendered text template for a single stub function; produced by a FrameworkAdapter | Adapters | +| `BeehaveConfig` | Value Object | Resolved configuration: `framework`, `features_dir`, `template_path`, `log_level`, `on_delete`, `on_orphan`; defaults applied | Config | +| `RawConfig` | Value Object | Unvalidated key-value pairs read directly from `[tool.beehave]` in `pyproject.toml` | Config | +| `ColumnSet` | Value Object | Ordered set of column names from a Scenario Outline's `Examples:` table | Parsing | +| `DemoFeature` | Value Object | Bee-themed demo `.feature` file content generated by `hatch`; covers Feature/Rule/Example/Outline patterns | CLI | + +### Verbs + +| Name | Actor | Object | Description | +|------|-------|--------|-------------| +| `parse` | Parser | `FeatureFile` → `Feature` | Read a `.feature` file and return its structured representation | +| `assign_ids` | IdAssigner | `Feature` → `Feature` | Assign `ExampleId` to any Example lacking a valid one; write back in-place using surgical line insertion | +| `generate_id` | IdAssigner | — → `ExampleId` | Generate a unique 8-char lowercase hex id; retry on collision | +| `slugify` | — | `str` → `FeatureSlug` / `RuleSlug` | Convert a name to snake_case slug | +| `load_cache` | CacheManager | `Path` → `FeatureCache` | Load cache from disk; rebuild silently if missing/stale/corrupt | +| `save_cache` | CacheManager | `FeatureCache` → None | Persist cache to `.beehave_cache/features.json` | +| `is_stale` | CacheManager | `CacheEntry` × `FeatureFile` → `bool` | Check if a cached entry is out of date | +| `plan` | SyncEngine | `[Feature]` × `[TestFile]` → `SyncPlan` | Compute the diff between current feature state and test stub state | +| `execute` | SyncEngine | `SyncPlan` → `SyncResult` | Apply the plan: create/update/move stubs; emit warnings or errors per policy | +| `detect_orphans` | SyncEngine | `[TestStub]` × `[ExampleId]` → `[Orphan]` | Find stubs whose `@id` has no matching Example | +| `detect_misplaced` | SyncEngine | `[TestStub]` × `[Feature]` → `[(TestStub, Path)]` | Find stubs in wrong directory | +| `propagate_deprecated` | SyncEngine | `Feature` → `Feature` | Apply `@deprecated` cascade: Feature/Rule → all child Examples | +| `render_stub` | FrameworkAdapter | `Example` → `StubTemplate` | Render a test stub function text for the given Example | +| `render_parametrized_stub` | FrameworkAdapter | `ScenarioOutline` → `StubTemplate` | Render a parametrized stub for a Scenario Outline | +| `read_config` | ConfigReader | `Path` → `BeehaveConfig` | Read `pyproject.toml`, extract `[tool.beehave]`, apply defaults; absent file returns defaults | +| `merge_cli` | ConfigReader | `BeehaveConfig` × `CLIArgs` → `BeehaveConfig` | Override config values with CLI flag values | +| `nest` | NestRunner | `BeehaveConfig` → None | Create directory structure and inject `[tool.beehave]` into `pyproject.toml` | +| `check_nest` | NestRunner | `BeehaveConfig` → `bool` | Verify structure is complete without modifying anything (--check mode) | +| `hatch` | DemoGenerator | `Path` → None | Write bee-themed demo `.feature` files; skip if file already exists | + +### Relationships + +| Subject | Relation | Object | Cardinality | Notes | +|---------|----------|--------|-------------|-------| +| `FeatureFile` | contains | `Feature` | 1:1 | One Feature per file | +| `Feature` | contains | `Rule` | 1:N | One or more Rules per Feature | +| `Rule` | contains | `Example` | 1:N | One or more Examples per Rule | +| `Example` | has | `ExampleId` | 1:1 | Assigned by `assign_ids`; stable once set | +| `Example` | has | `GherkinStep` | 1:N | Ordered list of steps | +| `ScenarioOutline` | extends | `Example` | 1:1 | Adds `ColumnSet` and rows | +| `ScenarioOutline` | has | `ColumnSet` | 1:1 | Column names for parametrize | +| `FeatureFile` | maps-to | `FeatureSlug` | 1:1 | Derived from file stem; stage-folder-independent | +| `Rule` | maps-to | `RuleSlug` | 1:1 | Derived from Rule title | +| `Rule` | maps-to | `TestFile` | 1:1 | `tests/features//_test.py` | +| `Example` | maps-to | `TestStub` | 1:1 | Identified by `@id` in function name | +| `TestStub` | lives-in | `TestFile` | N:1 | Multiple stubs per file | +| `FrameworkAdapter` | renders | `StubTemplate` | 1:N | One adapter, many stubs | +| `PytestAdapter` | implements | `FrameworkAdapter` | 1:1 | v1 only built-in | +| `BeehaveConfig` | selects | `FrameworkAdapter` | 1:1 | Via `framework` key | +| `BeehaveConfig` | points-to | `StubTemplate` | 0:1 | Via `template_path`; None = use adapter default | +| `SyncPlan` | references | `FeatureFile` | 1:N | Plan covers all changed files | +| `SyncPlan` | references | `TestFile` | 1:N | Plan covers all affected test files | +| `FeatureCache` | contains | `CacheEntry` | 1:N | One entry per known FeatureFile | +| `CacheEntry` | tracks | `FeatureFile` | 1:1 | Path + mtime + hash | +| `Orphan` | wraps | `TestStub` | 1:1 | Orphan is a classification of a stub | +| `Feature` | may-carry | `@deprecated` | 1:0..1 | Cascades to all child Rules and Examples | +| `Rule` | may-carry | `@deprecated` | 1:0..1 | Cascades to all child Examples | +| `Example` | may-carry | `@deprecated` | 1:0..1 | Direct; no override of parent in v1 | + +### Module Dependency Graph + +``` +CLI ──► Config ──► (pyproject.toml) + │ + ├──► Nest ──► (filesystem) + │ + ├──► Parsing ──► (filesystem / gherkin-official) + │ └──► (cache: .beehave_cache/features.json) + │ + ├──► Sync ──► Parsing + │ └──► Adapters + │ + └──► Adapters ──► (templates) +``` + +**Dependency rules (enforced):** +- `Parsing` has no dependency on `Sync`, `Adapters`, `CLI`, or `Nest` +- `Adapters` has no dependency on `Parsing`, `Sync`, `CLI`, or `Nest` +- `Config` has no dependency on `Parsing`, `Sync`, `Adapters`, or `Nest` +- `Sync` depends on `Parsing` and `Adapters` +- `CLI` depends on all other contexts; it is the composition root +- `Nest` depends only on `Config` + +--- + +## Context + +```mermaid +C4Context + title System Context — beehave + + Person(developer, "Developer", "Writes Gherkin .feature files and Python tests; runs beehave to keep them in sync") + Person(ci, "CI Pipeline", "Runs beehave status to gate on drift; reads --json exit codes") + Person(framework_author, "Framework Author", "Implements FrameworkAdapter Protocol to support a new test framework") + + System(beehave, "beehave", "Assigns @id tags to Gherkin Examples and generates/updates skipped test stubs so living documentation and test stubs stay in sync") + + System_Ext(feature_files, "Feature Files", ".feature files on disk — source of truth for requirements (Gherkin)") + System_Ext(test_suite, "Test Suite", "Python test files under tests/features/ — generated and updated by beehave") + System_Ext(pyproject, "pyproject.toml", "Project configuration; contains [tool.beehave] config block") + + Rel(developer, beehave, "runs sync / status / nest / hatch", "CLI / Python API") + Rel(ci, beehave, "runs status --json", "CLI") + Rel(framework_author, beehave, "implements FrameworkAdapter Protocol", "Python API") + Rel(beehave, feature_files, "reads; writes @id tags only", "filesystem") + Rel(beehave, test_suite, "creates, updates, and warns about stubs", "filesystem") + Rel(beehave, pyproject, "reads [tool.beehave] config", "filesystem") +``` + +--- + +## Container + +```mermaid +C4Container + title Container Diagram — beehave + + Person(developer, "Developer", "") + Person(ci, "CI Pipeline", "") + + System_Boundary(beehave_sys, "beehave") { + Container(cli, "CLI", "Python / fire", "Entry points: nest, sync, status, hatch, version. Composition root — wires all other modules together.") + Container(config, "Config", "Python", "Reads [tool.beehave] from pyproject.toml; applies defaults; merges CLI overrides into BeehaveConfig.") + Container(parsing, "Parsing", "Python / gherkin-official", "Parses .feature files into Feature/Rule/Example graph; assigns @id tags via surgical line insertion; manages incremental cache.") + Container(sync, "Sync", "Python", "Computes SyncPlan from parsed features vs. existing test stubs; executes create/update/move/warn operations.") + Container(adapters, "Adapters", "Python", "FrameworkAdapter Protocol + PytestAdapter. Renders framework-specific stub text (skip marker, parametrize, header).") + Container(nest, "Nest", "Python", "Creates docs/features/{backlog,in-progress,completed}/ and tests/features/ directory structure; injects [tool.beehave] into pyproject.toml.") + } + + System_Ext(feature_files, "Feature Files", ".feature files — source of truth") + System_Ext(test_suite, "Test Suite", "tests/features/**/*_test.py") + System_Ext(pyproject, "pyproject.toml", "Project config") + System_Ext(cache_store, ".beehave_cache/", "Incremental sync cache (JSON)") + + Rel(developer, cli, "runs commands", "CLI / Python API") + Rel(ci, cli, "runs status --json", "CLI") + + Rel(cli, config, "reads config") + Rel(cli, parsing, "triggers parse + id assignment") + Rel(cli, sync, "triggers sync/status") + Rel(cli, nest, "triggers nest command") + Rel(cli, adapters, "selects adapter via config") + + Rel(config, pyproject, "reads [tool.beehave]", "filesystem") + Rel(nest, feature_files, "creates directory structure", "filesystem") + Rel(nest, pyproject, "injects [tool.beehave] snippet", "filesystem") + Rel(parsing, feature_files, "reads + writes @id tags", "filesystem") + Rel(parsing, cache_store, "reads/writes incremental cache", "filesystem") + Rel(sync, test_suite, "creates, updates, moves stubs", "filesystem") + Rel(sync, parsing, "reads parsed Feature graph") + Rel(sync, adapters, "renders stub text") +``` + +--- + +## ADRs + +See `docs/adr/` for the full decision record. Each ADR contains a `## Context` section with the Q&A that produced the decision. | ADR | Decision | |-----|----------| From 0dd1689f59460c5d6e97e0813f617c9beb34a597 Mon Sep 17 00:00:00 2001 From: nullhack Date: Wed, 22 Apr 2026 05:25:04 -0400 Subject: [PATCH 7/9] refactor(docs): remove C4 prefix from Context/Container section names; transfer update-docs ownership to SA --- AGENTS.md | 6 +++--- FLOW.md | 2 +- 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/AGENTS.md b/AGENTS.md index b1472fd..0263fff 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -49,7 +49,7 @@ All feature work happens on branches. `main` is the single source of truth and r - **Product Owner (PO)** — AI agent. Interviews the stakeholder, writes discovery docs, Gherkin features, and acceptance criteria. Accepts or rejects deliveries. **Sole owner of all `.feature` file moves** (backlog → in-progress before Step 2; in-progress → completed after Step 5 acceptance). - **Stakeholder** — Human. Answers PO's questions, provides domain knowledge, approves PO syntheses to confirm discovery is complete. -- **System Architect (SA)** — AI agent. Designs architecture, writes domain stubs, records decisions in ADRs, and verifies implementation respects those decisions. Owns `docs/system.md` (including domain model, C4 context, and C4 container sections) and `docs/adr/ADR-*.md`. Never edits or moves `.feature` files. Escalates spec gaps to PO. +- **System Architect (SA)** — AI agent. Designs architecture, writes domain stubs, records decisions in ADRs, and verifies implementation respects those decisions. Owns `docs/system.md` (including domain model, Context, and Container sections) and `docs/adr/ADR-*.md`. Never edits or moves `.feature` files. Escalates spec gaps to PO. - **Software Engineer (SE)** — AI agent. Implements everything: test bodies, production code, releases. Owns all `.py` files under the package. Never edits or moves `.feature` files. Escalates spec gaps to PO. If no `.feature` file is in `in-progress/`, stops and escalates to PO. ## Feature File Chain of Responsibility @@ -87,7 +87,7 @@ All feature work happens on branches. `main` is the single source of truth and r | `version-control` | software-engineer | Step 2 (branch creation), Step 5 (merge to main), post-mortem branches | | `create-pr` | system-architect | post-acceptance | | `git-release` | stakeholder | post-acceptance | -| `update-docs` | product-owner | post-acceptance + on stakeholder demand | +| `update-docs` | system-architect | post-acceptance + on stakeholder demand | | `design-colors` | designer | branding, color, WCAG compliance | | `design-assets` | designer | SVG asset creation and updates | | `flow` | all agents | every session — flow protocol, state machine design, FLOW/WORK templates | @@ -164,7 +164,7 @@ docs/ scope_journal.md ← raw Q&A, PO appends after every session discovery.md ← session synthesis changelog (behavioral changes only), PO appends after every session adr/ ← one file per decision: ADR-YYYY-MM-DD-.md, SA creates at Step 2 - system.md ← SA-owned current-state snapshot: domain model + C4 context + C4 container + modules + constraints + ADR index; SA rewrites at Step 2, PO reviews at Step 5 + system.md ← SA-owned current-state snapshot: domain model + Context + Container sections + modules + constraints + ADR index; SA rewrites at Step 2, PO reviews at Step 5 glossary.md ← living glossary, PO updates after each session branding.md ← project identity, colors, release naming, wording (designer owns) assets/ ← logo.svg, banner.svg, and other visual assets (designer owns) diff --git a/FLOW.md b/FLOW.md index aebf05b..8301508 100644 --- a/FLOW.md +++ b/FLOW.md @@ -167,7 +167,7 @@ States are checked **in order**. The first matching condition is the current sta ### [STEP-2-ARCH] **Owner**: `system-architect` **Entry condition**: On `@branch`, no test stubs in `tests/features//` -**Action**: Read feature; design domain stubs; write ADRs; update `system.md` (domain model + C4 sections); run `uv run task test-fast` to generate stubs +**Action**: Read feature; design domain stubs; write ADRs; update `system.md` (domain model + Context + Container sections); run `uv run task test-fast` to generate stubs **Exit**: Stubs generated → update `@state: STEP-3-READY` in `WORK.md` **Failure**: Spec unclear → escalate to `product-owner`; update `@state: STEP-1-DISCOVERY` in `WORK.md` **Commit**: `feat(arch): design @id architecture` From 5d6d460d8032c9e49840fb1f810790154f79a0ba Mon Sep 17 00:00:00 2001 From: nullhack Date: Wed, 22 Apr 2026 05:45:45 -0400 Subject: [PATCH 8/9] fix(consistency): resolve all 22 consistency issues across agents and skills MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Root A (issues 3,9,10,11,16,18,20,21): FLOW.md/WORK.md confusion — all session log, state update, and format references corrected to WORK.md (FLOW.md is static and agent-immutable). Root B (issues 1,6,14,19): domain-model.md phantom references removed from AGENTS.md, run-session, verify, system-architect, implement skills; all references now point to ## Domain Model section of docs/system.md. Root C (issues 2,7,8,12,13,15): ownership errors fixed — [STEP-2-READY] owner changed to software-engineer; [POST-MORTEM] action split between PO and SE; [STEP-5-COMPLETE] detection rule 13 added and entry condition disambiguated from [STEP-5-MERGE]; create-skill/SKILL.md implement step corrected to Step 3; system-architect.md create-skill entry removed; implement/SKILL.md test-coverage -> test. Root D (issues 4,5,17): git-release stages docs/system.md not deleted context.md/container.md; architect/SKILL.md template directory fixed to 'this skill's directory'; FLOW.md Source:/Feature: field references replaced with WORK.md @id/@branch. --- AGENTS.md | 5 ++--- FLOW.md | 19 ++++++++++--------- 2 files changed, 12 insertions(+), 12 deletions(-) diff --git a/AGENTS.md b/AGENTS.md index 0263fff..7229c46 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -11,7 +11,7 @@ Features flow through 5 steps with a WIP limit of 1 feature at a time. The files ``` STEP 1: SCOPE (product-owner) → discovery + Gherkin stories + criteria -STEP 2: ARCH (system-architect) → branch from main; read system.md + glossary.md + in-progress feature + targeted package files; write domain stubs; create/update domain-model.md; significant decisions as docs/adr/ADR-YYYY-MM-DD-.md; system.md rewritten +STEP 2: ARCH (system-architect) → branch from main; read system.md + glossary.md + in-progress feature + targeted package files; write domain stubs; update ## Domain Model section in system.md; significant decisions as docs/adr/ADR-YYYY-MM-DD-.md; system.md rewritten STEP 3: TDD LOOP (software-engineer) → RED → GREEN → REFACTOR, one @id at a time STEP 4: VERIFY (system-architect) → run all commands, review code against architecture STEP 5: ACCEPT (product-owner) → demo, validate, SE merges branch to main with --no-ff, move .feature to completed/ (PO only) @@ -116,8 +116,7 @@ Discovery follows a block structure per session. See `skill define-scope` for th **Key rules**: - PO owns `scope_journal.md`, `discovery.md`, `glossary.md`, and `.feature` files -- PO reads `domain-model.md` but never writes to it — entity suggestions go in `discovery.md` for SA formalization at Step 2 -- `domain-model.md` does not exist as a standalone file; the domain model is the `## Domain Model` section in `docs/system.md` +- PO reads the `## Domain Model` section of `docs/system.md` but never writes to `system.md` — entity suggestions go in `discovery.md` for SA formalization at Step 2 - Real-time split rule: >2 concerns or >8 candidate Examples → split immediately - Completed feature touched and changed → move to `backlog/` diff --git a/FLOW.md b/FLOW.md index 8301508..8b04fc3 100644 --- a/FLOW.md +++ b/FLOW.md @@ -105,8 +105,9 @@ States are checked **in order**. The first matching condition is the current sta 10. All unskipped tests pass, skipped tests remain → **[STEP-3-GREEN]** 11. All tests pass, no skipped tests → **[STEP-4-READY]** 12. Manual state set by SA after Step 4 approval → **[STEP-5-READY]** -13. On main branch, feature still in `in-progress/` → **[STEP-5-MERGE]** -14. Post-mortem file exists for current feature → **[POST-MORTEM]** +13. On main branch, feature still in `in-progress/` AND `WORK.md @state = STEP-5-COMPLETE` → **[STEP-5-COMPLETE]** +14. On feature branch (`feat/` or `fix/`), feature still in `in-progress/` → **[STEP-5-MERGE]** +15. Post-mortem file exists for current feature → **[POST-MORTEM]** --- @@ -157,9 +158,9 @@ States are checked **in order**. The first matching condition is the current sta --- ### [STEP-2-READY] -**Owner**: `system-architect` +**Owner**: `software-engineer` **Entry condition**: Feature has `@id` tags, no `feat/` or `fix/` branch -**Action**: Create branch `feat/` from `main`; set `@branch` in `WORK.md` +**Action**: Load `skill version-control`; create branch `feat/` from `main`; set `@branch` in `WORK.md` **Exit**: Branch created → update `@state: STEP-2-ARCH` in `WORK.md` --- @@ -220,7 +221,7 @@ States are checked **in order**. The first matching condition is the current sta ### [STEP-5-MERGE] **Owner**: `software-engineer` -**Entry condition**: Feature accepted; still on `@branch` +**Entry condition**: Feature accepted; on `feat/` or `fix/` branch; feature still in `in-progress/` **Action**: Merge `@branch` to `main` with `--no-ff`; delete `@branch` **Exit**: Merged → update `@state: STEP-5-COMPLETE` in `WORK.md` @@ -228,17 +229,17 @@ States are checked **in order**. The first matching condition is the current sta ### [STEP-5-COMPLETE] **Owner**: `product-owner` -**Entry condition**: On `main`, feature still in `in-progress/` +**Entry condition**: On `main`; `WORK.md @state = STEP-5-COMPLETE`; feature still in `in-progress/` **Action**: Move feature from `in-progress/` to `completed/` **Exit**: Feature moved → remove item from `WORK.md` active items; return to `[IDLE]` --- ### [POST-MORTEM] -**Owner**: `product-owner` +**Owner**: `product-owner` (post-mortem doc) + `software-engineer` (fix branch) **Entry condition**: Post-mortem file exists for current feature -**Action**: Write post-mortem in `docs/post-mortem/`; create `fix/` branch from original start commit -**Exit**: Post-mortem committed → update `@state: STEP-2-ARCH`, `@branch: fix/` in `WORK.md` +**Action**: PO writes post-mortem in `docs/post-mortem/`; SE loads `skill version-control` and creates `fix/` branch from original start commit; PO updates `WORK.md` +**Exit**: Post-mortem committed, fix branch created → update `@state: STEP-2-ARCH`, `@branch: fix/` in `WORK.md` --- From 2d3d88bd9be0fc7af01ca6f6364597f27b1d4772 Mon Sep 17 00:00:00 2001 From: nullhack Date: Wed, 22 Apr 2026 05:59:30 -0400 Subject: [PATCH 9/9] fix(flow): close loopholes and infinite loops in state machine i2A: STEP-2-ARCH failure exit changed from STEP-1-DISCOVERY to STEP-1-CRITERIA; a baselined feature with a spec gap needs criteria extended, not rediscovered. i3A: detection rule 10 now checks WORK.md @state = STEP-5-READY before the filesystem test-pass check (rule 11); filesystem alone cannot distinguish Step 4 complete from Step 5 ready, so WORK.md takes precedence for late- pipeline states. i4/5: software-engineer.md No-In-Progress and Spec-Gaps sections corrected to reference WORK.md instead of FLOW.md (FLOW.md is agent-immutable). i6B: IDLE exit now lists all 3 conditional initial states depending on feature content (STEP-1-DISCOVERY / STEP-1-CRITERIA / STEP-2-READY), matching the logic in select-feature skill. i7A: STEP-3-READY and STEP-3-GREEN merged into STEP-3-WORKING; STEP-3-RED retained as a mid-cycle sub-state with a note that WORK.md stays STEP-3-WORKING unless the session ends mid-RED. All STEP-3-READY references in agent files updated to STEP-3-WORKING. --- FLOW.md | 71 +++++++++++++++++++++++++++++---------------------------- 1 file changed, 36 insertions(+), 35 deletions(-) diff --git a/FLOW.md b/FLOW.md index 8b04fc3..e5ca1fa 100644 --- a/FLOW.md +++ b/FLOW.md @@ -69,14 +69,13 @@ States are checked **in order**. The first matching condition is the current sta └──────────────────────────────────────────────► [STEP-2-ARCH] │ ▼ - [STEP-3-READY] - │ - ┌────────────────────┤ - ▼ ▼ - [STEP-3-RED] ──► [STEP-3-GREEN] - │ - ▼ - [STEP-4-READY] + [STEP-3-WORKING] + │ + ▼ + [STEP-3-RED] + │ + ▼ + [STEP-4-READY] │ ▼ [STEP-5-READY] @@ -100,14 +99,13 @@ States are checked **in order**. The first matching condition is the current sta 5. Feature has `Rule:` blocks, no `Example:` with `@id` → **[STEP-1-CRITERIA]** 6. Feature has `@id` tags, no `feat/` or `fix/` branch exists → **[STEP-2-READY]** 7. On feature branch, no test stubs in `tests/features//` → **[STEP-2-ARCH]** -8. Test stubs exist, any have `@pytest.mark.skip` → **[STEP-3-READY]** +8. Test stubs exist, any have `@pytest.mark.skip` OR all unskipped tests pass but skipped remain → **[STEP-3-WORKING]** 9. Unskipped test exists that fails → **[STEP-3-RED]** -10. All unskipped tests pass, skipped tests remain → **[STEP-3-GREEN]** +10. `WORK.md @state` is `STEP-5-READY` → **[STEP-5-READY]** *(WORK.md takes precedence over rule 11 — filesystem alone cannot distinguish Step 4 done from Step 5 ready)* 11. All tests pass, no skipped tests → **[STEP-4-READY]** -12. Manual state set by SA after Step 4 approval → **[STEP-5-READY]** -13. On main branch, feature still in `in-progress/` AND `WORK.md @state = STEP-5-COMPLETE` → **[STEP-5-COMPLETE]** -14. On feature branch (`feat/` or `fix/`), feature still in `in-progress/` → **[STEP-5-MERGE]** -15. Post-mortem file exists for current feature → **[POST-MORTEM]** +12. On main branch, feature still in `in-progress/` AND `WORK.md @state = STEP-5-COMPLETE` → **[STEP-5-COMPLETE]** +13. On feature branch (`feat/` or `fix/`), feature still in `in-progress/` → **[STEP-5-MERGE]** +14. Post-mortem file exists for current feature → **[POST-MORTEM]** --- @@ -127,7 +125,10 @@ States are checked **in order**. The first matching condition is the current sta **Owner**: `product-owner` **Entry condition**: No file in `docs/features/in-progress/` AND all BASELINED backlog features already have `@id` tags (or no BASELINED features exist) **Action**: Select next BASELINED feature from `backlog/`; move it to `in-progress/` -**Exit**: Feature moved → create `WORK.md` entry with `@state: STEP-1-DISCOVERY` +**Exit**: Feature moved → create `WORK.md` entry; initial `@state` depends on feature content: +- Feature has no `Rule:` blocks → `@state: STEP-1-DISCOVERY` +- Feature has `Rule:` blocks but no `@id` Examples → `@state: STEP-1-CRITERIA` +- Feature has `@id` Examples → `@state: STEP-2-READY` --- @@ -169,35 +170,32 @@ States are checked **in order**. The first matching condition is the current sta **Owner**: `system-architect` **Entry condition**: On `@branch`, no test stubs in `tests/features//` **Action**: Read feature; design domain stubs; write ADRs; update `system.md` (domain model + Context + Container sections); run `uv run task test-fast` to generate stubs -**Exit**: Stubs generated → update `@state: STEP-3-READY` in `WORK.md` -**Failure**: Spec unclear → escalate to `product-owner`; update `@state: STEP-1-DISCOVERY` in `WORK.md` +**Exit**: Stubs generated → update `@state: STEP-3-WORKING` in `WORK.md` +**Failure**: Spec unclear → escalate to `product-owner`; update `@state: STEP-1-CRITERIA` in `WORK.md`; document the gap in `WORK.md` `Next:` line **Commit**: `feat(arch): design @id architecture` --- -### [STEP-3-READY] +### [STEP-3-WORKING] **Owner**: `software-engineer` -**Entry condition**: Test stubs exist, some have `@pytest.mark.skip` -**Action**: Pick first skipped `@id`; remove skip; write test body -**Exit**: Test written and fails → update `@state: STEP-3-RED` in `WORK.md` +**Entry condition**: Test stubs exist; at least one has `@pytest.mark.skip` OR all unskipped tests pass but skipped remain +**Action**: +1. Pick the next skipped `@id`; remove `@pytest.mark.skip`; write the test body (RED) +2. Write minimal production code until the test passes (GREEN) +3. Refactor if needed (REFACTOR) +4. Repeat from 1 for the next `@id` +**Exit (more @ids)**: Skipped tests still remain → stay in `[STEP-3-WORKING]` +**Exit (all done)**: No skipped tests remain → update `@state: STEP-4-READY` in `WORK.md` +**Commit**: After each `@id` or logical group --- ### [STEP-3-RED] **Owner**: `software-engineer` -**Entry condition**: An unskipped test exists that fails +**Entry condition**: An unskipped test exists that fails (mid-cycle sub-state within STEP-3-WORKING) **Action**: Write minimal production code to pass the failing test -**Exit**: Test passes → update `@state: STEP-3-GREEN` in `WORK.md` - ---- - -### [STEP-3-GREEN] -**Owner**: `software-engineer` -**Entry condition**: All unskipped tests pass; skipped tests remain -**Action**: Refactor if needed; then pick next `@id` -**Exit (more @ids)**: Next @id selected → update `@state: STEP-3-READY` in `WORK.md` -**Exit (all done)**: No skipped tests remain → update `@state: STEP-4-READY` in `WORK.md` -**Commit**: After each `@id` or logical group +**Exit**: Test passes → return to `[STEP-3-WORKING]` +**Note**: This sub-state is detected automatically during the TDD cycle. `WORK.md @state` stays `STEP-3-WORKING` unless the session ends mid-RED; in that case update to `STEP-3-RED` so the next session knows a test is currently failing. --- @@ -206,13 +204,13 @@ States are checked **in order**. The first matching condition is the current sta **Entry condition**: All tests implemented (no `@skip`) and passing **Action**: Run all quality checks; semantic review against acceptance criteria **Exit**: All checks pass → update `@state: STEP-5-READY` in `WORK.md` -**Failure**: Issues found → update `@state: STEP-3-READY` in `WORK.md`; document issues +**Failure**: Issues found → update `@state: STEP-3-WORKING` in `WORK.md`; document issues in `WORK.md` `Next:` line --- ### [STEP-5-READY] **Owner**: `product-owner` -**Entry condition**: Manual state set by SA after Step 4 approval +**Entry condition**: `WORK.md @state = STEP-5-READY` (set by SA after Step 4 approval) **Action**: Demo and validate against acceptance criteria **Exit**: Feature accepted → update `@state: STEP-5-MERGE` in `WORK.md` **Failure**: Not accepted → update `@state: POST-MORTEM` in `WORK.md` @@ -304,6 +302,9 @@ grep -r "@pytest.mark.skip" tests/features/*/ # 8. Check test failures uv run task test-fast 2>&1 | grep -E "FAILED|ERROR" + +# 9. Check WORK.md @state for STEP-5-READY (must evaluate before rule 12 / test-pass check) +grep "@state:" WORK.md | grep -q "STEP-5-READY" ``` ---