nullhack · nullhack · Apr 3, 2026 · Apr 3, 2026
diff --git a/AGENTS.md b/AGENTS.md
@@ -79,6 +79,8 @@ When developers use this template, they get:
 - Comprehensive linting with ruff
 - Static type checking with pyright
 - Property-based testing with Hypothesis
+- API documentation with pdoc
+- BDD-style test reports with pytest-html-plus
 
 ## Template Usage
 
@@ -117,6 +119,7 @@ cookiecutter gh:your-username/python-project-template --checkout v1.2.20260312
 - **v1.2.20260312**: Added meta template management system
 - **v1.3.20260313**: Added session-workflow skill
 - **v1.4.20260313**: Added AI-driven themed naming
+- **v1.5.20260402**: Replaced mkdocs with pdoc for API docs, added pytest-html-plus with BDD docstring display
 
 ## Generated Project Features
 

diff --git a/README.md b/README.md
@@ -115,7 +115,8 @@ task doc-serve # Live documentation server
 - Mutation testing with Cosmic Ray
 
 **Documentation & Deployment**
-- MkDocs with modern theme
+- pdoc for API documentation with search
+- pytest-html-plus with BDD docstring display
 - Docker containerization
 - GitHub Actions CI/CD
 - Automated documentation deployment

diff --git a/{{cookiecutter.project_slug}}/.opencode/agents/developer.md b/{{cookiecutter.project_slug}}/.opencode/agents/developer.md
@@ -36,17 +36,46 @@ Use `/skill session-workflow` for the complete session start and end protocol.
 - **Python Version**: >=3.13
 
 ## Project Structure
+
 ```
 {{cookiecutter.project_slug}}/
 ├── {{cookiecutter.package_name}}/      # Main package
+│   ├── __init__.py
 │   └── {{cookiecutter.module_name}}.py  # Entry point
-├── tests/                               # Test suite
-├── docs/                                # Documentation
+├── tests/                              # Test suite (mirror source tree)
+│   ├── unit/
+│   │   ├── __init__.py
+│   │   ├── domain/
+│   │   │   ├── __init__.py
+│   │   │   └── [module]_test.py
+│   │   ├── storage/
+│   │   │   ├── __init__.py
+│   │   │   └── [adapter]_test.py
+│   │   └── models_test.py
+│   ├── integration/
+│   │   ├── __init__.py
+│   │   └── storage/
+│   │       ├── __init__.py
+│   │       ├── factory_test.py
+│   │       ├── memory/
+│   │       │   └── [repo]_test.py
+│   │       └── sqlite/
+│   │           └── [repo]_test.py
+│   ├── conftest.py
+│   └── {{cookiecutter.project_slug}}_test.py  # Smoke test
+├── docs/                               # Documentation
 ├── pyproject.toml                       # Project config
 ├── TODO.md                              # Session state & development roadmap
 └── README.md                            # Project docs
 ```
 
+### Test Naming Convention
+- Use `*_test.py` suffix (e.g., `models_test.py`, not `test_models.py`)
+- Configure in `pyproject.toml`: `python_files = ["*_test.py"]`
+
+### Mirror Source Tree Rule
+For each source module `{{cookiecutter.module_name}}/<path>/<module>.py`, create a corresponding test file `tests/<path>/<module>_test.py`.
+
 ## Coding Standards
 - Follow PEP 8 style guide
 - Use Google docstring convention

diff --git a/{{cookiecutter.project_slug}}/.opencode/skills/code-quality/SKILL.md b/{{cookiecutter.project_slug}}/.opencode/skills/code-quality/SKILL.md
@@ -174,14 +174,81 @@ def handle_data(processor: DataProcessor) -> None:
 ```
 
 ### 5. Mutation Testing with Cosmic Ray
-```bash
-# Run mutation testing (optional, resource-intensive)
-task mut-report
 
+Mutation testing validates **test quality** — not just coverage — by introducing small code changes (mutations) and verifying that tests catch them. A test suite with high coverage but a poor mutation score means tests are exercising code without actually verifying behavior.
+
+#### When to run cosmic-ray
+- **After GREEN phase**: when all tests pass and coverage >= minimum
+- **Before merging PRs** for core domain logic (models, services, value objects)
+- **Not needed for**: storage adapters, web routers, CLI glue code — focus on the domain
+
+#### Running cosmic-ray
+
+```bash
+# Full mutation report (slow — run once per feature, not per commit)
+uv run task mut-report
 # Generates: docs/mut_report.html
+
+# Targeted run on a specific module (faster feedback)
+uv run cosmic-ray run cosmic-ray.toml {{cookiecutter.module_name}} tests/unit/ --report
+uv run cosmic-ray html-report cosmic-ray.toml > docs/mut_report.html
+```
+
+#### Configuration (`cosmic-ray.toml`)
+
+```toml
+[cosmic-ray]
+module-path = "{{cookiecutter.module_name}}"
+timeout = 10.0
+excluded-modules = []  # Add web/adapter modules to skip
+test-command = "uv run pytest tests/unit/ -x -q"
+
+[cosmic-ray.distributor]
+name = "local"
+```
+
+#### Interpreting results
+
+| Metric | Target | Action if below |
+|--------|--------|----------------|
+| Mutation score | >= 80% | Add property-based tests to kill surviving mutants |
+| Survived mutants | < 20% | Investigate with `cosmic-ray show-survivors` |
+
+```bash
+# See which mutants survived (what tests aren't catching)
+uv run cosmic-ray show-survivors cosmic-ray.toml
 ```
 
-Mutation testing validates test quality by introducing bugs and checking if tests catch them.
+#### Fixing surviving mutants
+
+A surviving mutant means a real bug could go undetected. For each survivor:
+
+1. Read the mutant diff — what change survived?
+2. Write a test that would **fail** with that mutation applied
+3. Confirm the new test kills the mutant by re-running
+
+```python
+# Surviving mutant example:
+# Original:  if value > 0:
+# Mutant:    if value >= 0:   ← survived — no test catches zero boundary
+
+# Fix: add a boundary test
+@given(st.floats(max_value=0.0, allow_nan=False, allow_infinity=False))
+def test_when_value_is_zero_or_negative_should_raise_invalid_value_error(value):
+    with pytest.raises(InvalidValueError):
+        MyModel(value=value)
+```
+
+#### Prioritization — where mutation testing pays off most
+
+| Code type | Run cosmic-ray? | Reason |
+|-----------|-----------------|--------|
+| Domain models | Yes — always | Core invariants must be airtight |
+| Value objects | Yes — always | Validation logic is critical |
+| Domain services | Yes | Business rules live here |
+| Repository ports/interfaces | No | No logic to mutate |
+| Storage adapters | No | Covered by integration tests |
+| Web routers | No | Thin delegation layer |
 
 ### 6. Quality Gates and Automation
 

diff --git a/{{cookiecutter.project_slug}}/.opencode/skills/reference/test-patterns.md b/{{cookiecutter.project_slug}}/.opencode/skills/reference/test-patterns.md
@@ -83,8 +83,21 @@ def real_api_response():
 ```
 
 ### Property-Based Testing with Hypothesis
+
+#### When to use Hypothesis vs plain TDD
+
+| Use plain TDD | Use Hypothesis |
+|--------------|----------------|
+| Side effects (DB, files, network) | Pure functions |
+| Behavioral contracts ("when closed, ceases to exist") | Invariants over all valid inputs |
+| Specific error messages | Round-trip properties |
+| Integration between components | Algorithms, parsers, serializers |
+
+**NEVER** use Hypothesis for side-effectful code — it is inefficient and produces flaky tests.
+
+#### Basic property test
 ```python
-from hypothesis import given, strategies as st
+from hypothesis import given, settings, strategies as st
 
 @given(st.emails())
 def test_when_any_valid_email_provided_should_generate_valid_jwt(email):
@@ -96,6 +109,69 @@ def test_when_any_valid_email_provided_should_generate_valid_jwt(email):
     assert decoded["email"] == email
 ```
 
+#### Settings profiles — match intensity to phase
+```python
+# Fast feedback during RED/GREEN cycle
+@settings(max_examples=25, deadline=500)
+@given(st.text(min_size=1))
+def test_when_any_input_property_holds(value): ...
+
+# Thorough check for CI / QA phase
+@settings(max_examples=200, deadline=2000)
+@given(st.text(min_size=1))
+def test_when_any_input_property_holds_ci(value): ...
+```
+
+#### Composite strategies — build domain-valid objects
+```python
+from hypothesis import strategies as st
+
+@st.composite
+def valid_user_data(draw):
+    """Generate valid user data that satisfies domain rules."""
+    return {
+        "email": draw(st.emails()),
+        "name": draw(st.text(min_size=1, max_size=100,
+                             alphabet=st.characters(blacklist_characters="\n\r\t"))),
+        "age": draw(st.integers(min_value=13, max_value=120)),
+    }
+
+@given(valid_user_data())
+def test_when_valid_user_data_provided_should_always_create_successfully(data):
+    user = User.create(**data)
+    assert user.email == data["email"]
+```
+
+#### Round-trip invariant — classic Hypothesis use case
+```python
+@given(st.builds(MyModel, name=st.text(min_size=1)))
+def test_when_model_serialized_and_deserialized_should_be_equal(model):
+    assert MyModel.from_dict(model.to_dict()) == model
+```
+
+#### Stateful testing — for state machines with interleaved operations
+```python
+from hypothesis.stateful import RuleBasedStateMachine, rule, initialize, invariant
+
+class ConnectionMachine(RuleBasedStateMachine):
+    """Explores all reachable connection state transitions."""
+
+    @initialize()
+    def setup(self):
+        self.conn = Connection.open()
+
+    @rule()
+    def close(self):
+        self.conn.close()
+
+    @invariant()
+    def closed_connection_is_inactive(self):
+        if self.conn.is_closed():
+            assert not self.conn.is_active()
+
+TestConnectionLifecycle = ConnectionMachine.TestCase
+```
+
 ### Test Fixtures and Factories
 ```python
 @pytest.fixture