microsoft · Alan-Jowett · Jun 3, 2026 · Jun 3, 2026 · Jun 3, 2026 · Jun 3, 2026
diff --git a/manifest.yaml b/manifest.yaml
@@ -639,6 +639,14 @@ protocols:
         cross-model semantic matching, and classifies consensus levels
         to identify fragile prompt language.
 
+    - name: coverage-gap-analysis
+      path: protocols/reasoning/coverage-gap-analysis.md
+      description: >
+        Deterministic protocol for turning code coverage gaps into
+        specification drift candidates. Normalizes uncovered regions,
+        filters incidental code, and classifies missing validation
+        versus undocumented behavior.
+
 formats:
   - name: requirements-doc
     path: formats/requirements-doc.md
@@ -1137,6 +1145,19 @@ templates:
       format: investigation-report
       requires: [requirements-document, validation-plan]
 
+    - name: audit-coverage-gaps
+      path: templates/audit-coverage-gaps.md
+      description: >
+        Audit uncovered code regions against requirements, validation
+        artifacts, and tests. Uses coverage data as a deterministic
+        discovery signal for missing validation and undocumented
+        behavior.
+      persona: specification-analyst
+      protocols: [anti-hallucination, self-verification, operational-constraints, coverage-gap-analysis]
+      taxonomies: [specification-drift]
+      format: investigation-report
+      requires: [requirements-document, validation-plan]
+
     - name: audit-integration-compliance
       path: templates/audit-integration-compliance.md
       description: >

diff --git a/protocols/reasoning/coverage-gap-analysis.md b/protocols/reasoning/coverage-gap-analysis.md
@@ -0,0 +1,204 @@
+<!-- SPDX-License-Identifier: MIT -->
+<!-- Copyright (c) PromptKit Contributors -->
+
+---
+name: coverage-gap-analysis
+type: reasoning
+description: >
+  Deterministic protocol for turning code coverage gaps into specification
+  drift candidates. Normalizes uncovered regions, filters incidental code,
+  traces remaining regions to requirements and validation artifacts, and
+  classifies missing validation versus undocumented behavior.
+applicable_to:
+  - audit-coverage-gaps
+---
+
+# Protocol: Coverage Gap Analysis
+
+Apply this protocol when a coverage report is available and the goal is to
+use uncovered code as a **discovery signal** for specification drift.
+
+Coverage gaps are **candidate generators, not findings by themselves**.
+Covered code is **out of scope** for this protocol and MUST NOT be treated
+as evidence that the behavior is specified or adequately validated.
+
+## Phase 1: Coverage Signal Inventory
+
+Build a reproducible inventory of coverage gaps before tracing them.
+
+1. **Identify the coverage artifact**:
+   - Record the coverage tool or report format if evident.
+   - Record what test scope produced it (unit, integration, mixed) if stated.
+   - Record any stated exclusions, filters, or generated-code suppressions.
+
+2. **Extract uncovered regions**:
+   - Capture every uncovered or partially covered region with file path,
+     line range, and coverage kind (`no hits`, `partial branch`, or
+     equivalent from the report).
+   - If the report only provides function-level or file-level data, keep
+     that granularity. Do NOT invent finer block boundaries.
+
+3. **Normalize regions into reviewable units**:
+   - Merge adjacent uncovered lines only when they are clearly part of the
+     same behavioral unit (same function, branch body, or error path).
+   - Preserve the original report evidence so the normalization can be
+     reproduced.
+
+4. **Create a candidate ledger**:
+   - Assign each normalized region a unique identifier (`CG-001`, `CG-002`, ...).
+   - For each entry record: file path, line range, enclosing symbol,
+     coverage kind, and the raw coverage evidence used to create it.
+
+## Phase 2: Disambiguation Before Drift Classification
+
+Do NOT classify uncovered regions until you determine whether they are
+behaviorally significant.
+
+1. **Exclude clearly non-significant code**:
+   - Logging, metrics, debug strings, tracing hooks, boilerplate
+     serialization, generated code, trivial accessors, and test-only
+     scaffolding are excluded unless the specification explicitly
+     constrains them.
+   - Record excluded regions in the coverage summary with rationale.
+     Do NOT turn them into findings.
+
+2. **Check for inactive or intentionally unreachable paths**:
+   - Feature-flagged code, platform-gated branches, deprecated paths,
+     fault-injection hooks, and known-dead fallback branches may explain
+     missing coverage without implying drift.
+   - If the inactive status is evidenced, exclude the region with the
+     supporting rationale.
+   - If the status is plausible but not evidenced, mark the region as
+     **INCONCLUSIVE** and state what additional context is needed.
+
+3. **Determine behavioral significance**:
+   A region is significant when it affects one or more of:
+   - user-visible behavior
+   - data mutation or persistence
+   - access control or trust boundaries
+   - external communication or side effects
+   - state transitions
+   - error contracts, retry logic, or timeout behavior
+   - resource lifecycle or requirement-bound constraints
+   - synchronization or shared resource access enforcement contracts
+
+4. **Only advance significant, in-scope regions**:
+   - Regions that are excluded or inconclusive stop here.
+   - Regions that are significant proceed to specification tracing.
+
+## Phase 3: Specification Trace for Significant Regions
+
+For each significant uncovered region, determine whether it traces to
+documented intent.
+
+1. **Search requirements and design artifacts**:
+   - Look for explicit REQ-ID references, acceptance criteria,
+     domain terminology, and design mechanisms that match the region's behavior.
+   - If no design document is provided, skip design checks and trace
+     directly from requirements to code.
+
+2. **Record positive traceability**:
+   - When a region maps to one or more REQ-IDs, record the governing
+     requirement(s), acceptance criteria, and any relevant design sections.
+
+3. **Handle absent traceability carefully**:
+   - If the region implements genuine product behavior and no requirement
+     or design trace can be found, classify it as a candidate
+     **D9_UNDOCUMENTED_BEHAVIOR**.
+   - If the region appears to be reasonable infrastructure that supports
+     other requirements indirectly, record it as excluded rather than D9.
+
+4. **Handle ambiguous traceability**:
+   - If multiple REQ-IDs are plausible, carry all plausible mappings
+     forward and mark the finding confidence accordingly.
+   - Do NOT invent a new requirement to resolve the ambiguity.
+
+## Phase 4: Validation Trace for Requirement-Linked Regions
+
+For each significant uncovered region that traces to a requirement,
+determine whether the uncovered status reflects missing validation,
+missing tests, or weak assertions.
+
+1. **Check the validation plan**:
+   - Determine whether the linked REQ-ID has one or more TC-NNN entries
+     in the validation plan or traceability matrix.
+   - If no validation entry exists, classify the gap as
+     **D2_UNTESTED_REQUIREMENT** unless the plan explicitly marks the
+     requirement as manual-only or deferred.
+
+2. **Check test implementation**:
+   - If a TC-NNN exists, search the provided test code for the
+     implementing test.
+   - If no implementing test is found, classify the gap as
+     **D11_UNIMPLEMENTED_TEST_CASE**.
+
+3. **Check assertion sufficiency**:
+   - If tests exist, determine whether the uncovered region corresponds
+     to unexercised acceptance criteria, negative paths, boundary cases,
+     ordering constraints, or semantic assertions that the test does not verify.
+   - Missing required criterion exercise is
+     **D12_UNTESTED_ACCEPTANCE_CRITERION**.
+   - Incorrect or overly coarse assertions that leave the behavior
+     effectively unverified are **D13_ASSERTION_MISMATCH**.
+
+4. **Respect documented manual-only validation**:
+   - If the validation plan explicitly documents that the behavior is
+     validated manually or deferred outside the automated suite,
+     record that rationale and exclude the region from D11-D13 findings.
+
+5. **Handle insufficient evidence**:
+   - If the available test context is insufficient to distinguish D12
+     from D13, mark the region **INCONCLUSIVE** and state the missing
+     evidence instead of guessing.
+
+## Phase 5: Classification and Escalation
+
+Turn only the confirmed regions into findings.
+
+1. **Assign exactly one classification from the specification-drift
+   taxonomy** to each confirmed region:
+   - `D2_UNTESTED_REQUIREMENT`
+   - `D9_UNDOCUMENTED_BEHAVIOR`
+   - `D11_UNIMPLEMENTED_TEST_CASE`
+   - `D12_UNTESTED_ACCEPTANCE_CRITERION`
+   - `D13_ASSERTION_MISMATCH`
+
+   If one source location appears to support multiple labels, split it
+   into separate normalized candidate regions only when the evidence
+   supports distinct behavioral units. Do NOT stack multiple drift
+   labels onto one confirmed region.
+
+2. **For each finding provide**:
+   - the coverage region location
+   - the specification location(s), or `None — no matching requirement identified` for D9
+   - the validation and test location(s), or explicit absence
+   - the disambiguation rationale
+   - the impact of leaving the region uncovered
+   - a concrete recommended next action
+
+3. **Recommended next actions**:
+   - D9 findings that appear to describe real behavior with no governing
+     requirement are good candidates for
+     `requirements-from-implementation` or `spec-extraction-workflow`.
+   - D2, D11, D12, and D13 clusters that suggest broader validation drift
+     are good candidates for `audit-traceability` or
+     `audit-test-compliance`.
+
+4. **Do NOT promote excluded or inconclusive regions into findings**.
+
+## Phase 6: Coverage Summary
+
+After individual findings, produce aggregate metrics:
+
+1. **Coverage candidate count**: total normalized regions, excluded regions,
+   inconclusive regions, and classified findings.
+2. **Traceability split**: requirement-linked vs unlinked significant regions.
+3. **Finding distribution**: count by D2, D9, D11, D12, D13.
+4. **Exclusion reasons**: grouped counts for generated code,
+   infrastructure-only code, manual-only validation, inactive paths, and
+   other documented exclusions.
+5. **Overall assessment**: a short judgment of whether the dominant issue
+   appears to be missing validation, undocumented behavior, or mixed drift.
+6. **Scope limitation**: explicitly state that this protocol examined
+   uncovered regions only and did not clear covered code for
+   specification or validation compliance.
diff --git a/templates/audit-coverage-gaps.md b/templates/audit-coverage-gaps.md
@@ -0,0 +1,157 @@
+<!-- SPDX-License-Identifier: MIT -->
+<!-- Copyright (c) PromptKit Contributors -->
+
+---
+name: audit-coverage-gaps
+description: >
+  Audit uncovered code regions against requirements, validation artifacts,
+  and tests. Uses coverage data as a deterministic discovery signal for
+  missing validation and undocumented behavior.
+persona: specification-analyst
+protocols:
+  - guardrails/anti-hallucination
+  - guardrails/self-verification
+  - guardrails/operational-constraints
+  - reasoning/coverage-gap-analysis
+taxonomies:
+  - specification-drift
+format: investigation-report
+params:
+  project_name: "Name of the project or feature being audited"
+  coverage_report: "Coverage artifact content or report excerpt showing uncovered or partially covered regions"
+  requirements_doc: "The requirements document content"
+  validation_plan: "The validation plan content"
+  design_doc: "The design document content (optional — omit for a requirements-only audit)"
+  code_context: "Source code to audit — files, modules, or repository path"
+  test_code: "Test source code to inspect for validation coverage"
+  coverage_scope: "Optional narrowing for the coverage signal — e.g., '0-hit regions only', 'include partial branches', 'coverage below 80%'"
+  focus_areas: "Optional narrowing — e.g., 'authentication module', 'retry paths' (default: audit all significant uncovered regions)"
+  audience: "Who will read the audit report — e.g., 'spec owners', 'engineering leads'"
+input_contract:
+  type: validation-plan
+  description: >
+    A validation plan and requirements document, plus a coverage artifact,
+    source code, and test code used to triage uncovered regions against
+    specification intent.
+output_contract:
+  type: investigation-report
+  description: >
+    An investigation report classifying coverage-driven drift findings
+    using the specification-drift taxonomy (D2, D9, D11, D12, D13),
+    with evidence, exclusions, and escalation guidance.
+---
+
+# Task: Audit Coverage Gaps
+
+You are tasked with auditing **uncovered code regions** against the
+requirements, validation plan, and test suite to determine whether low
+coverage signals missing validation or undocumented behavior.
+
+## Inputs
+
+**Project Name**: {{project_name}}
+
+**Coverage Report**:
+{{coverage_report}}
+
+**Requirements Document**:
+{{requirements_doc}}
+
+**Validation Plan**:
+{{validation_plan}}
+
+**Design Document** (if provided):
+{{design_doc}}
+
+**Source Code**:
+{{code_context}}
+
+**Test Code**:
+{{test_code}}
+
+**Coverage Scope**: {{coverage_scope}}
+
+**Focus Areas**: {{focus_areas}}
+
+**Audience**: {{audience}}
+
+## Instructions
+
+1. **Apply the coverage-gap-analysis protocol.** Execute all phases in
+   order. Treat the coverage report as a deterministic source of
+   **candidates**, not as direct proof of drift.
+
+2. **Classify only confirmed findings** using the specification-drift
+   taxonomy. Every reported finding MUST have exactly one of:
+   - `D2_UNTESTED_REQUIREMENT`
+   - `D9_UNDOCUMENTED_BEHAVIOR`
+   - `D11_UNIMPLEMENTED_TEST_CASE`
+   - `D12_UNTESTED_ACCEPTANCE_CRITERION`
+   - `D13_ASSERTION_MISMATCH`
+
+   `D8_UNIMPLEMENTED_REQUIREMENT` is intentionally out of scope for this
+   workflow: this audit starts from uncovered implemented regions in a
+   coverage artifact, so requirements with no implementation at all are
+   better handled by `audit-code-compliance`.
+
+   Excluded regions belong in **Investigation Scope** and inconclusive
+   regions belong in **Open Questions**, not in the findings list.
+
+3. **If the design document is not provided**, skip design-specific
+   tracing. Trace uncovered regions directly from requirements to code.
+   Do NOT fabricate design intent.
+
+4. **If coverage scope or focus areas are specified**, still build the
+   initial candidate ledger from the provided coverage artifact, but
+   restrict detailed tracing and classification to the narrowed scope.
+   Explicitly document which candidate regions were excluded by scope.
+
+5. **Apply the anti-hallucination protocol.** Every finding must cite:
+   - the coverage region location and raw coverage evidence
+   - the requirement or design location, or explicit absence for D9
+   - the validation-plan location, or explicit absence for D2
+   - the test-code location, or explicit absence for D11
+
+   Do NOT invent requirements, tests, branch boundaries, or intended
+   behavior that are not evidenced in the provided artifacts.
+
+6. **Apply the operational-constraints protocol.** Do not attempt to
+   ingest the entire codebase or test suite blindly. Use the coverage
+   artifact to identify candidate regions first, then deep-read only the
+   code and tests needed to disambiguate those regions.
+
+7. **Format the output** according to the investigation-report format.
+   Map this task's work products as follows:
+   - Phase 1 candidate ledger and scoping method -> **Investigation Scope**
+   - Phase 2 disambiguation results -> **Investigation Scope** and
+     **Open Questions** for inconclusive regions
+   - Phases 3-5 classified regions -> **Findings**, one F-NNN per finding
+   - Phase 6 metrics -> **Executive Summary** and a coverage subsection
+     in **Root Cause Analysis**
+   - Escalation paths and next actions -> **Remediation Plan**
+
+8. **State the scope boundary explicitly** in the report:
+   - This audit examined uncovered or partially covered regions only.
+   - Covered code was not evaluated for specification alignment by this task.
+
+9. **Quality checklist** — before finalizing, verify:
+   - [ ] Every finding has exactly one drift label from D2, D9, D11, D12, D13
+   - [ ] Each normalized candidate region maps to at most one finding; split distinct behavioral units instead of stacking labels
+   - [ ] Every finding cites coverage evidence and concrete artifact locations
+   - [ ] Excluded regions are documented with rationale and are not reported as findings
+   - [ ] Inconclusive regions state what evidence is missing
+   - [ ] The report distinguishes missing validation from undocumented behavior
+   - [ ] The report states that covered code remains out of scope
+   - [ ] Coverage metrics are calculated from actual candidate counts
+   - [ ] Escalation recommendations are concrete and aligned to the finding type
+
+## Non-Goals
+
+- Do NOT treat uncovered code as automatically buggy or drifted.
+- Do NOT clear covered code as specified, correct, or adequately validated.
+- Do NOT execute the code or run the coverage tool — this task analyzes
+  the provided coverage artifact and related source material.
+- Do NOT rewrite requirements, tests, or code — report findings and
+  recommended next actions only.
+- Do NOT expand into a full repository maintenance audit unless the
+  findings explicitly warrant escalation.