fix: add falsification discipline to security audit template by Alan-Jowett · Pull Request #243 · microsoft/PromptKit

Alan-Jowett · 2026-04-14T15:28:27Z

Summary

Add adversarial falsification discipline to the security audit workflow to reduce false positive findings. Three targeted changes to the investigate-security template and its protocols, motivated by a real-world audit where 2 of 9 findings were false positives due to missed upstream validation.

Problem

During a security audit of a protocol implementation, the audit produced false positives because:

Incomplete provenance tracing: A finding reported an integer underflow at a use site, but an upstream caller function already validated the constraint. The audit analyzed the arithmetic in isolation without tracing back to the validation origin.
Missing API contract analysis: A finding reported a heap overflow via memcpy with an unchecked length, but the length came from ZwQueryValueKey which guarantees the output fits within the provided buffer on success. The audit treated an API output as untrusted input.

Both failures share a root cause: the audit applied falsification rigor asymmetrically — more rigorously to "this looks safe" conclusions (6 correctly falsified) than to "this looks dangerous" conclusions (2 false positives reported).

Changes

#	Component	Change	Prevents
1	`templates/investigate-security.md`	Add `guardrails/adversarial-falsification` to protocol list	Requires "disprove before reporting" and "verify helpers and callers" — directly catches upstream validation in caller functions
2	`protocols/analysis/security-vulnerability.md`	Add validation provenance check (Phase 2, step 5) — backward tracing from use site to validation origin	Forces checking caller validation, API postconditions, and init-time invariants before reporting
3	`protocols/guardrails/self-verification.md`	Add symmetric falsification to Sampling Verification (Rule 1)	Breaks confirmation bias by requiring equal falsification rigor for "dangerous" and "safe" conclusions
4	`manifest.yaml`	Sync protocol list for `investigate-security`	Consistency with template frontmatter

Design Decisions

Reused adversarial-falsification protocol — it already contains the exact rules needed (Rule 2: "Disprove Before Reporting", Rule 4: "Verify Helpers and Callers"). The template simply wasn't including it.
Added backward tracing to security-vulnerability rather than creating a new protocol — the step is specific to input validation audit (Phase 2) and extends the existing forward-tracing methodology.
Added symmetric falsification to self-verification — this is a cross-cutting guardrail change that benefits all templates, not just security audits. The language is minimal and scoped to the sampling verification step.

Checklist

All files have SPDX license headers
YAML frontmatter is valid and complete
Component names match file names (kebab-case)
manifest.yaml updated with all new components
No vague instructions in protocols or templates
Templates have a quality checklist section
New components do not conflict with existing ones

Three changes to reduce false positive findings in security audits: 1. Add adversarial-falsification protocol to investigate-security template. The template previously lacked the protocol that requires actively disproving each finding before reporting it. This omission allowed findings to be reported without checking for upstream validation in caller functions or API postconditions. 2. Add validation provenance check to security-vulnerability protocol (Phase 2, step 5). The protocol described forward tracing (entry point to use site) but not backward tracing (use site back to validation origin). The new step requires checking caller validation, API postconditions, and initialization-time invariants before reporting any finding. 3. Add symmetric falsification to self-verification protocol (Rule 1). The sampling verification step now requires attempting to disprove reported findings with the same rigor applied when falsifying candidates that were concluded safe — breaking confirmation bias toward reporting. Motivated by a real audit where 2 of 9 findings were false positives because upstream validation (in a caller function and a kernel API contract) was missed. The audit correctly falsified 6 other candidates as safe, demonstrating the capability existed but was applied asymmetrically. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Copilot

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Adds adversarial falsification rigor to the security audit workflow to reduce false-positive vulnerability findings by forcing backward validation tracing and symmetric verification standards.

Changes:

Add guardrails/adversarial-falsification to the investigate-security template protocol list.
Extend the security vulnerability analysis protocol with a “validation provenance check” step (backward tracing to upstream validation/contracts).
Strengthen self-verification sampling with a symmetric falsification requirement; sync the template’s protocol list in manifest.yaml.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.

File	Description
templates/investigate-security.md	Adds `guardrails/adversarial-falsification` to the template’s protocol set.
protocols/guardrails/self-verification.md	Adds symmetric falsification guidance to sampling verification.
protocols/analysis/security-vulnerability.md	Adds backward-tracing validation provenance check before reporting findings.
manifest.yaml	Updates `investigate-security` protocol list to include `adversarial-falsification`.

…equirements Address review feedback: 'confirmed no upstream validation' and 'answer no with code evidence' ask for proof of nonexistence, which is hard to satisfy consistently. Rephrase to require explicit documentation of what was checked, the evidence source, and why each check does not neutralize the finding. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Copilot AI review requested due to automatic review settings April 14, 2026 15:28

Copilot AI reviewed Apr 14, 2026

View reviewed changes

Comment thread manifest.yaml

Comment thread protocols/guardrails/self-verification.md Outdated

Comment thread protocols/analysis/security-vulnerability.md Outdated

Alan-Jowett merged commit 522f81a into microsoft:main Apr 14, 2026
2 of 4 checks passed

Alan-Jowett deleted the fix/security-audit-falsification-gaps branch April 14, 2026 15:42

Copilot started reviewing on behalf of Alan-Jowett April 14, 2026 16:30 View session

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: add falsification discipline to security audit template#243

fix: add falsification discipline to security audit template#243
Alan-Jowett merged 2 commits intomicrosoft:mainfrom
Alan-Jowett:fix/security-audit-falsification-gaps

Alan-Jowett commented Apr 14, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Alan-Jowett commented Apr 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Problem

Changes

Design Decisions

Checklist

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Alan-Jowett commented Apr 14, 2026 •

edited

Loading