Skip to content

feat: write-sink guard for DIFC-compatible output servers#1760

Merged
lpcox merged 4 commits intomainfrom
feature/write-sink-guard
Mar 11, 2026
Merged

feat: write-sink guard for DIFC-compatible output servers#1760
lpcox merged 4 commits intomainfrom
feature/write-sink-guard

Conversation

@lpcox
Copy link
Collaborator

@lpcox lpcox commented Mar 10, 2026

Problem

When the GitHub guard is active, the agent acquires integrity tags (e.g., none:github/gh-aw* unapproved:github/gh-aw* approved:github/gh-aw*). Output servers like safeoutputs use noop guards, which return empty labels and OperationReadWrite. The DIFC read evaluation then fails because the resource integrity (empty) is lower than the agent's integrity (non-empty).

Solution

Add a write-sink guard that:

  1. Mirrors agent secrecy tags from context onto the resource — prevents secrecy violations for private repos
  2. Leaves resource integrity empty — no integrity requirements for writes
  3. Returns OperationWrite — skips the failing read check entirely

After DIFC auto-detection, all noop guards are automatically upgraded to write-sink guards via UpgradeNoopToWriteSink().

DIFC Evaluation

Check Noop Guard Write-Sink Guard
Read integrity ❌ empty < agent tags (skipped — OperationWrite)
Write integrity ✅ empty → no requirements ✅ empty → no requirements
Write secrecy ❌ agent tags ⊄ empty ✅ agent tags = resource tags

Changes

  • internal/guard/write_sink.go — WriteSinkGuard implementation
  • internal/guard/write_sink_test.go — 8 tests including e2e DIFC evaluation
  • internal/guard/registry.goUpgradeNoopToWriteSink() method
  • internal/difc/agent.goAddIntegrityTags() for merge semantics
  • internal/mcp/connection.go — Public GetAgentTagsSnapshotFromContext
  • internal/server/unified.go — Merge labels instead of replace; auto-upgrade after DIFC detection

Testing

  • 8 new unit tests including noop-fails-but-write-sink-passes comparison
  • All existing tests pass (make agent-finished ✅)

Servers like safeoutputs that receive agent writes but have no
WASM guard cause DIFC violations: the agent acquires integrity
tags from a guarded server (e.g., GitHub), but the noop guard
on the output server returns empty labels and OperationReadWrite,
causing the read evaluation to fail (empty resource integrity <
non-empty agent integrity).

The write-sink guard solves this by:
- Mirroring agent secrecy tags onto the resource (from context)
- Leaving resource integrity empty (no requirements)
- Returning OperationWrite (skip the failing read check)

This ensures writes pass DIFC evaluation:
- Write integrity: resource empty → agent has all 0 required tags ✓
- Write secrecy: resource = agent tags → agent ⊆ resource ✓

Changes:
- Add WriteSinkGuard (internal/guard/write_sink.go)
- Add UpgradeNoopToWriteSink() to Registry
- Add AddIntegrityTags() to agent registry (merge semantics)
- Make GetAgentTagsSnapshotFromContext public in mcp package
- Change ensureGuardInitialized to merge labels (not replace)
- Auto-upgrade noop guards to write-sink after DIFC detection

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings March 10, 2026 23:19
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a DIFC “write-sink” guard intended for write-only output backends (e.g., safeoutputs) so that agents labeled by other guards (like the GitHub WASM guard) can write without triggering DIFC read-integrity violations.

Changes:

  • Add WriteSinkGuard that mirrors agent secrecy tags onto the resource, leaves resource integrity empty, and forces OperationWrite.
  • Auto-upgrade registered noop guards to write-sink guards when DIFC is enabled.
  • Make agent labeling additive (union semantics) and expose a public context helper for reading agent tag snapshots.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
internal/server/unified.go Auto-upgrades noop guards when DIFC is enabled; changes agent label handling to merge instead of replace.
internal/mcp/connection.go Exposes GetAgentTagsSnapshotFromContext publicly for guard use.
internal/guard/write_sink.go Implements the write-sink guard (write-only classification + secrecy mirroring).
internal/guard/registry.go Adds UpgradeNoopToWriteSink() to swap noop guards to write-sink.
internal/difc/agent.go Adds AddIntegrityTags() to support additive/merge semantics.
internal/guard/write_sink_test.go Adds unit/e2e-style DIFC evaluation tests for write-sink behavior and registry upgrade.
Comments suppressed due to low confidence (1)

internal/guard/write_sink.go:51

  • LabelAgent sets DIFCMode to "filter", which will override the gateway’s configured enforcement mode for any server upgraded from noop → write-sink (noop previously returned "strict"). If write-sink shouldn’t change enforcement behavior, consider leaving DIFCMode empty (so NewUnified’s configured mode applies) or matching noop’s default (strict).
func (g *WriteSinkGuard) LabelAgent(_ context.Context, _ interface{}, _ BackendCaller, _ *difc.Capabilities) (*LabelAgentResult, error) {
	logWriteSink.Print("LabelAgent: returning empty labels (write-sink does not label agents)")
	return &LabelAgentResult{
		Agent: AgentLabelsPayload{
			Secrecy:   []string{},
			Integrity: []string{},
		},
		DIFCMode: difc.ModeFilter,
	}, nil

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

lpcox and others added 3 commits March 10, 2026 17:13
Replace automatic noop-to-write-sink upgrade with explicit configuration.
MCP servers are no longer treated as write sinks by default. Instead,
servers must declare a write-sink guard policy:

  "guard-policies": {
    "write-sink": {
      "accept": ["private:github/gh-aw*"]
    }
  }

The accept field lists secrecy labels the sink accepts. When an agent
carries secrecy tags (e.g., from reading a private repo), the write-sink
sets those accept patterns as the resource's secrecy, allowing writes
that would otherwise be blocked by DIFC.

Changes:
- Add WriteSinkPolicy to GuardPolicy with accept field
- Update GuardPolicy JSON marshal/unmarshal for write-sink
- Add validation for accept entries (visibility:owner/repo*)
- WriteSinkGuard now stores configured accept patterns
- Remove UpgradeNoopToWriteSink auto-upgrade from registry
- Add resolveWriteSinkPolicy to unified server
- Detect write-sink in parsePolicyMap
- 22 new tests: policy parsing, config loading, DIFC evaluation

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Prevent accidental commit of WASM guard binaries which are
downloaded at runtime, not stored in source control.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- README.md: Add write-sink section alongside allow-only in guard-policies
  docs, with JSON/TOML examples and accept pattern format reference.
  Add safeoutputs write-sink example to the comprehensive JSON config.
- AGENTS.md: Update guard package description to list all guard types.
- config.example.toml: Add Example 2 showing write-sink configuration
  for safe-outputs server with accept pattern documentation.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@lpcox lpcox merged commit 6d5c2e3 into main Mar 11, 2026
13 checks passed
@lpcox lpcox deleted the feature/write-sink-guard branch March 11, 2026 03:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants