Consult: Gemini (agy/Antigravity) reviewer runs in empty sandbox + can execute real commands in the worktree (ran porch done, wrote files)

**Severity: safety.** A consult *reviewer* lane must be read-only/advisory. The Gemini (`agy`/Antigravity) adapter's agent was observed executing real, mutating commands inside a builder worktree — including a porch state transition (`porch done <project>`) and a file write. Two distinct failures, the second serious.

## (1) Recurring — empty sandbox, no real review

The Gemini lane repeatedly launches in an **empty temporary sandbox directory with no repo/diff context**, so it never sees the code under review. Its transcript is the agent flailing in that empty dir — *"I will list the contents of the workspace directory…"*, inspecting `antigravity-cli` folders, env vars, and permissions — instead of reviewing a diff. **No `VERDICT:` line is produced.**

This has recurred on nearly every consult run across a multi-PR session. The Claude and Codex lanes receive the worktree diff correctly; the Gemini lane does not.

Knock-on effect: when a lane produces no verdict, **porch defaults the result to `REQUEST_CHANGES`** (see §3) — so Gemini contributes a *false blocking verdict* on essentially every PR, which reviewers learn to dismiss as "just the gemini tooling failure" (dangerous — it trains people to ignore a REQUEST_CHANGES).

## (2) NEW — the reviewer agent executed real commands in the worktree

On a re-run, the Gemini agent **ran real commands inside the builder worktree**:
- **`porch done <project>`** — which advanced porch protocol state and fired a gate.
- **Wrote a spurious file** into the project's `codev/projects/<project>/` directory.

In the observed instance the end state happened to be correct (the review *was* complete, so the gate landed where it belonged). **But that was luck, not safety.** A reviewer agent with shell access to `porch`/`git`/the filesystem can:
- Advance a protocol **past a gate before review is actually complete**.
- Mutate, delete, or commit code in the worktree it's supposed to passively review.
- Pollute the project's audit trail with stray files.

## Why this matters

The consult trust model is that reviewer lanes **read + emit a verdict** — they are advisory, not actors. A reviewer that can execute `porch done` or write files is a privilege-escalation-shaped bug: it can corrupt orchestration state and the worktree it's reviewing.

## Proposed fixes

1. **Sandbox consult reviewer agents read-only.** The Gemini/Antigravity (`agy`) reviewer must run with no ability to execute `porch`, `git`, or filesystem-write commands. Its only output is a verdict + comments. Audit how the `agy` adapter grants tool/command access — it evidently runs with a shell that has real `porch` + write access to the worktree.
2. **Fix the empty-sandbox context plumbing.** The adapter must receive the builder worktree's diff/context the same way the Claude and Codex lanes do. Until it does, it structurally cannot review.
3. **Porch: distinguish "absent verdict / tooling failure" from `REQUEST_CHANGES`.** A lane that produces no `VERDICT:` line should be recorded as `SKIPPED`/`ERRORED`, not defaulted to a blocking `REQUEST_CHANGES`. The current default trains operators to ignore real REQUEST_CHANGES verdicts.

## Evidence

- The Gemini lane's review-output file for an affected run is the empty-sandbox flailing transcript (lists workspace dir, inspects `antigravity-cli`, checks env/permissions — never references the actual diff).
- Builder report from an affected run: the agent *"went rogue and executed real commands in the worktree (ran `porch done …` and wrote a spurious rebuttals file, since removed); porch state verified consistent afterward."*
- Pattern: empty-sandbox / no-verdict on the Gemini lane across ~7 consecutive PRs in one session.

## Areas

Primary: `area/consult` (the `agy`/Antigravity reviewer adapter — read-only sandboxing + empty-sandbox context). Secondary: `area/porch` (the absent-verdict → REQUEST_CHANGES default in the verdict parser).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consult: Gemini (agy/Antigravity) reviewer runs in empty sandbox + can execute real commands in the worktree (ran porch done, wrote files) #1051

(1) Recurring — empty sandbox, no real review

(2) NEW — the reviewer agent executed real commands in the worktree

Why this matters

Proposed fixes

Evidence

Areas

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Consult: Gemini (agy/Antigravity) reviewer runs in empty sandbox + can execute real commands in the worktree (ran porch done, wrote files) #1051

Description

(1) Recurring — empty sandbox, no real review

(2) NEW — the reviewer agent executed real commands in the worktree

Why this matters

Proposed fixes

Evidence

Areas

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions