agy/Gemini consult lane goes off-task on structured reviews — emits no VERDICT, breaking the porch consult loop

## Problem

The Gemini consult lane (Antigravity `agy`, migrated in #778) **runs successfully but does not produce a structured review verdict** when asked for one (`consult -m gemini --type <spec|plan|impl|phase|pr|integration>`). Instead of returning `VERDICT: APPROVE/REQUEST_CHANGES` with findings, it **wanders off-task** — exploring the repo and, notably, latching onto the local **`google-antigravity-sdk` skill** and summarizing *that* instead of reviewing the target.

Because no verdict comes back, porch's review loop treats it as `REQUEST_CHANGES` (see companion porch bug) and **loops every phase**. This is **not** an auth problem — `agy` 1.0.6 authenticates fine (OAuth, exit 0); the lane is *behaviorally* broken for structured reviews.

## Reproductions

1. **Architect smoke test (2026-06-09):** `consult -m gemini --prompt "Reply with exactly: AGY_LANE_OK <year>. Nothing else."` → completed in 47.6s, exit 0, but **ignored the instruction** and produced a multi-section "sandbox state" summary that referenced `google-antigravity-sdk` SKILL.md. Agentic exploration overrode the prompt.
2. **Builder spir-987, phase_1 (2026-06-11/12):** iter1 timed out (skipped, fine); **iter2 ran but went off-task** — explored an unrelated `google-antigravity-sdk`, returned **no VERDICT**. porch defaulted to `REQUEST_CHANGES` → loop (reached iter3 of `max_iterations=8`). Codex + Claude both `APPROVE`'d the same phase cleanly (build + 5 tests green).

## Suspected root cause

`agy`'s agentic/`--sandbox` mode lets Gemini self-direct, and the presence of the `google-antigravity-sdk` skill in context appears to hijack it (it keeps exploring that). The review prompt's "return a VERDICT" instruction is not being honored / not constraining the run.

## Expected

- A `--type` review consult on the gemini lane returns a parseable `VERDICT` (APPROVE / REQUEST_CHANGES + findings), same contract as codex/claude.
- agy should be constrained to the review task (less free-roaming agentic exploration for `--type` reviews), and should not be derailed by unrelated local skills.
- If the lane genuinely can't produce structured verdicts reliably, consider gating gemini out of the default review model set until it can.

## Impact / workaround

Surfaced on #987; worked around by scoping that project's consult models to **codex + claude only**. Left unfixed, this recurs on every phase of every project that uses the default 3-model review set. Related: #778 (the agy migration), and the companion porch no-verdict bug (filed alongside this).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

agy/Gemini consult lane goes off-task on structured reviews — emits no VERDICT, breaking the porch consult loop #1032

Problem

Reproductions

Suspected root cause

Expected

Impact / workaround

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

agy/Gemini consult lane goes off-task on structured reviews — emits no VERDICT, breaking the porch consult loop #1032

Description

Problem

Reproductions

Suspected root cause

Expected

Impact / workaround

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions