Skip to content

agy/Gemini consult lane goes off-task on structured reviews — emits no VERDICT, breaking the porch consult loop #1032

@waleedkadous

Description

@waleedkadous

Problem

The Gemini consult lane (Antigravity agy, migrated in #778) runs successfully but does not produce a structured review verdict when asked for one (consult -m gemini --type <spec|plan|impl|phase|pr|integration>). Instead of returning VERDICT: APPROVE/REQUEST_CHANGES with findings, it wanders off-task — exploring the repo and, notably, latching onto the local google-antigravity-sdk skill and summarizing that instead of reviewing the target.

Because no verdict comes back, porch's review loop treats it as REQUEST_CHANGES (see companion porch bug) and loops every phase. This is not an auth problem — agy 1.0.6 authenticates fine (OAuth, exit 0); the lane is behaviorally broken for structured reviews.

Reproductions

  1. Architect smoke test (2026-06-09): consult -m gemini --prompt "Reply with exactly: AGY_LANE_OK <year>. Nothing else." → completed in 47.6s, exit 0, but ignored the instruction and produced a multi-section "sandbox state" summary that referenced google-antigravity-sdk SKILL.md. Agentic exploration overrode the prompt.
  2. Builder spir-987, phase_1 (2026-06-11/12): iter1 timed out (skipped, fine); iter2 ran but went off-task — explored an unrelated google-antigravity-sdk, returned no VERDICT. porch defaulted to REQUEST_CHANGES → loop (reached iter3 of max_iterations=8). Codex + Claude both APPROVE'd the same phase cleanly (build + 5 tests green).

Suspected root cause

agy's agentic/--sandbox mode lets Gemini self-direct, and the presence of the google-antigravity-sdk skill in context appears to hijack it (it keeps exploring that). The review prompt's "return a VERDICT" instruction is not being honored / not constraining the run.

Expected

  • A --type review consult on the gemini lane returns a parseable VERDICT (APPROVE / REQUEST_CHANGES + findings), same contract as codex/claude.
  • agy should be constrained to the review task (less free-roaming agentic exploration for --type reviews), and should not be derailed by unrelated local skills.
  • If the lane genuinely can't produce structured verdicts reliably, consider gating gemini out of the default review model set until it can.

Impact / workaround

Surfaced on #987; worked around by scoping that project's consult models to codex + claude only. Left unfixed, this recurs on every phase of every project that uses the default 3-model review set. Related: #778 (the agy migration), and the companion porch no-verdict bug (filed alongside this).

Metadata

Metadata

Assignees

No one assigned

    Labels

    area/consultArea: Consult CLI / consultation toolingbugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions