feat(agent): add in-pipeline pre-PR self-review phase by nizar-lahlali · Pull Request #263 · aws-samples/sample-autonomous-cloud-coding-agents

nizar-lahlali · 2026-06-04T17:39:47Z

Summary

Adds an optional self-review pipeline phase where the LLM critiques its own cumulative diff before the PR is created
New self_review.py orchestration module with run_self_review() that generates the diff, invokes run_agent() a second time with a review-focused prompt, and accumulates turns/cost
New prompts/self_review.py with a focused review prompt template (correctness, bugs, security, style, test gaps checklist)
TaskConfig extended with self_review_enabled (default False) and self_review_max_turns (default 5)
Fields threaded through build_config(), get_config(), server.py _extract_invocation_params(), and pipeline.py run_task()
NEW: Self-review summary posted as a PR comment — the review agent writes .self-review-summary.md with structured findings, the pipeline posts it via gh pr comment, then deletes the file so it never appears in the diff

Design decisions:

Fail-open: self-review errors never block PR creation
Uses remaining turns/budget from original max_turns allocation (capped at self_review_max_turns)
Second run_agent() call gives the model fresh context and a clean review perspective
Skipped for pr_review (read-only), empty diff, no remaining turns/budget
Feature is opt-in (disabled by default) — no behavior change for existing users
Summary file approach (vs. parsing trajectory): deterministic, decoupled from SDK internals
Comment posted after PR creation (natural ordering — PR must exist before commenting)

Closes #262

Test plan

All 34 unit tests pass (tests/test_self_review.py) — includes 13 new tests for summary reader and PR comment posting
All existing tests pass (no regressions)
Ruff lint passes on all new/modified files
Manual validation: set SELF_REVIEW_ENABLED=true in local batch mode to verify the phase executes and summary is posted
Integration: deploy and submit a task with self_review_enabled: true in the orchestrator payload

krokoko · 2026-06-12T16:22:09Z

I think this is fine, just should be exposed using the new workflow system, a.k.a: add a method to do self review, and make it configurable as a step in the coding task workflow yaml file

Adds an optional self-review phase between agent execution and post-hooks where the LLM critiques its own cumulative diff before the PR is created. This improves first-pass PR quality by catching bugs, style issues, and test gaps before human review. - New self_review.py orchestration module with run_self_review() - New prompts/self_review.py with focused review prompt template - TaskConfig extended with self_review_enabled and self_review_max_turns - Fields threaded through build_config, get_config, server, pipeline - Fail-open: self-review errors never block PR creation - Uses remaining turns/budget from original allocation (capped) - Feature is opt-in (disabled by default)

After the self-review phase completes, the agent writes a structured summary of findings to `.self-review-summary.md`. The pipeline reads this file, posts it as a PR comment via `gh pr comment`, then deletes it so it never appears in the PR diff. Fail-open: if the file is missing or the comment fails to post, the pipeline continues normally.

Addresses review feedback: rather than a hardcoded inline pipeline phase gated by a per-task API flag, self-review is now a first-class workflow StepKind, configurable as a step in the coding workflow YAML. - Add `self_review` StepKind to workflow.schema.json (with a `max_turns` step field, 1-20, default 5) + models.Step; excluded from repo-less workflows (schema conditional + validator rule 3) since it diffs/re-runs the agent against the cloned tree. - Register `_handle_self_review` in the runner's STEP_HANDLERS; it wraps run_self_review and accumulates the review loop's turns/cost onto the shared agent_result. Non-side-effecting, so it may use on_failure: continue (advisory / fail-open). - Drive the step from pipeline.run_task via `_execute_self_review_step` (only_kinds={"self_review"}), mirroring _execute_agent_step. Runs after the cancel short-circuit and before post-hooks; posts the summary PR comment after ensure_pr when the step ran. - Declare the step in coding/new-task-v1.yaml (opt-in by presence; on_failure: continue). Remove it to disable; tune via max_turns. - Drop the now-redundant self_review_enabled / self_review_max_turns fields from TaskConfig, build_config, get_config, server params, and run_task — configurability lives in the workflow YAML. - Add golden validator-corpus fixtures (valid self_review step; repo-less + self_review rejected) to lock parity. Agent quality green: ruff, ty (0 errors), 1129 tests, 79.87% coverage.

theagenticguy · 2026-06-22T20:56:37Z

It might be semantics but should this be called a self review or should it be explicitly described as a isolated sub agent critic (aka adversarial review)?

nizar-lahlali requested a review from a team as a code owner June 4, 2026 17:39

bgagent added 3 commits June 22, 2026 14:34

nizar-lahlali force-pushed the feat/262-pre-pr-self-review branch from 19b763c to 3211230 Compare June 22, 2026 20:16

nizar-lahlali requested a review from a team as a code owner June 22, 2026 20:16

Merge branch 'main' into feat/262-pre-pr-self-review

2a8b5a9

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(agent): add in-pipeline pre-PR self-review phase#263

feat(agent): add in-pipeline pre-PR self-review phase#263
nizar-lahlali wants to merge 4 commits into
mainfrom
feat/262-pre-pr-self-review

nizar-lahlali commented Jun 4, 2026 •

edited

Loading

Uh oh!

krokoko commented Jun 12, 2026

Uh oh!

theagenticguy commented Jun 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

nizar-lahlali commented Jun 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Uh oh!

krokoko commented Jun 12, 2026

Uh oh!

theagenticguy commented Jun 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

nizar-lahlali commented Jun 4, 2026 •

edited

Loading